LLM-agents
AI

OpenAgents: A Foundation for Free and Open Language Agents in the Real World

author
10 minutes, 55 seconds Read

New advancements in the field have shown that language agents, especially those based on LLMs, can use natural language to accomplish complex tasks in various settings. But at the moment, building proof-of-concept language agents is where most language agent frameworks are concentrating their efforts. This concentration frequently disregards the ease of use for users without technical training and pays little heed to the designs at the application level.

A public platform for hosting and deploying language agents in the wild and across a multitude of common tasks has been developed by developers to address the present restrictions suffered by language agents. This platform is called the OpenAgents framework. The core of the OpenAgents framework are the three agents.

A data agent is someone who assists with data analysis by making use of data tools and languages like Python and SQL for querying and programming.
With the help of plugin agents, you can access more than 200 useful API tools for your day-to-day work.
Helps you surf the web anonymously with web agents.

In an effort to make the agent features accessible to the general public and to provide researchers and developers with a smooth deployment experience on their local installations, the OpenAgents framework employs a web user interface that is geared for typical failures and rapid answers. One may argue that the OpenAgents framework is an effort to provide the groundwork for creating cutting-edge language agents that are both innovative and effective in the real world.

In today’s article, we will discuss the OpenAgents framework in more depth and take a closer look at it. We will go into the internals of the framework, how it is designed, the typical problems that arise, and the outcomes. Alright, then, I will begin.

Language Agents and OpenAgents: A First Look
Intelligent agents are the fundamental ancestor of language agents. Theoretically, these smart beings should be able to solve problems on their own, as well as perceive their surroundings, form opinions, and take appropriate action. Since LLMs have become more advanced, the international community of developers has taken advantage of the idea of intelligent agents to build linguistic agents. These agents have lately demonstrated outstanding promise, and they use natural language programming (NLP) to execute a broad variety of complex tasks in different contexts.

Gravitas and Chase are two examples of existing language agent frameworks that mainly offer a developer-centric console interface and proof-of-concept examples. Nevertheless, they frequently limit the audience they may reach, especially those who aren’t good coders. Furthermore, developers currently build agent benchmarks with deterministic assessment in mind, particularly for use cases including online browsing, coding, tool utilization, or some mix of these.

Microsoft and OpenAI, two well-established companies, have released a number of well-designed solutions, such as Advanced Data Analysis (also known as Code Interpreter) and browser plugins, in an attempt to build intelligent and language agents powered by LLM for a wider audience. These agents serve their purposes well, but they don’t contribute much to the development community. The business logic code and model implementations are not open-sourced, which limits free access for consumers and prevents developers and researchers from further exploring them. This limitation emerges due of this lack of transparency.

Currently, three internal agents form the basis of OpenAgents, an open-source platform for hosting and utilizing agents, which developers have created to address this issue.

A data agent is someone who assists with data analysis by making use of data tools and languages like Python and SQL for querying and programming.
With the help of plugin agents, you can access more than 200 useful API tools for your day-to-day work.
Helps you surf the web anonymously with web agents.
Anyone interested in using, developing for, or researching the OpenAgents platform can see it in action in the following image.

1. The OpenAgents framework’s three agents can be interacted with by non-technical users using a web interface, rather than a package or consoles designed for programmers.
2. The OpenAgents framework gives developers access to research codes and business logic, allowing them to deploy the backend and frontend without any issues.
3. The web user interface (UI) gives researchers the freedom to construct new language agents from start or to apply agent-related approaches utilizing the shared components and examples.

To summarize, the original intention of the OpenAgents framework was to provide a comprehensive and realistic platform for evaluating language agents with humans in the loop. This platform would enable users to interact with agents to accomplish various tasks, and data from these interactions, as well as user feedback, would be stored and analyzed for future evaluations and development.

on case you didn’t know, LLM prompting is a way for developers to make smart instructions that protect against malicious or incorrect inputs, improve the look of the output, and satisfy the logic on the backend. While building the OpenAgents framework, developers stress the need of clearly defining application needs using the LLM prompting technique. Nevertheless, it was quickly realized by developers that these instructions or LLM prompts can accumulate to a large amount, which could impact the context handling capabilities of LLM frameworks and token constraints. The developers also noted that for these agents to be successfully used in the real world, the agent models need to be able to handle a variety of interacting scenarios in real-time and have outstanding performance. Even while existing agent frameworks are fast enough, they neglect real-world factors, particularly in real-time, which obscures the full potential of LLM frameworks since they compromise on accuracy or responsiveness.

We directly compare the OpenAgents framework with previous research on agent concept benchmarks and prototype creation in the following graphic.

Designing and Implementing the OpenAgents Platform
The OpenAgents platform’s architecture and design may be broadly divided into two parts: the User Interface (UI), which includes the backend and frontend, and the Language Agent (LA), which includes tools, language models, and environments. An interface for user-agent communication is provided by the OpenAgents framework. The following is the schematic of the framework’s interaction flow.

After getting user input, the agents use the tools at their disposal to plan and execute the necessary activities in their settings. Here we can see the framework’s methodical design, or architecture, in action.

User Interface
After taking on a large number of host agents and reusable business logic, the OpenAgents framework’s developers spared no effort in creating an intuitive and feature-rich user interface. The main objective was to make the OpenAgents framework user friendly while also making it highly effective and easy to use. Consequently, the framework boasts support for a wide array of technical tasks, including data streaming, error handling, and backend server operations.

Automated Language Translator
An environment, a language model, and a tool interface are the three main parts of a language agent in the OpenAgents framework. With the OpenAgents framework’s built-in prompting technique, agents are guided through a sequential process beginning with observation and ending with action. Parsers are a part of the tool interface that can convert the parsable text that LLMs produce into executable actions like creating code or performing API calls. The framework also instructs the LLM to do this more efficiently. Within the constraints of each environment, the framework carries out these operations.

Agents of OpenAgents
The three main agents of OpenAgents are the Data Agent, which aids in data analysis using data tools and query languages like SQL or Python, the Plugin Agents, which provide access to more than 200 API tools useful for everyday tasks, and the Web Agents, which allow you to browse the web anonymously. While these agents share some domain knowledge with ChatGPT plugins, OpenAgents’ implementation is entirely based on open language APIs, unlike ChatGPT.

Data Broker
Built and released into the OpenAgents framework, the data agent can handle a broad variety of data-related activities that end users often face. The data agents can generate and run code in Python and SQL, and they come with a plethora of data tools like ECharts Tool, Kaggle Data Search, and Data Profiling, which can give you basic data information and search datasets. In addition, the data agent is encouraged to proactively utilize these technologies by the OpenAgents architecture in order to properly respond to the requests made by end users. Furthermore, because to the extensive coding needs, the OpenAgents framework chooses embedded language models for the data agent. Instead of the agent itself creating the code, tools such as Python, ECharts, and SQL are responsible for this task. This method allows the framework to fully utilize the programming capabilities of language models, which in turn alleviates the data agent’s workload.

The data agent can handle a flood of data-centric requests with ease using these data tools, and it can graph, manipulate, and query data with ease, going well beyond the capabilities of code and text generation. Here we can see a data agent in action, together with the tools that regular users have at their disposal.

Automated Plugins
Specifically, developers paid close attention to the API pings, function calling interface, and response lengths when designing the OpenAgents framework’s plugin agent to meet the diverse needs of users conducting everyday tasks such as online shopping, reading news, or building websites and applications. The agent provides access to more than 200 plugins. The following are examples of popular plugins:

Search Engine by Google
The Wolfram Alpha
More quickly
Klarna
Online learning platform
Bring Me
Speak
Request a PDF from AskYourPDF
The App BizTok
Klook
The image below shows how the plugin agents work, and users can adjust the number of plugins they want to utilize according to their needs and requirements.

Also, the OpenAgents framework has a function that chooses the best plugins based on user instructions, which is helpful when users aren’t sure which plugins would work best for their needs.

Agents on the Web
To improve the chat agent’s performance and capabilities, the OpenAgents framework introduces web agent, a dedicated tool. Even while the web agent is still housed in the chat agent, it incorporates it effortlessly whenever needed. As seen in the figure below, the web agent subsequently delivers the end user the final response.

These web agents’ design strategy is highly beneficial because, before passing them on to the web agent, the chat agent systematically processes important parameters or starts URLs. This ensures that the output is better aligned with the user’s requirements, which leads to clear communication. In addition, the approach enables online agents to handle complex and versatile consumer inquiries through the use of dynamic multi-turn web navigation and chat conversations. So, the OpenAgents framework allows for the improvement and evolution of each module by clearly defining the functions and duties of chat and multi-browsing agents.

OpenAgents: Use Cases and Actual Implementation
Here we will discuss the development path of the OpenAgents framework, from its inception in theory to its actual deployment in the real world, including the difficulties faced, lessons learned, and evaluation complexity that the developers overcame.

Making Use of Prompts to Develop Practical Applications from Huge Language Models
In order to define specific needs, the OpenAgents framework makes use of LLM prompts while developing real-world applications for end users. Some of the instructions are meant to make sure the output follows a certain format so the backend logic can work, some are meant to make the output look nicer, and the rest are meant to keep the framework safe from bad guys.

Unpredictable Real-Life Circumstances
Internet infrastructure, users, business logics, and other real-world variables that developers could not control were greeted by the OpenAgents framework when it was deployed in the real world. Because of these external variables, developers had to rethink and fine-tune several assumptions based on previous research; nonetheless, these aspects may cause the framework to provide responses that end users are unhappy with.

Evaluation Difficulty
It becomes more difficult to evaluate the performance of applications powered by LLM due to the increased complexity of constructing such applications, even while agents targeted directly for apps may have a wider application and allow for better evaluation. In addition to increasing instability, this method lengthens the LLMs’ system chain, which makes it harder for the framework to adjust to new parts. Therefore, it is prudent to improve the agents’ operational logic and system design to streamline processes and guarantee effective output.

Last Remarks
Throughout this post, we have covered the OpenAgents framework, which provides an open platform for hosting and deploying language agents in real-world scenarios and for several common activities. The three main agents that make up the OpenAgents framework are the Data Agent, which aids in data analysis using tools like SQL or Python, the Plugin Agents, which provide access to more than 200 API tools useful for everyday tasks, and the Web Agents, which allow you to browse the web anonymously. In an effort to make the agent features accessible to the general public and to provide researchers and developers with a smooth deployment experience on their local installations, the OpenAgents framework employs a web user interface that is geared for typical failures and rapid answers. The goal of OpenAgents is to make LLMs more accessible, not just to developers and researchers but also to end users with less technical knowledge, by offering a transparent, comprehensive, and deployable platform.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *