Using Non-OpenAI Models with FastAgency#

FastAgency makes it simple to work with non-OpenAI models through its AutoGen runtime. You can do this in a couple of ways:

Using a proxy server that provides an OpenAI-compatible API or
By using a custom model client class, which lets you define and load your own models.

This flexibility allows you to access a variety of models, assign tailored models to agents, and optimise inference costs, among other advantages.

To show how simple it is to use non-OpenAI models, we'll rewrite the Weatherman chatbot example. With just a few changes, we'll switch to the Together AI Cloud platform, utilizing their Meta-Llama-3.1-70B-Instruct-Turbo model. For a comprehensive list of models available through Together AI, please refer to their official documentation.

Let’s dive in!

Installation#

We strongly recommend using Cookiecutter for setting up the project. Cookiecutter creates the project folder structure, default workflow, automatically installs all the necessary requirements, and creates a devcontainer that can be used with Visual Studio Code.

You can setup the project using Cookiecutter by following the project setup guide.

Alternatively, you can use pip + venv. Before getting started, ensure that you have FastAgency installed. Run the following command:

pip install "fastagency[autogen,mesop,openapi]"

This command installs the FastAgency library along with the AutoGen runtime and the mesop and openapi submodules. These components enable you to build multi-agent workflows and seamlessly integrate with the external Rest APIs.

Prerequisites#

Before you begin this guide, ensure you have:

Together AI account and API Key: This guide uses Together AI's Meta-Llama-3.1-70B-Instruct-Turbo model, so you'll need access to it. Follow the steps in the section below to create your Together AI account and obtain your API key.

Setting Up Your Together AI Account and API Key#

1. Create a Together AI account:

Go to https://api.together.ai.
Choose a sign-in option and follow the instructions to create your account.
If you already have an account, simply log-in.

2. Obtain your API Key:

Once you complete the account creation process the API key will be displayed on the screen which you can copy.
Or you can do the following to view your API key:
- Tap on the person icon at the top right corner, and click Settings
- On the left side bar, navigate to API Keys
- Copy your API key, and you're ready to go!

Set Up Your API Keys in the Environment#

To securely use the API keys in your project, you should store it as an environment variable.

Run the following command in the same terminal where you will run the FastAgency application. This environment variable must be set for the application to function correctly; skipping this step will cause the example application to crash.

Linux/macOSWindows

export TOGETHER_API_KEY="your_together_api_key"

set TOGETHER_API_KEY="your_together_api_key"

Example: Integrating a Weather API with AutoGen#

Code Walkthrough#

As we rewrite the existing Weatherman chatbox to use non-OpenAI models, most of the code remains unchanged. The only modifications to the original code are:

Configure the Language Model (LLM)
Update the System Message

Since the modifications are minor, I will focus only on these differences in this guide. For a detailed explanation of the original code, please refer to the original guide.

1. Configure the Language Model (LLM)#

First, update the LLM configuration to use non-OpenAI models. For our example, we'll use meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo, but you can choose any model from Together AI Cloud. For a complete list, refer to their official documentation.

Next, add two parameters: api_type and hide_tools.

hide_tools

The hide_tools in AutoGen controls when tools are visible during LLM conversations. It addresses a common issue where LLMs might repeatedly recommend tool calls, even after they've been executed, potentially creating an endless loop of tool invocations.

This parameter offers three options to control tool visibility:
1. never: Tools are always visible to the LLM
2. if_all_run: Tools are hidden once all the tools have been called
3. if_any_run: Tools are hidden after any of the tool has been called
In our example, we set the hide_tools to if_any_run, to hide tools once any of them has been called, improving conversation flow.
api_type

Set the api_type to together to instruct FastAgency to use Together AI Cloud for model inference.

llm_config = {
    "config_list": [
        {
            "model": "meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo",
            "api_key": os.getenv("TOGETHER_API_KEY"),
            "api_type": "together",
            "hide_tools": "if_any_run"
        }
    ],
    "temperature": 0.8,
}

2. Update the System Message#

The system message has been adjusted to work optimally with the meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo model. You may need to experiment with the system prompt if you are using a different model.

weather_agent_system_message = """You are a weather agent. When asked
about the weather for a specific city, NEVER provide any information from
memory. ALWAYS respond with: "Please hold on while I retrieve the real-time
weather data for [city name]." and immediately call the provided function to
retrieve real-time data for that city. Be concise in your response."""

Complete Application Code#

main.py

import os
from typing import Any

from autogen import UserProxyAgent
from autogen.agentchat import ConversableAgent

from fastagency import UI, FastAgency
from fastagency.api.openapi import OpenAPI
from fastagency.runtimes.autogen import AutoGenWorkflows
from fastagency.ui.mesop import MesopUI

llm_config = {
    "config_list": [
        {
            "model": "meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo",
            "api_key": os.getenv("TOGETHER_API_KEY"),
            "api_type": "together",
            "hide_tools": "if_any_run"
        }
    ],
    "temperature": 0.8,
}

openapi_url = "https://weather.tools.fastagency.ai/openapi.json"
weather_api = OpenAPI.create(openapi_url=openapi_url)

weather_agent_system_message = """You are a weather agent. When asked
about the weather for a specific city, NEVER provide any information from
memory. ALWAYS respond with: "Please hold on while I retrieve the real-time
weather data for [city name]." and immediately call the provided function to
retrieve real-time data for that city. Be concise in your response."""

wf = AutoGenWorkflows()

@wf.register(name="simple_weather", description="Weather chat")  # type: ignore[type-var]
def weather_workflow(
    ui: UI, params: dict[str, Any]
) -> str:
    initial_message = ui.text_input(
        sender="Workflow",
        recipient="User",
        prompt="I can help you with the weather. What would you like to know?",
    )

    user_agent = UserProxyAgent(
        name="User_Agent",
        system_message="You are a user agent",
        llm_config=llm_config,
        human_input_mode="NEVER",
        code_execution_config=False
    )
    weather_agent = ConversableAgent(
        name="Weather_Agent",
        system_message=weather_agent_system_message,
        llm_config=llm_config,
        human_input_mode="NEVER",
    )

    wf.register_api(  # type: ignore[attr-defined]
        api=weather_api,
        callers=[user_agent],
        executors=[weather_agent],
        functions=[
            {
                "get_daily_weather_daily_get": {
                    "name": "get_daily_weather",
                    "description": "Get the daily weather",
                }
            },
            "get_hourly_weather_hourly_get",
        ],
    )

    chat_result = user_agent.initiate_chat(
        weather_agent,
        message=initial_message,
        summary_method="reflection_with_llm",
        max_turns=3,
    )

    return chat_result.summary  # type: ignore[no-any-return]


app = FastAgency(provider=wf, ui=MesopUI())

Running the Application#

The preferred way to run the Mesop application is using a Python WSGI HTTP server like Gunicorn on Linux and Mac or Waitress on Windows.

Cookiecutterenv + pip

Terminal

gunicorn main:app

First, install the package using package manager such as pip and then run it:

Linux/MacOSWindows

Terminal

pip install gunicorn
gunicorn main:app

Terminal

pip install waitress
waitress-serve --listen=0.0.0.0:8000 main:app

Output#

Once you run the command above, FastAgency will start a Mesop application. Below is the output from the terminal along with a partial screenshot of the Mesop application:

[2024-10-10 13:19:18 +0530] [23635] [INFO] Starting gunicorn 23.0.0
[2024-10-10 13:19:18 +0530] [23635] [INFO] Listening at: http://127.0.0.1:8000 (23635)
[2024-10-10 13:19:18 +0530] [23635] [INFO] Using worker: sync
[2024-10-10 13:19:18 +0530] [23645] [INFO] Booting worker with pid: 23645

This example demonstrates the power of AutoGen runtime in FastAgency, highlighting how easily you can use non-OpenAI models with just a few changes in the code. With FastAgency, developers can quickly build interactive, scalable applications that work with live data sources.