Skip to content

WhatsApp API Integration and Web Scraping#

In this tutorial, we will explore how to leverage the FastAgency framework to create a dynamic and interactive chatbot that integrates two powerful agents:

  1. WebSurferAgent: A web-scraping agent capable of retrieving relevant content from webpages (learn more here).

  2. WhatsApp agent – An agent that interacts with the Infobip WhatsApp API to send WhatsApp messages based on the user’s request. It will be created using the standard ConversableAgent from AutoGen and the OpenAPI object instantiated with an OpenAPI specification of Infobip's REST API.

The chat system will operate between these two agents and the user, allowing them to scrape web content and send the relevant information via WhatsApp, all within a seamless conversation. This tutorial will guide you through setting up these agents, handling user interaction, and ensuring secure API communication.

What You’ll Learn#

By the end of this tutorial, you will understand how to:

  1. Integrate external APIs like Infobip WhatsApp API using OpenAPI.
  2. Build and register agents that autonomously scrape the web for relevant information using WebSurferAgent.
  3. Use AutoGenWorkflows to manage agent interactions and user input.
  4. Present scraped content to the user and offer sending that content via WhatsApp.
  5. Handle secure API credentials and ensure safe communication between agents using APIKeyHeader.

We will walk through setting up each agent, handling API security, and creating a cohesive conversation that scrapes data, processes user input, and sends it via WhatsApp in response.

Let’s dive into creating a powerful interactive agent system with FastAgency!

Project setup#

We strongly recommend using Cookiecutter for setting up the project. It creates the project folder structure, default workflow, automatically installs all the necessary requirements, and creates a devcontainer that can be used with Visual Studio Code for development.

You could also use virtual environment managers such as venv, and a Python package manager, such as pip.

  1. Install Cookiecutter with the following command:

    pip install cookiecutter
    

  2. Run the cookiecutter command:

    cookiecutter https://github.com/airtai/cookiecutter-fastagency.git
    

  3. Depending on the type of the project, choose the appropriate option in step 3:

    [1/4] project_name (My FastAgency App):
    [2/4] project_slug (my_fastagency_app):
    [3/4] Select app_type
        1 - fastapi+mesop
        2 - mesop
        3 - nats+fastapi+mesop
        Choose from [1/2/3] (1): 2
    [4/4] Select python_version
        1 - 3.12
        2 - 3.11
        3 - 3.10
        Choose from [1/2/3] (1):
    

    This command installs FastAgency with support for both the Console and Mesop interfaces for AutoGen workflows.

  4. Executing the cookiecutter command will create the following file structure:

    my_fastagency_app
    ├── deployment
    │   └── firebase
    │       ├── allowed_users.yaml
    │       └── firebase_config.yaml
    ├── docker
    │   ├── content
    │   │   ├── nginx.conf.template
    │   │   └── run_fastagency.sh
    │   └── Dockerfile
    ├── my_fastagency_app
    │   ├── deployment
    │   │   ├── __init__.py
    │   │   └── main.py
    │   ├── local
    │   │   ├── __init__.py
    │   │   ├── main_console.py
    │   │   └── main_mesop.py
    │   ├── __init__.py
    │   └── workflow.py
    ├── scripts
    │   ├── build_docker.sh
    │   ├── deploy_to_fly_io.sh
    │   ├── lint-pre-commit.sh
    │   ├── lint.sh
    │   ├── run_docker.sh
    │   ├── run_mesop_locally.sh
    │   ├── static-analysis.sh
    │   └── static-pre-commit.sh
    ├── tests
    │   ├── __init__.py
    │   ├── conftest.py
    │   └── test_workflow.py
    ├── README.md
    ├── fly.toml
    └── pyproject.toml
    
  5. To run LLM-based applications, you need an API key for the LLM used. The most commonly used LLM is OpenAI. To use it, create an OpenAI API Key and set it as an environment variable in the terminal using the following command:

    export OPENAI_API_KEY=openai_api_key_here
    

    If you want to use a different LLM provider, follow this guide.

    Alternatively, you can skip this step and set the LLM API key as an environment variable later in the devcontainer's terminal. If you open the project in Visual Studio Code using GUI, you will need to manually set the environment variable in the devcontainer's terminal.

  6. Open the generated project in Visual Studio Code with the following command:

    code my_fastagency_app
    

  7. Once the project is opened, you will get the following option to reopen it in a devcontainer:

  8. After reopening the project in devcontainer, you can verify that the setup is correct by running the provided tests with the following command:

    pytest -s
    

    You should get the following output if everything is correctly setup.

    =================================== test session starts ===================================
    platform linux -- Python 3.12.7, pytest-8.3.3, pluggy-1.5.0
    rootdir: /workspaces/my_fastagency_app
    configfile: pyproject.toml
    plugins: asyncio-0.24.0, anyio-4.6.2.post1
    asyncio: mode=Mode.STRICT, default_loop_scope=None
    collected 1 item
    
    tests/test_workflow.py .                                                            [100%]
    
    ==================================== 1 passed in 1.02s ====================================
    

    Running the test could take up to 30 seconds, depending on latency and throughput of OpenAI (or other LLM providers).

  9. Install additional dependencies which will be needed for this tutorial:

    pip install "fastagency[openapi]"
    

Info

If you used a different project_slug than the default my_fastagency_app this will be reflected in the project module naming. Keep this in mind when running the commands further in this guide (in Run Application), you will need to replace my_fastagency_app with your project_slug name.

To get started, you need to install FastAgency with OpenAPI submodule. You can do this using pip, Python's package installer.

pip install "fastagency[autogen,mesop,openapi]"

API Key Setup#

WebSurferAgent requires an Bing Web Search API key and WhatsAppAgent requires an API key to interact with Infobip's WhatsApp service. Follow these steps to create your API keys:

Create Bing Web Search API Key#

To create Bing Web Search API key, follow the guide provided.

Note

You will need to create Microsoft Azure Account.

Create Infobip Account#

Step 1: If you don’t have a Infobip account, you’ll need to sign up:

Step 2: Settings

  • In the Customize your experience section, choose:
    1. WhatsApp
    2. Customer support
    3. By using code (APIs, SDKs)

Step 3: Test WhatsApp API

  • After you have created the account, you will be redirected Infobip Homepage.
  • Check the Send your first message option and send a WhatsApp message to yourself.
  • In this tutorial, we will only be sending messages to your own number

Important

Upon receiving this message, please reply (e.g., with "Hi") to initiate the session. Note that sessions expire after 24 hours. If your session has expired, simply send another message to create a new one.

Copy the API Key from the top-right corner and continue with the next steps.

Step 4: Register your WhatsApp sender (Optional)

  • By default, Infobip number will be used as the sender for your messages.
  • If you wish to create a new sender phone number and customize your branding (including your name and logo), click on Register Sender.

Set Up Your API Keys in the Environment#

To securely use the API keys in your project, you should store it in an environment variables.

You can set the API keys in your terminal as an environment variable:

export WHATSAPP_API_KEY="your_whatsapp_api_key"
export BING_API_KEY="your_bing_api_key"
set WHATSAPP_API_KEY="your_whatsapp_api_key"
set BING_API_KEY="your_bing_api_key"

Complete Application Code#

Workflow Code#

You need to define the workflow that your application will use. This is where you specify how the agents interact and what they do.

Workflow will be generated within the my_fastagency_app/workflow.py folder. You will need to replace the existing workflow.py with the code below.

Create workflow.py and paste the code below inside.

workflow.py
import os
from typing import Annotated, Any, Optional

from autogen import register_function
from autogen.agentchat import ConversableAgent

from fastagency import UI
from fastagency.api.openapi.client import OpenAPI
from fastagency.api.openapi.security import APIKeyHeader
from fastagency.runtimes.autogen import AutoGenWorkflows
from fastagency.runtimes.autogen.agents.websurfer import WebSurferAgent

llm_config = {
    "config_list": [
        {
            "model": "gpt-4o-mini",
            "api_key": os.getenv("OPENAI_API_KEY"),
        }
    ],
    "temperature": 0.8,
}

openapi_url = "https://dev.infobip.com/openapi/products/whatsapp.json"

whatsapp_api = OpenAPI.create(
    openapi_url=openapi_url,
    # this is an optional parameter, but specified here because servers are not specified in the OpenAPI specification
    servers=[{"url": "https://api.infobip.com"}],
)

header_authorization = "App "  # pragma: allowlist secret
header_authorization += os.getenv("WHATSAPP_API_KEY", "")
whatsapp_api.set_security_params(APIKeyHeader.Parameters(value=header_authorization))

# This is the default sender number for Infobip.
# If you want to use your own sender, please update the value below:
sender = "447860099299"
WHATSAPP_SYSTEM_MESSAGE = f"""You are an agent in charge to communicate with the user and WhatsAPP API.
Always use 'present_completed_task_or_ask_question' to interact with the user.
- make sure that the 'message' parameter contains all the necessary information for the user!
Initially, the Web_Surfer_Agent will provide you with some content from the web.
You should ask the user if he would like to receive the summary of the scraped page
by using 'present_completed_task_or_ask_question'.
- "If you want to receive the summary of the page as a WhatsApp message, please provide your number."

    When sending the message, the Body must use the following format:
{{
    "from": "{sender}",
    "to": "receiverNumber",
    "messageId": "test-message-randomInt",
    "content": {{
        "text": "message"
    }},
    "callbackData": "Callback data"
}}

"from" number is always the same.
"""

wf = AutoGenWorkflows()


@wf.register(name="whatsapp_and_websurfer", description="WhatsApp and WebSurfer chat")
def whatsapp_and_websurfer_workflow(ui: UI, params: dict[str, Any]) -> str:
    def is_termination_msg(msg: dict[str, Any]) -> bool:
        return msg["content"] is not None and "TERMINATE" in msg["content"]

    def present_completed_task_or_ask_question(
        message: Annotated[str, "Message for examiner"],
    ) -> Optional[str]:
        try:
            return ui.text_input(
                sender="Whatsapp_Agent",
                recipient="User",
                prompt=message,
            )
        except Exception as e:  # pragma: no cover
            return f"present_completed_task_or_ask_question() FAILED! {e}"

    whatsapp_agent = ConversableAgent(
        name="WhatsApp_Agent",
        system_message=WHATSAPP_SYSTEM_MESSAGE,
        llm_config=llm_config,
        human_input_mode="NEVER",
        is_termination_msg=is_termination_msg,
    )

    web_surfer = WebSurferAgent(
        name="Web_Surfer_Agent",
        llm_config=llm_config,
        summarizer_llm_config=llm_config,
        human_input_mode="NEVER",
        executor=whatsapp_agent,
        is_termination_msg=is_termination_msg,
        bing_api_key=os.getenv("BING_API_KEY"),
    )

    register_function(
        present_completed_task_or_ask_question,
        caller=whatsapp_agent,
        executor=web_surfer,
        name="present_completed_task_or_ask_question",
        description="""Present completed task or ask question.
If you are presenting a completed task, last message should be a question: 'Do yo need anything else?'""",
    )

    wf.register_api(
        api=whatsapp_api,
        callers=whatsapp_agent,
        executors=web_surfer,
        functions=["send_whatsapp_text_message"],
    )

    initial_message = ui.text_input(
        sender="Workflow",
        recipient="User",
        prompt="For which website would you like to receive a summary?",
    )

    chat_result = whatsapp_agent.initiate_chat(
        web_surfer,
        message=f"Users initial message: {initial_message}",
        summary_method="reflection_with_llm",
        max_turns=10,
    )

    return chat_result.summary  # type: ignore[no-any-return]

Deployment Code#

Deployment files will be generated under my_fastagency_app/deployment folder. Generated main.py should be the same as the code below. You don't need change anything.

Create deployment/main.py and paste the code below inside.

main.py
from fastagency import FastAgency
from fastagency.ui.mesop import MesopUI

from ..workflow import wf

ui = MesopUI()


app = FastAgency(
    provider=wf,

Code Walkthrough#

Now we will go over each key part of the code, explaining its function and purpose within the FastAgency framework. Understanding these components is crucial for building a dynamic interaction between the user, the WebSurferAgent, and the WhatsAppAgent.

Creating the WhatsApp API Instance#

The following lines shows how to initializes the WhatsApp API by loading the OpenAPI specification from a URL. The OpenAPI spec defines how to interact with the WhatsApp API, including endpoints, parameters, and security details.

Also, we configure the WhatsApp API with the WHATSAPP_API_KEY using set_security_params to authenticate our requests.

whatsapp_api = OpenAPI.create(
    openapi_url=openapi_url,
    # this is an optional parameter, but specified here because servers are not specified in the OpenAPI specification
    servers=[{"url": "https://api.infobip.com"}],
)

header_authorization = "App "  # pragma: allowlist secret
header_authorization += os.getenv("WHATSAPP_API_KEY", "")
whatsapp_api.set_security_params(APIKeyHeader.Parameters(value=header_authorization))

For more information, visit API Integration User Guide.

Registering the Workflow#

Here, we initialize a new workflow using AutoGenWorkflows() and register it under the name "whatsapp_and_websurfer". The @wf.register decorator registers the function to handle chat flow with security enabled, combining both WhatsAppAgent and WebSurferAgent.

@wf.register(name="whatsapp_and_websurfer", description="WhatsApp and WebSurfer chat")
def whatsapp_and_websurfer_workflow(ui: UI, params: dict[str, Any]) -> str:
    ...

Interaction with the user#

This is a core function used by the WhatsAppAgent to either present the task result or ask a follow-up question to the user. The message is wrapped in a TextInput object, and then ui.process_message() sends it for user interaction.

    def present_completed_task_or_ask_question(
        message: Annotated[str, "Message for examiner"],
    ) -> Optional[str]:
        try:
            return ui.text_input(
                sender="Whatsapp_Agent",
                recipient="User",
                prompt=message,
            )
        except Exception as e:  # pragma: no cover
            return f"present_completed_task_or_ask_question() FAILED! {e}"

Creating the WhatsApp and WebSurfer Agents#

  • WhatsAppAgent: A ConversableAgent is created with the name "WhatsApp_Agent". It uses the system message defined earlier and relies on the termination function to end the chat when needed.
  • WebSurferAgent: The WebSurferAgent is responsible for scraping web content and passes the retrieved data to the WhatsAppAgent. It’s configured with a summarizer to condense web content, which is useful when presenting concise data to the user. For more information, visit WebSurfer User Guide.
    whatsapp_agent = ConversableAgent(
        name="WhatsApp_Agent",
        system_message=WHATSAPP_SYSTEM_MESSAGE,
        llm_config=llm_config,
        human_input_mode="NEVER",
        is_termination_msg=is_termination_msg,
    )

    web_surfer = WebSurferAgent(
        name="Web_Surfer_Agent",
        llm_config=llm_config,
        summarizer_llm_config=llm_config,
        human_input_mode="NEVER",
        executor=whatsapp_agent,
        is_termination_msg=is_termination_msg,
        bing_api_key=os.getenv("BING_API_KEY"),
    )

Registering Functions#

The function present_completed_task_or_ask_question is registered to allow the WhatsAppAgent to ask questions or present completed tasks after receiving data from the WebSurferAgent.

    register_function(
        present_completed_task_or_ask_question,
        caller=whatsapp_agent,
        executor=web_surfer,
        name="present_completed_task_or_ask_question",
        description="""Present completed task or ask question.
If you are presenting a completed task, last message should be a question: 'Do yo need anything else?'""",
    )

We register the WhatsApp API, which allows the WhatsAppAgent to handle tasks like suggesting messages that will be sent to the user.

    wf.register_api(
        api=whatsapp_api,
        callers=whatsapp_agent,
        executors=web_surfer,
        functions=["send_whatsapp_text_message"],
    )

Initiating the Chat#

We initiate the conversation between the user, WebSurferAgent, and WhatsAppAgent. The user’s initial message is provided, and the system is configured to handle up to 10 turns of interaction. The conversation is summarized using the reflection_with_llm method, which uses a language model to summarize the chat.

Once the conversation ends, the summary is returned to the user, wrapping up the session.

    chat_result = whatsapp_agent.initiate_chat(
        web_surfer,
        message=f"Users initial message: {initial_message}",
        summary_method="reflection_with_llm",
        max_turns=10,
    )

    return chat_result.summary  # type: ignore[no-any-return]

Starting the Application#

The FastAgency app is created, using the registered workflows (wf) and web-based user interface (MesopUI). This makes the conversation between agents and the user interactive.

ui = MesopUI()


app = FastAgency(
    provider=wf,

For more information, visit Mesop User Guide.

Running the Application#

The preferred way to run the Mesop application is using a Python WSGI HTTP server like Gunicorn on Linux and Mac or Waitress on Windows.

Terminal

gunicorn my_fastagency_app.deployment.main:app

First, install the package using package manager such as pip and then run it:

Terminal

pip install gunicorn
gunicorn deployment.main:app

Terminal

pip install waitress
waitress-serve --listen=0.0.0.0:8000 deployment.main:app
[2024-10-10 13:19:18 +0530] [23635] [INFO] Starting gunicorn 23.0.0
[2024-10-10 13:19:18 +0530] [23635] [INFO] Listening at: http://127.0.0.1:8000 (23635)
[2024-10-10 13:19:18 +0530] [23635] [INFO] Using worker: sync
[2024-10-10 13:19:18 +0530] [23645] [INFO] Booting worker with pid: 23645

The command will launch a web interface where users can input their requests and interact with the agents (in this case http://localhost:8000)

Note

Ensure that your OpenAI API key is set in the environment, as the agents rely on it to interact using GPT-4o. If the API key is not correctly configured, the application may fail to retrieve LLM-powered responses.

Chat Example#

In this scenario, the user instructs the agents to scrape BBC Sport for the latest sports news.

Initial message

Upon receiving the request, WebSurferAgent initiates the process by scraping the webpage for relevant updates.

Scraping

After the scraping process is complete, the agents compile the findings and present them to the user. In the final step, the user submits their phone number to receive the results via WhatsApp message.

Scraped Info

WhatsApp API call

Finally, the results are delivered to the user through a WhatsApp message.

WhatsApp Message

Conclusion#

In summary, connecting FastAgency with the Infobip WhatsApp API lets you create chat systems that can gather web data and send it straight to users on WhatsApp. By using two agents — WebSurferAgent to pull web content and WhatsAppAgent for messaging, you can build engaging experiences for users. This tutorial covered the essential steps to set up these agents, secure the API, and manage user interactions. With this setup, you can enhance your chatbot’s capabilities, providing real-time information and smooth communication across different platforms.