WhatsApp API Integration and Web Scraping#
In this tutorial, we will explore how to leverage the FastAgency framework to create a dynamic and interactive chatbot that integrates two powerful agents:
-
WebSurferAgent
: A web-scraping agent capable of retrieving relevant content from webpages (learn more here). -
WhatsApp agent – An agent that interacts with the Infobip WhatsApp API to send WhatsApp messages based on the user’s request. It will be created using the standard
ConversableAgent
from AutoGen and theOpenAPI
object instantiated with an OpenAPI specification of Infobip's REST API.
The chat system will operate between these two agents and the user, allowing them to scrape web content and send the relevant information via WhatsApp, all within a seamless conversation. This tutorial will guide you through setting up these agents, handling user interaction, and ensuring secure API communication.
What You’ll Learn#
By the end of this tutorial, you will understand how to:
- Integrate external APIs like Infobip WhatsApp API using
OpenAPI
. - Build and register agents that autonomously scrape the web for relevant information using
WebSurferAgent
. - Use
AutoGenWorkflows
to manage agent interactions and user input. - Present scraped content to the user and offer sending that content via WhatsApp.
- Handle secure API credentials and ensure safe communication between agents using
APIKeyHeader
.
We will walk through setting up each agent, handling API security, and creating a cohesive conversation that scrapes data, processes user input, and sends it via WhatsApp in response.
Let’s dive into creating a powerful interactive agent system with FastAgency!
Project setup#
We strongly recommend using Cookiecutter for setting up the project. It creates the project folder structure, default workflow, automatically installs all the necessary requirements, and creates a devcontainer that can be used with Visual Studio Code for development.
You could also use virtual environment managers such as venv, and a Python package manager, such as pip.
-
Install Cookiecutter with the following command:
-
Run the
cookiecutter
command: -
Depending on the type of the project, choose the appropriate option in step 3:
[1/4] project_name (My FastAgency App): [2/4] project_slug (my_fastagency_app): [3/4] Select app_type 1 - fastapi+mesop 2 - mesop 3 - nats+fastapi+mesop Choose from [1/2/3] (1): 2 [4/4] Select python_version 1 - 3.12 2 - 3.11 3 - 3.10 Choose from [1/2/3] (1):
This command installs FastAgency with support for both the Console and Mesop interfaces for AutoGen workflows.
-
Executing the
cookiecutter
command will create the following file structure:my_fastagency_app ├── docker │ ├── content │ │ ├── nginx.conf.template │ │ └── run_fastagency.sh │ └── Dockerfile ├── my_fastagency_app │ ├── deployment │ │ ├── __init__.py │ │ └── main.py │ ├── local │ │ ├── __init__.py │ │ ├── main_console.py │ │ └── main_mesop.py │ ├── __init__.py │ └── workflow.py ├── scripts │ ├── build_docker.sh │ ├── check-registered-app-pre-commit.sh │ ├── check-registered-app.sh │ ├── deploy_to_fly_io.sh │ ├── lint-pre-commit.sh │ ├── lint.sh │ ├── register_to_fly_io.sh │ ├── run_docker.sh │ ├── run_mesop_locally.sh │ ├── static-analysis.sh │ └── static-pre-commit.sh ├── tests │ ├── __init__.py │ ├── conftest.py │ └── test_workflow.py ├── README.md ├── fly.toml └── pyproject.toml
-
To run LLM-based applications, you need an API key for the LLM used. The most commonly used LLM is OpenAI. To use it, create an OpenAI API Key and set it as an environment variable in the terminal using the following command:
If you want to use a different LLM provider, follow this guide.
Alternatively, you can skip this step and set the LLM API key as an environment variable later in the devcontainer's terminal. If you open the project in Visual Studio Code using GUI, you will need to manually set the environment variable in the devcontainer's terminal.
-
Open the generated project in Visual Studio Code with the following command:
-
Once the project is opened, you will get the following option to reopen it in a devcontainer:
-
After reopening the project in devcontainer, you can verify that the setup is correct by running the provided tests with the following command:
You should get the following output if everything is correctly setup.
=================================== test session starts =================================== platform linux -- Python 3.12.7, pytest-8.3.3, pluggy-1.5.0 rootdir: /workspaces/my_fastagency_app configfile: pyproject.toml plugins: asyncio-0.24.0, anyio-4.6.2.post1 asyncio: mode=Mode.STRICT, default_loop_scope=None collected 1 item tests/test_workflow.py . [100%] ==================================== 1 passed in 1.02s ====================================
Running the test could take up to 30 seconds, depending on latency and throughput of OpenAI (or other LLM providers).
-
Install additional dependencies which will be needed for this tutorial:
Info
If you used a different project_slug
than the default my_fastagency_app
this will be reflected in the project module naming. Keep this in mind when running the commands further in this guide (in Run Application), you will need to replace my_fastagency_app
with your project_slug
name.
API Key Setup#
WebSurferAgent
requires an Bing Web Search API key and WhatsAppAgent requires an API key to interact with Infobip's WhatsApp service. Follow these steps to create your API keys:
Create Bing Web Search API Key#
To create Bing Web Search API key, follow the guide provided.
Note
You will need to create Microsoft Azure Account.
Create Infobip Account#
Step 1: If you don’t have a Infobip account, you’ll need to sign up:
- Go to Infobip Portal and create account
Step 2: Settings
- In the Customize your experience section, choose:
- Customer support
- By using code (APIs, SDKs)
Step 3: Test WhatsApp API
- After you have created the account, you will be redirected Infobip Homepage.
- Check the Send your first message option and send a WhatsApp message to yourself.
- In this tutorial, we will only be sending messages to your own number
Important
Upon receiving this message, please reply (e.g., with "Hi") to initiate the session. Note that sessions expire after 24 hours. If your session has expired, simply send another message to create a new one.
Copy the API Key from the top-right corner and continue with the next steps.
Step 4: Register your WhatsApp sender (Optional)
- By default, Infobip number will be used as the sender for your messages.
- If you wish to create a new sender phone number and customize your branding (including your name and logo), click on Register Sender.
Set Up Your API Keys in the Environment#
To securely use the API keys in your project, you should store it in an environment variables.
You can set the API keys in your terminal as an environment variable:
Complete Application Code#
Workflow Code#
You need to define the workflow that your application will use. This is where you specify how the agents interact and what they do.
Workflow will be generated within the my_fastagency_app/workflow.py
folder. You will need to replace the existing workflow.py
with the code below.
Create workflow.py
and paste the code below inside.
workflow.py
import os
from typing import Annotated, Any, Optional
from autogen import register_function
from autogen.agentchat import ConversableAgent
from fastagency import UI
from fastagency.api.openapi import OpenAPI
from fastagency.api.openapi.security import APIKeyHeader
from fastagency.runtimes.autogen import AutoGenWorkflows
from fastagency.runtimes.autogen.agents.websurfer import WebSurferAgent
llm_config = {
"config_list": [
{
"model": "gpt-4o-mini",
"api_key": os.getenv("OPENAI_API_KEY"),
}
],
"temperature": 0.8,
}
openapi_url = "https://dev.infobip.com/openapi/products/whatsapp.json"
whatsapp_api = OpenAPI.create(
openapi_url=openapi_url,
# this is an optional parameter, but specified here because servers are not specified in the OpenAPI specification
servers=[{"url": "https://api.infobip.com"}],
)
header_authorization = "App " # pragma: allowlist secret
header_authorization += os.getenv("WHATSAPP_API_KEY", "")
whatsapp_api.set_security_params(APIKeyHeader.Parameters(value=header_authorization))
# This is the default sender number for Infobip.
# If you want to use your own sender, please update the value below:
sender = "447860099299"
WHATSAPP_SYSTEM_MESSAGE = f"""You are an agent in charge to communicate with the user and WhatsAPP API.
Always use 'present_completed_task_or_ask_question' to interact with the user.
- make sure that the 'message' parameter contains all the necessary information for the user!
Initially, the Web_Surfer_Agent will provide you with some content from the web.
You should ask the user if he would like to receive the summary of the scraped page
by using 'present_completed_task_or_ask_question'.
- "If you want to receive the summary of the page as a WhatsApp message, please provide your number."
When sending the message, the Body must use the following format:
{{
"from": "{sender}",
"to": "receiverNumber",
"messageId": "test-message-randomInt",
"content": {{
"text": "message"
}},
"callbackData": "Callback data"
}}
"from" number is always the same.
"""
wf = AutoGenWorkflows()
@wf.register(name="whatsapp_and_websurfer", description="WhatsApp and WebSurfer chat")
def whatsapp_and_websurfer_workflow(ui: UI, params: dict[str, Any]) -> str:
def is_termination_msg(msg: dict[str, Any]) -> bool:
return msg["content"] is not None and "TERMINATE" in msg["content"]
def present_completed_task_or_ask_question(
message: Annotated[str, "Message for examiner"],
) -> Optional[str]:
try:
return ui.text_input(
sender="Whatsapp_Agent",
recipient="User",
prompt=message,
)
except Exception as e: # pragma: no cover
return f"present_completed_task_or_ask_question() FAILED! {e}"
whatsapp_agent = ConversableAgent(
name="WhatsApp_Agent",
system_message=WHATSAPP_SYSTEM_MESSAGE,
llm_config=llm_config,
human_input_mode="NEVER",
is_termination_msg=is_termination_msg,
)
web_surfer = WebSurferAgent(
name="Web_Surfer_Agent",
llm_config=llm_config,
summarizer_llm_config=llm_config,
human_input_mode="NEVER",
executor=whatsapp_agent,
is_termination_msg=is_termination_msg,
bing_api_key=os.getenv("BING_API_KEY"),
)
register_function(
present_completed_task_or_ask_question,
caller=whatsapp_agent,
executor=web_surfer,
name="present_completed_task_or_ask_question",
description="""Present completed task or ask question.
If you are presenting a completed task, last message should be a question: 'Do yo need anything else?'""",
)
wf.register_api(
api=whatsapp_api,
callers=whatsapp_agent,
executors=web_surfer,
functions=["send_whatsapp_text_message"],
)
initial_message = ui.text_input(
sender="Workflow",
recipient="User",
prompt="For which website would you like to receive a summary?",
)
chat_result = whatsapp_agent.initiate_chat(
web_surfer,
message=f"Users initial message: {initial_message}",
summary_method="reflection_with_llm",
max_turns=10,
)
return chat_result.summary # type: ignore[no-any-return]
Deployment Code#
Deployment files will be generated under my_fastagency_app/deployment
folder. Generated main.py
should be the same as the code below. You don't need change anything.
Create deployment/main.py
and paste the code below inside.
main.py
Code Walkthrough#
Now we will go over each key part of the code, explaining its function and purpose within the FastAgency framework. Understanding these components is crucial for building a dynamic interaction between the user, the WebSurferAgent
, and the WhatsAppAgent.
Creating the WhatsApp API Instance#
The following lines shows how to initializes the WhatsApp API by loading the OpenAPI specification from a URL. The OpenAPI spec defines how to interact with the WhatsApp API, including endpoints, parameters, and security details.
Also, we configure the WhatsApp API with the WHATSAPP_API_KEY using set_security_params to authenticate our requests.
whatsapp_api = OpenAPI.create(
openapi_url=openapi_url,
# this is an optional parameter, but specified here because servers are not specified in the OpenAPI specification
servers=[{"url": "https://api.infobip.com"}],
)
header_authorization = "App " # pragma: allowlist secret
header_authorization += os.getenv("WHATSAPP_API_KEY", "")
whatsapp_api.set_security_params(APIKeyHeader.Parameters(value=header_authorization))
For more information, visit API Integration User Guide.
Registering the Workflow#
Here, we initialize a new workflow using AutoGenWorkflows() and register it under the name "whatsapp_and_websurfer". The @wf.register decorator registers the function to handle chat flow with security enabled, combining both WhatsAppAgent and WebSurferAgent.
@wf.register(name="whatsapp_and_websurfer", description="WhatsApp and WebSurfer chat")
def whatsapp_and_websurfer_workflow(ui: UI, params: dict[str, Any]) -> str:
...
Interaction with the user#
This is a core function used by the WhatsAppAgent to either present the task result or ask a follow-up question to the user. The message is wrapped in a TextInput object, and then ui.process_message() sends it for user interaction.
def present_completed_task_or_ask_question(
message: Annotated[str, "Message for examiner"],
) -> Optional[str]:
try:
return ui.text_input(
sender="Whatsapp_Agent",
recipient="User",
prompt=message,
)
except Exception as e: # pragma: no cover
return f"present_completed_task_or_ask_question() FAILED! {e}"
Creating the WhatsApp and WebSurfer Agents#
- WhatsAppAgent: A
ConversableAgent
is created with the name "WhatsApp_Agent". It uses the system message defined earlier and relies on the termination function to end the chat when needed. WebSurferAgent
: TheWebSurferAgent
is responsible for scraping web content and passes the retrieved data to the WhatsAppAgent. It’s configured with a summarizer to condense web content, which is useful when presenting concise data to the user. For more information, visit WebSurfer User Guide.
whatsapp_agent = ConversableAgent(
name="WhatsApp_Agent",
system_message=WHATSAPP_SYSTEM_MESSAGE,
llm_config=llm_config,
human_input_mode="NEVER",
is_termination_msg=is_termination_msg,
)
web_surfer = WebSurferAgent(
name="Web_Surfer_Agent",
llm_config=llm_config,
summarizer_llm_config=llm_config,
human_input_mode="NEVER",
executor=whatsapp_agent,
is_termination_msg=is_termination_msg,
bing_api_key=os.getenv("BING_API_KEY"),
)
Registering Functions#
The function present_completed_task_or_ask_question is registered to allow the WhatsAppAgent to ask questions or present completed tasks after receiving data from the WebSurferAgent
.
register_function(
present_completed_task_or_ask_question,
caller=whatsapp_agent,
executor=web_surfer,
name="present_completed_task_or_ask_question",
description="""Present completed task or ask question.
If you are presenting a completed task, last message should be a question: 'Do yo need anything else?'""",
)
We register the WhatsApp API, which allows the WhatsAppAgent to handle tasks like suggesting messages that will be sent to the user.
wf.register_api(
api=whatsapp_api,
callers=whatsapp_agent,
executors=web_surfer,
functions=["send_whatsapp_text_message"],
)
Initiating the Chat#
We initiate the conversation between the user, WebSurferAgent
, and WhatsAppAgent. The user’s initial message is provided, and the system is configured to handle up to 10 turns of interaction. The conversation is summarized using the reflection_with_llm method, which uses a language model to summarize the chat.
Once the conversation ends, the summary is returned to the user, wrapping up the session.
chat_result = whatsapp_agent.initiate_chat(
web_surfer,
message=f"Users initial message: {initial_message}",
summary_method="reflection_with_llm",
max_turns=10,
)
return chat_result.summary # type: ignore[no-any-return]
Starting the Application#
The FastAgency app is created, using the registered workflows (wf
) and web-based user interface (MesopUI
). This makes the conversation between agents and the user interactive.
For more information, visit Mesop User Guide.
Running the Application#
The preferred way to run the Mesop application is using a Python WSGI HTTP server like Gunicorn on Linux and Mac or Waitress on Windows.
[2024-10-10 13:19:18 +0530] [23635] [INFO] Starting gunicorn 23.0.0
[2024-10-10 13:19:18 +0530] [23635] [INFO] Listening at: http://127.0.0.1:8000 (23635)
[2024-10-10 13:19:18 +0530] [23635] [INFO] Using worker: sync
[2024-10-10 13:19:18 +0530] [23645] [INFO] Booting worker with pid: 23645
The command will launch a web interface where users can input their requests and interact with the agents (in this case http://localhost:8000)
Note
Ensure that your OpenAI API key is set in the environment, as the agents rely on it to interact using GPT-4o. If the API key is not correctly configured, the application may fail to retrieve LLM-powered responses.
Chat Example#
In this scenario, the user instructs the agents to scrape BBC Sport for the latest sports news.
Upon receiving the request, WebSurferAgent
initiates the process by scraping the webpage for relevant updates.
After the scraping process is complete, the agents compile the findings and present them to the user. In the final step, the user submits their phone number to receive the results via WhatsApp message.
Finally, the results are delivered to the user through a WhatsApp message.
Conclusion#
In summary, connecting FastAgency with the Infobip WhatsApp API lets you create chat systems that can gather web data and send it straight to users on WhatsApp. By using two agents — WebSurferAgent
to pull web content and WhatsAppAgent for messaging, you can build engaging experiences for users. This tutorial covered the essential steps to set up these agents, secure the API, and manage user interactions. With this setup, you can enhance your chatbot’s capabilities, providing real-time information and smooth communication across different platforms.