π¦οΈπ LangGraph AI agent
Example of how to use LangGraph with Apify Actors to create a social media analysis tool-calling agent.
src/main.py
src/tools.py
1"""This module defines the main entry point for the Apify Actor.
2
3Feel free to modify this file to suit your specific needs.
4
5To build Apify Actors, utilize the Apify SDK toolkit, read more at the official documentation:
6https://docs.apify.com/sdk/python
7"""
8
9from __future__ import annotations
10
11import logging
12
13from apify import Actor
14from langchain_openai import ChatOpenAI
15from langgraph.prebuilt import create_react_agent
16
17from src.models import AgentStructuredOutput
18from src.ppe_utils import charge_for_actor_start, charge_for_model_tokens, get_all_messages_total_tokens
19from src.tools import tool_calculator_sum, tool_scrape_instagram_profile_posts
20from src.utils import log_state
21
22# fallback input is provided only for testing, you need to delete this line
23fallback_input = {
24 'query': 'This is fallback test query, do not nothing and ignore it.',
25 'modelName': 'gpt-4o-mini',
26}
27
28
29async def main() -> None:
30 """Main entry point for the Apify Actor.
31
32 This coroutine is executed using `asyncio.run()`, so it must remain an asynchronous function for proper execution.
33 Asynchronous execution is required for communication with Apify platform, and it also enhances performance in
34 the field of web scraping significantly.
35
36 Raises:
37 ValueError: If the input is missing required attributes.
38 """
39 async with Actor:
40 # Handle input
41 actor_input = await Actor.get_input()
42 # fallback input is provided only for testing, you need to delete this line
43 actor_input = {**fallback_input, **actor_input}
44
45 query = actor_input.get('query')
46 model_name = actor_input.get('modelName', 'gpt-4o-mini')
47 if actor_input.get('debug', False):
48 Actor.log.setLevel(logging.DEBUG)
49 if not query:
50 msg = 'Missing "query" attribute in input!'
51 raise ValueError(msg)
52
53 await charge_for_actor_start()
54
55 llm = ChatOpenAI(model=model_name)
56
57 # Create the ReAct agent graph
58 # see https://langchain-ai.github.io/langgraph/reference/prebuilt/?h=react#langgraph.prebuilt.chat_agent_executor.create_react_agent
59 tools = [tool_calculator_sum, tool_scrape_instagram_profile_posts]
60 graph = create_react_agent(llm, tools, response_format=AgentStructuredOutput)
61
62 inputs: dict = {'messages': [('user', query)]}
63 response: AgentStructuredOutput | None = None
64 last_message: str | None = None
65 last_state: dict | None = None
66 async for state in graph.astream(inputs, stream_mode='values'):
67 last_state = state
68 log_state(state)
69 if 'structured_response' in state:
70 response = state['structured_response']
71 last_message = state['messages'][-1].content
72 break
73
74 if not response or not last_message or not last_state:
75 Actor.log.error('Failed to get a response from the ReAct agent!')
76 await Actor.fail(status_message='Failed to get a response from the ReAct agent!')
77 return
78
79 if not (messages := last_state.get('messages')):
80 Actor.log.error('Failed to get messages from the ReAct agent!')
81 await Actor.fail(status_message='Failed to get messages from the ReAct agent!')
82 return
83
84 if not (total_tokens := get_all_messages_total_tokens(messages)):
85 Actor.log.error('Failed to calculate the total number of tokens used!')
86 await Actor.fail(status_message='Failed to calculate the total number of tokens used!')
87 return
88
89 await charge_for_model_tokens(model_name, total_tokens)
90
91 # Push results to the key-value store and dataset
92 store = await Actor.open_key_value_store()
93 await store.set_value('response.txt', last_message)
94 Actor.log.info('Saved the "response.txt" file into the key-value store!')
95
96 await Actor.push_data(
97 {
98 'response': last_message,
99 'structured_response': response.dict() if response else {},
100 }
101 )
102 Actor.log.info('Pushed the into the dataset!')
Python LangGraph template
A template for LangGraph projects in Python for building AI agents with Apify Actors. The template provides a basic structure and an example LangGraph ReAct agent that calls Actors as tools in a workflow.
How it works
A ReAct agent is created and given a set of tools to accomplish a task. The agent receives a query from the user and decides which tools to use and in what order to complete the task. In this case, the agent is provided with an Instagram Scraper Actor to scrape Instagram profile posts and a calculator tool to sum a list of numbers to calculate the total number of likes and comments. The agent is configured to also output structured data, which is pushed to the dataset, while textual output is stored in the key-value store as a response.txt
file.
How to use
Add or modify the agent tools in the src/tools.py
file, and make sure to include new tools in the agent tools list in src/main.py
. Additionally, you can update the agent system prompt in src/main.py
. For more information, refer to the LangGraph ReAct agent documentation and the LangChain tools documentation.
For a more advanced multi-agent example, see the Finance Monitoring Agent actor or visit the LangGraph documentation.
Included features
- Apify SDK for Python - a toolkit for building Apify Actors and scrapers in Python
- Input schema - define and easily validate a schema for your Actor's input
- Dataset - store structured data where each object stored has the same attributes
- Key-value store - store any kind of data, such as JSON documents, images, or text files
Resources
LlamaIndex agent to scrape, deduplicate and summarize contact details from a website
Scrape single page with provided URL with HTTPX and extract data from page's HTML with Beautiful Soup.
Example of a web scraper that uses Python HTTPX to scrape HTML from URLs provided on input, parses it using BeautifulSoup and saves results to storage.
Crawler example that uses headless Chrome driven by Playwright to scrape a website. Headless browsers render JavaScript and can help when getting blocked.
Scraper example built with Selenium and headless Chrome browser to scrape a website and save the results to storage. A popular alternative to Playwright.
Empty template with basic structure for the Actor with Apify SDK that allows you to easily add your own functionality.