Terms of Service (TOS) Watchdog
Pricing
from $0.10 / 1,000 results
Terms of Service (TOS) Watchdog
An Actor that compares two Terms of Services (updated or current versus previous or old) and uses an LLM to analyze the risk of changes
Terms of Service Watchdog
An Apify Actor that compares two Terms of Service URLs (old vs new) and uses AI to analyze legal risk of changes.
Overview
This Actor performs a one-time comparison between two Terms of Service URLs (old version vs new version). It fetches both URLs, compares their content, and if differences are found, uses GPT-4o-mini to assess legal risk.
Features
- Direct Comparison: Compares two URLs side-by-side (old vs new)
- One-Time Analysis: No stateful storage - perfect for ad-hoc comparisons
- Change Detection: Identifies differences between versions
- AI-Powered Analysis: Uses GPT-4o-mini to identify high-risk changes
- Structured Output: Saves results to Apify Dataset in JSON format
Tech Stack
- Python 3.11+
- httpx: Async HTTP client for fetching web pages
- BeautifulSoup4: HTML parsing and text extraction
- LangChain: LLM orchestration
- OpenAI GPT-4o-mini: Legal risk analysis
- Apify SDK: Actor framework and storage
Input Schema
{"old_url": "https://example.com/terms/old","new_url": "https://example.com/terms/new","openai_api_key": "sk-..."}
Required Fields:
old_url: URL of the old/previous version to comparenew_url: URL of the new/current version to compareopenai_api_key: OpenAI API key for GPT-4o-mini analysis
Output Schema
Results are saved to the default Dataset with the following structure:
{"old_url": "https://example.com/terms/old","new_url": "https://example.com/terms/new","change_detected": true,"risk_analysis": "Analysis of legal implications...","timestamp": "2024-01-01T12:00:00"}
Installation
For Local Development/Testing
Yes, you need to install dependencies before running locally:
- Create a virtual environment (recommended):
python3 -m venv venvsource venv/bin/activate # On Windows: venv\Scripts\activate
- Install dependencies:
$pip install -r requirements.txt
- Set up input file for local testing:
mkdir -p storage/key_value_stores/default# Create storage/key_value_stores/default/INPUT.json with your configuration
Example INPUT.json:
{"old_url": "https://example.com/terms/old","new_url": "https://example.com/terms/new","openai_api_key": "sk-your-openai-api-key-here"}
Note: The INPUT.json file is already created in the project. Edit it with your actual URLs and API key.
For Apify Platform
No manual installation needed - Dependencies are automatically installed during the Actor build process on Apify.
Usage
Local Development
After installing dependencies, you can run the Actor locally:
# Using Python directlypython src/main.py# Or using Apify CLI (if installed)apify run
Note: For local testing, you'll need:
- Python 3.11 or higher
- All dependencies from
requirements.txtinstalled - OpenAI API key (passed via input)
- Optional: Apify token if you want to use Apify storage locally
Apify Platform
- Push the Actor to Apify (dependencies install automatically)
- Configure input with your old and new ToS URLs and OpenAI API key
- Run the Actor to perform the comparison
How It Works
- Fetch Old Version: Downloads and extracts text content from the
old_url - Fetch New Version: Downloads and extracts text content from the
new_url - Text Comparison: Compares the two texts directly
- Change Detection: If texts differ, both versions are sent to GPT-4o-mini for analysis
- Risk Assessment: The LLM analyzes changes focusing on:
- Data privacy implications
- AI training rights
- Billing and payment terms
- User rights changes
- Output: Results are saved to the Dataset with comparison results
File Structure
tos_watchdog/├── apify.json # Actor configuration for Apify platform├── requirements.txt # Python dependencies├── README.md # This file├── TROUBLESHOOTING.md # Troubleshooting guide├── .gitignore # Git ignore rules└── src/├── __init__.py # Package init├── main.py # Entry point├── routes.py # URL fetching and comparison logic└── analysis.py # LLM analysis logic
Notes
- Stateless: No persistent storage - each run is independent
- Direct Comparison: Fetches both URLs and compares them directly
- Text Extraction: Removes scripts, styles, and navigation elements for cleaner comparison
- Token Limits: Both texts are limited to 15,000 characters each to stay within GPT-4o-mini token limits
- Error Handling: The Actor handles errors gracefully and reports them in the output



