Website Terms & Policy Scraper
Pricing
Pay per event
Website Terms & Policy Scraper
Extract and monitor website terms of service, privacy policies, and legal pages to detect added clauses or removed text via SHA256 diffs.
Pricing
Pay per event
Rating
0.0
(0)
Developer
太郎 山田
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
8 days ago
Last modified
Categories
Share
SaaS / Company Site Change Monitor Pro
Extract text from legal pages and automatically monitor website terms of service, privacy policies, and compliance documents for critical updates. Corporate legal teams, compliance officers, and vendor risk managers rely on this specialized scraper to track third-party policy changes without manual review. Schedule the scraper to run weekly or daily across a list of vendor URLs to extract the exact text of their service agreements and legal notices.
Using advanced SHA256 snapshot diffing, the tool evaluates the newly scraped data against historical versions to pinpoint exact modifications. It outputs a clear diff summary showing precisely which clauses were added, altered, or removed. This is essential for detecting silent updates to data processing agreements, liability limitations, or subprocessor lists that vendors often change without direct notification.
The extracted structured data includes the page URL, timestamp, full text snapshot, and a detailed array of text differences. By automating the monitoring of these websites, teams can proactively manage vendor risk and ensure continuous compliance. Rely on this web scraper to track pages, extract policy details, and provide the exact textual evidence needed to maintain a secure supply chain.
Store Quickstart
Run this actor with your target input. Results appear in the Apify Dataset and can be piped to webhooks for real-time delivery. Use dryRun to validate before committing to a schedule.
Key Features
- • **複数ターゲット監視 (
price|terms|features|general** — 複数ターゲット監視 (price|terms|features|general`) - • テキスト抽出 + SHA256スナップショット管理 — テキスト抽出 + SHA256スナップショット管理
- • 変更サマリ(追加/削除行) — 変更サマリ(追加/削除行)
- • Apify KV が使えない環境でもローカル
state/にフォールバック — Apify KV が使えない環境でもローカルstate/にフォールバック
Use Cases
| Who | Why |
|---|---|
| Developers | Automate recurring data fetches without building custom scrapers |
| Data teams | Pipe structured output into analytics warehouses |
| Ops teams | Monitor changes via webhook alerts |
| Product managers | Track competitor/market signals without engineering time |
Input
| Field | Type | Default | Description |
|---|---|---|---|
| targets | array | prefilled | List of targets. Each item supports id, name, kind, url, includePatterns, excludePatterns, maxChars, fixtureHtml, fixtur |
| requestTimeoutSeconds | integer | 30 | HTTP request timeout per target. |
| userAgent | string | — | Optional user-agent header for target requests. |
| maxChars | integer | 25000 | Upper bound for extracted text length. |
| delivery | string | "dataset" | Where to send run results. |
| datasetMode | string | "changes_only" | Choose whether dataset/webhook sends all targets or only event items. |
| webhookUrl | string | — | Required when delivery is webhook. |
| notifyOnNoChange | boolean | false | When false, webhook mode skips if no event items. |
Input Example
{"targets": [{"id": "notion-pricing","name": "Notion Pricing","kind": "price","url": "https://www.notion.so/pricing","includePatterns": ["price","free","business","enterprise"],"excludePatterns": ["cookie","login"]}],"requestTimeoutSeconds": 30,"maxChars": 25000,"delivery": "dataset","datasetMode": "changes_only","notifyOnNoChange": false,"snapshotKey": "saas-change-monitor-snapshots","dryRun": false}
Input Examples
Example: Single-target audit
{"targets": ["example-target-1"],"maxResultsPerTarget": 30}
Example: Bulk portfolio
{"targets": ["target-1","target-2","target-3"],"maxResultsPerTarget": 50,"snapshotKey": "saas-change-monitor-actor-premium-state"}
Example: Recurring delta watch
{"targets": ["target-1"],"snapshotKey": "saas-change-monitor-actor-premium-state","emitChangedOnly": true}
Output
| Field | Type | Description |
|---|---|---|
meta | object | |
results | array | |
results[].id | string | |
results[].name | string | |
results[].kind | string | |
results[].url | string | |
results[].capturedAt | timestamp | |
results[].changed | boolean | |
results[].status | string | |
results[].hash | string | |
results[].previousHash | string | |
results[].previousCapturedAt | string | |
results[].lineCount | number | |
results[].changeSummary | object | |
results[].preview | string | |
results[].error | string | |
results[].finalUrl | string |
Output Example
{"status": "ok","data": []}
API Usage
Run this actor programmatically using the Apify API. Replace YOUR_API_TOKEN with your token from Apify Console → Settings → Integrations.
cURL
curl -X POST "https://api.apify.com/v2/acts/taroyamada~saas-change-monitor-actor-premium/run-sync-get-dataset-items?token=YOUR_API_TOKEN" \-H "Content-Type: application/json" \-d '{ "targets": [ { "id": "notion-pricing", "name": "Notion Pricing", "kind": "price", "url": "https://www.notion.so/pricing", "includePatterns": [ "price", "free", "business", "enterprise" ], "excludePatterns": [ "cookie", "login" ] } ], "requestTimeoutSeconds": 30, "maxChars": 25000, "delivery": "dataset", "datasetMode": "changes_only", "notifyOnNoChange": false, "snapshotKey": "saas-change-monitor-snapshots", "dryRun": false }'
Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")run = client.actor("taroyamada/saas-change-monitor-actor-premium").call(run_input={"targets": [{"id": "notion-pricing","name": "Notion Pricing","kind": "price","url": "https://www.notion.so/pricing","includePatterns": ["price","free","business","enterprise"],"excludePatterns": ["cookie","login"]}],"requestTimeoutSeconds": 30,"maxChars": 25000,"delivery": "dataset","datasetMode": "changes_only","notifyOnNoChange": false,"snapshotKey": "saas-change-monitor-snapshots","dryRun": false})for item in client.dataset(run["defaultDatasetId"]).iterate_items():print(item)
JavaScript / Node.js
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const run = await client.actor('taroyamada/saas-change-monitor-actor-premium').call({"targets": [{"id": "notion-pricing","name": "Notion Pricing","kind": "price","url": "https://www.notion.so/pricing","includePatterns": ["price","free","business","enterprise"],"excludePatterns": ["cookie","login"]}],"requestTimeoutSeconds": 30,"maxChars": 25000,"delivery": "dataset","datasetMode": "changes_only","notifyOnNoChange": false,"snapshotKey": "saas-change-monitor-snapshots","dryRun": false});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(items);
Tips & Limitations
- Schedule weekly runs for procurement watchlists; daily for critical vendors.
- Use webhook delivery to drop diffs into a shared procurement Slack channel.
- Track competitor pricing quarterly to inform your own pricing strategy.
- Store snapshots persistently — these are valuable for contract renegotiation evidence.
- Combine with
vendor-change-monitorfor term-level changes alongside pricing.
FAQ
How are diffs computed?
Semantic diff on structured fields (prices, plan names, feature tables) rather than raw HTML — filters out cosmetic changes.
What happens if a vendor redesigns their site?
The actor falls back to best-effort structured extraction; material redesigns may require a vendor-specific adapter update.
Can I track changes to downloadable contracts/PDFs?
The actor tracks content text; PDF-only vendors with dynamic URLs may require manual review.
Does this bypass anti-bot measures?
No — it uses standard requests. Some vendors with aggressive bot protection may not be monitorable; report these issues and we'll adjust fetch strategy.
Can I monitor internal portals?
No — this actor is public-web only. Internal portals with auth are out of scope.
Related Actors
SaaS & Vendor Monitoring cluster — explore related Apify tools:
- SaaS Pricing & Terms Monitor API — Monitor pricing, terms, and feature pages with machine-readable diffs, snapshot history, and dataset/webhook delivery.
- Vendor Pricing, Terms & Renewal Watch API — Monitor vendor pricing changes, terms-of-service updates, renewal language, privacy / DPA policy diffs, and procurement / vendor-risk alerts with one summary-first vendor digest per monitored vendor.
- Vendor Status Page & Incident Digest Monitor — Monitor public vendor status pages and incident feeds.
Cost
Pay Per Event:
actor-start: $0.01 (flat fee per run)dataset-item: $0.003 per output item
Example: 1,000 items = $0.01 + (1,000 × $0.003) = $3.01
No subscription required — you only pay for what you use.
⭐ Was this helpful?
If this actor saved you time, please leave a ★ rating on Apify Store. It takes 10 seconds, helps other developers discover it, and keeps updates free.
Bug report or feature request? Open an issue on the Issues tab of this actor.