# AI Website Content Checker (`scraping_samurai/ai-website-content-checker`) Actor

AI Content Checker monitors web pages for content changes and provides AI-powered summaries, visual comparisons, and smart notifications. It helps track competitors, compliance updates, and critical information.

- **URL**: https://apify.com/scraping\_samurai/ai-website-content-checker.md
- **Developed by:** [Scraping Samurai](https://apify.com/scraping_samurai) (community)
- **Categories:** AI, Agents, Automation
- **Stats:** 1 total users, 1 monthly users, 100.0% runs succeeded, 1 bookmarks
- **User rating**: No ratings yet

## Pricing

$19.99/month + usage

To use this Actor, you pay a monthly rental fee to the developer. The rent is subtracted from your prepaid usage every month.You also pay for the Apify platform usage, which gets cheaper the higher Apify subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#rental-actors

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

### AI Content Checker

#### Smart website monitoring with AI-powered content change detection and analysis

Monitor any web page for content changes and receive intelligent notifications with AI-powered analysis of what changed. Perfect for tracking competitors, monitoring critical information, or staying informed about updates to important web pages.

---

### 🔍 Key Features

-   **AI-Powered Change Analysis**: Receive detailed summaries of how content has changed, not just that it changed.
-   **Visual Comparison**: Before/after screenshots of changes for visual confirmation.
-   **Smart Notifications**: Get email alerts with AI-summarized changes and screenshots.
-   **Robust Web Handling**: Works on complex sites with dynamic content and anti-bot measures.
-   **Selective Monitoring**: Monitor specific elements on a page rather than the entire page.
-   **Reliable Execution**: Multiple retry strategies to ensure successful data extraction.

---

### 📋 Use Cases

-   **Competitor Monitoring**: Track price changes, product features, or marketing messages.
-   **News & Publication Tracking**: Be the first to know when important content is published.
-   **Documentation Monitoring**: Stay informed about API or documentation updates.
-   **E-commerce Tracking**: Monitor product availability, pricing, or specifications.
-   **Legal & Compliance**: Track terms of service, privacy policies, or regulatory changes.
-   **Crisis Management**: Keep tabs on news articles or public statements during sensitive situations.

---

### 🧠 AI-Powered Analysis

The AI Content Checker doesn't just tell you that something changed—it tells you what changed and how. Our AI integration:

-   **Summarizes Changes**: Analyzes differences and provides a concise, human-readable summary.
-   **Identifies Important Updates**: Focuses on meaningful changes while ignoring trivial ones.
-   **Contextualizes Changes**: Understands the significance of changes in their context.
-   **Cleans Content**: Automatically filters out navigation elements, ads, and other noise.

##### Example AI Analysis:

```plaintext
The company has changed its pricing structure. The basic plan increased from $9.99 to $12.99 per month. A new "Enterprise" tier was added at $49.99 with additional features including dedicated support and custom integrations. The free tier remains unchanged.
````

***

### ⚙️ Configuration Options

| Parameter              | Description                                                      |
| ---------------------- | ---------------------------------------------------------------- |
| `url`                  | URL of the page to monitor                                       |
| `contentSelector`      | CSS selector targeting the content to monitor                    |
| `screenshotSelector`   | CSS selector for screenshot area (defaults to `contentSelector`) |
| `sendNotificationTo`   | Email address for notifications                                  |
| `sendNotificationText` | Custom message included in notifications                         |
| `aiMode`               | Enable AI analysis (`true`/`false`)                              |
| `loaderSelector`       | Page load strategy (`networkidle`, `domcontentloaded`, `load`)   |
| `navigationTimeout`    | Maximum time to wait for page load (milliseconds)                |
| `informOnError`        | Send email on errors (`true`/`false`)                            |
| `maxRetries`           | Number of retry attempts                                         |
| `retryStrategy`        | When to retry (`on-block`, `on-all-errors`, `never-retry`)       |

***

### 🚀 Getting Started

#### Best Practice: Create Dedicated Tasks

We recommend creating separate tasks for each website you want to monitor. This approach offers several advantages:

- **Optimized Configuration**: Each task can be configured specifically for its target website.
- **Independent Scheduling**: Run different monitors on different schedules as needed.
- **Isolated Error Handling**: If one monitor encounters issues, others continue to function.
- **Better Organization**: Keep your monitoring efforts neatly organized.
- **Easier Maintenance**: Update or modify individual monitors without affecting others.

#### Example Configuration

##### Monitoring a Product Page:

```json
{
    "url": "https://example.com/product-page",
    "contentSelector": "#product-description",
    "sendNotificationTo": "user@example.com",
    "aiMode": "true",
    "loaderSelector": "networkidle",
    "retryStrategy": "on-all-errors"
}
```

***

### 📊 Understanding the Results

After each run, the system provides:

- **Dataset Records**: Detailed information about any changes detected.
- **Key-Value Store**: Current and previous content and screenshots.
- **Email Notifications**: Summaries of changes with screenshots (if configured).

Each output includes:

- The original URL monitored.
- Previous content.
- Current content.
- AI-powered analysis of changes (when enabled).
- Links to before/after screenshots.

***

### 🛠️ Advanced Configuration

#### Handling Complex Websites

For websites with complex loading behavior:

```json
{
    "url": "https://complex-site.com/dynamic-page",
    "contentSelector": ".dynamic-content",
    "loaderSelector": "networkidle",
    "navigationTimeout": 60000,
    "retryStrategy": "on-all-errors",
    "maxRetries": 5
}
```

#### Monitoring Multiple Elements

You can create separate tasks to monitor different elements on the same page:

##### Task 1: Monitor Pricing

```json
{
    "url": "https://example.com/product",
    "contentSelector": ".price-container",
    "sendNotificationTo": "pricing@example.com"
}
```

##### Task 2: Monitor Features

```json
{
    "url": "https://example.com/product",
    "contentSelector": ".feature-list",
    "sendNotificationTo": "product@example.com"
}
```

***

### 📈 Scheduling

For optimal monitoring, schedule your tasks to run at appropriate intervals:

- **Critical updates**: Every few hours or daily.
- **Competitor monitoring**: Daily or weekly.
- **Documentation changes**: Weekly.

Remember that more frequent checks may require additional proxy rotation to avoid being blocked by websites.

***

### 🔧 Troubleshooting

#### Common Issues

- **No content detected**: Check your CSS selector and ensure it targets the right element.
- **Page blocking**: Adjust the retry strategy or decrease check frequency.
- **Timeout errors**: Increase the navigation timeout for slow-loading pages.

#### Improving Detection Accuracy

- Use specific CSS selectors to target only the content you care about.
- Enable `aiMode` for smarter change detection and filtering.
- Test your selectors in browser developer tools before configuring.

***

### 📞 Support

Need help setting up your monitoring tasks? Submit an issue through the Issues tab.

Start monitoring important content changes with AI-powered analysis today!

# Actor input Schema

## `url` (type: `string`):

URL of a web page to be monitored

## `aiMode` (type: `string`):

When enabled, AI Mode allows you to compare the entire page or the content within specific CSS selectors using AI analysis, providing better quality of the output.

## `contentSelector` (type: `string`):

CSS selector of an area you want to monitor

## `screenshotSelector` (type: `string`):

CSS selector of a screenshot you want to get

## `sendNotificationTo` (type: `string`):

Email address where you want to get the notification

## `sendNotificationText` (type: `string`):

Optional text to include in the email notification.

## `informOnError` (type: `string`):

In case of the problem with selectors on the page, you will get notification mail
with the screenshot of the page attached.

## `navigationTimeout` (type: `integer`):

How long it should wait, in milliseconds, until the page times out

## `retryStrategy` (type: `string`):

Sometimes the page doesn't load properly or the actor gets blocked so retrying those helps. On the other hand retrying wrong selector doesn't help. The recognition of blocked pages is not perfect (about 80%).

## `loaderSelector` (type: `string`):

Until which page readiness status should the scraper wait — networkidle or another state (networkidle should cover 90% of use cases)?

## `maxRetries` (type: `integer`):

How many times the actor should retry in case of error.

## Actor input object example

```json
{
  "url": "https://www.apify.com/change-log",
  "aiMode": "false",
  "contentSelector": "article",
  "screenshotSelector": "article",
  "sendNotificationText": "ContentChecker found a new change!",
  "informOnError": "false",
  "navigationTimeout": 30000,
  "retryStrategy": "on-block",
  "loaderSelector": "networkidle",
  "maxRetries": 5
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "url": "https://www.apify.com/change-log",
    "contentSelector": "article",
    "screenshotSelector": "article",
    "sendNotificationText": "ContentChecker found a new change!",
    "navigationTimeout": 30000
};

// Run the Actor and wait for it to finish
const run = await client.actor("scraping_samurai/ai-website-content-checker").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "url": "https://www.apify.com/change-log",
    "contentSelector": "article",
    "screenshotSelector": "article",
    "sendNotificationText": "ContentChecker found a new change!",
    "navigationTimeout": 30000,
}

# Run the Actor and wait for it to finish
run = client.actor("scraping_samurai/ai-website-content-checker").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "url": "https://www.apify.com/change-log",
  "contentSelector": "article",
  "screenshotSelector": "article",
  "sendNotificationText": "ContentChecker found a new change!",
  "navigationTimeout": 30000
}' |
apify call scraping_samurai/ai-website-content-checker --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=scraping_samurai/ai-website-content-checker",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "AI Website Content Checker",
        "description": "AI Content Checker monitors web pages for content changes and provides AI-powered summaries, visual comparisons, and smart notifications. It helps track competitors, compliance updates, and critical information.",
        "version": "0.0",
        "x-build-id": "z7gS5X5GuXDHbJpJJ"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/scraping_samurai~ai-website-content-checker/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-scraping_samurai-ai-website-content-checker",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/scraping_samurai~ai-website-content-checker/runs": {
            "post": {
                "operationId": "runs-sync-scraping_samurai-ai-website-content-checker",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/scraping_samurai~ai-website-content-checker/run-sync": {
            "post": {
                "operationId": "run-sync-scraping_samurai-ai-website-content-checker",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "url",
                    "aiMode"
                ],
                "properties": {
                    "url": {
                        "title": "URL to check",
                        "type": "string",
                        "description": "URL of a web page to be monitored"
                    },
                    "aiMode": {
                        "title": "AI Mode",
                        "enum": [
                            "true",
                            "false"
                        ],
                        "type": "string",
                        "description": "When enabled, AI Mode allows you to compare the entire page or the content within specific CSS selectors using AI analysis, providing better quality of the output.",
                        "default": "false"
                    },
                    "contentSelector": {
                        "title": "Monitored area selector",
                        "type": "string",
                        "description": "CSS selector of an area you want to monitor"
                    },
                    "screenshotSelector": {
                        "title": "Screenshot selector",
                        "type": "string",
                        "description": "CSS selector of a screenshot you want to get"
                    },
                    "sendNotificationTo": {
                        "title": "Email address",
                        "type": "string",
                        "description": "Email address where you want to get the notification"
                    },
                    "sendNotificationText": {
                        "title": "Notification Text",
                        "type": "string",
                        "description": "Optional text to include in the email notification."
                    },
                    "informOnError": {
                        "title": "Notification in case of error",
                        "enum": [
                            "true",
                            "false"
                        ],
                        "type": "string",
                        "description": "In case of the problem with selectors on the page, you will get notification mail \n with the screenshot of the page attached.",
                        "default": "false"
                    },
                    "navigationTimeout": {
                        "title": "Navigation Timeout",
                        "type": "integer",
                        "description": "How long it should wait, in milliseconds, until the page times out",
                        "default": 30000
                    },
                    "retryStrategy": {
                        "title": "How to retry",
                        "enum": [
                            "on-block",
                            "on-all-errors",
                            "never-retry"
                        ],
                        "type": "string",
                        "description": "Sometimes the page doesn't load properly or the actor gets blocked so retrying those helps. On the other hand retrying wrong selector doesn't help. The recognition of blocked pages is not perfect (about 80%).",
                        "default": "on-block"
                    },
                    "loaderSelector": {
                        "title": "Until which page readiness status should the scraper wait?",
                        "enum": [
                            "networkidle",
                            "domcontentloaded",
                            "load"
                        ],
                        "type": "string",
                        "description": "Until which page readiness status should the scraper wait — networkidle or another state (networkidle should cover 90% of use cases)?",
                        "default": "networkidle"
                    },
                    "maxRetries": {
                        "title": "Maximum number of retries",
                        "type": "integer",
                        "description": "How many times the actor should retry in case of error.",
                        "default": 5
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
