# SaaS Pricing Page Scraper (`aetherlabx/saas-pricing-page-scraper`) Actor

Extract accurate, dynamically rendered pricing data from any SaaS website. Built with Playwright to bypass JS loading screens. Supports custom CSS selectors. Perfect for competitor intelligence, n8n workflows, and feeding clean text to AI Agents (LLMs) for pricing analysis.

- **URL**: https://apify.com/aetherlabx/saas-pricing-page-scraper.md
- **Developed by:** [Aether](https://apify.com/aetherlabx) (community)
- **Categories:** Developer tools, Automation, E-commerce
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $2.90 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## SaaS Pricing Monitor Actor

A production-grade Apify Actor built for scraping dynamically rendered SaaS pricing pages. Powered by Playwright, Crawlee, and the Apify platform, it extracts structured data and full-page screenshots from any pricing page—regardless of how JavaScript-heavy it is.

### Why This Actor?

Most SaaS companies render their pricing behind JavaScript frameworks (React, Vue, Next.js). Traditional HTTP scrapers fail silently on these pages, returning empty bodies or loading spinners. This Actor solves that problem with a full Playwright browser engine that executes JavaScript, waits for network idle, and captures the fully rendered page exactly as a human sees it.

#### Key Advantages

| Feature | Benefit |
|---|---|
| **Full Browser Rendering** | Handles React, Vue, Angular, Next.js, and any JS-rendered page. No more blank responses from `curl` or `fetch`. |
| **Full-Page Screenshots** | Every crawl produces a high-resolution PNG screenshot automatically stored to Apify cloud storage with a permanent public URL. Perfect for visual verification and archival. |
| **Smart Selector Extraction** | Target specific pricing plan containers, feature comparison tables, or any DOM element with a CSS selector. Falls back gracefully to full-body extraction when no selector is provided or the selector is not found. |
| **Structured Output for n8n / Make / Zapier** | Every result is pushed as clean JSON to the Apify Dataset, making it instantly consumable by automation platforms. |
| **AI & LLM Ready** | Clean `extractedText` and `extractedHtml` fields are ready to feed into GPT, Claude, or any LLM for pricing analysis, competitive intelligence, and trend detection. |
| **Zero-Config Robustness** | Built-in retries (up to 2 attempts), network idle timeout handling, selector not-found fallback, and comprehensive error reporting per URL. |
| **Docker-Based Deployment** | Runs on Apify's official `apify/actor-node-playwright-chrome:20` image with all Playwright browsers pre-installed. No setup required. |
| **Single-File Architecture** | The entire Actor logic lives in one clean, readable file (`src/main.js`). Easy to audit, modify, and extend. |

---

### How It Works

````

Input (URL + optional selector)
│
▼
┌──────────────────────────────┐
│  Playwright Browser Launch   │
│  (Full Chromium engine)      │
└──────────────┬───────────────┘
│
▼
┌──────────────────────────────┐
│  Page Load & Render          │
│  • Wait for network idle     │
│  • Extra 2s settling time    │
│  • Handle selector lookup    │
└──────────────┬───────────────┘
│
▼
┌──────────────────────────────┐
│  Data Extraction             │
│  • HTML (from selector/body) │
│  • Plain text (InnerText)    │
│  • Page title                │
└──────────────┬───────────────┘
│
▼
┌──────────────────────────────┐
│  Screenshot Capture          │
│  • Full-page PNG screenshot  │
│  • Stored in Apify KV Store  │
│  • Public URL generated      │
└──────────────┬───────────────┘
│
▼
┌──────────────────────────────┐
│  Output to Apify Dataset     │
│  • Structured JSON           │
│  • Ready for n8n / API       │
└──────────────────────────────┘

````

---

### Input

The Actor accepts two fields via Apify's input schema:

| Field | Type | Required | Description |
|---|---|---|---|
| `url` | `string` | **Yes** | The full URL of the SaaS pricing page to scrape. |
| `selector` | `string` | No | A CSS selector to target a specific section (e.g., `.pricing-container`, `#plans`, `.plan-grid`). Leave empty to scrape the full page body. |

#### Example Input

```json
{
  "url": "https://www.notion.so/pricing",
  "selector": ""
}
````

With a specific selector:

```json
{
  "url": "https://www.linear.app/pricing",
  "selector": ".pricing-plans"
}
```

***

### Output

Each crawl pushes a structured JSON object to the Apify Dataset:

| Field | Type | Description |
|---|---|---|
| `url` | `string` | The crawled URL |
| `title` | `string` | Page `<title>` tag content |
| `selector` | `string` | The CSS selector used (defaults to `"body"`) |
| `extractedHtml` | `string` | Raw HTML of the extracted section |
| `extractedText` | `string` | Plain text content (innerText) of the extracted section |
| `screenshotUrl` | `string` | Permanent public URL of the full-page screenshot |
| `crawledAt` | `string` | ISO 8601 timestamp of the crawl |
| `error` | `string` | Error message (only present on failure) |

#### Example Output

```json
{
  "url": "https://www.notion.so/pricing",
  "title": "Notion Pricing – Plans & Features",
  "selector": "body",
  "extractedHtml": "<div class=\"pricing-page\">...</div>",
  "extractedText": "Free\n$0/month\nPlus\n$10/month\nBusiness\n$18/month\nEnterprise\nContact sales",
  "screenshotUrl": "https://api.apify.com/v2/key-value-stores/xxx/records/pricing-screenshot-1712345678901.png",
  "crawledAt": "2025-05-24T10:00:00.000Z"
}
```

***

### Use Cases

#### 1. Competitor Pricing Intelligence

Schedule this Actor to scrape your competitors' pricing pages daily. The structured `extractedText` output makes it trivial to detect plan changes, price adjustments, and new feature tiers.

#### 2. n8n Automation Workflows

Connect the Actor to n8n via the official Apify integration. Trigger actions when pricing changes are detected—send Slack alerts, update spreadsheets, or feed data into internal dashboards.

#### 3. AI-Powered Pricing Analysis

Feed the `extractedText` output directly into LLMs (GPT-4, Claude, Gemini) to generate competitive analysis reports, pricing strategy recommendations, and market positioning insights.

#### 4. Screenshot Archival

Every crawl generates a permanent, publicly accessible screenshot URL. Build a visual timeline of pricing page changes to track redesigns and messaging shifts over time.

#### 5. Multi-Page Batch Scraping

Feed multiple URLs into the Actor to scrape dozens of competitors in a single run. Combine with the Apify API to build fully automated competitor monitoring pipelines.

***

### Project Structure

```
saas-pricing-monitor-actor/
├── Dockerfile              ## Apify-ready Docker image (Playwright + Chrome pre-installed)
├── INPUT_SCHEMA.json       ## Apify input field definitions
├── package.json            ## Node.js dependencies (apify, crawlee, playwright)
├── src/
│   └── main.js             ## Core Actor logic (entry point)
├── storage/
│   └── key_value_stores/
│       └── default/
│           └── INPUT.json  ## Local development input sample
└── .actor/
    └── actor.json          ## Apify Actor specification
```

### Tech Stack

| Technology | Purpose |
|---|---|
| **[Apify](https://apify.com)** | Actor runtime, Dataset storage, Key-Value Store for screenshots |
| **[Crawlee](https://crawlee.dev)** | Web scraping framework with built-in retry logic and request management |
| **[Playwright](https://playwright.dev)** | Full Chromium browser engine for rendering JavaScript-heavy pages |
| **Node.js** | Runtime environment (ES modules) |

### Running Locally

```bash
## Install dependencies
npm install

## Set up your input (edit storage/key_value_stores/default/INPUT.json)
{
  "url": "https://www.notion.so/pricing",
  "selector": ""
}

## Run the Actor
npm start
```

### Deploying to Apify

1. Install the [Apify CLI](https://docs.apify.com/cli/):
   ```bash
   npm install -g apify-cli
   ```

2. Log in to your Apify account:
   ```bash
   apify login
   ```

3. Push the Actor to Apify:
   ```bash
   apify push
   ```

4. Run it from the Apify Console, API, or n8n integration.

### Error Handling

The Actor is designed to fail gracefully:

- **Network idle timeout**: If the page takes too long to stabilize, the Actor proceeds with whatever content is loaded.
- **Selector not found**: Falls back to extracting the full `<body>` content instead of crashing.
- **Extraction failures**: Catches errors per-URL, logs them, and pushes an error entry to the Dataset so downstream workflows can handle failures without breaking.
- **Automatic retries**: Failed requests are retried up to 2 times before giving up.

### Notes

- The maximum request handler timeout is set to 120 seconds to accommodate slow-loading pricing pages with heavy JavaScript.
- Screenshots are stored in the Apify Key-Value Store with a timestamp-based filename for uniqueness.
- The Actor is designed to be integrated into larger automation pipelines—all output is structured and machine-readable.

### License

MIT

# Actor input Schema

## `url` (type: `string`):

The SaaS pricing page URL to scrape.

## `selector` (type: `string`):

CSS selector to extract a specific pricing section (leave empty to scrape the full body).

## Actor input object example

```json
{
  "url": "https://notion.so/pricing",
  "selector": ".pricing-container"
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "url": "https://notion.so/pricing",
    "selector": ".pricing-container"
};

// Run the Actor and wait for it to finish
const run = await client.actor("aetherlabx/saas-pricing-page-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "url": "https://notion.so/pricing",
    "selector": ".pricing-container",
}

# Run the Actor and wait for it to finish
run = client.actor("aetherlabx/saas-pricing-page-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "url": "https://notion.so/pricing",
  "selector": ".pricing-container"
}' |
apify call aetherlabx/saas-pricing-page-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=aetherlabx/saas-pricing-page-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "SaaS Pricing Page Scraper",
        "description": "Extract accurate, dynamically rendered pricing data from any SaaS website. Built with Playwright to bypass JS loading screens. Supports custom CSS selectors. Perfect for competitor intelligence, n8n workflows, and feeding clean text to AI Agents (LLMs) for pricing analysis.",
        "version": "1.0",
        "x-build-id": "JoDzNt7lLrrJudaR9"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/aetherlabx~saas-pricing-page-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-aetherlabx-saas-pricing-page-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/aetherlabx~saas-pricing-page-scraper/runs": {
            "post": {
                "operationId": "runs-sync-aetherlabx-saas-pricing-page-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/aetherlabx~saas-pricing-page-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-aetherlabx-saas-pricing-page-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "url"
                ],
                "properties": {
                    "url": {
                        "title": "URL",
                        "type": "string",
                        "description": "The SaaS pricing page URL to scrape."
                    },
                    "selector": {
                        "title": "CSS Selector",
                        "type": "string",
                        "description": "CSS selector to extract a specific pricing section (leave empty to scrape the full body)."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
