# Content Gap Analyzer (`fabian_maume/content-gap-analyzer`) Actor

Identifies topics, insights, and questions that top-ranking competitors cover but your blog post misses. Runs a Google search, classifies each competitor page, extracts key insights via LLM, and reports what's missing from your own article.

- **URL**: https://apify.com/fabian\_maume/content-gap-analyzer.md
- **Developed by:** [Fabian Maume](https://apify.com/fabian_maume) (community)
- **Categories:** SEO tools
- **Stats:** 2 total users, 1 monthly users, 87.5% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Content Gap Analyzer

You had a great blog post which suddently stop bringing traffic? That might be a matter of content gap. Check in Google Search Console for which keyword your blog post was ranking and run the Contnent Gap Analyzer to identify what insights are missing from your blog post.

### What does Content Gap Analyzer do?

**Content Gap Analyzer** identifies the topics, insights, and questions that competitors cover but your blog post misses.

Point it at one of your articles and a target keyword - the Actor pulls the top-ranking pages from Google, classifies each one by content type, extracts the key insights, and reports back which insights are absent from your own page.

Like any Apify Actor, you can run it on a schedule, integrate via the Apify API, or hook into Make, Zapier, and other automation tools.

### Why use Content Gap Analysis?

Search engines rank pages on topical depth. If a competitor covers an angle, statistic, or question that your article doesn't, you lose ground in the SERP - even when your writing is stronger. Doing this audit manually means opening ten tabs, extracting the bullet points by hand, and cross-referencing them against your own draft. This Actor automates the entire loop.

Key features:

- Runs a **Google search** for each keyword and pulls the top organic results
- **Classifies every competitor page** as one of 13 content types (Guide, Listicle, Landing page, Comparison, Alternatives, Pricing page, Case study, …)
- Extracts **up to 20 key insights per page** using LLM analysis
- Short list **missing insights** : what competitors cover but your article doesn't
- Surfaces **question-style headings** from competitor articles, ready to add into your FAQ
- Reports **where your article ranks** for each target keyword


### What data can Content Gap Analyzer extract?

The main dataset contains one row per competitor page:

| Field | Type | Description |
|-------|------|-------------|
| `url` | string | Competitor page URL |
| `ownArticle` | boolean | `true` if the row is your own blog post |
| `contentType` | string | LLM classification (Guide, Listicle, Landing page, …) |
| `confidence` | number | Classification confidence (0–1) |
| `contentInsight` | array | Up to 20 key insights extracted from the page |
| `contentLength` | number | Length of the page markdown (characters) |
| `OrganicAppearance` | array | Keywords and Google rank where the page appeared |

Three auxiliary records are produced in the default key-value store:

- **`MISSING_INSIGHTS`** - insights present in competitors but absent from your article
- **`Question`** - question-style headings (H1–H6) from competitor articles
- **`SUB_DATASETS`** - links to the raw datasets produced by Google Search Scraper and Website Content Crawler, so you can drill down into the upstream data

### How to run a content gap analysis

1. Open the Actor in Apify Console.
2. Paste the URL of the blog post you want to analyze into **Blog post URL**.
3. Enter your target **Keyword** - one query, or several separated by line breaks.
4. Set **Number of organic results per keyword** (default: 10).
5. Click **Save & Start**.
6. When the run finishes, open the **Output** tab to see the competitor landscape table, missing insights, questions, and links to the sub-Actor datasets.

### How much will it cost to run Content Gap Analyzer?

The Actor is billed on compute usage and passes through the cost of the sub-Actors it calls - [Google Search Scraper](https://apify.com/apify/google-search-scraper), [Website Content Crawler](https://apify.com/apify/website-content-crawler) - plus LLM tokens consumed via the OpenRouter integration. The actual run cost depends on:

- **Number of organic results** - drives Google Search Scraper and Website Content Crawler cost
- **Total length of competitor markdown** - drives LLM token cost
- **Memory allocated to Website Content Crawler** - the Actor auto-requests the maximum power-of-2 RAM available on your account, so larger plans crawl faster

You can cap the total spend per run with the `maxTotalChargeUsd` option when starting the Actor. The orchestrator checks the remaining budget before each sub-Actor call and refuses to overspend.

### Input

See the **Input** tab for the full schema. Three fields drive the run:

- **`BlogUrls`** : the article you want to analyze. Only the first URL is used.
- **`keywords`** : the target search query. Separate multiple queries with line breaks.
- **`organicResultCount`** : number of organic Google results to analyze per keyword. Defaults to 10. Rounded up to the nearest page of 10.

### Output

You can download the dataset in various formats such as JSON, HTML, CSV, or Excel. A simplified output row looks like:

```json
{
  "url": "https://competitor.com/guide-to-x",
  "ownArticle": false,
  "contentType": "Guide",
  "confidence": 0.92,
  "contentInsigth": [
    "Cheerio is up to 8x faster than jsdom for HTML parsing.",
    "Cheerio does not execute JavaScript - use Playwright for SPAs."
  ],
  "contentLenght": 18342,
  "OrganicAppearance": [
    { "keyword": "web scraping cheerio", "rank": 2 }
  ]
}
````

Auxiliary outputs in the key-value store:

```json
// MISSING_INSIGHTS
[
  "Cheerio's selector engine supports a subset of CSS4.",
  "Most modern scraping stacks now pair Cheerio with a queue-based crawler."
]

// SUB_DATASETS
[
  { "name": "Google Search Scraper",   "resultUrl": "https://console.apify.com/storage/datasets/abc123" },
  { "name": "Website Content Crawler", "resultUrl": "https://console.apify.com/storage/datasets/def456" }
]
```

### Tips

- **Use one focused long tail keyword.** Content gaps are keyword-specific so broad keyword sets dilute the analysis.
- **Lower `organicResultCount` for faster, cheaper runs.** Ten results capture the dominant topics for most queries.
- **Cap your run cost.** Set `maxTotalChargeUsd` when starting the Actor - the orchestrator allocates the remaining budget across sub-Actors and stops when the cap is reached.
- **Inspect upstream data via `SUB_DATASETS`.** If a result looks wrong, the linked Google Search Scraper and Website Content Crawler datasets show exactly what was fed into the LLM.

### FAQ

#### Why does my own URL appear in the competitor landscape?

That's by design : including your own page lets the LLM compare it against competitors on equal footing. The `ownArticle` field flags which row is yours.

#### The "missing insights" list is empty.

Either your article already covers the competitor topics (good!) or the competitors didn't yield enough usable text. Check the **Sub-Actor datasets** tab to see what Website Content Crawler returned.

#### Can I use this without an LLM?

No, classification and insight extraction both rely on the OpenRouter integration. LLM cost is part of the run cost.
However this Actor is open source so you can remove LLM logic and build your own custom Actor.

#### What is the content gap analysis tool?

A tool to compare your content with your competitors'. It typically has two step:

- Data extracting
- Content analysis with LLM

#### What is content gap analysis?

A content gap analysis is the process of identifying missing topics, keywords, or formats that your audience is searching for, but that your website currently fails to cover. It usually relies on mapping out what your competitors rank for, to discover content opportunities.

# Actor input Schema

## `BlogUrls` (type: `array`):

Blog post you wish to analyze

## `organicResultCount` (type: `integer`):

Maximum number of Google searches to analyze per keywords.

## `keywords` (type: `string`):

Separate keywords with line breaks

## Actor input object example

```json
{
  "BlogUrls": [
    {
      "url": "https://apify.com"
    }
  ],
  "organicResultCount": 10,
  "keywords": "Apify"
}
```

# Actor output Schema

## `competitorLandscape` (type: `string`):

Table of competitor pages with classification, extracted insights, and organic search appearances.

## `missingInsight` (type: `string`):

Insights present in competitor articles but absent from your own article.

## `questions` (type: `string`):

Question-style H1-H10 headings extracted from competitor articles. Useful for FAQ inspiration.

## `subDatasets` (type: `string`):

Links to the raw datasets produced by the Google Search Scraper and Website Content Crawler sub-Actors.

## `results` (type: `string`):

Full raw output as stored in the dataset.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "BlogUrls": [
        {
            "url": "https://apify.com"
        }
    ],
    "keywords": "Apify"
};

// Run the Actor and wait for it to finish
const run = await client.actor("fabian_maume/content-gap-analyzer").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "BlogUrls": [{ "url": "https://apify.com" }],
    "keywords": "Apify",
}

# Run the Actor and wait for it to finish
run = client.actor("fabian_maume/content-gap-analyzer").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "BlogUrls": [
    {
      "url": "https://apify.com"
    }
  ],
  "keywords": "Apify"
}' |
apify call fabian_maume/content-gap-analyzer --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=fabian_maume/content-gap-analyzer",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Content Gap Analyzer",
        "description": "Identifies topics, insights, and questions that top-ranking competitors cover but your blog post misses. Runs a Google search, classifies each competitor page, extracts key insights via LLM, and reports what's missing from your own article.",
        "version": "0.0",
        "x-build-id": "XlUlegbG3hMQ5hsnw"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/fabian_maume~content-gap-analyzer/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-fabian_maume-content-gap-analyzer",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/fabian_maume~content-gap-analyzer/runs": {
            "post": {
                "operationId": "runs-sync-fabian_maume-content-gap-analyzer",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/fabian_maume~content-gap-analyzer/run-sync": {
            "post": {
                "operationId": "run-sync-fabian_maume-content-gap-analyzer",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "BlogUrls": {
                        "title": "Blog post URL",
                        "type": "array",
                        "description": "Blog post you wish to analyze",
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "organicResultCount": {
                        "title": "Number of organic results for analysis",
                        "type": "integer",
                        "description": "Maximum number of Google searches to analyze per keywords.",
                        "default": 10
                    },
                    "keywords": {
                        "title": "Keywords to analyze",
                        "type": "string",
                        "description": "Separate keywords with line breaks"
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
