# AI Keyword Clustering Tool - Topical Clustering + Bulk SERP (`doesaiknow/ai-keyword-clustering-tool-topical-clustering-bulk-serp`) Actor

Bulk AI keyword clustering & topical clustering map with Google volume + CPC + 12-month trend + live SERP-validated MERGE/SPLIT recs. 1000 keywords/batch, keyword grouping for pillar pages. Ahrefs/Semrush/MarketMuse/Frase alternative - pay per 1000 KW, no subscription, no API keys.

- **URL**: https://apify.com/doesaiknow/ai-keyword-clustering-tool-topical-clustering-bulk-serp.md
- **Developed by:** [Dawid S](https://apify.com/doesaiknow) (community)
- **Categories:** SEO tools, AI, Developer tools
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $0.99 / per 1000 input keywords

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

### AI Keyword Clustering Tool 🧠🔍

Bulk **AI keyword clustering**, **topical clustering** map and **live SERP-validated** MERGE / SPLIT recommendations in a single run. Submit up to **1,000 keywords per batch**; get Google Ads search volume, CPC, 12-month trend, **deep semantic clusters** with canonical_query + central_entity + buyer intent + recommended content format, a hierarchical **CORE / OUTER topical authority map** keyed to your business context — and per-cluster top-10 Google SERPs cross-checked for **keyword cannibalization** and **content consolidation** opportunities.

A practical alternative to **Ahrefs / Semrush / MarketMuse / Frase / Surfer SEO / KeywordInsights** when you need the *clustering output*, not a full SEO suite subscription. Pay per 1,000 keywords. No subscription. No API keys to manage.

This actor is built for:

- 📈 **SEO professionals & agencies** planning pillar pages and topical authority maps
- ✍️ **Content marketers & strategists** turning a seed list into a 6-month editorial calendar
- 🛠️ **In-house SEO teams & developers** integrating bulk keyword clustering into pipelines, dashboards, n8n / Make / Zapier flows
- 🤖 **AI agent builders** wiring topical maps into LLM workflows via Apify MCP
- 🎯 **GEO / AEO specialists** generating cluster-level briefs for AI-engine optimization

> **Co-designed with a senior SEO analyst.** Every metric, threshold, validation layer and output field reflects what a professional content-strategy deliverable needs to look like — the kind that holds up in an agency client review or an editorial-planning meeting.

---

#### 📌 Table of Contents

- [✨ Features](#-features)
- [🎯 Use Cases](#-use-cases)
- [⚡ Quick Start](#-quick-start)
- [🧾 Input Parameters](#-input-parameters)
- [📤 Output](#-output)
- [💰 Pricing](#-pricing)
- [🆚 How it compares](#-how-it-compares)
- [❓ FAQ](#-faq)
- [🔎 SEO Keywords](#-seo-keywords)

---

#### ✨ Features

- **🧠 AI semantic clustering of up to 1,000 keywords per batch**
  - Deep semantic embeddings group keywords by *intent* and *meaning*, not just surface tokens. "ascendent kalkulator" + "kalkulator ascendentu" + "jak obliczyć ascendent" land in one cluster, not three.
- **🗺️ CORE / OUTER topical authority map**
  - Every cluster is classified relative to your `businessContext`: **CORE** = pillar pages (your main offering or a direct derivative); **OUTER** = supporting / E-E-A-T content that builds topical authority around the core. Output is a hierarchical pillar → supporting structure with URL-safe slugs ready for editorial planning.
- **🔍 Live Google SERP validation — MERGE & SPLIT recommendations**
  - For every cluster's canonical query, we fetch the live Google top-10 and check cross-cluster URL overlap. **MERGE** = two clusters with ≥ 50% URL overlap (Google ranks them as the same intent — consolidate to one page). **SPLIT** = a cluster whose sampled members have < 30% mean SERP overlap with the canonical query (the cluster mixes distinct intents — separate them).
- **📊 Per-keyword Google Ads metrics**
  - Search volume, CPC, competition + competition_index, full 12-month monthly history and month-over-month trend — for **every** keyword in the expanded set.
- **🧮 Per-cluster intelligence**
  - `canonical_query` (the cluster's root keyword, picked by volume × low-competition), `central_entity` (the head noun), `intent` (commercial / informational / transactional), `recommended_format` (ranking / how-to guide / comparison / informational article — picked from the SERP), `members[]`, members_count, total_volume, avg_cpc, avg_competition, top-10 SERP for the canonical.
- **🌱 Optional seed enrichment**
  - Submit fewer than 200 seeds and the pipeline auto-expands each with related queries (Google PAA + Related searches + organic title tokens + product-filter attributes + AI-generated query variants). Submit 200+ and your input IS the corpus — no auto-expansion, just clustering.
- **🌍 Country & language localization**
  - 53 countries, 2-letter ISO codes (`us` / `en`, `de` / `de`, `pl` / `pl`, `fr` / `fr`, …). Cluster labels and topical-map text honor the language code.
- **🧼 Clean structured JSON output**
  - Drop straight into SEO dashboards, BI tools, Looker / Metabase, spreadsheets, agent workflows, or Apify MCP clients.

---

#### 🎯 Use Cases

| Use Case | What you can do | Why it helps |
|---|---|---|
| 🧠 **Topical clustering** for pillar pages | Turn 1,000 keywords into 30–80 semantic clusters | Build hub-and-spoke / pillar-cluster content models from data, not guesses |
| 🗺️ **Topical authority map** for new sites | CORE / OUTER classification keyed to your business | Plan the full topical authority footprint before writing a word |
| 🔁 **Keyword cannibalization audit** | MERGE recommendations expose clusters Google sees as the same intent | Consolidate competing pages before they suppress each other's rankings |
| ✂️ **Content split audit** | SPLIT recommendations expose clusters mixing distinct intents | Spin off misfit members to their own focused pages |
| 📅 Editorial calendar planning | Per-cluster `recommended_format` + total_volume + intent | Prioritize the 30 highest-value clusters for the next quarter |
| 💰 PPC keyword grouping | Bulk keyword grouping by intent + CPC + competition | Build ad groups that share intent (and Quality Score) |
| 🤖 Topical maps for AI agents | Feed clean JSON into LLM workflows / Apify MCP | Replace expensive multi-tool SaaS stacks with one actor call |
| 🎯 **GEO / AEO content briefs** | Cluster + canonical SERP + intent → AI-engine-ready brief input | Generate generative-engine-optimized content briefs from one dataset |

---

#### ⚡ Quick Start

##### Click **"Try for free"** with the default input

The prefill is a curated 10-keyword running-gear seed list with a real `businessContext`. One run gives you 2–4 clusters, a topical authority map and at least one SPLIT or MERGE recommendation — enough to validate the dataset shape and pricing on your own account in ~3 minutes.

##### Real run — your own keywords

```json
{
  "keywords": [
    "topical clustering",
    "keyword clustering tool",
    "keyword grouping",
    "semantic keyword clustering",
    "topical authority map",
    "pillar page planning",
    "content cluster tool",
    "seo content strategy",
    "keyword cannibalization",
    "MERGE keyword recommendation"
  ],
  "businessContext": "We sell AI-powered SEO software for agencies and in-house content teams — keyword research, content planning, AI brand visibility tracking. Buyer is a senior SEO / content strategist.",
  "country": "us",
  "language": "en",
  "enableSerpValidation": true,
  "enableSplitCheck": true
}
````

##### Tips for clean clusters

- **`businessContext` is REQUIRED** — it drives CORE / OUTER classification. The richer the context, the sharper the topical map. 1–3 specific sentences beats one generic.
- **200+ keyword inputs skip auto-expansion** — submit your own complete keyword universe at scale; submit < 200 if you want the pipeline to add Google PAA / Related / variant queries.
- **Group by language per run** — mixed-language batches work, but produce stronger clusters when seeds share a language.
- **Toggle `enableSerpValidation` off** for cheaper exploratory runs without MERGE / SPLIT recommendations.

***

#### 🧾 Input Parameters

| Parameter | Type | Required | Default | Description |
|---|---:|:---:|---:|---|
| `keywords` | `array<string>` | ✅ | — | 1–1,000 seed keywords. Case-insensitive duplicates removed automatically. |
| `businessContext` | `string` | ✅ | — | 1–3 sentences describing what your business does. Drives CORE / OUTER classification. |
| `country` | `string` | ❌ | `us` | 2-letter ISO-3166 country code, lowercase. 53 supported (`us`, `gb`, `de`, `fr`, `pl`, `es`, …). |
| `language` | `string` | ❌ | derived from country | 2-letter ISO-639 language code, lowercase. Set explicitly when your keywords are in a different language than the country default. |
| `enableSerpValidation` | `boolean` | ❌ | `true` | Run live SERP overlap analysis for MERGE recommendations. |
| `enableSplitCheck` | `boolean` | ❌ | `true` | Run SPLIT coherence check for clusters > 10 members. |
| `minClusterSize` | `integer` | ❌ | `2` | Smallest reportable cluster (members below this get marked outliers). |
| `maxClusters` | `integer` | ❌ | auto | Hard cap on cluster count. Default auto-tunes via cluster quality score. |

***

#### 📤 Output

One dataset item per batch run, with a full ClusterResult payload.

##### Example output (trimmed)

```json
{
  "batch_id": "a3cdcec9-826e-4fb0-aaef-978196d0736e",
  "status": "done",
  "country": "us",
  "language": "en",
  "business_context": "We sell AI-powered SEO software …",
  "summary": {
    "input_keywords": 10,
    "expanded_keywords": 290,
    "total_clusters": 30,
    "outliers": 0,
    "total_search_volume": 388170,
    "core_topics": 10,
    "outer_topics": 20,
    "split_recommendations": 9,
    "merge_recommendations": 17
  },
  "keywords": [
    {
      "keyword": "topical clustering",
      "google_search_volume": 480,
      "google_cpc_usd": 11.70,
      "google_competition": "LOW",
      "monthly_history": [{"year": 2026, "month": 4, "search_volume": 590}],
      "trend_mom_pct": 22.9,
      "cluster_id": 0
    }
  ],
  "clusters": [
    {
      "id": 0,
      "canonical_query": "topical clustering",
      "central_entity": "topical clustering",
      "cluster_label": "Topical Clustering & Topical Authority Maps",
      "intent": "commercial",
      "recommended_format": "ranking",
      "classification": "CORE",
      "members": ["topical clustering", "topical authority map", "topical authority tool", "..."],
      "members_count": 17,
      "total_volume": 18570,
      "avg_cpc_usd": 14.32,
      "serp_top10": [
        {"rank": 1, "url": "https://example.com/topical-clustering", "title": "...", "domain": "example.com"}
      ]
    }
  ],
  "validation": {
    "splits": [
      {
        "cluster_id": 4,
        "canonical_query": "keyword clustering",
        "mean_serp_overlap_pct": 22.5,
        "subgroups": [["semantic keyword clustering"], ["keyword grouping tool"]],
        "reason": "mean SERP overlap of 5 sampled members vs 'keyword clustering' is 22.5% (< 30%); the cluster likely mixes distinct intents"
      }
    ],
    "merges": [
      {
        "cluster_ids": [1, 13],
        "canonical_queries": ["keyword research", "keyword research tools"],
        "serp_overlap_pct": 60.0,
        "reason": "top-10 SERP overlap of 60.0% between 'keyword research' and 'keyword research tools' (≥ 50%) — Google ranks them as the same intent"
      }
    ]
  },
  "topical_map": {
    "core": [
      {
        "pillar_page": "topical-clustering",
        "central_entity": "topical clustering",
        "attribute_type": "Main",
        "supporting_pages": [3, 7, 12],
        "cluster_id": 0
      }
    ],
    "outer": [
      {
        "page": "keyword-cannibalization",
        "central_entity": "keyword cannibalization",
        "reasoning": "A diagnostic concept supporting the core clustering offering, not a primary pillar.",
        "cluster_id": 12
      }
    ]
  },
  "created_at": "2026-05-14T12:53:14Z",
  "completed_at": "2026-05-14T12:54:30Z"
}
```

##### Output fields

| Category | Fields |
|---|---|
| 🆔 Batch | `batch_id`, `status`, `country`, `language`, `business_context`, `created_at`, `completed_at` |
| 📊 Summary | `input_keywords`, `expanded_keywords`, `total_clusters`, `outliers`, `total_search_volume`, `core_topics`, `outer_topics`, `split_recommendations`, `merge_recommendations` |
| 🔑 Per-keyword | `keyword`, `google_search_volume`, `google_cpc_usd`, `google_competition`, `monthly_history[]`, `trend_mom_pct`, `cluster_id` |
| 🧠 Per-cluster | `id`, `canonical_query`, `central_entity`, `cluster_label`, `intent`, `recommended_format`, `classification` (CORE/OUTER), `members[]`, `members_count`, `total_volume`, `avg_cpc_usd`, `avg_competition`, `serp_top10[]` |
| ✂️ MERGE / SPLIT | `validation.splits[]`, `validation.merges[]` — each with `cluster_id(s)`, `canonical_query`, SERP-overlap %, machine-readable `reason` |
| 🗺️ Topical map | `topical_map.core[]` (pillar pages), `topical_map.outer[]` (supporting pages) — both with URL-safe slugs |

You can download the dataset in JSON, CSV, Excel, HTML or RSS — Apify Store handles the format conversion.

***

#### 💰 Pricing

**Pay-per-event** — one event per batch (up to 1,000 keywords), charged only on **successful** runs. No subscription. No platform usage charges. No outbound API keys to manage.

| Apify subscription | Effective price / 1,000 keywords |
|---|---:|
| Business | **$0.99** |
| Scale | $1.25 |
| Starter | $1.75 |
| Free | $3.95 (one batch fits inside the $5 monthly Apify free credit) |

Sending fewer than 1,000 keywords counts as one batch (the pipeline cost is dominated by SERP fetches and Gemini calls, not the input array length).

Learn more: [Apify pay-per-event docs](https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event)

***

#### 🆚 How it compares

| | AI Keyword Clustering Tool | Ahrefs / Semrush UI scrapers | Single-engine keyword volume scrapers |
|---|---|---|---|
| Bulk semantic clustering | ✅ 1,000 KW / batch | ❌ raw keyword data only | ❌ |
| CORE / OUTER topical map | ✅ keyed to your business context | ❌ | ❌ |
| Live SERP MERGE / SPLIT recs | ✅ Google top-10 overlap analysis | ❌ | ❌ |
| Google Ads volume + CPC + 12-mo trend | ✅ for every expanded keyword | ✅ | ✅ |
| Buyer intent classification per cluster | ✅ commercial / informational / transactional | ❌ | ❌ |
| Recommended content format per cluster | ✅ ranking / how-to / comparison / article | ❌ | ❌ |
| Subscription required | ❌ pay-per-batch | ❌ but typically $3–10 / scan | ❌ |
| Best for | content planning + topical authority + pillar pages | suite-replacement scraping | bulk volume only |

> Sibling actors from the same `doesaiknow` developer:
>
> - [**Keyword Metrics Pro**](https://apify.com/doesaiknow/doesaiknow-keyword-metrics-apify) — raw bulk Google + Bing volume only, no clustering. $1.35 / 1,000 KW on Business tier, multi-engine.
> - [**AI Brand Visibility**](https://apify.com/doesaiknow/ai-brand-visibility---chatgpt-perplexity-copilot-google-ai) — track how often your brand is mentioned by ChatGPT, Perplexity, Copilot, and Google AI Overviews.

***

#### ❓ FAQ

##### **Q1: Why "topical clustering" instead of plain keyword clustering?**

Topical clustering does what flat keyword clustering refuses to: it groups keywords by **search intent + central entity + business relevance**, not by string similarity. "running shoes for flat feet" and "best supportive trainers" cluster together — even with zero overlapping tokens — because Google ranks them on the same intent. That is the difference between a list of synonyms and a content map you can act on.

##### **Q2: How is this different from keyword grouping in Ahrefs / Semrush?**

Ahrefs and Semrush group keywords by **token overlap** on a single canonical query. We cluster on **deep semantic embeddings**, then validate against **live Google SERPs** (top-10 URL overlap). The MERGE recommendation, in particular, has no equivalent in the major SEO suites — it directly tells you which clusters Google itself sees as the same intent (≥ 50% URL overlap) and should be a single page.

##### **Q3: What does the topical authority map actually contain?**

Two arrays. `core[]` lists clusters whose central entity is your main offering or a direct derivative — each becomes a pillar page with a URL slug, attribute type (Main / Derived) and a list of supporting cluster IDs. `outer[]` lists clusters that are tangential but build E-E-A-T / topical authority around the core — each becomes a supporting article with a reasoning sentence explaining why it's adjacent. Together they're a complete pillar-cluster content plan.

##### **Q4: How much does it cost to cluster 1,000 keywords?**

One batch event — **$0.99** on Business tier, $1.25 on Scale, $1.75 on Starter, $3.95 on Free (which fits inside Apify's $5 monthly free credit). No platform usage charges. No outbound API costs. No per-keyword surcharge. Sending 100 or 1,000 keywords both count as one batch.

##### **Q5: Can I run this from n8n / Make / Zapier / a Python script / an AI agent?**

Yes — use the Apify API client in your stack. Code samples auto-generated by Apify Store are below in the **API** section. The actor is also discoverable via the Apify MCP server, so Claude Desktop, Cursor, ChatGPT custom GPTs, n8n / Zapier MCP nodes and in-house AI agents can call it as a single tool.

##### **Q6: What languages and countries are supported?**

53 countries via the `country` parameter (`us`, `gb`, `de`, `fr`, `es`, `it`, `pl`, `nl`, `se`, `br`, `pt`, `jp`, `in`, `mx`, …). Language defaults from country but is overridable via `language` (e.g. `country=pl`, `language=en` for English keywords in the Polish ad-volume locale). Cluster labels, central entities and topical-map reasoning honor the language code — submit Polish keywords with `language=pl` and the labels come back in Polish.

##### **Q7: What is the SPLIT recommendation actually checking?**

For every cluster with > 10 members, we randomly sample 5 of them, fetch each one's live Google top-10, and compare the URL set to the cluster's canonical-query SERP. Members whose individual top-10 has < 30% URL overlap with the canonical are flagged as misfits — i.e. Google ranks them differently from the cluster's center, so they're mixing distinct intents. The `subgroups[]` field lists the misfits so you can pull them into their own focused cluster / page.

##### **Q8: Do I need to deduplicate my keywords before submitting?**

No. The actor lowercases + trims every input keyword and removes duplicates before billing — what shows up in the dataset is the unique set. Submit 50 raw variations of "best running shoes" and you'll see one keyword on the output side.

##### **Q9: What happens if my run hits a timeout?**

The clustering pipeline has a 12-minute backend hard limit; the actor polls for ~13 minutes. A 1,000-keyword run typically completes in 5–8 minutes. If a run does time out, the dataset is empty and you are **not charged** (PPE only fires on success).

***

#### 🔎 SEO Keywords

topical clustering, AI keyword clustering tool, keyword clustering tool, keyword clustering, semantic keyword clustering, AI semantic clustering, keyword grouping, keyword grouping tool, topical authority map, topical authority tool, content cluster tool, content cluster generator, pillar page planning, pillar cluster model, hub and spoke seo, content hub planning, SEO content strategy, content gap analysis, semantic seo tool, AI seo tool, keyword cannibalization, MERGE keyword recommendation, SPLIT keyword recommendation, cluster validation, content brief generator, editorial calendar tool, keyword research tool, bulk keyword research, Google Ads search volume, keyword search volume, keyword cpc, keyword competition, 12-month keyword trend, SERP analysis tool, SERP overlap analysis, live SERP analysis, search intent classification, intent clustering, transactional keyword intent, informational keyword intent, commercial keyword intent, canonical query, central entity, recommended content format, GEO keyword tool, AEO keyword tool, ai agent seo, agentic seo tooling, Ahrefs alternative, Semrush alternative, MarketMuse alternative, Frase alternative, Surfer SEO alternative, KeywordTool.io alternative, KeywordInsights alternative, AnswerThePublic alternative, LowFruits alternative, agency keyword research, agency keyword clustering, freelance SEO tool, in-house SEO tool, content strategist tool, AI content planning, AI pillar page, AI topical map, pay per 1000 keywords, no subscription seo tool, no api keys seo tool, bulk seo tool, JSON keyword data, structured keyword data, Apify keyword actor, MCP keyword tool, Model Context Protocol seo, large scale keyword analysis, batch keyword tool, buyer intent keywords.

If this actor is useful in your workflow, please **leave a rating on Apify** so others can find it.

# Actor input Schema

## `keywords` (type: `array`):

Flat list of seed keywords to research and cluster. Submit up to 1000 per batch — the full clustering benefit kicks in around 200+ keywords (below that you'll see fewer clusters, and seed-enrichment expansion runs only when you provide fewer than 200). Mixed-language batches work but produce stronger clusters when all keywords share a language. Each keyword: 1–200 chars, ASCII or Unicode. Duplicates (case-insensitive) are removed.

## `businessContext` (type: `string`):

1–3 sentences describing what your business does, who you serve, and what you sell. This is the Source Context: every cluster gets a CORE / OUTER classification — CORE = clusters whose central entity is the main offering or a direct derivative (build pillar pages); OUTER = clusters tangential to the core offering that still build topical authority (supporting articles for E-E-A-T). Required in v1.0 so every run yields a real topical map.

## `country` (type: `string`):

ISO-3166 country code (lowercase, 2 chars). Drives Google Ads search volume + CPC + SERP locale routing. 53 countries supported. Unknown codes silently fall back to US / English.

## `language` (type: `string`):

ISO-639 lowercase 2-char language code. If omitted, the language tied to `country` is used. Set explicitly when you want EN keywords run against a non-EN volume locale (e.g. `country=pl`, `language=en`).

## `enableSerpValidation` (type: `boolean`):

When TRUE, every cluster's canonical\_query is run through a live Google SERP for its top-10 results; cluster pairs with ≥50% URL overlap get a MERGE recommendation (Google ranks them as the same intent). Adds ~1–2 min runtime per 50 clusters and is already priced into the per-batch fee — no upcharge.

## `enableSplitCheck` (type: `boolean`):

When TRUE, clusters with more than 10 members are checked for internal SERP coherence: a random sample of members has its SERP compared to the cluster's canonical-query SERP, and clusters with <30% mean URL overlap get a SPLIT recommendation (the cluster mixes distinct intents). Uses a few extra SERP calls, already priced in.

## `minClusterSize` (type: `integer`):

Minimum keywords per cluster. Keywords that don't fit a cluster of this size are marked as outliers (cluster\_id = -1). Increase to 3+ for tighter, fewer clusters; set to 1 to keep every keyword in some cluster.

## `maxClusters` (type: `integer`):

Hard cap on the number of clusters produced. Default is auto-tuned via silhouette score (typically 50–80 for 1000 keywords). Set explicitly for a strict pillar-page count when planning editorial.

## Actor input object example

```json
{
  "keywords": [
    "best running shoes",
    "running shoes for flat feet",
    "running shoes review",
    "running shoes brands",
    "marathon training plan",
    "marathon nutrition guide",
    "couch to 5k plan",
    "running form tips",
    "stretching for runners",
    "running cadence training"
  ],
  "businessContext": "We sell premium running gear (shoes, apparel, accessories) and publish marathon training content for amateur athletes preparing for their first marathon.",
  "country": "us",
  "enableSerpValidation": true,
  "enableSplitCheck": true,
  "minClusterSize": 2
}
```

# Actor output Schema

## `clusterResult` (type: `string`):

Default dataset link. Holds the full ClusterResult JSON: summary, keywords\[] (per-keyword volume + CPC + competition + monthly history + trend + cluster\_id), clusters\[] (canonical\_query, central\_entity, cluster\_label, intent, recommended\_format, classification, members, serp\_top10), validation{splits\[],merges\[]}, topical\_map{core\[],outer\[]}. See dataset\_schema.json for per-field documentation.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "keywords": [
        "best running shoes",
        "running shoes for flat feet",
        "running shoes review",
        "running shoes brands",
        "marathon training plan",
        "marathon nutrition guide",
        "couch to 5k plan",
        "running form tips",
        "stretching for runners",
        "running cadence training"
    ],
    "businessContext": "We sell premium running gear (shoes, apparel, accessories) and publish marathon training content for amateur athletes preparing for their first marathon."
};

// Run the Actor and wait for it to finish
const run = await client.actor("doesaiknow/ai-keyword-clustering-tool-topical-clustering-bulk-serp").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "keywords": [
        "best running shoes",
        "running shoes for flat feet",
        "running shoes review",
        "running shoes brands",
        "marathon training plan",
        "marathon nutrition guide",
        "couch to 5k plan",
        "running form tips",
        "stretching for runners",
        "running cadence training",
    ],
    "businessContext": "We sell premium running gear (shoes, apparel, accessories) and publish marathon training content for amateur athletes preparing for their first marathon.",
}

# Run the Actor and wait for it to finish
run = client.actor("doesaiknow/ai-keyword-clustering-tool-topical-clustering-bulk-serp").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "keywords": [
    "best running shoes",
    "running shoes for flat feet",
    "running shoes review",
    "running shoes brands",
    "marathon training plan",
    "marathon nutrition guide",
    "couch to 5k plan",
    "running form tips",
    "stretching for runners",
    "running cadence training"
  ],
  "businessContext": "We sell premium running gear (shoes, apparel, accessories) and publish marathon training content for amateur athletes preparing for their first marathon."
}' |
apify call doesaiknow/ai-keyword-clustering-tool-topical-clustering-bulk-serp --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=doesaiknow/ai-keyword-clustering-tool-topical-clustering-bulk-serp",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "AI Keyword Clustering Tool - Topical Clustering + Bulk SERP",
        "description": "Bulk AI keyword clustering & topical clustering map with Google volume + CPC + 12-month trend + live SERP-validated MERGE/SPLIT recs. 1000 keywords/batch, keyword grouping for pillar pages. Ahrefs/Semrush/MarketMuse/Frase alternative - pay per 1000 KW, no subscription, no API keys.",
        "version": "0.0",
        "x-build-id": "qMQkDmd5cZpwK81RD"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/doesaiknow~ai-keyword-clustering-tool-topical-clustering-bulk-serp/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-doesaiknow-ai-keyword-clustering-tool-topical-clustering-bulk-serp",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/doesaiknow~ai-keyword-clustering-tool-topical-clustering-bulk-serp/runs": {
            "post": {
                "operationId": "runs-sync-doesaiknow-ai-keyword-clustering-tool-topical-clustering-bulk-serp",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/doesaiknow~ai-keyword-clustering-tool-topical-clustering-bulk-serp/run-sync": {
            "post": {
                "operationId": "run-sync-doesaiknow-ai-keyword-clustering-tool-topical-clustering-bulk-serp",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "keywords",
                    "businessContext"
                ],
                "properties": {
                    "keywords": {
                        "title": "Keywords (1–1000 per batch)",
                        "minItems": 1,
                        "maxItems": 1000,
                        "type": "array",
                        "description": "Flat list of seed keywords to research and cluster. Submit up to 1000 per batch — the full clustering benefit kicks in around 200+ keywords (below that you'll see fewer clusters, and seed-enrichment expansion runs only when you provide fewer than 200). Mixed-language batches work but produce stronger clusters when all keywords share a language. Each keyword: 1–200 chars, ASCII or Unicode. Duplicates (case-insensitive) are removed.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "businessContext": {
                        "title": "Business context (REQUIRED — unlocks CORE / OUTER classification)",
                        "minLength": 1,
                        "maxLength": 2000,
                        "type": "string",
                        "description": "1–3 sentences describing what your business does, who you serve, and what you sell. This is the Source Context: every cluster gets a CORE / OUTER classification — CORE = clusters whose central entity is the main offering or a direct derivative (build pillar pages); OUTER = clusters tangential to the core offering that still build topical authority (supporting articles for E-E-A-T). Required in v1.0 so every run yields a real topical map."
                    },
                    "country": {
                        "title": "Country / market",
                        "enum": [
                            "us",
                            "gb",
                            "ca",
                            "au",
                            "de",
                            "fr",
                            "es",
                            "it",
                            "pl",
                            "nl",
                            "se",
                            "no",
                            "dk",
                            "fi",
                            "br",
                            "pt",
                            "jp",
                            "in",
                            "mx",
                            "ie",
                            "at",
                            "ch",
                            "be",
                            "cz",
                            "ro",
                            "ae",
                            "ar",
                            "bg",
                            "cl",
                            "co",
                            "ee",
                            "eg",
                            "gr",
                            "hr",
                            "hu",
                            "id",
                            "il",
                            "kr",
                            "lt",
                            "lv",
                            "my",
                            "ng",
                            "nz",
                            "pe",
                            "ph",
                            "pk",
                            "sa",
                            "sg",
                            "sk",
                            "th",
                            "tr",
                            "ua",
                            "vn",
                            "za"
                        ],
                        "type": "string",
                        "description": "ISO-3166 country code (lowercase, 2 chars). Drives Google Ads search volume + CPC + SERP locale routing. 53 countries supported. Unknown codes silently fall back to US / English.",
                        "default": "us"
                    },
                    "language": {
                        "title": "Language (optional)",
                        "enum": [
                            "en",
                            "pl",
                            "de",
                            "fr",
                            "es",
                            "it",
                            "pt",
                            "nl",
                            "sv",
                            "no",
                            "da",
                            "fi",
                            "ja",
                            "ko",
                            "zh",
                            "tr",
                            "ar",
                            "ru",
                            "uk",
                            "cs",
                            "hu",
                            "ro",
                            "el"
                        ],
                        "type": "string",
                        "description": "ISO-639 lowercase 2-char language code. If omitted, the language tied to `country` is used. Set explicitly when you want EN keywords run against a non-EN volume locale (e.g. `country=pl`, `language=en`)."
                    },
                    "enableSerpValidation": {
                        "title": "Enable SERP validation + MERGE recommendations",
                        "type": "boolean",
                        "description": "When TRUE, every cluster's canonical_query is run through a live Google SERP for its top-10 results; cluster pairs with ≥50% URL overlap get a MERGE recommendation (Google ranks them as the same intent). Adds ~1–2 min runtime per 50 clusters and is already priced into the per-batch fee — no upcharge.",
                        "default": true
                    },
                    "enableSplitCheck": {
                        "title": "Enable SPLIT coherence check",
                        "type": "boolean",
                        "description": "When TRUE, clusters with more than 10 members are checked for internal SERP coherence: a random sample of members has its SERP compared to the cluster's canonical-query SERP, and clusters with <30% mean URL overlap get a SPLIT recommendation (the cluster mixes distinct intents). Uses a few extra SERP calls, already priced in.",
                        "default": true
                    },
                    "minClusterSize": {
                        "title": "Min cluster size (optional, defaults to 2)",
                        "minimum": 1,
                        "maximum": 10,
                        "type": "integer",
                        "description": "Minimum keywords per cluster. Keywords that don't fit a cluster of this size are marked as outliers (cluster_id = -1). Increase to 3+ for tighter, fewer clusters; set to 1 to keep every keyword in some cluster.",
                        "default": 2
                    },
                    "maxClusters": {
                        "title": "Max clusters (optional, auto-tuned by default)",
                        "minimum": 5,
                        "maximum": 200,
                        "type": "integer",
                        "description": "Hard cap on the number of clusters produced. Default is auto-tuned via silhouette score (typically 50–80 for 1000 keywords). Set explicitly for a strict pillar-page count when planning editorial."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
