Similarweb Scraper - Traffic, AI Traffic & WHOIS avatar

Similarweb Scraper - Traffic, AI Traffic & WHOIS

Pricing

from $0.80 / 1,000 domains

Go to Apify Store
Similarweb Scraper - Traffic, AI Traffic & WHOIS

Similarweb Scraper - Traffic, AI Traffic & WHOIS

πŸ” Spy on any website in seconds: traffic, rankings, top keywords, AI traffic share (ChatGPT/Claude/Gemini), competitors, similar sites & WHOIS β€” all from Similarweb. No login or API key. Bulk parallel scrape, captcha-resilient. Export to JSON/CSV/Excel. SEO, lead gen, research.

Pricing

from $0.80 / 1,000 domains

Rating

5.0

(1)

Developer

VortexData

VortexData

Maintained by Community

Actor stats

2

Bookmarked

12

Total users

11

Monthly active users

5 days ago

Last modified

Share

πŸ” Similarweb Scraper

πŸ“Š Website intelligence for any domain in seconds. Start with one website, choose traffic/rankings, similar sites, or WHOIS + homepage keywords, then export the results as JSON, CSV, Excel or any other format Apify supports.

πŸ’Ž What is Similarweb Scraper?

Similarweb Scraper is a fast, captcha-resilient web scraper that pulls the same data the Similarweb web app shows you β€” without requiring a Similarweb account, login, or API key. Behind the scenes it talks to Similarweb's own SPA data endpoint using a real Chrome TLS / JA3 fingerprint via curl_cffi and routes every request through a fresh Apify Residential proxy session, so you get reliable, production-grade data for any domain.

You can start with a single domain, then scale to a whole list when you are ready. Pick one dataset mode per run and the Actor returns clean records ready to drop into a spreadsheet, BI tool, warehouse, or AI agent.

πŸš€ What can Similarweb Scraper do?

  • πŸ—‚οΈ Choose one of three dataset modes for each run:
    • πŸ“Š Base data β€” global / country / category ranks, monthly visits, bounce rate, pages per visit, time on site, traffic-source split (direct, search, referral, social, paid, mail), top organic keywords with volume and CPC, AI traffic share per LLM (ChatGPT / Claude / Gemini / Perplexity / Copilot).
    • πŸͺž Similar sites β€” competitors and alternatives with their traffic, category and ranking.
    • πŸ†” AITDK β€” WHOIS via RDAP (registrar, registration / expiration dates, name servers, EPP status, DNSSEC) plus on-page keyword density analysis of the domain's homepage.
  • ⚑ Start small or run in bulk β€” one domain is enough for a test run; larger batches process up to 10 domains concurrently by default.
  • πŸ›‘οΈ Captcha-resilient β€” uses Similarweb's open SPA endpoint that serves 200 OK to Chrome TLS fingerprints, no captcha solving required for base data and similar sites.
  • πŸ”„ Per-request IP rotation β€” every HTTP call gets a fresh Apify Residential proxy session, so a blocked address costs at most one attempt.
  • 🌐 Three input formats β€” accepts example.com, www.example.com, or https://example.com. The domain is extracted automatically.

☁️ Remember the Apify platform

Running this Actor on Apify gives you everything that comes with the platform out of the box: managed Residential proxies with global exit IPs, scheduling (run hourly / daily / weekly), free storage in Apify Datasets with export to JSON / CSV / Excel / JSONL / XML / RSS, webhooks and integrations (Make, Zapier, n8n, Google Sheets, Slack, Airtable, Pipedream), and a REST API + Python / JavaScript SDKs to plug results into your own pipelines.

πŸ—οΈ What data can this Actor extract?

Field groupExamples
RankingsGlobal rank Β· country rank Β· category rank
EngagementTotal visits Β· monthly visits (3 months) Β· bounce rate Β· pages / visit Β· time on site
Traffic sourcesDirect Β· search Β· referral Β· social Β· paid Β· mail (as shares)
AI traffic shareChatGPT Β· Claude Β· Gemini Β· Perplexity Β· Copilot β€” current + 3-month history
Top keywordsKeyword Β· estimated value Β· search volume Β· CPC
Country breakdownTop countries with share + monthly visit estimates per country
Similar sitesUp to 20 related sites with traffic, ranks, category and thumbnails
WHOIS (via RDAP)Registrar Β· IANA ID Β· registration / expiration / last-changed dates Β· name servers Β· EPP status
Keyword densityTop-20 non-stopword tokens from the homepage with count and density
AssetsDesktop / mobile screenshots Β· favicon

🎯 How to use Similarweb Scraper

  1. Click Try for free on the Actor's Apify Store page.
  2. In the Domains field, enter one website to test, or paste a larger list later β€” one per line, any format (example.com, www.example.com, or https://example.com).
  3. Pick exactly one Dataset to fetch. Start with base_data if you want the standard Similarweb dashboard data.
  4. Click Start. A one-domain test run is fine; there is no 10-domain minimum. aitdk takes longer than the other modes because it also fetches RDAP and the homepage.
  5. When the run finishes, open Storage β†’ Dataset and export to JSON, CSV, Excel, JSONL, XML or RSS. Or pull the results through the API: https://api.apify.com/v2/datasets/{dataset_id}/items.

πŸ“₯ Input

The form has two fields only β€” everything else has sensible defaults:

FieldTypeDefault
domainsarrayrequired β€” minimum 1 domain
datasetsenumbase_data β€” one of base_data, similar_sites, aitdk

Example input

{
"domains": ["openai.com"],
"datasets": "base_data"
}

πŸ“€ Output

Each domain produces one dataset item. Each item conforms to the dataset schema and is rendered in the Apify Console views that match the selected dataset mode: πŸ“Š Overview Β· 🚦 Traffic sources Β· πŸ’« Engagement Β· πŸ€– AI traffic share Β· πŸ†” AITDK (WHOIS and keywords).

Example item (abridged)

{
"domain": "openai.com",
"rankGlobal": 207,
"country": "US",
"countryRank": 306,
"category": "ai_chatbots_and_tools",
"categoryRank": 6,
"title": "OpenAI",
"totalVisits": 195737812,
"bounceRate": 0.5937,
"pagesPerVisit": 2.59,
"timeOnSite": 138.72,
"socialTraffic": 0.0287,
"searchTraffic": 0.2154,
"directTraffic": 0.3840,
"referralTraffic": 0.1038,
"aiTrafficShareChatgpt": 0.8825,
"aiTrafficShareClaude": 0.0029,
"aiTrafficShareGemini": 0.0106,
"topKeywords": [
{"keyword": "chatgpt", "estimatedValue": 20907500.0, "searchVolume": 173339160.0, "cpc": 0.14},
{"keyword": "chat gpt", "estimatedValue": 5688810.0, "searchVolume": 95011780.0, "cpc": 0.14}
]
}

πŸ”— Integrate Similarweb Scraper anywhere

Apify Actors run on a REST API β€” every run, dataset and webhook is addressable from your code:

# Trigger a run from anywhere
curl -X POST "https://api.apify.com/v2/acts/<USER>~similarweb-scraper/runs?token=<API_TOKEN>" \
-H "Content-Type: application/json" \
-d '{"domains": ["openai.com"], "datasets": "base_data"}'
# Read results from the run's default dataset
curl "https://api.apify.com/v2/datasets/<DATASET_ID>/items?format=json"

Or use the official Python and JavaScript clients.

❓ FAQ

Can I test it with only one domain? Yes. The input requires at least one domain, not ten. The default example uses openai.com with base_data, so a new user can click Start immediately.

πŸ”‘ Do I need a Similarweb account or API key? No. This Actor talks to Similarweb's public SPA endpoint directly. No login, no API key, and no scraping the captcha-gated in-depth pages.

πŸ†• Is the data fresh? Yes β€” it's the same JSON Similarweb's UI loads. The snapshotDate field on every record tells you exactly which month it represents. Similarweb refreshes its traffic data monthly.

⏰ Can I run this on a schedule? Yes β€” open the Actor in Apify Console, go to Schedules and pick hourly, daily, weekly or a custom cron. Combine with webhooks to push fresh data into Google Sheets, Slack, Make, Zapier or your own backend automatically.

βš–οΈ Is web scraping legal? Public web pages are generally legal to scrape, but you must respect copyright, terms of service, and personal-data protection laws (GDPR in the EU and similar regulations elsewhere). This Actor only extracts publicly visible data β€” no personal data is collected. See Apify's legal blog for details.

πŸ’¬ Support

πŸ“ Changelog

See CHANGELOG.md for the full release history. The Actor follows Semantic Versioning.