Similarweb Scraper - Traffic, AI Traffic & WHOIS
Pricing
from $0.80 / 1,000 domains
Similarweb Scraper - Traffic, AI Traffic & WHOIS
π Spy on any website in seconds: traffic, rankings, top keywords, AI traffic share (ChatGPT/Claude/Gemini), competitors, similar sites & WHOIS β all from Similarweb. No login or API key. Bulk parallel scrape, captcha-resilient. Export to JSON/CSV/Excel. SEO, lead gen, research.
Pricing
from $0.80 / 1,000 domains
Rating
5.0
(1)
Developer
VortexData
Maintained by CommunityActor stats
2
Bookmarked
12
Total users
11
Monthly active users
5 days ago
Last modified
Categories
Share
π Similarweb Scraper
π Website intelligence for any domain in seconds. Start with one website, choose traffic/rankings, similar sites, or WHOIS + homepage keywords, then export the results as JSON, CSV, Excel or any other format Apify supports.
π What is Similarweb Scraper?
Similarweb Scraper is a fast, captcha-resilient web scraper that pulls
the same data the Similarweb web app shows you β without requiring a
Similarweb account, login, or API key. Behind the scenes it talks to
Similarweb's own SPA data endpoint using a real Chrome TLS / JA3
fingerprint via curl_cffi
and routes every request through a fresh Apify Residential proxy
session, so you get reliable, production-grade data for any domain.
You can start with a single domain, then scale to a whole list when you are ready. Pick one dataset mode per run and the Actor returns clean records ready to drop into a spreadsheet, BI tool, warehouse, or AI agent.
π What can Similarweb Scraper do?
- ποΈ Choose one of three dataset modes for each run:
- π Base data β global / country / category ranks, monthly visits, bounce rate, pages per visit, time on site, traffic-source split (direct, search, referral, social, paid, mail), top organic keywords with volume and CPC, AI traffic share per LLM (ChatGPT / Claude / Gemini / Perplexity / Copilot).
- πͺ Similar sites β competitors and alternatives with their traffic, category and ranking.
- π AITDK β WHOIS via RDAP (registrar, registration / expiration dates, name servers, EPP status, DNSSEC) plus on-page keyword density analysis of the domain's homepage.
- β‘ Start small or run in bulk β one domain is enough for a test run; larger batches process up to 10 domains concurrently by default.
- π‘οΈ Captcha-resilient β uses Similarweb's open SPA endpoint that
serves
200 OKto Chrome TLS fingerprints, no captcha solving required for base data and similar sites. - π Per-request IP rotation β every HTTP call gets a fresh Apify Residential proxy session, so a blocked address costs at most one attempt.
- π Three input formats β accepts
example.com,www.example.com, orhttps://example.com. The domain is extracted automatically.
βοΈ Remember the Apify platform
Running this Actor on Apify gives you everything that comes with the platform out of the box: managed Residential proxies with global exit IPs, scheduling (run hourly / daily / weekly), free storage in Apify Datasets with export to JSON / CSV / Excel / JSONL / XML / RSS, webhooks and integrations (Make, Zapier, n8n, Google Sheets, Slack, Airtable, Pipedream), and a REST API + Python / JavaScript SDKs to plug results into your own pipelines.
ποΈ What data can this Actor extract?
| Field group | Examples |
|---|---|
| Rankings | Global rank Β· country rank Β· category rank |
| Engagement | Total visits Β· monthly visits (3 months) Β· bounce rate Β· pages / visit Β· time on site |
| Traffic sources | Direct Β· search Β· referral Β· social Β· paid Β· mail (as shares) |
| AI traffic share | ChatGPT Β· Claude Β· Gemini Β· Perplexity Β· Copilot β current + 3-month history |
| Top keywords | Keyword Β· estimated value Β· search volume Β· CPC |
| Country breakdown | Top countries with share + monthly visit estimates per country |
| Similar sites | Up to 20 related sites with traffic, ranks, category and thumbnails |
| WHOIS (via RDAP) | Registrar Β· IANA ID Β· registration / expiration / last-changed dates Β· name servers Β· EPP status |
| Keyword density | Top-20 non-stopword tokens from the homepage with count and density |
| Assets | Desktop / mobile screenshots Β· favicon |
π― How to use Similarweb Scraper
- Click Try for free on the Actor's Apify Store page.
- In the Domains field, enter one website to test, or paste a
larger list later β one per line, any format (
example.com,www.example.com, orhttps://example.com). - Pick exactly one Dataset to fetch. Start with
base_dataif you want the standard Similarweb dashboard data. - Click Start. A one-domain test run is fine; there is no 10-domain
minimum.
aitdktakes longer than the other modes because it also fetches RDAP and the homepage. - When the run finishes, open Storage β Dataset and export to
JSON, CSV, Excel, JSONL, XML or RSS. Or pull the results through the
API:
https://api.apify.com/v2/datasets/{dataset_id}/items.
π₯ Input
The form has two fields only β everything else has sensible defaults:
| Field | Type | Default |
|---|---|---|
domains | array | required β minimum 1 domain |
datasets | enum | base_data β one of base_data, similar_sites, aitdk |
Example input
{"domains": ["openai.com"],"datasets": "base_data"}
π€ Output
Each domain produces one dataset item. Each item conforms to the dataset schema and is rendered in the Apify Console views that match the selected dataset mode: π Overview Β· π¦ Traffic sources Β· π« Engagement Β· π€ AI traffic share Β· π AITDK (WHOIS and keywords).
Example item (abridged)
{"domain": "openai.com","rankGlobal": 207,"country": "US","countryRank": 306,"category": "ai_chatbots_and_tools","categoryRank": 6,"title": "OpenAI","totalVisits": 195737812,"bounceRate": 0.5937,"pagesPerVisit": 2.59,"timeOnSite": 138.72,"socialTraffic": 0.0287,"searchTraffic": 0.2154,"directTraffic": 0.3840,"referralTraffic": 0.1038,"aiTrafficShareChatgpt": 0.8825,"aiTrafficShareClaude": 0.0029,"aiTrafficShareGemini": 0.0106,"topKeywords": [{"keyword": "chatgpt", "estimatedValue": 20907500.0, "searchVolume": 173339160.0, "cpc": 0.14},{"keyword": "chat gpt", "estimatedValue": 5688810.0, "searchVolume": 95011780.0, "cpc": 0.14}]}
π Integrate Similarweb Scraper anywhere
Apify Actors run on a REST API β every run, dataset and webhook is addressable from your code:
# Trigger a run from anywherecurl -X POST "https://api.apify.com/v2/acts/<USER>~similarweb-scraper/runs?token=<API_TOKEN>" \-H "Content-Type: application/json" \-d '{"domains": ["openai.com"], "datasets": "base_data"}'# Read results from the run's default datasetcurl "https://api.apify.com/v2/datasets/<DATASET_ID>/items?format=json"
Or use the official Python and JavaScript clients.
β FAQ
Can I test it with only one domain?
Yes. The input requires at least one domain, not ten. The default
example uses openai.com with base_data, so a new user can click
Start immediately.
π Do I need a Similarweb account or API key? No. This Actor talks to Similarweb's public SPA endpoint directly. No login, no API key, and no scraping the captcha-gated in-depth pages.
π Is the data fresh?
Yes β it's the same JSON Similarweb's UI loads. The snapshotDate
field on every record tells you exactly which month it represents.
Similarweb refreshes its traffic data monthly.
β° Can I run this on a schedule? Yes β open the Actor in Apify Console, go to Schedules and pick hourly, daily, weekly or a custom cron. Combine with webhooks to push fresh data into Google Sheets, Slack, Make, Zapier or your own backend automatically.
βοΈ Is web scraping legal? Public web pages are generally legal to scrape, but you must respect copyright, terms of service, and personal-data protection laws (GDPR in the EU and similar regulations elsewhere). This Actor only extracts publicly visible data β no personal data is collected. See Apify's legal blog for details.
π¬ Support
- Found a bug or have a feature request? Open an issue in the Actor's Issues tab on Apify Console.
- Questions about Apify itself? Visit docs.apify.com or the Apify Discord community.
π Changelog
See CHANGELOG.md for the full release history. The Actor follows Semantic Versioning.