Similarweb Scraper avatar
Similarweb Scraper
Try for free

7 days trial then $15.00/month - No credit card required now

View all Actors
Similarweb Scraper

Similarweb Scraper

epctex/similarweb-scraper
Try for free

7 days trial then $15.00/month - No credit card required now

The most comprehensive Similarweb Scraper you will ever find. Obtain data on website popularity and receive it in formats such as JSON, XML, CSV, Excel, or an HTML table.

Actor - Similarweb Scraper

Similarweb scraper

The most comprehensive Similarweb Scraper you will ever find. Obtain data on website popularity and receive it in formats such as JSON, XML, CSV, Excel, or an HTML table. Each scrape allows you to gather the subsequent details:

  • Information on the company
  • Metrics including total visits, average pages viewed, average time spent, and rate of visitors leaving quickly
  • Rankings based on popularity (worldwide/by nation/by category)
  • Origins of web traffic (like advertisements, direct links, search engines, social media, and more)
  • Spread across social media platforms
  • Rival websites
  • Technologies implemented
  • Demographic breakdown by age and gender
  • Trending subjects and websites for users
  • Leading visitor countries

Bugs, fixes, updates, and changelog

This scraper is under active development. If you have any feature requests you can create an issue from here.

Input Parameters

The input of this scraper should be JSON containing the list of pages on Similarweb that should be visited. Required fields are:

  • websites: (Required) (Array) Domains or Similarweb full URLs that you want to retrieve the results from right away!

  • proxy: (Required) (Proxy Object) Proxy configuration.

This solution requires the use of Proxy servers, either your own proxy servers or you can use Apify Proxy.

Compute Unit Consumption

The actor is optimized to run blazing fast and scrape as many items as possible. Therefore, it forefronts all the detailed requests. If the actor doesn't block very often it'll scrape 100 websites in 2 minutes with ~0.02-0.05 compute units. (depending on proxy and number of retries).

Similarweb Scraper Input example

1{
2  "websites": [
3    "apify.com",
4    "https://www.similarweb.com/website/example.com"
5  ],
6  "disableAnalytics": true,
7  "proxy": {
8    "useApifyProxy": true
9  }
10}

During the Run

During the run, the actor will output messages letting you know what is going on. Each message always contains a short label specifying which page from the provided list is currently specified. When items are loaded from the page, you should see a message about this event with a loaded item count and total item count for each page.

If you provide incorrect input to the actor, it will immediately stop with a failure state and output an explanation of what is wrong.

Similarweb Export

During the run, the actor stores results into a dataset. Each item is a separate item in the dataset.

You can manage the results in any language (Python, PHP, Node JS/NPM). See the FAQ or our API reference to learn more about getting results from this Similarweb actor.

Scraped Similarweb Properties

The structure of each item in Similarweb looks like this:

Item Detail

1{
2	"url": "https://www.similarweb.com/website/example.com",
3	"name": "example.com",
4	"description": "",
5	"icon": "https://site-images.similarcdn.com/image?url=example.com&t=2&s=1&h=b63fbdb6c1a27c3f1467398b04fb81fcac694d85223c8a8eb2ca7c3e276b4af8",
6	"previewDesktop": "https://site-images.similarcdn.com/image?url=example.com&t=1&s=1&h=b63fbdb6c1a27c3f1467398b04fb81fcac694d85223c8a8eb2ca7c3e276b4af8",
7	"previewMobile": "https://site-images.similarcdn.com/image?url=example.com&t=4&s=1&h=b63fbdb6c1a27c3f1467398b04fb81fcac694d85223c8a8eb2ca7c3e276b4af8",
8	"totalVisits": 2731009,
9	"bounceRate": 0.6584328504024405,
10	"pagesPerVisit": 2.1271126903721043,
11	"avgVisitDuration": "00:01:10",
12	"companyName": "example.com",
13	"companyYearFounded": 2020,
14	"companyEmployeesMin": 11,
15	"companyEmployeesMax": 50,
16	"companyAnnualRevenueMin": 5000000,
17	"companyAnnualRevenueMax": 10000000,
18	"companyHeadquarterCountryCode": "US",
19	"companyHeadquarterCity": "Poway",
20	"categoryId": "computers_electronics_and_technology/programming_and_developer_software",
21	"globalRank": 27325,
22	"countryRank": 24029,
23	"categoryRank": 494,
24	"organicTraffic": 0.02037535305716658,
25	"paidTraffic": 0.00013297517471535874,
26	"socialNetworkDistribution": [
27		{
28			"name": "Facebook",
29			"visitsShare": 0.3255293008801001,
30			"icon": "https://site-images.similarcdn.com/image?url=facebook.com&t=2&s=1&h=be773d6b77aa3d59b6a671c5c27ad729b1ae77400e89776e2f749cce6b926c4b"
31		},
32		{
33			"name": "Reddit",
34			"visitsShare": 0.20677069669874004,
35			"icon": "https://site-images.similarcdn.com/image?url=reddit.com&t=2&s=1&h=66f2412047e0362ec60d5583d4b186511a8e859446bb112c60d22968facae906"
36		}
37	],
38	"topReferrals": [
39		{
40			"domain": "sainstore.com.cn",
41			"icon": "https://site-images.similarcdn.com/image?url=sainstore.com.cn&t=2&s=1&h=6599e599a77a07a671b445f6dbcc0dcb642ed742b73da654a8f249cd14cd34e2",
42			"visitsShare": 0.039163300136706325,
43			"isLocked": false
44		},
45	],
46	"topIncomingCategories": [
47		{
48			"category": "Computers_Electronics_and_Technology/Programming_and_Developer_Software",
49			"visitsShare": 0.20976287943264638
50		}
51	],
52	"topOutgoingSites": [
53		{
54			"domain": "bing.com",
55			"icon": "https://site-images.similarcdn.com/image?url=bing.com&t=2&s=1&h=a37ae326ae78ba2da30e9c8da030e84e07d4c76c5452c239f85c9050781ffde8",
56			"visitsShare": 0.15244198374460416,
57			"isLocked": false
58		}
59	],
60	"trafficSources": {
61		"directVisitsShare": 0.7987929429397509,
62		"referralVisitsShare": 0.14399020951353203,
63		"organicSearchVisitsShare": 0.02037535305716658,
64		"paidSearchVisitsShare": 0.00013297517471535874,
65		"socialNetworksVisitsShare": 0.012302031511970067,
66		"mailVisitsShare": 0.00922292962838403,
67		"adsVisitsShare": 0.015183558174481039
68	},
69	"adsSource": [
70		{
71			"domain": "spy.house",
72			"icon": "https://site-images.similarcdn.com/image?url=spy.house&t=2&s=1&h=726c13ae2747aa7e6eaa3b875da2e2f1fa14bfd4c535bd7a649c7cc417785d7b",
73			"visitsShare": 0.024292125814544006,
74			"isLocked": false
75		}
76	],
77	"topKeywords": [
78		{
79			"name": "example.com",
80			"estimatedValue": 2310.074622306198,
81			"volume": 8810
82		},
83	],
84	"topSimilarityCompetitors": [
85		{
86			"domain": "imagemagick.org",
87			"icon": "https://site-images.similarcdn.com/image?url=imagemagick.org&t=2&s=1&h=4acde3cf8fb1d14a87ca92846d2233fb2ff498afed6faa2520f19e65309e4fd3",
88			"visitsTotalCount": 313828,
89			"categoryId": "computers_electronics_and_technology/programming_and_developer_software",
90			"categoryRank": 4116,
91			"affinity": 1,
92			"isDataFromGa": false
93		}
94	],
95	"technologies": [
96		{
97			"categoryId": "payment_and_currencies",
98			"topTechName": "PayPal",
99			"topTechIconUrl": "https://s3.amazonaws.com/s3-static-us-east-1.similarweb.com/technographics/id=1146",
100			"technologiesTotalCount": 21
101		}
102	],
103	"globalRankHistory": [
104		{
105			"date": "2023-07-01T00:00:00+00:00",
106			"rank": 27111
107		},
108		{
109			"date": "2023-08-01T00:00:00+00:00",
110			"rank": 23239
111		}
112	],
113	"countryRankCompetitors": [
114		{
115			"rank": 24027,
116			"domain": "beautyrest.com",
117			"icon": "https://site-images.similarcdn.com/image?url=beautyrest.com&t=2&s=1&h=8b7053f2087658d99913b631de5c97b9083682ee9fc4b4bfe4a20e40ef093cbd"
118		}
119	],
120	"countryRankHistory": [
121		{
122			"date": "2023-07-01T00:00:00+00:00",
123			"rank": 19812
124		}
125	],
126	"categoryRankCompetitors": [
127		{
128			"rank": 492,
129			"domain": "getintopc.com",
130			"icon": "https://site-images.similarcdn.com/image?url=getintopc.com&t=2&s=1&h=27eff1e0f77ac735d5db39957bb5b137f482a949bea4dcf5b83ff0ca2a8087a6"
131		}
132	],
133	"categoryRankHistory": [
134		{
135			"date": "2023-07-01T00:00:00+00:00",
136			"rank": 407
137		},
138		{
139			"date": "2023-08-01T00:00:00+00:00",
140			"rank": 402
141		},
142		{
143			"date": "2023-09-01T00:00:00+00:00",
144			"rank": 494
145		}
146	],
147	"globalRankCompetitors": [
148		{
149			"rank": 27323,
150			"domain": "soccerbet.rs",
151			"icon": "https://site-images.similarcdn.com/image?url=soccerbet.rs&t=2&s=1&h=6b63bac9019d26763bf796676857925f149df5806baaaf5ff4c92e2860fb3676"
152		}
153	],
154	"ageDistribution": [
155		{
156			"minAge": 25,
157			"maxAge": 34,
158			"value": 0.34934147381902564
159		}
160	],
161	"topInterestedWebsites": [
162		{
163			"domain": "iana.org",
164			"icon": "https://site-images.similarcdn.com/image?url=iana.org&t=2&s=1&h=8dc33e2b49124d93e7fdb59eb700c2f1535c5e1a10905e508efc65ca6880c622"
165		}
166	],
167	"topInterestedTopics": [
168		"google",
169		"search",
170		"news",
171		"internet",
172		"community"
173	],
174	"topInterestedCategories": [
175		"computers_electronics_and_technology/computers_electronics_and_technology",
176		"computers_electronics_and_technology/programming_and_developer_software",
177		"adult",
178		"computers_electronics_and_technology/file_sharing_and_hosting",
179		"computers_electronics_and_technology/social_networks_and_online_communities"
180	],
181	"topCountries": [
182		{
183			"countryAlpha2Code": "US",
184			"countryUrlCode": "united-states",
185			"visitsShare": 0.17667851173150384,
186			"visitsShareChange": -0.02132305166773374
187		}
188	],
189	"globalRankPrev": 23239,
190	"globalRankChange": -4086,
191	"categoryRankPrev": 402,
192	"categoryRankChange": -92,
193	"countryRankPrev": 19798,
194	"countryRankChange": -4231
195}

Contact

Please visit us through epctex.com to see all the products that are available for you. If you are looking for any custom integration or so, please reach out to us through the chat box in epctex.com. In need of support? devops@epctex.com is at your service.

Developer
Maintained by Community
Actor metrics
  • 18 monthly users
  • 100.0% runs succeeded
  • days response time
  • Created in Oct 2023
  • Modified about 7 hours ago