π BuiltWith Scraper
Pricing
from $4.99 / 1,000 results
π BuiltWith Scraper
Pricing
from $4.99 / 1,000 results
Rating
0.0
(0)
Developer
API Empire
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
16 hours ago
Last modified
Categories
Share
π BuiltWith Domain Technology Scraper
Extract the complete technology stack, company profile, and social footprint of any website β at scale, with zero captcha-solving headaches and a smart 3-tier proxy fallback that keeps your runs flowing.
Point the actor at one URL or a thousand. Get back a clean JSON record per domain β analytics tools, CDNs, ad networks, JavaScript libraries, hosting, payment processors, name servers, copyright signals β everything BuiltWith.com knows. β‘
π Why Choose This Actor?
| β¨ | What you get |
|---|---|
| π§ Smart proxy fallback | Starts direct for speed β escalates to datacenter β finally residential with 3 retries. Once it locks onto residential, it stays there. No wasted budget on overkill proxies. |
| π‘οΈ Gate bypass built-in | The BuiltWith JS / image-tile captcha is handled internally β no need to paste cookies or solve puzzles. |
| π¦ Bulk input | Feed a single domain or thousands. Same actor, same shape. |
| π§° Full tech stack | 100+ categories: Analytics, CDN, Frameworks, JS libs, Ad networks, Payments, Hosting, Email, DNS, Operating systems, Copyright signals, and more. |
| π’ Company profile | First-indexed date, global footprint with country flags. |
| π Social links | Twitter / X, LinkedIn, GitHub, YouTube, Instagram, TikTok, Reddit, Pinterest, Threads β auto-detected. |
| πΎ Live saving | Every record is pushed to your dataset as it lands β a crash mid-run leaves you with partial results, never an empty dataset. |
| π 4 dataset views | Overview, Company, Technologies, Meta β switch in the Console with one click. |
| π€ API & MCP ready | Run synchronously or asynchronously from your own code. |
π― Key Features
- β‘ Async-first β built on Apify SDK 3.x with
async with Actor: - π‘οΈ 3-tier auto-fallback proxy β
None β Datacenter β Residential (Γ3, sticky) - π§° Full BuiltWith section parsing β Analytics, Tracking, CDN, JavaScript, Ad networks, Frameworks, Servers, Mobile, Audio/Video, Aggregation, Verified, Copyright, Document Standards, Registrar, Web Master Registration, and every other section
- π’ Company-level enrichment β Company name, first-indexed date, global location footprint
- π Auto-extracted social profiles β only ones that match the target domain
- π Real-time dataset push β see results streaming into the Output tab as they happen
- π’ Configurable politeness β request delay, retry count, proxy preference
- π Engaging live logs β counts, top sections, country flags, success/failure markers
π₯ Input
Example
{"urls": [{ "url": "https://apify.com" },{ "url": "https://crunchbase.com" }],"scrapeTechnologies": true,"scrapeCompany": true,"scrapeMeta": true,"requestDelay": 6.0,"maxRetries": 3,"proxyConfiguration": { "useApifyProxy": false }}
Field reference
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
π urls | array | yes | β | Domains or full URLs. Accepts apify.com, https://apify.com, or full request-list entries. |
π§° scrapeTechnologies | boolean | no | true | Include the live technology stack. |
π’ scrapeCompany | boolean | no | true | Include company-level data (name, first indexed, footprint). |
π¨ scrapeMeta | boolean | no | true | Include social links, contacts, rankings. |
β±οΈ requestDelay | number | no | 6.0 | Seconds between requests. Random jitter is added. |
π maxRetries | integer | no | 3 | Retry count when running on the residential tier. |
π‘οΈ proxyConfiguration | object | no | { useApifyProxy: false } | Override the auto-fallback ladder. |
π€ Output
Each successfully scraped domain becomes one dataset item. Example:
{"domain": "apify.com","status": "ok","companyName": "Apify","firstIndexed": "June 2004","domainName": "APIFY.COM","lastDetected": "Wednesday, November 12, 2025","liveTechnologies": 198,"technologiesCount": 198,"socialLinksCount": 7,"globalFootprintCount": 0,"scrapedAt": "2026-05-16T10:42:13+00:00","company": {"companyName": "Apify","firstIndexed": "June 2004","globalFootprint": [],"churnData": [],"spendTimeline": [],"innovationTimeline": [],"longevityData": []},"technologies": {"domainName": "APIFY.COM","lastDetected": "Wednesday, November 12, 2025","liveTechnologies": 198,"rank": null,"technologies": [{"section": "Analytics and Tracking","name": "Google Analytics 4","url": "//trends.builtwith.com/analytics/Google-Analytics-4","description": "Google Analytics 4 formerly known as App + Web is a new version of Google Analytics that was released in October 2020.","category": null,"icon": "https://x.cdnpi.pe/serve/UPASDUCKmLywn7PY/google.com"}]},"meta": {"companyName": "Apify","location": null,"telephones": [],"postalAddresses": [],"contacts": [],"socialLinks": [{ "platform": "twitter.com", "url": "https://twitter.com/apify", "fullText": "twitter.com/apify" }],"websiteInfo": {},"rankings": {}}}
π Dataset views
The Output tab in the Console gives you four ready-made views:
| View | What you see |
|---|---|
| π Overview | One row per domain β company, counts, status |
| π’ Company Profile | Domain + the full company block |
| π§° Technologies | Domain + the technology stack block |
| π¨ Meta & Social | Domain + the meta/social block |
π How to Use (Apify Console)
- Log in at https://console.apify.com β Actors.
- Open BuiltWith Domain Technology Scraper.
- Configure the input:
- π Paste your domains (one per line in
urls) - π‘οΈ Leave proxy on default (auto-fallback) or pick your own
- π§° Toggle the sections you actually need to save credits
- π Paste your domains (one per line in
- Click βΆ Start.
- Watch the live log:
π Processing domain: β¦π§° Parsed N technologies across M sectionsπ Social links: 7 (github.com, linkedin.com, β¦)πΎ Saved record to dataset
- Open the Output tab β switch to Overview / Technologies / etc.
- Export to JSON / CSV / XLSX / RSS / HTML.
π€ Use via API / MCP
Synchronous run (waits for completion, returns dataset items)
curl -X POST "https://api.apify.com/v2/acts/<ACTOR_ID>/run-sync-get-dataset-items?token=$APIFY_TOKEN" \-H "Content-Type: application/json" \-d '{"urls": [{ "url": "https://apify.com" }],"scrapeTechnologies": true,"scrapeCompany": true,"scrapeMeta": true}'
Asynchronous run (returns immediately with run ID)
curl -X POST "https://api.apify.com/v2/acts/<ACTOR_ID>/runs?token=$APIFY_TOKEN" \-H "Content-Type: application/json" \-d '{ "urls": [{ "url": "https://crunchbase.com" }] }'
From Python
from apify_client import ApifyClientAsyncclient = ApifyClientAsync(token="apify_api_...")run = await client.actor("<ACTOR_ID>").call(run_input={"urls": [{"url": "https://apify.com"}],})items = await client.dataset(run["defaultDatasetId"]).list_items().items
π Best Use Cases
- π― Competitive intel β see which tools your competitors use
- π Sales prospecting β filter leads by tech stack (uses Stripe? Uses Shopify?)
- π Market research β map adoption of a SaaS across an industry
- π M&A due diligence β quick technical fingerprint of a target
- π‘οΈ Security & compliance β inventory third-party scripts on partner sites
- π Investor research β track product evolution via tech changes
π° Pricing
This actor uses Pay-Per-Event (PPE) with two events:
| Event | Description |
|---|---|
π apify-actor-start | One-time charge when a run starts (synthetic event). |
π¦ apify-default-dataset-item | One charge per scraped domain pushed to the dataset (synthetic event). |
You only pay for the domains you successfully process. Failed / blocked domains still produce a row (status: blocked-or-unavailable) so you can re-queue them β but they are billed as one item.
Tip: Toggle off
scrapeTechnologies/scrapeCompany/scrapeMetawhen you only need one slice β the actor will spend less CPU per domain and run faster.
β Frequently Asked Questions
Does this actor work without a proxy? Yes. The default starting tier is direct (no proxy), which is the fastest and free. The actor only escalates to datacenter then residential if BuiltWith rejects the direct request.
Will I get blocked?
The actor uses an internal bypass that handles BuiltWith's standard gate. If a small fraction of domains fail, they'll be marked blocked-or-unavailable and you can re-run them after a delay.
Can I scrape thousands of domains?
Yes β input is unlimited. Use a higher requestDelay (6β10s) for very large jobs to stay polite and avoid escalating to residential too quickly.
Why does the output have empty arrays for churnData, spendTimeline, etc.?
Those are placeholders for paid BuiltWith pro fields that aren't exposed on the public profile pages. They are kept in the schema for stable consumer code.
Can I use my own proxy?
Yes β set proxyConfiguration.useApifyProxy: true and pick groups in the input. The auto-fallback ladder will start from your chosen tier.
Does this scrape private accounts? No. All data comes from BuiltWith.com's public profile pages, which only index publicly observable site signals.
β οΈ Caution / Legal
- β Data is collected only from publicly available BuiltWith.com profile pages.
- β Honour reasonable rate limits β the default 6 s delay is a good citizen baseline.
- β The end user is responsible for compliance with target-site terms of service, BuiltWith.com terms, and applicable laws (GDPR, CCPA, etc.).
- β Do not use the data for spam, harassment, or any activity prohibited by local law.
π¬ Support and Feedback
Issues, feature requests, and feedback are very welcome β please open an issue on the actor's page or contact the maintainer. We ship updates often. π