πŸ” BuiltWith Scraper avatar

πŸ” BuiltWith Scraper

Pricing

from $4.99 / 1,000 results

Go to Apify Store
πŸ” BuiltWith Scraper

πŸ” BuiltWith Scraper

Pricing

from $4.99 / 1,000 results

Rating

0.0

(0)

Developer

API Empire

API Empire

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

16 hours ago

Last modified

Share

πŸ” BuiltWith Domain Technology Scraper

Extract the complete technology stack, company profile, and social footprint of any website β€” at scale, with zero captcha-solving headaches and a smart 3-tier proxy fallback that keeps your runs flowing.

Point the actor at one URL or a thousand. Get back a clean JSON record per domain β€” analytics tools, CDNs, ad networks, JavaScript libraries, hosting, payment processors, name servers, copyright signals β€” everything BuiltWith.com knows. ⚑


πŸš€ Why Choose This Actor?

✨What you get
🧠 Smart proxy fallbackStarts direct for speed β†’ escalates to datacenter β†’ finally residential with 3 retries. Once it locks onto residential, it stays there. No wasted budget on overkill proxies.
πŸ›‘οΈ Gate bypass built-inThe BuiltWith JS / image-tile captcha is handled internally β€” no need to paste cookies or solve puzzles.
πŸ“¦ Bulk inputFeed a single domain or thousands. Same actor, same shape.
🧰 Full tech stack100+ categories: Analytics, CDN, Frameworks, JS libs, Ad networks, Payments, Hosting, Email, DNS, Operating systems, Copyright signals, and more.
🏒 Company profileFirst-indexed date, global footprint with country flags.
πŸ”— Social linksTwitter / X, LinkedIn, GitHub, YouTube, Instagram, TikTok, Reddit, Pinterest, Threads β€” auto-detected.
πŸ’Ύ Live savingEvery record is pushed to your dataset as it lands β€” a crash mid-run leaves you with partial results, never an empty dataset.
πŸ“Š 4 dataset viewsOverview, Company, Technologies, Meta β€” switch in the Console with one click.
πŸ€– API & MCP readyRun synchronously or asynchronously from your own code.

🎯 Key Features

  • ⚑ Async-first β€” built on Apify SDK 3.x with async with Actor:
  • πŸ›‘οΈ 3-tier auto-fallback proxy β€” None β†’ Datacenter β†’ Residential (Γ—3, sticky)
  • 🧰 Full BuiltWith section parsing β€” Analytics, Tracking, CDN, JavaScript, Ad networks, Frameworks, Servers, Mobile, Audio/Video, Aggregation, Verified, Copyright, Document Standards, Registrar, Web Master Registration, and every other section
  • 🏒 Company-level enrichment β€” Company name, first-indexed date, global location footprint
  • πŸ”— Auto-extracted social profiles β€” only ones that match the target domain
  • πŸ“Š Real-time dataset push β€” see results streaming into the Output tab as they happen
  • 🐒 Configurable politeness β€” request delay, retry count, proxy preference
  • πŸ“ Engaging live logs β€” counts, top sections, country flags, success/failure markers

πŸ“₯ Input

Example

{
"urls": [
{ "url": "https://apify.com" },
{ "url": "https://crunchbase.com" }
],
"scrapeTechnologies": true,
"scrapeCompany": true,
"scrapeMeta": true,
"requestDelay": 6.0,
"maxRetries": 3,
"proxyConfiguration": { "useApifyProxy": false }
}

Field reference

FieldTypeRequiredDefaultDescription
🌐 urlsarrayyesβ€”Domains or full URLs. Accepts apify.com, https://apify.com, or full request-list entries.
🧰 scrapeTechnologiesbooleannotrueInclude the live technology stack.
🏒 scrapeCompanybooleannotrueInclude company-level data (name, first indexed, footprint).
πŸ“¨ scrapeMetabooleannotrueInclude social links, contacts, rankings.
⏱️ requestDelaynumberno6.0Seconds between requests. Random jitter is added.
πŸ” maxRetriesintegerno3Retry count when running on the residential tier.
πŸ›‘οΈ proxyConfigurationobjectno{ useApifyProxy: false }Override the auto-fallback ladder.

πŸ“€ Output

Each successfully scraped domain becomes one dataset item. Example:

{
"domain": "apify.com",
"status": "ok",
"companyName": "Apify",
"firstIndexed": "June 2004",
"domainName": "APIFY.COM",
"lastDetected": "Wednesday, November 12, 2025",
"liveTechnologies": 198,
"technologiesCount": 198,
"socialLinksCount": 7,
"globalFootprintCount": 0,
"scrapedAt": "2026-05-16T10:42:13+00:00",
"company": {
"companyName": "Apify",
"firstIndexed": "June 2004",
"globalFootprint": [],
"churnData": [],
"spendTimeline": [],
"innovationTimeline": [],
"longevityData": []
},
"technologies": {
"domainName": "APIFY.COM",
"lastDetected": "Wednesday, November 12, 2025",
"liveTechnologies": 198,
"rank": null,
"technologies": [
{
"section": "Analytics and Tracking",
"name": "Google Analytics 4",
"url": "//trends.builtwith.com/analytics/Google-Analytics-4",
"description": "Google Analytics 4 formerly known as App + Web is a new version of Google Analytics that was released in October 2020.",
"category": null,
"icon": "https://x.cdnpi.pe/serve/UPASDUCKmLywn7PY/google.com"
}
]
},
"meta": {
"companyName": "Apify",
"location": null,
"telephones": [],
"postalAddresses": [],
"contacts": [],
"socialLinks": [
{ "platform": "twitter.com", "url": "https://twitter.com/apify", "fullText": "twitter.com/apify" }
],
"websiteInfo": {},
"rankings": {}
}
}

πŸ“Š Dataset views

The Output tab in the Console gives you four ready-made views:

ViewWhat you see
🌐 OverviewOne row per domain β€” company, counts, status
🏒 Company ProfileDomain + the full company block
🧰 TechnologiesDomain + the technology stack block
πŸ“¨ Meta & SocialDomain + the meta/social block

πŸš€ How to Use (Apify Console)

  1. Log in at https://console.apify.com β†’ Actors.
  2. Open BuiltWith Domain Technology Scraper.
  3. Configure the input:
    • 🌐 Paste your domains (one per line in urls)
    • πŸ›‘οΈ Leave proxy on default (auto-fallback) or pick your own
    • 🧰 Toggle the sections you actually need to save credits
  4. Click β–Ά Start.
  5. Watch the live log:
    • 🌐 Processing domain: …
    • 🧰 Parsed N technologies across M sections
    • πŸ”— Social links: 7 (github.com, linkedin.com, …)
    • πŸ’Ύ Saved record to dataset
  6. Open the Output tab β†’ switch to Overview / Technologies / etc.
  7. Export to JSON / CSV / XLSX / RSS / HTML.

πŸ€– Use via API / MCP

Synchronous run (waits for completion, returns dataset items)

curl -X POST "https://api.apify.com/v2/acts/<ACTOR_ID>/run-sync-get-dataset-items?token=$APIFY_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"urls": [{ "url": "https://apify.com" }],
"scrapeTechnologies": true,
"scrapeCompany": true,
"scrapeMeta": true
}'

Asynchronous run (returns immediately with run ID)

curl -X POST "https://api.apify.com/v2/acts/<ACTOR_ID>/runs?token=$APIFY_TOKEN" \
-H "Content-Type: application/json" \
-d '{ "urls": [{ "url": "https://crunchbase.com" }] }'

From Python

from apify_client import ApifyClientAsync
client = ApifyClientAsync(token="apify_api_...")
run = await client.actor("<ACTOR_ID>").call(run_input={
"urls": [{"url": "https://apify.com"}],
})
items = await client.dataset(run["defaultDatasetId"]).list_items().items

πŸ’Ž Best Use Cases

  • 🎯 Competitive intel β€” see which tools your competitors use
  • πŸ›’ Sales prospecting β€” filter leads by tech stack (uses Stripe? Uses Shopify?)
  • πŸ“ˆ Market research β€” map adoption of a SaaS across an industry
  • πŸ” M&A due diligence β€” quick technical fingerprint of a target
  • πŸ›‘οΈ Security & compliance β€” inventory third-party scripts on partner sites
  • πŸ“Š Investor research β€” track product evolution via tech changes

πŸ’° Pricing

This actor uses Pay-Per-Event (PPE) with two events:

EventDescription
πŸš€ apify-actor-startOne-time charge when a run starts (synthetic event).
πŸ“¦ apify-default-dataset-itemOne charge per scraped domain pushed to the dataset (synthetic event).

You only pay for the domains you successfully process. Failed / blocked domains still produce a row (status: blocked-or-unavailable) so you can re-queue them β€” but they are billed as one item.

Tip: Toggle off scrapeTechnologies / scrapeCompany / scrapeMeta when you only need one slice β€” the actor will spend less CPU per domain and run faster.


❓ Frequently Asked Questions

Does this actor work without a proxy? Yes. The default starting tier is direct (no proxy), which is the fastest and free. The actor only escalates to datacenter then residential if BuiltWith rejects the direct request.

Will I get blocked? The actor uses an internal bypass that handles BuiltWith's standard gate. If a small fraction of domains fail, they'll be marked blocked-or-unavailable and you can re-run them after a delay.

Can I scrape thousands of domains? Yes β€” input is unlimited. Use a higher requestDelay (6–10s) for very large jobs to stay polite and avoid escalating to residential too quickly.

Why does the output have empty arrays for churnData, spendTimeline, etc.? Those are placeholders for paid BuiltWith pro fields that aren't exposed on the public profile pages. They are kept in the schema for stable consumer code.

Can I use my own proxy? Yes β€” set proxyConfiguration.useApifyProxy: true and pick groups in the input. The auto-fallback ladder will start from your chosen tier.

Does this scrape private accounts? No. All data comes from BuiltWith.com's public profile pages, which only index publicly observable site signals.


  • βœ… Data is collected only from publicly available BuiltWith.com profile pages.
  • βœ… Honour reasonable rate limits β€” the default 6 s delay is a good citizen baseline.
  • βœ… The end user is responsible for compliance with target-site terms of service, BuiltWith.com terms, and applicable laws (GDPR, CCPA, etc.).
  • ❌ Do not use the data for spam, harassment, or any activity prohibited by local law.

πŸ’¬ Support and Feedback

Issues, feature requests, and feedback are very welcome β€” please open an issue on the actor's page or contact the maintainer. We ship updates often. πŸš€