Pricing

Pay per usage

Go to Apify Store

🔍 BuiltWith Scraper

Try for free

Pricing

Pay per usage

Rating

0.0

(0)

Developer

API Empire

Actor stats

Bookmarked

Total users

Monthly active users

19 days ago

Last modified

🔍 BuiltWith Domain Technology Scraper

Extract the complete technology stack, company profile, and social footprint of any website — at scale, with zero captcha-solving headaches and a smart 3-tier proxy fallback that keeps your runs flowing.

Point the actor at one URL or a thousand. Get back a clean JSON record per domain — analytics tools, CDNs, ad networks, JavaScript libraries, hosting, payment processors, name servers, copyright signals — everything BuiltWith.com knows. ⚡

🚀 Why Choose This Actor?

✨	What you get
🧠 Smart proxy fallback	Starts direct for speed → escalates to datacenter → finally residential with 3 retries. Once it locks onto residential, it stays there. No wasted budget on overkill proxies.
🛡️ Gate bypass built-in	The BuiltWith JS / image-tile captcha is handled internally — no need to paste cookies or solve puzzles.
📦 Bulk input	Feed a single domain or thousands. Same actor, same shape.
🧰 Full tech stack	100+ categories: Analytics, CDN, Frameworks, JS libs, Ad networks, Payments, Hosting, Email, DNS, Operating systems, Copyright signals, and more.
🏢 Company profile	First-indexed date, global footprint with country flags.
🔗 Social links	Twitter / X, LinkedIn, GitHub, YouTube, Instagram, TikTok, Reddit, Pinterest, Threads — auto-detected.
💾 Live saving	Every record is pushed to your dataset as it lands — a crash mid-run leaves you with partial results, never an empty dataset.
📊 4 dataset views	Overview, Company, Technologies, Meta — switch in the Console with one click.
🤖 API & MCP ready	Run synchronously or asynchronously from your own code.

🎯 Key Features

⚡ Async-first — built on Apify SDK 3.x with async with Actor:
🛡️ 3-tier auto-fallback proxy — None → Datacenter → Residential (×3, sticky)
🧰 Full BuiltWith section parsing — Analytics, Tracking, CDN, JavaScript, Ad networks, Frameworks, Servers, Mobile, Audio/Video, Aggregation, Verified, Copyright, Document Standards, Registrar, Web Master Registration, and every other section
🏢 Company-level enrichment — Company name, first-indexed date, global location footprint
🔗 Auto-extracted social profiles — only ones that match the target domain
📊 Real-time dataset push — see results streaming into the Output tab as they happen
🐢 Configurable politeness — request delay, retry count, proxy preference
📝 Engaging live logs — counts, top sections, country flags, success/failure markers

📥 Input

Example

{
  "urls": [
    { "url": "https://apify.com" },
    { "url": "https://crunchbase.com" }
  ],
  "scrapeTechnologies": true,
  "scrapeCompany": true,
  "scrapeMeta": true,
  "requestDelay": 6.0,
  "maxRetries": 3,
  "proxyConfiguration": { "useApifyProxy": false }
}

Field reference

Field	Type	Required	Default	Description
🌐 `urls`	`array`	yes	—	Domains or full URLs. Accepts `apify.com`, `https://apify.com`, or full request-list entries.
🧰 `scrapeTechnologies`	`boolean`	no	`true`	Include the live technology stack.
🏢 `scrapeCompany`	`boolean`	no	`true`	Include company-level data (name, first indexed, footprint).
📨 `scrapeMeta`	`boolean`	no	`true`	Include social links, contacts, rankings.
⏱️ `requestDelay`	`number`	no	`6.0`	Seconds between requests. Random jitter is added.
🔁 `maxRetries`	`integer`	no	`3`	Retry count when running on the residential tier.
🛡️ `proxyConfiguration`	`object`	no	`{ useApifyProxy: false }`	Override the auto-fallback ladder.

📤 Output

Each successfully scraped domain becomes one dataset item. Example:

{
  "domain": "apify.com",
  "status": "ok",
  "companyName": "Apify",
  "firstIndexed": "June 2004",
  "domainName": "APIFY.COM",
  "lastDetected": "Wednesday, November 12, 2025",
  "liveTechnologies": 198,
  "technologiesCount": 198,
  "socialLinksCount": 7,
  "globalFootprintCount": 0,
  "scrapedAt": "2026-05-16T10:42:13+00:00",

  "company": {
    "companyName": "Apify",
    "firstIndexed": "June 2004",
    "globalFootprint": [],
    "churnData": [],
    "spendTimeline": [],
    "innovationTimeline": [],
    "longevityData": []
  },

  "technologies": {
    "domainName": "APIFY.COM",
    "lastDetected": "Wednesday, November 12, 2025",
    "liveTechnologies": 198,
    "rank": null,
    "technologies": [
      {
        "section": "Analytics and Tracking",
        "name": "Google Analytics 4",
        "url": "//trends.builtwith.com/analytics/Google-Analytics-4",
        "description": "Google Analytics 4 formerly known as App + Web is a new version of Google Analytics that was released in October 2020.",
        "category": null,
        "icon": "https://x.cdnpi.pe/serve/UPASDUCKmLywn7PY/google.com"
      }
    ]
  },

  "meta": {
    "companyName": "Apify",
    "location": null,
    "telephones": [],
    "postalAddresses": [],
    "contacts": [],
    "socialLinks": [
      { "platform": "twitter.com", "url": "https://twitter.com/apify", "fullText": "twitter.com/apify" }
    ],
    "websiteInfo": {},
    "rankings": {}
  }
}

📊 Dataset views

The Output tab in the Console gives you four ready-made views:

View	What you see
🌐 Overview	One row per domain — company, counts, status
🏢 Company Profile	Domain + the full company block
🧰 Technologies	Domain + the technology stack block
📨 Meta & Social	Domain + the meta/social block

🚀 How to Use (Apify Console)

Log in at https://console.apify.com → Actors.
Open BuiltWith Domain Technology Scraper.
Configure the input:
- 🌐 Paste your domains (one per line in urls)
- 🛡️ Leave proxy on default (auto-fallback) or pick your own
- 🧰 Toggle the sections you actually need to save credits
Click ▶ Start.
Watch the live log:
- 🌐 Processing domain: …
- 🧰 Parsed N technologies across M sections
- 🔗 Social links: 7 (github.com, linkedin.com, …)
- 💾 Saved record to dataset
Open the Output tab → switch to Overview / Technologies / etc.
Export to JSON / CSV / XLSX / RSS / HTML.

🤖 Use via API / MCP

Synchronous run (waits for completion, returns dataset items)

curl -X POST "https://api.apify.com/v2/acts/<ACTOR_ID>/run-sync-get-dataset-items?token=$APIFY_TOKEN" \
     -H "Content-Type: application/json" \
     -d '{
       "urls": [{ "url": "https://apify.com" }],
       "scrapeTechnologies": true,
       "scrapeCompany": true,
       "scrapeMeta": true
     }'

Asynchronous run (returns immediately with run ID)

curl -X POST "https://api.apify.com/v2/acts/<ACTOR_ID>/runs?token=$APIFY_TOKEN" \
     -H "Content-Type: application/json" \
     -d '{ "urls": [{ "url": "https://crunchbase.com" }] }'

From Python

from apify_client import ApifyClientAsync

client = ApifyClientAsync(token="apify_api_...")
run = await client.actor("<ACTOR_ID>").call(run_input={
    "urls": [{"url": "https://apify.com"}],
})
items = await client.dataset(run["defaultDatasetId"]).list_items().items

💎 Best Use Cases

🎯 Competitive intel — see which tools your competitors use
🛒 Sales prospecting — filter leads by tech stack (uses Stripe? Uses Shopify?)
📈 Market research — map adoption of a SaaS across an industry
🔍 M&A due diligence — quick technical fingerprint of a target
🛡️ Security & compliance — inventory third-party scripts on partner sites
📊 Investor research — track product evolution via tech changes

💰 Pricing

This actor uses Pay-Per-Event (PPE) with two events:

Event	Description
🚀 `apify-actor-start`	One-time charge when a run starts (synthetic event).
📦 `apify-default-dataset-item`	One charge per scraped domain pushed to the dataset (synthetic event).

You only pay for the domains you successfully process. Failed / blocked domains still produce a row (status: blocked-or-unavailable) so you can re-queue them — but they are billed as one item.

Tip: Toggle off scrapeTechnologies / scrapeCompany / scrapeMeta when you only need one slice — the actor will spend less CPU per domain and run faster.

❓ Frequently Asked Questions

Does this actor work without a proxy? Yes. The default starting tier is direct (no proxy), which is the fastest and free. The actor only escalates to datacenter then residential if BuiltWith rejects the direct request.

Will I get blocked? The actor uses an internal bypass that handles BuiltWith's standard gate. If a small fraction of domains fail, they'll be marked blocked-or-unavailable and you can re-run them after a delay.

Can I scrape thousands of domains? Yes — input is unlimited. Use a higher requestDelay (6–10s) for very large jobs to stay polite and avoid escalating to residential too quickly.

Why does the output have empty arrays for churnData, spendTimeline, etc.? Those are placeholders for paid BuiltWith pro fields that aren't exposed on the public profile pages. They are kept in the schema for stable consumer code.

Can I use my own proxy? Yes — set proxyConfiguration.useApifyProxy: true and pick groups in the input. The auto-fallback ladder will start from your chosen tier.

Does this scrape private accounts? No. All data comes from BuiltWith.com's public profile pages, which only index publicly observable site signals.

⚠️ Caution / Legal

✅ Data is collected only from publicly available BuiltWith.com profile pages.
✅ Honour reasonable rate limits — the default 6 s delay is a good citizen baseline.
✅ The end user is responsible for compliance with target-site terms of service, BuiltWith.com terms, and applicable laws (GDPR, CCPA, etc.).
❌ Do not use the data for spam, harassment, or any activity prohibited by local law.

💬 Support and Feedback

Issues, feature requests, and feedback are very welcome — please open an issue on the actor's page or contact the maintainer. We ship updates often. 🚀

🔍 BuiltWith Scraper

scraper-engine/builtwith-scraper

Scraper Engine

🔍 BuiltWith Scraper

scrapier/builtwith-scraper

🔍 BuiltWith Scraper (builtwith-scraper) extracts tech stack & website details from BuiltWith pages. Includes tools, analytics, tags & vendors for faster B2B research, competitive analysis & lead generation. 🚀📈

Scrapier

🔍 BuiltWith Scraper

simpleapi/builtwith-scraper

🔍 BuiltWith Scraper extracts tech stack data from BuiltWith pages—track analytics, hosting, CMS, plugins & more. ⚡ Great for B2B research, competitive analysis & lead generation. 🚀 Fast, targeted results.

SimpleAPI

🔍 BuiltWith Scraper

scrapio/builtwith-scraper

🔍 BuiltWith Scraper extracts tech stack details from BuiltWith—track tools, analytics, CDNs, and more. 📈 Perfect for competitive research, lead gen, and SEO audits. 🚀 Fast, efficient, and scraper-ready for data-driven teams.

Scrapio

BuiltWith Official Technology Scraper

builtwith/builtwith-official-technology-scraper

Get technology information about a website from builtwith.com.

BuiltWith

496

5.0

Builtwith Discovery Extractor

getdataforme/builtwith-discovery-extractor

The Builtwith Discovery Parser Spider extracts detailed platform and technology data from BuiltWith's discovery platform, offering customizable search parameters and scalable performance....

GetDataForMe

Builtwith Discovery Parser Spider

getdataforme/builtwith-discovery-parser-spider

The Builtwith Discovery Parser Spider extracts detailed platform and technology data from BuiltWith's discovery platform, offering customizable search parameters and scalable performance....

GetDataForMe