Company Info Scraper – Contacts, Tech Stack, Social Profiles
Pricing
$9.99/month + usage
Company Info Scraper – Contacts, Tech Stack, Social Profiles
Crawl any company website: extract emails (support/sales/decision-maker), phone numbers, addresses, social profiles, technologies, industry, size, lead score. Smart crawling + JS fallback. $9.99/month. Perfect for B2B leads & competitor intel.
Pricing
$9.99/month + usage
Rating
0.0
(0)
Developer
Scrape Pilot
Maintained by CommunityActor stats
0
Bookmarked
4
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
🏢 Company Info Scraper – Extract Business Data, Contacts, Tech Stack, Social Profiles
Crawl any company website – extract company name, description, logo, favicon, email addresses (support/sales/decision‑maker), phone numbers, social media profiles, physical addresses, technologies used, industry, business model, and more.
Includes smart crawling (priority pages first), fallback browser rendering, checkpoint/resume, and lead scoring. Perfect for lead generation, sales intelligence, competitor research, and market analysis.
💡 What is the Company Info Scraper?
The Company Info Scraper is an intelligent Apify actor that analyzes any business website and returns a structured, comprehensive profile of that company. It goes far beyond simple scraping:
- Intelligent crawling – prioritizes contact, about, careers, blog, pricing, and portal pages first.
- Multi‑engine fetching – tries lightweight HTTP (curl_cffi) first, falls back to headless Chromium (Playwright) for JavaScript‑heavy sites.
- Contact extraction – finds email addresses (categorised into support, sales, decision‑maker, general), phone numbers (E.164 format), physical addresses (with pattern matching).
- Social media discovery – detects Facebook, Instagram, LinkedIn, Twitter/X, YouTube, TikTok, Pinterest, GitHub, Threads, Snapchat, Reddit.
- Technology detection – identifies CMS (WordPress, Shopify, Webflow), analytics (GA, GTM), chat widgets (Intercom, Zendesk), payment providers (Stripe, PayPal), and more.
- Business intelligence – infers industry, company size estimate, business model (B2B SaaS, E‑commerce, Agency, etc.), and lead scoring (High/Medium/Low).
- Resume & checkpoint – saves progress after every page; survives interruptions.
- Bulk processing – supply many start URLs, and the actor crawls each independently.
The output includes everything a sales or research team needs to qualify a lead – direct contact URLs, email buckets, phone numbers, social handles, and even a lead quality score.
🚀 Key Features
| Feature | Description |
|---|---|
| Smart priority crawling | Visits contact, about, careers, privacy, terms, blog, pricing, login, portal, product pages early. |
| Dual fetch engine | Uses curl_cffi with Chrome impersonation + fallback to Playwright (full JS rendering). |
| Contact extraction | Emails (categorised into support, sales, decision-maker, general), phone numbers (E.164), physical addresses. |
| Social media discovery | Detects 13+ social platforms; returns full profile URLs. |
| Technology stack detection | Recognises 30+ technologies (CMS, analytics, chat, payments, hosting). |
| Business insights | Industry (10 categories), company size (from hints or number of employees), business model, lead score. |
| Checkpoint & resume | Saves state after every page; restart without re‑scanning visited URLs. |
| Bulk domains | Process hundreds of websites in one run (each independently crawled). |
| Flat monthly pricing | $9.99/month – unlimited runs, no per‑page fees. |
| Clean JSON output | One comprehensive item per domain + intermediate items for each scanned page (with status field). |
📥 Input Parameters
The actor accepts a JSON object with the following fields:
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
startUrls | array of objects | Yes | – | List of starting URLs (e.g., [{"url": "https://example.com"}]). |
maxPagesPerDomain | integer | No | 20 | Maximum pages to crawl per domain (prevents runaway crawls). |
concurrency | integer | No | 20 | Number of concurrent HTTP requests. |
regionHint | string | No | "US" | Two‑letter country code for phone number parsing (e.g., "US", "BD"). |
proxyConfiguration | object | No | – | Apify proxy configuration. Residential proxies recommended. |
Example Input
{"startUrls": [{"url": "https://stripe.com"},{"url": "https://shopify.com"},{"url": "https://airbnb.com"}],"maxPagesPerDomain": 30,"concurrency": 15,"regionHint": "US","proxyConfiguration": {"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"]}}
📤 Output Fields
The actor pushes two types of items to the dataset:
- Page‑level items (one per crawled page,
status: "scanned") – useful for intermediate debugging. - One final item per domain (when crawling finishes or
maxPagesPerDomainreached,status: "completed"or"partial").
Below is a sample of the final company item (most relevant for users):
{"domain": "stripe.com","website": "https://stripe.com","company_name": "Stripe","website_title": "Stripe: Financial infrastructure for the internet","website_description": "Stripe powers online and in-person payment processing and financial solutions for businesses of all sizes.","industry": "Fintech","country": "United States","company_size_estimate": "1000+","business_model": "B2B SaaS","founded_year": "2010","logo_url": "https://stripe.com/img/about/logos/logomark.png","favicon_url": "https://stripe.com/favicon.ico","phone_numbers": ["+1-888-963-8944"],"emails": ["support@stripe.com", "sales@stripe.com", "press@stripe.com"],"support_emails": ["support@stripe.com"],"sales_emails": ["sales@stripe.com"],"decision_maker_emails": [],"email_buckets": {"support": ["support@stripe.com"],"sales": ["sales@stripe.com"],"general": [],"other": ["press@stripe.com"]},"addresses": ["3180 18th St, San Francisco, CA 94110"],"social_profiles": {"twitter": ["https://twitter.com/stripe"],"linkedin": ["https://linkedin.com/company/stripe"],"github": ["https://github.com/stripe"]},"contact_url": "https://stripe.com/contact","about_url": "https://stripe.com/about","careers_url": "https://stripe.com/jobs","privacy_url": "https://stripe.com/privacy","terms_url": "https://stripe.com/legal","blog_url": "https://stripe.com/blog","pricing_url": "https://stripe.com/pricing","login_url": "https://dashboard.stripe.com/login","customer_portal_url": "https://dashboard.stripe.com","product_url": "https://stripe.com/products","technologies": ["Cloudflare", "Google Analytics", "React", "Stripe", "PayPal"],"has_contact_form": true,"has_live_chat": false,"newsletter_signup": true,"accepts_online_payments": true,"pages_scanned": 25,"status": "completed","lead_score": 85,"lead_quality": "High","scraped_at": "2026-06-01T12:30:00Z"}
| Field | Type | Description |
|---|---|---|
domain | string | Company domain (e.g., stripe.com). |
company_name | string | Best‑guess company name (from OG tags, title, H1). |
website_title | string | Page <title> of homepage or first meaningful page. |
website_description | string | Meta description / OG description. |
industry | string | Detected industry (Fintech, E‑commerce, SaaS, etc.). |
country | string | Detected country from address or text. |
company_size_estimate | string | 1-10, 11-50, ..., 1000+ or Unknown. |
business_model | string | B2B SaaS, E-commerce, Agency / Services, etc. |
founded_year | string | From JSON‑LD or text (4‑digit year). |
logo_url | string | URL of the company logo (if found). |
favicon_url | string | URL of the favicon. |
phone_numbers | array | E.164 formatted phone numbers. |
emails | array | All found email addresses. |
support_emails | array | Emails matching support@, help@, etc. |
sales_emails | array | Emails matching sales@, business@, etc. |
decision_maker_emails | array | Emails matching ceo@, founder@, director@, etc. |
email_buckets | object | Categorised emails for easy integration. |
addresses | array | Physical address strings. |
social_profiles | object | Dictionary of platform → array of URLs. |
contact_url | string | First discovered contact page URL. |
about_url | string | First discovered about page URL. |
careers_url | string | Careers/jobs page URL. |
privacy_url | string | Privacy policy URL. |
terms_url | string | Terms of service URL. |
blog_url | string | Blog/news page URL. |
pricing_url | string | Pricing page URL. |
login_url | string | Login page URL. |
customer_portal_url | string | Customer portal/dashboard URL. |
product_url | string | Product/solutions URL. |
technologies | array | Detected technology stack (CMS, analytics, etc.). |
has_contact_form | boolean | Whether a contact form was detected. |
has_live_chat | boolean | Live chat widget detected (Intercom, etc.). |
newsletter_signup | boolean | Newsletter subscription form detected. |
accepts_online_payments | boolean | Payment methods (Stripe, PayPal, etc.) detected. |
pages_scanned | integer | Number of pages crawled for this domain. |
status | string | completed (reached limit or finished), partial (still links left but stopped). |
lead_score | integer | Score 0–100 based on available data (emails, phones, social, etc.). |
lead_quality | string | High (≥70), Medium (35–69), Low (<35). |
scraped_at | string | ISO timestamp. |
💰 Pricing
| Plan | Price | Description |
|---|---|---|
| Monthly Subscription | $9.99 | Unlimited runs – no per‑page fees, no hidden costs. |
- You can scrape as many domains as you want, with up to
maxPagesPerDomainper domain, as many times per month as you like. - The actor automatically saves checkpoints; if you stop early, it resumes from where it left off.
- No pay‑per‑event – this is a fixed monthly subscription.
🛠 How to Use on Apify
- Create a task with this actor.
- Provide
startUrls– one or more company website URLs. - Adjust
maxPagesPerDomain(default 20) – higher values give more thorough results but take longer. - Set
concurrency(default 20) – increase for faster crawling (but may trigger blocking). - Enable residential proxies – strongly recommended to avoid being blocked.
- Run – the actor will crawl each domain, extract all data, and push results to the Dataset.
- Export – download final company profiles as JSON, CSV, or Excel.
Tip: The actor produces both page‑level items (many) and a final summary item per domain. Filter by
status: "completed"orstatus: "partial"to get only the final company profiles.
Running via API
curl -X POST "https://api.apify.com/v2/acts/your-username~company-info-scraper/runs" \-H "Content-Type: application/json" \-H "Authorization: Bearer YOUR_API_TOKEN" \-d '{"startUrls": [{"url": "https://stripe.com"}],"maxPagesPerDomain": 15,"proxyConfiguration": {"useApifyProxy": true, "apifyProxyGroups": ["RESIDENTIAL"]}}'
🎯 Use Cases
| Use Case | How the Company Info Scraper Helps |
|---|---|
| Lead generation | Extract contact emails, phones, social profiles, and decision‑maker emails for outbound campaigns. |
| Sales intelligence | Score leads automatically (lead_quality) and identify technologies used (e.g., Shopify stores → pitch e‑commerce solutions). |
| Competitor research | Gather website tech stack, business model, and industry classification for benchmarking. |
| Market analysis | Batch‑process hundreds of companies in a sector to identify common tools, locations, and sizes. |
| Mergers & acquisitions | Quickly collect company overview, founding year, and contact channels for initial due diligence. |
| Partnership sourcing | Find potential partners by discovering their contact and social pages. |
❓ Frequently Asked Questions
1. Do I need a proxy?
Residential proxies are strongly recommended, especially for many domains or for sites using Cloudflare. Datacenter IPs may get blocked quickly.
2. How many pages does it crawl?
maxPagesPerDomain controls the limit (default 20). The actor prioritises important pages (contact, about, careers, etc.), so even a low limit gives good data.
3. What happens if the website requires JavaScript?
The actor tries HTTP first (fast). If that fails or returns minimal content, it falls back to a headless Chrome browser (Playwright) to render the page.
4. How does it categorise emails?
support_emails: starts withsupport@,help@,care@,service@,contact@,hello@.sales_emails: starts withsales@,business@,partners@,bd@.decision_maker_emails: starts withceo@,founder@,owner@,president@,director@,vp@,chief@,head@,lead@.general:info@,hello@,contact@(if not already captured).other: all remaining.
5. How is the lead score calculated?
Points are added for:
- Phone numbers (20), emails (20), addresses (10), social profiles (10), contact page (10), company name (5), technologies (5), description (5), logo/favicon (5), live chat (5), newsletter (5).
Total ≥70 = High, 35–69 = Medium, <35 = Low.
6. Can I run this for thousands of domains?
Yes, but be mindful of proxy usage (residential proxies are metered). For very large lists, spread runs over time or use a dedicated proxy pool.
7. Does it extract the full website content?
No – it focuses on metadata, contact info, and structured data. It does not archive entire pages.
8. What is the checkpoint feature?
If the actor stops (due to spending limit, timeout, or user interruption), it saves which pages have been visited. When you restart, it resumes without re‑scanning already processed URLs.
9. How do I get only the final company profile (not the page‑level items)?
After the run, filter the dataset by status: "completed" or "partial". Page‑level items have status: "scanned".
10. What if the website is not in English?
The actor works with any language – pattern matching (addresses, phone numbers) is language‑agnostic, and keyword detection (for industry, social media) uses a base set of English terms. You can modify the INDUSTRY_HINTS etc. in the source code for other languages.
🔍 SEO Keywords
company information scraper, business intelligence, lead generation tool, website technology detector, social media finder, email extractor, phone number scraper, company contact scraper, business data extraction, Apify company scraper, B2B lead enrichment, competitor analysis tool, company size estimator
🔗 Related Actors
- LinkedIn Company Scraper – Extract data from LinkedIn company pages.
- Instagram Profile Scraper – Retrieve public Instagram profile info.
- Amazon Product Scraper – Extract product details and pricing.
Start extracting complete company intelligence – only $9.99/month. Crawl any business website, get contacts, tech stack, social profiles, and lead scoring.