Pricing

from $0.001 / actor start

Try for free

Go to Apify Store

Company Website Research

Try for free

Extracting comprehensive data from the corporate website

Pricing

from $0.001 / actor start

Rating

4.4

(3)

Developer

Jian Lee

Actor stats

Bookmarked

Total users

Monthly active users

75 days

Issues response

3 months ago

Last modified

Company Research Actor

Apify Actor for researching a public company website and returning structured website evidence in one JSON result.

This Actor is built for company research, lead enrichment, and downstream automation. It can start from a direct website, a bare domain, or only a company name.

What This Actor Does

accepts website_url, domain, or company_name
discovers an official website when only the company name is provided
prefers Apify's Google Search Results Scraper for company-name discovery and uses the first valid Google organic website result directly
falls back to the internal heuristic search flow only when the nested Google search actor is unavailable or returns no usable website result
if discovery still stays ambiguous after fallback, returns candidate_websites instead of guessing
crawls a small set of high-value pages such as homepage, about, products/services, and contact
uses a hybrid crawl strategy:
- http-first when HTML is enough
- browser-fallback when the site is JS-heavy or the HTTP probe is not enough
fails fast on heavy block signals such as CAPTCHA, WAF, or explicit access denial instead of spending time on low-value salvage attempts
when running on Apify, prepares a standby Apify Proxy profile and can auto-escalate to proxy for suspicious blocked hosts even if use_proxy is left off
extracts:
- company name
- resolved website and domain
- LinkedIn company URL when found
- cleaned text from kept pages
- public emails, phones, and an address candidate
- rule-based summary, products, and market signals
returns crawl metadata including strategy, mode, confidence, failure_reason, timing breakdown, browser engine, and salvage usage

Best Fit

Works best for:

company websites
manufacturer and industrial sites
B2B corporate sites
one-page company sites
public product/catalog websites with clear navigation

Less reliable for:

login-only sites
CAPTCHA or anti-bot protected sites
sites with very heavy client-side rendering
sites where key information is hidden behind forms, PDFs, or gated downloads

Input

Resolution order:

website_url
domain
discovery from company_name

Main input fields:

company_name: company name for website discovery or as a hint for extraction
website_url: full website URL, highest priority input
domain: bare domain, normalized to https://<domain>/
social_link: known company social URL, usually LinkedIn
country: optional discovery hint
country: optional discovery hint, available as a dropdown in the Apify input UI
mode: fast or deep
anti_block_mode: browser hardening level, off, basic, or aggressive
use_proxy: force Apify Proxy from the start for HTTP and browser crawling
proxy_groups: optional Apify Proxy groups such as RESIDENTIAL
salvage_if_blocked: try likely subpages if the homepage is blocked or unavailable, except for clearly heavy-blocked sites that are failed fast
max_pages: max number of kept pages in output
max_text_chars: max total extracted text characters across kept pages
discover_if_missing: whether to discover a website when only the company name is given
extract_contacts: whether to extract emails, phones, and address
follow_subpages: whether to crawl internal pages beyond the first page
include_path_hints: preferred path fragments used to prioritize internal links

Mode

fast

lower latency
stops earlier once enough useful content is found
good for lead enrichment and bulk runs

deep

broader page coverage
better for contacts, products, and company profile quality
slower than fast

Anti-Block Mode

off

no browser hardening beyond the default crawler setup

basic

adds browser environment hardening and lightweight blocker dismissal
recommended default for most runs

aggressive

adds stronger popup/overlay removal and lightweight resource blocking
useful for difficult websites, but slightly riskier on fragile sites

Example Inputs

Direct website:

{
  "website_url": "https://vnsteel.vn/",
  "mode": "fast",
  "max_pages": 3,
  "max_text_chars": 7000,
  "extract_contacts": true,
  "follow_subpages": true
}

Bare domain:

{
  "domain": "pny.com",
  "mode": "deep",
  "max_pages": 3,
  "max_text_chars": 8000,
  "extract_contacts": true,
  "follow_subpages": true
}

Company name only:

{
  "company_name": "VNSTEEL",
  "country": "Vietnam",
  "mode": "deep",
  "max_pages": 3,
  "max_text_chars": 7000,
  "discover_if_missing": true,
  "extract_contacts": true,
  "follow_subpages": true
}

Company name discovery notes:

when only company_name is provided, this Actor first tries to call apify/google-search-scraper
if Google returns a usable organic website result, the Actor uses that website directly for crawling
the nested search run is executed under the current runner account, so the runner pays for that search usage
if the nested search run is unavailable or returns no usable website result, the Actor falls back to its internal discovery heuristic
if discovery is ambiguous, the Actor returns candidate_websites and stops instead of crawling the wrong website

Custom path hints:

{
  "website_url": "https://eup.vn/",
  "mode": "deep",
  "max_pages": 4,
  "max_text_chars": 10000,
  "extract_contacts": true,
  "follow_subpages": true,
  "include_path_hints": [
    "about",
    "products",
    "services",
    "contact",
    "gioi-thieu",
    "linh-vuc",
    "lien-he"
  ]
}

Output

The Actor writes one result object to:

the default dataset
the OUTPUT record in the default key-value store

Output Shape

{
  "company_name": "PNY Technologies Inc.",
  "resolved_website_url": "https://www.pny.com/",
  "resolved_domain": "pny.com",
  "resolved_social_link": "https://www.linkedin.com/company/pny-technologies/",
  "candidate_websites": [],
  "sources": [
    "https://www.pny.com/",
    "https://www.pny.com/professional/support/contact-us"
  ],
  "pages": [
    {
      "url": "https://www.pny.com/",
      "title": "PNY | NVIDIA Graphics, Storage, Networking & Memory Solutions",
      "page_type": "homepage",
      "text": "PNY delivers solutions in over 50 countries...",
      "text_chars": 3200
    }
  ],
  "contacts": {
    "emails": ["gopny@pny.com", "tsupport@pny.com"],
    "phones": ["19735159700"],
    "address": "100 Jefferson Road, Parsippany, New Jersey 07054 US"
  },
  "signals": {
    "about_summary": "PNY delivers solutions in over 50 countries...",
    "products": ["GeForce graphics cards", "Solid state drives", "PC memory"],
    "markets": ["Global"]
  },
  "metadata": {
    "discovery_used": false,
    "strategy": "http-first",
    "mode": "deep",
    "anti_block_mode": "basic",
    "browser_used": false,
    "browser_engine": null,
    "salvage_used": false,
    "pages_crawled": 3,
    "failure_reason": null,
    "confidence": {
      "website": 0.99,
      "contacts": 0.99,
      "summary": 0.85,
      "products": 0.63,
      "overall": 0.92
    },
    "timings": {
      "total_ms": 5472,
      "discovery_ms": 0,
      "crawl_ms": 5472,
      "http_probe_ms": 5472,
      "browser_crawl_ms": 0
    },
    "duration_ms": 5472
  }
}

LinkedIn Company Scraper — Company Data for Lead Enrichment

automation-lab/linkedin-company-scraper

Scrape LinkedIn company pages for website, industry, size, employee count, headquarters, specialties, and profile metadata. Use the website field as the starting point for contact discovery and email verification workflows.

Stas Persiianenko

434

Company Domain

apioracle/company-domain

Retrieves the official company website and social media links for a given company name.

Leo Barone

977

4.9

Website Company Data & Domain Enrichment API

pink_comic/website-company-enrichment

Turn domains into company data: business name, description, emails, phones, social profiles, tech stack, SEO fields, and website metadata. Clearbit/Clay alternative for B2B lead enrichment, CRM cleanup, sales prospecting, and domain-to-company lookup.

Ava Torres

LinkedIn Company Scraper

scrapier/linkedin-company-scraper-actor

Scrape LinkedIn company data with the LinkedIn Company Scraper. Extract company names, industries, employee counts, locations, and descriptions. Ideal for market research, lead generation, and competitor analysis. Fast, accurate, and scalable for single or bulk company profiles.

Scrapier

Research Tools Mcp

halilc4/research-tools-mcp

All-in-one MCP server for research. See how AI describes your brand. Track citations across ChatGPT, Claude, Gemini & Perplexity in real-time. Plus 30+ tools for SEO, keyword research, SERP analysis, trends, social listening & ad intelligence. Token-efficient MCP server. Pay only for what you use.

Igor Halilovic

1.0

Linkedin company scraper

curious_coder/linkedin-company-scraper

This linkedin company extractor program gives you all available information several companies in bulk including name, address, phone numbers, website, employee count, etc

Curious Coder

1.4K

3.0

LinkedIn Company Search Scraper

powerai/linkedin-company-search-scraper

Extract company information from LinkedIn with detailed metadata including company profiles, size, industry, and more. Perfect for market research, competitor analysis, and business development.

PowerAI

440

1.8

Linkedin Company Search

clothefobia/linkedin-company-search

Linkedin Company Search : search all companies on linkedin

clothe fobia

Linkedin Company Profile Scraper

scrapeverse/linkedin-company-profile-scraper

The LinkedIn Company Profile Scraper is a powerful and efficient tool designed to extract valuable information from LinkedIn company profiles with ease. Whether you're a market researcher, sales professional, or just curious about a company's background.

ScrapeVerse

496

3.0

Linkedin company scraper

logical_scrapers/linkedin-company-scraper

FASTEST LinkedIn company scraper. BULK Pull 50,000+ enriched company profiles in under 10 minutes. Company name, address, description, employee count, logo URL, website, industry, company size/type, headquarters, founding year, specialties, similar/affiliated pages, stock info and more.