Company Website Research
Pricing
from $0.001 / actor start
Go to Apify Store
Pricing
from $0.001 / actor start
Rating
0.0
(0)
Developer
Jian Lee
Maintained by Community
Actor stats
0
Bookmarked
3
Total users
2
Monthly active users
a day ago
Last modified
Categories
Share
Company Research Actor
Apify Actor for researching a public company website and returning structured website evidence in one JSON result.
This Actor is built for company research, lead enrichment, and downstream automation. It can start from a direct website, a bare domain, or only a company name.
What This Actor Does
- accepts
website_url,domain, orcompany_name - discovers an official website when only the company name is provided
- prefers Apify's
Google Search Results Scraperfor company-name discovery and uses the first valid Google organic website result directly - falls back to the internal heuristic search flow only when the nested Google search actor is unavailable or returns no usable website result
- if discovery still stays ambiguous after fallback, returns
candidate_websitesinstead of guessing - crawls a small set of high-value pages such as
homepage,about,products/services, andcontact - uses a hybrid crawl strategy:
http-firstwhen HTML is enoughbrowser-fallbackwhen the site is JS-heavy or the HTTP probe is not enough
- fails fast on heavy block signals such as CAPTCHA, WAF, or explicit access denial instead of spending time on low-value salvage attempts
- when running on Apify, prepares a standby Apify Proxy profile and can auto-escalate to proxy for suspicious blocked hosts even if
use_proxyis left off - extracts:
- company name
- resolved website and domain
- LinkedIn company URL when found
- cleaned text from kept pages
- public emails, phones, and an address candidate
- rule-based summary, products, and market signals
- returns crawl metadata including
strategy,mode,confidence,failure_reason, timing breakdown, browser engine, and salvage usage
Best Fit
Works best for:
- company websites
- manufacturer and industrial sites
- B2B corporate sites
- one-page company sites
- public product/catalog websites with clear navigation
Less reliable for:
- login-only sites
- CAPTCHA or anti-bot protected sites
- sites with very heavy client-side rendering
- sites where key information is hidden behind forms, PDFs, or gated downloads
Input
Resolution order:
website_urldomain- discovery from
company_name
Main input fields:
company_name: company name for website discovery or as a hint for extractionwebsite_url: full website URL, highest priority inputdomain: bare domain, normalized tohttps://<domain>/social_link: known company social URL, usually LinkedIncountry: optional discovery hintcountry: optional discovery hint, available as a dropdown in the Apify input UImode:fastordeepanti_block_mode: browser hardening level,off,basic, oraggressiveuse_proxy: force Apify Proxy from the start for HTTP and browser crawlingproxy_groups: optional Apify Proxy groups such asRESIDENTIALsalvage_if_blocked: try likely subpages if the homepage is blocked or unavailable, except for clearly heavy-blocked sites that are failed fastmax_pages: max number of kept pages in outputmax_text_chars: max total extracted text characters across kept pagesdiscover_if_missing: whether to discover a website when only the company name is givenextract_contacts: whether to extract emails, phones, and addressfollow_subpages: whether to crawl internal pages beyond the first pageinclude_path_hints: preferred path fragments used to prioritize internal links
Mode
fast
- lower latency
- stops earlier once enough useful content is found
- good for lead enrichment and bulk runs
deep
- broader page coverage
- better for contacts, products, and company profile quality
- slower than
fast
Anti-Block Mode
off
- no browser hardening beyond the default crawler setup
basic
- adds browser environment hardening and lightweight blocker dismissal
- recommended default for most runs
aggressive
- adds stronger popup/overlay removal and lightweight resource blocking
- useful for difficult websites, but slightly riskier on fragile sites
Example Inputs
Direct website:
{"website_url": "https://vnsteel.vn/","mode": "fast","max_pages": 3,"max_text_chars": 7000,"extract_contacts": true,"follow_subpages": true}
Bare domain:
{"domain": "pny.com","mode": "deep","max_pages": 3,"max_text_chars": 8000,"extract_contacts": true,"follow_subpages": true}
Company name only:
{"company_name": "VNSTEEL","country": "Vietnam","mode": "deep","max_pages": 3,"max_text_chars": 7000,"discover_if_missing": true,"extract_contacts": true,"follow_subpages": true}
Company name discovery notes:
- when only
company_nameis provided, this Actor first tries to callapify/google-search-scraper - if Google returns a usable organic website result, the Actor uses that website directly for crawling
- the nested search run is executed under the current runner account, so the runner pays for that search usage
- if the nested search run is unavailable or returns no usable website result, the Actor falls back to its internal discovery heuristic
- if discovery is ambiguous, the Actor returns
candidate_websitesand stops instead of crawling the wrong website
Custom path hints:
{"website_url": "https://eup.vn/","mode": "deep","max_pages": 4,"max_text_chars": 10000,"extract_contacts": true,"follow_subpages": true,"include_path_hints": ["about","products","services","contact","gioi-thieu","linh-vuc","lien-he"]}
Output
The Actor writes one result object to:
- the default dataset
- the
OUTPUTrecord in the default key-value store
Output Shape
{"company_name": "PNY Technologies Inc.","resolved_website_url": "https://www.pny.com/","resolved_domain": "pny.com","resolved_social_link": "https://www.linkedin.com/company/pny-technologies/","candidate_websites": [],"sources": ["https://www.pny.com/","https://www.pny.com/professional/support/contact-us"],"pages": [{"url": "https://www.pny.com/","title": "PNY | NVIDIA Graphics, Storage, Networking & Memory Solutions","page_type": "homepage","text": "PNY delivers solutions in over 50 countries...","text_chars": 3200}],"contacts": {"emails": ["gopny@pny.com", "tsupport@pny.com"],"phones": ["19735159700"],"address": "100 Jefferson Road, Parsippany, New Jersey 07054 US"},"signals": {"about_summary": "PNY delivers solutions in over 50 countries...","products": ["GeForce graphics cards", "Solid state drives", "PC memory"],"markets": ["Global"]},"metadata": {"discovery_used": false,"strategy": "http-first","mode": "deep","anti_block_mode": "basic","browser_used": false,"browser_engine": null,"salvage_used": false,"pages_crawled": 3,"failure_reason": null,"confidence": {"website": 0.99,"contacts": 0.99,"summary": 0.85,"products": 0.63,"overall": 0.92},"timings": {"total_ms": 5472,"discovery_ms": 0,"crawl_ms": 5472,"http_probe_ms": 5472,"browser_crawl_ms": 0},"duration_ms": 5472}}