Company Deep Research Agent
Pricing
from $1,000.00 / 1,000 company researcheds
Company Deep Research Agent
Research any company from a domain. Get website metadata, Wikipedia summary, GitHub repos & stars, SEC EDGAR filings & ticker, academic papers, DNS records, and social media profiles in one JSON report.
Pricing
from $1,000.00 / 1,000 company researcheds
Rating
0.0
(0)
Developer

ryan clinton
Actor stats
1
Bookmarked
4
Total users
1
Monthly active users
5 hours ago
Last modified
Categories
Share
Generate a comprehensive intelligence report on any company from just a domain name. The Company Deep Research Agent automatically gathers data from 7 different public sources — website metadata, Wikipedia, GitHub, SEC EDGAR filings, OpenAlex academic papers, DNS infrastructure, and social media profiles — and compiles everything into a single structured JSON report.
No API keys required for the core functionality. Just enter a domain like stripe.com and get back a detailed company dossier in seconds.
Why Use Company Deep Research Agent?
Manual company research means visiting a dozen websites, copying data into spreadsheets, and hoping you didn't miss anything. This actor does it all in one run: website analysis, Wikipedia summary, GitHub presence, SEC filings, academic citations, DNS infrastructure, and social media — compiled into a single structured JSON report you can feed into any pipeline.
Features
- Website analysis — extracts title, meta description, Open Graph images, favicon, and discovers social media links directly from the company homepage
- Wikipedia summary — finds the company's Wikipedia page and pulls the summary, description, thumbnail image, and direct URL
- GitHub presence — locates the company's GitHub organization, counts public repos and followers, and lists top repositories ranked by stars
- SEC EDGAR filings — determines whether the company is publicly traded, retrieves the stock ticker and CIK number, and lists recent 10-K, 10-Q, and 8-K filings
- Academic research — searches OpenAlex for scholarly papers mentioning the company, ranked by citation count, with DOI links and publication details
- DNS infrastructure — resolves A, MX, TXT, and NS records to reveal hosting providers, email infrastructure, and domain verification tokens
- Social media detection — checks Twitter/X, LinkedIn, Facebook, Instagram, YouTube, and GitHub for active company profiles using both website-linked URLs and intelligent slug guessing
- Auto-detection — automatically detects the company name from the website title and Open Graph metadata
- Configurable modules — toggle SEC filings, research papers, and GitHub search on or off to speed up runs or focus on what matters
- No API keys needed — all core data sources use free public APIs. An optional GitHub token increases rate limits but is not required
How to Use
-
Enter the domain — provide the company's website domain (e.g.,
stripe.com,tesla.com). You can include or omithttps://— the actor strips it automatically. -
Optionally set the company name — if left blank, the actor detects the company name from the website's
<title>tag or Open Graph metadata. Override this if the auto-detected name is a tagline rather than the company name. -
Toggle data modules — enable or disable SEC filings search, academic research papers, and GitHub analysis depending on your needs. All three are enabled by default.
-
Run and download — click "Start" and wait for the run to complete (typically 15–45 seconds). Download the full JSON report from the dataset.
Input Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
domain | String | Yes | — | Company website domain (e.g., stripe.com). The https:// prefix and trailing slashes are stripped automatically. |
companyName | String | No | Auto-detected | Company name override for search queries. If not provided, detected from <title> or og:title. |
includeFinancials | Boolean | No | true | Search SEC EDGAR for public company filings, ticker, and CIK. |
includeResearch | Boolean | No | true | Search OpenAlex for academic papers mentioning the company. |
includeGithub | Boolean | No | true | Search GitHub for company organization and repositories. |
githubToken | String | No | — | GitHub personal access token for higher API rate limits (60 → 5,000 requests/hour). |
maxResults | Integer | No | 50 | Maximum number of items to return per data source (1–100). |
Input Examples
Quick company lookup — all modules enabled:
{"domain": "stripe.com"}
Named company with GitHub token:
{"domain": "openai.com","companyName": "OpenAI","githubToken": "ghp_xxxxxxxxxxxxxxxxxxxx","maxResults": 20}
Fast scan — website + DNS + social media only:
{"domain": "acme.com","includeFinancials": false,"includeResearch": false,"includeGithub": false}
Input Tips
- Provide
companyNameexplicitly for companies whose website title is a tagline (e.g., "Build the Future" rather than "Acme Corp"). This improves Wikipedia, SEC, and GitHub search accuracy. - Use
maxResults: 10for quick overviews,maxResults: 50for comprehensive reports. - Set
includeFinancials: falsefor private companies to skip the SEC EDGAR search and save time.
Output
Each run produces one dataset item with this structure:
{"domain": "stripe.com","companyName": "Stripe","researchDate": "2025-03-15","website": {"title": "Stripe | Financial Infrastructure for the Internet","description": "Stripe powers online and in-person payment processing...","favicon": "https://stripe.com/favicon.ico","ogImage": "https://stripe.com/img/v3/home/social.png","socialLinks": {"twitter": "https://twitter.com/stripe","linkedin": "https://www.linkedin.com/company/stripe","github": "https://github.com/stripe"}},"wikipedia": {"found": true,"summary": "Stripe, Inc. is an Irish-American multinational financial services...","description": "American-Irish financial services company","thumbnail": "https://upload.wikimedia.org/...","url": "https://en.wikipedia.org/wiki/Stripe,_Inc."},"github": {"found": true,"orgProfile": {"name": "Stripe","bio": "Financial infrastructure for the internet.","publicRepos": 186,"followers": 1523,"url": "https://github.com/stripe"},"topRepositories": [{"name": "stripe-node","description": "Node.js library for the Stripe API.","stars": 3842,"forks": 745,"language": "TypeScript","url": "https://github.com/stripe/stripe-node"}],"totalStars": 28450},"financials": {"isPublicCompany": false,"ticker": null,"cik": null,"recentFilings": []},"research": {"paperCount": 1247,"topPapers": [{"title": "The Rise of Embedded Finance...","doi": "https://doi.org/10.1016/j.jfi.2024.101032","citationCount": 89,"publicationDate": "2024-06-15","source": "Journal of Financial Intermediation"}]},"dns": {"aRecords": ["185.166.143.32"],"mxRecords": ["10 aspmx.l.google.com"],"txtRecords": ["v=spf1 include:_spf.google.com ~all"],"nameServers": ["ns1.p16.dynect.net"]},"socialMedia": [{ "platform": "Twitter/X", "url": "https://twitter.com/stripe", "found": true },{ "platform": "LinkedIn", "url": "https://www.linkedin.com/company/stripe", "found": true },{ "platform": "Instagram", "url": "https://www.instagram.com/stripe", "found": false }]}
Output Fields
Top-level fields:
| Field | Type | Description |
|---|---|---|
domain | String | The company domain that was researched |
companyName | String | Detected or provided company name |
researchDate | String | ISO date of the research (YYYY-MM-DD) |
website fields:
| Field | Type | Description |
|---|---|---|
title | String | Website <title> tag content |
description | String | Meta description or og:description |
favicon | String | URL to the website favicon |
ogImage | String | Open Graph image URL |
socialLinks | Object | Social media URLs found in the page HTML (keys: twitter, linkedin, facebook, instagram, youtube, github) |
wikipedia fields:
| Field | Type | Description |
|---|---|---|
found | Boolean | Whether a Wikipedia page was found |
summary | String | Wikipedia article extract |
description | String | Short Wikipedia description |
thumbnail | String | Wikipedia thumbnail image URL |
url | String | Direct URL to the Wikipedia page |
github fields:
| Field | Type | Description |
|---|---|---|
found | Boolean | Whether a GitHub org or repos were found |
orgProfile.name | String | GitHub organization display name |
orgProfile.bio | String | Organization description |
orgProfile.publicRepos | Integer | Number of public repositories |
orgProfile.followers | Integer | Number of GitHub followers |
topRepositories[] | Array | Top repos by stars, each with name, description, stars, forks, language, url |
totalStars | Integer | Sum of stars across returned repos |
financials fields:
| Field | Type | Description |
|---|---|---|
isPublicCompany | Boolean | Whether SEC filings were found |
ticker | String / null | Stock ticker symbol (e.g., "AAPL") |
cik | String / null | SEC Central Index Key |
recentFilings[] | Array | Recent SEC filings, each with formType, filedDate, description, url |
research fields:
| Field | Type | Description |
|---|---|---|
paperCount | Integer | Total papers found on OpenAlex |
topPapers[] | Array | Top papers by citations, each with title, doi, citationCount, publicationDate, source |
dns fields:
| Field | Type | Description |
|---|---|---|
aRecords | String[] | IPv4 addresses |
mxRecords | String[] | Mail exchange records (priority + exchange) |
txtRecords | String[] | TXT records (SPF, DKIM, verification tokens) |
nameServers | String[] | Authoritative name servers |
socialMedia[] fields:
| Field | Type | Description |
|---|---|---|
platform | String | Platform name (Twitter/X, LinkedIn, Facebook, Instagram, YouTube, GitHub) |
url | String | Profile URL (discovered from website or guessed from company slug) |
found | Boolean | Whether the profile exists and returned HTTP 200 |
Use Cases
- Sales & BD professionals preparing company briefs before outbound prospecting — identify tech stack, filings status, and social channels to personalize outreach
- Competitive intelligence analysts pulling together website metadata, GitHub activity, SEC filings, and academic citations into one report
- Venture capital & PE researchers evaluating investment targets — assess public market presence, open-source footprint, and academic research impact
- Journalists & investigators compiling background information — Wikipedia summaries, SEC filings, DNS records, and social presence in seconds
- M&A due diligence teams running preliminary technical and public-records checks on acquisition targets
- Marketing strategists auditing a brand's digital footprint across social platforms, website metadata, and open-source presence
How to Use the API
You can call Company Deep Research Agent programmatically from any language:
Python
import requestsimport time# Start the actor runrun = requests.post("https://api.apify.com/v2/acts/ryanclinton~company-deep-research/runs",params={"token": "YOUR_APIFY_TOKEN"},json={"domain": "stripe.com","includeFinancials": True,"includeResearch": True,"includeGithub": True,"maxResults": 20},timeout=30,).json()# Wait for completionrun_id = run["data"]["id"]while True:status = requests.get(f"https://api.apify.com/v2/actor-runs/{run_id}",params={"token": "YOUR_APIFY_TOKEN"},timeout=10,).json()if status["data"]["status"] in ("SUCCEEDED", "FAILED", "ABORTED"):breaktime.sleep(5)# Get resultsdataset_id = status["data"]["defaultDatasetId"]items = requests.get(f"https://api.apify.com/v2/datasets/{dataset_id}/items",params={"token": "YOUR_APIFY_TOKEN"},timeout=30,).json()report = items[0]print(f"Company: {report['companyName']}")print(f"Wikipedia: {report['wikipedia']['summary'][:100]}...")print(f"GitHub repos: {report['github']['orgProfile']['publicRepos'] if report['github']['orgProfile'] else 'N/A'}")print(f"Public company: {report['financials']['isPublicCompany'] if report['financials'] else 'N/A'}")
JavaScript
const response = await fetch("https://api.apify.com/v2/acts/ryanclinton~company-deep-research/run-sync-get-dataset-items?token=YOUR_APIFY_TOKEN",{method: "POST",headers: { "Content-Type": "application/json" },body: JSON.stringify({domain: "stripe.com",includeFinancials: true,includeGithub: true,maxResults: 20,}),});const [report] = await response.json();console.log(`${report.companyName} — ${report.wikipedia?.summary?.slice(0, 100)}`);console.log(`GitHub: ${report.github.totalStars} total stars`);
cURL
curl -X POST "https://api.apify.com/v2/acts/ryanclinton~company-deep-research/run-sync-get-dataset-items?token=YOUR_APIFY_TOKEN" \-H "Content-Type: application/json" \-d '{"domain": "stripe.com","includeFinancials": true,"includeGithub": true,"maxResults": 20}'
How It Works
Input (domain, companyName, module toggles)│▼┌──────────────────────────────────────────────────┐│ Step 1: Website Analysis ││ Fetch HTML → extract <title>, og:*, favicon, ││ social links via regex pattern matching ││ Auto-detect company name from title │└──────────────────────────────────────────────────┘│▼┌──────────────────────────────────────────────────┐│ Step 2: Wikipedia ││ Try direct page summary API first ││ Fallback: search API → top result → summary ││ Returns: summary, description, thumbnail, URL │└──────────────────────────────────────────────────┘│▼┌──────────────────────────────────────────────────┐│ Step 3: GitHub (optional) ││ Try org lookup with 3 name guesses: ││ • domain base (e.g., "stripe") ││ • company lowercase ("openai") ││ • company with dashes ("some-company") ││ Fallback: repository search API ││ Returns: org profile, top repos by stars │└──────────────────────────────────────────────────┘│▼┌──────────────────────────────────────────────────┐│ Step 4: SEC EDGAR (optional) ││ 3 endpoints tried in sequence: ││ • EFTS full-text search (10-K, 10-Q, 8-K) ││ • browse-edgar company search (Atom XML) ││ • company_tickers.json for ticker/CIK ││ Returns: isPublic, ticker, CIK, recent filings │└──────────────────────────────────────────────────┘│▼┌──────────────────────────────────────────────────┐│ Step 5: OpenAlex (optional) ││ Search works by company name, sorted by ││ citation count descending ││ Returns: total paper count, top papers w/ DOI │└──────────────────────────────────────────────────┘│▼┌──────────────────────────────────────────────────┐│ Step 6: DNS ││ Resolve A, MX, TXT, NS records via Node.js ││ dns.promises module ││ Reveals hosting, email, verification tokens │└──────────────────────────────────────────────────┘│▼┌──────────────────────────────────────────────────┐│ Step 7: Social Media ││ 6 platforms: Twitter/X, LinkedIn, Facebook, ││ Instagram, YouTube, GitHub ││ Priority: website-linked URLs > slug guessing ││ HTTP GET check for existence │└──────────────────────────────────────────────────┘│▼Single JSON report pushed to dataset
Data Sources
| Step | Source | API Used | Auth Required |
|---|---|---|---|
| 1 | Company website | Direct HTTPS fetch + HTML parsing | No |
| 2 | Wikipedia | REST API (/api/rest_v1/page/summary) + search API | No |
| 3 | GitHub | REST API (/orgs/{name}, /orgs/{name}/repos) | Optional token |
| 4 | SEC EDGAR | EFTS search, browse-edgar, company_tickers.json | No |
| 5 | OpenAlex | REST API (/works?search=) | No |
| 6 | DNS | Node.js dns.promises (resolve4, resolveMx, resolveTxt, resolveNs) | No |
| 7 | Social Media | HTTP GET to profile URLs | No |
Company Name Auto-Detection
When companyName is not provided, the actor extracts it from the website:
- Fetches the homepage HTML
- Checks
og:titlefirst, falls back to<title>tag - Splits on separators (
|,-,–,—,:) - Takes the first part (e.g., "Stripe | Financial Infrastructure" → "Stripe")
- If no title found, capitalizes the domain name (e.g., "stripe.com" → "Stripe")
GitHub Organization Resolution
The actor tries multiple strategies to find the GitHub org:
- Domain base —
stripe.com→ trieshttps://api.github.com/orgs/stripe - Company name lowercase — "OpenAI" → tries
openai - Company name with dashes — "Some Company" → tries
some-company - Search fallback — if no org found, searches repositories by company name
How Much Does It Cost?
| Plan | Monthly Cost | Included Runs | Cost Per Run |
|---|---|---|---|
| Free | $0 | ~200 runs | $0 |
| Personal ($49) | $49/month | ~10,000 runs | ~$0.005 |
The actor uses 256 MB memory and completes in 15–45 seconds. It makes only lightweight API calls with no browser rendering, keeping compute costs minimal.
Tips
- Provide the company name explicitly for companies whose website title is a tagline rather than the company name. This improves Wikipedia, SEC, and GitHub search accuracy.
- Disable unused modules to cut run time in half. If you only need website metadata and social media, turn off SEC, research, and GitHub.
- Use a GitHub token when researching multiple companies in a batch. Without a token, GitHub allows 60 requests/hour. A free personal access token raises this to 5,000/hour.
- Combine with other actors — feed the SEC CIK number into SEC EDGAR Filing Analyzer, or pass the domain into Website Tech Stack Detector for technology fingerprinting.
- Batch process company lists by calling this actor via the Apify API in a loop. Each run is independent, so you can research hundreds of companies in parallel.
Limitations
- Company name detection depends on website title — sites with tagline-only titles (e.g., "Build the Future") will produce poor search results across Wikipedia, SEC, and GitHub unless you provide
companyNamemanually. - SEC EDGAR is US-only — the financials module only finds companies that file with the US Securities and Exchange Commission. Non-US public companies are not covered.
- GitHub org matching is heuristic — the actor tries 3 name guesses plus a search fallback. Companies with GitHub org names that differ significantly from their company name may not be found.
- Social media detection uses HTTP status — some platforms may return false positives (200 for redirect pages) or false negatives (rate limiting). Website-discovered links are more reliable.
- No WHOIS data — the actor resolves DNS but does not query WHOIS registrars. Use the WHOIS Domain Lookup actor for registration details.
- Wikipedia search may match wrong entity — common company names (e.g., "Apple") may match the Wikipedia article for a different entity. Providing the full company name helps.
- Sequential processing — the 7 steps run sequentially. A single slow API response can extend the total run time.
- OpenAlex results may include false matches — papers mentioning the company name in passing may appear in results alongside genuinely relevant research.
Responsible Use
- All data is from public sources — Wikipedia (Creative Commons), SEC (public domain), OpenAlex (open access), GitHub (public API), DNS (public records).
- Respect GitHub rate limits — use a personal access token when running batch queries to avoid hitting the 60-request/hour unauthenticated limit.
- Comply with SEC EDGAR fair use policy — the actor includes a descriptive User-Agent string. Avoid excessive request volumes.
- Use for legitimate business research — this tool is designed for sales intelligence, competitive analysis, due diligence, and journalism.
FAQ
Is this actor free to use? Yes. All data sources are free public APIs. The only cost is Apify platform compute, which is covered by the free tier for moderate usage.
Does it work for non-US companies? Yes. Website analysis, Wikipedia, GitHub, DNS, and social media work globally. The SEC EDGAR module only returns results for companies that file with the US SEC.
How accurate is the company name auto-detection?
The actor extracts the company name from og:title or <title> and splits on common separators. For most corporate websites this works well. For sites with tagline-first titles, provide the company name manually.
Can I increase the GitHub API rate limit?
Yes. Generate a free GitHub personal access token at github.com/settings/tokens and enter it in the githubToken input field.
What happens if a data source is unavailable? Each module handles errors independently. If Wikipedia is down or the SEC API times out, that section returns null or empty results while the rest completes normally.
How often should I re-run research on the same company? Monthly for actively changing companies. Quarterly for stable companies.
Integrations
The Company Deep Research Agent works with the full Apify platform ecosystem:
- Apify API — trigger runs programmatically and retrieve results as JSON for custom company research pipelines.
- Zapier — trigger a company research run when a new lead enters your CRM, then push the report into Google Sheets or Slack.
- Make (Integromat) — build workflows that research companies and route findings into Airtable, HubSpot, or email sequences.
- Google Sheets — export the dataset directly for team collaboration.
- Webhooks — receive the research report as soon as the run completes for real-time integrations.
- Scheduled Runs — monitor companies over time by tracking changes in SEC filings, new GitHub repositories, or updated website metadata.
Related Actors
Build a complete company intelligence pipeline by combining this actor with other tools:
| Actor | What it does | Use with Company Deep Research |
|---|---|---|
| SEC EDGAR Filing Analyzer | Deep SEC filing analysis | Dive deeper using the CIK from this actor's output |
| Website Tech Stack Detector | Technology fingerprinting | Identify the tech stack behind the company's website |
| Website Contact Scraper | Extract contact details | Get emails and phone numbers from company websites |
| GitHub Repository Search | Cross-GitHub repo search | Expand the GitHub analysis beyond the company org |
| OpenAlex Research Search | Full OpenAlex search | Explore academic papers with author and institution filters |
| WHOIS Domain Lookup | Domain registration details | Get registrar, creation date, and expiration for the domain |
| DNS Record Lookup | Detailed DNS queries | Deeper DNS analysis beyond built-in A/MX/TXT/NS |
| Brand Protection Monitor | Brand threat monitoring | Check for typosquatting and impersonation |
| SaaS Competitive Intelligence | SaaS competitor analysis | Compare pricing, features, and positioning |
| SEC Insider Trading | Insider stock transactions | Track insider trades for public companies |