Company Deep Research Agent avatar

Company Deep Research Agent

Pricing

from $1,000.00 / 1,000 company researcheds

Go to Apify Store
Company Deep Research Agent

Company Deep Research Agent

Research any company from a domain. Get website metadata, Wikipedia summary, GitHub repos & stars, SEC EDGAR filings & ticker, academic papers, DNS records, and social media profiles in one JSON report.

Pricing

from $1,000.00 / 1,000 company researcheds

Rating

0.0

(0)

Developer

ryan clinton

ryan clinton

Maintained by Community

Actor stats

1

Bookmarked

4

Total users

1

Monthly active users

5 hours ago

Last modified

Share

Generate a comprehensive intelligence report on any company from just a domain name. The Company Deep Research Agent automatically gathers data from 7 different public sources — website metadata, Wikipedia, GitHub, SEC EDGAR filings, OpenAlex academic papers, DNS infrastructure, and social media profiles — and compiles everything into a single structured JSON report.

No API keys required for the core functionality. Just enter a domain like stripe.com and get back a detailed company dossier in seconds.

Why Use Company Deep Research Agent?

Manual company research means visiting a dozen websites, copying data into spreadsheets, and hoping you didn't miss anything. This actor does it all in one run: website analysis, Wikipedia summary, GitHub presence, SEC filings, academic citations, DNS infrastructure, and social media — compiled into a single structured JSON report you can feed into any pipeline.

Features

  • Website analysis — extracts title, meta description, Open Graph images, favicon, and discovers social media links directly from the company homepage
  • Wikipedia summary — finds the company's Wikipedia page and pulls the summary, description, thumbnail image, and direct URL
  • GitHub presence — locates the company's GitHub organization, counts public repos and followers, and lists top repositories ranked by stars
  • SEC EDGAR filings — determines whether the company is publicly traded, retrieves the stock ticker and CIK number, and lists recent 10-K, 10-Q, and 8-K filings
  • Academic research — searches OpenAlex for scholarly papers mentioning the company, ranked by citation count, with DOI links and publication details
  • DNS infrastructure — resolves A, MX, TXT, and NS records to reveal hosting providers, email infrastructure, and domain verification tokens
  • Social media detection — checks Twitter/X, LinkedIn, Facebook, Instagram, YouTube, and GitHub for active company profiles using both website-linked URLs and intelligent slug guessing
  • Auto-detection — automatically detects the company name from the website title and Open Graph metadata
  • Configurable modules — toggle SEC filings, research papers, and GitHub search on or off to speed up runs or focus on what matters
  • No API keys needed — all core data sources use free public APIs. An optional GitHub token increases rate limits but is not required

How to Use

  1. Enter the domain — provide the company's website domain (e.g., stripe.com, tesla.com). You can include or omit https:// — the actor strips it automatically.

  2. Optionally set the company name — if left blank, the actor detects the company name from the website's <title> tag or Open Graph metadata. Override this if the auto-detected name is a tagline rather than the company name.

  3. Toggle data modules — enable or disable SEC filings search, academic research papers, and GitHub analysis depending on your needs. All three are enabled by default.

  4. Run and download — click "Start" and wait for the run to complete (typically 15–45 seconds). Download the full JSON report from the dataset.

Input Parameters

ParameterTypeRequiredDefaultDescription
domainStringYesCompany website domain (e.g., stripe.com). The https:// prefix and trailing slashes are stripped automatically.
companyNameStringNoAuto-detectedCompany name override for search queries. If not provided, detected from <title> or og:title.
includeFinancialsBooleanNotrueSearch SEC EDGAR for public company filings, ticker, and CIK.
includeResearchBooleanNotrueSearch OpenAlex for academic papers mentioning the company.
includeGithubBooleanNotrueSearch GitHub for company organization and repositories.
githubTokenStringNoGitHub personal access token for higher API rate limits (60 → 5,000 requests/hour).
maxResultsIntegerNo50Maximum number of items to return per data source (1–100).

Input Examples

Quick company lookup — all modules enabled:

{
"domain": "stripe.com"
}

Named company with GitHub token:

{
"domain": "openai.com",
"companyName": "OpenAI",
"githubToken": "ghp_xxxxxxxxxxxxxxxxxxxx",
"maxResults": 20
}

Fast scan — website + DNS + social media only:

{
"domain": "acme.com",
"includeFinancials": false,
"includeResearch": false,
"includeGithub": false
}

Input Tips

  • Provide companyName explicitly for companies whose website title is a tagline (e.g., "Build the Future" rather than "Acme Corp"). This improves Wikipedia, SEC, and GitHub search accuracy.
  • Use maxResults: 10 for quick overviews, maxResults: 50 for comprehensive reports.
  • Set includeFinancials: false for private companies to skip the SEC EDGAR search and save time.

Output

Each run produces one dataset item with this structure:

{
"domain": "stripe.com",
"companyName": "Stripe",
"researchDate": "2025-03-15",
"website": {
"title": "Stripe | Financial Infrastructure for the Internet",
"description": "Stripe powers online and in-person payment processing...",
"favicon": "https://stripe.com/favicon.ico",
"ogImage": "https://stripe.com/img/v3/home/social.png",
"socialLinks": {
"twitter": "https://twitter.com/stripe",
"linkedin": "https://www.linkedin.com/company/stripe",
"github": "https://github.com/stripe"
}
},
"wikipedia": {
"found": true,
"summary": "Stripe, Inc. is an Irish-American multinational financial services...",
"description": "American-Irish financial services company",
"thumbnail": "https://upload.wikimedia.org/...",
"url": "https://en.wikipedia.org/wiki/Stripe,_Inc."
},
"github": {
"found": true,
"orgProfile": {
"name": "Stripe",
"bio": "Financial infrastructure for the internet.",
"publicRepos": 186,
"followers": 1523,
"url": "https://github.com/stripe"
},
"topRepositories": [
{
"name": "stripe-node",
"description": "Node.js library for the Stripe API.",
"stars": 3842,
"forks": 745,
"language": "TypeScript",
"url": "https://github.com/stripe/stripe-node"
}
],
"totalStars": 28450
},
"financials": {
"isPublicCompany": false,
"ticker": null,
"cik": null,
"recentFilings": []
},
"research": {
"paperCount": 1247,
"topPapers": [
{
"title": "The Rise of Embedded Finance...",
"doi": "https://doi.org/10.1016/j.jfi.2024.101032",
"citationCount": 89,
"publicationDate": "2024-06-15",
"source": "Journal of Financial Intermediation"
}
]
},
"dns": {
"aRecords": ["185.166.143.32"],
"mxRecords": ["10 aspmx.l.google.com"],
"txtRecords": ["v=spf1 include:_spf.google.com ~all"],
"nameServers": ["ns1.p16.dynect.net"]
},
"socialMedia": [
{ "platform": "Twitter/X", "url": "https://twitter.com/stripe", "found": true },
{ "platform": "LinkedIn", "url": "https://www.linkedin.com/company/stripe", "found": true },
{ "platform": "Instagram", "url": "https://www.instagram.com/stripe", "found": false }
]
}

Output Fields

Top-level fields:

FieldTypeDescription
domainStringThe company domain that was researched
companyNameStringDetected or provided company name
researchDateStringISO date of the research (YYYY-MM-DD)

website fields:

FieldTypeDescription
titleStringWebsite <title> tag content
descriptionStringMeta description or og:description
faviconStringURL to the website favicon
ogImageStringOpen Graph image URL
socialLinksObjectSocial media URLs found in the page HTML (keys: twitter, linkedin, facebook, instagram, youtube, github)

wikipedia fields:

FieldTypeDescription
foundBooleanWhether a Wikipedia page was found
summaryStringWikipedia article extract
descriptionStringShort Wikipedia description
thumbnailStringWikipedia thumbnail image URL
urlStringDirect URL to the Wikipedia page

github fields:

FieldTypeDescription
foundBooleanWhether a GitHub org or repos were found
orgProfile.nameStringGitHub organization display name
orgProfile.bioStringOrganization description
orgProfile.publicReposIntegerNumber of public repositories
orgProfile.followersIntegerNumber of GitHub followers
topRepositories[]ArrayTop repos by stars, each with name, description, stars, forks, language, url
totalStarsIntegerSum of stars across returned repos

financials fields:

FieldTypeDescription
isPublicCompanyBooleanWhether SEC filings were found
tickerString / nullStock ticker symbol (e.g., "AAPL")
cikString / nullSEC Central Index Key
recentFilings[]ArrayRecent SEC filings, each with formType, filedDate, description, url

research fields:

FieldTypeDescription
paperCountIntegerTotal papers found on OpenAlex
topPapers[]ArrayTop papers by citations, each with title, doi, citationCount, publicationDate, source

dns fields:

FieldTypeDescription
aRecordsString[]IPv4 addresses
mxRecordsString[]Mail exchange records (priority + exchange)
txtRecordsString[]TXT records (SPF, DKIM, verification tokens)
nameServersString[]Authoritative name servers

socialMedia[] fields:

FieldTypeDescription
platformStringPlatform name (Twitter/X, LinkedIn, Facebook, Instagram, YouTube, GitHub)
urlStringProfile URL (discovered from website or guessed from company slug)
foundBooleanWhether the profile exists and returned HTTP 200

Use Cases

  • Sales & BD professionals preparing company briefs before outbound prospecting — identify tech stack, filings status, and social channels to personalize outreach
  • Competitive intelligence analysts pulling together website metadata, GitHub activity, SEC filings, and academic citations into one report
  • Venture capital & PE researchers evaluating investment targets — assess public market presence, open-source footprint, and academic research impact
  • Journalists & investigators compiling background information — Wikipedia summaries, SEC filings, DNS records, and social presence in seconds
  • M&A due diligence teams running preliminary technical and public-records checks on acquisition targets
  • Marketing strategists auditing a brand's digital footprint across social platforms, website metadata, and open-source presence

How to Use the API

You can call Company Deep Research Agent programmatically from any language:

Python

import requests
import time
# Start the actor run
run = requests.post(
"https://api.apify.com/v2/acts/ryanclinton~company-deep-research/runs",
params={"token": "YOUR_APIFY_TOKEN"},
json={
"domain": "stripe.com",
"includeFinancials": True,
"includeResearch": True,
"includeGithub": True,
"maxResults": 20
},
timeout=30,
).json()
# Wait for completion
run_id = run["data"]["id"]
while True:
status = requests.get(
f"https://api.apify.com/v2/actor-runs/{run_id}",
params={"token": "YOUR_APIFY_TOKEN"},
timeout=10,
).json()
if status["data"]["status"] in ("SUCCEEDED", "FAILED", "ABORTED"):
break
time.sleep(5)
# Get results
dataset_id = status["data"]["defaultDatasetId"]
items = requests.get(
f"https://api.apify.com/v2/datasets/{dataset_id}/items",
params={"token": "YOUR_APIFY_TOKEN"},
timeout=30,
).json()
report = items[0]
print(f"Company: {report['companyName']}")
print(f"Wikipedia: {report['wikipedia']['summary'][:100]}...")
print(f"GitHub repos: {report['github']['orgProfile']['publicRepos'] if report['github']['orgProfile'] else 'N/A'}")
print(f"Public company: {report['financials']['isPublicCompany'] if report['financials'] else 'N/A'}")

JavaScript

const response = await fetch(
"https://api.apify.com/v2/acts/ryanclinton~company-deep-research/run-sync-get-dataset-items?token=YOUR_APIFY_TOKEN",
{
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
domain: "stripe.com",
includeFinancials: true,
includeGithub: true,
maxResults: 20,
}),
}
);
const [report] = await response.json();
console.log(`${report.companyName}${report.wikipedia?.summary?.slice(0, 100)}`);
console.log(`GitHub: ${report.github.totalStars} total stars`);

cURL

curl -X POST "https://api.apify.com/v2/acts/ryanclinton~company-deep-research/run-sync-get-dataset-items?token=YOUR_APIFY_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"domain": "stripe.com",
"includeFinancials": true,
"includeGithub": true,
"maxResults": 20
}'

How It Works

Input (domain, companyName, module toggles)
┌──────────────────────────────────────────────────┐
│ Step 1: Website Analysis │
│ Fetch HTML → extract <title>, og:*, favicon, │
│ social links via regex pattern matching │
│ Auto-detect company name from title │
└──────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────┐
│ Step 2: Wikipedia │
│ Try direct page summary API first │
│ Fallback: search API → top result → summary │
│ Returns: summary, description, thumbnail, URL │
└──────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────┐
│ Step 3: GitHub (optional)
│ Try org lookup with 3 name guesses: │
│ • domain base (e.g., "stripe")
│ • company lowercase ("openai")
│ • company with dashes ("some-company")
│ Fallback: repository search API │
│ Returns: org profile, top repos by stars │
└──────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────┐
│ Step 4: SEC EDGAR (optional)
3 endpoints tried in sequence: │
│ • EFTS full-text search (10-K, 10-Q, 8-K)
│ • browse-edgar company search (Atom XML)
│ • company_tickers.json for ticker/CIK │
│ Returns: isPublic, ticker, CIK, recent filings │
└──────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────┐
│ Step 5: OpenAlex (optional)
│ Search works by company name, sorted by │
│ citation count descending │
│ Returns: total paper count, top papers w/ DOI │
└──────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────┐
│ Step 6: DNS │
│ Resolve A, MX, TXT, NS records via Node.js │
│ dns.promises module │
│ Reveals hosting, email, verification tokens │
└──────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────┐
│ Step 7: Social Media │
6 platforms: Twitter/X, LinkedIn, Facebook, │
│ Instagram, YouTube, GitHub │
│ Priority: website-linked URLs > slug guessing │
│ HTTP GET check for existence │
└──────────────────────────────────────────────────┘
Single JSON report pushed to dataset

Data Sources

StepSourceAPI UsedAuth Required
1Company websiteDirect HTTPS fetch + HTML parsingNo
2WikipediaREST API (/api/rest_v1/page/summary) + search APINo
3GitHubREST API (/orgs/{name}, /orgs/{name}/repos)Optional token
4SEC EDGAREFTS search, browse-edgar, company_tickers.jsonNo
5OpenAlexREST API (/works?search=)No
6DNSNode.js dns.promises (resolve4, resolveMx, resolveTxt, resolveNs)No
7Social MediaHTTP GET to profile URLsNo

Company Name Auto-Detection

When companyName is not provided, the actor extracts it from the website:

  1. Fetches the homepage HTML
  2. Checks og:title first, falls back to <title> tag
  3. Splits on separators (|, -, , , :)
  4. Takes the first part (e.g., "Stripe | Financial Infrastructure" → "Stripe")
  5. If no title found, capitalizes the domain name (e.g., "stripe.com" → "Stripe")

GitHub Organization Resolution

The actor tries multiple strategies to find the GitHub org:

  1. Domain basestripe.com → tries https://api.github.com/orgs/stripe
  2. Company name lowercase — "OpenAI" → tries openai
  3. Company name with dashes — "Some Company" → tries some-company
  4. Search fallback — if no org found, searches repositories by company name

How Much Does It Cost?

PlanMonthly CostIncluded RunsCost Per Run
Free$0~200 runs$0
Personal ($49)$49/month~10,000 runs~$0.005

The actor uses 256 MB memory and completes in 15–45 seconds. It makes only lightweight API calls with no browser rendering, keeping compute costs minimal.

Tips

  • Provide the company name explicitly for companies whose website title is a tagline rather than the company name. This improves Wikipedia, SEC, and GitHub search accuracy.
  • Disable unused modules to cut run time in half. If you only need website metadata and social media, turn off SEC, research, and GitHub.
  • Use a GitHub token when researching multiple companies in a batch. Without a token, GitHub allows 60 requests/hour. A free personal access token raises this to 5,000/hour.
  • Combine with other actors — feed the SEC CIK number into SEC EDGAR Filing Analyzer, or pass the domain into Website Tech Stack Detector for technology fingerprinting.
  • Batch process company lists by calling this actor via the Apify API in a loop. Each run is independent, so you can research hundreds of companies in parallel.

Limitations

  • Company name detection depends on website title — sites with tagline-only titles (e.g., "Build the Future") will produce poor search results across Wikipedia, SEC, and GitHub unless you provide companyName manually.
  • SEC EDGAR is US-only — the financials module only finds companies that file with the US Securities and Exchange Commission. Non-US public companies are not covered.
  • GitHub org matching is heuristic — the actor tries 3 name guesses plus a search fallback. Companies with GitHub org names that differ significantly from their company name may not be found.
  • Social media detection uses HTTP status — some platforms may return false positives (200 for redirect pages) or false negatives (rate limiting). Website-discovered links are more reliable.
  • No WHOIS data — the actor resolves DNS but does not query WHOIS registrars. Use the WHOIS Domain Lookup actor for registration details.
  • Wikipedia search may match wrong entity — common company names (e.g., "Apple") may match the Wikipedia article for a different entity. Providing the full company name helps.
  • Sequential processing — the 7 steps run sequentially. A single slow API response can extend the total run time.
  • OpenAlex results may include false matches — papers mentioning the company name in passing may appear in results alongside genuinely relevant research.

Responsible Use

  • All data is from public sources — Wikipedia (Creative Commons), SEC (public domain), OpenAlex (open access), GitHub (public API), DNS (public records).
  • Respect GitHub rate limits — use a personal access token when running batch queries to avoid hitting the 60-request/hour unauthenticated limit.
  • Comply with SEC EDGAR fair use policy — the actor includes a descriptive User-Agent string. Avoid excessive request volumes.
  • Use for legitimate business research — this tool is designed for sales intelligence, competitive analysis, due diligence, and journalism.

FAQ

Is this actor free to use? Yes. All data sources are free public APIs. The only cost is Apify platform compute, which is covered by the free tier for moderate usage.

Does it work for non-US companies? Yes. Website analysis, Wikipedia, GitHub, DNS, and social media work globally. The SEC EDGAR module only returns results for companies that file with the US SEC.

How accurate is the company name auto-detection? The actor extracts the company name from og:title or <title> and splits on common separators. For most corporate websites this works well. For sites with tagline-first titles, provide the company name manually.

Can I increase the GitHub API rate limit? Yes. Generate a free GitHub personal access token at github.com/settings/tokens and enter it in the githubToken input field.

What happens if a data source is unavailable? Each module handles errors independently. If Wikipedia is down or the SEC API times out, that section returns null or empty results while the rest completes normally.

How often should I re-run research on the same company? Monthly for actively changing companies. Quarterly for stable companies.

Integrations

The Company Deep Research Agent works with the full Apify platform ecosystem:

  • Apify API — trigger runs programmatically and retrieve results as JSON for custom company research pipelines.
  • Zapier — trigger a company research run when a new lead enters your CRM, then push the report into Google Sheets or Slack.
  • Make (Integromat) — build workflows that research companies and route findings into Airtable, HubSpot, or email sequences.
  • Google Sheets — export the dataset directly for team collaboration.
  • Webhooks — receive the research report as soon as the run completes for real-time integrations.
  • Scheduled Runs — monitor companies over time by tracking changes in SEC filings, new GitHub repositories, or updated website metadata.

Build a complete company intelligence pipeline by combining this actor with other tools:

ActorWhat it doesUse with Company Deep Research
SEC EDGAR Filing AnalyzerDeep SEC filing analysisDive deeper using the CIK from this actor's output
Website Tech Stack DetectorTechnology fingerprintingIdentify the tech stack behind the company's website
Website Contact ScraperExtract contact detailsGet emails and phone numbers from company websites
GitHub Repository SearchCross-GitHub repo searchExpand the GitHub analysis beyond the company org
OpenAlex Research SearchFull OpenAlex searchExplore academic papers with author and institution filters
WHOIS Domain LookupDomain registration detailsGet registrar, creation date, and expiration for the domain
DNS Record LookupDetailed DNS queriesDeeper DNS analysis beyond built-in A/MX/TXT/NS
Brand Protection MonitorBrand threat monitoringCheck for typosquatting and impersonation
SaaS Competitive IntelligenceSaaS competitor analysisCompare pricing, features, and positioning
SEC Insider TradingInsider stock transactionsTrack insider trades for public companies