theHarvester - OSINT Email & Subdomain Harvester avatar

theHarvester - OSINT Email & Subdomain Harvester

Pricing

$3.00 / 1,000 osint records

Go to Apify Store
theHarvester - OSINT Email & Subdomain Harvester

theHarvester - OSINT Email & Subdomain Harvester

Cloud-hosted theHarvester OSINT tool. Harvest emails, subdomains, IPs, URLs and ASNs from 54+ public sources (Shodan, Censys, crt.sh, VirusTotal, SecurityTrails, GitHub, hunter.io). Full CLI feature parity β€” DNS brute force, subdomain takeover, screenshots. $0.003 per record harvested.

Pricing

$3.00 / 1,000 osint records

Rating

0.0

(0)

Developer

Anshuman Atrey

Anshuman Atrey

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a day ago

Last modified

Share

theHarvester OSINT Actor

πŸ“¦ Open source Β· MIT: github.com/AnshumanAtrey/theharvester-osint

Wraps laramies/theHarvester β€” the OSINT tool used by penetration testers and red teams to gather emails, subdomains, IPs, URLs, and ASNs from public sources.

This actor exposes the full CLI surface of theHarvester (16 flags, 54 data sources) as a structured Apify input. Results are pushed as individual dataset records (one per host/email/IP/etc) plus a summary row β€” perfect for tables, CSV export, or downstream pipelines.

Quick start

{
"domain": "example.com",
"sources": ["crtsh", "hackertarget", "rapiddns", "certspotter"]
}

These four sources are free and require no API key. For a basic domain, expect 100-1000 subdomains in 30-60 seconds.

Full power (with API keys)

{
"domain": "example.com",
"sources": ["all"],
"dnsBrute": true,
"takeOver": true,
"shodan": true,
"shodanApiKey": "YOUR_KEY",
"securitytrailsApiKey": "YOUR_KEY",
"virustotalApiKey": "YOUR_KEY",
"hunterApiKey": "YOUR_KEY"
}

What you get

The dataset receives one record per finding, with recordType discriminator:

recordTypeFieldsWhen emitted
hosthost, ip, domainOne per discovered subdomain
emailemail, domainOne per discovered email
ipip, domainOne per discovered IP
urlurl, domainOne per discovered "interesting URL"
asnasn, domainOne per discovered ASN
shodanshodan (object)One per host enriched by Shodan
personperson, domainOne per discovered person/name
summarycounts, sources, success, cmdOne per run β€” always last

54 supported data sources

Free (no API key): crtsh, hackertarget, rapiddns, certspotter, otx, urlscan, threatcrowd, dnsdumpster, brave, duckduckgo, baidu, yahoo, mojeek, commoncrawl, waybackarchive, robtex, sitedossier, anubis, subdomaincenter, hudsonrock, thc

Paid / API key required: shodan, censys, virustotal, securityTrails, hunter, hunterhow, intelx, fullhunt, netlas, leakix, leaklookup, zoomeye, criminalip, dehashed, fofa, github-code, gitlab, bitbucket, bevigil, builtwith, chaos, projectdiscovery, onyphe, pentesttools, rocketreach, securityscorecard, subdomainfinderc99, tomba, venacus, whoisxml, windvane, dymo, haveibeenpwned, bufferoverun, shodanInternetDB

Mapped CLI flags

Every theHarvester CLI flag is exposed:

CLI flagActor inputNotes
-ddomainRequired
-bsourcesComma-separated or array
-llimitPer-source result cap
-SstartPagination offset
-ndnsLookupResolve discovered hosts
-cdnsBruteBrute-force subdomains
-rdnsResolveCustom resolver list
-ednsServerDNS server IP
-sshodanEnrich via Shodan
-ttakeOverSubdomain takeover check
--screenshotscreenshotCapture subdomain screenshots
-aapiScanAPI endpoint scan
-wwordlistWordlist path (cloud-limited)
-puseProxiesUse proxies.yaml
-qquietSuppress key warnings
-f(internal)Output is parsed and pushed automatically

Notes for Apify cloud

  • Screenshots write to /tmp/screenshots inside the container. They are not yet auto-uploaded to the key-value store β€” that's planned.
  • Wordlist must reference a file that exists inside the Docker image. Custom wordlists from the input field do not transfer.
  • Proxies require a proxies.yaml baked into the image. Skip useProxies unless you've forked this actor with a custom image.
  • API keys are isSecret β€” they are stored encrypted and not logged.

FAQ

Do I need API keys?

Not for basic recon β€” the 4 free sources (crtsh, hackertarget, rapiddns, certspotter) work without any keys and return 100-1000 subdomains for most domains. API keys unlock deeper sources like Shodan, Censys, VirusTotal, SecurityTrails, and Hunter.

Which sources should I pick first?

For pure subdomain discovery: crtsh, certspotter, hackertarget, rapiddns, otx, dnsdumpster, anubis, subdomaincenter β€” all free, all complement each other. For emails: add hunter (paid) or intelx (paid).

Why are some sources returning 0 results?

A few sources require a working API key (you'll see "key not set" warnings in logs). Others rate-limit per-IP β€” re-running 10 minutes later usually works. CommonCrawl + WaybackArchive can be slow on large domains; bump limit if needed.

Can I combine this with nmap for full recon?

Yes β€” that's the canonical workflow. theHarvester finds subdomains β†’ feed each into nmap β†’ nmap returns open ports + services per host. Stitch them into a single recon report via Apify's webhook output.

How does this differ from running theHarvester locally?

Same binary, but you skip the install pain (Python deps, system requirements, proxy setup), and outputs are pre-parsed into structured records ready for CSV/Google Sheets/Notion. You also get Apify Residential Proxies built-in.

Pairs nicely with

Bundle for full attack-surface mapping:

  • nmap β€” Port-scan every subdomain theHarvester discovers
  • NetIntel β€” Enrich each discovered IP with WHOIS, GeoIP, SSL, reputation data
  • Bug Bounty Finder β€” Check whether the target has a public bounty program before reporting
  • Holehe Email OSINT β€” Take the discovered emails and find which sites they're registered on
  • Social Analyzer β€” Investigate the people behind the discovered emails
  • Zomato Restaurant Scraper β€” Restaurant lead lists (separate B2B use case)

Credits

Built on top of theHarvester by Christian Martorella (Edge Security). MIT/GPL licensed per upstream.