Pricing

from $1.00 / 1,000 results

Darkweb Scraper

Crawl dark web .onion sites via Tor. Extract links, emails, phone numbers, cryptocurrency wallet addresses, social media handles, and API keys from hidden services.

Pricing

from $1.00 / 1,000 results

Rating

0.0

(0)

Developer

Crawler Bros

Actor stats

Bookmarked

Total users

Monthly active users

23 days ago

Last modified

What is Darkweb Scraper?

Darkweb Scraper is an Apify actor that accesses Tor hidden services (.onion sites) and extracts structured data from them. It bundles a Tor daemon internally, so you don't need any special setup or proxy configuration. Provide a search keyword or direct .onion URLs, and the scraper will crawl the dark web and return organized results.

What can this actor do?

Crawl .onion sites — Navigate through dark web pages with configurable crawl depth and page limits
Search the dark web — Enter any keyword and discover relevant .onion sites automatically via dark web search engines (best-effort; search engines are frequently offline)
Extract emails — Find email addresses embedded in dark web pages
Extract phone numbers — Detect phone numbers in international formats
Extract cryptocurrency addresses — Identify Bitcoin, Ethereum, Monero, Litecoin, Bitcoin Cash, and Ripple wallet addresses
Extract social media handles — Find Twitter/X, Instagram, and Telegram usernames and links
Detect exposed API keys — Discover exposed AWS keys, Google API keys, and generic credential strings
Keyword matching — Check whether your search term appears on each crawled page

Use cases

Threat intelligence — Monitor the dark web for leaked credentials, stolen data, or mentions of your organization
Brand protection — Detect unauthorized use of your brand name or products on hidden services
Security research — Discover exposed API keys, wallet addresses, and sensitive data on .onion sites
OSINT investigations — Map dark web site structures, discover linked hidden services, and extract contact information
Cryptocurrency tracking — Find wallet addresses associated with dark web activity

Input

Field	Type	Default	Description
Mode	Select	`crawl`	How to discover pages: `crawl` (start URLs only — most reliable), `search` (query dark web engines), `searchAndCrawl` (merge search seeds with start URLs).
Search Keyword	String	—	Keyword to query dark web search engines (modes `search`, `searchAndCrawl`).
Start URLs	URL List	BBC News onion	Direct `.onion` URLs to crawl (modes `crawl`, `searchAndCrawl`). Must be valid Tor hidden service addresses.
Max Crawl Depth	Integer	1	Maximum link depth to follow from seed pages. `0` = only the provided URLs.
Max Pages to Crawl	Integer	5	Maximum number of pages to fetch during the crawl.
Max Output Items	Integer	5	Maximum number of items to include in the output dataset.
Extract emails	Boolean	true	Find email addresses on each page.
Extract phone numbers	Boolean	true	Detect phone numbers on each page.
Extract cryptocurrency addresses	Boolean	true	Identify BTC / ETH / XMR / LTC / BCH / XRP wallet addresses.
Extract social media handles	Boolean	true	Find Twitter/X, Instagram, and Telegram handles.
Detect exposed API keys	Boolean	true	Discover AWS, Google, and generic API key strings.

Note: mode=crawl requires at least one Start URLs entry. mode=search requires a Search Keyword. mode=searchAndCrawl requires at least one of the two.

Example input — Crawl mode (default, most reliable)

{
    "mode": "crawl",
    "startUrls": [
        { "url": "http://bbcnewsd73hkzno2ini43t4gblxvycyac5aw4gnv7t2rfl3d5jakber2iniad.onion/" }
    ],
    "maxDepth": 1,
    "maxPages": 5,
    "maxItems": 5
}

Example input — Search mode

{
    "mode": "search",
    "search": "marketplace",
    "maxDepth": 2,
    "maxPages": 20,
    "maxItems": 20
}

Example input — Search + crawl combined

{
    "mode": "searchAndCrawl",
    "search": "forum",
    "startUrls": [
        { "url": "http://xjfbpuj56rdazx4iolylxplbvyft2onuerjeimlcqwaihp3s6r4xebqd.onion/" }
    ],
    "maxDepth": 2,
    "maxPages": 30
}

Output

Each crawled page produces one item in the output dataset with the following fields:

Field	Type	Description
url	String	The `.onion` page URL that was scraped
sourceUrl	String	Canonical source URL (same as `url`; included for consistency with downstream tooling)
title	String	The page title extracted from the HTML `<title>` tag
links	Array	All links discovered on the page (both `.onion` and clearnet), resolved to absolute URLs
onionLinks	Array	Subset of `links` that point to `.onion` hosts
emails	Array	Email addresses found on the page (false-positive domains filtered)
phones	Array	Phone numbers found on the page (short / repeated / letter-containing matches filtered)
cryptoAddresses	Object	Cryptocurrency wallet addresses grouped by type (`bitcoin`, `ethereum`, `monero`, `litecoin`, `bitcoinCash`, `ripple`)
misc	Object	Social media handles (`twitter`, `instagram`, `telegram`) and exposed `apiKeys`
searchKeywordFound	Boolean	Whether the search keyword was found on this page (only present when a keyword was supplied)
recordType	String	Always `"page"`
scrapedAt	String	UTC ISO-8601 timestamp of when the page was scraped

Empty arrays / objects / strings are never emitted — any field that would be empty is omitted entirely from the record.

Sample output

{
    "url": "http://xjfbpuj56rdazx4iolylxplbvyft2onuerjeimlcqwaihp3s6r4xebqd.onion/",
    "sourceUrl": "http://xjfbpuj56rdazx4iolylxplbvyft2onuerjeimlcqwaihp3s6r4xebqd.onion/",
    "title": "Dark Market - Home",
    "links": [
        "http://xjfbpuj56rdazx4iolylxplbvyft2onuerjeimlcqwaihp3s6r4xebqd.onion/faq/",
        "http://xjfbpuj56rdazx4iolylxplbvyft2onuerjeimlcqwaihp3s6r4xebqd.onion/support/",
        "http://phobosxilamwcg75xt22id7aywkzol6q6rfl2flipcqoc4e4ahima5id.onion/"
    ],
    "onionLinks": [
        "http://xjfbpuj56rdazx4iolylxplbvyft2onuerjeimlcqwaihp3s6r4xebqd.onion/faq/",
        "http://xjfbpuj56rdazx4iolylxplbvyft2onuerjeimlcqwaihp3s6r4xebqd.onion/support/",
        "http://phobosxilamwcg75xt22id7aywkzol6q6rfl2flipcqoc4e4ahima5id.onion/"
    ],
    "emails": ["contact@darkservice.onion"],
    "phones": ["+1-555-0123"],
    "cryptoAddresses": {
        "bitcoin": ["1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa"],
        "monero": ["4AdUndXHHZ6cfufTMvppY6JwXNouMBzSkbLYfpAV5Usx3skxNgYeYTRj5UzqtReoS44qo9mtmXCqY45DJ852K5Jv2684Rge"]
    },
    "misc": {
        "twitter": ["@darkmarket"],
        "telegram": ["@darkmarketgroup"]
    },
    "searchKeywordFound": true,
    "recordType": "page",
    "scrapedAt": "2026-06-18T12:00:00.000000+00:00"
}

How to use

Crawl specific .onion sites (recommended)

Add one or more .onion URLs to the Start URLs field
Set Mode to crawl
Set Max Crawl Depth to 0 to only scrape the provided pages, or higher to follow links
Click Start

Search the dark web (best-effort)

Set Mode to search
Enter a keyword in Search Keyword (e.g., "marketplace", "forum", "leaked data")
Set Max Crawl Depth to control how deep the crawler follows links (0 = only search results, 2+ = follow links from results)
Set Max Pages to limit the crawl scope
Click Start and wait for results

Combine both modes

Set Mode to searchAndCrawl with both a keyword and start URLs. The scraper merges all discovered URLs and crawls them together, removing duplicates.

Tips

Start with crawl mode — mode=crawl with a known-good .onion URL is the most reliable path. Search engines on the dark web are frequently offline.
Start small — Use maxDepth: 0 and maxPages: 5 for your first run to see how the actor works
Dark web sites are unreliable — Many .onion sites go offline frequently. If a site is unreachable, the scraper will skip it and continue with other URLs
Tor is slow — Connecting through the Tor network adds latency. Expect each page to take 5-30 seconds to load
No proxy needed — The actor bundles its own Tor daemon, so you don't need to configure any proxy or pay extra for proxy services
Keyword search — Use specific, relevant keywords for better results. Generic terms may return many unrelated pages
Toggle extractors off — Disable individual extractors (extractEmails, extractPhones, etc.) to speed up runs when you only care about one data type

Reliability

Crawl mode (mode=crawl) is the reliable default. The daily test run on Apify uses crawl mode against a stable .onion seed and is expected to produce at least one record.
Search mode (mode=search) is best-effort. Dark web search engines (Ahmia, Torch, Haystack) rotate addresses and go offline frequently. The actor tries multiple engines in order, but search mode may return zero results on any given run. This is expected and not a bug.
Retry behavior. Every page fetch retries up to 2 times with exponential backoff on timeout / network error. HTTP 4xx/5xx responses are logged and skipped (the crawl continues with the next URL).
Tor bootstrap. The actor waits up to 120 seconds for Tor to bootstrap. If bootstrap fails, the run fails with a clear status message suggesting a retry.

Data Source

This actor scrapes Tor hidden services (.onion websites) via the Tor network. Tor access is free and requires no paid API keys, no residential proxy groups, and no user-supplied credentials — the actor bundles its own Tor daemon and uses the free Apify datacenter network only for the build. This satisfies the project's zero-cost-cloud requirement: any user on the Apify free plan can run this actor with no configuration beyond the input fields.

For search mode, the actor queries the public dark web search engines Ahmia (ahmia.fi), Torch, and Haystack. These are public indexes of .onion sites; no account or API key is required.

Limitations

The actor can only access .onion (Tor hidden service) URLs. Regular clearnet websites are not crawled
Dark web search engine results depend on what has been indexed. Not all .onion sites are discoverable via search
Some hidden services use CAPTCHAs or anti-bot measures that may prevent scraping
Tor circuit establishment takes 10-30 seconds at the start of each run
The actor does not render JavaScript. Sites that require JavaScript for content display may return incomplete data

FAQ

Is it legal to scrape the dark web? Accessing the dark web via Tor is legal in most jurisdictions. However, the legality depends on what you do with the data. This tool is intended for security research, OSINT, and threat intelligence. Always comply with applicable laws.

Do I need a proxy to use this actor? No. The actor includes a built-in Tor daemon that handles all network routing automatically. No additional proxy configuration is needed and no paid Apify proxy groups are used.

How fast is the scraping? Tor connections are inherently slower than regular internet. Expect 5-30 seconds per page depending on the hidden service's responsiveness. A typical run with 10 pages completes in 2-5 minutes.

Why are some pages not scraped? Dark web sites have high failure rates. Sites may be temporarily offline, overloaded, or have moved to a new .onion address. The scraper will skip unreachable pages and continue with others.

Why did my search run return zero results? Dark web search engines (Ahmia, Torch, Haystack) frequently go offline or rotate their .onion addresses. The actor tries them in order and falls back gracefully, but on any given day search mode may return zero seeds. Use mode=crawl with known .onion URLs for reliable results.

What cryptocurrency addresses are detected? The scraper identifies Bitcoin (BTC), Ethereum (ETH), Monero (XMR), Litecoin (LTC), Bitcoin Cash (BCH), and Ripple (XRP) wallet addresses.

Can I scrape a specific .onion site deeply? Yes. Add the site URL to Start URLs, set Mode to crawl, set Max Crawl Depth to 3-5, and increase Max Pages to allow thorough crawling of the site's internal pages.

What happens if Tor fails to connect? The actor will wait up to 2 minutes for Tor to establish a connection. If it fails, the run will end with an error message suggesting you retry.

Why are empty fields missing from my records? The actor uses an omit-empty contract: any field that would be null, "", [], or {} is removed from the record entirely before push. This keeps the dataset clean and downstream-friendly. Only fields with real data are present.

TOR Scraper - Dark Web & .onion Site Data Extractor

igolaizola/tor-dark-web-scraper

Scrape .onion hidden services and websites anonymously over TOR. Provide a list of URLs or search the dark web by keyword, extract page content, and pull any data using CSS selectors. No setup required. TOR runs automatically in the cloud.

Iñigo Garcia Olaizola

Dark Web Scraper

epctex/darkweb-scraper

Extract publicly exposed data, including crypto wallets, API keys, emails, phone numbers, and more from publicly accessible Tor (.onion) sites.

epctex

1.8K

5.0

Dark Web Search Results Scraper

lofomachines/dark-web-search-results-scraper

Scrapes search results from dark web search engines. Get titles, onion urls, page description. Best use via Claude MCP to scan Dark Web results by keywords.

Lofomachines

147

5.0

Wallet Doxxer

nickrains/wallet-doxxer

Find crypto wallet owners on Twitter/X. Scrape Ethereum, Solana & Bitcoin addresses → get Twitter handles + confidence scores. Perfect for airdrop verification, whale research, Web3 lead gen & DAO allowlists. Bulk process 50k+ wallets with auto-resume. Export JSON, CSV, Excel.

Nicholas Rains

5.0

Xvideos Scraper

hello.datawizards/Xvideos-Scraper1

Extract video data from Xvideos using keywords. Get titles, URLs, durations, thumbnails, uploaders, and views in JSON. Ideal for trend analysis and research. Use Apify Proxy for reliability. Simple input: keyword and item limit. Perfect for data aggregation and automation.

datawizards

🚀 TGStat Channel Parser

backhoe/tgstat-channel-parser

Extract comprehensive Telegram channel statistics from TGStat.com using intelligent AI-powered search...

Scout — Lead Enrichment + OSINT

logical_vivacity/scout

Email finder + lead enrichment + OSINT from public sources. Pass any fragment — name, email, or domain — get a verified dossier: 700+ identity sites, SMTP-validated emails, document mining, sanctions screen, domain→team discovery. $0.05 person, $0.15 domain. No API keys

Logical Vivacity

173

Telegram Channel Scraper

backhoe/telegram-channel-scraper

MEGA Uploader & Downloader – No Download Limit

code-node-tools/mega-uploader-downloader---no-download-limit

Bypass MEGA.nz download limits and transfer quota to automate uploads and downloads of MEGA files and folders. Supports public links or login-based access. Ideal for backups, file delivery, and using MEGA as cloud storage in automated workflows.

CodeNodeTools

1.7K

Telegram Video Downloader

scraper-mind/telegram-video-downloader

Telegram Video Downloader is a powerful tool that allows you to download 1000+ videos from Telegram channels, groups, and chats in one go. It supports high-speed processing, metadata extraction, and multiple video formats**, ensuring seamless downloads with a 99.99% success rate. 🚀📥