urlscan.io Threat Intelligence Scraper
Pricing
from $26.62 / 1,000 results
urlscan.io Threat Intelligence Scraper
Search the urlscan.io public scan database with Lucene queries (domain, page.url, hash, IP, ASN, tag) and export scan metadata: page URL, IP, ASN, server, TLS, screenshot, redirect chain, country, brand, verdict.
Pricing
from $26.62 / 1,000 results
Rating
0.0
(0)
Developer
ParseForge
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share

๐ก๏ธ urlscan.io Threat Intelligence Scraper
๐ Export urlscan.io scan results in seconds. Run Lucene-style queries across the public urlscan.io scan database and pull back domain, IP, ASN, TLS, brand, verdict, and screenshot metadata. No API key, no rate-limit dance, no manual JSON parsing.
๐ Last updated: 2026-05-13 ยท ๐ 31 fields per record ยท ๐ก๏ธ Phishing + malware feed ยท ๐ Any domain, IP, ASN, or tag
The urlscan.io Threat Intelligence Scraper queries the urlscan.io public search API with full Lucene syntax (domain:, page.url:, task.tags:phishing, page.asn:, brand.name:, verdicts.overall.malicious:true, plus AND, OR, NOT, wildcards, and date ranges) and returns one row per scan. Each row carries the page URL, apex domain, IP, ASN, server software, TLS issuer, redirect chain, country, page title, request count, brand attribution, and the malicious verdict score, plus links to the rendered screenshot and the full urlscan report.
Coverage spans the entire urlscan.io public corpus, which adds millions of new scans every week across phishing kits, brand impersonation, malware C2s, fast-flux infrastructure, and regular web pages. Every field maps directly to the upstream API so you can join scans to your own SIEM, takedown queue, or brand-protection workflow.
| ๐ฏ Target Audience | ๐ก Primary Use Cases |
|---|---|
| Threat intel teams, SOC analysts, brand-protection engineers, takedown vendors, anti-phishing researchers, OSINT investigators | Phishing kit discovery, brand impersonation monitoring, IP / ASN attribution, malware infrastructure mapping, screenshot enrichment, indicator-of-compromise hunting |
๐ What the urlscan.io Scraper does
Five intel workflows in one Actor:
- ๐ฃ Phishing discovery. Pull every scan tagged
phishingfor a brand or apex domain. - ๐ข Brand impersonation monitoring. Watch
brand.name:<your-brand>across the global scan feed. - ๐ Infrastructure attribution. Pivot on
page.ip:,page.asn:, orpage.server:to map hosting clusters. - ๐ Redirect-chain analysis. Trace landing-page redirects and final URLs across recent scans.
- ๐ผ๏ธ Visual enrichment. Every record links to a public urlscan screenshot and the full result report.
Each scan record carries scan metadata (UUID, visibility, method, time, tags), page facts (URL, domain, apex, country, IP, ASN, server, status, title, MIME), TLS context (issuer, valid days), traffic stats (unique IPs, unique countries, request count, data length), brand attribution, and the urlscan verdict (score, malicious flag, categories), plus deep links to the screenshot and report page.
๐ก Why it matters: brand-protection and SOC teams burn hours stitching together phishing kit pivots from raw urlscan JSON. This Actor flattens the response into a spreadsheet-ready table so triage, takedown filings, and dashboards land in one query.
๐ฌ Full Demo
๐ง Coming soon: a 3-minute walkthrough showing a phishing query, pivot to ASN, and Slack alert.
โ๏ธ Input
| Input | Type | Default | Behavior |
|---|---|---|---|
query | string | "domain:apify.com" | Lucene query. Required. Supports domain:, page.url:, page.ip:, page.asn:, task.tags:, brand.name:, verdicts.overall.malicious:true, hash:, filename:, plus AND, OR, NOT, wildcards, and date ranges. |
maxItems | integer | 10 | Records to return. Free plan caps at 10, paid plan at 1,000,000. |
pageSize | integer | 100 | Results per API request. Lower values are friendlier to free-tier rate limits. |
Example: every phishing scan against PayPal in the last seven days.
{"query": "page.domain:paypal.com AND task.tags:phishing AND date:>now-7d","maxItems": 500}
Example: malicious verdicts hosted on a specific ASN.
{"query": "page.asn:AS139341 AND verdicts.overall.malicious:true","maxItems": 200,"pageSize": 100}
โ ๏ธ Good to Know: urlscan.io rate-limits anonymous search and may return partial results for very broad queries. Narrow with
date:>now-30dor an apex domain when running bulk pulls, and keeppageSizemodest on the free tier.
๐ Output
Each scan record carries 31 fields. Download the dataset as CSV, Excel, JSON, or XML.
๐งพ Schema
| Field | Type | Example |
|---|---|---|
๐ uuid | string | "019e2370-d463-72c7-a1ef-3f07c7db0e75" |
๐ task_url | string | "https://classai-jdssb5uo04.edgeone.dev/" |
๐๏ธ task_visibility | string | "public" |
๐ ๏ธ task_method | string | "api" |
๐ task_time | ISO 8601 | "2026-05-13T22:24:25.440Z" |
๐ท๏ธ task_tags | string[] | ["phishing","malicious"] |
๐ page_url | string | "https://classai-jdssb5uo04.edgeone.dev/" |
๐ page_domain | string | "classai-jdssb5uo04.edgeone.dev" |
๐ชช page_apex_domain | string | "edgeone.dev" |
๐ณ๏ธ page_country | string | "SG" |
๐ฅ๏ธ page_server | string | "edgeone-pages" |
๐ก page_ip | string | "43.174.247.29" |
๐ฐ๏ธ page_asn | string | "AS139341" |
๐ข page_asn_name | string | "ACE-AS-AP ACE, SG" |
๐ช page_ptr | string | null | null |
๐ page_status | string | "200" |
๐ page_tlsValidDays | number | 364 |
๐ท๏ธ page_tlsIssuer | string | "DigiCert Secure Site OV G2 TLS CN RSA4096 SHA256 2022 CA1" |
๐ page_redirected | string | null | null |
๐ฐ page_title | string | "ๆฌข่ฟๆฅๅฐไฟกๆฏ็งๆๅฎๅฎ" |
๐ page_mime_type | string | "text/html" |
๐ page_language | string | null | null |
๐
domain_age_days | number | 1273 |
๐ unique_ips | number | 1 |
๐บ๏ธ unique_countries | number | 1 |
๐ request_count | number | 2 |
๐ฆ data_length | number | 10882 |
๐ท๏ธ brand_name | string | "PayPal" |
๐จ verdict_score | number | 100 |
โ ๏ธ verdicts_overall_malicious | boolean | true |
๐ผ๏ธ screenshot | string | "https://urlscan.io/screenshots/<uuid>.png" |
๐ report_url | string | "https://urlscan.io/result/<uuid>/" |
๐ scrapedAt | ISO 8601 | "2026-05-13T22:25:22.027Z" |
๐ฆ Sample records
โจ Why choose this Actor
| Capability | |
|---|---|
| ๐ก๏ธ | Lucene-native search. Every urlscan search operator works as-is: domain:, page.ip:, task.tags:, brand.name:, verdicts.overall.malicious:true, date ranges, wildcards, boolean logic. |
| ๐ | Public corpus. Searches the global pool of public scans contributed by the urlscan community and automated submitters. |
| ๐ผ๏ธ | Screenshot + report links. Every record points at the rendered PNG and the full urlscan report page for analyst review. |
| ๐ฏ | Brand & verdict attribution. Includes urlscan's own brand match, verdict score, and malicious flag where present. |
| โก | Fast pagination. Server-side search_after cursor walks the full result set without timing out. |
| ๐ซ | No API key required. Uses the public search endpoint. Plug it in and run. |
| ๐ | Always fresh. Every run hits the live urlscan index. |
๐ The urlscan.io public corpus is one of the most cited threat-intel data sources in modern SOC tooling, takedown vendor pipelines, and brand-protection products.
๐ How it compares to alternatives
| Approach | Cost | Coverage | Refresh | Filters | Setup |
|---|---|---|---|---|---|
| โญ urlscan.io Scraper (this Actor) | $5 free credit, then pay-per-use | Public urlscan corpus | Live per run | Full Lucene syntax | โก 2 min |
| urlscan PRO subscription | $200+/month per seat | Public + private | Live | Full Lucene | ๐ข Vendor onboarding |
| Build your own integration | Engineering time | Same | Same | Same | ๐ Days |
| Commercial brand-protection suite | $$$ | Curated | Hourly | Vendor-defined | โณ Weeks |
Pick this Actor when you want urlscan firepower without the seat licenses or the parser code.
๐ How to use
- ๐ Sign up. Create a free account with $5 credit (takes 2 minutes).
- ๐ Open the Actor. Go to the urlscan.io Threat Intelligence Scraper page on the Apify Store.
- ๐ฏ Set the query. Try
domain:yourbrand.com AND task.tags:phishingand setmaxItems. - ๐ Run it. Click Start and let the Actor walk the search index.
- ๐ฅ Download. Grab results in the Dataset tab as CSV, Excel, JSON, or XML.
โฑ๏ธ Total time from signup to a phishing feed export: 3-5 minutes. No coding required.
๐ผ Business use cases
๐ Automating urlscan.io Scraper
Control the scraper programmatically for scheduled runs and pipeline integrations:
- ๐ข Node.js. Install the
apify-clientNPM package. - ๐ Python. Use the
apify-clientPyPI package. - ๐ See the Apify API documentation for full details.
The Apify Schedules feature lets you trigger this Actor on any cron interval. Hourly phishing sweeps, daily brand watches, and weekly ASN audits keep your downstream SIEM, takedown vendor, or Slack channel in sync.
๐ Beyond business use cases
Threat intel data feeds far more than commercial SOCs. The same structured records support research, civic transparency, and personal security projects.
๐ค Ask an AI assistant about this scraper
Open a ready-to-send prompt about this ParseForge actor in the AI of your choice:
- ๐ฌ ChatGPT
- ๐ง Claude
- ๐ Perplexity
- ๐ Copilot
โ Frequently Asked Questions
๐งฉ How does it work?
Drop a Lucene query into the input form, click Start, and the Actor walks the urlscan.io public search API with a cursor-based pager. Each scan is flattened into 31 columns covering page, network, TLS, brand, and verdict data, plus links to the screenshot and the full report.
๐ What query syntax can I use?
Anything that works in urlscan's own search bar. Common fields: domain:, page.url:, page.domain:, page.ip:, page.asn:, page.country:, task.tags:, brand.name:, verdicts.overall.malicious:true, hash:, filename:, plus AND, OR, NOT, wildcards, and date ranges like date:>now-7d.
๐ How accurate is the data?
Every field maps to a urlscan.io public API response. urlscan is widely cited across SOC and brand-protection tooling, though tags and verdicts are crowd plus heuristic in origin. Treat verdicts as one input among several when making takedown decisions.
๐ How fresh is the data?
Every run hits the live urlscan index, so results reflect scans submitted up to the moment the run started.
๐ซ Do I need a urlscan API key?
No. This Actor uses the public search endpoint. For very high-volume use cases consider a urlscan PRO subscription on top of this Actor.
โฐ Can I schedule daily phishing sweeps?
Yes. Use Apify Schedules to trigger the Actor on any cron interval and pipe results into Slack, email, a webhook, or your warehouse.
๐ผ๏ธ Are screenshots included?
Yes. Every record includes a public screenshot URL and the full urlscan report URL.
โ๏ธ Is this data legal to use?
urlscan.io publishes scan results publicly. Use the data in line with urlscan's terms and your local regulations. For takedowns and legal filings, follow standard evidence-handling practices.
๐ณ Do I need a paid Apify plan?
No. The free plan covers small runs (10 records). A paid plan unlocks higher limits, scheduling, and concurrency.
๐ What if I need help?
Reach out via the contact form below to request a custom intel pipeline, a private workflow, or a feature.
๐ Integrate with any app
urlscan.io Threat Intelligence Scraper connects to any cloud service via Apify integrations:
- Make - Automate multi-step phishing workflows
- Zapier - Connect with 5,000+ apps
- Slack - Get phishing alerts in your channels
- Airbyte - Pipe scans into your warehouse
- GitHub - Trigger runs from commits or issues
- Google Drive - Export datasets straight to Sheets
You can also use webhooks to trigger downstream actions when a run finishes. Push new phishing scans into your takedown queue or alert your SOC in Slack.
๐ Recommended Actors
- ๐ RDAP Domain Lookup Scraper - Modern WHOIS replacement via the RDAP protocol
- ๐ข GSA eLibrary Scraper - U.S. federal contract vendor and price data
- ๐๏ธ Hubspot Marketplace Scraper - Marketplace app and integration catalog
- ๐ฐ PR Newswire Scraper - Press release feed with publish dates
- ๐ค Hugging Face Model Scraper - AI model registry metadata
๐ก Pro Tip: browse the complete ParseForge collection for more reference-data and intel scrapers.
๐ Need Help? Open our contact form to request a new scraper, propose a custom data project, or report an issue.
โ ๏ธ Disclaimer: this Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by urlscan.io GmbH or any of its partners. All trademarks mentioned are the property of their respective owners. Only publicly available scan data from the urlscan.io public search API is collected.