Pricing

from $0.50 / full recon scan

Automated reconnaissance actor for bug bounty hunters

This Apify actor automates bug bounty recon by scraping the Wayback Machine and GitHub for legacy attack surfaces. It extracts historical URLs, public code, and deprecated files, parsing them to uncover hidden subdomains and forgotten API endpoints. The findings are saved into structured JSON files.

Pricing

from $0.50 / full recon scan

Rating

0.0

(0)

Developer

Zaher el siddik

Actor stats

Bookmarked

Total users

Monthly active users

3 months ago

Last modified

Bug Bounty Recon Actor

Automated reconnaissance for bug bounty hunters. Discovers hidden API endpoints, subdomains, interesting files, and potential secrets from Wayback Machine archives and GitHub code search.

What It Does

Wayback Machine CDX API — Retrieves thousands of historical URLs for a target domain, extracting subdomains, API endpoints, and interesting files (.env, .json, .sql, .bak, etc.)
GitHub Code Search — Finds code referencing the target domain across all public repos, searching for API keys, passwords, tokens, configs, and endpoints
Deep Content Analysis — Fetches archived pages and raw GitHub files, parsing them with 30+ regex patterns for hidden endpoints and 15 secret detectors
Structured Output — Deduplicates and categorizes all findings into a clean JSON dataset

Input

Parameter	Type	Default	Description
`targetDomain`	string	required	Root domain to recon (e.g. `example.com`)
`githubToken`	string	—	GitHub PAT for authenticated code search (prevents rate limits)
`waybackLimit`	integer	5000	Max historical URLs from Wayback CDX
`fetchContent`	boolean	true	Fetch & parse archived pages for endpoints/secrets
`contentFetchLimit`	integer	200	Max pages to fetch for deep analysis
`githubSearchPages`	integer	5	GitHub search result pages (30 results/page)
`debug`	boolean	false	Enable verbose logging

Output

Results are pushed to the default dataset as structured JSON:

{
  "targetDomain": "example.com",
  "scanTimestamp": "2024-01-15T12:00:00.000Z",
  "totalSubdomains": 15,
  "totalEndpoints": 42,
  "totalGithubFindings": 87,
  "totalWaybackUrls": 5000,
  "totalInterestingFiles": 290,
  "totalPotentialSecrets": 3,
  "subdomains": ["api.example.com", "staging.example.com", ...],
  "endpoints": [
    { "path": "/api/v2/users", "category": "Versioned API" },
    { "path": "/graphql", "category": "GraphQL" },
    { "path": "/admin/config", "category": "Admin" },
    { "path": "/auth/oauth/callback", "category": "Authentication" }
  ],
  "endpointsByCategory": { "API": [...], "Authentication": [...], ... },
  "interestingFiles": [
    { "url": "https://example.com/.env", "extension": ".env", ... }
  ],
  "githubFindings": [
    { "repository": "org/repo", "file": "config.js", "url": "..." }
  ],
  "potentialSecrets": [
    { "type": "AWS Access Key", "value": "AKIA12345678***REDACTED***", "fullLength": 20 }
  ]
}

Endpoint Categories

Discovered endpoints are auto-categorized:

API — /api/* paths
Versioned API — /v1/*, /v2/* paths
GraphQL — /graphql endpoints
Authentication — /auth/*, /oauth/*, /login/*
Admin — /admin/* paths
Internal — /internal/*, /private/*, /debug/*
Webhook — /webhook/*, /webhooks/*
API Documentation — /swagger*, /openapi*
Sensitive File — .json, .env, .sql, .bak, .config, etc.
File/Upload — /upload/*, /media/*
Payment — /pay/*, /billing/*, /subscription/*
User Management — /user/*, /account/*, /profile/*

Secret Detection

Scans fetched content for 15 secret patterns:

AWS Access Keys & Secret Keys
Generic API Keys, Secrets, Passwords
Bearer Tokens, JWTs
GitHub, Slack, Google, Stripe, SendGrid, Twilio, Mailgun tokens
Private Keys (RSA, EC, DSA)
Heroku API Keys

All detected values are redacted in output — only the first 12 chars + type are shown.

GitHub Token

A GitHub Personal Access Token is recommended for the GitHub code search phase. Without it, GitHub's unauthenticated rate limit (10 req/min) will be hit quickly.

Generate one at https://github.com/settings/tokens — no special scopes needed (public repo access is sufficient).

Tips

Start small — Use waybackLimit: 500 and contentFetchLimit: 50 for a quick scan
Scale up — Increase limits for thorough recon on high-value targets
GitHub token — Highly recommended for meaningful GitHub results
Content fetch — Some domains block Wayback Machine serving; set fetchContent: false if the fetch success rate is very low
Schedule runs — Run weekly to catch new archived content

Cost

Typical run costs on Apify:

Quick scan (500 URLs, 50 content): ~$0.01, 30-60 seconds
Full scan (5000 URLs, 200 content): ~$0.03, 2-5 minutes
Deep scan (50000 URLs, 2000 content): ~$0.15, 15-30 minutes

Limitations

Wayback Machine CDX only returns URLs that were previously archived — not all subdomains/pages will be present
Content fetch success depends on the target; some sites block Wayback from serving archived content
GitHub code search is limited to indexed public repositories
Secret detection uses pattern matching and may produce false positives on placeholder/example values
The Actor filters common vulnerability scanner payloads from Wayback results but some noise may remain

Bug Bounty Recon Scanner

iamuendo/Bug-Bounty-Recon-Scanner

Find exposed admin panels, missing/weak security headers, sensitive file leaks, and HTTPS misconfigurations across target domains. Export prioritised risk scores and JSON reports. Run via API, schedule scans, or integrate with bug bounty tools.

Isaac Muendo

Bug Bounty Finder - HackerOne + Bugcrowd + security.txt

anshumanatrey/bug-bounty-finder

Find every public bug bounty / responsible disclosure program for a target. Aggregates HackerOne directory + Bugcrowd engagements + target /.well-known/security.txt. Daily-use lookup for bug bounty hunters — know if a target has a program before hunting.

Anshuman Atrey

Subdomain Finder & Recon Tool

andok/subdomain-finder

Discover subdomains for any target via passive OSINT sources. Ideal for security bug bounties and attack surface mapping.

Andok

GitHub Bounty Scout

fragrant_invite/github-bounty-scout

Find actionable GitHub bounty and paid issue leads before they get buried in noisy search results.

玉成孙

Algora Bounty Radar

sebarb/algora-bounty-radar

Read-only scout for public Algora/GitHub bounty candidates with safety, funding, and competition filters.

Seba

Hackerone Scraper

filakovsky/hackerone-scraper

This Actor scrapes publicly disclosed vulnerability reports from the HackerOne Hacktivity feed - the industry's largest collection of real-world bug bounty disclosures.

Daniel Filakovsky

Bounty Opportunity Triage

angelic_wingnut/bounty-opportunity-triage

Finds legitimate OSS bounty and agent-market opportunities, filters payout blockers, and ranks next actions.

Lester Steptoe Jr

Google Play Review & Bug Report Extractor (AI-Powered)

apilab/google-play-review-bug-report-extractor

Extracts Google Play reviews and automatically converts them into structured bug reports. It identifies technical issues, assigns severity, suggests owners (Product/Engineering), and summarizes user complaints into actionable insights. Perfect for automating your QA and product feedback loop.

Apilab

Wayback Machine Historical Content Scraper

happyfhantum/wayback-machine-historical-content-scraper

Compare archived website snapshots through the Wayback Machine and extract page-history change signals.

Kelsey Todd

4.0

Wayback Machine Checker

automation-lab/wayback-machine-checker

This actor checks if URLs are archived in the Internet Archive Wayback Machine. It retrieves snapshot counts, oldest and newest archive dates, and direct links to archived versions. Uses both the Availability API and CDX API for comprehensive results.