Pricing

from $0.35 / 1,000 results

Wiki Grabber

Find Wikipedia pages with citation-needed tags, dead links, broken link signals, and cleanup issues using keyword search. Great for SEO, link building, outreach, and research workflows.

Pricing

from $0.35 / 1,000 results

Rating

0.0

(0)

Developer

Shahab Uddin

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

WikiGrabber

WikiGrabber is an Apify Actor and lightweight web app for finding Wikipedia pages with citation-needed tags, dead-link templates, broken-link signals, and other source-cleanup hints.

What it does

Searches English Wikipedia by keyword
Parses page wikitext and rendered HTML
Detects citation-needed, dead-link, and cleanup-style signals
Extracts exact citation and dead-link locations from article sections
Adds direct article, section, and section-edit links for faster action
Scores results so higher-opportunity pages rise to the top
Stores filtered results in an Apify dataset
Lets you browse results in a browser
Exports saved results as CSV

Endpoints

GET / serves the browser UI
GET /api/health returns a simple health check
GET /api/search?keyword=SEO&limit=30&page=1 runs a keyword search and creates a request-safe dataset
GET /api/dataset?dataset=<datasetName-from-search>&page=2&limit=20 pages through saved dataset results
GET /api/export.csv?dataset=<datasetName-from-search> exports a dataset as CSV

Advanced result workflow

Filter result pages by Show all, Missing Citations, or Dead Links
See exact issue rows with section title, line reference, and excerpt
Open the exact Wikipedia section directly from the result card
Jump straight into action=edit&section=<n> links to add a citation or replace a dead link
Review mixed pages that contain both citation and dead-link opportunities

Local development

npm install
npm start

By default the app starts on http://localhost:4321.

For a local one-off QA run that follows the same standard-run code path as Apify's automated test, put an INPUT.json file under your chosen CRAWLEE_STORAGE_DIR, then start the actor with WIKI_GRABBER_FORCE_STANDARD_MODE=1.

Deploy on Apify

npx apify login
npx apify push

Important note about Apify run modes

This project supports both Apify run modes, but they behave differently:

Standard Actor run The Actor does not keep the HTTP server alive on Apify. Instead, it treats the run as a one-off batch job. If you provide input like {"keyword":"seo tool","limit":10}, it will build the dataset, save output, and finish with SUCCEEDED. If a standard run starts without a keyword, the actor now falls back to the built-in QA keyword seo tool so automated tests and manual one-off runs still produce a non-empty default dataset.
Standby mode The Actor behaves like a web server behind a stable URL, and Apify keeps standby runs available according to the standby configuration.

If you want a persistent app-like experience, use Standby mode instead of manually starting a normal Actor run from the Console.

The input schema now uses both prefill and default on the search keyword for maximum compatibility with Apify's QA flow, while operational settings such as limit keep a real default value for API, task, and scheduler runs.

Apify QA checklist

In Apify Console, use Source > Input > Restore example input and confirm it fills keyword: "seo tool" with limit: 10
Start the Actor from that restored example input and verify the run finishes within Apify's 5-minute automated-test window
Confirm the default dataset is non-empty and that fallback rows, when emitted, are clearly marked with resultType: "fallback"
If Wikipedia is temporarily unavailable during the test window, expect a successful run with a diagnostic fallback row instead of an empty default dataset

Standby behavior

Repeated identical searches can be served from an in-memory cache while a Standby run stays warm
Concurrent identical requests share the same in-flight search work instead of duplicating Wikipedia fetches
Each generated dataset name is request-safe, so one user search does not drop or overwrite another user's dataset
Add refresh=true to /api/search if you want to bypass the cache and force a new dataset build
Wikipedia API calls automatically retry on transient timeout and 429/5xx responses, and large revision batches fall back to smaller groups when needed

Example use cases

Wikipedia citation research
Dead-link replacement prospecting
Link-building opportunity discovery
SEO outreach research
Topic-based cleanup analysis
CSV export for campaign workflows

Output fields

Each result can include:

resultType
keyword
title
note
pageid
url
snippet
wordcount
timestamp
citationNeededTemplates
deadLinkTemplates
brokenLinkSignals
cleanupTemplates
bareUrlCount
refCount
score
issueCounts
locations[]
actionLinks

Broken Link Audit

zerobreak/broken-link-audit

Broken Link Audit is an Apify actor that crawls websites to find broken links, dead URLs, and failed HTTP requests. It scans internal pages, extracts all links, and performs live HTTP checks to detect 404 errors, timeouts, and server issues, helping you fix problems before they harm SEO.

ZeroBreak

Broken Link Checker - Dead Link & 404 Finder

pink_comic/broken-link-checker

Scan any web page and find all broken links (404s, timeouts, errors). Checks every link including images, scripts, stylesheets. Returns status codes, anchor text, error details. Parallel processing for speed. For SEO audits, website maintenance, content QA, and link rot detection.

Ava Torres

Broken Link Finder

pillowy_travel/broken-link-finder

Finds and analyzes broken links on given web pages

SAHIL KUMAR

Broken Link Checker — Find 404s, Dead Links & Redirect Issues

khadinakbar/broken-link-checker

Crawl a website, scan a URL list, or verify all URLs from a sitemap. Returns broken links with source page, anchor text, status, redirect chain, and failure class — for SEO audits, content QA, and migration validation.

Khadin Akbar

Broken Link Checker

taroyamada/broken-link-checker

Crawl supplied websites to find dead internal and outbound links with status codes, anchor context, redirect hints, and source pages.

太郎山田

Broken Link Checker - Find 404s and Dead Links

santamaria-automations/broken-link-checker

Crawl any website and find broken links, 404 errors, redirect chains, timeouts, and SSL failures. Essential for SEO audits, QA, and content maintenance. Export data, run via API, schedule and monitor runs, or integrate with other tools.

Ale

Dead Link Crawler

actually_good_at_this/dead-link-crawler

Scans any website and identifies broken links (4xx and 5xx status codes). Allows to find and fix broken links, perform SEO audits, identify orphaned pages and server errors, ensure all links work before going live, analyse competitors and discover what's broken on competitor sites.

john Y

Broken Link Checker — Recursive Site Crawler

accurate_pouch/broken-link-checker

Recursively crawl your website and find every broken link, 404, redirect, and timeout. Checks internal and external links with configurable depth. 100 links free per run.

Manchitt Sanan

Wikipedia Email Scraper - Advanced, Fast & Cheapest

contacts-api/wikipedia-email-scraper-fast-advanced-and-cheapest

📚 Wikipedia Email Scraper allows you to collect publicly available editor and organization emails from Wikipedia pages 🔎 Great for research and academic outreach 📧

Lead Heaven

Broken Link Checker

parseforge/broken-link-checker

Scan thousands of URLs instantly and detect broken links, 404s, redirects, and slow pages. Get comprehensive link health reports with status codes, response times, redirect chains, and detailed error information. Perfect for website maintenance, SEO audits, and quality assurance.