Pricing

Pay per event

Media Bias/Fact Check Source Credibility Scraper

Pull structured source-credibility records from Media Bias/Fact Check (MBFC) -- the largest media-source reliability database (~7,000+ profiles). Returns bias rating, factual-reporting tier, MBFC credibility rating, country press-freedom, media type, traffic, and full History/Funding/Analysis prose.

Pricing

Pay per event

Rating

0.0

(0)

Developer

BowTiedRaccoon

Actor stats

Bookmarked

Total users

Monthly active users

12 days ago

Last modified

Use Cases

Fact-checking pipelines — source-level trust layer alongside claim-level fact-checkers (PolitiFact, Snopes)
Disinformation research — identify and filter conspiracy-pseudoscience / questionable sources
Brand safety — screen media sources before advertising placement
RAG source-filtering — weight or exclude sources by credibility rating before ingestion
OSINT / media-literacy — annotate article URLs with source bias and credibility metadata

Input

Field	Type	Default	Description
`mode`	string (required)	`all`	`all` = full corpus; `category` = selected bias categories; `seed` = explicit profile URLs
`categories`	string[]	—	Bias categories when `mode=category`. Options: `center`, `left-center`, `left`, `right-center`, `right`, `pro-science`, `conspiracy-pseudoscience`, `questionable`, `satire`
`seedUrls`	string[]	—	Explicit MBFC profile URLs when `mode=seed`
`maxItems`	integer	`200`	Maximum source profiles to return (0 = unlimited)
`includeBody`	boolean	`true`	Include History / Funding / Analysis prose sections
`proxyConfiguration`	object	no proxy	Optional proxy (MBFC does not require proxy on plain UA)

Output

Each record contains:

{
  "sourceName": "247Sports",
  "sourceUrl": "https://mediabiasfactcheck.com/247sports-bias-and-credibility/",
  "sourceHomepage": "https://247sports.com",
  "biasRating": "center",
  "rawBiasRating": "LEAST BIASED",
  "factualReporting": "HIGH",
  "credibilityRating": "HIGH CREDIBILITY",
  "country": "United States",
  "countryFreedomRating": "MOSTLY FREE",
  "mediaType": "Website",
  "trafficPopularity": "High Traffic",
  "categoryIndex": "center",
  "history": "247Sports, established in 2010 by Shannon Terry...",
  "fundedByOwnership": "247Sports is owned by CBS Interactive...",
  "analysisBias": "247Sports focuses on sports news...",
  "lastUpdated": "April 16, 2024",
  "reviewedBy": "",
  "articleJsonLd": { "...": "..." },
  "bodyMarkdown": "## History\n\n...",
  "status": "success",
  "errorMsg": ""
}

Bias Rating Normalization

MBFC Label	Normalized Slug
LEAST BIASED	`center`
LEFT-CENTER BIAS	`left-center`
LEFT BIAS	`left`
RIGHT-CENTER BIAS	`right-center`
RIGHT BIAS	`right`
PRO-SCIENCE	`pro-science`
CONSPIRACY-PSEUDOSCIENCE	`conspiracy-pseudoscience`
QUESTIONABLE SOURCE	`questionable`
SATIRE	`satire`

Predefined Dataset Views

Source Credibility Table — sourceName, sourceHomepage, biasRating, factualReporting, credibilityRating, country
Low-Credibility Sources — focused view for conspiracy-pseudoscience, questionable, and low-credibility sources

Architecture

Pure HTTP two-level hierarchical crawl using CoreCrawler. No browser, no proxy required.

Level 1 (category):  Walk each MBFC bias-category index page
                     (/center/ /leftcenter/ /left/ /right-center/ /right/
                      /pro-science/ /conspiracy/ /questionable/ /satire/)
                     → parse <table> of source profile links
                     → link text "Source Name (domain.com)" → extract sourceHomepage

Level 2 (profile):   Fetch each source profile page
                     → parse "Detailed Report" block for structured fields
                     → extract History / Funded by / Analysis sections
                     → extract JSON-LD Article metadata

Rate-limit handling: CoreCrawler detects 429 responses and backs off exponentially. MBFC enforces per-IP rate limits on aggressive crawlers; the actor uses polite concurrency (3 concurrent requests max).

Crawl Modes

mode=all (default): Walks all 9 bias-category index pages and fetches every source profile in the corpus (~7,000 profiles total). Suitable for full-archive downloads.

mode=category: Walks only the specified bias categories. Useful for targeted pulls (e.g., all conspiracy-pseudoscience sources for a disinfo pipeline).

mode=seed: Fetches only the explicitly provided MBFC profile URLs. Suitable for spot-lookups or updating specific source records.

Performance

Default memory: 512 MB
Full corpus run: ~7,000 profiles at polite 1-3 req/s ≈ 2-4 hours
Category run (e.g., center ~500 profiles): ~15-30 minutes
Seed run (single URL): under 1 minute

Google Fact Check Scraper

seemuapps/google-fact-check-scraper

Search Google's Fact Check Tools database for claims and their reviews publisher, rating, source URL and export them to a clean dataset.

Andrew

Pitch Deck Credibility Analyzer

mrbridge/pitch-deck-credibility-analyzer

Multi-agent AI analysis of your pitch deck credibility. Simulates an investment committee with 6 specialized AI agents that evaluate financial soundness, commercial viability, team credibility, and presentation quality. Get a detailed report with score, red flags, and actionable feedback in minutes.

MrBridge

Article Truth Report — Fact-Check & Trust Score

xavvyness/xavvyness-article-intel

Paste any article URL. Get Trust Score (0-100), claim-by-claim fact-check with live web sources, sentiment, bias detection, manipulation risk, and a one-paragraph truth verdict. JSON + rich HTML report.

XavvyNess

Job Description Bias And Risk Auditor

trovevault/jd-bias-compliance-risk-auditor

Audits job descriptions for bias, pay transparency, location clarity, legal-risk signals, rewrites, and evidence. Export data, run via API, schedule and monitor runs, or integrate with other tools.

Trove Vault

Snopes Fact-Check Scraper

lulzasaur/snopes-scraper

Scrape Snopes.com fact-check articles. Get claims, ratings (True/False/Mixture), authors, dates, and full article text via ClaimReview schema.

lulz bot

Instagram Business Intelligence Pro

red.cars/instagram-business-intelligence-pro

Assess influencer credibility, track brand mentions, and extract competitor intelligence from Instagram. Engagement rate analysis, follower authenticity scoring, and business contact discovery. No API keys required.

AutomateLab

1.0

Business Wire Press Releases Scraper — Media API

nexgendata/business-wire-press-releases-scraper

Monitor Business Wire press releases for media intelligence. Clean JSON for PR, media-monitoring teams and AI agents.

NexGenData

GlobeNewswire Press Releases Scraper — Media API

nexgendata/globenewswire-press-releases-scraper

Monitor GlobeNewswire press releases for media intelligence. Clean JSON for PR, media-monitoring teams and AI agents.

NexGenData

OnlyFans Downloader

maximedupre/onlyfans-downloader

Download media URLs from OnlyFans profiles, media sections, and posts you can access. Export media links, thumbnails, creator context, post text, source URLs, and timestamps for API workflows or dataset exports.

Maxime Dupré

Duckduckgo Scraper

scrapier/duckduckgo-scraper

Use DuckDuckGo Scraper to fetch structured search data without personalization bias. Scrape result URLs, snippets, and rankings for research and analytics. Designed for developers, analysts, and teams needing privacy-first search data.