Pricing

from $0.50 / 1,000 results

AI Research Radar — compliant feed of new AI papers and news

AI research feed of new ML papers and AI news from HuggingFace, Anthropic, Google, The Decoder — structured JSON, robots-compliant.

Pricing

from $0.50 / 1,000 results

Rating

0.0

(0)

Developer

Connor Teskey

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

AI Research Radar

New AI papers, lab announcements, and AI news from five permitted sources, delivered as one structured, schedule-ready feed.

Built for AI newsletter writers, research agents, and trend dashboards. Instead of hand-maintaining a scraper per site, you run one actor and get the latest items from HuggingFace papers and blog, the Anthropic and Google AI newsrooms, and The Decoder as uniform JSON records — ready to rank, summarize, alert on, or pipe into a RAG index.

What you get

Field	Meaning
`title`	Paper, post, or article headline
`url`	Canonical link on the source site
`category`	`papers`, `blog`, `labs`, or `news` — set per source
`source`	Source domain, e.g. `huggingface.co`
`fetched_at`	UTC timestamp of the run (ISO 8601)
`extraction`	Extractor version tag (`selector_free_v1`)

Quick start

{
    "sources": [
        { "url": "https://huggingface.co/papers", "category": "papers" },
        { "url": "https://huggingface.co/blog", "category": "blog" },
        { "url": "https://www.anthropic.com/news", "category": "labs" }
    ],
    "maxItemsPerSource": 25
}

This returns up to 75 fresh items (25 per source), typically in under a minute. Omit sources entirely to use the full five-source default set, which adds the Google AI blog and The Decoder.

Output example

{
    "category": "papers",
    "title": "Role-Agent: Bootstrapping LLM Agents via Dual-Role Evolution",
    "url": "https://huggingface.co/papers/2606.10917",
    "source": "huggingface.co",
    "fetched_at": "2026-06-10T14:12:08.421337+00:00",
    "extraction": "selector_free_v1"
}

Why this one

Selector-free extraction. Titles are pulled by link-text shape and URL structure rather than page-specific CSS selectors, so the site redesigns that break conventional scrapers do not break this one.
Layout drift is flagged, never hidden. A source that suddenly yields zero items is marked zero_yield_check_layout in the HEALTH report instead of quietly shrinking your feed.
Papers, labs, and press in one schema. The five default sources cover research papers, official lab announcements, and AI journalism, each record tagged with its category.
Bring your own sources. Pass any list of {url, category} pages; the same robots check, retry logic, and extraction apply to every source you add.
Fresh by design. Each run is a live snapshot of the source pages — schedule it hourly or daily and the radar stays current.

Compliance and reliability

Topsail actors are built compliance-first and ship with self-healing plumbing:

robots.txt is always respected — fail-closed. If a robots check cannot complete, the source is skipped, never scraped. There is no input to turn this off.
Sources are public listing and newsroom pages — HuggingFace papers and blog, Anthropic news, the Google AI blog, and The Decoder — pages these publishers serve openly to every visitor, with no account, paywall, or personal data involved.
Transient failures retry once with backoff; persistent failures are reported, not hidden.
Every run writes a per-source HEALTH report to the key-value store, so you can see exactly which sources delivered and which were blocked, empty, or erroring.
No PII, no paywalled or login-gated content, no circumvention.

Pricing

Pay per result: $0.50 per 1,000 dataset items — one item is one paper, post, or article. Sources that come back robots-blocked, erroring, or empty add nothing to the dataset and cost nothing — you pay only for delivered records. A typical default run of around 100 items costs about $0.05.

Honest limits

Titles and canonical links only — no abstracts, authors, publication dates, or article text. fetched_at is the run timestamp, not the publish date.
Extraction expects headline-shaped link text (at least 4 words and 24 characters), so very short titles can be missed and an occasional non-article link can slip through.
Only same-domain links are collected from each source page.
Pages that render their listings entirely with JavaScript yield zero items; the run flags them in HEALTH rather than failing.
No cross-run deduplication or diff detection — each run is a full snapshot. Dedupe by url downstream if you ingest continuously.

FAQ

Can I use this as an ML papers API? Yes. Trigger runs on a schedule through the Apify API and read the dataset as JSON or CSV — a lightweight ML papers API without maintaining your own scraper.

How fresh is the AI research feed? Each run is a live snapshot of the source pages at run time. Schedule the actor hourly or daily to keep an always-current AI news feed.

Can I add my own sources? Yes. sources accepts any list of {url, category} pages. The robots check and selector-free extraction apply to every source you add; blog-style listing pages work best.

Does it return abstracts or full article text? No — titles and canonical links only. Pair it with Topsail's Site to Markdown actor when you need full LLM-ready page content.

What happens when a source site redesigns? Usually nothing: extraction keys on link-text shape and URL structure, not page-specific selectors. If a source still drops to zero items, the run flags it as zero_yield_check_layout in the HEALTH report.

More compliant data feeds from Topsail

Site to Markdown — any site to clean LLM-ready markdown
GTA 6 Countdown & Developments Tracker — countdown, confirmed facts, diffed developments, market odds
Commodity Intel — oil, gold, uranium headlines from permitted sources
Crypto News — BTC/ETH/DeFi headlines from major outlets

Ai-ML-scraper

labrat011/ai-ml-scraper

Search AI/ML models, research papers, and trending papers from HuggingFace Hub and arXiv. No API key required.

mick_

HuggingFace Daily Papers Scraper

tzmyk/huggingface-daily-papers-scraper

Scrapes AI/ML research papers from HuggingFace Daily Papers (huggingface.co/papers). Extracts title, authors, abstract, GitHub repo, star count, upvotes, AI summary, and keywords.

tzmyk

GitHub and HuggingFace AI Research Monitor

ghostgrid/github-huggingface-research-monitor

Track trending AI repositories, models, datasets, and papers from GitHub and HuggingFace.

GhostGrid

arXiv Paper Scraper - AI ML Research Papers

openclawmara/arxiv-paper-scraper

Scrape arXiv research papers by keyword, category, or author. Extracts titles, abstracts, authors, citations, and metadata. Perfect for AI/ML research monitoring, literature reviews, and LLM training data collection.

OpenClaw Mara

HuggingFace Daily Papers Scraper

zenolvepro/huggingface-daily-papers

Scrape HuggingFace's Daily Papers feed: title, paper URL, and community upvote count for every trending AI/ML paper, ranked by votes. Powers research-intel dashboards, newsletter digests, and AI-lab/VC competitive tracking. No official bulk/list API exists. Pay per result.

Zenolve

Crypto News — compliant Bitcoin & DeFi headline feed

topsail/compliant-crypto-news

Compliant crypto news API: a structured Bitcoin news feed and DeFi news headlines from CoinDesk, Decrypt, and CoinTelegraph.

Connor Teskey

arXiv Papers Scraper - AI/ML Research at Scale

wetyr_corporation/arxiv-papers-scraper

Search and bulk extract arXiv research papers with abstracts, authors, categories, and PDF links. Built for AI/ML researchers, RAG knowledge bases, and citation tracking.

WETYR

arXiv Papers Scraper — AI & Research by Keyword or Category

hichemdev/arxiv-papers-scraper

Scrape arXiv research papers by keyword or category: title, authors, abstract, dates, categories, DOI and PDF link. Perfect for tracking AI/ML research.

Hichem Ben Moussa

Commodity Intel — compliant oil, gold & uranium news feed

topsail/compliant-commodity-intel

Commodity news API: oil news feed, gold news, silver and uranium headlines as structured JSON from robots-compliant public sources.

Connor Teskey

AI News Aggregator

david_flagg/ai-news-aggregator

Aggregate AI and ML news from Hacker News, Papers With Code, MIT Technology Review, The Batch, and Import AI. Filter by keywords, date range, minimum score. Get titles, URLs, authors, summaries, topic tags, arXiv links, and code repos. Real-time data, sorted by date or relevance.