Dev.to Articles Scraper
Pricing
from $3.50 / 1,000 results
Dev.to Articles Scraper
Scrape developer articles from Dev.to by tag or author: title, description, tags, reactions, comments and reading time. Schedule it to track trending tech content and topics.
Pricing
from $3.50 / 1,000 results
Rating
0.0
(0)
Developer
Logiover
Maintained by CommunityActor stats
0
Bookmarked
41
Total users
8
Monthly active users
6 hours ago
Last modified
Categories
Share
✍️ Dev.to Articles Scraper — Developer Blog Posts by Tag or Author to JSON/CSV

Bulk-scrape developer articles from Dev.to via the official public API — by tag, by author username, or the latest feed across all tags — fully paginated through Forem's public JSON. Title, description, URL, tags, author display name + username, reactions count, comments count, reading time, cover image and publish timestamp. No login, no Forem API key, no proxy. Export to JSON, CSV, Excel or XML.
Built for technical content marketers running newsletters, dev-relations teams monitoring tool adoption, content aggregators, technical journalists, devtool product teams tracking ecosystem conversation, and ML engineers building tech-writing corpora.
🟢 No Dev.to account. No API key. No proxy. Pure public REST.
🚀 Why this scraper
Dev.to (Forem) is one of the largest dedicated technical-writing platforms — thousands of new posts every week from indie developers, OSS maintainers, DevRel teams at major companies, ML researchers, frontend craftspeople, system designers, and self-taught builders sharing what they learn. The signal is dense and well-tagged: every post carries up to four tags (#javascript, #ai, #webdev, #python, #react, #rust, #devops, #career, #beginners, #tutorial...), reading time, engagement counts and a clear author identity.
Pulling Dev.to at scale yourself runs into:
- Knowing the Forem
/api/articlesquery parameter conventions (tag,username,top,state,page,per_page) - Threading full pagination across hundreds of pages
- Distinguishing "tag feed" from "username feed" from "latest feed"
- Flattening Forem's nested response into flat rows
- Handling occasional 429s with backoff
- Persisting output in a format your warehouse, BI tool or content pipeline can use
This Actor handles all of that. Set a tag, a username (or both, or neither), set the cap, hit run — get back a flat, structured, paginated dataset of every matching article with all the metadata you need.
✨ Key features
| Feature | What it gives you |
|---|---|
| 🔌 Official Dev.to / Forem API | Stable, well-documented, fully paginated — no fragile HTML parsing |
| 🏷️ Tag-based filtering | Pull articles for any topic: javascript, ai, webdev, python, react, rust, devops, career, beginners, tutorial, etc. |
| 👤 Author-based filtering | Collect every article from a specific Dev.to writer by username |
| ♾️ Full pagination | Walks the entire feed for your filter, not just page 1 |
| 📊 Rich metadata per article | 13 fields: title, description, URL, tags, author display name + username, reactions, comments, reading time, cover image, publish date |
| 📈 Engagement metrics | reactionsCount and commentsCount make it easy to rank by popularity |
| ⏱️ Reading-time included | Estimated reading time in minutes — useful for content curation and email digests |
| 🎯 Adjustable run size | maxArticles=0 pulls everything; cap to anything for faster runs |
| 🧱 Flat, export-ready schema | No nested JSON — drop straight into a spreadsheet or warehouse |
| 📦 All export formats | JSON, CSV, Excel, HTML, XML, JSONL via the Apify Dataset |
| 🔓 No auth, no proxy | Pure public-API access — no Dev.to account, no API key, no residential proxy |
| 🧰 Built-in Overview view | Pre-configured Apify Dataset view with the most-useful columns visible by default |
🎯 Built for these use cases
1. Tech content discovery & curation
What's being written about AI agents this week? About Rust on the backend? About Astro vs Next.js? Schedule the Actor with a tag filter for a continuously-fresh stream of new developer articles in your niche — feed your newsletter, your podcast research, your weekly internal dev digest.
2. Author network analysis
Pass a username — get every article that author has ever published, with tags and engagement metrics. Build maps of who writes about what, who's gaining traction, who's a credible voice in a given subdomain. Useful for DevRel outreach, podcast guest sourcing, technical-recruiting research and content partnerships.
3. Tag trend tracking
Pull weekly snapshots of the top tags in your space. Plot article counts per tag over time. See which technologies are heating up (rust vs go, nextjs vs remix, ollama vs vllm) months before consensus catches up.
4. Newsletter & content aggregator pipelines
Daily run, top-reactions filter (sort/cap downstream), email-ready output. Power a "best of Dev.to" weekly newsletter, a Slack-channel digest, an internal "what should we read?" feed, or an SEO content-syndication pipeline.
5. Devtool marketing & DevRel monitoring
You sell a devtool (CI service, framework, observability tool, hosting platform). Monitor mentions of your product, your competitor and your category in Dev.to articles. Catch reviews early, partner with active writers, surface integration tutorials your community is asking for.
6. Technical writing benchmarking
Studying a topic for an upcoming book / course / docs site? Pull every Dev.to article on that tag. See what's been covered, what hasn't, what got engagement and what didn't — a research shortcut.
7. LLM / NLP training data
Dev.to articles are well-tagged, clearly attributed and topically diverse — great structure for fine-tuning a developer-assistant model on contemporary tech writing.
8. Competitive analysis & influencer tracking
Track every Dev.to article published by competitors and influential authors in your space. Engagement deltas, posting cadence, topic mix — all directly observable.
📥 Inputs
| Field | Type | Required | Description |
|---|---|---|---|
tag | string | No | Tag filter, e.g. javascript, ai, webdev, python, react, rust, devops, career. Leave empty for the latest feed across all tags. |
username | string | No | Filter to a specific author's articles by Dev.to username (e.g. ben, ali, florincornea). Combine with tag or leave standalone. |
maxArticles | integer | No | Hard cap on rows. 0 = pull every available article for the filter. |
Example inputs
Latest articles across all tags:
{"tag": "","username": "","maxArticles": 200}
Every AI article on Dev.to:
{"tag": "ai","maxArticles": 0}
Every article by a specific author:
{"username": "ben","maxArticles": 0}
A specific author's posts on a specific tag:
{"tag": "javascript","username": "florincornea","maxArticles": 50}
📤 Output
One Apify dataset row per article. Sample:
{"id": 1234567,"title": "10 JavaScript Tricks You Should Know in 2026","description": "A practical roundup of modern JS techniques you can apply today.","url": "https://dev.to/janedev/10-javascript-tricks-you-should-know-2026","author": "Jane Developer","authorUsername": "janedev","tags": ["javascript", "webdev", "beginners", "tutorial"],"commentsCount": 24,"reactionsCount": 312,"readingTimeMinutes": 6,"coverImage": "https://res.cloudinary.com/practicaldev/.../cover.png","publishedAt": "2026-05-10T09:00:00Z","scrapedAt": "2026-05-16T10:00:00.000Z"}
Full field reference
| Field | Type | Meaning |
|---|---|---|
id | number | Dev.to numeric article ID |
title | string | Article title |
description | string | Short description / summary |
url | string | Canonical URL of the article on dev.to |
author | string | Display name of the author |
authorUsername | string | Dev.to username (use this for follow-up username queries) |
tags | array | Tags attached to the article (up to 4 on Dev.to) |
commentsCount | number | Number of comments on the article |
reactionsCount | number | Total reactions (likes, unicorns, bookmarks combined) |
readingTimeMinutes | number | Estimated reading time in minutes |
coverImage | string | URL of the article's cover image |
publishedAt | string | ISO 8601 publication timestamp |
scrapedAt | string | ISO 8601 timestamp of the scrape |
⚙️ How it works
- Parses input — tag, username, max cap.
- Calls
https://dev.to/api/articleswith the right query:?tag=<tag>&username=<username>&per_page=100&page=1. - Paginates through
page=1..Nuntil the cap is hit or the API returns an empty page. - Backs off on HTTP 429 / 5xx with exponential retry.
- Flattens the nested Forem response — extracts
user.name→author,user.username→authorUsername,tag_list→tags, normalizes timestamps to ISO 8601. - Streams each article as one flat row directly into the Apify Dataset.
The Actor uses ONLY the official, publicly-documented Dev.to / Forem v1 API (dev.to/api/articles). No HTML scraping, no headless browser, no proxy, no auth.
⚡ Performance
| Workload | Approx time | API calls |
|---|---|---|
| Latest 100 articles, no filter | ~3 seconds | 1 |
| 500 articles for a tag | ~10 seconds | 5 |
| 2,000 articles for a tag | ~40 seconds | 20 |
| All articles for an author (200 typical) | ~5 seconds | 2 |
| Full tag backfill (10,000+ articles) | ~5 minutes | ~100 |
The Forem API returns up to 100 articles per page. The Actor stays comfortably within published rate limits via built-in pacing.
💰 Cost model
Pay-Per-Result. You only pay for article rows actually saved. Pages that return zero matches are not billed.
Typical costs (rough order):
- Daily newsletter ingestion (~100 latest) → tiny
- Weekly tag-feed sweep (~500 per tag) → small
- Author network mapping (1,000 authors × top 10 each) → moderate
- Full historical tag backfill (10,000+) → moderate but bounded
🔄 Schedule for continuous monitoring
Common patterns:
- Hourly for new-article alerts in fast-moving tags (
ai,llm) - Daily at 7:00 UTC for newsletter and Slack-channel digests
- Weekly for "what's trending in
- Monthly for ecosystem health dashboards and content audits
Push new rows into Slack, Discord, Notion, Airtable, Sheets, your CRM, Postgres, BigQuery, your newsletter sender or any HTTP endpoint via Apify Webhooks.
🛠️ FAQ
Do I need a Dev.to API key or login? No. The Actor uses Dev.to's fully public v1 API, which doesn't require authentication for read-only article queries.
Is it legal to scrape Dev.to? The Actor reads publicly available article metadata via Dev.to's official public API — an API intended for programmatic access. You are responsible for complying with Dev.to's terms of service and for how you use the data (especially attribution if you republish).
How many articles can I get per run?
As many as the API serves for your filter. Set maxArticles=0 to pull everything for your tag/username; set a number for a faster, capped run.
Can I filter by tag AND author at the same time?
Yes — supply both tag and username and the Actor returns articles that match both.
Does it return full article body / Markdown / HTML? This Actor returns the article metadata (including title, description, URL, tags, engagement). For the full article body, the Dev.to API exposes that on the per-article endpoint — request a companion Actor build if you need it.
Does it include engagement metrics?
Yes — reactionsCount, commentsCount and readingTimeMinutes are every record's three engagement signals. Rank, filter or threshold by them downstream.
Are private / unpublished articles included? No. The public API only exposes published articles. Drafts, scheduled posts and private content are not accessible.
Can I sort by reactions or comments?
The Forem /api/articles endpoint returns articles in publish-date order by default. Sort by reactionsCount or commentsCount downstream in your spreadsheet, SQL or pandas.
Is the data fresh? Yes — the API serves data in near real-time. New articles typically appear within a minute or two of publishing.
How is this different from RSS-feed-based aggregators? Dev.to's RSS is shallow (latest N items only) and lacks engagement counts. This Actor uses the structured API for full pagination and rich metadata.
Can I use this for LLM training? Yes. Dev.to article metadata (and the body via the per-article endpoint) is well-tagged, clearly attributed and topically diverse — common pick for tech-writing training sets. Respect attribution and the original license.
What output formats are supported? JSON, CSV, Excel, HTML, XML, JSONL via the Apify Dataset, plus REST API and webhooks for live integrations.
📚 Related scrapers
Adjacent data sources in the social/dev/content suite:
| Scraper | Purpose |
|---|---|
devto-articles-scraper | You are here. Dev.to articles by tag/author/feed via the public API. |
hacker-news-search-scraper | HN stories/comments/Show HN/Ask HN/front page by keyword. |
hacker-news-who-is-hiring-scraper | Monthly HN "Who is hiring?" thread parsed by company/role/stack. |
reddit-subreddit-scraper | Posts from any subreddit by sort and time window. |
reddit-historical-archive-scraper | Years of subreddit history at scale. |
stack-exchange-questions-scraper | Q&A across 170+ Stack Exchange sites by tag/site/sort. |
github-repository-scraper | Public GitHub repo metadata by search query. |
product-hunt-daily-launches-scraper | Today's Product Hunt launches with votes and makers. |
linkedin-top-content-scraper | Top-performing LinkedIn posts by keyword/author. |
linkedin-ad-library-scraper | LinkedIn Ad Library — competitor ad creative & spend signals. |
letterboxd-film-review-scraper | Film reviews from Letterboxd for culture/sentiment work. |
instagram-media-downloader | Reels/Posts/Stories HD download URLs in bulk. |
🔑 Keyword cloud
Core: devto scraper, dev.to scraper, devto api scraper, dev.to articles scraper, devto blog scraper, devto tag scraper, devto author scraper, devto json export, devto csv export, forem scraper, forem api scraper, tech blog scraper, developer articles scraper, developer blog dataset.
Niche: devto javascript scraper, devto python scraper, devto ai scraper, devto webdev scraper, devto rust scraper, devto react scraper, devto devops scraper, devto career scraper, devto beginners scraper, devto tutorial scraper, devto reactions scraper, devto comments count scraper, devto reading time scraper, devto cover image scraper.
Use case: tech content discovery, developer content aggregator, tech newsletter automation, author network analysis, tag trend tracking, devrel monitoring, devtool marketing intelligence, technical writing benchmarking, content audit dataset, content curation pipeline, competitive content tracking, influencer monitoring, podcast guest research, technical recruiting research, llm training data for developer writing, nlp corpus building, sentiment analysis on tech content.
Audience: technical content marketers, devrel teams, newsletter writers, dev tool product managers, founders of dev-tool startups, technical journalists, content aggregator owners, ml/llm engineers, ai researchers, technical recruiters, dev community managers, growth marketers targeting developers, podcast hosts in tech, developer educators, technical writers and authors.
Changelog
- 2026-06-01 — Maintenance & reliability pass: pulled the latest source and rebuilt the Actor on the current base image; build verified.
-
2026-05-25 — Maintenance & reliability pass: pulled the latest source and rebuilt the Actor on the current base image; build verified.
-
2026-05-20 — Maintenance pass: reviewed the input schema and default values for a smooth one-click start, and rebuilt the Actor on the latest base image.
Last reviewed: 2026-06-01.
📝 Changelog
2026-06-04
- Verified live & refreshed build — reliability/maintenance pass.