Slashdot Scraper
Pricing
Pay per event
Slashdot Scraper
Extract technology news stories from Slashdot.org including article titles, authors, publication dates, source links, comment counts, and department tags. Browse all sections or scrape the main feed.
Pricing
Pay per event
Rating
0.0
(0)
Developer
BowTiedRaccoon
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
9 days ago
Last modified
Categories
Share
Extract technology news stories from Slashdot.org — the long-running "News for nerds, stuff that matters" aggregator. Captures all story fields available on the main feed and section pages: headline, Slashdot URL, external source link, author, publish time, comment count, department tag, and topic section.
What you get
Each result record contains:
| Field | Description |
|---|---|
storyId | Slashdot internal story ID |
title | Story headline |
url | Slashdot story page URL |
sourceUrl | External source article URL |
sourceDomain | Domain of the external source (e.g. reuters.com) |
commentCount | Number of community comments |
author | Slashdot editor who posted the story |
publishedAt | Publication datetime |
department | "from the X dept." tagline |
section | Topic section (Hardware, Science, YRO, etc.) |
summary | Story summary text (first paragraph of the post) |
scrapedAt | ISO-8601 timestamp of when the record was scraped |
Input
| Parameter | Type | Default | Description |
|---|---|---|---|
maxItems | integer | 10 | Maximum number of stories to return (0 = unlimited) |
startPage | integer | 1 | Page to start from (1-based) |
maxPages | integer | 3 | Number of listing pages to crawl (~15 stories/page) |
Usage
The default run fetches the first 10 stories from Slashdot's homepage. Increase maxItems and maxPages to collect more. Each page yields approximately 15 stories.
Technical notes
- Uses Cheerio HTML scraping (server-rendered pages, no JS execution required)
- Cloudflare is present on the site but operates at the CDN layer only — impit Chrome fingerprinting bypasses it cleanly without proxy
- Protocol-relative URLs (
//subdomain.slashdot.org/...) are normalised tohttps://in output - Memory usage: 256 MB