Content Intelligence Extractor
Pricing
from $5.00 / 1,000 page converteds
Content Intelligence Extractor
Extract clean Markdown from Reddit threads and news sites. Built for LLM pipelines, n8n workflows, and AI content analysis. Uses Mozilla Readability + Reddit JSON API for noise-free output.
Pricing
from $5.00 / 1,000 page converteds
Rating
0.0
(0)
Developer

Andrew Luxem
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
7 days ago
Last modified
Categories
Share
Converts Reddit threads and film/entertainment news articles into clean, structured Markdown optimized for LLM pipelines, AI content analysis, and n8n automation workflows.
What it does
Give it a list of URLs — Reddit threads or articles from sites like Screen Rant, CBR, IGN, or any news site — and it returns clean Markdown with engagement signals, metadata, and source-specific fields ready to pipe directly into Claude, GPT, or any LLM.
Reddit threads are extracted via Reddit's JSON API (no browser needed) with post body, top comments sorted by upvotes, and engagement data.
Film/news sites are extracted using Mozilla Readability + Turndown — the same engine Firefox uses to strip ads, sidebars, author bios, and newsletter popups before converting to clean Markdown.
Use cases
- Content gap analysis — feed competitor articles to an LLM to find unexplored angles
- n8n content pipelines — schedule weekly runs, pipe output to Claude or GPT for article briefs
- Reddit trend monitoring — extract high-upvote fan theories or discussions for content research
- SEO research — extract and analyze top-ranking articles in bulk
- RAG knowledge bases — clean Markdown is ideal for vector embeddings
Example input
{"urls": ["https://www.reddit.com/r/FanTheories/comments/abc123/my_theory/","https://screenrant.com/some-article/"],"maxRedditComments": 10,"includeEngagementData": true}
Example output
{"url": "https://www.reddit.com/r/FanTheories/comments/abc123/","sourceType": "reddit","title": "Theory: The ending means something else entirely","markdown": "# Theory: The ending...\n\nFull post body...\n\n## Top Comments\n\n...","metadata": {"wordCount": 847,"estimatedReadTime": 4,"engagementSignal": 3200},"redditSpecific": {"subreddit": "FanTheories","upvotes": 3200,"commentCount": 143,"topComments": [{ "body": "...", "upvotes": 412 }]}}
n8n integration
Use the native Apify n8n node to trigger this actor on a schedule:
- Schedule Trigger — weekly or daily
- Apify: Run Actor — pass your URL list
- Apify: Get Dataset — fetch results
- Loop + LLM node — Claude/GPT analysis prompt
- Google Sheets / Notion — store content briefs
Pricing
Pay-per-page: $0.005 per URL processed. First 20 pages free.
Supported sources
- Reddit (all subreddits via JSON API)
- Screen Rant, CBR, IGN, Variety, Hollywood Reporter
- Any article-based news or blog site
- Custom CSS selectors to strip site-specific noise

