Website To Rss
Pricing
Pay per usage
Website To Rss
Convert any website into RSS, Atom, or JSON feeds. Auto-detects articles, tracks changes, and sends notifications. Works with WordPress, Ghost, Medium, Substack, and any blog.
Pricing
Pay per usage
Rating
0.0
(0)
Developer

Gabriel Antony Xaviour
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
7 days ago
Last modified
Categories
Share
๐ก Website to RSS
Transform any website into a standards-compliant RSS, Atom, or JSON feed with automatic article detection and change monitoring.
Features โข Quick Start โข Configuration โข Output โข Monitoring โข Use Cases
What is Website to RSS?
Website to RSS converts any website into a subscribable feed, even if the site doesn't offer one. It automatically detects article patterns, extracts content using multiple strategies, and outputs valid RSS 2.0, Atom 1.0, or JSON Feed formats.
Key Benefits
| Feature | Description |
|---|---|
| ๐ Auto-Discovery | Automatically detects site structure and article patterns |
| ๐ฏ Platform Presets | Optimized extraction for WordPress, Ghost, Medium, Substack, Hugo |
| ๐ Smart Extraction | Uses OpenGraph, JSON-LD, and semantic HTML for reliable content |
| ๐ Change Detection | Track new posts and content changes between runs |
| ๐ Multiple Formats | Generate RSS 2.0, Atom 1.0, JSON Feed, and HTML preview |
| ๐ Notifications | Send webhooks or Slack alerts when new content is detected |
Features
Auto-Discovery Engine
The actor automatically analyzes websites to find articles:
- Platform Detection โ Checks for WordPress, Ghost, Medium signatures
- Structure Analysis โ Identifies article patterns from page structure
- Link Filtering โ Excludes author, tag, and category pages
- Article Scoring โ Uses heuristics to identify real articles
- Content Extraction โ Tries OpenGraph โ JSON-LD โ Semantic HTML โ Fallback
Platform Presets
Pre-configured extraction rules for popular platforms:
| Platform | What's Optimized |
|---|---|
| WordPress | Post selectors, date formats, category extraction |
| Ghost | Card content, member content handling |
| Medium | Story pages, clap counts, reading time |
| Substack | Newsletter posts, subscriber content |
| Hugo | Front matter, taxonomy handling |
| Generic | Universal patterns for unknown sites |
Output Formats
| Format | Content Type | Description |
|---|---|---|
| RSS 2.0 | application/rss+xml | Most compatible feed format |
| Atom 1.0 | application/atom+xml | Modern feed standard |
| JSON Feed | application/json | Developer-friendly format |
| HTML | text/html | Visual preview page |
Quick Start
Basic Usage
Just provide a URL โ the actor handles the rest:
{"websiteUrl": "https://blog.example.com","maxItems": 20}
With Platform Preset
Optimize extraction for known platforms:
{"websiteUrl": "https://my-ghost-blog.com","platformPreset": "ghost","maxItems": 20}
With Manual Selectors
Full control for custom sites:
{"websiteUrl": "https://custom-site.com","discoveryMode": "manual","linkSelector": ".article-link","titleSelector": ".article-title","contentSelector": ".article-body","maxItems": 20}
With Change Monitoring
Get notified when new content appears:
{"websiteUrl": "https://blog.example.com","stateStoreName": "my-blog-monitor","monitorMode": "new_pages","slackWebhook": "https://hooks.slack.com/services/..."}
Input Configuration
Core Options
| Parameter | Type | Default | Description |
|---|---|---|---|
websiteUrl | string | required | URL of the website to convert |
maxItems | number | 50 | Maximum items in the feed |
outputFormats | array | ["rss", "json"] | Formats: rss, atom, json, html |
Discovery Options
| Parameter | Type | Default | Description |
|---|---|---|---|
discoveryMode | string | auto | auto, preset, or manual |
platformPreset | string | auto | wordpress, ghost, medium, substack, hugo, generic |
autoDetectArticles | boolean | true | Use heuristics to filter non-articles |
Manual Selectors
| Parameter | Type | Description |
|---|---|---|
linkSelector | string | CSS selector for article links |
titleSelector | string | CSS selector for article title |
contentSelector | string | CSS selector for article body |
dateSelector | string | CSS selector for publication date |
urlExcludePatterns | array | URL patterns to skip (e.g., /author/) |
Monitoring Options
| Parameter | Type | Default | Description |
|---|---|---|---|
stateStoreName | string | โ | Named store to persist state between runs |
monitorMode | string | both | new_pages, content_changes, or both |
webhookUrl | string | โ | URL to POST when new content found |
slackWebhook | string | โ | Slack webhook for notifications |
Output
Key-Value Store Files
| Key | Content Type | Description |
|---|---|---|
feed.xml | application/rss+xml | RSS 2.0 feed |
feed.atom | application/atom+xml | Atom 1.0 feed |
feed.json | application/json | JSON Feed 1.1 |
feed.html | text/html | HTML preview page |
OUTPUT | application/json | Run summary and statistics |
Dataset Item Schema
{"url": "https://blog.example.com/post-1","title": "My First Post","description": "A short description...","content": "Full article content...","date": "2024-01-15T10:00:00Z","author": "John Doe","image": "https://blog.example.com/image.jpg","categories": ["tech", "tutorial"],"changeStatus": "new"}
Run Summary (OUTPUT)
{"feedTitle": "My Blog","feedUrl": "https://blog.example.com","itemCount": 20,"newItems": 3,"changedItems": 1,"formats": ["rss", "json"],"crawlDurationSec": 45}
Change Detection
Enable persistent monitoring to track content changes:
{"websiteUrl": "https://blog.example.com","stateStoreName": "my-blog-monitor","webhookUrl": "https://my-webhook.com/notify"}
How It Works
- First Run โ Captures all current items as baseline
- Subsequent Runs โ Compares current items with stored state
- Detection โ Identifies new pages and content changes
- Notification โ Sends webhook/Slack with changes
- State Update โ Stores new state for next run
Webhook Payload
{"event": "new_items","feedTitle": "My Blog","itemCount": 3,"items": [{"title": "New Post","url": "https://blog.example.com/new-post","date": "2024-01-15T10:00:00Z"}]}
Use Cases
| Use Case | Description |
|---|---|
| RSS for RSS-less Sites | Create feeds for websites that don't offer them |
| Content Monitoring | Track competitors, news sources, or blogs for updates |
| Aggregation | Combine multiple sites into monitoring workflows |
| Archiving | Capture content changes over time |
| Automation Triggers | Use webhooks to trigger downstream workflows |
Integrations
Schedule Regular Updates
Use Apify Scheduler for periodic monitoring:
| Update Frequency | Best For |
|---|---|
| Every hour | News sites, high-frequency publishers |
| Every 6 hours | Active blogs, company announcements |
| Daily | Personal blogs, slow-updating sites |
Feed Readers
Generated feeds work with any RSS reader:
- Feedly
- Inoreader
- NewsBlur
- Feedbin
- Any RSS-compatible app
Automation Platforms
Connect via webhooks to:
- Zapier
- Make (Integromat)
- n8n
- Custom backends
Troubleshooting
No articles detected
| Problem | Solution |
|---|---|
| Site uses JavaScript | This actor uses HTTP (not browser). Try a Playwright-based scraper |
| Custom structure | Use discoveryMode: "manual" with specific selectors |
| Site blocks requests | Enable proxy configuration |
Wrong content extracted
| Problem | Solution |
|---|---|
| Grabbing navigation | Provide specific contentSelector for article body |
| Missing dates | Add dateSelector for date elements |
| Extra pages included | Add patterns to urlExcludePatterns |
FAQ
Q: Does this work with JavaScript-heavy sites?
This actor uses HTTP requests (CheerioCrawler), not a browser. For sites that require JavaScript rendering, consider a Playwright-based scraper.
Q: How often should I run monitoring?
Depends on the site's update frequency. News sites: hourly. Blogs: every 6-24 hours. Over-polling wastes resources and may trigger rate limits.
Q: Can I monitor multiple sites?
Run the actor separately for each site with different stateStoreName values. Use Apify's scheduling to orchestrate multiple monitors.
Q: What if the site changes its structure?
Auto-discovery adapts to many changes. For manual selectors, you'll need to update them if the site's HTML structure changes.
Support
- Issues: Report bugs or request features on GitHub
- Documentation: See full README in Actor source
- API: Use Apify API to run this Actor programmatically
License
Apache 2.0