Convert Any Website to RSS Feed
Pricing
from $2.00 / 1,000 results
Convert Any Website to RSS Feed
Turn blogs, news pages, job boards, product listings, directories, and sitemaps into RSS feeds, JSON Feed, and structured datasets with change detection.
Pricing
from $2.00 / 1,000 results
Rating
0.0
(0)
Developer
Inus Grobler
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
5 days ago
Last modified
Categories
Share
Any Website to RSS Feed
Website to RSS Feed Generator for blogs, news pages, job boards, product listings, directories, and sitemaps.
Turn almost any public website into an RSS feed, JSON Feed, and structured Apify dataset.
Any Website to RSS Feed is a lightweight website to RSS feed generator for blogs, news pages, job boards, product listings, directories, category pages, and sitemap-backed websites. Use it when a site does not offer a clean RSS feed, when you want a normalized feed from many different websites, or when you need to detect new and changed items over time.
The Actor is designed to be cost-aware by default. It looks for existing RSS, Atom, and JSON feeds first, then tries static HTML extraction, and only uses browser rendering or optional AI selector discovery when needed.
SEO Title
Convert Any Website to RSS Feed | Blogs, News, Jobs
SEO Description
Create RSS feeds from blogs, news pages, job boards, product listings, directories, and sitemaps. Export JSON Feed and structured datasets with lightweight change detection on Apify.
Why People Use It
- Turn websites without RSS into feeds for Slack, email, Zapier, Make, or custom automation
- Monitor job boards, product catalogs, newsroom sections, and directory pages
- Publish normalized RSS and JSON Feed outputs from inconsistent websites
- Detect new and changed content across scheduled runs without building a custom scraper
What You Can Do With It
- Create an RSS feed from almost any public website
- Convert blogs, listings, directories, and job boards into structured feed items
- Generate both RSS 2.0 and JSON Feed from the same run
- Export feed items to an Apify dataset for automation and analysis
- Detect new, changed, and unchanged items across repeated runs
- Use custom CSS selectors for difficult websites
- Use optional OpenRouter selector discovery when normal extraction is weak
Common Use Cases
- Website to RSS feed conversion
- Blog to RSS feed generation
- Job board monitoring
- Product listing monitoring
- News and article aggregation
- Directory and category page tracking
- Sitemap-based content discovery
- Lightweight website change monitoring for feed publishing
- RSS feed generation for automation tools, newsletters, and dashboards
How It Works
For each URL you provide, the Actor tries these approaches:
- Finds existing RSS, Atom, or JSON feeds
- Checks common feed URLs such as
/feed,/rss.xml, and/atom.xml - Discovers sitemap URLs from
robots.txtandsitemap.xml - Extracts repeated cards or listing items from static HTML
- Uses your custom CSS selectors when provided
- Falls back to browser rendering when JavaScript is required
- Optionally uses OpenRouter to infer selectors for repeated listings
- Stores RSS, JSON Feed, dataset rows, and a run summary
Quick Start
Paste one or more public URLs into startUrls.
{"startUrls": [{ "url": "https://example.com/blog" }]}
Best Starting Setup
For most websites, this is the best place to start:
{"startUrls": [{ "url": "https://example.com/blog" }],"mode": "auto","maxPages": 25,"maxItems": 100}
Use pageList for category pages, search results, job boards, and product grids. Use customSelectors only when a site has an unusual layout and automatic extraction is weak.
For a listing page, such as jobs, products, articles, or directory cards:
{"startUrls": [{ "url": "https://example.com/jobs" }],"mode": "pageList","maxPages": 25,"maxItems": 100}
For a page where you already know the selectors:
{"startUrls": [{ "url": "https://example.com/products" }],"mode": "customSelectors","customSelectors": {"itemSelector": ".product-card","titleSelector": ".product-title","urlSelector": "a","summarySelector": ".product-summary","imageSelector": "img"}}
Input Options
startUrls: public website URLs to convert into feed itemsmode: choose automatic discovery, RSS-only discovery, sitemap mode, page-list extraction, or custom selectorsmaxPages: maximum pages to fetchmaxItems: maximum items to returnrenderJavaScript: use browser rendering for JavaScript-heavy pagesincludeUrlPatterns: keep only URLs matching these patternsexcludeUrlPatterns: remove URLs matching these patternscustomSelectors: CSS selectors for manual extractionchangedItemPolicy: include changed items, exclude them, or output only changed items
Technical settings such as crawl depth, browser fallback, deduplication, and debug reporting stay on safe built-in defaults so the input stays simple.
Output
The Actor writes feed items to the dataset and stores feed files in the run key-value store.
Dataset items include:
{"title": "Frontend Engineer","url": "https://example.com/jobs/frontend-engineer","canonicalUrl": "https://example.com/jobs/frontend-engineer","summary": "Frontend role for a fast-growing product team.","publishedAt": "2026-05-13T09:00:00.000Z","updatedAt": null,"imageUrl": "https://example.com/images/frontend.png","sourcePageUrl": "https://example.com/jobs","sourceSite": "https://example.com","contentHash": "sha256...","isNew": true,"isChanged": false,"previousHash": null,"discoveryMethod": "html_repeated","extractionMethod": "static_html","confidence": 0.86,"scrapedAt": "2026-05-13T12:00:00.000Z"}
Key-value store records:
RSS_XML: generated RSS 2.0 feedJSON_FEED: generated JSON FeedRUN_SUMMARY: totals, failed URLs, and extraction flagsDEBUG_REPORT: optional debug details when troubleshooting is enabled internally
RSS and JSON Feed
The generated RSS feed includes channel metadata and item fields such as title, link, guid, publication date, and description when available.
The generated JSON Feed includes feed metadata and item fields such as id, URL, title, summary, publication date, and image when available.
Descriptions and summaries are optional because many listing pages do not expose full excerpts. When possible, the Actor tries to enrich missing summaries from page metadata and detail pages while staying within your page limits.
New and Changed Items
The Actor can track items across runs using a named state store. This lets you tell whether an item is:
- new
- changed
- unchanged
This is useful for scheduled RSS generation, content alerts, product monitoring, job board monitoring, and feed publishing workflows.
Example Saved Tasks
These are good starting points for production tasks in Apify.
Monitor a News Section
{"startUrls": [{ "url": "https://example.com/news" }],"mode": "auto","maxPages": 20,"maxItems": 50,"changedItemPolicy": "include"}
Track a Job Board
{"startUrls": [{ "url": "https://example.com/jobs" }],"mode": "pageList","maxPages": 30,"maxItems": 100,"changedItemPolicy": "onlyChanged"}
Follow a Product Listing Page
{"startUrls": [{ "url": "https://example.com/products" }],"mode": "pageList","maxPages": 20,"maxItems": 100,"includeUrlPatterns": ["/products/"]}
Optional AI Selector Discovery
Some websites have unusual markup. When needed, the Actor can use OpenRouter to infer CSS selectors for repeated cards or listing items.
The AI is only used for selector discovery. It is not used to summarize content, rewrite text, or analyze the full website.
To use this feature, set OPENROUTER_API_KEY as a secret environment variable in Apify. Do not put the key in your input JSON.
Pricing and Resource Tips
- Start with
mode: "auto" - Keep
renderJavaScriptoff unless the site requires it - Use
maxPagesandmaxItemsto control cost - Use
includeUrlPatternsfor large websites and sitemaps - Use
customSelectorsfor pages with predictable layouts - Use OpenRouter only as a fallback unless a website clearly needs selector inference
Practical guidance:
- Small blog or feed-backed website: usually
maxPages: 10-25 - News section or job board: usually
maxPages: 20-40 - Large sitemap or directory: start with
maxPages: 25and narrow withincludeUrlPatterns - Turn on
renderJavaScriptonly for sites that clearly depend on client-side rendering
If you are launching this as a scheduled production task, start conservative, review the first few runs, and only increase page limits when the extra items are worth the extra cost.
Limitations
- Works on public pages only
- Does not log in to websites
- Does not perform screenshot comparison
- Does not generate AI summaries
- Does not crawl deeply by default
- Some complex websites may require custom selectors
Troubleshooting
- Empty dataset: try
mode: "customSelectors"with manual selectors - Missing summaries: the source page may not expose descriptions or excerpts
- JavaScript-heavy site: enable
renderJavaScript - Large sitemap: reduce
maxPagesor useincludeUrlPatterns - LLM fallback not running: check that
OPENROUTER_API_KEYis set as a secret environment variable
Launch Checklist
Before going live with a scheduled task:
- Run the Actor once with
mode: "auto"and confirm the dataset items look correct. - Check
RSS_XML,JSON_FEED, andRUN_SUMMARYin the key-value store. - Make sure
maxPagesandmaxItemsare low enough for the budget you want. - Add
includeUrlPatternsif the site is broad and you only want one section. - If the page is JavaScript-heavy, enable
renderJavaScript. - If extraction is weak, switch to
customSelectorsbefore reaching for AI fallback. - Schedule the task and review the second run to confirm
isNewandisChangedbehave as expected.
Python API Example
Use the Apify API from Python with apify_client:
import osfrom apify_client import ApifyClientclient = ApifyClient(os.environ["APIFY_API_TOKEN"])run_input = {"startUrls": [{"url": "https://example.com/blog"}],"mode": "auto","maxPages": 25,"maxItems": 100,}run = client.actor("TheScrapeLab/any-website-to-rss-feed").call(run_input=run_input)dataset = client.dataset(run["defaultDatasetId"])store = client.key_value_store(run["defaultKeyValueStoreId"])items = list(dataset.iterate_items())rss_xml = store.get_record("RSS_XML")["value"]json_feed = store.get_record("JSON_FEED")["value"]run_summary = store.get_record("RUN_SUMMARY")["value"]print(f"Run ID: {run['id']}")print(f"Items found: {len(items)}")print(f"New items: {run_summary['newItems']}")print(f"Changed items: {run_summary['changedItems']}")print(items[0]["title"] if items else "No items found")print(rss_xml[:200])print(json_feed["title"])
Who This Actor Is For
This website to RSS feed generator is useful for marketers, growth teams, recruiters, publishers, analysts, developers, automation builders, and anyone who needs reliable feed data from public websites.