Convert Any Website to RSS Feed
Pricing
from $0.60 / 1,000 results
Convert Any Website to RSS Feed
Turn blogs, news pages, job boards, product listings, directories, and sitemaps into RSS feeds, JSON Feed, and structured datasets with change detection.
Pricing
from $0.60 / 1,000 results
Rating
0.0
(0)
Developer
Inus Grobler
Maintained by CommunityActor stats
0
Bookmarked
5
Total users
1
Monthly active users
12 days ago
Last modified
Categories
Share
Any Website to RSS Feed
At a glance: what it does is convert public websites into RSS, JSON Feed, and dataset items; input examples include public start URLs and optional selectors; output examples are feed item rows plus RSS_XML, JSON_FEED, and RUN_SUMMARY records; use cases include monitoring and automation; limitations, troubleshooting, and pricing/cost notes are covered below.
Any Website to RSS Feed converts public blogs, news sections, job boards, product listings, directories, category pages, and sitemap-backed websites into RSS feeds, JSON Feed files, and structured Apify dataset items.
Use it when a site has no useful RSS feed, when you need a normalized feed from several websites, or when you want scheduled monitoring for new and changed content.
Main Use Cases
- Create an RSS feed from a website that does not publish one
- Convert blogs, newsroom pages, job boards, product grids, and directories into feed items
- Monitor websites for new or changed posts, jobs, listings, products, or pages
- Export structured content to automation tools, dashboards, newsletters, or data pipelines
- Generate both RSS 2.0 and JSON Feed output from the same run
- Use custom CSS selectors when automatic extraction needs help
What It Extracts
Each dataset item can include:
- Title
- URL and canonical URL
- Summary or excerpt, when available
- Published and updated dates, when available
- Image URL, when available
- Source page and source website
- Content hash for change detection
- New/changed status across repeated runs
- Discovery method, extraction method, confidence score, and scrape timestamp
The Actor also stores:
RSS_XML: generated RSS 2.0 feedJSON_FEED: generated JSON FeedRUN_SUMMARY: totals, failed URLs, and extraction flagsDEBUG_REPORT: optional troubleshooting details when debug mode is enabled
How It Works
For each start URL, the Actor uses a cost-conscious extraction order:
- Look for existing RSS, Atom, and JSON feeds.
- Check common feed URLs such as
/feed,/rss.xml, and/atom.xml. - Extract repeated listing cards from static HTML.
- Read structured metadata such as JSON-LD and Open Graph tags.
- Follow selected same-site detail pages when page limits allow it.
- Use browser rendering only when you enable it.
- Use AI selector discovery only when you enable it.
High-confidence feeds are accepted without extra fallback work, which keeps routine feed-backed runs cheap.
Input Configuration
Most users only need four settings:
startUrls: The public website pages to convert into feed items.preset: How deep the Actor should scan.maxItems: The maximum number of feed items to return.maxPages: The maximum number of pages to fetch.
Leave the other defaults alone for the cheapest first run.
Required
startUrls: One or more public web pages to convert into feed items.
Basic Options
preset:quick,balanced,deep, orjavascript.maxPages: Maximum pages to fetch. Default is25.maxItems: Maximum feed items to return. Default is100.includeUrlPatterns: Keep only item URLs matching these text or regex-like patterns.excludeUrlPatterns: Drop item URLs matching these patterns.
Advanced Options
customSelectors: Manual CSS selectors for pages where automatic extraction needs help.
Older JSON/API inputs can still use advanced fields such as:
mode:auto,rssDiscoveryOnly,sitemap,pageList, orcustomSelectors.crawlDepth: Internal link depth for page-list and sitemap modes. Default is1.renderJavaScript: Use browser rendering for JavaScript-heavy pages. Keep off unless needed.playwrightFallback: Try one browser fallback if static extraction finds no useful items. Default is off.useLLM: Optional AI selector discovery. Default is off.changedItemPolicy: Include all items, exclude changed items, or output only changed items. On a first run,onlyChangedusually returns no dataset rows because there is no previous state.stateMaxItems: Maximum historical item fingerprints to keep per start URL. Default is10000.
Static and feed-backed runs are designed for low memory. Browser-rendered runs should be launched with 1024 MB memory.
Example Input
{"startUrls": [{ "url": "https://example.com/blog" }],"preset": "balanced","maxItems": 100,"maxPages": 25}
Listing Page Example
{"startUrls": [{ "url": "https://example.com/jobs" }],"preset": "balanced","maxPages": 25,"maxItems": 100,"includeUrlPatterns": ["/jobs/"]}
Custom Selectors Example
{"startUrls": [{ "url": "https://example.com/products" }],"preset": "balanced","customSelectors": {"itemSelector": ".product-card","titleSelector": ".product-title","urlSelector": "a","summarySelector": ".product-summary","imageSelector": "img"}}
Example Output
{"title": "Frontend Engineer","url": "https://example.com/jobs/frontend-engineer","canonicalUrl": "https://example.com/jobs/frontend-engineer","summary": "Frontend role for a fast-growing product team.","publishedAt": "2026-05-13T09:00:00.000Z","updatedAt": null,"imageUrl": "https://example.com/images/frontend.png","sourcePageUrl": "https://example.com/jobs","sourceSite": "https://example.com","contentHash": "sha256...","isNew": true,"isChanged": false,"previousHash": null,"discoveryMethod": "html_repeated","extractionMethod": "static_html","confidence": 0.86,"scrapedAt": "2026-05-13T12:00:00.000Z"}
How to Run on Apify
- Open the Actor in Apify Console.
- Add one or more
startUrls. - Keep
modeasautofor the first run. - Set
maxPagesandmaxItemsto a small number for testing. - Run the Actor.
- Review the dataset and the
RSS_XML,JSON_FEED, andRUN_SUMMARYrecords. - Increase limits only when the extra results are useful.
Exporting Results
After a run finishes:
- Download dataset rows as JSON, CSV, Excel, XML, RSS, or HTML from the Dataset tab.
- Open the Key-value store to copy or download
RSS_XMLandJSON_FEED. - Use the API to fetch dataset items and feed records programmatically.
Pricing and Resource Tips
Recommended pricing model: pay per dataset result.
Measured test runs showed low static-run platform cost at 256 MB, with browser rendering reserved for opt-in 1024 MB runs. For most users, a simple result-based price is easier to understand than charging separately for every internal fetch.
Recommended Store pricing:
- Event:
apify-default-dataset-item - User-facing event title:
result - Suggested price:
$0.001per result for FREE/BRONZE users - Suggested discounts:
$0.0008for SILVER and$0.0006for GOLD and higher - Keep the tiny Actor-start event only if needed to discourage empty spam runs
- Include platform usage in the event price rather than asking users to reason about memory and compute units
Cost control tips:
- Use
256 MBfor normal feed/static runs. - Use
1024 MBwhenrenderJavaScriptis enabled. - Keep
renderJavaScript,playwrightFallback, anduseLLMoff unless needed. - Use
includeUrlPatternsfor broad websites. - Start with
maxPages: 10-25and increase gradually.
Limits and Caveats
- Works on public pages only.
- Does not log in to websites.
- Does not bypass paywalls or private content.
- Does not guarantee every website can be converted automatically.
- JavaScript-heavy websites may need browser rendering.
- Some unusual layouts may need custom CSS selectors.
- Very broad websites should be narrowed with URL patterns and page limits.
- AI selector discovery uses an external AI provider only when you explicitly enable it.
Troubleshooting
- Empty dataset: try
pageListmode or add custom selectors. - Wrong section extracted: add
includeUrlPatternsorexcludeUrlPatterns. - Missing summaries: the source page may not expose excerpts; increase
maxPagesif detail-page enrichment is worth the cost. - JavaScript-heavy page: enable
renderJavaScriptand run with1024 MBmemory. - Too many irrelevant links: lower
crawlDepthor switch tocustomSelectors. - Scheduled run shows no new items: check
changedItemPolicyand the state store name.
Python API Example
import osfrom apify_client import ApifyClientclient = ApifyClient(os.environ["APIFY_TOKEN"])run_input = {"startUrls": [{"url": "https://example.com/blog"}],"mode": "auto","maxPages": 25,"maxItems": 100,}run = client.actor("thescrapelab/any-website-to-rss-feed").call(run_input=run_input)dataset = client.dataset(run["defaultDatasetId"])store = client.key_value_store(run["defaultKeyValueStoreId"])items = list(dataset.iterate_items())rss_xml = store.get_record("RSS_XML")["value"]json_feed = store.get_record("JSON_FEED")["value"]summary = store.get_record("RUN_SUMMARY")["value"]print(f"Run ID: {run['id']}")print(f"Items: {len(items)}")print(f"New items: {summary['newItems']}")print(f"Changed items: {summary['changedItems']}")print(items[0]["title"] if items else "No items found")print(rss_xml[:200])print(json_feed["title"])
FAQ
Can I create an RSS feed from any website?
You can create feeds from many public websites, especially blogs, news pages, listings, directories, and sitemap-backed sites. Some complex or protected websites may need custom selectors or may not be suitable.
Does this Actor find existing RSS feeds?
Yes. It checks feed links on the page and common RSS, Atom, and JSON Feed paths before doing page extraction.
Can it monitor job boards and product listings?
Yes. Use pageList mode with includeUrlPatterns that match the job or product URLs.
Does it detect changed pages?
Yes. It stores content fingerprints in a state store and marks each item as new, changed, or unchanged on later runs.
Should I enable browser rendering?
Only when static extraction does not see the content. Browser rendering costs more and should use 1024 MB memory.
Does it use AI?
Not by default. AI selector discovery is optional and only runs when you set useLLM to a non-off value.
How do I reduce cost?
Use the default static mode, keep browser and AI options off, lower maxPages, use URL patterns, and start at 256 MB memory.
Can I use the RSS output directly?
Yes. Open the RSS_XML key-value store record after the run and use that URL or record content in your feed reader or automation.
Suggested Keywords
website to RSS feed, RSS feed generator, create RSS from website, blog to RSS, news RSS scraper, job board monitoring, product listing monitor, sitemap to RSS, JSON Feed generator, Apify RSS scraper.