Pricing

from $0.60 / 1,000 results

Convert Any Website to RSS Feed

Turn blogs, news pages, job boards, product listings, directories, and sitemaps into RSS feeds, JSON Feed, and structured datasets with change detection.

Pricing

from $0.60 / 1,000 results

Rating

0.0

(0)

Developer

Inus Grobler

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

Any Website to RSS Feed

At a glance: what it does is convert public websites into RSS, JSON Feed, and dataset items; input examples include public start URLs and optional selectors; output examples are feed item rows plus RSS_XML, JSON_FEED, and RUN_SUMMARY records; use cases include monitoring and automation; limitations, troubleshooting, and pricing/cost notes are covered below.

Any Website to RSS Feed converts public blogs, news sections, job boards, product listings, directories, category pages, and sitemap-backed websites into RSS feeds, JSON Feed files, and structured Apify dataset items.

Use it when a site has no useful RSS feed, when you need a normalized feed from several websites, or when you want scheduled monitoring for new and changed content.

Main Use Cases

Create an RSS feed from a website that does not publish one
Convert blogs, newsroom pages, job boards, product grids, and directories into feed items
Monitor websites for new or changed posts, jobs, listings, products, or pages
Export structured content to automation tools, dashboards, newsletters, or data pipelines
Generate both RSS 2.0 and JSON Feed output from the same run
Use custom CSS selectors when automatic extraction needs help

What It Extracts

Each dataset item can include:

Title
URL and canonical URL
Summary or excerpt, when available
Published and updated dates, when available
Image URL, when available
Source page and source website
Content hash for change detection
New/changed status across repeated runs
Discovery method, extraction method, confidence score, and scrape timestamp

The Actor also stores:

RSS_XML: generated RSS 2.0 feed
JSON_FEED: generated JSON Feed
RUN_SUMMARY: totals, failed URLs, and extraction flags
DEBUG_REPORT: optional troubleshooting details when debug mode is enabled

How It Works

For each start URL, the Actor uses a cost-conscious extraction order:

Look for existing RSS, Atom, and JSON feeds.
Check common feed URLs such as /feed, /rss.xml, and /atom.xml.
Extract repeated listing cards from static HTML.
Read structured metadata such as JSON-LD and Open Graph tags.
Follow selected same-site detail pages when page limits allow it.
Use browser rendering only when you enable it.
Use AI selector discovery only when you enable it.

High-confidence feeds are accepted without extra fallback work, which keeps routine feed-backed runs cheap.

Input Configuration

Most users only need four settings:

startUrls: The public website pages to convert into feed items.
preset: How deep the Actor should scan.
maxItems: The maximum number of feed items to return.
maxPages: The maximum number of pages to fetch.

Leave the other defaults alone for the cheapest first run.

Required

startUrls: One or more public web pages to convert into feed items.

Basic Options

preset: quick, balanced, deep, or javascript.
maxPages: Maximum pages to fetch. Default is 25.
maxItems: Maximum feed items to return. Default is 100.
includeUrlPatterns: Keep only item URLs matching these text or regex-like patterns.
excludeUrlPatterns: Drop item URLs matching these patterns.

Advanced Options

customSelectors: Manual CSS selectors for pages where automatic extraction needs help.

Older JSON/API inputs can still use advanced fields such as:

mode: auto, rssDiscoveryOnly, sitemap, pageList, or customSelectors.
crawlDepth: Internal link depth for page-list and sitemap modes. Default is 1.
renderJavaScript: Use browser rendering for JavaScript-heavy pages. Keep off unless needed.
playwrightFallback: Try one browser fallback if static extraction finds no useful items. Default is off.
useLLM: Optional AI selector discovery. Default is off.
changedItemPolicy: Include all items, exclude changed items, or output only changed items. On a first run, onlyChanged usually returns no dataset rows because there is no previous state.
stateMaxItems: Maximum historical item fingerprints to keep per start URL. Default is 10000.

Static and feed-backed runs are designed for low memory. Browser-rendered runs should be launched with 1024 MB memory.

Example Input

{
  "startUrls": [
    { "url": "https://example.com/blog" }
  ],
  "preset": "balanced",
  "maxItems": 100,
  "maxPages": 25
}

Listing Page Example

{
  "startUrls": [
    { "url": "https://example.com/jobs" }
  ],
  "preset": "balanced",
  "maxPages": 25,
  "maxItems": 100,
  "includeUrlPatterns": ["/jobs/"]
}

Custom Selectors Example

{
  "startUrls": [
    { "url": "https://example.com/products" }
  ],
  "preset": "balanced",
  "customSelectors": {
    "itemSelector": ".product-card",
    "titleSelector": ".product-title",
    "urlSelector": "a",
    "summarySelector": ".product-summary",
    "imageSelector": "img"
  }
}

Example Output

{
  "title": "Frontend Engineer",
  "url": "https://example.com/jobs/frontend-engineer",
  "canonicalUrl": "https://example.com/jobs/frontend-engineer",
  "summary": "Frontend role for a fast-growing product team.",
  "publishedAt": "2026-05-13T09:00:00.000Z",
  "updatedAt": null,
  "imageUrl": "https://example.com/images/frontend.png",
  "sourcePageUrl": "https://example.com/jobs",
  "sourceSite": "https://example.com",
  "contentHash": "sha256...",
  "isNew": true,
  "isChanged": false,
  "previousHash": null,
  "discoveryMethod": "html_repeated",
  "extractionMethod": "static_html",
  "confidence": 0.86,
  "scrapedAt": "2026-05-13T12:00:00.000Z"
}

How to Run on Apify

Open the Actor in Apify Console.
Add one or more startUrls.
Keep mode as auto for the first run.
Set maxPages and maxItems to a small number for testing.
Run the Actor.
Review the dataset and the RSS_XML, JSON_FEED, and RUN_SUMMARY records.
Increase limits only when the extra results are useful.

Exporting Results

After a run finishes:

Download dataset rows as JSON, CSV, Excel, XML, RSS, or HTML from the Dataset tab.
Open the Key-value store to copy or download RSS_XML and JSON_FEED.
Use the API to fetch dataset items and feed records programmatically.

Pricing and Resource Tips

Recommended pricing model: pay per dataset result.

Measured test runs showed low static-run platform cost at 256 MB, with browser rendering reserved for opt-in 1024 MB runs. For most users, a simple result-based price is easier to understand than charging separately for every internal fetch.

Recommended Store pricing:

Event: apify-default-dataset-item
User-facing event title: result
Suggested price: $0.001 per result for FREE/BRONZE users
Suggested discounts: $0.0008 for SILVER and $0.0006 for GOLD and higher
Keep the tiny Actor-start event only if needed to discourage empty spam runs
Include platform usage in the event price rather than asking users to reason about memory and compute units

Cost control tips:

Use 256 MB for normal feed/static runs.
Use 1024 MB when renderJavaScript is enabled.
Keep renderJavaScript, playwrightFallback, and useLLM off unless needed.
Use includeUrlPatterns for broad websites.
Start with maxPages: 10-25 and increase gradually.

Limits and Caveats

Works on public pages only.
Does not log in to websites.
Does not bypass paywalls or private content.
Does not guarantee every website can be converted automatically.
JavaScript-heavy websites may need browser rendering.
Some unusual layouts may need custom CSS selectors.
Very broad websites should be narrowed with URL patterns and page limits.
AI selector discovery uses an external AI provider only when you explicitly enable it.

Troubleshooting

Empty dataset: try pageList mode or add custom selectors.
Wrong section extracted: add includeUrlPatterns or excludeUrlPatterns.
Missing summaries: the source page may not expose excerpts; increase maxPages if detail-page enrichment is worth the cost.
JavaScript-heavy page: enable renderJavaScript and run with 1024 MB memory.
Too many irrelevant links: lower crawlDepth or switch to customSelectors.
Scheduled run shows no new items: check changedItemPolicy and the state store name.

Python API Example

import os
from apify_client import ApifyClient

client = ApifyClient(os.environ["APIFY_TOKEN"])

run_input = {
    "startUrls": [{"url": "https://example.com/blog"}],
    "mode": "auto",
    "maxPages": 25,
    "maxItems": 100,
}

run = client.actor("thescrapelab/any-website-to-rss-feed").call(run_input=run_input)

dataset = client.dataset(run["defaultDatasetId"])
store = client.key_value_store(run["defaultKeyValueStoreId"])

items = list(dataset.iterate_items())
rss_xml = store.get_record("RSS_XML")["value"]
json_feed = store.get_record("JSON_FEED")["value"]
summary = store.get_record("RUN_SUMMARY")["value"]

print(f"Run ID: {run['id']}")
print(f"Items: {len(items)}")
print(f"New items: {summary['newItems']}")
print(f"Changed items: {summary['changedItems']}")
print(items[0]["title"] if items else "No items found")
print(rss_xml[:200])
print(json_feed["title"])

FAQ

Can I create an RSS feed from any website?

You can create feeds from many public websites, especially blogs, news pages, listings, directories, and sitemap-backed sites. Some complex or protected websites may need custom selectors or may not be suitable.

Does this Actor find existing RSS feeds?

Yes. It checks feed links on the page and common RSS, Atom, and JSON Feed paths before doing page extraction.

Can it monitor job boards and product listings?

Yes. Use pageList mode with includeUrlPatterns that match the job or product URLs.

Does it detect changed pages?

Yes. It stores content fingerprints in a state store and marks each item as new, changed, or unchanged on later runs.

Should I enable browser rendering?

Only when static extraction does not see the content. Browser rendering costs more and should use 1024 MB memory.

Does it use AI?

Not by default. AI selector discovery is optional and only runs when you set useLLM to a non-off value.

How do I reduce cost?

Use the default static mode, keep browser and AI options off, lower maxPages, use URL patterns, and start at 256 MB memory.

Can I use the RSS output directly?

Yes. Open the RSS_XML key-value store record after the run and use that URL or record content in your feed reader or automation.

Suggested Keywords

website to RSS feed, RSS feed generator, create RSS from website, blog to RSS, news RSS scraper, job board monitoring, product listing monitor, sitemap to RSS, JSON Feed generator, Apify RSS scraper.

Twitter Trending Topics Scraper 🌎

easyapi/twitter-trending-topics-scraper

Monitor Twitter trending topics across 60+ countries worldwide. Get real-time insights into what's buzzing on Twitter with tweet volumes and timestamps. Perfect for social media analysts, marketers, and researchers tracking global social trends. 🌎

EasyApi

191

LinkedIn Post Search Scraper

crawlerbros/linkedin-post-search-scraper

Search LinkedIn for posts by keyword, topic, or hashtag. Filter by date posted and sort by relevance or recency.

Crawler Bros

RSS Feed Scraper & Monitor — Any Feed to JSON, CSV, Excel

q_services/rss-feed-monitor

Turn any RSS or Atom feed into clean structured data. Keyword filtering, deduplication across feeds. Perfect for content monitoring.

Q Services

Rss Feed API

vivid_astronaut/rss-feed

Fabio Suizu

Website to RSS Feed Converter - Monitor Any Website

scrappy_garden/website-to-rss-converter

Convert any website into an RSS feed automatically. Monitor blogs, news sites, e-commerce stores for new content. Get instant notifications when pages change. Perfect for content aggregation, monitoring competitors, and staying updated. Export to RSS, JSON, or XML.

Bikram Adhikari

RSS & Atom Feed to JSON Scraper

andok/rss-parser

Monitor blogs, news sites, and podcasts. Convert any RSS or Atom feed into structured JSON data for instant content syndication.

Andok

RSS to JSON — Structured Feed Data for AI

wsgcjj/rss-to-json

Convert any RSS or Atom feed to clean structured JSON. Perfect for AI agents, content aggregation, news monitoring, and data pipelines.

陈俊杰

RSS Feed Scraper

technicaldost/rss-feed-scraper

Scrape any RSS or Atom feed into clean, structured JSON. Extract titles, links, authors, publish dates, categories and full content from news sites, blogs, podcasts and YouTube feeds in bulk. Fast and reliable, no proxies or setup required.

Technical Dost Solutions

5.0

Website to RSS Feed Generator

constant_quadruped/website-to-rss

Convert any website into an RSS feed instantly. Auto-detects blog posts, news, and articles. Supports JavaScript sites via Playwright. Filter by keywords, extract full content, output as RSS or JSON. Perfect for competitor monitoring, news aggregation, and research tracking.