News & Announcements to Markdown for RAG avatar

News & Announcements to Markdown for RAG

Pricing

from $40.00 / 1,000 markdown chunks

Go to Apify Store
News & Announcements to Markdown for RAG

News & Announcements to Markdown for RAG

Convert press releases, corporate announcements & news articles into clean, chunked Markdown for RAG and LLM pipelines. Article URLs or RSS feeds. No login.

Pricing

from $40.00 / 1,000 markdown chunks

Rating

0.0

(0)

Developer

NexGenData

NexGenData

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Categories

Share

📰 News & Announcements to Markdown for RAG

Turn press releases, corporate announcements, and news articles into clean, chunked Markdown for RAG and LLM pipelines. Feed it article URLs or RSS/Atom feeds and get LLM-ready text with citations.

⚡ What you get

FieldDescription
urlSource article URL (citation)
titleArticle / release title
chunkIndex / totalChunksPosition within the article
markdownClean Markdown chunk

🎯 Use cases

  1. AI engineers building news/PR RAG copilots
  2. Market & competitive intel feeding event data to an LLM
  3. PR/IR teams building searchable announcement archives
  4. Fintech/research products needing announcement text with citations

🚀 Sample inputs

{ "rssFeeds": ["https://www.prnewswire.com/rss/news-releases-list.rss"], "maxPerFeed": 10 }
{ "urls": ["https://www.businesswire.com/news/home/.../en/..."], "chunkWords": 600 }

📦 Sample output

{ "url": "https://www.prnewswire.com/news-releases/...", "title": "Acme Raises $50M Series B", "chunkIndex": 0, "totalChunks": 6, "markdown": "# Acme Corp Raises $50M...
..." }

📊 Sample Output

Sample output

🛠 How it works

  1. Source — fetches article URLs directly, or pulls latest items from RSS/Atom feeds.
  2. Extract — isolates the main article (<article>/<main>), strips nav/ads/scripts.
  3. Convert — HTML → ATX Markdown.
  4. Chunk — ~chunkWords-word chunks for embedding.
  5. Schema — one row per chunk, with the source URL as citation.

💰 Pricing Example

Pay-per-event: $0.005 per run + $0.04 per Markdown chunk (document-record).

ChunksCost
100~$4.00
500~$20.00
2,000~$80.00
Apify's $5 free credit covers ~124 chunks. Start free →

Fetches publicly-accessible articles/feeds with an identified User-Agent. Respect each publisher's terms for your downstream use; output includes source URLs for attribution.

❓ FAQ

URLs or feeds? Either or both — feeds expand to their latest items. Citations? Yes — every chunk keeps its source URL. Chunk size? chunkWords (default 800). Paywalled articles? Only public content is reachable. Fresh? Pulled live at run time. Dedup? Repeated URLs in one run are skipped.

🆘 Troubleshooting

  • Empty markdown — the page may be JS-rendered or paywalled.
  • Too much boilerplate — the article wrapper wasn't detected; try a direct article URL.
  • Feed returns nothing — confirm it's a valid RSS/Atom URL.
  • Huge output — lower maxPerFeed or chunkWords.

🏷️ About NexGenData

Structured public-data tools for analysts, developers, and operators. thenextgennexus.com.