🔥Czech News Scraper avatar
🔥Czech News Scraper

Pricing

from $2.00 / 1,000 articles

Go to Apify Store
🔥Czech News Scraper

🔥Czech News Scraper

Developed by

P Brother

P Brother

Maintained by Community

Extracts articles from Czech news sites (Novinky.cz, Seznam Zprávy, Super.cz, Proženy.cz) in JSON, real-time & historical, 1M+ articles.

5.0 (1)

Pricing

from $2.00 / 1,000 articles

0

1

1

Last modified

3 hours ago

Czech News Scraper

Czech News Scraper extracts article texts from selected news focusing on both speed and data quality.

In addition to the article text, Czech News Scraper also retrieves various metadata for each article. The full output is detailed below.

Features

  • High speed — scrape thousands of articles within seconds
  • 📦 JSON output — structured data ready for further processing
  • 🏷️ Extracted metadata — author, title, dates, tags, categories, and more
  • 📝 Content in Markdown — clean, analysis-ready format
  • 🔄 Unified content format — consistent schema across all supported websites
  • ⏱️ Realtime & historical articles — access both the latest and historical articles

Filtering

  • 🔍 Full-text query
  • 📅 Created date range
  • 🕒 Updated date range

Sorting

  • 📆 By created date
  • ♻️ By updated date
  • ⭐ By rank (full-text relevance)

Pagination

  • 📑 Up to 100 articles per page

🔗 Try all filters and features right away in the Start Console section — you only pay for the results you actually want and receive. Plus, you get free credits from Apify to get started.

Supported Websites

Czech News Scraper currently supports scraping articles from the following websites:

Total: over 1,076,000 articles

Additional websites will be added over time.
If you’d like to see a new website supported, go to the Issues tab and create a request.

Why Scrape News Articles?

There are many reasons why scraping news articles is useful:

  • Media monitoring: Track mentions of your company, competitors, or industry-related keywords to stay on top of reputation and trends.
  • Research and analysis: Collect and analyze articles to identify patterns, trends, and insights in politics, economics, or social issues.
  • Sentiment analysis: Determine sentiment around topics, companies, or individuals to understand public opinion.
  • Event detection: Detect and track events (natural disasters, protests, product launches) for fast response.
  • Topic modeling: Identify underlying topics and themes to understand broader context.
  • Entity extraction: Extract people, organizations, and locations to build databases or track relationships.
  • News recommendation: Build personalized recommendation systems for users.
  • Fake news detection: Identify potential misinformation and promote fact-based journalism.
  • Historical research: Archive articles for long-term study of past events and trends.
  • Business intelligence: Gather competitive intelligence, track markets, and discover opportunities.
  • Content generation: Use articles as input for summaries, abstracts, or generated content.
  • Academic research: Support studies in journalism, communication, sociology, and political science.
  • Data journalism: Create interactive dashboards and visualizations for storytelling.
  • AI training: Provide large, high-quality datasets for training AI models.

Output example

The scraped articles will be shown as a dataset which you can find in the Output tab.
For easier inspection, the results are first displayed as a table.

Below is a sample dataset in JSON format:

{
"articleId": 40528950,
"created": 1751640522,
"updated": 1751652939,
"recommendedUntilDate": 1751816446,
"domicile": null,
"url": "https://www.novinky.cz/clanek/ekonomika-jako-kdyby-spadl-most-na-plne-dalnici-komentuje-analytik-blackout-40528950",
"section": "ekonomika",
"tags": ["Elektřina", "Blackout", "Blackout v Česku"],
"relatedArticles": [],
"authors": ["Martin Procházka"],
"title": "Jako kdyby spadl most na plné dálnici, komentuje analytik blackout",
"perex": "Příčiny výpadku elektřiny v částech Česka nejsou zatím jasné...",
"captionTitle": "Analytik společnosti Capitalinked Radim Dohnal",
"captionImageUrl": "//d15-a.sdn.cz/d_15/c_img_ob_A/nPvP7Dwxi2txZrV7DpuCck/1c9a/radim-dohnal.jpeg",
"content": "# `Jako kdyby spadl most na plné dálnici, komentuje analytik blackout`\n\nVytvořeno: 04.07.25 14:48\n\nAktualizováno: 04.07.25 18:15\n\n..."
}

NOTE: On this page you can also see a larger sample output with full JSON data and rendered Markdown content:
https://apify-czech-news.vercel.app/

NOTE: All textual content is converted into Markdown (titles, text, images, even tables, external links such as Facebook posts, Tweets, and many more).
Sometimes, however, embedded HTML widgets remain unprocessed. These are inserted inside html blocks in the output.
You can safely delete them locally, or further process them with tools like BeautifulSoup (Python) or Cheerio (Node.js).

Yes, extracting articles is legal, since you are scraping publicly available content. However, most articles are protected by copyright laws.

Before publishing extracted content anywhere, always check the terms of use of the source website.

Your feedback

If you find a bug or have feedback, please create an issue in the Issues tab.

📧 Contact: pbrother@seznam.cz — don’t hesitate to reach out, I’ll look into it quickly.