
🔥Czech News Scraper
Pricing
from $2.00 / 1,000 articles

🔥Czech News Scraper
Extracts articles from Czech news sites (Novinky.cz, Seznam Zprávy, Super.cz, Proženy.cz) in JSON, real-time & historical, 1M+ articles.
5.0 (1)
Pricing
from $2.00 / 1,000 articles
0
1
1
Last modified
3 hours ago
Czech News Scraper
Czech News Scraper extracts article texts from selected news focusing on both speed and data quality.
In addition to the article text, Czech News Scraper also retrieves various metadata for each article. The full output is detailed below.
Features
- ⚡ High speed — scrape thousands of articles within seconds
- 📦 JSON output — structured data ready for further processing
- 🏷️ Extracted metadata — author, title, dates, tags, categories, and more
- 📝 Content in Markdown — clean, analysis-ready format
- 🔄 Unified content format — consistent schema across all supported websites
- ⏱️ Realtime & historical articles — access both the latest and historical articles
Filtering
- 🔍 Full-text query
- 📅 Created date range
- 🕒 Updated date range
Sorting
- 📆 By created date
- ♻️ By updated date
- ⭐ By rank (full-text relevance)
Pagination
- 📑 Up to 100 articles per page
🔗 Try all filters and features right away in the Start Console section — you only pay for the results you actually want and receive. Plus, you get free credits from Apify to get started.
Supported Websites
Czech News Scraper currently supports scraping articles from the following websites:
- Novinky.cz — over 724,000 articles
- Seznam Zprávy — over 171,000 articles
- Super.cz — over 134,000 articles
- ProŽeny.cz — over 46,000 articles
Total: over 1,076,000 articles
Additional websites will be added over time.
If you’d like to see a new website supported, go to the Issues tab and create a request.
Why Scrape News Articles?
There are many reasons why scraping news articles is useful:
- Media monitoring: Track mentions of your company, competitors, or industry-related keywords to stay on top of reputation and trends.
- Research and analysis: Collect and analyze articles to identify patterns, trends, and insights in politics, economics, or social issues.
- Sentiment analysis: Determine sentiment around topics, companies, or individuals to understand public opinion.
- Event detection: Detect and track events (natural disasters, protests, product launches) for fast response.
- Topic modeling: Identify underlying topics and themes to understand broader context.
- Entity extraction: Extract people, organizations, and locations to build databases or track relationships.
- News recommendation: Build personalized recommendation systems for users.
- Fake news detection: Identify potential misinformation and promote fact-based journalism.
- Historical research: Archive articles for long-term study of past events and trends.
- Business intelligence: Gather competitive intelligence, track markets, and discover opportunities.
- Content generation: Use articles as input for summaries, abstracts, or generated content.
- Academic research: Support studies in journalism, communication, sociology, and political science.
- Data journalism: Create interactive dashboards and visualizations for storytelling.
- AI training: Provide large, high-quality datasets for training AI models.
Output example
The scraped articles will be shown as a dataset which you can find in the Output tab.
For easier inspection, the results are first displayed as a table.
Below is a sample dataset in JSON format:
{"articleId": 40528950,"created": 1751640522,"updated": 1751652939,"recommendedUntilDate": 1751816446,"domicile": null,"url": "https://www.novinky.cz/clanek/ekonomika-jako-kdyby-spadl-most-na-plne-dalnici-komentuje-analytik-blackout-40528950","section": "ekonomika","tags": ["Elektřina", "Blackout", "Blackout v Česku"],"relatedArticles": [],"authors": ["Martin Procházka"],"title": "Jako kdyby spadl most na plné dálnici, komentuje analytik blackout","perex": "Příčiny výpadku elektřiny v částech Česka nejsou zatím jasné...","captionTitle": "Analytik společnosti Capitalinked Radim Dohnal","captionImageUrl": "//d15-a.sdn.cz/d_15/c_img_ob_A/nPvP7Dwxi2txZrV7DpuCck/1c9a/radim-dohnal.jpeg","content": "# `Jako kdyby spadl most na plné dálnici, komentuje analytik blackout`\n\nVytvořeno: 04.07.25 14:48\n\nAktualizováno: 04.07.25 18:15\n\n..."}
NOTE: On this page you can also see a larger sample output with full JSON data and rendered Markdown content:
https://apify-czech-news.vercel.app/
NOTE: All textual content is converted into Markdown (titles, text, images, even tables, external links such as Facebook posts, Tweets, and many more).
Sometimes, however, embedded HTML widgets remain unprocessed. These are inserted insidehtml blocks
in the output.
You can safely delete them locally, or further process them with tools like BeautifulSoup (Python) or Cheerio (Node.js).
Is It Legal to Extract Articles?
Yes, extracting articles is legal, since you are scraping publicly available content. However, most articles are protected by copyright laws.
Before publishing extracted content anywhere, always check the terms of use of the source website.
Your feedback
If you find a bug or have feedback, please create an issue in the Issues tab.
📧 Contact: pbrother@seznam.cz — don’t hesitate to reach out, I’ll look into it quickly.