Ultimate News API
Try for free
7 days trial then $10.00/month - No credit card required now
View all Actors
Ultimate News API
glitch_404/ultimate-news-scraper
Try for free
7 days trial then $10.00/month - No credit card required now
news scraper to scrape up to 10K news articles from over 1000 news sources in less than 20 minutes news from over 20 categories .e.g. Crypto news, World News, Latest News, Celebrities News, and a lot more. you can get news from websites like Fox News, BBC News, CNN News, Crypto and Cryptocurrencies.
change log for v2.0 Beta
- this scraper is still in Beta mode, so you might find some bugs or issues please don't hesitate to report them though opening an issue here is my GitHub if you need to contact me
new features added
- allow specific agencies or even 1 agency for the scraper to scrape from them (if found)
- new Date Range feature it's way better now you can view it is documents downside
- better code and better speed improvements, you can scrape way more historical data now
- API Mode is faster, you can use it to extract html on you own instead of the scraper ( it will lower your costs a lot )
- better logging system for bugs and errors detection
- better and more accurate data collection
- more scrapers added to collect a lot more data over the web
how to use v2.0 Beta new features
- Date Range
- you can enter a date range or enter today to get today's news - all to get any date (can scrape months old articles)
- today = today's date articles
- yesterday = yesterday's date articles
- to scrape a specific date range use / to separate the 2 dates (only 2 dates are counted)
- dates are always in the format YYYY-MM-DD
- example: 2021-12-31/2022-01-01
- to scrape a specific date enter it like this 2021-12-31 and all articles will be in this format
- this version also support these formats (1s, 2m, 3h, 1d, 1w, 5M, 2y)
- so, you can use them to scrape
- 1s for 1 second old articles
- 1m for 1 minute old articles
- 1h for 1 hour old articles
- 1d for 1 day old articles
- 1w for 1 week old articles
- 1M for 1 month old articles
- 1y for 1 year for old articles
- Maximum articles amount
- API mode
- you can lower the scraper costs by scraping the news in HTML instead of letting it extract the text, and then you can extract the text manually
- Exclude specific agencies
- set this to exclude any agency from scraping it separate agencies like this
bbc, fox
- this feature works like this
if agency in exclude_agencies.split(','): skip article
so please be careful with this setting
- set this to exclude any agency from scraping it separate agencies like this
- Allow specific agencies
- same as exclude but works in reverse you can allow 1 or 2 or way more specific news agencies to be scraped for example
bbc news, fox news
this will allow only bbc and Fox News to be scraped
- Proxies
- with this option you can set your own proxies from whatever provided, or you can use Apify proxies
- it is very optional to use proxies this scraper doesn't need any proxies but if you face an issue with no articles being scraped then allow this actor to use some proxies as a test otherwise contact me here is my GitHub
more info
- this project took me over 55 hours of pure coding to finish please if you find any bugs or errors report to me and I will fix them all
- if you have ANY suggestions or questions please contact me here is my GitHub or open an issue on the scraper I will be very happy to maintain this scraper as long as I can
- thank you for using my project
Developer
Maintained by Community
Actor Metrics
12 monthly users
-
4 stars
>99% runs succeeded
Created in Feb 2024
Modified 2 months ago