Scrape up to 10000 news articles from over 4500 news sources in less than 20 minutes, news from over 20 categories, e.g., Crypto news, World News, Latest News, Celebrities, and a lot more.
You can find news on websites such as Fox News, BBC News, CNN, and Cryptocurrency-Related News Sources.
this scraper is still in Alpha mode, so you might find some bugs or issues please don't hesitate to report them though opening an issue here is myGitHubif you need to contact me
allow specific agencies or even 1 agency for the scraper to scrape from them (if found)
new Date Range feature it's way more accurate now
some speed improvements still working on making it faster
API Mode is faster, you can use it to extract html on you own instead of the scraper ( it will lower your costs a lot )
better logging system for bugs and errors detection Please report any if found
better and more accurate data collection
how to use v3.0 new features
Keywords filter
search and filter articles based on keywords
separate keywords by a comma (e.g. 'apple, google, amazon')
very case-sensitive and still under testing (not fully functional)
Date Range filter
you can choose from today, 3 days, 1 week, 1 month, 1 year or all
Maximum articles amount
you set the amount to maximize the scraper output, but it scrapes around that number not the exact number so the scraper might find less than the value you set
API mode
you can lower the scraper costs by scraping the news in HTML instead of letting it extract the text, and then you can extract the text manually
Exclude specific agencies
set this to exclude any agency from scraping it separate agencies like this bbc, fox
only Allow specific agencies
same as exclude but works in reverse you can allow 1 or 2 or way more specific news agencies to be scraped for example
bbc, fox news this will allow only bbc and fox to be scraped (not case-sensitive)
Proxies
with this option you can set your own proxies from whatever provided, or you can use Apify proxies
it is very optional to use proxies this scraper doesn't need any proxies but if you face an issue with no articles being scraped then allow this actor to use some proxies as a test otherwise contact me here is my GitHub
more info
Telegram Group for easier communications with the developer
this project took me over 60 hours of pure coding to finish please if you find any bugs or errors report to me and I will fix them all
if you have ANY suggestions or questions please contact me here is my GitHub or open an issue on the scraper I will be very happy to maintain this scraper as long as I can