Articles Extractor avatar

Articles Extractor

Try for free

3 days trial then $15.00/month - No credit card required now

Go to Store
Articles Extractor

Articles Extractor

web.harvester/articles-extractor
Try for free

3 days trial then $15.00/month - No credit card required now

The Article Extractor is an enterprise-grade web scraping solution designed specifically for extracting structured data from news articles, blog posts, and online publications. Our advanced HTML parsing engine delivers unmatched accuracy in content extraction across thousands of websites.

Developer
Maintained by Community

Actor Metrics

  • 34 monthly users

  • 5.0 / 5 (2)

  • 17 bookmarks

  • >99% runs succeeded

  • Created in Jun 2023

  • Modified 10 hours ago

Categories

Start URLs

startUrlsarrayOptional

List of Article URLs to extract

Default value of this property is []

Use alternative parser

alternativeParserbooleanOptional

Enable to use the alternative content parser which preserves more formatting. Might be better for some websites.

Default value of this property is false

Save article HTML

saveArticleHtmlbooleanOptional

If checked, it saves only the article section of the HTML for each article.

Default value of this property is false

Save full page HTML

savePageHtmlbooleanOptional

If checked, it saves the whole HTML for each article.

Default value of this property is false

Custom headers (Manual)

customHeadersobjectOptional

Custom headers to be added to each request.

Use header generator (Random)

useHeaderGeneratorbooleanOptional

If checked, it uses the header generator to generate random headers for each request.

Default value of this property is false

Header generator options (Can be left empty)

headerGeneratorOptionsobjectOptional

Configure options for the header generator including devices and browsers to emulate. This helps in generating realistic browser headers for your requests.

Proxy configuration (Residential Proxies are recommended)

proxyConfigurationobjectOptional

This is required if you want to use Apify Proxy.

Default value of this property is {"useApifyProxy":true,"apifyProxyGroups":["RESIDENTIAL"]}