Smart Article Extractor avatar
Smart Article Extractor

Pricing

Pay per usage

Go to Store
Smart Article Extractor

Smart Article Extractor

lukaskrivka/article-extractor-smart

Developed by

Lukáš Křivka

Maintained by Apify

📰 Smart Article Extractor extracts articles from any scientific, academic, or news website with just one click. The extractor crawls the whole website and automatically distinguishes articles from other web pages. Download your data as HTML table, JSON, Excel, RSS feed, and more.

4.7 (6)

Pricing

Pay per usage

103

Monthly users

290

Runs succeeded

>99%

Response time

37 days

Last modified

10 days ago

PZ

Wrt previous isse

Closed
pzubkiewicz opened this issue
10 months ago

Sorry, I can't reply to your response.

I tried something similar, but it didn't scrape contents.

{text: $("//*[@id='middle-panel']/article/div[5] | //div[@class='pb-5']").text().trim()}

However, in Chrome this XPath finds the correct section of a webpage.

Could you please explain why?

PZ

pzubkiewicz

10 months ago

I am using | in the XPath so it can handle different HTML structures on this particular page.

lukaskrivka avatar

Hello,

You should be able to reply to the closed issue as well. I cannot make it XPath to work but I don't have much experience with it. I will see if any colleagues can give me advice. How exactly do you run it in Chrome?

lukaskrivka avatar

I realized we don't use the browser to run the parser so only CSS selectors are available.

Pricing

Pricing model

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage.