Smart Article Extractor avatar
Smart Article Extractor

Pricing

Pay per usage

Go to Store
Smart Article Extractor

Smart Article Extractor

lukaskrivka/article-extractor-smart

Developed by

Lukáš Křivka

Maintained by Apify

📰 Smart Article Extractor extracts articles from any scientific, academic, or news website with just one click. The extractor crawls the whole website and automatically distinguishes articles from other web pages. Download your data as HTML table, JSON, Excel, RSS feed, and more.

4.7 (6)

Pricing

Pay per usage

103

Monthly users

280

Runs succeeded

>99%

Response time

36 days

Last modified

6 days ago

PZ

Article has too few words

Closed
pzubkiewicz opened this issue
10 months ago

Hi Lukas,

For some reason, crawler says this article has 19 words, which is not true https://community.aws/content/2eYoqeFRqaVnk900emsknDfzhfW/the-ultimate-cheat-sheet-for-using-amazon-q-developer-in-your-ide

Is there something I could do?

lukaskrivka avatar

Hello,

Thanks for the report. Sadly, the automatic extraction is not perfect and this article has a bit non-standard structure.

There is a way to override the parser using the Extend Output Function, like this

1($) => {
2    const result = {};
3    result.text = $('article div, article h2, article h3').text()
4
5    return result;
6}

See this run but it is still not perfect https://console.apify.com/view/runs/fgyTfOIjyt5chg8yT

Pricing

Pricing model

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage.