
Smart Article Extractor
Pricing
Pay per usage

Smart Article Extractor
📰 Smart Article Extractor extracts articles from any scientific, academic, or news website with just one click. The extractor crawls the whole website and automatically distinguishes articles from other web pages. Download your data as HTML table, JSON, Excel, RSS feed, and more.
4.7 (6)
Pricing
Pay per usage
133
Total users
5.3K
Monthly users
381
Runs succeeded
99%
Issues response
4.4 days
Last modified
3 months ago
Empty results from medium.com
Closed
Nothing is returned from that page
2024-04-04T09:58:45.807Z INFO Adding article URL: https://medium.com/@daniel.castillo_48013/streamlining-event-driven-architecture-documentation-with-event-catalog-a-step-by-step-guide-6d10d95abaa12024-04-04T09:58:46.294Z INFO CheerioCrawler: Starting the crawler.2024-04-04T09:58:47.526Z WARN IS NOT VALID ARTICLE --- Reasons: [Article has too few words: 145 (should be at least 150)] --- https://medium.com/@daniel.castillo_48013/streamlining-event-driven-architecture-documentation-with-event-catalog-a-step-by-step-guide-6d10d95abaa12024-04-04T09:58:47.599Z INFO CheerioCrawler: All requests from the queue have been processed, the crawler will shut down.2024-04-04T09:58:47.799Z INFO CheerioCrawler: Final request statistics: {"requestsFinished":1,"requestsFailed":0,"retryHistogram":[1],"requestAvgFailedDurationMillis":null,"requestAvgFinishedDurationMillis":1195,"requestsFinishedPerMinute":37,"requestsFailedPerMinute":0,"requestTotalDurationMillis":1195,"requestsTotal":1,"crawlerRuntimeMillis":1616}2024-04-04T09:58:47.804Z INFO CheerioCrawler: Finished! Total 1 requests: 1 succeeded, 0 failed. {"terminal":true}

Hello,
Thanks for the report. Unfortunately, Medium can be quite complex to parse and the actor only parsed text from the first paragraph.
You can improve that by using the Extend Output Function to point it to the correct text.
One example would be this
($) => {return {text: $('article p, article ol').text().trim()}}
https://console.apify.com/view/runs/ClHPxNgQaEw3pOYSM
But it would require better selector to get really clean text.