Smart Article Extractor avatar
Smart Article Extractor
Try for free

No credit card required

View all Actors
Smart Article Extractor

Smart Article Extractor

lukaskrivka/article-extractor-smart
Try for free

No credit card required

📰 Smart Article Extractor extracts articles from any scientific, academic, or news website with just one click. The extractor crawls the whole website and automatically distinguishes articles from other web pages. Download your data as HTML table, JSON, Excel, RSS feed, and more.

User avatar

Does not support Hebrew

Closed

abiding_flare opened this issue
a year ago

I've tried to run this scrapper on Hebrew websites and it didn't work. Is there a way to add Hebrew?

User avatar

Hi Makan,

I'm sorry but this scraper very likely will perform poorly for Hebrew (it might work on some websites).

For less detailed parsing, you can try https://apify.com/apify/website-content-crawler which should work better for non-English

User avatar

abiding_flare

a year ago

Thank you for your quick response, I do need all of the parsing. It is an amazing tool and exactly what I need. (I tried it in English with great results).

User avatar

abiding_flare

a year ago

Could this help with adjusting the scrapper to work in Hebrew? https://stackoverflow.com/questions/1365510/where-can-i-find-a-list-of-hebrew-stop-words

User avatar

I don't think that will be enough. There is quite a complicated parsing library behind it. If you know some good for Hebrew, we could add it as a backend.

User avatar

abiding_flare

a year ago

Will this help? https://github.com/amir-zeldes/HebPipe Thank you

User avatar

abiding_flare

a year ago

Hi, Do you think this https://github.com/amir-zeldes/HebPipe could help? I tried to run the scraper with no minimum words per article and got some information but not the full text. https://console.apify.com/view/runs/A4SU2iteF1ZF8Mkg3

I would appreciate if this could work for Hebrew.

User avatar

abiding_flare

a year ago

Hi, Is there any update on this? Thank you!

User avatar

Hi Makam, I'm sorry but this is a big feature and we don't see enough demand to justify the development cost just yet.

User avatar

abiding_flare

a year ago

Hi, Is there a way for us to pay for such a development?

User avatar

Hi Makam, You have 2 options

  1. Ask some of our partners or freelancers to build this from scratch.
  2. Reach out to https://apify.com/enterprise

The best bet for Hebrew is probably using GPT so maybe using https://apify.com/drobnikj/gpt-scraper would help here

User avatar

I will close this issue now, we will keep your request in mind for this scraper in case an easier path forward is found

Developer
Maintained by Apify
Actor metrics
  • 172 monthly users
  • 74.9% runs succeeded
  • 2.8 days response time
  • Created in Nov 2019
  • Modified about 2 months ago
Categories