📰 Smart Article Extractor extracts articles from any scientific, academic, or news website with just one click. The extractor crawls the whole website and automatically distinguishes articles from other web pages. Download your data as HTML table, JSON, Excel, RSS feed, and more.

User avatar

The tags are not always scraped properly


tblobaum opened this issue
10 months ago

My script keeps failing because the tags are not always formatted as advertised. Try this link to see what i mean:

User avatar

Hi Tom,

Thanks for the report. I'm sorry for my belated reply.

You are right that some of the tags shouldn't be there. However, we are currently unable to fix such fine-grained parts of the system as it works on a generic best-effort basis.

