Article Text Extractor avatar
Article Text Extractor
Try for free

No credit card required

View all Actors
Article Text Extractor

Article Text Extractor

mtrunkat/article-text-extractor
Try for free

No credit card required

Simply extracts article texts and other meta info from the given URL. Uses https://github.com/ageitgey/node-unfluff which is a NodeJS implementation of https://github.com/grangier/python-goose.

The code examples below show how to run the Actor and get its results. To run the code, you need to have an Apify account. Replace <YOUR_API_TOKEN> in the code with your API token, which you can find under Settings > Integrations in Apify Console. Learn more

1from apify_client import ApifyClient
2
3# Initialize the ApifyClient with your Apify API token
4client = ApifyClient("<YOUR_API_TOKEN>")
5
6# Prepare the Actor input
7run_input = { "url": "https://www.bbc.com/news/world-asia-china-48659073" }
8
9# Run the Actor and wait for it to finish
10run = client.actor("mtrunkat/article-text-extractor").call(run_input=run_input)
11
12# Fetch and print Actor results from the run's dataset (if there are any)
13print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
14for item in client.dataset(run["defaultDatasetId"]).iterate_items():
15    print(item)
16
17# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start
Developer
Maintained by Community
Actor metrics
  • 22 monthly users
  • 8 stars
  • 99.7% runs succeeded
  • 7.3 hours response time
  • Created in Mar 2018
  • Modified 10 months ago
Categories