Article Text Extractor avatar
Article Text Extractor
Try for free

No credit card required

View all Actors
Article Text Extractor

Article Text Extractor

mtrunkat/article-text-extractor
Try for free

No credit card required

Simply extracts article texts and other meta info from the given URL. Uses https://github.com/ageitgey/node-unfluff which is a NodeJS implementation of https://github.com/grangier/python-goose.

The code examples below show how to run the Actor and get its results. To run the code, you need to have an Apify account. Replace <YOUR_API_TOKEN> in the code with your API token, which you can find under Settings > Integrations in Apify Console. Learn more

Node.js

Python

curl

1import { ApifyClient } from 'apify-client';
2
3// Initialize the ApifyClient with your Apify API token
4const client = new ApifyClient({
5    token: '<YOUR_API_TOKEN>',
6});
7
8// Prepare Actor input
9const input = {
10    "url": "https://www.bbc.com/news/world-asia-china-48659073"
11};
12
13(async () => {
14    // Run the Actor and wait for it to finish
15    const run = await client.actor("mtrunkat/article-text-extractor").call(input);
16
17    // Fetch and print Actor results from the run's dataset (if any)
18    console.log('Results from dataset');
19    console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
20    const { items } = await client.dataset(run.defaultDatasetId).listItems();
21    items.forEach((item) => {
22        console.dir(item);
23    });
24})();
25
26// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs
Developer
Maintained by Community
Actor metrics
  • 23 monthly users
  • 98.2% runs succeeded
  • 0.0 days response time
  • Created in Mar 2018
  • Modified 7 months ago
Categories