Simply extracts article texts and other meta info from the given URL. Uses https://github.com/ageitgey/node-unfluff which is a NodeJS implementation of https://github.com/grangier/python-goose.

- Modified
- Users498
- Runs105,372
To run the code examples, you need to have an Apify account. Replace <YOUR_API_TOKEN> in the code with your API token. For a more detailed explanation, please read about running actors via the API in Apify Docs.
import { ApifyClient } from 'apify-client';
// Initialize the ApifyClient with API token
const client = new ApifyClient({
token: '<YOUR_API_TOKEN>',
});
// Prepare actor input
const input = {
"url": "https://www.bbc.com/news/world-asia-china-48659073"
};
(async () => {
// Run the actor and wait for it to finish
const run = await client.actor("mtrunkat/article-text-extractor").call(input);
// Fetch and print actor results from the run's dataset (if any)
console.log('Results from dataset');
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
console.dir(item);
});
})();