Web Page Analyzer

  • apify/page-analyzer
  • Modified
  • Users 714
  • Runs 42.7k
  • Created by Author's avatarApify

Performs analysis of a webpage to figure the best way how to scrapes its data. On input, it takes an URL and array of strings to search for, and on output, it returns a definition of a crawler.

To run the code examples, you need to have an Apify account. Replace <YOUR_API_TOKEN> in the code with your API token. For a more detailed explanation, please read about running actors via the API in Apify Docs.

import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with API token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare actor input
const input = {
    "url": "https://www.scrapethissite.com/pages/ajax-javascript/#2015",
    "keywords": [
        "Spotlight",
        "Oscar Winning Films: AJAX and Javascript",
        "https://en.wikipedia.org/wiki/List_of_Academy_Award-winning_films",
        "A Girl in the River: The Price of Forgiveness",
        "87 items"
    ],
    "proxyConfig": {
        "useApifyProxy": true
    }
};

(async () => {
    // Run the actor and wait for it to finish
    const run = await client.actor("apify/page-analyzer").call(input);

    // Fetch and print actor results from the run's dataset (if any)
    console.log('Results from dataset');
    const { items } = await client.dataset(run.defaultDatasetId).listItems();
    items.forEach((item) => {
        console.dir(item);
    });
})();