Actor picture

Economist Category Scraper


Example implementation of scraper built using apify/web-scraper actor. Crawls latest updates from a given economist category.

Author's avatarMarek Trunkát
  • Modified
  • Users13
  • Runs22
Actor picture

Economist Category Scraper


# Dockerfile contains instructions how to build a Docker image that
# will contain all the code and configuration needed to run your actor.
# For a full Dockerfile reference,
# see

# First, specify the base Docker image. Apify provides the following
# base images for your convenience:
#  apify/actor-node-basic (Node.js 10 on Alpine Linux, small and fast)
#  apify/actor-node-chrome (Node.js 10 + Chrome on Debian)
#  apify/actor-node-chrome-xvfb (Node.js 10 + Chrome + Xvfb on Debian)
# For more information, see
# Note that you can use any other image from Docker Hub.
FROM apify/actor-node-basic

# Second, copy just package.json since it should be the only file
# that affects NPM install in the next step
COPY package.json ./

# Install NPM packages, skip optional and development dependencies to
# keep the image small. Avoid logging too much and print the dependency
# tree for debugging
RUN npm --quiet set progress=false \
 && npm install --only=prod --no-optional \
 && echo "Installed NPM packages:" \
 && npm list \
 && echo "Node.js version:" \
 && node --version \
 && echo "NPM version:" \
 && npm --version
# Next, copy the remaining files and directories with the source code.
# Since we do this after NPM install, quick build will be really fast
# for most source file changes. 
COPY . ./

# Optionally, specify how to launch the source code of your actor.
# By default, Apify's base Docker images define the CMD instruction
# that runs the source code using the command specified
# in the "scripts.start" section of the package.json file.
# In short, the instruction looks something like this:  
# CMD npm start


    "title": "My input schema",
    "type": "object",
    "schemaVersion": 1,
    "properties": {
        "category": {
            "title": "Category",
            "type": "string",
            "description": " category to be scraped",
            "editor": "textfield",
            "prefill": "briefing"

# Economist category scraper

Example implementation of scraper built using apify/web-scraper 
actor. Crawls latest updates from a given economist category.


// This is the main Node.js source code file of your actor.
// It is referenced from the "scripts" section of the package.json file.

const Apify = require('apify');

Apify.main(async () => {
    // Get input of the actor. Input fields can be modified in INPUT_SCHEMA.json file.
    // For more information, see
    const input = await Apify.getInput();

    // Here you can prepare your input for actor apify/web-scraper this input is based on a actor
    // task you used as the starting point.
    const metamorphInput = {
        "startUrls": [
                "url": `${input.category}/?page=1`,
                "method": "GET"
        "useRequestQueue": true,
        "pseudoUrls": [
                "purl": `${input.category}/?page=[\\d+]`,
                "method": "GET"
        "linkSelector": "a",
        "pageFunction": async function pageFunction(context) {
            // request is an instance of Apify.Request (
            // $ is an instance of jQuery (
            const request = context.request;
            const $ = context.jQuery;
            const pageNum = parseInt(request.url.split('?page=').pop());
  `Scraping ${context.request.url}`);
            // Extract all articles.
            const articles = [];
            $('article').each((index, articleEl) => {
                const $articleEl = $(articleEl);
                // H3 contains 2 child elements where first one is topic and second is article title.
                const $h3El = $articleEl.find('h3');
                // Extract additonal info and push it to data object.
                    topic: $h3El.children().first().text(),
                    title: $h3El.children().last().text(),
                    url: $articleEl.find('a')[0].href,
                    teaser: $articleEl.find('.teaser__text').text(),
            // Return results.
            return articles;
        "proxyConfiguration": {
            "useApifyProxy": false
        "debugLog": false,
        "browserLog": false,
        "injectJQuery": true,
        "injectUnderscore": false,
        "downloadMedia": false,
        "downloadCss": false,
        "ignoreSslErrors": false

    // Now let's metamorph into actor apify/web-scraper using the created input.
    await Apify.metamorph('apify/web-scraper', metamorphInput);


    "name": "my-actor",
    "version": "0.0.1",
    "dependencies": {
        "apify": "^0.14.5"
    "scripts": {
        "start": "node main.js"
    "author": "Me!"