You can access the Web Scraper Task programmatically from your own applications by using the Apify API. You can also choose the language preference from below. To use the Apify API, you’ll need an Apify account and your API token, found in Integrations settings in Apify Console.

Python

JavaScript

CLI

OpenAPI

HTTP

MCP

# Set API token
$API_TOKEN=<YOUR_API_TOKEN>

# Prepare Actor input
$cat > input.json << 'EOF'
<{
<  "runMode": "DEVELOPMENT",
<  "startUrls": [
<    {
<      "url": "https://crawlee.dev"
<    }
<  ],
<  "linkSelector": "a[href]",
<  "globs": [
<    {
<      "glob": "https://crawlee.dev/*/*"
<    }
<  ],
<  "pseudoUrls": [],
<  "pageFunction": "// The function accepts a single argument: the \"context\" object.\n// For a complete list of its properties and functions,\n// see https://apify.com/apify/web-scraper#page-function \nasync function pageFunction(context) {\n    // This statement works as a breakpoint when you're trying to debug your code. Works only with Run mode: DEVELOPMENT!\n    // debugger; \n\n    // jQuery is handy for finding DOM elements and extracting data from them.\n    // To use it, make sure to enable the \"Inject jQuery\" option.\n    const $ = context.jQuery;\n    const pageTitle = $('title').first().text();\n    const h1 = $('h1').first().text();\n    const first_h2 = $('h2').first().text();\n    const random_text_from_the_page = $('p').first().text();\n\n\n    // Print some information to actor log\n    context.log.info(`URL: ${context.request.url}, TITLE: ${pageTitle}`);\n\n    // Manually add a new page to the queue for scraping.\n   await context.enqueueRequest({ url: 'http://www.example.com' });\n\n    // Return an object with the data extracted from the page.\n    // It will be stored to the resulting dataset.\n    return {\n        url: context.request.url,\n        pageTitle,\n        h1,\n        first_h2,\n        random_text_from_the_page\n    };\n}",
<  "proxyConfiguration": {
<    "useApifyProxy": true
<  },
<  "initialCookies": [],
<  "waitUntil": [
<    "networkidle2"
<  ],
<  "preNavigationHooks": "// We need to return array of (possibly async) functions here.\n// The functions accept two arguments: the \"crawlingContext\" object\n// and \"gotoOptions\".\n[\n    async (crawlingContext, gotoOptions) => {\n        // ...\n    },\n]\n",
<  "postNavigationHooks": "// We need to return array of (possibly async) functions here.\n// The functions accept a single argument: the \"crawlingContext\" object.\n[\n    async (crawlingContext) => {\n        // ...\n    },\n]",
<  "breakpointLocation": "NONE",
<  "customData": {}
<}
<EOF

# Run the Actor using an HTTP API
# See the full API reference at https://docs.apify.com/api/v2
$curl "https://api.apify.com/v2/acts/undrtkr984~web-scraper-task/runs?token=$API_TOKEN" \
<  -X POST \
<  -d @input.json \
<  -H 'Content-Type: application/json'

Web Scraper Task API

Below, you can find a list of relevant HTTP API endpoints for calling the Web Scraper Task Actor. For this, you’ll need an Apify account. Replace <YOUR_API_TOKEN> in the URLs with your Apify API token, which you can find under Integrations in Apify Console. For details, see the API reference.

Run Actor

POST

https://api.apify.com/v2/acts/undrtkr984~web-scraper-task/runs?token=<YOUR_API_TOKEN>

Note: By adding the method=POST query parameter, this API endpoint can be called using a GET request and thus used in third-party webhooks. Please refer to our Run Actor API documentation.

Run Actor synchronously and get dataset items

POST

https://api.apify.com/v2/acts/undrtkr984~web-scraper-task/run-sync-get-dataset-items?token=<YOUR_API_TOKEN>

Note: This endpoint supports both POST and GET request methods. However, only the POST method allows you to pass input data. For more information, please refer to our Run Actor synchronously and get dataset items API documentation.

Get Actor

GET

https://api.apify.com/v2/acts/undrtkr984~web-scraper-task?token=<YOUR_API_TOKEN>

For more information, please refer to our Get Actor API documentation.

Actors can be used to scrape web pages, extract data, or automate browser tasks. Use the Web Scraper Task API programmatically via the Apify API.

You can choose from:

Web Scraper Task API in Python

Web Scraper Task API in JavaScript

Web Scraper Task API through CLI

Web Scraper Task OpenAPI definition

You can start Web Scraper Task with the Apify API by sending an HTTP POST request to the Run Actorendpoint. An Actor’s input and its content type can be passed as a payload of the POST request, and additional options can be specified using URL query parameters. The Web Scraper Task is identified within the API by its ID, which is the creator’s username and the name of the Actor.

When the Web Scraper Task run finishes you can list the data from its default dataset(storage) via the API or you can preview the data directly on Apify Console.

Web Scraper Experimental Debug

mtrunkat/web-scraper-experimental-dbgr

Experimental version of Apify Web Scraper with Chrome debugger integrated

Marek Trunkát

Dynamic Web Scraper

josejet/dynamic-web-scraper

Dynamic Web Scraper is an Apify Actor that gathers information online by simulating user browsing behavior on the web. It reduces the time and amount of scraped web pages by using a model (ChatGPT) to make decisions regarding browser navigation and results evaluation.

Pepa J W̚͠h̾̔̎̿͊͛̄͊e̢̦̲̰̦̋̇͗̾̑oi̟͈̯̝̊̉́̇͑̕ğ̆͘͡e͗͛o͊̔̇̄

224

Web Scraper

apify/web-scraper

Crawls arbitrary websites using a web browser and extracts structured data from web pages using a provided JavaScript function. The Actor supports both recursive crawling and lists of URLs, and automatically manages concurrency for maximum performance.

Apify

100K

3.5

Web Scraper

futurizerush/web-scraper

Simple web scraper. Extract titles, paragraphs, links, images, tables and more from websites. Supports custom CSS selectors and batch collection. For large needs, try Apify's Web Content Crawler.

Futurize Rush

Pro Web Content Crawler (With Images)

assertive_analogy/pro-web-content-crawler

Pro Web Content Crawler is a powerful tool that digs deep into web content and images. It handles complex sites, dynamic pages, and hidden content, making it perfect for extracting both data and images. Customizable and API-ready for your unique data needs.

Gideon Nesh

170

5.0

Website Content Vector Retriever

hamza.alwan/website-content-vector-retriever

Hamza Alwan

Instant web data scraper - Scrape any website

curious_coder/instant-web-scraper

Scrape any public and private website data by providing just URL and optionally cookies and proxy information. This scraper is similar to instant data scraper but runs on cloud and can be used as API too!

Curious Coder

1.7K

3.6

Shopify API Scraper

rl1987/shopify-api-scraper

API scraper to get product data from almost any Shopify store!

R.L.

Magento E-Commerce Scraper 🚧

jupri/magento-scraper

Scrape data about product price, description and other information from Magento E-Commerce websites.

cat

388

Shopify Product Scraper: Extract Product Data via JSON API

linen_snack/shopify-product-scraper-extract-product-data-via-json-api

Effortlessly scrape comprehensive product data (titles, descriptions, prices, variants, images, SKUs, inventory & more) from any Shopify store. data extraction for e-commerce analysis, price monitoring, or building product feeds. Fast, reliable, and easy to configure with just the store URL.