Pricing

Pay per usage

Try for free

Go to Apify Store

Legacy PhantomJS Crawler

Try for free

Replacement for the legacy Apify Crawler product with a backward-compatible interface. The Actor uses PhantomJS headless browser to recursively crawl websites and extract data from them using a piece of front-end JavaScript code.

Pricing

Pay per usage

Rating

5.0

(6)

Developer

Apify

Actor stats

Bookmarked

852

Total users

Monthly active users

2 months ago

Last modified

Categories

Developer tools

Open source

You can access the Legacy PhantomJS Crawler programmatically from your own applications by using the Apify API. You can also choose the language preference from below. To use the Apify API, you’ll need an Apify account and your API token, found in Integrations settings in Apify Console.

Python

JavaScript

CLI

OpenAPI

HTTP

MCP

# Set API token
$API_TOKEN=<YOUR_API_TOKEN>

# Prepare Actor input
$cat > input.json << 'EOF'
<{
<  "startUrls": [
<    {
<      "key": "START",
<      "value": "https://www.example.com/"
<    }
<  ],
<  "crawlPurls": [
<    {
<      "key": "MY_LABEL",
<      "value": "https://www.example.com/[.*]"
<    }
<  ],
<  "clickableElementsSelector": "a:not([rel=nofollow])",
<  "pageFunction": "function pageFunction(context) {\n    // called on every page the crawler visits, use it to extract data from it\n    var $ = context.jQuery;\n    var result = {\n        title: $('title').text(),\n        myValue: $('TODO').text()\n    };\n    return result;\n}\n",
<  "interceptRequest": "function interceptRequest(context, newRequest) {\n    // called whenever the crawler finds a link to a new page,\n    // use it to override default behavior\n    return newRequest;\n}\n"
<}
<EOF

# Run the Actor using an HTTP API
# See the full API reference at https://docs.apify.com/api/v2
$curl "https://api.apify.com/v2/acts/apify~legacy-phantomjs-crawler/runs?token=$API_TOKEN" \
<  -X POST \
<  -d @input.json \
<  -H 'Content-Type: application/json'

Legacy PhantomJS Crawler - Crawl websites, extract data API

Below, you can find a list of relevant HTTP API endpoints for calling the Legacy PhantomJS Crawler Actor. For this, you’ll need an Apify account. Replace <YOUR_API_TOKEN> in the URLs with your Apify API token, which you can find under Integrations in Apify Console. For details, see the API reference.

Run Actor

POST
https://api.apify.com/v2/acts/apify~legacy-phantomjs-crawler/runs?token=<YOUR_API_TOKEN>

Note: By adding the method=POST query parameter, this API endpoint can be called using a GET request and thus used in third-party webhooks. Please refer to our Run Actor API documentation.

Run Actor synchronously and get dataset items

POST
https://api.apify.com/v2/acts/apify~legacy-phantomjs-crawler/run-sync-get-dataset-items?token=<YOUR_API_TOKEN>

Note: This endpoint supports both POST and GET request methods. However, only the POST method allows you to pass input data. For more information, please refer to our Run Actor synchronously and get dataset items API documentation.

Get Actor

GET
https://api.apify.com/v2/acts/apify~legacy-phantomjs-crawler?token=<YOUR_API_TOKEN>

For more information, please refer to our Get Actor API documentation.

Actors can be used to scrape web pages, extract data, or automate browser tasks. Use the Legacy PhantomJS Crawler API programmatically via the Apify API.

You can choose from:

Legacy PhantomJS Crawler API in Python

Legacy PhantomJS Crawler API in JavaScript

Legacy PhantomJS Crawler API through CLI

Legacy PhantomJS Crawler OpenAPI definition

You can start Legacy PhantomJS Crawler with the Apify API by sending an HTTP POST request to the Run Actorendpoint. An Actor’s input and its content type can be passed as a payload of the POST request, and additional options can be specified using URL query parameters. The Legacy PhantomJS Crawler is identified within the API by its ID, which is the creator’s username and the name of the Actor.

When the Legacy PhantomJS Crawler run finishes you can list the data from its default dataset(storage) via the API or you can preview the data directly on Apify Console.

Metadata Extractor

jancurn/extract-metadata

A small efficient actor that loads a web page, parses its HTML using Cheerio library and extracts the following meta-data from the <HEAD> tag, such as page title, description, author etc.

Jan Čurn

1.3K

Send Legacy PhantomJS Crawler Results

drobnikj/send-crawler-results

This actor downloads results from Legacy PhantomJS Crawler task and sends them to email as attachments. It is designed to run from finish webhook.

Jakub Drobník

URL Redirects

manishrc/url-redirect

Actor that takes a list of URLs and provides a list of loaded URLs after redirects

Manish Chiniwalar

413

google-search

parse/google-search

parse

465

1.0

📧 Email & Contact Extractor From Any Web

easyapi/email-contact-extractor-from-any-web

A powerful web scraper that automatically extracts emails, phone numbers, and social media profiles from any website. Features deep crawling, smart retry mechanism, and advanced pattern recognition. Perfect for lead generation and contact discovery.

EasyApi

246

5.0

Indeed Job Search

cheapget/indeed-job-search

Automate your job search data collection with the Indeed Jobs Scraper. Extract detailed job postings, company reviews, and salary information from Indeed efficiently.

CheapGET

120

5.0

Glassdoor Scraper

mherzog/glassdoor-scraper

Matt Herzog

GSC MCP

smacient/gsc-mcp-worker

A comprehensive Google Search Console Model Context Protocol (MCP) Actor that provides deep SEO analytics, keyword cannibalization detection, and search performance insights using your GSC data. Ideal for businesses, agencies, and SEO professionals seeking advanced search analytics.

Smacient

Example Process Crawl Results

apify/example-process-crawl-results

Iterates through all results from a crawler run and count them. Needs to be called from the crawler's finish webhook by adding an URL to finish the webhook of your crawler. Use this actor as a starting point to develop custom post-processing of data from the crawler.

Apify

4.5

Bandcamp Crawler

service-paradis/bandcamp-crawler

The Bandcamp.com crawler is a web scraping tool that allows you to extract data from the Bandcamp music platform. With this crawler, you can get information about albums, tracks, and much more. The crawler is built on top of Apify SDK, and you can run it both on the Apify platform and locally.