Legacy PhantomJS Crawler
Try for free
No credit card required
View all Actors
Legacy PhantomJS Crawler
apify/legacy-phantomjs-crawler
Try for free
No credit card required
Replacement for the legacy Apify Crawler product with a backward-compatible interface. The actor uses PhantomJS headless browser to recursively crawl websites and extract data from them using a piece of front-end JavaScript code.
The code examples below show how to run the Actor and get its results. To run the code, you need to have an Apify account. Replace <YOUR_API_TOKEN> in the code with your API token, which you can find under Settings > Integrations in Apify Console. Learn mode
Node.js
Python
curl
1import { ApifyClient } from 'apify-client';
2
3// Initialize the ApifyClient with your Apify API token
4const client = new ApifyClient({
5 token: '<YOUR_API_TOKEN>',
6});
7
8// Prepare Actor input
9const input = {
10 "startUrls": [
11 {
12 "key": "START",
13 "value": "https://www.example.com/"
14 }
15 ],
16 "crawlPurls": [
17 {
18 "key": "MY_LABEL",
19 "value": "https://www.example.com/[.*]"
20 }
21 ],
22 "clickableElementsSelector": "a:not([rel=nofollow])",
23 "pageFunction": function pageFunction(context) {
24 // called on every page the crawler visits, use it to extract data from it
25 var $ = context.jQuery;
26 var result = {
27 title: $('title').text(),
28 myValue: $('TODO').text()
29 };
30 return result;
31 },
32 "interceptRequest": function interceptRequest(context, newRequest) {
33 // called whenever the crawler finds a link to a new page,
34 // use it to override default behavior
35 return newRequest;
36 }
37};
38
39(async () => {
40 // Run the Actor and wait for it to finish
41 const run = await client.actor("apify/legacy-phantomjs-crawler").call(input);
42
43 // Fetch and print Actor results from the run's dataset (if any)
44 console.log('Results from dataset');
45 const { items } = await client.dataset(run.defaultDatasetId).listItems();
46 items.forEach((item) => {
47 console.dir(item);
48 });
49})();
Developer
Maintained by Apify
Actor metrics
- 124 monthly users
- 99.9% runs succeeded
- Created in Mar 2019
- Modified 6 months ago
Categories