Page Scraping Analyzer avatar
Page Scraping Analyzer
Try for free

No credit card required

View all Actors
Page Scraping Analyzer

Page Scraping Analyzer

apify/page-analyzer
Try for free

No credit card required

Performs analysis of a webpage to figure out the best way how to scrape its data. Provide a URL and data points to find and get back a detailed dashboard showing how the data can be scraped. Works with initial and rendered HTML, JavaScript variables and dynamically loaded data.

The code examples below show how to run the Actor and get its results. To run the code, you need to have an Apify account. Replace <YOUR_API_TOKEN> in the code with your API token, which you can find under Settings > Integrations in Apify Console. Learn mode

Node.js

Python

curl

1# Set API token
2API_TOKEN=<YOUR_API_TOKEN>
3
4# Prepare Actor input
5cat > input.json <<'EOF'
6{
7  "url": "http://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html",
8  "keywords": [
9    "A Light in the Attic",
10    "51.77",
11    "In stock",
12    "22 available",
13    "a897fe39b1053632",
14    "It's hard to imagine a world without A Light in the Attic. This now-classic collection of poetry and drawings from Shel Silverstein celebrates its 20th anniversary with this special edition. Silverstein's humorous and creative verse can amuse the dowdiest of readers. Lemon-faced adults and fidgety kids sit still and read these rhythmic words and laugh and smile and love th It's hard to imagine a world without A Light in the Attic. This now-classic collection of poetry and drawings from Shel Silverstein celebrates its 20th anniversary with this special edition. Silverstein's humorous and creative verse can amuse the dowdiest of readers. Lemon-faced adults and fidgety kids sit still and read these rhythmic words and laugh and smile and love that Silverstein. Need proof of his genius? RockabyeRockabye baby, in the treetopDon't you know a treetopIs no safe place to rock?And who put you up there,And your cradle, too?Baby, I think someone down here'sGot it in for you. Shel, you never sounded so good. ...more"
15  ],
16  "proxyConfig": {
17    "useApifyProxy": true
18  }
19}
20EOF
21
22# Run the Actor using an HTTP API
23# See the full API reference at https://docs.apify.com/api/v2
24curl "https://api.apify.com/v2/acts/apify~page-analyzer/runs?token=$API_TOKEN" \
25  -X POST \
26  -d @input.json \
27  -H 'Content-Type: application/json'
Developer
Maintained by Apify
Actor metrics
  • 40 monthly users
  • 80.6% runs succeeded
  • 0.0 days response time
  • Created in Feb 2018
  • Modified 9 months ago