Page Scraping Analyzer avatar

Page Scraping Analyzer

Try for free

No credit card required

Go to Store
Page Scraping Analyzer

Page Scraping Analyzer

apify/page-analyzer
Try for free

No credit card required

Performs analysis of a webpage to figure out the best way how to scrape its data. Provide a URL and data points to find and get back a detailed dashboard showing how the data can be scraped. Works with initial and rendered HTML, JavaScript variables and dynamically loaded data.

Do you want to learn more about this Actor?

Get a demo

You can access the Page Scraping Analyzer programmatically from your own applications by using the Apify API. You can choose the language preference from below. To use the Apify API, you’ll need an Apify account and your API token, found in Integrations settings in Apify Console.

1# Set API token
2API_TOKEN=<YOUR_API_TOKEN>
3
4# Prepare Actor input
5cat > input.json << 'EOF'
6{
7  "url": "http://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html",
8  "keywords": [
9    "A Light in the Attic",
10    "51.77",
11    "In stock",
12    "22 available",
13    "a897fe39b1053632",
14    "It's hard to imagine a world without A Light in the Attic. This now-classic collection of poetry and drawings from Shel Silverstein celebrates its 20th anniversary with this special edition. Silverstein's humorous and creative verse can amuse the dowdiest of readers. Lemon-faced adults and fidgety kids sit still and read these rhythmic words and laugh and smile and love th It's hard to imagine a world without A Light in the Attic. This now-classic collection of poetry and drawings from Shel Silverstein celebrates its 20th anniversary with this special edition. Silverstein's humorous and creative verse can amuse the dowdiest of readers. Lemon-faced adults and fidgety kids sit still and read these rhythmic words and laugh and smile and love that Silverstein. Need proof of his genius? RockabyeRockabye baby, in the treetopDon't you know a treetopIs no safe place to rock?And who put you up there,And your cradle, too?Baby, I think someone down here'sGot it in for you. Shel, you never sounded so good. ...more"
15  ],
16  "proxyConfig": {
17    "useApifyProxy": true
18  }
19}
20EOF
21
22# Run the Actor using an HTTP API
23# See the full API reference at https://docs.apify.com/api/v2
24curl "https://api.apify.com/v2/acts/apify~page-analyzer/runs?token=$API_TOKEN" \
25  -X POST \
26  -d @input.json \
27  -H 'Content-Type: application/json'

Page Scraping Analyzer API

Below, you can find a list of relevant HTTP API endpoints for calling the Page Scraping Analyzer Actor. For this, you’ll need an Apify account. Replace <YOUR_API_TOKEN> in the URLs with your Apify API token, which you can find under Integrations in Apify Console. For details, see the API reference .

Run Actor

POST
https://api.apify.com/v2/acts/apify~page-analyzer/runs?token=<YOUR_API_TOKEN>

Note: By adding the method=POST query parameter, this API endpoint can be called using a GET request and thus used in third-party webhooks. Please refer to our Run Actor API documentation .

Run Actor synchronously and get dataset items

POST
https://api.apify.com/v2/acts/apify~page-analyzer/run-sync-get-dataset-items?token=<YOUR_API_TOKEN>

Note: This endpoint supports both POST and GET request methods. However, only the POST method allows you to pass input data. For more information, please refer to our Run Actor synchronously and get dataset items API documentation .

Get Actor

GET
https://api.apify.com/v2/acts/apify~page-analyzer?token=<YOUR_API_TOKEN>

For more information, please refer to our Get Actor API documentation .

Actors can be used to scrape web pages, extract data, or automate browser tasks. Use the Page Scraping Analyzer API programmatically via the Apify API.

You can choose from:

You can start Page Scraping Analyzer with the Apify API by sending an HTTP POST request to the Run Actor endpoint. An Actor’s input and its content type can be passed as a payload of the POST request, and additional options can be specified using URL query parameters. The Page Scraping Analyzer is identified within the API by its ID, which is the creator’s username and the name of the Actor.

When the Page Scraping Analyzer run finishes you can list the data from its default dataset (storage) via the API or you can preview the data directly on Apify Console .

Developer
Maintained by Apify

Actor Metrics

  • 14 monthly users

  • 10 stars

  • 97% runs succeeded

  • Created in Feb 2018

  • Modified 6 months ago