Page Scraping Analyzer
No credit card required
Page Scraping Analyzer
No credit card required
Performs analysis of a webpage to figure out the best way how to scrape its data. Provide a URL and data points to find and get back a detailed dashboard showing how the data can be scraped. Works with initial and rendered HTML, JavaScript variables and dynamically loaded data.
Do you want to learn more about this Actor?
Get a demoYou can access the Page Scraping Analyzer programmatically from your own Python applications by using the Apify API. You can also choose the language preference from below. To use the Apify API, you’ll need an Apify account and your API token, found in Integrations settings in Apify Console.
1from apify_client import ApifyClient
2
3# Initialize the ApifyClient with your Apify API token
4# Replace '<YOUR_API_TOKEN>' with your token.
5client = ApifyClient("<YOUR_API_TOKEN>")
6
7# Prepare the Actor input
8run_input = {
9 "url": "http://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html",
10 "keywords": [
11 "A Light in the Attic",
12 "51.77",
13 "In stock",
14 "22 available",
15 "a897fe39b1053632",
16 "It's hard to imagine a world without A Light in the Attic. This now-classic collection of poetry and drawings from Shel Silverstein celebrates its 20th anniversary with this special edition. Silverstein's humorous and creative verse can amuse the dowdiest of readers. Lemon-faced adults and fidgety kids sit still and read these rhythmic words and laugh and smile and love th It's hard to imagine a world without A Light in the Attic. This now-classic collection of poetry and drawings from Shel Silverstein celebrates its 20th anniversary with this special edition. Silverstein's humorous and creative verse can amuse the dowdiest of readers. Lemon-faced adults and fidgety kids sit still and read these rhythmic words and laugh and smile and love that Silverstein. Need proof of his genius? RockabyeRockabye baby, in the treetopDon't you know a treetopIs no safe place to rock?And who put you up there,And your cradle, too?Baby, I think someone down here'sGot it in for you. Shel, you never sounded so good. ...more",
17 ],
18 "proxyConfig": { "useApifyProxy": True },
19}
20
21# Run the Actor and wait for it to finish
22run = client.actor("apify/page-analyzer").call(run_input=run_input)
23
24# Fetch and print Actor results from the run's dataset (if there are any)
25print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
26for item in client.dataset(run["defaultDatasetId"]).iterate_items():
27 print(item)
28
29# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start
Page Scraping Analyzer API in Python
The Apify API client for Python is the official library that allows you to use Page Scraping Analyzer API in Python, providing convenience functions and automatic retries on errors.
Install the apify-client
pip install apify-client
Other API clients include:
Actor Metrics
19 monthly users
-
10 stars
95% runs succeeded
Created in Feb 2018
Modified 5 months ago