![Vanilla JS Scraper avatar](https://images.apifyusercontent.com/TrSLn1VVbGWZgHxPDHeq37Vl24lk7Gi7mgoKUXyYYds/rs:fill:92:92/aHR0cHM6Ly9hcGlmeS1pbWFnZS11cGxvYWRzLXByb2QuczMuYW1hem9uYXdzLmNvbS9VUko3dVVNUThERENNaVVmVy9KZkUydDJIb3ZHa21rbmRFZi1KUy5wbmc.webp)
Vanilla JS Scraper
Try for free
No credit card required
View all Actors![Vanilla JS Scraper](https://images.apifyusercontent.com/TrSLn1VVbGWZgHxPDHeq37Vl24lk7Gi7mgoKUXyYYds/rs:fill:92:92/aHR0cHM6Ly9hcGlmeS1pbWFnZS11cGxvYWRzLXByb2QuczMuYW1hem9uYXdzLmNvbS9VUko3dVVNUThERENNaVVmVy9KZkUydDJIb3ZHa21rbmRFZi1KUy5wbmc.webp)
![Vanilla JS Scraper](https://images.apifyusercontent.com/TrSLn1VVbGWZgHxPDHeq37Vl24lk7Gi7mgoKUXyYYds/rs:fill:92:92/aHR0cHM6Ly9hcGlmeS1pbWFnZS11cGxvYWRzLXByb2QuczMuYW1hem9uYXdzLmNvbS9VUko3dVVNUThERENNaVVmVy9KZkUydDJIb3ZHa21rbmRFZi1KUy5wbmc.webp)
Vanilla JS Scraper
mstephen190/vanilla-js-scraper
Try for free
No credit card required
Scrape the web using familiar JavaScript methods! Crawls websites using raw HTTP requests, parses the HTML with the JSDOM package, and extracts data from the pages using Node.js code. Supports both recursive crawling and lists of URLs. This actor is a non jQuery alternative to CheerioScraper.
The code examples below show how to run the Actor and get its results. To run the code, you need to have an Apify account. Replace <YOUR_API_TOKEN> in the code with your API token, which you can find under Settings > Integrations in Apify Console. Learn more
1from apify_client import ApifyClient
2
3# Initialize the ApifyClient with your Apify API token
4client = ApifyClient("<YOUR_API_TOKEN>")
5
6# Prepare the Actor input
7run_input = {
8 "requests": [{ "url": "https://apify.com" }],
9 "pseudoUrls": [{ "purl": "https://apify.com[(/[\\w-]+)?]" }],
10 "linkSelector": "a[href]",
11 "pageFunction": """async function pageFunction(context) {
12 const { window, document, crawler, enqueueRequest, request, response, userData, json, body, kvStore, customData } = context;
13
14 const title = document.querySelector('title').textContent
15
16 const responseHeaders = response.headers
17
18 return {
19 title,
20 responseHeaders
21 };
22}""",
23 "preNavigationHooks": """// We need to return array of (possibly async) functions here.
24// The functions accept two arguments: the \"crawlingContext\" object
25// and \"requestAsBrowserOptions\" which are passed to the `requestAsBrowser()`
26// function the crawler calls to navigate..
27[
28 async (crawlingContext, requestAsBrowserOptions) => {
29 // ...
30 }
31]""",
32 "postNavigationHooks": """// We need to return array of (possibly async) functions here.
33// The functions accept a single argument: the \"crawlingContext\" object.
34[
35 async (crawlingContext) => {
36 // ...
37 },
38]""",
39 "proxy": { "useApifyProxy": True },
40 "additionalMimeTypes": [],
41 "customData": {},
42}
43
44# Run the Actor and wait for it to finish
45run = client.actor("mstephen190/vanilla-js-scraper").call(run_input=run_input)
46
47# Fetch and print Actor results from the run's dataset (if there are any)
48print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
49for item in client.dataset(run["defaultDatasetId"]).iterate_items():
50 print(item)
51
52# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start
Developer
Maintained by Community
Actor metrics
- 11 monthly users
- 2 stars
- 99.5% runs succeeded
- Created in Mar 2022
- Modified 10 months ago
Categories