Web Scraper avatar

Web Scraper

Try for free

No credit card required

Go to Store
Web Scraper

Web Scraper

apify/web-scraper
Try for free

No credit card required

Crawls arbitrary websites using the Chrome browser and extracts structured data from web pages using a provided JavaScript function. The Actor supports both recursive crawling and lists of URLs, and automatically manages concurrency for maximum performance.

competent_path avatar

received 401 status code

Open
Competent Path (competent_path) opened this issue
4 days ago

I tried this with the following input:

1{
2    "breakpointLocation": "NONE",
3    "browserLog": false,
4    "closeCookieModals": false,
5    "debugLog": false,
6    "downloadCss": false,
7    "downloadMedia": false,
8    "excludes": [
9        {
10            "glob": "/**/*.{png,jpg,jpeg,pdf}"
11        }
12    ],
13    "globs": [
14        {
15            "glob": ""
16        }
17    ],
18    "headless": false,
19    "ignoreCorsAndCsp": true,
20    "ignoreSslErrors": true,
21    "injectJQuery": true,
22    "keepUrlFragments": false,
23    "pageFunction": "async function pageFunction(context) {\n    const $ = context.jQuery;\n    return {html: $('html').first().html()};\n}",
24    "postNavigationHooks": "// We need to return array of (possibly async) functions here.\n// The functions accept a single argument: the \"crawlingContext\" object.\n[\n    async (crawlingContext) => {\n        // ...\n    },\n]",
25    "preNavigationHooks": "// We need to return array of (possibly async) functions here.\n// The functions accept two arguments: the \"crawlingContext\" object\n// and \"gotoOptions\".\n[\n    async (crawlingContext, gotoOptions) => {\n        // ...\n    },\n]\n",
26    "proxyConfiguration": {
27        "useApifyProxy": true,
28        "apifyProxyGroups": [
29            "RESIDENTIAL"
30        ]
31    },
32    "runMode": "PRODUCTION",
33    "startUrls": [
34        {
35            "url": "https://www.wsj.com/livecoverage/stock-market-today-dow-sp500-nasdaq-live-08-07-2024/card/robinhood-reports-record-quarterly-revenue-and-profit-tIlQ0DnKKwNWFeqoRcA2",
36            "method": "GET"
37        }
38    ],
39    "useChrome": true,
40    "waitUntil": [
41        "networkidle2"
42    ]
43}

PuppeteerCrawler: Reclaiming failed request back to the list or queue. Request blocked - received 401 status code. 2025-02-18T22:39:55.764Z {"id":"9nWDjDToDXvA6Ny","url":"https://www.wsj.com/livecoverage/stock-market-today-dow-sp500-nasdaq-live-08-07-2024/card/robinhood-reports-record-quarterly-revenue-and-profit-tIlQ0DnKKwNWFeqoRcA2","retryCount":1}

Developer
Maintained by Apify

Actor Metrics

  • 3.3k monthly users

  • 456 bookmarks

  • >99% runs succeeded

  • 4.8 days response time

  • Created in Mar 2019

  • Modified a month ago