Web Scraper avatar
Web Scraper

Pricing

Pay per usage

Go to Store
Web Scraper

Web Scraper

Developed by

Apify

Apify

Maintained by Apify

Crawls arbitrary websites using a web browser and extracts structured data from web pages using a provided JavaScript function. The Actor supports both recursive crawling and lists of URLs, and automatically manages concurrency for maximum performance.

4.5 (22)

Pricing

Pay per usage

698

Total users

82.6k

Monthly users

4k

Runs succeeded

>99%

Issue response

32 days

Last modified

20 days ago

competent_path avatar

received 401 status code

Open

Competent Path (competent_path) opened this issue
3 months ago

I tried this with the following input:

{
"breakpointLocation": "NONE",
"browserLog": false,
"closeCookieModals": false,
"debugLog": false,
"downloadCss": false,
"downloadMedia": false,
"excludes": [
{
"glob": "/**/*.{png,jpg,jpeg,pdf}"
}
],
"globs": [
{
"glob": ""
}
],
"headless": false,
"ignoreCorsAndCsp": true,
"ignoreSslErrors": true,
"injectJQuery": true,
"keepUrlFragments": false,
"pageFunction": "async function pageFunction(context) {\n const $ = context.jQuery;\n return {html: $('html').first().html()};\n}",
"postNavigationHooks": "// We need to return array of (possibly async) functions here.\n// The functions accept a single argument: the \"crawlingContext\" object.\n[\n async (crawlingContext) => {\n // ...\n },\n]",
"preNavigationHooks": "// We need to return array of (possibly async) functions here.\n// The functions accept two arguments: the \"crawlingContext\" object\n// and \"gotoOptions\".\n[\n async (crawlingContext, gotoOptions) => {\n // ...\n },\n]\n",
"proxyConfiguration": {
"useApifyProxy": true,
"apifyProxyGroups": [
"RESIDENTIAL"
]
},
"runMode": "PRODUCTION",
"startUrls": [
{
"url": "https://www.wsj.com/livecoverage/stock-market-today-dow-sp500-nasdaq-live-08-07-2024/card/robinhood-reports-record-quarterly-revenue-and-profit-tIlQ0DnKKwNWFeqoRcA2",
"method": "GET"
}
],
"useChrome": true,
"waitUntil": [
"networkidle2"
]
}

PuppeteerCrawler: Reclaiming failed request back to the list or queue. Request blocked - received 401 status code. 2025-02-18T22:39:55.764Z {"id":"9nWDjDToDXvA6Ny","url":"https://www.wsj.com/livecoverage/stock-market-today-dow-sp500-nasdaq-live-08-07-2024/card/robinhood-reports-record-quarterly-revenue-and-profit-tIlQ0DnKKwNWFeqoRcA2","retryCount":1}

competent_path avatar

Please let me know if there is any update on this one.