
Web Scraper
Pricing
Pay per usage

Web Scraper
Crawls arbitrary websites using a web browser and extracts structured data from web pages using a provided JavaScript function. The Actor supports both recursive crawling and lists of URLs, and automatically manages concurrency for maximum performance.
4.5 (23)
Pricing
Pay per usage
920
Total users
90K
Monthly users
4.9K
Runs succeeded
>99%
Issues response
7.8 days
Last modified
2 months ago

received 401 status code
Closed
I tried this with the following input:
{"breakpointLocation": "NONE","browserLog": false,"closeCookieModals": false,"debugLog": false,"downloadCss": false,"downloadMedia": false,"excludes": [{"glob": "/**/*.{png,jpg,jpeg,pdf}"}],"globs": [{"glob": ""}],"headless": false,"ignoreCorsAndCsp": true,"ignoreSslErrors": true,"injectJQuery": true,"keepUrlFragments": false,"pageFunction": "async function pageFunction(context) {\n const $ = context.jQuery;\n return {html: $('html').first().html()};\n}","postNavigationHooks": "// We need to return array of (possibly async) functions here.\n// The functions accept a single argument: the \"crawlingContext\" object.\n[\n async (crawlingContext) => {\n // ...\n },\n]","preNavigationHooks": "// We need to return array of (possibly async) functions here.\n// The functions accept two arguments: the \"crawlingContext\" object\n// and \"gotoOptions\".\n[\n async (crawlingContext, gotoOptions) => {\n // ...\n },\n]\n","proxyConfiguration": {"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"]},"runMode": "PRODUCTION","startUrls": [{"url": "https://www.wsj.com/livecoverage/stock-market-today-dow-sp500-nasdaq-live-08-07-2024/card/robinhood-reports-record-quarterly-revenue-and-profit-tIlQ0DnKKwNWFeqoRcA2","method": "GET"}],"useChrome": true,"waitUntil": ["networkidle2"]}
PuppeteerCrawler: Reclaiming failed request back to the list or queue. Request blocked - received 401 status code. 2025-02-18T22:39:55.764Z {"id":"9nWDjDToDXvA6Ny","url":"https://www.wsj.com/livecoverage/stock-market-today-dow-sp500-nasdaq-live-08-07-2024/card/robinhood-reports-record-quarterly-revenue-and-profit-tIlQ0DnKKwNWFeqoRcA2","retryCount":1}

Competent Path (competent_path)
Please let me know if there is any update on this one.
Hello, and sorry for the delay.
The 401 error indicates that access was blocked, and unfortunately, the Wall Street Journal has very strong anti-bot protections in place. We haven’t been able to successfully scrape this page using either the Web Scraper or Camoufox Scraper Actors with any input options combination.
At the moment, the most viable path forward is to develop a custom solution tailored specifically for WSJ. If you don’t have the development capacity to do this yourself, you might consider hiring a freelancer from our official Apify Discord server.
I'll close this issue as there is likely no way forward for this Actor to support scraping this server. Let us know if you have any other questions!