GPT Scraper
Pay $9.00 for 1,000 pages
GPT Scraper
Pay $9.00 for 1,000 pages
Extract data from any website and feed it into GPT via the OpenAI API. Use ChatGPT to proofread content, analyze sentiment, summarize reviews, extract contact details, and much more.
Do you want to learn more about this Actor?
Get a demoFor this input, the Actor returned wrong answer, when we checked the MD generated prompt from KV-store it was overflowed with base64 inline representation from one of the images on the website.
When we remove the img, link
tags everything works fine. I do not think that including inline base64 encoded source of image into the prompt should be default behavior.
1{ 2 "dynamicContentWaitSecs": 10, 3 "instructions": "Gets the amount of results on the page and return it as single number in JSON format: \njobAmount", 4 "pageFormatInRequest": "Markdown", 5 "proxyConfiguration": { 6 "useApifyProxy": true 7 }, 8 "removeElementsCssSelector": "script, style, noscript, path, svg, xlink", 9 "removeLinkUrls": true, 10 "saveSnapshots": true, 11 "schema": { 12 "type": "object", 13 "properties": { 14 "jobAmount": { 15 "type": "number", 16 "description": "results" 17 } 18 }, 19 "required": [ 20 "jobAmount" 21 ] 22 }, 23 "startUrls": [ 24 { 25 "url": "https://workforcenow.adp.com/mascsr/default/mdf/recruitment/recruitment.html?cid=e4f6ff38-1bcd-40e3-b778-ec98a30f2192&ccId=19000101_000001&type=MP&lang=en_US&selectedMenuKey=CurrentOpenings", 26 "method": "GET" 27 } 28 ], 29 "useStructureOutput": false, 30 "includeUrlGlobs": [], 31 "excludeUrlGlobs": [], 32 "maxCrawlingDepth": 99999999, 33 "maxPagesPerCrawl": 10, 34 "initialCookies": [], 35 "temperature": "0", 36 "topP": "1", 37 "frequencyPenalty": "0", 38 "presencePenalty": "0" 39}
Thanks for reporting this!
Yeah, this should definitely not be the default, it should get processed out of the page content before it is sent to the GPT. We will investigate and fix this :)
Actor Metrics
150 monthly users
-
69 stars
>99% runs succeeded
2.6 days response time
Created in Mar 2023
Modified 8 days ago