GPT Scraper avatar
GPT Scraper

Pricing

$9.00 / 1,000 pages

Go to Store
GPT Scraper

GPT Scraper

Developed by

Jakub Drobník

Maintained by Apify

Extract data from any website and feed it into GPT via the OpenAI API. Use ChatGPT to proofread content, analyze sentiment, summarize reviews, extract contact details, and much more.

4.4 (7)

Pricing

$9.00 / 1,000 pages

91

Monthly users

167

Runs succeeded

>99%

Response time

3.2 days

Last modified

3 months ago

JoseJet avatar

prompt gets filled with base64 representation of image

Open

Pepa <b>J</b> (JoseJet) opened this issue
5 months ago

For this input, the Actor returned wrong answer, when we checked the MD generated prompt from KV-store it was overflowed with base64 inline representation from one of the images on the website.

When we remove the img, link tags everything works fine. I do not think that including inline base64 encoded source of image into the prompt should be default behavior.

1{
2  "dynamicContentWaitSecs": 10,
3  "instructions": "Gets the amount of results on the page and return it as single number in JSON format: \njobAmount",
4  "pageFormatInRequest": "Markdown",
5  "proxyConfiguration": {
6    "useApifyProxy": true
7  },
8  "removeElementsCssSelector": "script, style, noscript, path, svg, xlink",
9  "removeLinkUrls": true,
10  "saveSnapshots": true,
11  "schema": {
12    "type": "object",
13    "properties": {
14      "jobAmount": {
15        "type": "number",
16        "description": "results"
17      }
18    },
19    "required": [
20      "jobAmount"
21    ]
22  },
23  "startUrls": [
24    {
25      "url": "https://workforcenow.adp.com/mascsr/default/mdf/recruitment/recruitment.html?cid=e4f6ff38-1bcd-40e3-b778-ec98a30f2192&ccId=19000101_000001&type=MP&lang=en_US&selectedMenuKey=CurrentOpenings",
26      "method": "GET"
27    }
28  ],
29  "useStructureOutput": false,
30  "includeUrlGlobs": [],
31  "excludeUrlGlobs": [],
32  "maxCrawlingDepth": 99999999,
33  "maxPagesPerCrawl": 10,
34  "initialCookies": [],
35  "temperature": "0",
36  "topP": "1",
37  "frequencyPenalty": "0",
38  "presencePenalty": "0"
39}
lukas.prusa avatar

Thanks for reporting this!

Yeah, this should definitely not be the default, it should get processed out of the page content before it is sent to the GPT. We will investigate and fix this :)

Pricing

Pricing model

Pay per result 

This Actor is paid per result. You are not charged for the Apify platform usage, but only a fixed price for each dataset of 1,000 items in the Actor outputs.

Price per 1,000 items

$9.00