Zach's "Webpage Content To Markdown" Scraper avatar

Zach's "Webpage Content To Markdown" Scraper

Under maintenance
Try for free

3 days trial then $19.00/month - No credit card required now

Go to Store
This Actor is under maintenance.

This Actor may be unreliable while under maintenance. Would you like to try a similar Actor instead?

See alternative Actors
Zach's "Webpage Content To Markdown" Scraper

Zach's "Webpage Content To Markdown" Scraper

dyf/webpage-to-markdown
Try for free

3 days trial then $19.00/month - No credit card required now

Scrape a webpage and parse to markdown. Packed with features to ensure high success rate and low cost. Includes 2 modes of operation so that you can optimize for either cost (as cheap as possible) or yield (as many successful results as possible).

You can access the Zach's "Webpage Content To Markdown" Scraper programmatically from your own applications by using the Apify API. You can choose the language preference from below. To use the Apify API, you’ll need an Apify account and your API token, found in Integrations settings in Apify Console.

1{
2  "openapi": "3.0.1",
3  "info": {
4    "version": "1.0",
5    "x-build-id": "wKDpkCFLgQUv8SXxI"
6  },
7  "servers": [
8    {
9      "url": "https://api.apify.com/v2"
10    }
11  ],
12  "paths": {
13    "/acts/dyf~webpage-to-markdown/run-sync-get-dataset-items": {
14      "post": {
15        "operationId": "run-sync-get-dataset-items-dyf-webpage-to-markdown",
16        "x-openai-isConsequential": false,
17        "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
18        "tags": [
19          "Run Actor"
20        ],
21        "requestBody": {
22          "required": true,
23          "content": {
24            "application/json": {
25              "schema": {
26                "$ref": "#/components/schemas/inputSchema"
27              }
28            }
29          }
30        },
31        "parameters": [
32          {
33            "name": "token",
34            "in": "query",
35            "required": true,
36            "schema": {
37              "type": "string"
38            },
39            "description": "Enter your Apify token here"
40          }
41        ],
42        "responses": {
43          "200": {
44            "description": "OK"
45          }
46        }
47      }
48    },
49    "/acts/dyf~webpage-to-markdown/runs": {
50      "post": {
51        "operationId": "runs-sync-dyf-webpage-to-markdown",
52        "x-openai-isConsequential": false,
53        "summary": "Executes an Actor and returns information about the initiated run in response.",
54        "tags": [
55          "Run Actor"
56        ],
57        "requestBody": {
58          "required": true,
59          "content": {
60            "application/json": {
61              "schema": {
62                "$ref": "#/components/schemas/inputSchema"
63              }
64            }
65          }
66        },
67        "parameters": [
68          {
69            "name": "token",
70            "in": "query",
71            "required": true,
72            "schema": {
73              "type": "string"
74            },
75            "description": "Enter your Apify token here"
76          }
77        ],
78        "responses": {
79          "200": {
80            "description": "OK",
81            "content": {
82              "application/json": {
83                "schema": {
84                  "$ref": "#/components/schemas/runsResponseSchema"
85                }
86              }
87            }
88          }
89        }
90      }
91    },
92    "/acts/dyf~webpage-to-markdown/run-sync": {
93      "post": {
94        "operationId": "run-sync-dyf-webpage-to-markdown",
95        "x-openai-isConsequential": false,
96        "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
97        "tags": [
98          "Run Actor"
99        ],
100        "requestBody": {
101          "required": true,
102          "content": {
103            "application/json": {
104              "schema": {
105                "$ref": "#/components/schemas/inputSchema"
106              }
107            }
108          }
109        },
110        "parameters": [
111          {
112            "name": "token",
113            "in": "query",
114            "required": true,
115            "schema": {
116              "type": "string"
117            },
118            "description": "Enter your Apify token here"
119          }
120        ],
121        "responses": {
122          "200": {
123            "description": "OK"
124          }
125        }
126      }
127    }
128  },
129  "components": {
130    "schemas": {
131      "inputSchema": {
132        "type": "object",
133        "properties": {
134          "startUrls": {
135            "title": "Start URLs or Domains",
136            "type": "array",
137            "description": "A list of URLs or domains to start with. For domains, the script will automatically prepend 'https://'.",
138            "items": {
139              "type": "string"
140            }
141          },
142          "minContentLengthChars": {
143            "title": "Minimum Content Length to Mark \"Success\" (Characters)",
144            "type": "integer",
145            "description": "If the content length is less than this value, the crawler will mark the page as a soft fail.\nEnter 0 for no limit.",
146            "default": 500
147          },
148          "maxContentLengthChars": {
149            "title": "Maximum Content Length (Characters)",
150            "type": "integer",
151            "description": "Enter 0 for no limit.",
152            "default": 0
153          },
154          "getDataUsingBrowser": {
155            "title": "Get Data Using Browser",
156            "type": "boolean",
157            "description": "If true, the crawler will use a headless browser to get the HTML content of the page. (need atleast 4096MB memory)",
158            "default": true
159          },
160          "proxyConfigurationSettings": {
161            "title": "Proxy configuration",
162            "type": "object",
163            "description": "Select proxies to be used by your crawler."
164          }
165        }
166      },
167      "runsResponseSchema": {
168        "type": "object",
169        "properties": {
170          "data": {
171            "type": "object",
172            "properties": {
173              "id": {
174                "type": "string"
175              },
176              "actId": {
177                "type": "string"
178              },
179              "userId": {
180                "type": "string"
181              },
182              "startedAt": {
183                "type": "string",
184                "format": "date-time",
185                "example": "2025-01-08T00:00:00.000Z"
186              },
187              "finishedAt": {
188                "type": "string",
189                "format": "date-time",
190                "example": "2025-01-08T00:00:00.000Z"
191              },
192              "status": {
193                "type": "string",
194                "example": "READY"
195              },
196              "meta": {
197                "type": "object",
198                "properties": {
199                  "origin": {
200                    "type": "string",
201                    "example": "API"
202                  },
203                  "userAgent": {
204                    "type": "string"
205                  }
206                }
207              },
208              "stats": {
209                "type": "object",
210                "properties": {
211                  "inputBodyLen": {
212                    "type": "integer",
213                    "example": 2000
214                  },
215                  "rebootCount": {
216                    "type": "integer",
217                    "example": 0
218                  },
219                  "restartCount": {
220                    "type": "integer",
221                    "example": 0
222                  },
223                  "resurrectCount": {
224                    "type": "integer",
225                    "example": 0
226                  },
227                  "computeUnits": {
228                    "type": "integer",
229                    "example": 0
230                  }
231                }
232              },
233              "options": {
234                "type": "object",
235                "properties": {
236                  "build": {
237                    "type": "string",
238                    "example": "latest"
239                  },
240                  "timeoutSecs": {
241                    "type": "integer",
242                    "example": 300
243                  },
244                  "memoryMbytes": {
245                    "type": "integer",
246                    "example": 1024
247                  },
248                  "diskMbytes": {
249                    "type": "integer",
250                    "example": 2048
251                  }
252                }
253              },
254              "buildId": {
255                "type": "string"
256              },
257              "defaultKeyValueStoreId": {
258                "type": "string"
259              },
260              "defaultDatasetId": {
261                "type": "string"
262              },
263              "defaultRequestQueueId": {
264                "type": "string"
265              },
266              "buildNumber": {
267                "type": "string",
268                "example": "1.0.0"
269              },
270              "containerUrl": {
271                "type": "string"
272              },
273              "usage": {
274                "type": "object",
275                "properties": {
276                  "ACTOR_COMPUTE_UNITS": {
277                    "type": "integer",
278                    "example": 0
279                  },
280                  "DATASET_READS": {
281                    "type": "integer",
282                    "example": 0
283                  },
284                  "DATASET_WRITES": {
285                    "type": "integer",
286                    "example": 0
287                  },
288                  "KEY_VALUE_STORE_READS": {
289                    "type": "integer",
290                    "example": 0
291                  },
292                  "KEY_VALUE_STORE_WRITES": {
293                    "type": "integer",
294                    "example": 1
295                  },
296                  "KEY_VALUE_STORE_LISTS": {
297                    "type": "integer",
298                    "example": 0
299                  },
300                  "REQUEST_QUEUE_READS": {
301                    "type": "integer",
302                    "example": 0
303                  },
304                  "REQUEST_QUEUE_WRITES": {
305                    "type": "integer",
306                    "example": 0
307                  },
308                  "DATA_TRANSFER_INTERNAL_GBYTES": {
309                    "type": "integer",
310                    "example": 0
311                  },
312                  "DATA_TRANSFER_EXTERNAL_GBYTES": {
313                    "type": "integer",
314                    "example": 0
315                  },
316                  "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
317                    "type": "integer",
318                    "example": 0
319                  },
320                  "PROXY_SERPS": {
321                    "type": "integer",
322                    "example": 0
323                  }
324                }
325              },
326              "usageTotalUsd": {
327                "type": "number",
328                "example": 0.00005
329              },
330              "usageUsd": {
331                "type": "object",
332                "properties": {
333                  "ACTOR_COMPUTE_UNITS": {
334                    "type": "integer",
335                    "example": 0
336                  },
337                  "DATASET_READS": {
338                    "type": "integer",
339                    "example": 0
340                  },
341                  "DATASET_WRITES": {
342                    "type": "integer",
343                    "example": 0
344                  },
345                  "KEY_VALUE_STORE_READS": {
346                    "type": "integer",
347                    "example": 0
348                  },
349                  "KEY_VALUE_STORE_WRITES": {
350                    "type": "number",
351                    "example": 0.00005
352                  },
353                  "KEY_VALUE_STORE_LISTS": {
354                    "type": "integer",
355                    "example": 0
356                  },
357                  "REQUEST_QUEUE_READS": {
358                    "type": "integer",
359                    "example": 0
360                  },
361                  "REQUEST_QUEUE_WRITES": {
362                    "type": "integer",
363                    "example": 0
364                  },
365                  "DATA_TRANSFER_INTERNAL_GBYTES": {
366                    "type": "integer",
367                    "example": 0
368                  },
369                  "DATA_TRANSFER_EXTERNAL_GBYTES": {
370                    "type": "integer",
371                    "example": 0
372                  },
373                  "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
374                    "type": "integer",
375                    "example": 0
376                  },
377                  "PROXY_SERPS": {
378                    "type": "integer",
379                    "example": 0
380                  }
381                }
382              }
383            }
384          }
385        }
386      }
387    }
388  }
389}

Zach's "Webpage Content To Markdown" Scraper OpenAPI definition

OpenAPI is a standard for designing and describing RESTful APIs, allowing developers to define API structure, endpoints, and data formats in a machine-readable way. It simplifies API development, integration, and documentation.

OpenAPI is effective when used with AI agents and GPTs by standardizing how these systems interact with various APIs, for reliable integrations and efficient communication.

By defining machine-readable API specifications, OpenAPI allows AI models like GPTs to understand and use varied data sources, improving accuracy. This accelerates development, reduces errors, and provides context-aware responses, making OpenAPI a core component for AI applications.

You can download the OpenAPI definitions for Zach's "Webpage Content To Markdown" Scraper from the options below:

If you’d like to learn more about how OpenAPI powers GPTs, read our blog post.

You can also check out our other API clients:

Developer
Maintained by Community

Actor Metrics

  • 4 monthly users

  • 1 star

  • 18% runs succeeded

  • Created in Jan 2025

  • Modified 6 days ago