- Source code
The following table shows specification of the actor INPUT fields as defined by its input schema. These fields can be entered either manually in the app, or provided in a JSON object when running the actor using the API. Read more in docs.
List of web pages where the actor will start crawling.
Proxy servers let you bypass website protections, avoid IP address blocking and view content for other countries. Try to use the proxy if you are experiencing timeout errors.
Maximum link depth
The maximum number of links away from the Start URLs that the actor will crawl. If
0, the actor will not follow any links. If empty or null, the actor will follow links to arbitrary depth.
Total maximum pages
The maximum number of pages the crawler will load. It is always a good idea to limit the number of pages, otherwise the actor might run infinitely or consume too much resources.
Maximum pages per start URL
The maximum number of pages that will be enqueued from each start URL you provide.
Stay within domain
If set, the actor will only follow links within the same domain as the referring page.
If set, the actor will display a live view on the container URL, where you can monitor its progress. Note that the live view has a small performance overheads.
If set, the actor will extract contact information also from IFRAMEs. Sometimes, you might not want that (e.g. this will include data from online ads).