Apify Store Scraper avatar
Apify Store Scraper
Deprecated
View all Actors
This Actor is deprecated

This Actor is unavailable because the developer has decided to deprecate it. Would you like to try a similar Actor instead?

See alternative Actors
Apify Store Scraper

Apify Store Scraper

jurooravec/apify-store-scraper

Extract all Actors from the Apify Store. Includes cost, trial minutes, number of users, number of builds, version, author, and more. Optionally filter by category or search term. Download as JSON, JSONL, XML, CSV, Excel, or HTML formats.

Start URLs

startUrlsarrayRequired

Select specific URLs to scrape.

Default value of this property is [{"url":"https://apify.com/store"}]

Search query

listingFilterQuerystringOptional

If given, only actors matching the query will be retrieved

Actor category

listingFilterCategoryEnumOptional

If given, only actors from this category will be retried

Value options:

"ai": string"automation": string"business": string"covid-19": string"developer examples": string"developer tools": string"e-commerce": string"games": string"jobs": string"lead generation": string"marketing": string"news": string"seo tools": string"social media": string"travel": string"videos": string"real estate": string"sports": string"education": string"other": string

Target number of results

listingFilterMaxCountintegerOptional

If set, only up to this number of entries will be extracted. The actual number of entries might be higher than this because the results are paginated.

Proxy configuration

proxyobjectOptional

Select proxies to be used by your crawler.

Include personal data

includePersonalDatabooleanOptional

By default, fields that are potential personal data are censored. Toggle this option on to get the un-uncensored values.

WARNING: Turn this on ONLY if you have consent, legal basis for using the data, or at your own risk. Learn more

Default value of this property is false

Dataset ID or name

outputDatasetIdOrNamestringOptional

By default, data is written to Default dataset. Set this option if you want to write data to non-default dataset. Learn more

Rename dataset fields

outputPickFieldsarrayOptional

Select a subset of fields of an entry that will be pushed to the dataset.

If not set, all fields on an entry will be pushed to the dataset.

This is done before outputRenameFields.

Keys can be nested, e.g. "someProp.value[0]". Nested path is resolved using Lodash.get().

Rename dataset fields

outputRenameFieldsobjectOptional

Rename fields (columns) of the output data.

If not set, all fields will have their original names.

This is done after outputPickFields.

Keys can be nested, e.g. "someProp.value[0]". Nested path is resolved using Lodash.get().

Metamorph actor ID - metamorph to another actor at the end

metamorphActorIdstringOptional

Use this option if you want to run another actor with the same dataset after this actor has finished (AKA metamorph into another actor). Learn more

New actor is identified by its ID, e.g. "apify/web-scraper".

Metamorph actor build

metamorphActorBuildstringOptional

Tag or number of the target actor build to metamorph into (e.g. 'beta' or '1.2.345')

Metamorph actor input

metamorphActorInputobjectOptional

Input object passed to the follow-up (metamorph) actor. Learn more

maxRequestRetries

maxRequestRetriesintegerOptional

Indicates how many times the request is retried if BasicCrawlerOptions.requestHandler fails.

maxRequestsPerMinute

maxRequestsPerMinuteintegerOptional

The maximum number of requests per minute the crawler should run. We can pass any positive, non-zero integer.

maxRequestsPerCrawl

maxRequestsPerCrawlintegerOptional

Maximum number of pages that the crawler will open. The crawl will stop when this limit is reached.

NOTE: In cases of parallel crawling, the actual number of pages visited might be slightly higher than this value.

minConcurrency

minConcurrencyintegerOptional

Sets the minimum concurrency (parallelism) for the crawl.

WARNING: If we set this value too high with respect to the available system memory and CPU, our crawler will run extremely slow or crash. If not sure, it's better to keep the default value and the concurrency will scale up automatically.

maxConcurrency

maxConcurrencyintegerOptional

Sets the maximum concurrency (parallelism) for the crawl.

navigationTimeoutSecsintegerOptional

Timeout in which the HTTP request to the resource needs to finish, given in seconds.

requestHandlerTimeoutSecs

requestHandlerTimeoutSecsintegerOptional

Timeout in which the function passed as BasicCrawlerOptions.requestHandler needs to finish, in seconds.

keepAlive

keepAlivebooleanOptional

Allows to keep the crawler alive even if the RequestQueue gets empty. With keepAlive: true the crawler will keep running, waiting for more requests to come.

additionalMimeTypes

additionalMimeTypesarrayOptional

An array of MIME types you want the crawler to load and process. By default, only text/html and application/xhtml+xml MIME types are supported.

suggestResponseEncoding

suggestResponseEncodingstringOptional

By default this crawler will extract correct encoding from the HTTP response headers. There are some websites which use invalid headers. Those are encoded using the UTF-8 encoding. If those sites actually use a different encoding, the response will be corrupted. You can use suggestResponseEncoding to fall back to a certain encoding, if you know that your target website uses it. To force a certain encoding, disregarding the response headers, use forceResponseEncoding.

forceResponseEncoding

forceResponseEncodingstringOptional

By default this crawler will extract correct encoding from the HTTP response headers. Use forceResponseEncoding to force a certain encoding, disregarding the response headers. To only provide a default for missing encodings, use suggestResponseEncoding.

Log Level

logLevelEnumOptional

Select how detailed should be the logging.

Value options:

"off": string"debug": string"info": string"warn": string"error": string

Default value of this property is "info"

Error reporting dataset ID

errorReportingDatasetIdstringOptional

Apify dataset ID or name to which errors should be captured.

Default: 'REPORTING'.

Default value of this property is "REPORTING"

Send errors to Sentry

errorSendToSentrybooleanOptional

Whether to send actor error reports to Sentry.

This info is used by the author of this actor to identify broken integrations, and track down and fix issues.

Default value of this property is true

Developer
Maintained by Community