If checked, the scraper will obtain more detailed info by downloading linked resources (e.g. org's researchers, org's projects, ...).

If un-checked, only the data from the detail page is extracted.

Note 1: This is a different type of data than what is scraped from individual entries, as this data describes the relationships.

Note 2: This dramatically increases the running time (full dataset takes days, up to a week).
Consider that the whole DB has more than 500,000 entries of all kinds.
Whichever dataset you choose, the downloaded entries WILL have relationships to those 500k entries.

For details, please refer to https://apify.com/jurooravec/skcris-scraper#output

Default value of this property is false

Filter by first letter

listingFilterFirstLetterEnumOptional

If set, only entries starting with this letter will be extracted.

NOTE: Only characters A-Z are supported. Letters with diacritics (eg Á), can be found under the base character (eg A).

Value options:

"a": string"b": string"c": string"d": string"e": string"f": string"g": string"h": string"i": string"j": string"k": string"l": string"m": string"n": string"o": string"p": string"q": string"r": string"s": string"t": string"u": string"v": string"w": string"x": string"y": string"z": string

Filter by region (kraj)

listingFilterRegionEnumOptional

If set, only entries within this region will be extracted.

Value options:

"bratislava": string"trnava": string"trencin": string"nitra": string"zilina": string"banskabystrica": string"presov": string"kosice": string"zahranicie": string

Results per page

listingItemsPerPageintegerOptional

If set, this number of entries will be extracted per page.

NOTE: Default is set to 500. This balances 1) slow server start-up time, 2) total server response time, 3) the risk of the request failure.

Default value of this property is 500

Count the total matched results

listingCountOnlybooleanOptional

If checked, no data is extracted. Instead, the count of matched results is printed in the log.

Default value of this property is false

Extend Actor input from URL

inputExtendUrlstringOptional

Extend Actor input with a config from a URL.
For example, you can store your actor input in a source control, and import it here.
In case of a conflict (if a field is defined both in Actor input and in imported input) the Actor input overwrites the imported fields.
The URL is requested with GET method, and must point to a JSON file containing a single object (the config).
If you need to send a POST request or to modify the response further, use inputExtendFromFunction instead.

Extend Actor input from custom function

inputExtendFromFunctionstringOptional

Extend Actor input with a config defined by a custom function.
For example, you can store your actor input in a source control, and import it here.
In case of a conflict (if a field is defined both in Actor input and in imported input) the Actor input overwrites the imported fields.
The function must return an object (the config).

Start URLs

startUrlsarrayOptional

List of URLs to scrape.

Start URLs from Dataset

startUrlsFromDatasetstringOptional

Import URLs to scrape from an existing Dataset.
The dataset and the field to import from should be written as {datasetID}#{field}.
Example: datasetid123#url will take URLs from dataset datasetid123 from field url.

Start URLs from custom function

startUrlsFromFunctionstringOptional

Import or generate URLs to scrape using a custom function.

Proxy configuration

proxyobjectOptional

Select proxies to be used by your crawler.

Include personal data

includePersonalDatabooleanOptional

By default, fields that are potential personal data are censored. Toggle this option on to get the un-uncensored values.
WARNING: Turn this on ONLY if you have consent, legal basis for using the data, or at your own risk. Learn more

Default value of this property is false

Limit the number of requests

requestMaxEntriesintegerOptional

If set, only at most this many requests will be processed.
The count is determined from the RequestQueue that's used for the Actor run.
This means that if requestMaxEntries is set to 50, but the associated queue already handled 40 requests, then only 10 new requests will be handled.

Transform requests

requestTransformstringOptional

Freely transform the request object using a custom function.
If not set, the request will remain as is.

Transform requests - Setup

requestTransformBeforestringOptional

Use this if you need to run one-time initialization code before requestTransform.

Transform requests - Teardown

requestTransformAfterstringOptional

Use this if you need to run one-time teardown code after requestTransform.

Filter requests

requestFilterstringOptional

Decide which requests should be processed by using a custom function.
If not set, all requests will be included.
This is done after requestTransform.

Filter requests - Setup

requestFilterBeforestringOptional

Use this if you need to run one-time initialization code before requestFilter.

Filter requests - Teardown

requestFilterAfterstringOptional

Use this if you need to run one-time teardown code after requestFilter.

RequestQueue ID

requestQueueIdstringOptional

By default, requests are stored in the default request queue. Set this option if you want to use a non-default queue. Learn more
NOTE: RequestQueue name can only contain letters 'a' through 'z', the digits '0' through '9', and the hyphen ('-') but only in the middle of the string (e.g. 'my-value-1')

Limit the number of scraped entries

outputMaxEntriesintegerOptional

If set, only at most this many entries will be scraped.
The count is determined from the Dataset that's used for the Actor run.
This means that if outputMaxEntries is set to 50, but the associated Dataset already has 40 items in it, then only 10 new entries will be saved.

Rename dataset fields

outputRenameFieldsobjectOptional

Rename fields (columns) of the output data.
If not set, all fields will have their original names.
This is done before outputPickFields.
Keys can be nested, e.g. "someProp.value[0]". Nested path is resolved using Lodash.get().

Pick dataset fields

outputPickFieldsarrayOptional

Select a subset of fields of an entry that will be pushed to the dataset.
If not set, all fields on an entry will be pushed to the dataset.
This is done after outputRenameFields.
Keys can be nested, e.g. "someProp.value[0]". Nested path is resolved using Lodash.get().

Transform entries

outputTransformstringOptional

Freely transform the output data object using a custom function.
If not set, the data will remain as is.
This is done after outputPickFields and outputRenameFields.

Transform entries - Setup

outputTransformBeforestringOptional

Use this if you need to run one-time initialization code before outputTransform.

Transform entries - Teardown

outputTransformAfterstringOptional

Use this if you need to run one-time teardown code after outputTransform.

Filter entries

outputFilterstringOptional

Decide which scraped entries should be included in the output by using a custom function.
If not set, all scraped entries will be included.
This is done after outputPickFields, outputRenameFields, and outputTransform.

Filter entries - Setup

outputFilterBeforestringOptional

Use this if you need to run one-time initialization code before outputFilter.

Filter entries - Teardown

outputFilterAfterstringOptional

Use this if you need to run one-time teardown code after outputFilter.

Dataset ID

outputDatasetIdstringOptional

By default, data is written to Default dataset. Set this option if you want to write data to non-default dataset. Learn more
NOTE: Dataset name can only contain letters 'a' through 'z', the digits '0' through '9', and the hyphen ('-') but only in the middle of the string (e.g. 'my-value-1')

Cache ID

outputCacheStoreIdstringOptional

Set this option if you want to cache scraped entries in Apify's Key-value store.
This is useful for example when you want to scrape only NEW entries. In such case, you can use the outputFilter option to define a custom function to filter out entries already found in the cache. Learn more
NOTE: Cache name can only contain letters 'a' through 'z', the digits '0' through '9', and the hyphen ('-') but only in the middle of the string (e.g. 'my-value-1')

Cache primary keys

outputCachePrimaryKeysarrayOptional

Specify fields that uniquely identify entries (primary keys), so entries can be compared against the cache.
NOTE: If not set, the entries are hashed based on all fields

Cache action on result

outputCacheActionOnResultEnumOptional

Specify whether scraped results should be added to, removed from, or overwrite the cache.
- add - Adds scraped results to the cache
- remove - Removes scraped results from the cache
- set - First clears all entries from the cache, then adds scraped results to the cache
NOTE: No action happens when this field is empty.

Value options:

"add": string"remove": string"overwrite": string

maxRequestRetries

maxRequestRetriesintegerOptional

Indicates how many times the request is retried if BasicCrawlerOptions.requestHandler fails.

Default value of this property is 10

maxRequestsPerMinute

maxRequestsPerMinuteintegerOptional

The maximum number of requests per minute the crawler should run. We can pass any positive, non-zero integer.

maxRequestsPerCrawl

maxRequestsPerCrawlintegerOptional

Maximum number of pages that the crawler will open. The crawl will stop when this limit is reached.
NOTE: In cases of parallel crawling, the actual number of pages visited might be slightly higher than this value.

minConcurrency

minConcurrencyintegerOptional

Sets the minimum concurrency (parallelism) for the crawl.
WARNING: If we set this value too high with respect to the available system memory and CPU, our crawler will run extremely slow or crash. If not sure, it's better to keep the default value and the concurrency will scale up automatically.

maxConcurrency

maxConcurrencyintegerOptional

Sets the maximum concurrency (parallelism) for the crawl.

Default value of this property is 5

navigationTimeoutSecs

navigationTimeoutSecsintegerOptional

Timeout in which the HTTP request to the resource needs to finish, given in seconds.

requestHandlerTimeoutSecs

requestHandlerTimeoutSecsintegerOptional

Timeout in which the function passed as BasicCrawlerOptions.requestHandler needs to finish, in seconds.

keepAlive

keepAlivebooleanOptional

Allows to keep the crawler alive even if the RequestQueue gets empty. With keepAlive: true the crawler will keep running, waiting for more requests to come.

ignoreSslErrors

ignoreSslErrorsbooleanOptional

If set to true, SSL certificate errors will be ignored.

additionalMimeTypes

additionalMimeTypesarrayOptional

An array of MIME types you want the crawler to load and process. By default, only text/html and application/xhtml+xml MIME types are supported.

suggestResponseEncoding

suggestResponseEncodingstringOptional

By default this crawler will extract correct encoding from the HTTP response headers. There are some websites which use invalid headers. Those are encoded using the UTF-8 encoding. If those sites actually use a different encoding, the response will be corrupted. You can use suggestResponseEncoding to fall back to a certain encoding, if you know that your target website uses it. To force a certain encoding, disregarding the response headers, use forceResponseEncoding.

forceResponseEncoding

forceResponseEncodingstringOptional

By default this crawler will extract correct encoding from the HTTP response headers. Use forceResponseEncoding to force a certain encoding, disregarding the response headers. To only provide a default for missing encodings, use suggestResponseEncoding.

Batch requests

perfBatchSizeintegerOptional

If set, multiple Requests will be handled by a single Actor instance.
Example: If set to 20, then up to 20 requests will be handled in a single "go", after which the actor instance will reset.
See Apify documentation.

Wait (in seconds) between processing requests in a single batch

perfBatchWaitSecsintegerOptional

How long to wait between entries within a single batch.
Increase this value if you're using batching and you're sending requests to the scraped website too fast.
Example: If set to 1, then after each entry in a batch, wait 1 second before continuing.

Log Level

logLevelEnumOptional

Select how detailed should be the logging.

Value options:

"off": string"debug": string"info": string"warn": string"error": string

Default value of this property is "info"

Error reporting dataset ID

errorReportingDatasetIdstringOptional

Dataset ID to which errors should be captured.
Default: 'REPORTING'.
NOTE: Dataset name can only contain letters 'a' through 'z', the digits '0' through '9', and the hyphen ('-') but only in the middle of the string (e.g. 'my-value-1')

Default value of this property is "REPORTING"

Send errors to Sentry

errorSendToTelemetrybooleanOptional

Whether to report actor errors to telemetry such as Sentry.
This info is used by the author of this actor to identify broken integrations, and track down and fix issues.

Default value of this property is true

Metamorph actor ID - metamorph to another actor at the end

metamorphActorIdstringOptional

Use this option if you want to run another actor with the same dataset after this actor has finished (AKA metamorph into another actor). Learn more
New actor is identified by its ID, e.g. "apify/web-scraper".

Metamorph actor build

metamorphActorBuildstringOptional

Tag or number of the target actor build to metamorph into (e.g. 'beta' or '1.2.345')

Metamorph actor input

metamorphActorInputobjectOptional

Input object passed to the follow-up (metamorph) actor. Learn more

Website Content Crawler

apify/website-content-crawler

Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.

Apify

76K

4.6

(66)

🔥Apollo Scraper - Scrape upto 50k Leads

code_crafter/apollo-io-scraper

Scrape up to 50,000 leads per search URL.

Code Pioneer

70K

4.5

(355)

TikTok Scraper

clockworks/tiktok-scraper

Extract data from TikTok videos, hashtags, and users. Use URLs or search queries to scrape TikTok profiles, hashtags, posts, URLs, shares, followers, hearts, names, video, and music-related data. Export scraped data, run the scraper via API, schedule and monitor runs or integrate with other tools.

Clockworks

71K

4.6

(50)

Google Maps Scraper

compass/crawler-google-places

Extract data from thousands of Google Maps locations and businesses, including reviews, reviewer details, images, contact info, opening hours, location, prices & more. Export scraped data, run the scraper via API, schedule and monitor runs, or integrate with other tools.

Compass

167K

4.5

(210)

Instagram Scraper

apify/instagram-scraper

Scrape and download Instagram posts, profiles, places, hashtags, photos, and comments. Get data from Instagram using one or more Instagram URLs or search queries. Export scraped data, run the scraper via API, schedule and monitor runs or integrate with other tools.

Apify

133K

4.3

(110)

🏯 Tweet Scraper V2 - X / Twitter Scraper

apidojo/tweet-scraper

⚡️ Lightning-fast search, URL, list, and profile scraping, with customizable filters. At $0.40 per 1000 tweets, and 30-80 tweets per second, it is ideal for researchers, entrepreneurs, and businesses! Get comprehensive insights from Twitter (X) now!

API Dojo

22K

3.0

(94)

Facebook Posts Scraper

apify/facebook-posts-scraper

Extract data from hundreds of Facebook posts from one or multiple Facebook pages and profiles. Get post URL, post text, page or profile URL, timestamp, number of likes, shares, comments, and more. Download the data in JSON, CSV, and Excel and use it in apps, spreadsheets, and reports.

Apify

30K

4.9

(42)

Instagram Post Scraper

apify/instagram-post-scraper

Scrape Instagram posts. Just add one or more Instagram usernames and get your data in seconds including caption, metrics, images, mentions, coauthors, recent comments, sponsored status, video duration, views. Export scraped data, schedule scraper via API, integrate with other tools or AI workflows.

Apify

51K

4.7

(43)

Facebook Ad Library Scraper

curious_coder/facebook-ads-library-scraper

Scrape facebook ads search and ads run by facebook pages - Fast and lightweight. $0.75 per 1000 results

Curious Coder

10K

4.3

(25)

Full TikTok API Scraper

scraptik/tiktok-api

Apify’s LOWEST cost TikTok Scraper. Access the TikTok mobile API for user, posts, sounds, search, comments, followers, and more. Unlock TikTok data at scale. Have custom needs? Visit scraptik.com

ScrapTik

318

5.0

(3)

Google Maps Extractor

compass/google-maps-extractor

Extract data from hundreds of places fast. Scrape Google Maps by keyword, category, location, URLs & other filters. Get addresses, contact info, opening hours, popular times, prices, menus & more. Export scraped data, run the scraper via API, schedule and monitor runs, or integrate with other tools.

Compass

58K

4.9

(82)

SKCRIS Scraper

SKCRIS Scraper

Dataset type

Value options:

Include linked resources

Filter by first letter

Value options:

Filter by region (kraj)

Value options:

Results per page

Count the total matched results

Extend Actor input from URL

Extend Actor input from custom function

Start URLs

Start URLs from Dataset

Start URLs from custom function

Proxy configuration

Include personal data

Limit the number of requests

Transform requests

Transform requests - Setup

Transform requests - Teardown

Filter requests

Filter requests - Setup

Filter requests - Teardown

RequestQueue ID

Limit the number of scraped entries

Rename dataset fields

Pick dataset fields

Transform entries

Transform entries - Setup

Transform entries - Teardown

Filter entries

Filter entries - Setup

Filter entries - Teardown

Dataset ID

Cache ID

Cache primary keys

Cache action on result

Value options:

maxRequestRetries

maxRequestsPerMinute

maxRequestsPerCrawl

minConcurrency

maxConcurrency

navigationTimeoutSecs

requestHandlerTimeoutSecs

keepAlive

ignoreSslErrors

additionalMimeTypes

suggestResponseEncoding

forceResponseEncoding

Batch requests

Wait (in seconds) between processing requests in a single batch

Log Level

Value options:

Error reporting dataset ID

Send errors to Sentry

Metamorph actor ID - metamorph to another actor at the end

Metamorph actor build

Metamorph actor input

You might also like

Website Content Crawler

🔥Apollo Scraper - Scrape upto 50k Leads

TikTok Scraper

Google Maps Scraper

Instagram Scraper

🏯 Tweet Scraper V2 - X / Twitter Scraper

Facebook Posts Scraper

Instagram Post Scraper

Facebook Ad Library Scraper

Full TikTok API Scraper

Google Maps Extractor

Dataset type

Value options:

Include linked resources

Filter by first letter

Value options:

Filter by region (kraj)

Value options: