Dataset Image Downloader & Uploader
No credit card required
Dataset Image Downloader & Uploader
No credit card required
Download image files from image URLs in your datasets and save them to a Zip file, Key-Value store, or directly your AWS S3 bucket.
Do you want to learn more about this Actor?
Get a demoDataset Id
datasetId
stringOptional
Id of the dataset where the data are located. Image URLs will be extracted from there.
Path to image URLs
pathToImageUrls
stringOptional
Path from item object to an array or string where the URL(s) is/are located. Provide in "javascript style", e.g. "details[0].images
Filename function
fileNameFunction
stringOptional
Function that specifies how will be image filename created from its URL. If you keep this empty, it will be md5 hash of the URL.
Limit
limit
integerOptional
Max items to load from the dataset. Use with offset
to paginate over the data (can reduce memory requirement of large loads).
Offset
offset
integerOptional
How many items to skip from the dataset. Use with limit
to paginate over the data (can reduce memory requirement of large loads)
Output to
outputTo
EnumOptional
Where to save the data from input after possibly transforming them during the download process.
Value options:
"no-output": string"key-value-store": string"dataset": string
Output dataset Name or ID
outputDatasetId
stringOptional
Name or ID of the dataset where the data will be saved. Only relevant if you want to output to dataset!
Key Value store input
storeInput
stringOptional
If you want to input the data from key-value store instead of dataset. Notation: storeId-recordKey
, e.g. - kWdGzuXuKfYkrntWw-OUTPUT
Upload to
uploadTo
EnumOptional
Where do you want to upload the image files
Value options:
"zip-file": string"key-value-store": string"s3": string"no-upload": string
Key-value store name
uploadStoreName
stringOptional
Key-value store name where the images will be upload. Empty field means it will be uploaded to the default key-value store
S3 Bucket
s3Bucket
stringOptional
Only relevant if you want to upload to S3! Name of the bucket where to upload.
S3 Access key id
s3AccessKeyId
stringOptional
Only relevant if you want to upload to S3! You can create these credentials for IAM user.
S3 Secret access key
s3SecretAccessKey
stringOptional
Only relevant if you want to upload to S3! You can create these credentials for IAM user.
Check if key is already on S3
s3CheckIfAlreadyThere
booleanOptional
This option is useful if you don't want to rewrite the same image. GET requests are also cheaper than PUT requests
Pre-download function
preDownloadFunction
stringOptional
Function that specifies how will be the data transformed before downloading the image. The input and output of the function is the whole data array. You can skip downloading images of any item if you add skipItem: true field to it.
Post-download function
postDownloadFunction
stringOptional
Function that specifies how will be the data transformed before downloading the image. The input and output of the function is the whole data array. By default it adds either the file URL or errors array depending if the download was successfull.
Max retries
imageCheckMaxRetries
integerOptional
How many times should actor retry if the file it tries to download fails to pass the tests. Setting this too high can lead to unecessary loops.
Default value of this property is 6
Image check type
imageCheckType
EnumOptional
Type of the image check. If the image will not pass, the download will be retied with proxy and if that doesn't pass, the image is not uploaded.
Value options:
"none": string"content-type": string"image-size": string
Min size in KB
imageCheckMinSize
integerOptional
Minimum size of the image to pass the image check test
Min width
imageCheckMinWidth
integerOptional
Minimim width of the image in pixels to pass the image check. Works only if the image check type is 'jimp'.
Min height
imageCheckMinHeight
integerOptional
Minimim height of the image in pixels to pass the image check. Works only if the image check type is 'jimp'.
Max concurrency
maxConcurrency
integerOptional
You can specify how many maximum parallel downloading/uploading requests will be running. Keep in mind that the limit is here to not overload the host server.
Default value of this property is 40
Download timeout in ms
downloadTimeout
integerOptional
How long we will wait to download each image
Default value of this property is 15000
Batch Size
batchSize
integerOptional
Number of items loaded from dataset in one batch.
Default value of this property is 10000
Convert webp to png
convertWebpToPng
booleanOptional
If checked, the actor will automatically convert all webp type images to standard png. This increases the size of the image.
Actor Metrics
38 monthly users
-
14 stars
>99% runs succeeded
Created in Nov 2018
Modified 2 months ago