Apify Storage

Scalable and reliable cloud data storages designed specifically for web scraping and automation workloads

Apify Storage

Streamline your web scraping, crawling, automation or data processing workloads with specialized data storages from Apify. Easily maintain queues of URLs of web pages to crawl, store screenshots, save scraping results and export them to formats such as JSON, CSV or Excel.

Benefits

Designed especially for web scraping and crawling

Traditional database systems are not well suited for web scraping and crawling operations, and they can become prohibitively expensive for large workloads or fail to scale altogether. Apify provides low-cost storage carefully designed for these types of operations.

Web scraping and crawling
Reliability, performance and scalability

Enterprise-grade reliability, performance, and scalability

Store a few records or a few hundred million, with the same low latency and high reliability. Apify storages are designed according to industry's best practices and use Amazon Web Services for the underlying data storage, giving you high availability and peace of mind.

Easy to use

Data stored in Apify storages can be viewed on the web, giving you a quick way to review the data and share it with other people. The Apify API and SDK makes it easy to integrate our storages into your apps. All storage types come with extensive documentation and code examples.

Easy to use

Dataset

Store results from your web scraping, crawling or data processing jobs into Apify datasets and then export it to various formats like JSON, CSV, XML, RSS, Excel or HTML.

Datasets are ideal for storing a list of items, products from an online store or contact details of prospective customers. The advanced formatting and filtering options let you easily integrate datasets into your data pipeline.

DatasetDataset

Request queue

Maintain a queue of URLs of web pages in order to recursively crawl websites, starting from initial URLs and adding new links as they are found while skipping duplicates.

The request queue lets you query whether specific URLs were already found, push new URLs to the queue and fetch the next ones to process. Request queues support both breadth-first and depth-first crawling orders, and custom data attributes.

Request queueRequest queue

Key-value store

Store arbitrary data records along with their MIME content type. The records are accessible under a unique name and can be written and read at a rapid rate. The key-value store is ideal for saving files, such as screenshots of web pages or PDFs, or to persist the state of your actors and crawlers.

Key-value storeKey-value store