Pricing

Pay per usage

Try for free

Go to Apify Store

Firestore Import

Try for free

Developed by

Apify

Seamlessly import data from Apify datasets into Firebase Firestore database. This integration allows full control over document IDs, conflict resolution (overwrite, merge, skip), and data transformation using a custom JavaScript function.

5.0 (4)

Pricing

Pay per usage

Last modified

5 months ago

Automation

Integrations

Open source

The Firestore Import is Apify integration Actor that import data into Firebase Firestore (NoSQL cloud database build on Google Cloud infrastructure) from Apify dataset. It allows you to configure various options, such as the target collection, handling conflicts in data, and transforming the dataset item before importing it into Firestore.

Features

The Firestore Import Actor takes a dataset, applies transformations, and imports the data into a Firestore database. This Actor is highly customizable, you can control how the data are imported such as:

Selecting Firestore database and collection.
Automatically generating document IDs or using a field from the dataset for the document ID.
Handling document conflicts by either overwriting, merging, or skipping documents with existing ID.
Transforming data before it gets imported using a customizable JavaScript function.
One dataset item can lead to multiple Firestore inserts/updates.
Each document can have its own configuration, such as a custom collection or document ID.

Input

The actor requires several input fields to work correctly. Below is a detailed description of each input field:

Field Name	Type	Description
`serviceAccountKey`	`string` (secret, required)	Service account key in JSON format. You can get it from Firebase Console -> Project Settings -> Service accounts -> Generate new private key. Paste the whole JSON string here, don't worry this is secret input which store the value in encrypted form.
`datasetId`	`string` (required)	ID of the Apify dataset to import data from.
`collection`	`string` (required)	Firestore collection to import data to. If it doesn't exist, it will be created. Note: you can customize the collection for each record by using the `transformFunction` input. This can be useful when you want to import data to sub-collections.
`databaseName`	`string` (optional)	Name of the Firestore database. If not provided, the default database (`"(default)"`) will be used.
`idField`	`string` (optional)	Field in the dataset item that will be used as a Firestore document ID. It must be `string` or `number`. If not provided, all documents will be created with a random ID generated by Firestore (it means that value of `documentConflictResolution` is ignored in that case). This is useful when you want to update existing documents in Firestore. Note: you can customize the ID for each document independently using the `transformFunction` input field.
`documentConflictResolution`	`enum`: `overwrite`, `merge`, `skip` (required)	How to handle conflicts when importing data to Firestore: - overwrite: replace existing Firestore documents with the same ID. - merge: merge data from the dataset items with existing Firestore documents. - skip: documents with existing IDs will be skipped. ⚠️ Please note that the skip resolution has really bad performance on large scale and can't use batch writes (it makes request to Firestore for each document separately).
`transformFunction`	`string` (javascript, optional)	Javascript function that transforms each item from the dataset before importing it to Firestore. The function must return an object (or array of objects) with the `data` key that contains the transformed record and other optional fields. See examples below.
`batchSize`	`number` (optional)	Number of items to import in a single batch. Lower values are safer but slower, see Firestore limits (10 MiB batch write). Please note that skip conflict resolution does not use batch writes and will always import one item at a time. Defaults to `500`.

Transformation Function

The option transformFunction input field allows you to transform each dataset item before importing it to Firestore. The field accepts a JavaScript function that takes one dataset item as a parameter and returns an object (or array of objects) with the following keys:

data (required): transformed document that will be imported to Firestore.
id (optional): custom document ID. If not provided, the idField input field will be used to resolve document id or if not provided the document will be created with a random ID generated by Firestore.
collection (optional): custom collection name. If not provided, the collection input field will be used.
documentConflictResolution (optional): custom conflict resolution for the document. If not provided, the documentConflictResolution input field will be used.

(item) => {
    return {
        data: item,                           // transformed document
        id: item.id,                          // custom document ID
        collection: "customCollection",       // custom collection name
        documentConflictResolution: "merge",  // custom conflict resolution
    };
}

Examples

Simple transformation function:

The function below increments the value of the oldField by 1 and removes the unused field from the dataset item.

(item) => {
    item.newField = item.oldField + 1;
    delete item.unused;
    return { data: item };
}

Nested objects:

The function below transforms the dataset item into a Firestore document with nested objects. It updates the subdocument.field field and overwrites the whole author sub-document.

(item) => {
    return {
        data: {
            title: item.title,
            "subdocument.field": item.name,  // update single field of subdocument
            author: item.author              // overwrite whole subdocument
        },
    };
}

Field value functions:

The function below demonstrates how to use Firestore FieldValue functions. It adds new IDs to the existing ids array, removes values from the values array, increments the count field, and deletes the old field.

(item) => {
    return {
        data: {
            ids: FieldValue.arrayUnion(item.ids),         // add new ids to existing ids array
            values: FieldValue.arrayRemove(item.values),  // remove new values from existing array
            count: FieldValue.increment(item.count),      // increment existing count field by provided value
            old: FieldValue.delete(),                     // removes field
        },
    };
}

Data types:

The function below demonstrates how to create Firestore data types such as Timestamp, Vector, GeoPoint, and DocumentReference.

(item) => {
    return {
        data: {
            updatedAt: Timestamp.fromDate(Date.parse(item.date)),          // create Timestamp data type
            vector: FieldValue.VectorValue(item.values),                   // create vector data type
            position: GeoPoint(item.lat, item.lon),                        // create geopoint data type
            reference: DocumentReference("collection", "referenceDocId"),  // create reference type
        },
    };
}

Subcollection:

The function below demonstrates how to import data to sub-collections. It returns an array where the first item is the main document and other items are documents for sub-collection.

(item) => {
    const subDocuments = item.items.map((subItem) => ({
        id: subItem.id,
        collection: `records/${item.customId}/items`,
        documentConflictResolution: "skip",
        data: {
            weight: subItem.weight,
            length: subItem.length,
            name: subItem.name,
        },
    }));

    return [
        {
            id: item.customId,
            collection: "records",
            documentConflictResolution: "merge",
            data: {
                title: item.title,
                description: item.description,
            },
        },
        ...subDocuments,
    ];
}

Output

The Actor outputs statistics about the import to Key-Value store key Statistics with the following structure:

imported: total number of processed Firestore documents (either created, updated or skipped).
skipped: number of skipped Firestore documents.
overwritten: number of overwritten Firestore documents.
merged: number of merged Firestore documents.
created: number of created Firestore documents (counts written document if documentConflictResolution is skip).
failed: number of failed writes to Firestore documents.
itemsProcessed: total number of processed dataset items (including failed items).
itemsFailed: number of failed dataset items.
executionTimeMs: time in milliseconds it took to import the data.
startTime: timestamp when the import started.
endTime: timestamp when the import ended.

{
  "imported": 59278,
  "skipped": 0,
  "overwritten": 0,
  "merged": 59278,
  "created": 0,
  "failed": 0,
  "itemsProcessed": 1136,
  "itemsFailed": 0,
  "executionTimeMs": 19725,
  "startTime": "2025-02-26T17:56:22.652Z",
  "endTime": "2025-02-26T17:56:42.377Z"
}

On this page

Firestore Import

Share Actor:

Firebase Firestore Import

danielwebr/firebase-firestore-import

This actor is designed to simplify the process of transferring data from an Apify dataset into a Firebase Firestore collection. Simply configure the actor with your Firebase credentials, Firestore collection ID, and dataset ID, and it will handle the rest, including optional data transformations.

Daniel Wébr

Firestore Import

drobnikj/firestore-import

Imports dataset items to Firestone DB.

Jakub Drobník

S3 Uploader

apify/s3-uploader

Upload data from an Apify dataset to an Amazon S3 bucket. Providing various filters and transformation options, this Actor allows precise control over data structure, formatting, and upload settings to ensure seamless integration into your data pipeline.

Apify

Google Sheets Import & Export

lukaskrivka/google-sheets

Import data from datasets or JSON files to Google Sheets. Programmatically process data in Sheets. Easier and faster than the official Google Sheets API and perfect for importing data from scraping.

Lukáš Křivka

1.9K

5.0

Ultimate Reddit Profile Scraper

potatopeeler/reddit-scraper

Seamlessly download full Reddit user accounts, capturing posts, images, activity, and historical data, including URLs and media comments. Export detailed insights to CSV, JSON, XML, EXCEL formats, or effortlessly import them into your email for comprehensive analysis and easy access.

Jamie Potato

241

MongoDB Import

drobnikj/mongodb-import

Act import objects to specific mongodb collection. You can pass list of plain objects or Apify key-value store as input. For info see readme bellow.

Jakub Drobník

Ultimate Reddit Profile Scraper (Lite)

potatopeeler/reddit-account-scraper-lite

Pay per result. Seamlessly download full Reddit user accounts, capturing posts, images, activity, and historical data, including URLs and media comments. Export detailed insights to CSV, JSON, XML, EXCEL formats, or effortlessly import them into your email for comprehensive analysis and easy access.

Jamie Potato

103

🔥 Power Data Transformer

wiseek/power-data-transformer

Automate your entire data workflow: clean, merge, filter, deduplicate, enrich, and reshape your datasets using built-in transformation or powerful SQL pipelines — seamlessly integrated with automation platforms like n8n, Make.com, and Zapier.

wiseek

Web Scraper

apify/web-scraper

Crawls arbitrary websites using a web browser and extracts structured data from web pages using a provided JavaScript function. The Actor supports both recursive crawling and lists of URLs, and automatically manages concurrency for maximum performance.

Apify

93K

4.5