Output to Dataset avatar
Output to Dataset

Pricing

Pay per event

Go to Apify Store
Output to Dataset

Output to Dataset

Merges outputs from multiple actors into a single dataset. Execute actors in series or parallel, combine data from datasets, key-value stores, webhooks, and export the final output in various formats.

Pricing

Pay per event

Rating

5.0

(1)

Developer

njoylab

njoylab

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

2

Monthly active users

2 days ago

Last modified

Share

Apify actor that merges outputs from multiple actors into a single dataset. Execute actors in series or parallel, combine data from datasets, key-value stores, and webhooks, and get ready-made JSON/CSV/XLSX download links straight from the run logs.

Features

  • Multiple Data Sources: Fetch data from:

    • Existing datasets
    • Key-value stores
    • Actor runs
    • Webhook URLs
  • Actor Execution: Run multiple actors and collect their outputs

    • Parallel execution: Run all actors simultaneously for faster results
    • Series execution: Run actors one after another
  • Merge Strategies:

    • Append: Combine all data (keeps duplicates)
    • Deduplicate: Remove duplicates based on specified fields
  • Data Transformations: Filter, remap, pick, or enrich records before merging

  • Instant Downloads: Every run logs the dataset console link plus JSON, CSV, and XLSX download URLs powered by Apify dataset exports

Input Configuration

Sources

Array of existing data sources to merge:

{
"sources": [
{
"type": "dataset",
"id": "datasetId123"
},
{
"type": "keyValueStore",
"id": "storeId456",
"key": "OUTPUT"
},
{
"type": "actorRun",
"id": "runId789"
},
{
"type": "webhook",
"id": "https://api.example.com/data"
}
]
}

Actor Runs

Array of actors to execute before merging:

{
"actorRuns": [
{
"actorId": "apify/web-scraper",
"input": {
"startUrls": [{"url": "https://example.com"}]
},
"outputType": "dataset"
},
{
"actorId": "apify/google-search-scraper",
"input": {
"queries": "apify"
},
"outputType": "keyValueStore",
"outputKey": "OUTPUT"
}
]
}

Use outputType to control where each run stores its data before merging:

  • dataset (default) – read the items that the actor pushed to its default dataset; no outputKey needed.
  • keyValueStore – read a file/record saved via Actor.setValue in the default key-value store; set outputKey to the record name (e.g., "MERGED_OUTPUT.json").

Example mixing both sinks:

{
"actorRuns": [
{
"actorId": "my-dataset-actor",
"outputType": "dataset"
},
{
"actorId": "my-exporting-actor",
"outputType": "keyValueStore",
"outputKey": "LATEST_EXPORT"
}
]
}

Execution Mode

  • parallel (default): Run all actors at the same time
  • series: Run actors one after another

Merge Strategy

  • append (default): Combine all items, keeping duplicates
  • deduplicate: Remove duplicate items based on specified fields
{
"mergeStrategy": "deduplicate",
"deduplicateBy": ["url", "title"]
}

Output Location

All merged records are pushed to the actor's default dataset. Use Apify Console exports (JSON, CSV, XLSX, etc.) when you need a specific download format.

Transformations

Apply zero or more transformations to each item before the merge step. Transformations run in the order provided.

{
"transformations": [
{
"type": "filter",
"field": "price",
"operator": "lessThan",
"value": 50
},
{
"type": "mapFields",
"mapping": {
"title": "product.name",
"price": "product.price"
},
"removeOriginal": true
},
{
"type": "pickFields",
"fields": ["product.name", "product.price", "url"]
},
{
"type": "setField",
"field": "currency",
"value": "USD",
"overwrite": false
}
]
}

Supported transformation types:

  • filter: keep only items whose field matches a condition (equals, notEquals, contains, greaterThan, lessThan, exists).
  • mapFields: copy data from one field path to another (with optional removal of the original field).
  • pickFields: keep only the listed field paths (missing values are kept unless dropUndefined is true).
  • setField: write a static value into a field, optionally skipping existing values unless overwrite is true.

Complete Example

{
"actorRuns": [
{
"actorId": "apify/web-scraper",
"input": {
"startUrls": [
{"url": "https://apify.com/store"}
],
"pageFunction": "async function pageFunction(context) { return context.request; }"
}
},
{
"actorId": "apify/google-search-scraper",
"input": {
"queries": "web scraping"
}
}
],
"sources": [
{
"type": "dataset",
"id": "existingDatasetId"
}
],
"executionMode": "parallel",
"mergeStrategy": "deduplicate",
"deduplicateBy": ["url"]
}

Output

The actor saves all merged data to its default dataset. You can:

  1. Access via Apify Console: View the dataset in the run's output tab
  2. Download: Export the dataset in any format from the Apify platform
  3. Follow the logs: After each run the actor prints both a console link and ready-to-use JSON/CSV/XLSX download URLs for the merged dataset.

Use Cases

1. Merge Multiple Scraping Runs

Run the same scraper with different inputs and merge results:

{
"actorRuns": [
{
"actorId": "my-scraper",
"input": {"category": "electronics"}
},
{
"actorId": "my-scraper",
"input": {"category": "books"}
},
{
"actorId": "my-scraper",
"input": {"category": "clothing"}
}
],
"executionMode": "parallel",
"mergeStrategy": "append"
}

2. Combine Historical Data

Merge data from multiple previous runs:

{
"sources": [
{"type": "actorRun", "id": "run1"},
{"type": "actorRun", "id": "run2"},
{"type": "actorRun", "id": "run3"}
],
"mergeStrategy": "deduplicate",
"deduplicateBy": ["id"]
}

3. Aggregate Multiple Datasets

Combine existing datasets into one:

{
"sources": [
{"type": "dataset", "id": "dataset1"},
{"type": "dataset", "id": "dataset2"},
{"type": "dataset", "id": "dataset3"}
]
}