Output to Dataset avatar
Output to Dataset

Pricing

from $0.07 / 1,000 results

Go to Apify Store
Output to Dataset

Output to Dataset

Merges outputs from multiple actors into a single dataset. Execute actors in series or parallel, combine data from datasets, key-value stores, webhooks, and export the final output in various formats.

Pricing

from $0.07 / 1,000 results

Rating

5.0

(1)

Developer

njoylab

njoylab

Maintained by Community

Actor stats

0

Bookmarked

8

Total users

1

Monthly active users

7 days ago

Last modified

Share

Apify actor that merges outputs from multiple actors into a single dataset. Execute actors in series or parallel, combine data from datasets, key-value stores, and webhooks, and get ready-made JSON/CSV/XLSX download links straight from the run logs.

📋 Table of Contents


✨ Features

FeatureDescription
🔗 Multiple Data SourcesFetch data from existing datasets, key-value stores, actor runs, or webhook URLs
Actor ExecutionRun multiple actors and collect their outputs in parallel or series
🔀 Merge StrategiesAppend all data or deduplicate based on specified fields
🔧 TransformationsFilter, remap, pick, or enrich records before merging
📥 Instant DownloadsEvery run logs dataset console link plus JSON, CSV, and XLSX download URLs

🚀 Quick Start

Run this actor to merge data from multiple sources into a single dataset:

{
"actorRuns": [
{
"actorId": "apify/web-scraper",
"input": { "startUrls": [{ "url": "https://example.com" }] }
}
],
"mergeStrategy": "append"
}

After the run completes, check the logs for direct download links in JSON, CSV, and XLSX formats.


📝 Input Configuration

Sources

Merge data from existing Apify resources. Specify an array of sources with their type and identifier:

TypeDescriptionRequired Fields
datasetRead items from an existing datasetid - Dataset ID
keyValueStoreRead a record from a key-value storeid - Store ID, key - Record name
actorRunRead output from a previous actor runid - Run ID
webhookFetch data from an external URLid - Webhook URL

Example:

{
"sources": [
{ "type": "dataset", "id": "datasetId123" },
{ "type": "keyValueStore", "id": "storeId456", "key": "OUTPUT" },
{ "type": "actorRun", "id": "runId789" },
{ "type": "webhook", "id": "https://api.example.com/data" }
]
}

Actor Runs

Execute actors before merging their outputs. Each actor run configuration supports:

FieldTypeDescription
actorIdstringRequired. Actor ID or username/actor-name
inputobjectInput object to pass to the actor
outputTypestring"dataset" (default) or "keyValueStore"
outputKeystringRecord name when using keyValueStore output type

Example:

{
"actorRuns": [
{
"actorId": "apify/web-scraper",
"input": { "startUrls": [{ "url": "https://example.com" }] },
"outputType": "dataset"
},
{
"actorId": "apify/google-search-scraper",
"input": { "queries": "apify" },
"outputType": "keyValueStore",
"outputKey": "OUTPUT"
}
]
}

💡 Tip: Use outputType: "dataset" (default) when the actor pushes items to its dataset. Use outputType: "keyValueStore" when the actor saves data via Actor.setValue().


Execution Mode

Controls how actors are executed:

ModeDescription
parallelDefault. Run all actors simultaneously for faster results
seriesRun actors one after another (useful when order matters or for rate limiting)

Merge Strategy

Determines how data is combined:

StrategyDescription
appendDefault. Combine all items, keeping duplicates
deduplicateRemove duplicate items based on deduplicateBy fields

Deduplication example:

{
"mergeStrategy": "deduplicate",
"deduplicateBy": ["url", "title"]
}

Transformations

Apply transformations to each item before merging. Transformations run in the order provided.

TypeDescriptionParameters
filterKeep only items matching a conditionfield, operator, value
mapFieldsCopy data from one field path to anothermapping, removeOriginal
pickFieldsKeep only specified fieldsfields, dropUndefined
setFieldSet a static value on a fieldfield, value, overwrite

Filter operators: equals, notEquals, contains, greaterThan, lessThan, exists

Example:

{
"transformations": [
{
"type": "filter",
"field": "price",
"operator": "lessThan",
"value": 50
},
{
"type": "mapFields",
"mapping": {
"title": "product.name",
"price": "product.price"
},
"removeOriginal": true
},
{
"type": "pickFields",
"fields": ["product.name", "product.price", "url"]
},
{
"type": "setField",
"field": "currency",
"value": "USD",
"overwrite": false
}
]
}

📦 Complete Example

This example runs two actors in parallel, merges their outputs with an existing dataset, and deduplicates by URL:

{
"actorRuns": [
{
"actorId": "apify/web-scraper",
"input": {
"startUrls": [{ "url": "https://apify.com/store" }],
"pageFunction": "async function pageFunction(context) { return context.request; }"
}
},
{
"actorId": "apify/google-search-scraper",
"input": { "queries": "web scraping" }
}
],
"sources": [
{ "type": "dataset", "id": "existingDatasetId" }
],
"executionMode": "parallel",
"mergeStrategy": "deduplicate",
"deduplicateBy": ["url"]
}

📤 Output

All merged records are saved to the actor's default dataset. After each run, the logs display:

  1. Console link - Direct link to view the dataset in Apify Console
  2. Download URLs - Ready-to-use links for JSON, CSV, and XLSX exports

You can also export the dataset manually from the Apify Console in any supported format.


💡 Use Cases

1. Merge Multiple Scraping Runs

Run the same scraper with different inputs and merge all results:

{
"actorRuns": [
{ "actorId": "my-scraper", "input": { "category": "electronics" } },
{ "actorId": "my-scraper", "input": { "category": "books" } },
{ "actorId": "my-scraper", "input": { "category": "clothing" } }
],
"executionMode": "parallel",
"mergeStrategy": "append"
}

2. Combine Historical Data

Merge data from multiple previous actor runs:

{
"sources": [
{ "type": "actorRun", "id": "run1" },
{ "type": "actorRun", "id": "run2" },
{ "type": "actorRun", "id": "run3" }
],
"mergeStrategy": "deduplicate",
"deduplicateBy": ["id"]
}

3. Aggregate Multiple Datasets

Combine existing datasets into one unified dataset:

{
"sources": [
{ "type": "dataset", "id": "dataset1" },
{ "type": "dataset", "id": "dataset2" },
{ "type": "dataset", "id": "dataset3" }
]
}

📄 License

This project is licensed under the ISC License.