Output to Dataset avatar

Output to Dataset

Pricing

from $0.07 / 1,000 results

Go to Apify Store
Output to Dataset

Output to Dataset

Merges outputs from multiple actors into a single dataset. Execute actors in series or parallel, combine data from datasets, key-value stores, webhooks, and export the final output in various formats.

Pricing

from $0.07 / 1,000 results

Rating

5.0

(1)

Developer

njoylab

njoylab

Maintained by Community

Actor stats

0

Bookmarked

15

Total users

2

Monthly active users

a month ago

Last modified

Share

Apify actor that merges outputs from multiple actors into a single dataset. Execute actors in series or parallel, combine data from datasets, key-value stores, and webhooks, and get ready-made JSON/CSV/XLSX download links straight from the run logs.

๐Ÿ“‹ Table of Contents


โœจ Features

FeatureDescription
๐Ÿ”— Multiple Data SourcesFetch data from existing datasets, key-value stores, actor runs, or webhook URLs
โšก Actor ExecutionRun multiple actors and collect their outputs in parallel or series
๐Ÿ”€ Merge StrategiesAppend all data or deduplicate based on specified fields
๐Ÿ”ง TransformationsFilter, remap, pick, or enrich records before merging
๐Ÿ“ฅ Instant DownloadsEvery run logs dataset console link plus JSON, CSV, and XLSX download URLs

๐Ÿš€ Quick Start

Run this actor to merge data from multiple sources into a single dataset:

{
"actorRuns": [
{
"actorId": "apify/web-scraper",
"input": { "startUrls": [{ "url": "https://example.com" }] }
}
],
"mergeStrategy": "append"
}

After the run completes, check the logs for direct download links in JSON, CSV, and XLSX formats.


๐Ÿ“ Input Configuration

Sources

Merge data from existing Apify resources. Specify an array of sources with their type and identifier:

TypeDescriptionRequired Fields
datasetRead items from an existing datasetid - Dataset ID
keyValueStoreRead a record from a key-value storeid - Store ID, key - Record name
actorRunRead output from a previous actor runid - Run ID
webhookFetch data from an external URLid - Webhook URL

Example:

{
"sources": [
{ "type": "dataset", "id": "datasetId123" },
{ "type": "keyValueStore", "id": "storeId456", "key": "OUTPUT" },
{ "type": "actorRun", "id": "runId789" },
{ "type": "webhook", "id": "https://api.example.com/data" }
]
}

Actor Runs

Execute actors before merging their outputs. Each actor run configuration supports:

FieldTypeDescription
actorIdstringRequired. Actor ID or username/actor-name
inputobjectInput object to pass to the actor
outputTypestring"dataset" (default) or "keyValueStore"
outputKeystringRecord name when using keyValueStore output type

Example:

{
"actorRuns": [
{
"actorId": "apify/web-scraper",
"input": { "startUrls": [{ "url": "https://example.com" }] },
"outputType": "dataset"
},
{
"actorId": "apify/google-search-scraper",
"input": { "queries": "apify" },
"outputType": "keyValueStore",
"outputKey": "OUTPUT"
}
]
}

๐Ÿ’ก Tip: Use outputType: "dataset" (default) when the actor pushes items to its dataset. Use outputType: "keyValueStore" when the actor saves data via Actor.setValue().


Execution Mode

Controls how actors are executed:

ModeDescription
parallelDefault. Run all actors simultaneously for faster results
seriesRun actors one after another (useful when order matters or for rate limiting)

Merge Strategy

Determines how data is combined:

StrategyDescription
appendDefault. Combine all items, keeping duplicates
deduplicateRemove duplicate items based on deduplicateBy fields

Deduplication example:

{
"mergeStrategy": "deduplicate",
"deduplicateBy": ["url", "title"]
}

Transformations

Apply transformations to each item before merging. Transformations run in the order provided.

TypeDescriptionParameters
filterKeep only items matching a conditionfield, operator, value
mapFieldsCopy data from one field path to anothermapping, removeOriginal
pickFieldsKeep only specified fieldsfields, dropUndefined
setFieldSet a static value on a fieldfield, value, overwrite

Filter operators: equals, notEquals, contains, greaterThan, lessThan, exists

Example:

{
"transformations": [
{
"type": "filter",
"field": "price",
"operator": "lessThan",
"value": 50
},
{
"type": "mapFields",
"mapping": {
"title": "product.name",
"price": "product.price"
},
"removeOriginal": true
},
{
"type": "pickFields",
"fields": ["product.name", "product.price", "url"]
},
{
"type": "setField",
"field": "currency",
"value": "USD",
"overwrite": false
}
]
}

๐Ÿ“ฆ Complete Example

This example runs two actors in parallel, merges their outputs with an existing dataset, and deduplicates by URL:

{
"actorRuns": [
{
"actorId": "apify/web-scraper",
"input": {
"startUrls": [{ "url": "https://apify.com/store" }],
"pageFunction": "async function pageFunction(context) { return context.request; }"
}
},
{
"actorId": "apify/google-search-scraper",
"input": { "queries": "web scraping" }
}
],
"sources": [
{ "type": "dataset", "id": "existingDatasetId" }
],
"executionMode": "parallel",
"mergeStrategy": "deduplicate",
"deduplicateBy": ["url"]
}

๐Ÿ“ค Output

All merged records are saved to the actor's default dataset. After each run, the logs display:

  1. Console link - Direct link to view the dataset in Apify Console
  2. Download URLs - Ready-to-use links for JSON, CSV, and XLSX exports

You can also export the dataset manually from the Apify Console in any supported format.


๐Ÿ’ก Use Cases

1. Merge Multiple Scraping Runs

Run the same scraper with different inputs and merge all results:

{
"actorRuns": [
{ "actorId": "my-scraper", "input": { "category": "electronics" } },
{ "actorId": "my-scraper", "input": { "category": "books" } },
{ "actorId": "my-scraper", "input": { "category": "clothing" } }
],
"executionMode": "parallel",
"mergeStrategy": "append"
}

2. Combine Historical Data

Merge data from multiple previous actor runs:

{
"sources": [
{ "type": "actorRun", "id": "run1" },
{ "type": "actorRun", "id": "run2" },
{ "type": "actorRun", "id": "run3" }
],
"mergeStrategy": "deduplicate",
"deduplicateBy": ["id"]
}

3. Aggregate Multiple Datasets

Combine existing datasets into one unified dataset:

{
"sources": [
{ "type": "dataset", "id": "dataset1" },
{ "type": "dataset", "id": "dataset2" },
{ "type": "dataset", "id": "dataset3" }
]
}

๐Ÿ“„ License

This project is licensed under the ISC License.