LLM Dataset Processor avatar

LLM Dataset Processor

Try for free

No credit card required

Go to Store
LLM Dataset Processor

LLM Dataset Processor

dusan.vystrcil/llm-dataset-processor
Try for free

No credit card required

Allows you to process output of other actors or stored dataset with single LLM prompt. It's useful if you need to enrich data, summarize content, extract specific information, or manipulate data in a structured way using AI.

Input Dataset ID

inputDatasetIdstringOptional

The ID of the dataset to process.

Large Language Model

modelEnumRequired

The LLM to use for processing. Each model has different capabilities and pricing. GPT-4o-mini and Claude 3.5 Haiku are recommended for cost-effective processing, while models like Claude 3 Opus or GPT-4o offer higher quality but at a higher cost.

Value options:

"gpt-4o-mini": string"gpt-4o": string"claude-3-5-haiku-latest": string"claude-3-5-sonnet-latest": string"claude-3-opus-latest": string"gemini-1.5-flash": string"gemini-1.5-flash-8b": string"gemini-1.5-pro": string

LLM Provider API Key

llmProviderApiKeystringRequired

Your API key for the LLM Provider (e.g., OpenAI).

Temperature

temperaturestringRequired

Sampling temperature for the LLM API (controls randomness). We recommend using a value closer to 0 for exact results. In case of more 'creative' results, we recommend to use a value closer to 1.

Default value of this property is "0.1"

Multiple columns in output

multipleColumnsbooleanOptional

When enabled, instructs the LLM to return responses as JSON objects, creating multiple columns in the output dataset. The columns need to be named and described in the prompt. If disabled, responses are stored in a single llmresponse column.

Default value of this property is false

Prompt Template

promptstringRequired

The prompt template to use for processing. You can use ${fieldName} placeholders to reference fields from the input dataset.

Skip item if one or more ${fields} are empty

skipItemIfEmptybooleanOptional

When enabled, items will be skipped if any ${field} referenced in the prompt is empty, null, undefined, or contains only whitespace. This helps prevent processing incomplete data.

Default value of this property is true

Max Tokens

maxTokensintegerRequired

Maximum number of tokens in the LLM API response for each item.

Default value of this property is 300

Test Prompt Mode

testPromptbooleanOptional

Test mode that processes only a limited number of items (defined by testItemsCount). Use this to validate your prompt and configuration before running on the full dataset. We highly recommend enabling this option first to validate your prompt because of ambiguity of the LLM responses.

Default value of this property is true

Test Items Count

testItemsCountintegerOptional

Number of items to process when Test Prompt Mode is enabled.

Default value of this property is 3

Developer
Maintained by Community

Actor Metrics

  • 0 monthly users

  • 1 star

  • >99% runs succeeded

  • Created in Dec 2024

  • Modified a day ago

Categories