LLM Dataset Processor avatar

LLM Dataset Processor

Under maintenance
Try for free

No credit card required

Go to Store
This Actor is under maintenance.

This Actor may be unreliable while under maintenance. Would you like to try a similar Actor instead?

See alternative Actors
LLM Dataset Processor

LLM Dataset Processor

dusan.vystrcil/llm-dataset-processor
Try for free

No credit card required

Allows you to process whole dataset with single LLM prompt. It's useful if you need to enrich data, summarize content, extract specific information, or manipulate data in a structured way using AI.

Input Dataset ID

inputDatasetIdstringRequired

The ID of the dataset to process.

Large Language Model

modelEnumRequired

The LLM to use for processing. Each model has different capabilities and pricing. GPT-4o-mini and Claude 3.5 Haiku are recommended for cost-effective processing, while models like Claude 3 Opus or GPT-4o offer higher quality but at a higher cost.

Value options:

"gpt-4o-mini": string"gpt-4o": string"claude-3-5-haiku-latest": string"claude-3-5-sonnet-latest": string"claude-3-opus-latest": string"gemini-1.5-flash": string"gemini-1.5-flash-8b": string"gemini-1.5-pro": string

LLM Provider API Token

llmApiTokenstringRequired

Your API token for the LLM Provider (e.g., OpenAI).

Temperature

temperaturestringRequired

Sampling temperature for the LLM API (controls randomness). We recommend to use a value closer to 0 for exact results. In case of more 'creative' results, we recommend to use a value closer to 1.

Default value of this property is "0.1"

Multiple columns in output

multipleColumnsbooleanOptional

When enabled, instructs the LLM to return responses as JSON objects, creating multiple columns in the output dataset. The columns need to be named and described in the prompt. If disabled, responses are stored in a single llmresponse column.

Default value of this property is false

Prompt

promptstringRequired

The prompt template to send to the LLM API.

Use {{fieldName}} placeholders to insert values from the input dataset (e.g., Summarize this text: {{content.text}}). For multiple columns output, ensure your prompt contains the names and descriptions of the desired columns in output.

See README for more details.

Skip item if one or more {{field}} are empty

skipItemIfEmptybooleanOptional

When enabled, items will be skipped if any {{field}} referenced in the prompt is empty, null, undefined, or contains only whitespace. This helps prevent processing incomplete data.

Default value of this property is true

Max Tokens

maxTokensintegerRequired

Maximum number of tokens in the LLM API response.

Default value of this property is 150

Test Prompt Mode

testPromptbooleanOptional

Test mode that processes only a limited number of items (defined by testItemsCount). Use this to validate your prompt and configuration before running on the full dataset. We highly recommend enabling this option first to validate your prompt because of ambiguity of the LLM responses.

Default value of this property is true

Test Items Count

testItemsCountintegerOptional

Number of items to process when Test Prompt Mode is enabled.

Default value of this property is 3

Developer
Maintained by Community

Actor Metrics

  • 0 monthly users

  • 0 No stars yet

  • >99% runs succeeded

  • Created in Dec 2024

  • Modified 2 days ago

Categories