Dataset2GPT avatar

Dataset2GPT

Try for free

2 hours trial then $7.95/month - No credit card required now

Go to Store
Dataset2GPT

Dataset2GPT

flamboyant_leaf/dataset2gpt
Try for free

2 hours trial then $7.95/month - No credit card required now

Dataset2GPT reads any dataset on Apify and uses GPT-based AI to perform a variety of text-processing tasks—such as summarization, analysis, classification, or transformation. Whether you want a comprehensive 2,000-word synopsis, key insights, sentiment analysis, or custom NLP transformations, Dataset2GPT can handle it.

For a complete tutorial on how to integrate Dataset2GPT to analyse Reddit comments in bulk, please see our tutorial using our gpt-powered summary actor here.

Features

  • GPT-Powered Processing using GPT-4o-mini for best price/performance ratio
  • Flexible Output Length: Control how long or detailed the result should be (100–2,000 words or more)
  • Task Focus: Provide a specific topic or angle for the AI to focus on (e.g., “marketing analysis,” “customer feedback,” “sentiment extraction”)
  • Token Usage & Cost Tracking: Monitor how many tokens the AI uses and the associated cost
  • PDF Export with Markdown formatting for easy sharing
  • Optional Email Delivery of processed results

How It Works

  1. Input: Your Apify actor collects or generates data in a dataset.
  2. Processing: Dataset2GPT reads the dataset content as plain text, then uses GPT to perform your desired task.
  3. AI Output: The AI generates the requested output (e.g., summary, analysis, transformation).
  4. Delivery: Retrieve the output in multiple formats (dataset record, PDF, email).

Benefits

  • Rapid Insights: Quickly extract meaning or structure from large datasets.
  • Plain Text Processing: No complex parsing required—just feed text.
  • Seamless Integration: Works with any Apify actor that outputs to a dataset.
  • Automation: Set up once and let Dataset2GPT handle the heavy lifting.

Perfect For

  • Market Researchers
  • Social Media Managers
  • Data Scientists
  • Business Intelligence Teams
  • Content Analysts
  • Anyone working with scraped or collected data

Input Parameters

ParameterTypeRequiredDefaultDescription
datasetIdstringNo-Dataset ID for testing. When used as an integration, dataset ID is handled automatically
openaiApiKeystringYes-Your OpenAI API key for GPT-based processing
summaryLengthintegerNo1000Desired length of output in words (100–2000). You can also use it for controlling output detail
focusTopicstringNo-Specific topic or angle to guide the AI’s processing (e.g., “sentiment analysis” or “key points”)
targetEmailstringNo-Email address to send the results to (requires EMAIL_SUPPORT=true)

Output Formats

  1. Dataset Record

    1{
    2    "output": "The AI-generated text (e.g., summary, analysis, transformations)...",
    3    "tokenUsage": {
    4        "promptTokens": 1234,
    5        "completionTokens": 567,
    6        "totalTokens": 1801
    7    },
    8    "costs": {
    9        "promptCost": 0.001234,
    10        "completionCost": 0.000567,
    11        "totalCost": 0.001801
    12    }
    13}
    • The output field contains the GPT-processed text.
    • Token usage and cost breakdown are provided for transparency.
  2. PDF Output

    • Stored in a key-value store under the OUTPUT key.
    • Includes Markdown formatting for better readability.
    • Downloadable via the Apify console or API.
  3. Email Delivery (Optional)

    • Sends the GPT output in HTML and plain text formats.
    • Requires EMAIL_SUPPORT=true and a valid targetEmail.

Getting Started

  1. Add Dataset2GPT to Your Apify Actor: For example, if you have an actor scraping Reddit or Twitter, attach Dataset2GPT as an integration step.
  2. Use the Default Dataset: In your input configuration, set "datasetId": "{{resource.defaultDatasetId}}" to automatically consume the scraped data.
  3. Run Your Apify Actor: Once it finishes, Dataset2GPT will read the dataset and produce the requested AI output.
  4. Retrieve Your Results:
    • Download from the dataset.
    • Grab the PDF from the key-value store.
    • Or check your inbox if you enabled email delivery.

Usage Examples

1{
2    "openaiApiKey": "your-openai-api-key",
3    "summaryLength": 1500,
4    "focusTopic": "product feedback analysis"
5}

Here, Dataset2GPT will generate a ~1500-word analysis focusing on product feedback.

For Testing

1{
2    "datasetId": "your-dataset-id",
3    "openaiApiKey": "your-openai-api-key",
4    "summaryLength": 1000
5}

Useful for local testing. Specify a known datasetId.

With Email Delivery

1{
2    "openaiApiKey": "your-openai-api-key",
3    "targetEmail": "user@example.com",
4    "focusTopic": "sentiment extraction"
5}

Generates text focusing on sentiments and sends it to user@example.com.

Environment Variables

  • EMAIL_SUPPORT: Enable/disable email functionality (default: true)
  • MAX_PARALLEL_REQUESTS: Controls parallel API requests (default: 10)
  • PROMPT_TOKEN_COST: Cost per 1M prompt tokens (default: 0.150)
  • COMPLETION_TOKEN_COST: Cost per 1M completion tokens (default: 0.075)

With Dataset2GPT, turn any dataset into meaningful insights, summaries, transformations, or advanced NLP tasks—powered by GPT.

Developer
Maintained by Community

Actor Metrics

  • 1 monthly user

  • 0 No stars yet

  • >99% runs succeeded

  • Created in Jan 2025

  • Modified 5 days ago