SmartData Executor avatar
SmartData Executor
Under maintenance

Pricing

from $0.01 / 1,000 processed rows

Go to Apify Store
SmartData Executor

SmartData Executor

Under maintenance

Run structured data processing on CSV or JSON files. Clean, filter, aggregate, and transform datasets using simple parameters. Designed for analysts, automation workflows, and ETL pipelines. Outputs results as Apify Datasets with execution metadata.

Pricing

from $0.01 / 1,000 processed rows

Rating

0.0

(0)

Developer

Am Af

Am Af

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

1

Monthly active users

8 days ago

Last modified

Share

Run fast, controlled Python data processing jobs on large datasets without managing infrastructure.

SmartData Executor is an Apify Actor that performs common data operations—cleaning, filtering, aggregation, and transformation—on CSV or JSON datasets using Pandas. It is designed for reliability, security, and predictable costs.


🚀 Features

  • Supports CSV and JSON inputs
  • Powered by Pandas for high-performance data manipulation
  • Controlled execution environment; no arbitrary code execution
  • Handles large datasets efficiently
  • Outputs stored in Apify Key-Value Store and Dataset
  • Integrates seamlessly with Apify workflows

🔧 Supported Operations

1️⃣ clean

  • Removes empty rows
  • Removes duplicate rows

2️⃣ filter

  • Filters rows by column values
  • Supports multiple filters simultaneously

3️⃣ aggregate

  • Supported aggregation functions:
    • sum
    • mean
    • count
    • min
    • max

4️⃣ groupBy

  • Group by one or more columns

5️⃣ renameColumns

  • Rename columns

6️⃣ computedColumns

  • Create computed columns using safe Pandas expressions

📥 Input Format

The input JSON must include the operation type, input format, data, and optional parameters:

{
"operation": "clean | filter | aggregate | transform",
"inputFormat": "csv | json",
"data": "URL or raw CSV/JSON string",
"params": {
"filters": { "ColumnName": "Value" },
"groupBy": ["Column1", "Column2"],
"aggregations": { "Column1": "sum", "Column2": "mean" },
"renameColumns": { "oldName": "newName" },
"computedColumns": { "NewColumn": "Expression" }
}
}