Data Cleaning Actor
Pricing
$15.00 / 1,000 results
Go to Apify Store

Data Cleaning Actor
This actor automatically cleans, analyzes, and summarizes spreadsheet data. It handles different file types (CSV, XLSX), fixes missing values, detects outliers, generates charts, computes correlations, and returns a cleaned dataset along with downloadable files and visual insights.
Pricing
$15.00 / 1,000 results
Rating
0.0
(0)
Developer

Mitchell Wanjiru
Maintained by Community
Actor stats
0
Bookmarked
3
Total users
2
Monthly active users
8 days ago
Last modified
Categories
Share
data_cleaning_actor
A Python-based Apify Actor for automatic data cleaning, summarization, and visualization of CSV or Excel files.
Features
- Upload CSV or Excel files
- Clean data (remove missing, standardize types)
- Compute summary statistics (mean, median, missing values)
- Generate correlation matrix and plots (histogram, heatmap)
- Outputs cleaned CSV, JSON summary, and optional plot image
Usage
Inputs
dataFile: Path to CSV or Excel file to processcolumnsToAnalyze: (optional) List of columns to include in analysissummaryType: "stats", "correlations", or "charts"
Example key_value_stores/default/INPUT.json:
{"dataFile": "sample.csv","columnsToAnalyze": [],"summaryType": "stats"}
Outputs
key_value_stores/default/cleaned.csv: Cleaned data filekey_value_stores/default/summary.json: JSON summary (mean, median, missing values, correlations)key_value_stores/default/plot.png: Plot image (if requested)
Running Locally
- Install Python dependencies:
pip install -r requirements.txt
- Prepare your input file and
INPUT.jsonas above. - Run the Actor:
python src/main.py
- Check outputs in
key_value_stores/default/
Publishing to Apify
- Ensure all required files are present:
.actor/actor.json,input_schema.json,output_schema.json,dataset_schema.jsonsrc/main.py,requirements.txt,Dockerfile,AGENTS.md,README.md
- Test with real data and input options.
- Log in and push:
apify loginapify push
Notes
- Only numeric columns are used for correlations and plots.
- All NaN values in JSON output are converted to
nullfor compatibility. - For questions or issues, see AGENTS.md for Apify Actor guidance.


