Scraped Data Cleaner & Converter (No-Code CSV/JSON Tool) - PPE avatar
Scraped Data Cleaner & Converter (No-Code CSV/JSON Tool) - PPE

Pricing

Pay per event

Go to Store
Scraped Data Cleaner & Converter (No-Code CSV/JSON Tool) - PPE

Scraped Data Cleaner & Converter (No-Code CSV/JSON Tool) - PPE

Developed by

M3Web

M3Web

Maintained by Community

Clean and organize scraped .json or .csv data — no coding required. Remove duplicates, empty rows, unwanted columns, and sort by any field. Cleaned results are stored in Apify's Key-Value Store. Perfect for marketers, researchers, and no-code workflows.

0.0 (0)

Pricing

Pay per event

1

Total users

1

Monthly users

1

Runs succeeded

>99%

Last modified

a day ago

🧹 Scraped Data Cleaner & Converter (No-Code CSV/JSON Tool)

Clean and transform structured datasets from .json or .csv files — no coding required.

This Actor helps you prepare messy scraped data for analysis, import, or enrichment workflows. Whether you're working with leads, profiles, products, or survey results, it removes noise and gives you clean, structured rows you can actually use.


💰 Pricing – $0.03 Flat Per Run (PPE)

This Actor uses Pay Per Event (PPE) pricing with a flat $0.03 per run — no compute or storage fees.
Want to test it first? Try the Rental version with 7 days free access.


✅ Features

  • Accepts both .csv and .json files (uploaded directly or linked from Apify Key-Value Store)
  • Removes duplicate rows based on a field you choose (e.g. email, sku)
  • Discards rows missing required data
  • Choose whether all required fields must be filled or just one
  • Optionally remove rows with no meaningful values
  • Filter rows that match a specific field-value pair (e.g. status = active)
  • Delete unwanted fields (columns in CSV / keys in JSON)
  • Sort rows by one or more fields (text, number, date supported)
  • Saves results in both .csv and .json formats — download links shown in log

🧾 How to Use

The Actor provides a clean no-code interface. Just upload a file and select any combination of cleaning options.


1.1 📁 Uploaded Data File

Upload a .csv or .json file manually — or enter a full Apify Key-Value Store URL pointing to one.

When uploading directly, you'll see a window titled “Upload file to key-value store” with these options:

  • New temporary storage (recommended) — creates short-lived storage with no additional cost
  • 🗂️ New permanent storage — for keeping the file long-term
  • 📁 Existing storage — reuse an existing Apify KV store

If you’re cleaning a one-off dataset, just use the default temporary option. It’s lightweight, instant, and cost-free.


1.2 🧠 Deduplicate By Field

Specify a field name (e.g. email, id) to remove duplicate rows. Only the first occurrence of each unique value is kept.

1.3 🧹 Remove Empty Rows

Enable to discard rows where all fields are blank, null, or empty strings. Works for both JSON and CSV rows.


2.1.1 🔎 Must-Have Fields

List field names (e.g. email, profile, company) that should contain data. Rows missing those fields will be removed.

2.1.2 🔎 Match All Required Fields

Enable for strict filtering: only rows with all listed fields filled will be kept.
Disable to allow rows with any one field filled.


2.2.1 🎯 Filter by Field

Specify a field name (e.g. members) to match against a specific value.

2.2.2 🎯 Match Specific Value

Enter the exact value the field should contain (e.g. pro). Only rows with that exact match will be kept.


3.1 🪓 Remove Columns (optional)

List column names you want deleted from every row (status, id, etc.). Applies to both CSV and JSON files.


4.1 📌 Sort By Fields

Enter a list of field names to sort by, in order of priority (status, createdAt, email, etc.).

4.2 🔄 Sort Descending

Enable to reverse the sort direction (Z–A, latest-to-earliest, etc.).


📁 Output

Cleaned results are stored in Apify's Key-Value Store and made available in both .csv and .json formats, regardless of original file type.

You'll see two download links directly in the Actor logs:

  • 🟢 JSON format → originalFileName-CLEANED.json
  • 🟠 CSV format → originalFileName-CLEANED.csv

Right-click and “Save link as…” to download the cleaned files.

No need to browse datasets, export manually, or convert formats — everything is ready to go.


🧠 What’s It Good For?

Let’s say you scraped a bunch of data — like contacts, products, survey answers, whatever. This tool helps clean it up and make it actually usable.

You can:

  • Get rid of duplicate entries, like the same email showing up twice
  • Filter out rows that are missing stuff you care about — like empty emails or profiles
  • Keep only the ones with specific values (like people who have status: active or members: pro)
  • Delete random columns you don’t need, like internalNotes or debugInfo
  • Sort everything — by date, group, name, whatever you want
  • Convert between .json and .csv so you can open the file wherever
  • Basically, take any messy scraped file and make it clean, neat, and ready to use

It’s like having a smart assistant that tidies your data for you without writing a single line of code.


🛠 No Coding Required

You don’t need any JavaScript, parsing logic, or scripting knowledge. Just upload your file, tweak a few inputs, and go.

Ideal for:

  • Marketers analyzing scraped leads
  • Researchers organizing field data
  • Journalists working with tabular records
  • Data-driven workflows powered by no-code integrations

🔍 File Format Notes

  • If your input is a .json array, the Actor auto-converts it to tabular .csv as well
  • If your input is a .csv, the Actor also generates .json output for flexible reuse
  • Result filenames follow this pattern:
    yourFileName-CLEANED.csv and yourFileName-CLEANED.json

🧪 Sample Pre-Filled Input

To try it instantly, use the example CSV file provided in the interface or paste this Apify URL:
https://api.apify.com/v2/key-value-stores/9oIROyE5tcs83ZqP5/records/data-example.csv


📊 Input vs Output Examples

📄 Original CSV

groupnameemaillinkedinstatusmembers
3Bobbob@example.comactivebasic
1Eveeve@example.comactivepro
1Charliehttps://linkedin.com/in/charlieinactive
2Danapendingguest
1Eveeve@example.comactivepro
1Alicealice@example.comhttps://linkedin.com/in/aliceactivepro

1.2 🧠 Deduplicate By Field: email

Duplicate row for eve@example.com is removed.

groupnameemaillinkedinstatusmembers
3Bobbob@example.comactivebasic
1Eveeve@example.comactivepro
1Charliehttps://linkedin.com/in/charlieinactive
2Danapendingguest
1Alicealice@example.comhttps://linkedin.com/in/aliceactivepro

1.3 🧹 Remove Empty Rows

Removes the row with no values (between Charlie and Dana).

groupnameemaillinkedinstatusmembers
3Bobbob@example.comactivebasic
1Eveeve@example.comactivepro
1Charliehttps://linkedin.com/in/charlieinactive
2Danapendingguest
1Alicealice@example.comhttps://linkedin.com/in/aliceactivepro

2.1.1 🔎 Required Fields: email, linkedin

2.1.2 🔎 Match All Required Fields: 🔘 OFF — keeps rows with at least one of the fields filled

groupnameemaillinkedinstatusmembers
3Bobbob@example.comactivebasic
1Eveeve@example.comactivepro
1Charliehttps://linkedin.com/in/charlieinactive
1Alicealice@example.comhttps://linkedin.com/in/aliceactivepro

2.1.1 🔎 Required Fields: email, linkedin

2.1.2 🔎 Match All Required Fields: 🟢 ON — keeps only rows with both fields filled

groupnameemaillinkedinstatusmembers
1Alicealice@example.comhttps://linkedin.com/in/aliceactivepro

2.2.1 🎯 Filter by Field: members

2.2.2 🎯 Match Specific Value: pro

Keeps only rows where the members field is exactly pro.

groupnameemaillinkedinstatusmembers
1Eveeve@example.comactivepro
1Alicealice@example.comhttps://linkedin.com/in/aliceactivepro

3.1 🪓 Remove Columns: linkedin, status

Removes those columns from all rows.

groupnameemailmembers
3Bobbob@example.combasic
1Eveeve@example.compro
1Charlie
2Danaguest
1Alicealice@example.compro

4.1 📌 Sort By Fields: group, then name

4.2 🔄 Sort Descending: 🔘 OFF — lowest group first, A–Z within group

groupnameemaillinkedinstatusmembers
1Alicealice@example.comhttps://linkedin.com/in/aliceactivepro
1Charliehttps://linkedin.com/in/charlieinactive
1Eveeve@example.comactivepro
2Danapendingguest
3Bobbob@example.comactivebasic

4.1 📌 Sort By Fields: group, then name

4.2 🔄 Sort Descending: 🟢 ON — highest group first, Z–A within group

groupnameemaillinkedinstatusmembers
3Bobbob@example.comactivebasic
2Danapendingguest
1Eveeve@example.comactivepro
1Charliehttps://linkedin.com/in/charlieinactive
1Alicealice@example.comhttps://linkedin.com/in/aliceactivepro

⚖️ Pay Per Event (PPE) vs Rental – Which Version Should You Use?

Feature🟢 PPE Version🔵 Rental Version
Pricing ModelPay Per EventMonthly Subscription
Cost$0.03 per run$2.90/month
Usage Charges✅ No compute/storage fees⚠️ Usage billed separately (CU, dataset)
Free Trial❌ None✅ 7 days free
Output StorageKey-Value Store (CSV + JSON)Dataset Export

If you clean data occasionally and want zero billing surprises, use the PPE version — simple, predictable pricing.
If you run this frequently (e.g. 100+ runs/month), the Rental version offers better long-term value, and includes a 7-day free trial.


💬 Feedback & Ideas

Want new filtering modes, regex support, or nested data handling?
Have ideas to make it even simpler for non-coders? Just send me a message — I’d love to hear how you're using the tool.