42 Polars AI Data Transformer
Pricing
Pay per event
42 Polars AI Data Transformer
Transform datasets using natural language. Upload CSV/Excel/JSON, describe your transformation in plain English, get results + reusable Python code. Powered by AI.
Pricing
Pay per event
Rating
0.0
(0)
Developer
Salesmart Srl
Maintained by CommunityActor stats
0
Bookmarked
4
Total users
0
Monthly active users
4 days ago
Last modified
Categories
Share
polars-runner
Natural-language → Polars code, with auto-improving RAG cache. Runs on Apify, Docker, or as a Python library.
Part of the wiki-embedded monorepo (packages/polars-runner/).
What it does
You describe a data transformation in plain English, the runner generates Polars code with an LLM, executes it in a sandbox, and returns the results — plus the code itself for inspection and reuse.
- Multi-format input: CSV, JSON, Excel, Parquet, or a JSON object passed inline.
- Multi-table mode: ship
{ "contacts": [...], "companies": [...] }and the LLM gets one schema per table for explicit JOINs. - Three tiers:
- BYOK — bring your Google / Anthropic / OpenAI / Groq key.
- Hosted basic — Gemini 2.5 Flash-Lite (cheap, fast).
- Hosted premium — Gemini 2.5 Pro with reasoning (slow, high quality).
- Auto-improving: every successful execution is embedded and stored in a Pinecone index; future similar prompts retrieve and reuse working code, skipping the generation step.
- Error recovery: on a sandbox exception, the analyzer extracts a structured hint (missing column, wrong dtype, JOIN mismatch) and asks the LLM to fix-and-retry up to N times.
Two build targets
packages/polars-runner/├── .actor/Dockerfile Apify deploy (apify push)└── builds/docker/Dockerfile Standalone (docker build)
Same source, different wrappers — see the Dockerfile and docker-compose.yml files for runtime knobs.
Quick start
# Install (workspace member of wiki-embedded)uv sync --package polars-runner# Run with your own key, single tableGOOGLE_API_KEY=... polars-runner \--input data.csv \--prompt "Group by sector, sum revenue, top 10 desc"
Why it lives here
It is the analytics engine that backfills analytics/ chunks at compile time for ../wiki-compiler/, and may be called at runtime by ../../ for ad-hoc graph queries when a precomputed answer is not available. Standalone use is fully supported.
Status
v0.3.0 — refactored into the monorepo, FAANG-grade packaging. The post-execution semantic validator (planned v0.4) will tackle the "status SUCCEEDED but answer is semantically wrong" class of bugs surfaced by our benchmark.
License
MIT — see ../../LICENSE.
