LatAm Fintech Synthetic Data Generator
Pricing
from $0.25 / 1,000 synthetic user generateds
LatAm Fintech Synthetic Data Generator
Generate privacy-safe synthetic users, savings goals & transactions calibrated on 506K real records from a production LatAm savings app (2015–2024). Multimodal amounts, real seasonality, reproducible by seed
Pricing
from $0.25 / 1,000 synthetic user generateds
Rating
5.0
(1)
Developer
Joel Mendoza
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
0
Monthly active users
6 days ago
Last modified
Categories
Share
LatAm Synth — Synthetic Financial Savings Data Generator
Generate realistic synthetic test data for fintech applications, calibrated on 506,311 real LatAm records (2015–2024). Use it to generate synthetic datasets for ML training, mock financial transactions for integration testing, realistic fake user data for testing pipelines, or seed data for fintech development — all with full referential integrity, 100% synthetic, zero PII.
Each run produces linked users, savings goals, and deposit/withdrawal transactions that behave statistically like real Latin American savings-app data.
What you get
Each run produces three linked tables of synthetic dataset output:
| Table | Description |
|---|---|
users | Synthetic users with country, scores, join date |
goals | Savings goals with category, required amount, status (achieved / overdue / in_progress) |
transactions | Mock financial transactions — deposits and withdrawals within each goal's time window |
Calibrated against real data: lognormal mixture distributions (KS=0.032), 69.5% round-value snap, 73.8% overdue rate, monthly seasonality, 8 goal categories with realistic amounts and horizons. Suitable as a synthetic dataset for ML training or as realistic fake user data for testing any fintech system.
Pricing — Pay Per Event
This actor uses Apify's Pay Per Event model. You are charged per synthetic user generated, regardless of output format or destination.
| Event | Unit | What triggers it |
|---|---|---|
users-generated | per user | Each run, before data is written |
Example: generating 1,000 users = 1,000 × unit price.
The charge fires before output is written, so it applies equally whether you use CSV files, JSON, or the Dataset export — there is no cheaper path. If your spending limit is reached mid-run, the actor stops cleanly and reports the limit in the log.
Set your spending cap in Apify Console → Run → Max total charge before starting a run.
Input parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
users | integer | 1000 | Number of synthetic users (1–50,000) |
seed | integer | null | Random seed for reproducibility. Same seed → identical output |
countries | string[] | null | Restrict to specific countries. Empty = full LatAm mix (Mexico 46.6%, Colombia 14.4%, …) |
format | csv | json | csv | Output format for Key-value store files |
push_to_dataset | boolean | true | Push transactions to Apify Dataset (enables native JSON/CSV/Excel export and integrations) |
start_date | string | 2023-01-01 | Start of generation period (YYYY-MM-DD) |
end_date | string | 2024-12-31 | End of generation period (YYYY-MM-DD) |
Where to find your output
Every run writes to two places:
Key-value store — all three tables
- Open the run → Storage tab → Key-value store
- Download files:
users.csv/goals.csv/transactions.csv(CSV mode)OUTPUT_DATA— single JSON with all three tables (JSON mode)OUTPUT— always present; JSON summary of the run (parameters, row counts, file list)
Dataset — transactions (Apify-native export)
When push_to_dataset=true (default), all transactions appear in the run's Dataset tab:
- Export as JSON, CSV, or Excel in one click
- Connect Google Sheets, webhooks, or any Apify integration
- Disable with
push_to_dataset=falsefor runs > 10K users where you only need the KVS files
Use cases
- Fintech testing & QA — realistic fixtures for payment pipelines, budget apps, savings engines
- ML training data — bootstrap churn, recommendation, and segmentation models with real LatAm patterns
- Demos & POCs — dashboards with publicly shareable synthetic data
- Education — unlimited datasets for data science courses with real business narrative
For AI agents
This actor is designed to be invoked programmatically. Minimum input to generate synthetic test data:
{"users": 500,"seed": 42}
Full input with all options:
{"users": 1000,"seed": 42,"countries": ["Mexico", "Colombia"],"format": "csv","push_to_dataset": true,"start_date": "2023-01-01","end_date": "2024-12-31"}
Reading the output after the run completes:
- Dataset items (transactions):
GET {run.defaultDatasetId}/items— paginated JSON, directly iterable. - CSV files (users, goals, transactions):
GET {run.defaultKeyValueStoreId}/records/users.csv,.../goals.csv,.../transactions.csv. - Run summary (row counts, parameter echo):
GET {run.defaultKeyValueStoreId}/records/OUTPUT— JSON, always present.
The seed parameter guarantees deterministic output: the same seed always produces the same rows, useful for reproducible test fixtures. Omit seed for a random run.
Privacy & compliance
Output is 100% synthetic — no record derives from a real individual. No PII, no re-identification risk. GDPR / LGPD / CCPA / LFPDPPP non-applicable.