LatAm Fintech Synthetic Data Generator avatar

LatAm Fintech Synthetic Data Generator

Pricing

from $0.25 / 1,000 synthetic user generateds

Go to Apify Store
LatAm Fintech Synthetic Data Generator

LatAm Fintech Synthetic Data Generator

Generate privacy-safe synthetic users, savings goals & transactions calibrated on 506K real records from a production LatAm savings app (2015–2024). Multimodal amounts, real seasonality, reproducible by seed

Pricing

from $0.25 / 1,000 synthetic user generateds

Rating

5.0

(1)

Developer

Joel Mendoza

Joel Mendoza

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

0

Monthly active users

6 days ago

Last modified

Share

LatAm Synth — Synthetic Financial Savings Data Generator

Generate realistic synthetic test data for fintech applications, calibrated on 506,311 real LatAm records (2015–2024). Use it to generate synthetic datasets for ML training, mock financial transactions for integration testing, realistic fake user data for testing pipelines, or seed data for fintech development — all with full referential integrity, 100% synthetic, zero PII.

Each run produces linked users, savings goals, and deposit/withdrawal transactions that behave statistically like real Latin American savings-app data.

What you get

Each run produces three linked tables of synthetic dataset output:

TableDescription
usersSynthetic users with country, scores, join date
goalsSavings goals with category, required amount, status (achieved / overdue / in_progress)
transactionsMock financial transactions — deposits and withdrawals within each goal's time window

Calibrated against real data: lognormal mixture distributions (KS=0.032), 69.5% round-value snap, 73.8% overdue rate, monthly seasonality, 8 goal categories with realistic amounts and horizons. Suitable as a synthetic dataset for ML training or as realistic fake user data for testing any fintech system.

Pricing — Pay Per Event

This actor uses Apify's Pay Per Event model. You are charged per synthetic user generated, regardless of output format or destination.

EventUnitWhat triggers it
users-generatedper userEach run, before data is written

Example: generating 1,000 users = 1,000 × unit price.

The charge fires before output is written, so it applies equally whether you use CSV files, JSON, or the Dataset export — there is no cheaper path. If your spending limit is reached mid-run, the actor stops cleanly and reports the limit in the log.

Set your spending cap in Apify Console → Run → Max total charge before starting a run.

Input parameters

ParameterTypeDefaultDescription
usersinteger1000Number of synthetic users (1–50,000)
seedintegernullRandom seed for reproducibility. Same seed → identical output
countriesstring[]nullRestrict to specific countries. Empty = full LatAm mix (Mexico 46.6%, Colombia 14.4%, …)
formatcsv | jsoncsvOutput format for Key-value store files
push_to_datasetbooleantruePush transactions to Apify Dataset (enables native JSON/CSV/Excel export and integrations)
start_datestring2023-01-01Start of generation period (YYYY-MM-DD)
end_datestring2024-12-31End of generation period (YYYY-MM-DD)

Where to find your output

Every run writes to two places:

Key-value store — all three tables

  1. Open the run → Storage tab → Key-value store
  2. Download files:
    • users.csv / goals.csv / transactions.csv (CSV mode)
    • OUTPUT_DATA — single JSON with all three tables (JSON mode)
    • OUTPUT — always present; JSON summary of the run (parameters, row counts, file list)

Dataset — transactions (Apify-native export)

When push_to_dataset=true (default), all transactions appear in the run's Dataset tab:

  • Export as JSON, CSV, or Excel in one click
  • Connect Google Sheets, webhooks, or any Apify integration
  • Disable with push_to_dataset=false for runs > 10K users where you only need the KVS files

Use cases

  • Fintech testing & QA — realistic fixtures for payment pipelines, budget apps, savings engines
  • ML training data — bootstrap churn, recommendation, and segmentation models with real LatAm patterns
  • Demos & POCs — dashboards with publicly shareable synthetic data
  • Education — unlimited datasets for data science courses with real business narrative

For AI agents

This actor is designed to be invoked programmatically. Minimum input to generate synthetic test data:

{
"users": 500,
"seed": 42
}

Full input with all options:

{
"users": 1000,
"seed": 42,
"countries": ["Mexico", "Colombia"],
"format": "csv",
"push_to_dataset": true,
"start_date": "2023-01-01",
"end_date": "2024-12-31"
}

Reading the output after the run completes:

  • Dataset items (transactions): GET {run.defaultDatasetId}/items — paginated JSON, directly iterable.
  • CSV files (users, goals, transactions): GET {run.defaultKeyValueStoreId}/records/users.csv, .../goals.csv, .../transactions.csv.
  • Run summary (row counts, parameter echo): GET {run.defaultKeyValueStoreId}/records/OUTPUT — JSON, always present.

The seed parameter guarantees deterministic output: the same seed always produces the same rows, useful for reproducible test fixtures. Omit seed for a random run.

Privacy & compliance

Output is 100% synthetic — no record derives from a real individual. No PII, no re-identification risk. GDPR / LGPD / CCPA / LFPDPPP non-applicable.