Synthetic E-Commerce Data Generator
Pricing
Pay per event
Synthetic E-Commerce Data Generator
Generate realistic e-commerce test data with interconnected products, customers, orders, and reviews. Features referential integrity, realistic distributions, temporal coherence, industry presets, and deterministic seed mode.
Pricing
Pay per event
Rating
0.0
(0)
Developer
BowTiedRaccoon
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
4 days ago
Last modified
Categories
Share
Generate realistic e-commerce test datasets with four interconnected entity types: products, customers, orders, and reviews. All entities maintain referential integrity — orders reference real product and customer IDs, reviews reference real products and customers. Timestamps maintain temporal coherence: orders are placed after customer registration, shipments follow order placement, deliveries follow shipment.
Features
- Four entity types with cross-references: products, customers, orders, reviews
- Referential integrity — every order and review links to real product and customer IDs generated in the same run
- Realistic statistical distributions: log-normal product prices, right-skewed review ratings (average ~4.2), weighted order statuses (70% delivered)
- Temporal coherence —
shipped_atalways followsordered_at,delivered_atfollowsshipped_at, orders are placed after customer registration - Five industry presets with tailored categories, brand names, and price ranges
- Deterministic seed mode for reproducible datasets
- Five locale options for names, addresses, and phone numbers
- No network calls, no proxy needed — pure CPU data generation
Who Uses Synthetic E-Commerce Data and Why
- E-commerce developers — populate Shopify, WooCommerce, or Magento staging environments with realistic test data before launch
- Data engineers — validate ETL pipelines with known-schema e-commerce records that include edge cases (zero-order customers, cancelled orders, one-star reviews)
- Analytics teams — build and demo dashboards with realistic order volumes, customer segments, and product catalogs without exposing production data
- QA engineers — stress-test order processing systems with thousands of orders referencing real product inventories and customer accounts
- Bootcamp instructors — provide students with clean, well-structured datasets for SQL exercises, pandas workshops, and data visualization projects
How It Works
- You configure how many products, customers, orders, and reviews to generate, pick an industry preset and locale, and optionally set a random seed.
- The generator creates products first (with industry-specific categories, brands, and log-normal price distributions), then customers (with segment-weighted lifetime values), then orders (referencing real products and customers, with calculated totals and temporal timestamps), then reviews (with rating-appropriate text templates and referential links).
- In unified mode, all entities go to one dataset with an
entityTypefield. In separate mode, products go to the dataset and other entities are saved as JSON in the key-value store. - The
maxItemscap is applied after generation to limit total output size.
Input
Default run — 100 mixed records
{"numProducts": 20,"numCustomers": 30,"numOrders": 50,"numReviews": 40,"maxItems": 100,"industry": "general","locale": "en","outputFormat": "unified"}
Electronics dataset with deterministic seed
{"numProducts": 50,"numCustomers": 100,"numOrders": 200,"numReviews": 150,"maxItems": 0,"industry": "electronics","seed": 42,"outputFormat": "unified"}
Fashion products only (separate mode)
{"numProducts": 100,"numCustomers": 50,"numOrders": 80,"numReviews": 60,"maxItems": 0,"industry": "fashion","outputFormat": "separate"}
Input Reference
| Field | Type | Default | Description |
|---|---|---|---|
numProducts | integer | 20 | Number of product records to generate (1–10,000) |
numCustomers | integer | 30 | Number of customer records to generate (1–50,000) |
numOrders | integer | 50 | Number of order records to generate (0–100,000) |
numReviews | integer | 40 | Number of review records to generate (0–100,000) |
maxItems | integer | 100 | Maximum total records across all entity types. Set to 0 for no limit |
industry | string | general | Industry preset: general, fashion, electronics, grocery, home_goods |
locale | string | en | Locale for names and addresses: en, de, fr, ja, es |
seed | integer | null | Random seed for deterministic output. Omit for random data each run |
outputFormat | string | unified | unified puts all entities in one dataset. separate puts only products in the dataset and saves the rest to the key-value store |
Output
Product record
{"entityType": "product","product_id": "PROD-00001","product_name": "Premium Laptops X7K","sku": "SKU-RSJ7NHY5","brand": "NovaTech","category": "Electronics","subcategory": "Laptops","price": 54.56,"cost": 34.63,"weight_kg": 5.68,"rating_avg": 4.2,"review_count": 140,"in_stock": true,"created_at": "2024-03-23T03:33:06.557Z"}
Customer record
{"entityType": "customer","customer_id": "CUST-00001","first_name": "Bonita","last_name": "Tremblay","email": "bonita.tremblay@hotmail.com","phone": "(983) 829-9005","address": "5836 E Main Street","city": "Flagstaff","state": "VT","zip": "75793-8196","country": "US","customer_created_at": "2024-03-28T15:21:19.313Z","lifetime_value": 2450.75,"order_count": 12,"segment": "returning"}
Order record
{"entityType": "order","order_id": "ORD-00001","order_customer_id": "CUST-00017","product_ids": "PROD-00007, PROD-00013, PROD-00002","quantities": "1, 2, 1","subtotal": 326.41,"tax": 28.97,"shipping": 0,"total": 355.38,"order_status": "delivered","ordered_at": "2025-06-22T10:21:51.493Z","shipped_at": "2025-06-26T10:21:51.493Z","delivered_at": "2025-06-28T10:21:51.493Z"}
Review record
{"entityType": "review","review_id": "REV-00001","review_product_id": "PROD-00003","review_customer_id": "CUST-00012","review_rating": 5,"review_title": "Love it!","review_body": "Absolutely love this Premium Tablets A3M! The build quality is outstanding. Would definitely buy again.","helpful_count": 7,"verified_purchase": true,"reviewed_at": "2025-08-15T14:30:22.100Z"}
Industry Presets
| Preset | Categories | Price Range | Example Brands |
|---|---|---|---|
general | Electronics, Clothing, Home & Kitchen, Sports, Books | $5–$500 | Apex, NovaTech, Zenith |
fashion | Women's Clothing, Men's Clothing, Shoes, Accessories, Sportswear | $15–$800 | Luxe & Co, Urban Thread, Maison Noir |
electronics | Computers, Mobile, Audio, Smart Home, Gaming | $10–$2,500 | TechVault, PixelForge, Quantum |
grocery | Fresh Produce, Dairy & Eggs, Bakery, Beverages, Pantry | $1–$50 | Green Valley, Harvest Moon, Farm Fresh |
home_goods | Furniture, Decor, Kitchen, Bedding, Garden | $8–$1,200 | HomeStead, Craftwell, Willow & Oak |
Performance
This actor generates data in-memory with no network calls. Approximate run times:
- 100 records: < 1 second
- 1,000 records: 1–2 seconds
- 10,000 records: 5–10 seconds
- 100,000 records: 30–60 seconds
Memory usage stays under 256MB for datasets up to 100,000 records.
Need More Features?
If you need additional entity types (inventory, shipping carriers, promotions), custom field mappings, or integration with specific e-commerce platforms, file an issue or get in touch. We are always open to extending the generator to suit your needs.