← Blog

Synthetic Data: Test Your Generated App with Sample Data

Generate domain-aligned sample data from your UML model so you can exercise list views, forms, and APIs—without hand-writing fixtures or using irrelevant seed files.

Open Workbench Watch Demo Start Building Free

A generated full-stack app is only convincing when you can click through real flows. An empty database makes every screen look unfinished: empty tables, trivial validation paths, and APIs that never return the edge cases you care about. Hand-written fixtures help, but they drift from your model and take time to maintain.

Synthetic data in EcosystemCode is built to match your domain design—the same entities and relationships you modeled—so you can test how the app actually behaves before you invest in production data and integrations.

Why generic seed data falls short

Typical approaches—static CSV files, ad hoc inserts, or copy-pasted JSON—often miss the point:

  • Column names and shapes may not match your generated schemas.
  • Relationships (foreign keys, references between aggregates) are easy to get wrong.
  • Volume is arbitrary: too little to stress pagination, or too much to iterate quickly.

What you want is data that respects your class model and database target, so list pages, detail views, and create/update flows all see coherent records.

What “domain-aligned” means here

Synthetic data is generated in the context of your project’s database selection and UML-backed entities. You choose how many rows to create per entity, which output formats you need, and optional extras such as DDL and import scripts—so the result fits how you plan to load and verify data in dev or staging.

Common output formats include CSV, JSON, and SQL, with options such as DDL, an import script, and helper scripts (for example probing the database port from your app’s environment configuration). Locale settings help generated strings feel plausible for names, addresses, and similar fields when that matters for UI review.

The goal is not random noise: it is structured sample data you can use to see how your generated app works end to end.

A practical workflow

  1. Stabilize the model
    Make sure your class diagram (and related views) reflect the entities you care about. If you already use pre-generation validation, run those checks so you are not debugging data on top of inconsistent diagrams.

  2. Generate the application baseline
    Produce your runnable stack (UI, API, database layer) so there is a real surface to test—see code generation for the full picture.

  3. Open Synthetic Data
    In the product, use the Synthetic Data flow: confirm the target database (aligned with your project’s database selection), set rows per entity to match how heavy you want lists and joins to be, and select output formats (for example CSV, JSON, and SQL) plus any optional artifacts you need for your environment.

  4. Generate and load
    Run generation, then use your DDL/import script or SQL as appropriate to load data into the database your app uses. If your project includes helper scripts (such as resolving the DB port from .env), use them so local runs stay consistent.

  5. Exercise the app
    Walk critical paths: list and filter, open detail pages, create and edit records, and spot-check APIs (for example via OpenAPI/Swagger if your stack exposes it). Note gaps in the model or UX while the cost of change is still low.

When this pays off

  • Manual QA and demos: Stakeholders see realistic screens instead of empty states.
  • Integration and spike work: You can validate assumptions about cardinality and navigation before wiring external systems.
  • Onboarding: New developers get a populated environment without tribal knowledge about “the right” test rows.

Limits and expectations

Synthetic data is for testing and demonstration, not a substitute for production privacy, compliance, or performance testing. Treat it as a fast way to answer “does this generated baseline behave the way we intended?”—not as production-ready content.


Next steps