How is privacy protected in generated data?

Real PII is never used. The tool generates realistic but fake data and follows privacy guidelines to ensure HIPAA safety where applicable.

What output formats are supported?

JSON is the default; CSV and SQL INSERT formats are also supported with proper escaping.

How can I tailor the data scope?

Describe the data needs via ARGUMENTS and rely on the project data model from qa-artifacts/.qa-config.json; outputs are saved under qa-artifacts/test-data.

test-data

npx machina-cli add skill cyberwalk3r/qa-toolkit/test-data --openclaw

Files (1)

SKILL.md

2.0 KB

Test Data Generator

Generate synthetic test data. Read qa-artifacts/.qa-config.json for project context.

Input

Accept via $ARGUMENTS: description of what data is needed. Examples:

"Generate 20 test users with varied roles, names, and email formats"
"Create sample order data with different statuses and edge cases"
"I need healthcare patient records for testing — HIPAA-safe fake data"

Workflow

Understand the data model from the description or project context
Generate data with variety:
- Normal values (80%)
- Edge cases (15%): empty strings, max length, special characters, unicode
- Boundary values (5%): 0, negative, MAX_INT, very long strings
Ensure privacy safety — never use real PII; generate realistic but fake data
Output in requested format (default: JSON)

Data Dimensions

Names: multicultural, varied lengths, special characters (O'Brien, José, 田中)
Emails: valid, edge cases (plus addressing, long domains, unicode)
Dates: past, future, timezone edge cases, leap years, epoch boundaries
Numbers: zero, negative, decimal precision, very large
Addresses: international formats, multi-line, special postal codes
Strings: empty, whitespace-only, max-length, HTML/script injection strings

Output Formats

JSON (default):

[
  { "id": 1, "name": "Alice Johnson", "email": "alice@example.com", ... },
  { "id": 2, "name": "田中太郎", "email": "tanaka+test@example.co.jp", ... }
]

CSV: Comma-separated with header row SQL INSERT: Ready-to-execute INSERT statements with proper escaping

For domain-specific data patterns, read references/domain-data.md.

Save

Save to qa-artifacts/test-data/data-YYYY-MM-DD-<description>.json (or .csv / .sql)

Suggested Next Steps

After generating test data, suggest:

"Use this data in API tests (/qa-toolkit:api-test) or E2E tests (/qa-toolkit:e2e-test)."

Source

git clone https://github.com/cyberwalk3r/qa-toolkit/blob/main/skills/test-data/SKILL.mdView on GitHub

Overview

Generates realistic but fake test data for JSON, CSV, or SQL outputs. It reads project context from qa-artifacts/.qa-config.json and creates diverse records including names, emails, dates, numbers, addresses, and strings while strictly avoiding real PII.

How This Skill Works

The tool infers the data model from the description or project context, then generates records with a standard mix: normal values (80%), edge cases (15%), and boundary values (5%). It enforces privacy by using synthetic data and outputs in the requested format, defaulting to JSON.

When to Use It

Generate 20 test users with varied roles, names, and email formats
Create sample order data with different statuses and edge cases
Produce HIPAA-safe healthcare patient data for testing
Build multilingual datasets with international addresses for localization testing
Create large datasets with mixed numeric and date edge cases for performance testing

Quick Start

Step 1: Read qa-artifacts/.qa-config.json to understand project context and requirements
Step 2: Provide a description via ARGUMENTS describing needed data
Step 3: Run the generator to output JSON by default or choose CSV/SQL and save to qa-artifacts/test-data/data-YYYY-MM-DD-<description>.<ext>

Best Practices

Define the target schema from qa-artifacts/.qa-config.json before generation
Distribute values to include normal, edge, and boundary cases (80/15/5)
Always enforce privacy by using fake data and avoid real PII
Validate a sample of the output against the expected schema
Save files to qa-artifacts/test-data/data-YYYY-MM-DD-<description>.<ext>

Example Use Cases

JSON: 20 user records with varied names and emails
CSV: sample orders with statuses and edge cases
SQL INSERT: HIPAA-safe patient records
JSON: dataset with international addresses and unicode
JSON: large dataset for load and performance testing

Frequently Asked Questions

Add this skill to your agents