test-data
npx machina-cli add skill cyberwalk3r/qa-toolkit/test-data --openclawTest Data Generator
Generate synthetic test data. Read qa-artifacts/.qa-config.json for project context.
Input
Accept via $ARGUMENTS: description of what data is needed. Examples:
- "Generate 20 test users with varied roles, names, and email formats"
- "Create sample order data with different statuses and edge cases"
- "I need healthcare patient records for testing — HIPAA-safe fake data"
Workflow
- Understand the data model from the description or project context
- Generate data with variety:
- Normal values (80%)
- Edge cases (15%): empty strings, max length, special characters, unicode
- Boundary values (5%): 0, negative, MAX_INT, very long strings
- Ensure privacy safety — never use real PII; generate realistic but fake data
- Output in requested format (default: JSON)
Data Dimensions
- Names: multicultural, varied lengths, special characters (O'Brien, José, 田中)
- Emails: valid, edge cases (plus addressing, long domains, unicode)
- Dates: past, future, timezone edge cases, leap years, epoch boundaries
- Numbers: zero, negative, decimal precision, very large
- Addresses: international formats, multi-line, special postal codes
- Strings: empty, whitespace-only, max-length, HTML/script injection strings
Output Formats
JSON (default):
[
{ "id": 1, "name": "Alice Johnson", "email": "alice@example.com", ... },
{ "id": 2, "name": "田中太郎", "email": "tanaka+test@example.co.jp", ... }
]
CSV: Comma-separated with header row SQL INSERT: Ready-to-execute INSERT statements with proper escaping
For domain-specific data patterns, read references/domain-data.md.
Save
Save to qa-artifacts/test-data/data-YYYY-MM-DD-<description>.json (or .csv / .sql)
Suggested Next Steps
After generating test data, suggest:
- "Use this data in API tests (
/qa-toolkit:api-test) or E2E tests (/qa-toolkit:e2e-test)."
Source
git clone https://github.com/cyberwalk3r/qa-toolkit/blob/main/skills/test-data/SKILL.mdView on GitHub Overview
Generates realistic but fake test data for JSON, CSV, or SQL outputs. It reads project context from qa-artifacts/.qa-config.json and creates diverse records including names, emails, dates, numbers, addresses, and strings while strictly avoiding real PII.
How This Skill Works
The tool infers the data model from the description or project context, then generates records with a standard mix: normal values (80%), edge cases (15%), and boundary values (5%). It enforces privacy by using synthetic data and outputs in the requested format, defaulting to JSON.
When to Use It
- Generate 20 test users with varied roles, names, and email formats
- Create sample order data with different statuses and edge cases
- Produce HIPAA-safe healthcare patient data for testing
- Build multilingual datasets with international addresses for localization testing
- Create large datasets with mixed numeric and date edge cases for performance testing
Quick Start
- Step 1: Read qa-artifacts/.qa-config.json to understand project context and requirements
- Step 2: Provide a description via ARGUMENTS describing needed data
- Step 3: Run the generator to output JSON by default or choose CSV/SQL and save to qa-artifacts/test-data/data-YYYY-MM-DD-<description>.<ext>
Best Practices
- Define the target schema from qa-artifacts/.qa-config.json before generation
- Distribute values to include normal, edge, and boundary cases (80/15/5)
- Always enforce privacy by using fake data and avoid real PII
- Validate a sample of the output against the expected schema
- Save files to qa-artifacts/test-data/data-YYYY-MM-DD-<description>.<ext>
Example Use Cases
- JSON: 20 user records with varied names and emails
- CSV: sample orders with statuses and edge cases
- SQL INSERT: HIPAA-safe patient records
- JSON: dataset with international addresses and unicode
- JSON: large dataset for load and performance testing