example-data-processor
Use Cautionnpx machina-cli add skill fkesheh/skill-mcp/example-skill --openclawExample Data Processor
This skill demonstrates a complete skill structure with scripts, references, and proper documentation.
What This Skill Does
Processes CSV data files with these capabilities:
- Clean and validate data
- Transform columns
- Generate summary statistics
- Export results
Usage
Process a CSV file
To process a CSV file:
Process the data in myfile.csv
The skill will:
- Read the CSV file
- Clean the data (remove nulls, fix formats)
- Generate statistics
- Output a summary report
Custom Processing
For custom processing options:
Process sales.csv and group by region
Scripts
scripts/process_csv.py - Main data processing script
- Reads CSV files
- Applies transformations
- Generates output
scripts/fetch_data.py - API data fetcher (demonstrates uv dependencies)
- Fetches data from APIs using requests
- Beautiful output formatting with rich
- Auto-installs dependencies via uv inline metadata (PEP 723)
- No manual pip install needed!
scripts/validate.py - Data validation script
- Checks data quality
- Reports issues
Configuration
The scripts use these environment variables:
OUTPUT_DIR- Where to save processed files (optional)MAX_ROWS- Maximum rows to process (optional)
Set them using:
Set OUTPUT_DIR to /path/to/output
Reference Documentation
For detailed information:
- Data Formats - Supported data formats and schemas
- Examples - Common usage examples
Troubleshooting
"File not found" error:
- Ensure the CSV file exists
- Provide the full path to the file
"Invalid data" error:
- Check the CSV format matches expected schema
- See Data Formats for requirements
Overview
Example Data Processor handles CSV files end to end, enabling cleaning and validating data, transforming columns, and generating summary statistics with exportable results. It is ideal for workflows that require reliable CSV data cleanup and quick basic analysis.
How This Skill Works
Core workflow is in scripts/process_csv.py which reads a CSV file, applies transformations, and generates an output. Data quality checks are performed by the validate script, and environment variables like OUTPUT_DIR and MAX_ROWS control output location and processing scope. The fetch_data script demonstrates API data integration and auto installs dependencies via uv inline metadata (PEP 723).
When to Use It
- Clean messy CSV data with nulls or bad formats
- Transform or normalize columns
- Generate quick summary statistics and a report
- Export results for sharing or reporting
- Handle custom processing like group by region or other aggregations
Quick Start
- Step 1: Save your CSV file and set OUTPUT_DIR and MAX_ROWS if needed
- Step 2: Run the processor with a command like Process myfile.csv or Process sales.csv and group by region
- Step 3: Check OUTPUT_DIR for the summary report and transformed data
Best Practices
- Verify the CSV file exists and matches the expected schema
- Clean data by removing nulls and fixing formats
- Define MAX_ROWS to limit processing on large datasets
- Apply transformations in a repeatable pipeline
- Review the summary report and saved outputs in OUTPUT_DIR
Example Use Cases
- Process the data in myfile.csv
- Process sales.csv and group by region
- Run the main processor script via scripts/process_csv.py
- Use scripts/validate.py to check data quality before processing
- Export the final summary report to OUTPUT_DIR