realign-meta-framework
npx machina-cli add skill akaszubski/autonomous-dev/realign-meta-framework --openclawRealignment Meta-Framework
Shared framework for all realignment training workflows. Provides the common pipeline template, quality thresholds, and performance optimization guidance used across all domain-specific realignment workflows.
7-Stage Pipeline Template
All realignment workflows follow this common pipeline:
- Capability Assessment: Evaluate current model capabilities and identify gaps
- Data Preparation: Collect and prepare domain-specific training data
- SFT Preparation: Supervised fine-tuning on curated examples
- Preference/Reward Modeling: Domain-specific optimization (DPO, RLVR, SRF, etc.)
- Iterative Training: Multi-round training with quality gates
- Evaluation & Monitoring: Comprehensive evaluation against baselines
- Deployment & Validation: Final validation and deployment readiness
Quality Thresholds
| Metric | Minimum | Target | Critical |
|---|---|---|---|
| Task accuracy | 85% | 92% | < 80% triggers rollback |
| Capability retention | 95% | 98% | < 90% triggers rollback |
| Data quality score | 0.8 | 0.9 | < 0.7 blocks training |
| Evaluation coverage | 80% | 95% | < 70% blocks deployment |
Capability Regression Detection
- Run baseline evaluation suite before and after each training stage
- Track per-capability scores across training rounds
- Automatic rollback if any capability drops > 5% from baseline
- Cross-domain contamination checks between training stages
Performance Optimization
Memory Management
- Use gradient checkpointing for models > 7B parameters
- Batch size auto-tuning based on available memory
- Mixed precision training (fp16/bf16) by default
Training Efficiency
- Learning rate warmup: 5-10% of total steps
- Cosine annealing schedule with min_lr = 0.1 * max_lr
- Early stopping with patience = 3 evaluation rounds
- Checkpoint every N steps (configurable per domain)
Hardware Considerations
- See
mlx-performanceskill for Apple Silicon optimization - GPU memory estimation: model_params * 4 bytes * 3 (model + optimizer + gradients)
- Multi-device training coordination patterns
Cross-References
- Hardware details: See
mlx-performanceskill - Domain workflows: See
realign-domain-workflowsskill - Data quality: See
preference-data-qualityskill
Source
git clone https://github.com/akaszubski/autonomous-dev/blob/master/plugins/autonomous-dev/skills/archived/realign-meta-framework/SKILL.mdView on GitHub Overview
Provides a common 7-stage pipeline, quality thresholds, and optimization guidance for realignment workflows. It standardizes data prep, SFT, preference modeling, evaluation, and deployment across domains to ensure consistent quality and rollback safety.
How This Skill Works
The framework encodes realignment as a 7-stage pipeline: Capability Assessment, Data Preparation, SFT Preparation, Preference/Reward Modeling, Iterative Training, Evaluation & Monitoring, and Deployment & Validation. It defines concrete quality thresholds and automatic rollback rules, and bundles memory and training-efficiency guidance to optimize large-model workflows across domains.
When to Use It
- Starting a new realignment domain workflow and defining the 7-stage pipeline
- Enforcing quality gates and rollback policies across training stages
- Training large models with memory and precision optimization (e.g., >7B params)
- Detecting and preventing cross-domain contamination between training stages
- Coordinating DPO/RLVR/SRF-style optimization within realignment cycles
Quick Start
- Step 1: Define the realignment domain, data sources, and baseline capabilities
- Step 2: Implement Data Preparation, SFT Preparation, and Preference Modeling with quality gates
- Step 3: Run iterative training, monitor metrics, and perform deployment validation
Best Practices
- Run baseline evaluations before and after each training stage to quantify impact
- Define and document minimum, target, and critical metrics for each task
- Enable automatic rollback if any capability drops more than 5% from baseline
- Apply gradient checkpointing and mixed precision to improve memory and speed
- Coordinate with related skills (mlx-performance, realign-domain-workflows, data-quality) for end-to-end quality
Example Use Cases
- Realign a content moderation model using DPO with the 7-stage pipeline and rollback safeguards
- Finance advisory assistant alignment applying data-quality thresholds and iterative gates
- Medical QA model alignment ensuring capability retention above baseline with checks
- Multimodal model alignment using gradient checkpointing on a large parameter set
- Cross-domain realignment rollout across multiple teams with early stopping and gates