Get the FREE Ultimate OpenClaw Setup Guide →

realign-meta-framework

npx machina-cli add skill akaszubski/autonomous-dev/realign-meta-framework --openclaw
Files (1)
SKILL.md
2.5 KB

Realignment Meta-Framework

Shared framework for all realignment training workflows. Provides the common pipeline template, quality thresholds, and performance optimization guidance used across all domain-specific realignment workflows.

7-Stage Pipeline Template

All realignment workflows follow this common pipeline:

  1. Capability Assessment: Evaluate current model capabilities and identify gaps
  2. Data Preparation: Collect and prepare domain-specific training data
  3. SFT Preparation: Supervised fine-tuning on curated examples
  4. Preference/Reward Modeling: Domain-specific optimization (DPO, RLVR, SRF, etc.)
  5. Iterative Training: Multi-round training with quality gates
  6. Evaluation & Monitoring: Comprehensive evaluation against baselines
  7. Deployment & Validation: Final validation and deployment readiness

Quality Thresholds

MetricMinimumTargetCritical
Task accuracy85%92%< 80% triggers rollback
Capability retention95%98%< 90% triggers rollback
Data quality score0.80.9< 0.7 blocks training
Evaluation coverage80%95%< 70% blocks deployment

Capability Regression Detection

  • Run baseline evaluation suite before and after each training stage
  • Track per-capability scores across training rounds
  • Automatic rollback if any capability drops > 5% from baseline
  • Cross-domain contamination checks between training stages

Performance Optimization

Memory Management

  • Use gradient checkpointing for models > 7B parameters
  • Batch size auto-tuning based on available memory
  • Mixed precision training (fp16/bf16) by default

Training Efficiency

  • Learning rate warmup: 5-10% of total steps
  • Cosine annealing schedule with min_lr = 0.1 * max_lr
  • Early stopping with patience = 3 evaluation rounds
  • Checkpoint every N steps (configurable per domain)

Hardware Considerations

  • See mlx-performance skill for Apple Silicon optimization
  • GPU memory estimation: model_params * 4 bytes * 3 (model + optimizer + gradients)
  • Multi-device training coordination patterns

Cross-References

  • Hardware details: See mlx-performance skill
  • Domain workflows: See realign-domain-workflows skill
  • Data quality: See preference-data-quality skill

Source

git clone https://github.com/akaszubski/autonomous-dev/blob/master/plugins/autonomous-dev/skills/archived/realign-meta-framework/SKILL.mdView on GitHub

Overview

Provides a common 7-stage pipeline, quality thresholds, and optimization guidance for realignment workflows. It standardizes data prep, SFT, preference modeling, evaluation, and deployment across domains to ensure consistent quality and rollback safety.

How This Skill Works

The framework encodes realignment as a 7-stage pipeline: Capability Assessment, Data Preparation, SFT Preparation, Preference/Reward Modeling, Iterative Training, Evaluation & Monitoring, and Deployment & Validation. It defines concrete quality thresholds and automatic rollback rules, and bundles memory and training-efficiency guidance to optimize large-model workflows across domains.

When to Use It

  • Starting a new realignment domain workflow and defining the 7-stage pipeline
  • Enforcing quality gates and rollback policies across training stages
  • Training large models with memory and precision optimization (e.g., >7B params)
  • Detecting and preventing cross-domain contamination between training stages
  • Coordinating DPO/RLVR/SRF-style optimization within realignment cycles

Quick Start

  1. Step 1: Define the realignment domain, data sources, and baseline capabilities
  2. Step 2: Implement Data Preparation, SFT Preparation, and Preference Modeling with quality gates
  3. Step 3: Run iterative training, monitor metrics, and perform deployment validation

Best Practices

  • Run baseline evaluations before and after each training stage to quantify impact
  • Define and document minimum, target, and critical metrics for each task
  • Enable automatic rollback if any capability drops more than 5% from baseline
  • Apply gradient checkpointing and mixed precision to improve memory and speed
  • Coordinate with related skills (mlx-performance, realign-domain-workflows, data-quality) for end-to-end quality

Example Use Cases

  • Realign a content moderation model using DPO with the 7-stage pipeline and rollback safeguards
  • Finance advisory assistant alignment applying data-quality thresholds and iterative gates
  • Medical QA model alignment ensuring capability retention above baseline with checks
  • Multimodal model alignment using gradient checkpointing on a large parameter set
  • Cross-domain realignment rollout across multiple teams with early stopping and gates

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers