What prerequisites are required?

Have uvx available (install uv if needed), a supported Python version (3.10–3.13), and network access to fetch the template. The process also assumes Git is installed for post-generation steps.

How do I customize the template during setup?

During interactive configuration, pick PyTorch/CUDA presets, enable tools like ruff/ty/pytest, choose experiment trackers, and select a template type (Image Classification, Segmentation, GNN, etc.). This tailors the generated project to your research needs.

ml-project-init

npx machina-cli add skill nishide-dev/claude-code-ml-research/ml-project-init --openclaw

Files (1)

SKILL.md

6.9 KB

ML Project Initialization

Initialize a new machine learning research project using the ML Research Copier template with PyTorch Lightning, Hydra configuration, and modern Python tooling (uv).

Process

1. Explain the Template System

This command uses uvx copier to create projects from templates. The template supports:

PyTorch + CUDA: Multiple version presets (2.4-2.9, CUDA 11.8-13.0)
PyTorch Lightning: For training infrastructure
Hydra: For configuration management
Experiment Tracking: TensorBoard, W&B, MLflow
Modern Tooling: uv (package manager), ruff (linter), ty (type checker)
Multiple Templates: Image classification, segmentation, GNN, etc.

2. Check Prerequisites

Verify that uvx is available:

uvx --version

If not available, install uv:

curl -LsSf https://astral.sh/uv/install.sh | sh

3. Run Copier Template

Execute the copier command pointing to the ML research template:

# From GitHub (recommended)
uvx copier copy --trust gh:nishide-dev/ml-research-template <project-directory>

# From local clone (for development)
uvx copier copy --trust /path/to/ml-research-template <project-directory>

Where:

gh:nishide-dev/ml-research-template is the GitHub repository URL
<project-directory> is where the new project will be created

Note: The template is maintained in a separate repository (ml-research-template) for independent versioning and broader reusability.

4. Interactive Configuration

Copier will ask the user to configure:

Project Basics:

Project name (defaults to directory name)
Package name (Python import name, auto-generated from project name)
Description
Author name and email
Python version (3.10-3.13)

Development Tools:

Use ruff? (default: yes)
Use ty? (default: yes)
Use pytest? (default: yes)
Use GitHub Actions? (default: yes)
Use Nix + direnv? (default: no)

PyTorch/CUDA:

PyTorch + CUDA preset (interactive dropdown with compatible combinations)
- PyTorch 2.8.0 + CUDA 12.6 (recommended)
- PyTorch 2.9.0 + CUDA 12.6 (latest)
- ... (many presets)
- Custom (manual version entry)
Include torchvision? (default: yes)
Include torchaudio? (default: no)

ML Frameworks:

Use PyTorch Lightning? (default: yes)
Lightning version (if enabled)
Use Hydra? (default: yes)
Hydra version (if enabled)
Use PyTorch Geometric? (default: no)

Experiment Tracking:

Logger choice: TensorBoard / W&B / MLflow / Both / None
W&B entity (if W&B selected)

Template Type:

Image Classification (default)
Segmentation
Object Detection
Text Classification
GNN (Graph Neural Network)
Minimal (custom template)

Dataset:

MNIST / CIFAR-10 / CIFAR-100 / Fashion-MNIST / Custom (for image classification)

5. Post-Generation

Copier automatically executes post-generation tasks:

git init -b main - Initialize git repository
uv venv - Create virtual environment
uv lock - Lock dependencies
uv sync - Install dependencies
ruff check . --fix - Lint and auto-fix
ruff format . - Format code
ty check - Type check (if enabled)
pytest - Run tests (if enabled)

6. Project Structure

The generated project will have:

<project-name>/
├── src/
│   └── <package-name>/
│       ├── __init__.py
│       ├── train.py          # Main training script with Hydra
│       ├── models/           # LightningModule definitions
│       ├── data/             # DataModule and datasets
│       └── utils/            # Utility functions
├── tests/
│   └── test_<package_name>.py
├── configs/                  # Hydra configuration
│   ├── config.yaml
│   ├── model/
│   ├── data/
│   ├── trainer/
│   ├── logger/
│   └── experiment/
├── pyproject.toml           # uv + project config
├── ruff.toml                # Ruff linting config
├── .gitignore
├── README.md
└── .venv/                   # Virtual environment (created)

7. Verify Installation

After project creation, verify the setup:

cd <project-name>

# Verify CUDA availability
uv run python -c "import torch; print(f'CUDA: {torch.cuda.is_available()}')"

# Run a quick test
uv run pytest tests/ -v

# Check code quality
uv run ruff check .

8. Next Steps

Guide the user:

# Start training (with Hydra defaults)
uv run python src/<package-name>/train.py

# Override configuration
uv run python src/<package-name>/train.py trainer.max_epochs=20 data.batch_size=64

# Run specific experiment
uv run python src/<package-name>/train.py experiment=baseline

# Start TensorBoard (if using TensorBoard)
tensorboard --logdir logs/

# Login to W&B (if using W&B)
wandb login

Alternative: Manual Template Selection

If the user wants a specific template variant without interactive prompts, they can use:

# Use defaults with --defaults flag (from GitHub)
uvx copier copy --trust --defaults gh:nishide-dev/ml-research-template <project-directory>

# Use data flags for non-interactive (from GitHub)
uvx copier copy --trust \
  --data project_name="my-project" \
  --data pytorch_cuda_preset="pytorch-2.8.0-cuda-12.6" \
  --data use_lightning=true \
  --data logger_choice="wandb" \
  gh:nishide-dev/ml-research-template <project-directory>

# From local clone
uvx copier copy --trust --defaults /path/to/ml-research-template <project-directory>

Troubleshooting

copier not found

# Install uv first
curl -LsSf https://astral.sh/uv/install.sh | sh

# Then uvx will be available

CUDA version mismatch

Check available CUDA on system:

nvidia-smi

Select matching preset or custom CUDA version during copier prompts.

Import errors after generation

Ensure virtual environment is activated or use uv run:

uv run python src/<package-name>/train.py

Success Criteria

Copier template executes without errors
All dependencies installed successfully
CUDA detection works (if GPU available)
Tests pass
Ruff checks pass
README generated with correct instructions
Git repository initialized

Project is ready for ML research!

Example Usage

# From GitHub (recommended)
uvx copier copy --trust gh:nishide-dev/ml-research-template ~/projects/my-cifar10-project

# From local clone (for development)
uvx copier copy --trust /path/to/ml-research-template ~/projects/my-cifar10-project

# Follow interactive prompts, then:
cd ~/projects/my-cifar10-project
uv run python src/my_cifar10_project/train.py

Source

git clone https://github.com/nishide-dev/claude-code-ml-research/blob/main/skills/ml-project-init/SKILL.mdView on GitHub

Overview

Initialize a new ML research project from scratch using the ML Research Copier template, featuring PyTorch Lightning, Hydra configuration, and modern tooling (uv, ruff, ty). It streamlines setup from presets to experiment tracking, ensuring a scalable, reproducible foundation.

How This Skill Works

The workflow uses the uvx copier to generate a project from the ml-research-template, wiring in PyTorch Lightning, Hydra, and modern tooling. During an interactive configuration, you specify project basics, development tools, PyTorch/CUDA presets, and template type. After generation, a post-generation sequence runs: git init, uv venv, uv lock, uv sync, ruff checks, ty checks, and optional pytest execution.

When to Use It

Starting a new ML research project from scratch and needing a solid scaffold
Standardizing project structure for experimentation and reproducibility
Needing Hydra for config management and PyTorch Lightning for training
Wanting modern tooling (uv, ruff, ty) and built-in experiment tracking options
Creating image classification, segmentation, GNN, or text classification projects with templates

Quick Start

Step 1: Ensure uvx is available (uvx --version). If missing, install uv via the provided script (curl -LsSf https://astral.sh/uv/install.sh | sh).
Step 2: Generate the project using the template: uvx copier copy --trust gh:nishide-dev/ml-research-template <project-directory> (or use the local path).
Step 3: Complete the interactive configuration and run post-generation tasks (git init -b main; uv venv; uv lock; uv sync; ruff check . --fix; ruff format .; ty check; pytest).

Best Practices

Verify uvx availability before starting (uvx --version) and install uv if needed
Choose a compatible PyTorch/CUDA preset during interactive config
Enable essential tooling (ruff, ty, pytest) and decide on CI/experiment trackers
Run post-generation tasks immediately (git init, venv, lock, sync, lint, tests)
Keep the ml-research-template versioned and re-use for future projects

Example Use Cases

Bootstrapping an image classification research project with PyTorch Lightning and Hydra
Setting up a segmentation study with the ml-research-template and integrated trackers
Launching a GNN research project with Hydra config and Lightning modules
Creating a minimal custom template for rapid experimentation
Configuring a CUDA-enabled, multi-GPU training workflow with presets

Frequently Asked Questions

Add this skill to your agents