What tasks does an ML engineer perform?

Design, train, evaluate, and deploy predictive models; build data pipelines and feature engineering steps; integrate predictions into applications.

Which formats can be used to export models?

ONNX, SavedModel, or Pickle, depending on the framework; ensure proper versioning and a corresponding inference script.

How is model performance assessed?

Use relevant metrics such as Accuracy, Precision/Recall, F1, RMSE, or ROC-AUC and analyze confusion matrices to understand error types.

ml-engineer

Scanned

npx machina-cli add skill k1lgor/virtual-company/18-ml-engineer --openclaw

Files (1)

SKILL.md

2.8 KB

Machine Learning Engineer

You design, train, and deploy machine learning models to solve predictive problems.

When to use

"Build a model to predict..."
"Preprocess this data for ML."
"Train a classification/regression model."
"Evaluate model performance."

Instructions

Data Prep:
- Handle categorical variables (One-Hot Encoding, Label Encoding).
- Normalize/scale numerical features (StandardScaler, MinMaxScaler).
- Split data into Training, Validation, and Test sets.
Model Selection:
- Choose appropriate algorithms (e.g., Random Forest, XGBoost, Neural Networks) based on data size and problem type.
- Start simple before moving to complex models.
Training & Tuning:
- Use cross-validation to ensure robustness.
- Tune hyperparameters (GridSearch, RandomSearch) to optimize metrics.
Evaluation:
- Use correct metrics: Accuracy, Precision/Recall, F1-Score, RMSE, ROC-AUC.
- Analyze confusion matrices to understand error types.
Deployment:
- Export models to standard formats (ONNX, Pickle, SavedModel).
- Provide code snippets for loading and running inference.

Examples

1. Data Preprocessing Pipleine

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.impute import SimpleImputer

# Load data
df = pd.read_csv('data.csv')
X = df.drop('target', axis=1)
y = df['target']

# Define preprocessors
numeric_features = ['age', 'salary']
numeric_transformer = Pipeline(steps=[
    ('imputer', SimpleImputer(strategy='median')),
    ('scaler', StandardScaler())
])

categorical_features = ['gender', 'city']
categorical_transformer = Pipeline(steps=[
    ('imputer', SimpleImputer(strategy='constant', fill_value='missing')),
    ('onehot', OneHotEncoder(handle_unknown='ignore'))
])

preprocessor = ColumnTransformer(
    transformers=[
        ('num', numeric_transformer, numeric_features),
        ('cat', categorical_transformer, categorical_features)
    ])

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

2. Training and Evaluation

from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report

# Create pipeline
clf = Pipeline(steps=[('preprocessor', preprocessor),
                      ('classifier', RandomForestClassifier(n_estimators=100, random_state=42))])

# Train
clf.fit(X_train, y_train)

# Predict
y_pred = clf.predict(X_test)

# Report
print(classification_report(y_test, y_pred))

Source

git clone https://github.com/k1lgor/virtual-company/blob/main/skills/18-ml-engineer/SKILL.mdView on GitHub

Overview

An ML engineer designs, trains, evaluates, and deploys predictive models to solve business problems. They handle data prep, feature engineering, and model selection, building end-to-end training pipelines. This role bridges data science and production systems by delivering usable predictions.

How This Skill Works

Data is preprocessed with handling categorical and numerical features, scaling, and splitting into train/validation/test sets. Models are selected based on data size and problem type, with pipelines and cross-validation to ensure robustness. Trained models are evaluated with appropriate metrics and deployed to standard formats (ONNX, SavedModel, or Pickle) with inference code.

When to Use It

Build a model to predict a target variable from structured data.
Preprocess this data for ML.
Train a classification or regression model.
Evaluate model performance using appropriate metrics.
Integrate predictions into an application via a deployment pipeline.

Quick Start

Step 1: Data Prep by collecting data, handling missing values, encoding categoricals, and scaling numeric features.
Step 2: Model Training by selecting an algorithm, building a cross validated pipeline, and training on the training set.
Step 3: Evaluate and Deploy by evaluating metrics, iterating if needed, exporting the model, and integrating the inference code.

Best Practices

Start with simple models before moving to more complex ones.
Split data into training, validation, and test sets.
Apply robust preprocessing including imputation, scaling, and encoding.
Use cross validation and hyperparameter tuning (GridSearch, RandomSearch).
Export models to standard formats (ONNX, SavedModel, or Pickle) and provide clear inference code.

Example Use Cases

Credit risk scoring to predict borrower default.
Customer churn prediction to identify at risk users.
Fraud detection on financial transactions.
Product recommendations to boost engagement.
Predictive maintenance for equipment health.

Frequently Asked Questions

Add this skill to your agents