deep-learning
npx machina-cli add skill Aznatkoiny/zAI-Skills/deep-learning --openclawDeep Learning with Keras 3
Patterns and best practices based on Deep Learning with Python, 2nd Edition by François Chollet, updated for Keras 3 (Multi-Backend).
Core Workflow
- Prepare Data: Normalize, split train/val/test, create
tf.data.Dataset - Build Model: Sequential, Functional, or Subclassing API
- Compile:
model.compile(optimizer, loss, metrics) - Train:
model.fit(data, epochs, validation_data, callbacks) - Evaluate:
model.evaluate(test_data)
Model Building APIs
Sequential - Simple stack of layers:
model = keras.Sequential([
layers.Dense(64, activation="relu"),
layers.Dense(10, activation="softmax")
])
Functional - Multi-input/output, shared layers, non-linear topologies:
inputs = keras.Input(shape=(64,))
x = layers.Dense(64, activation="relu")(inputs)
outputs = layers.Dense(10, activation="softmax")(x)
model = keras.Model(inputs=inputs, outputs=outputs)
Subclassing - Full flexibility with call() method:
class MyModel(keras.Model):
def __init__(self):
super().__init__()
self.dense1 = layers.Dense(64, activation="relu")
self.dense2 = layers.Dense(10, activation="softmax")
def call(self, inputs):
x = self.dense1(inputs)
return self.dense2(x)
Quick Reference: Loss & Optimizer Selection
| Task | Loss | Final Activation |
|---|---|---|
| Binary classification | binary_crossentropy | sigmoid |
| Multiclass (one-hot) | categorical_crossentropy | softmax |
| Multiclass (integers) | sparse_categorical_crossentropy | softmax |
| Regression | mse or mae | None |
Optimizers: rmsprop (default), adam (popular), sgd (with momentum for fine-tuning)
Domain-Specific Guides
| Topic | Reference | When to Use |
|---|---|---|
| Keras 3 Migration | keras3_changes.md | START HERE: Multi-backend setup, keras.ops, import keras |
| Fundamentals | basics.md | Overfitting, regularization, data prep, K-fold validation |
| Keras Deep Dive | keras_working.md | Custom metrics, callbacks, training loops, tf.function |
| Computer Vision | computer_vision.md | Convnets, data augmentation, transfer learning |
| Advanced CV | advanced_cv.md | Segmentation, ResNets, Xception, Grad-CAM |
| Time Series | timeseries.md | RNNs (LSTM/GRU), 1D convnets, forecasting |
| NLP & Transformers | nlp_transformers.md | Text processing, embeddings, Transformer encoder/decoder |
| Generative DL | generative_dl.md | Text generation, VAEs, GANs, style transfer |
| Best Practices | best_practices.md | KerasTuner, mixed precision, multi-GPU, TPU |
Essential Callbacks
callbacks = [
keras.callbacks.EarlyStopping(monitor="val_loss", patience=3),
keras.callbacks.ModelCheckpoint("best.keras", save_best_only=True),
keras.callbacks.TensorBoard(log_dir="./logs")
]
model.fit(..., callbacks=callbacks)
Utility Scripts
| Script | Description |
|---|---|
| quick_train.py | Reusable training template with standard callbacks and history plotting |
| visualize_filters.py | Visualize convnet filter patterns via gradient ascent |
Source
git clone https://github.com/Aznatkoiny/zAI-Skills/blob/master/AI-Toolkit/skills/deep-learning/SKILL.mdView on GitHub Overview
This skill provides patterns and best practices for building neural networks using Keras 3 across JAX, TensorFlow, and PyTorch. It covers model building via Sequential, Functional, or Subclassing APIs, as well as custom training loops, data augmentation, transfer learning, and production-ready practices for CV, NLP, time series, and generative models.
How This Skill Works
Prepare data into tf.data.Datasets, choose a model style (Sequential, Functional, or Subclassing), compile with an optimizer and loss, and train with model.fit or custom training loops. The approach supports multi-backend experimentation (JAX, TF, PyTorch) and includes domain-specific guides for CV, NLP, time series, and generative DL to streamline development and production readiness.
When to Use It
- You’re building a computer vision model (CNNs) for image classification or detection.
- You’re designing NLP or Transformer-based systems for text understanding or generation.
- You need time-series forecasting with RNNs/1D convs or sequence models.
- You’re creating generative models like VAEs or GANs.
- You’re preparing a production-ready workflow with transfer learning and deployment considerations.
Quick Start
- Step 1: Prepare data as tf.data.Dataset, including normalization and train/val/test splits.
- Step 2: Build a model using Sequential, Functional, or Subclassing API and compile with an optimizer and loss.
- Step 3: Train and evaluate using model.fit or a custom training loop, then apply deployment-ready practices.
Best Practices
- Leverage KerasTuner for hyperparameter tuning to optimize architecture and training settings.
- Enable mixed-precision training and use appropriate data types for speed and memory.
- Plan for multi-GPU/TPU or JAX backends to scale training; use tf.data and caching.
- Use data augmentation and regularization to prevent overfitting; implement callbacks for monitoring.
- Work with transfer learning by freezing/unfreezing layers and using pretrained weights; validate before deployment.
Example Use Cases
- Image classification using a Sequential CNN for a baseline in a CV project.
- Text classification or sentiment analysis with a Functional/Transformer-based model.
- Time-series forecasting with LSTM/GRU or 1D conv nets in a real-world dataset.
- Generative DL project: training a VAE or GAN for image synthesis tasks.
- Transfer learning pipeline: fine-tuning a pretrained model for a downstream task and deploying it.