What backends does Keras 3 support?

Keras 3 is multi-backend with JAX, TensorFlow, and PyTorch enabled; choose the backend that matches your project and hardware.

How do I choose loss and activation for a task?

Use the quick reference: Binary classification - binary_crossentropy with sigmoid; Multiclass (one-hot) - categorical_crossentropy with softmax; Multiclass (integers) - sparse_categorical_crossentropy with softmax; Regression - mse/mae with no final activation.

How can I implement custom training loops?

Beyond model.fit, you can implement a custom training loop by overriding train_step in a subclassed Model or writing a loop with tf.GradientTape and manual optimization steps, then use it with your training data and metrics.

deep-learning

npx machina-cli add skill Aznatkoiny/zAI-Skills/deep-learning --openclaw

Files (1)

SKILL.md

3.9 KB

Deep Learning with Keras 3

Patterns and best practices based on Deep Learning with Python, 2nd Edition by François Chollet, updated for Keras 3 (Multi-Backend).

Core Workflow

Prepare Data: Normalize, split train/val/test, create tf.data.Dataset
Build Model: Sequential, Functional, or Subclassing API
Compile: model.compile(optimizer, loss, metrics)
Train: model.fit(data, epochs, validation_data, callbacks)
Evaluate: model.evaluate(test_data)

Model Building APIs

Sequential - Simple stack of layers:

model = keras.Sequential([
    layers.Dense(64, activation="relu"),
    layers.Dense(10, activation="softmax")
])

Functional - Multi-input/output, shared layers, non-linear topologies:

inputs = keras.Input(shape=(64,))
x = layers.Dense(64, activation="relu")(inputs)
outputs = layers.Dense(10, activation="softmax")(x)
model = keras.Model(inputs=inputs, outputs=outputs)

Subclassing - Full flexibility with call() method:

class MyModel(keras.Model):
    def __init__(self):
        super().__init__()
        self.dense1 = layers.Dense(64, activation="relu")
        self.dense2 = layers.Dense(10, activation="softmax")

    def call(self, inputs):
        x = self.dense1(inputs)
        return self.dense2(x)

Quick Reference: Loss & Optimizer Selection

Task	Loss	Final Activation
Binary classification	`binary_crossentropy`	`sigmoid`
Multiclass (one-hot)	`categorical_crossentropy`	`softmax`
Multiclass (integers)	`sparse_categorical_crossentropy`	`softmax`
Regression	`mse` or `mae`	None

Optimizers: rmsprop (default), adam (popular), sgd (with momentum for fine-tuning)

Domain-Specific Guides

Topic	Reference	When to Use
Keras 3 Migration	keras3_changes.md	START HERE: Multi-backend setup, `keras.ops`, `import keras`
Fundamentals	basics.md	Overfitting, regularization, data prep, K-fold validation
Keras Deep Dive	keras_working.md	Custom metrics, callbacks, training loops, `tf.function`
Computer Vision	computer_vision.md	Convnets, data augmentation, transfer learning
Advanced CV	advanced_cv.md	Segmentation, ResNets, Xception, Grad-CAM
Time Series	timeseries.md	RNNs (LSTM/GRU), 1D convnets, forecasting
NLP & Transformers	nlp_transformers.md	Text processing, embeddings, Transformer encoder/decoder
Generative DL	generative_dl.md	Text generation, VAEs, GANs, style transfer
Best Practices	best_practices.md	KerasTuner, mixed precision, multi-GPU, TPU

Essential Callbacks

callbacks = [
    keras.callbacks.EarlyStopping(monitor="val_loss", patience=3),
    keras.callbacks.ModelCheckpoint("best.keras", save_best_only=True),
    keras.callbacks.TensorBoard(log_dir="./logs")
]
model.fit(..., callbacks=callbacks)

Utility Scripts

Script	Description
quick_train.py	Reusable training template with standard callbacks and history plotting
visualize_filters.py	Visualize convnet filter patterns via gradient ascent

Source

git clone https://github.com/Aznatkoiny/zAI-Skills/blob/master/AI-Toolkit/skills/deep-learning/SKILL.mdView on GitHub

Overview

This skill provides patterns and best practices for building neural networks using Keras 3 across JAX, TensorFlow, and PyTorch. It covers model building via Sequential, Functional, or Subclassing APIs, as well as custom training loops, data augmentation, transfer learning, and production-ready practices for CV, NLP, time series, and generative models.

How This Skill Works

Prepare data into tf.data.Datasets, choose a model style (Sequential, Functional, or Subclassing), compile with an optimizer and loss, and train with model.fit or custom training loops. The approach supports multi-backend experimentation (JAX, TF, PyTorch) and includes domain-specific guides for CV, NLP, time series, and generative DL to streamline development and production readiness.

When to Use It

You’re building a computer vision model (CNNs) for image classification or detection.
You’re designing NLP or Transformer-based systems for text understanding or generation.
You need time-series forecasting with RNNs/1D convs or sequence models.
You’re creating generative models like VAEs or GANs.
You’re preparing a production-ready workflow with transfer learning and deployment considerations.

Quick Start

Step 1: Prepare data as tf.data.Dataset, including normalization and train/val/test splits.
Step 2: Build a model using Sequential, Functional, or Subclassing API and compile with an optimizer and loss.
Step 3: Train and evaluate using model.fit or a custom training loop, then apply deployment-ready practices.

Best Practices

Leverage KerasTuner for hyperparameter tuning to optimize architecture and training settings.
Enable mixed-precision training and use appropriate data types for speed and memory.
Plan for multi-GPU/TPU or JAX backends to scale training; use tf.data and caching.
Use data augmentation and regularization to prevent overfitting; implement callbacks for monitoring.
Work with transfer learning by freezing/unfreezing layers and using pretrained weights; validate before deployment.

Example Use Cases

Image classification using a Sequential CNN for a baseline in a CV project.
Text classification or sentiment analysis with a Functional/Transformer-based model.
Time-series forecasting with LSTM/GRU or 1D conv nets in a real-world dataset.
Generative DL project: training a VAE or GAN for image synthesis tasks.
Transfer learning pipeline: fine-tuning a pretrained model for a downstream task and deploying it.

Frequently Asked Questions

Add this skill to your agents