Get the FREE Ultimate OpenClaw Setup Guide →

lightgbm

npx machina-cli add skill G1Joshi/Agent-Skills/lightgbm --openclaw
Files (1)
SKILL.md
993 B

LightGBM

LightGBM is Microsoft's gradient boosting library. It is often faster and uses less memory than XGBoost due to leaf-wise tree growth.

When to Use

  • Huge Datasets: Optimized for efficiency.
  • Ranking: LGBMRanker is excellent for search/recommendation systems.

Core Concepts

Leaf-wise Growth

Grows the tree by splitting the leaf with max loss delta (creates deeper, unbalanced trees) vs Level-wise (balanced).

Histogram-based

Buckets continuous values into discrete bins for speed.

Best Practices (2025)

Do:

  • Tune num_leaves: The most important parameter for controlling complexity.
  • Use Categorical Features: Pass indexes of categorical columns directly.

Don't:

  • Don't overfit: Leaf-wise growth overfits easily on small data. Limit max_depth.

References

Source

git clone https://github.com/G1Joshi/Agent-Skills/blob/main/skills/ai-ml/lightgbm/SKILL.mdView on GitHub

Overview

LightGBM is Microsoft's gradient boosting library designed for speed and memory efficiency. It uses leaf-wise tree growth and histogram-based binning to train on large datasets quickly, and it supports ranking via LGBMRanker.

How This Skill Works

LightGBM grows trees using leaf-wise growth, selecting the leaf with the maximum loss delta to split, which can yield deeper, more accurate trees. It also bins continuous features into discrete histogram bins to speed up computations and reduce memory usage.

When to Use It

  • Working with huge datasets requiring efficiency
  • Building ranking models (LGBMRanker) for search/recommendation
  • Need faster training and lower memory usage compared to other frameworks
  • Desiring histogram-based speed improvements with discrete bins
  • Leveraging categorical features by passing their indexes directly

Quick Start

  1. Step 1: Install LightGBM, load your dataset, identify the label and feature columns, and note categorical feature indices
  2. Step 2: Initialize a model (e.g., LGBMClassifier or LGBMRegressor) with key params like num_leaves and use_histogram, and specify categorical feature indices
  3. Step 3: Train the model on training data, apply validation/early stopping if available, and evaluate with appropriate metrics

Best Practices

  • Tune num_leaves to control model complexity
  • Pass indexes of categorical features directly to improve handling
  • Don't overfit: leaf-wise growth can overfit on small data; limit max_depth
  • Leverage histogram-based training to speed up computations
  • For ranking tasks, consider using LGBMRanker to optimize order-based metrics

Example Use Cases

  • Training large-scale ranking models for search results or recommendations using LGBMRanker
  • Efficiently training gradient boosting on web-scale datasets with lower memory usage
  • Replacing heavier boosting frameworks in pipelines to achieve faster preprocessing and training
  • Deploying LightGBM models in production to reduce latency due to histogram-based binning
  • Using categorical feature indexes directly to accelerate feature processing

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers