mcts-select
Scannednpx machina-cli add skill NewJerseyStyle/plugin-mcts/mcts-select --openclawMCTS Selection Phase
You are executing the SELECTION phase of Monte Carlo Tree Search.
UCB1 Formula
For each node, calculate:
UCB = Q/N + c * sqrt(ln(parent_N) / N)
Where:
- Q: Total value/reward accumulated at this node
- N: Number of visits to this node
- parent_N: Number of visits to parent node
- c: Exploration constant (typically sqrt(2) ≈ 1.414)
Selection Algorithm
- Start at root node
- While current node is fully expanded and not terminal:
- Calculate UCB for all children
- Select child with highest UCB value
- Move to selected child
- Return the selected leaf node
Using MCP Tools
Call mcts_select with optional parameters:
exploration_constant: Value for c (default: 1.414)tree_id: If managing multiple trees
The tool returns:
selected_node_id: The ID of the selected nodepath: The path from root to selected nodenode_state: The state at the selected nodeis_terminal: Whether this is a terminal stateucb_scores: UCB scores for nodes along the path
Selection Strategy
For the current problem context: $ARGUMENTS
- Check if any nodes are unexplored (N=0) - these get priority
- Among explored nodes, balance:
- Exploitation: Nodes with high average reward (Q/N)
- Exploration: Nodes visited less frequently
- Consider domain-specific heuristics from observations
Output
After selection, report:
- Selected node ID and state
- Path taken from root
- UCB reasoning for the selection
- Whether expansion is needed (if node has unexplored children)
Proceed to EXPANSION phase with the selected node.
Source
git clone https://github.com/NewJerseyStyle/plugin-mcts/blob/main/skills/mcts-select/SKILL.mdView on GitHub Overview
Implements the Selection phase of Monte Carlo Tree Search using the UCB1 score to traverse from the root to a promising leaf. It prioritizes unexplored nodes (N=0), balances exploitation and exploration, and reports the path, node state, and UCB scores for debugging. The process prepares the next phase (Expansion) by returning the chosen leaf and its context.
How This Skill Works
Start at the root and repeatedly evaluate the UCB1 score for each child using Q/N + c * sqrt(ln(parent_N)/N). Move to the child with the highest score while the current node is fully expanded and non-terminal. If any node has N=0, those are prioritized. Return the selected leaf along with the path, node state, terminal status, and the UCB scores used for the decision.
When to Use It
- During a single MCTS iteration to identify the leaf node for expansion
- When balancing high-reward nodes against rarely visited ones
- When you want to inspect or debug the path and UCB reasoning with path scores
- When using multi-tree setups via tree_id to manage several MCTS trees in parallel
- When a domain heuristic should influence the selection amidst UCB-based decisions
Quick Start
- Step 1: Call mcts_select with the current root and optional parameters (exploration_constant, tree_id).
- Step 2: While the current node is fully expanded and non-terminal, compute UCB for all children and pick the highest; move to that child.
- Step 3: Return and inspect selected_node_id, path, node_state, is_terminal, and ucb_scores to decide on Expansion or rollout.
Best Practices
- Prioritize unexplored nodes (N=0) before expanding already-visited children
- Use the standard UCB1 formula: UCB = Q/N + c * sqrt(ln(parent_N) / N)
- Set an appropriate exploration constant c (default ~1.414) and keep it consistent
- Record and review the path and ucb_scores to diagnose poor expansions
- Proceed to Expansion only after a leaf is selected or a stopping condition is met
Example Use Cases
- AI game agent selecting the next move in a board game by traversing from the root to a leaf using UCB1
- Robotic path planning where the agent chooses the next waypoint based on visitation and reward
- Resource allocation in a strategy game where balance between known good moves and exploration is needed
- Puzzle-solving or planning tasks where the search tree is large and parallel simulations help
- Multi-tree MCTS setups where different trees are managed with a shared selection strategy (tree_id)