What is /dist-design?

A design-time analysis framework for distributed systems that evaluates trade-offs across communication patterns, component boundaries, state management, failure strategy, and scaling.

When should I use dist-design?

Use it when choosing among communication patterns, determining component boundaries, or selecting scaling strategies for a distributed system.

What outputs does dist-design produce?

A side-by-side comparison of options, a recommended design, required topology changes, and a list of unknowns that could affect the decision.

Dist Design

Scanned

npx machina-cli add skill Langerrr/distributed-architect/dist-design --openclaw

Files (1)

SKILL.md

2.5 KB

/dist-design — Design-Time Architecture Analysis

Evaluate architectural options and their trade-offs for distributed system decisions.

Arguments

$ARGUMENTS may contain a description of what's being designed

Step 1: Gather Requirements

Understand from $ARGUMENTS or by asking the user:

What problem does this solve?
What are the constraints (latency, throughput, consistency, fault tolerance)?
What components already exist? (load topology file if available)

Step 2: Load Project Topology

Look for a topology file. If it exists, understand the current system shape. If it doesn't, work with whatever architecture context is available.

Step 3: Identify the Design Decision

Categorize what's being decided:

Decision type	Examples
Communication pattern	Stream vs queue vs RPC, push vs pull, sync vs async
Component boundary	Where to split responsibilities, same process vs separate
State management	Where state lives, who owns it, consistency model
Failure strategy	Retry policy, circuit breakers, fallback behavior, DLQ design
Scaling approach	Horizontal vs vertical, partitioning, sharding, replication

Step 4: Enumerate Options

For each viable option, analyze:

a. Topology impact — What cardinality does it introduce? New convergence points? b. Failure analysis — What fails, blast radius, recovery mechanism? c. Concurrency implications — New concurrent access to shared resources? d. Operational cost — Complexity to operate, monitor, debug?

Step 5: Compare and Recommend

Present options side by side:

## Design Analysis: [decision description]

### Option A: [name]
- Topology: [impact]
- Failure modes: [key risks]
- Concurrency: [concerns]
- Operational cost: [assessment]
- Best when: [conditions where this option excels]

### Option B: [name]
- [same structure]

### Recommendation
[Which option and why, given the stated constraints]

### Topology Changes
[How the topology file should be updated if adopted]

Step 6: Flag Unknowns

Explicitly state what you DON'T know that could change the recommendation:

Load characteristics not yet known
Failure rates of dependencies not measured
Scaling requirements still undefined

Source

git clone https://github.com/Langerrr/distributed-architect/blob/main/skills/dist-design/SKILL.mdView on GitHub

Overview

Dist Design offers a structured, design-time analysis of distributed system options. It focuses on trade-offs across communication patterns, component boundaries, state management, failure strategy, and scaling. By gathering requirements, loading topology, and enumerating options, it guides a practical recommendation before implementation.

How This Skill Works

It follows a six-step workflow: gather requirements and constraints, load or infer the project topology, and identify design decision types. It then enumerates viable options with topology impact, failure modes, concurrency, and operating cost, compares them side by side, and issues a recommendation with required topology changes and known unknowns.

When to Use It

Choosing between communication patterns (stream, queue, RPC) for a new service.
Deciding where to split responsibilities between components (same process vs separate services).
Evaluating state ownership and consistency for distributed storage or caches.
Assessing scaling strategies (horizontal vs vertical, partitioning, replication).
Performing a design-time review when topology files exist or are being created.

Quick Start

Step 1: Gather requirements and constraints from ARGUMENTS or user input.
Step 2: Load the project topology (or infer context) and identify the decision types.
Step 3: Enumerate options, compare them, draft topology changes, and select a recommendation.

Best Practices

Start with explicit constraints: latency, throughput, consistency, and fault tolerance.
Load or define the topology context before evaluating options.
Document topology changes and rationale for each option.
Assess topology impact, failure modes, concurrency, and operational cost for every option.
Validate recommendations with simulations, pilots, or incremental rollout.

Example Use Cases

Choosing between stream processing and request-driven RPC for a data ingestion path.
Deciding boundary boundaries when breaking a monolith into microservices.
Selecting a horizontal vs vertical scaling strategy for a data store with eventual consistency.
Defining where state lives and who owns it in a distributed cache layer.
Designing a retry and circuit-breaker policy as part of a failure strategy.

Frequently Asked Questions

Add this skill to your agents