Dist Design
Scannednpx machina-cli add skill Langerrr/distributed-architect/dist-design --openclaw/dist-design — Design-Time Architecture Analysis
Evaluate architectural options and their trade-offs for distributed system decisions.
Arguments
$ARGUMENTSmay contain a description of what's being designed
Step 1: Gather Requirements
Understand from $ARGUMENTS or by asking the user:
- What problem does this solve?
- What are the constraints (latency, throughput, consistency, fault tolerance)?
- What components already exist? (load topology file if available)
Step 2: Load Project Topology
Look for a topology file. If it exists, understand the current system shape. If it doesn't, work with whatever architecture context is available.
Step 3: Identify the Design Decision
Categorize what's being decided:
| Decision type | Examples |
|---|---|
| Communication pattern | Stream vs queue vs RPC, push vs pull, sync vs async |
| Component boundary | Where to split responsibilities, same process vs separate |
| State management | Where state lives, who owns it, consistency model |
| Failure strategy | Retry policy, circuit breakers, fallback behavior, DLQ design |
| Scaling approach | Horizontal vs vertical, partitioning, sharding, replication |
Step 4: Enumerate Options
For each viable option, analyze:
a. Topology impact — What cardinality does it introduce? New convergence points? b. Failure analysis — What fails, blast radius, recovery mechanism? c. Concurrency implications — New concurrent access to shared resources? d. Operational cost — Complexity to operate, monitor, debug?
Step 5: Compare and Recommend
Present options side by side:
## Design Analysis: [decision description]
### Option A: [name]
- Topology: [impact]
- Failure modes: [key risks]
- Concurrency: [concerns]
- Operational cost: [assessment]
- Best when: [conditions where this option excels]
### Option B: [name]
- [same structure]
### Recommendation
[Which option and why, given the stated constraints]
### Topology Changes
[How the topology file should be updated if adopted]
Step 6: Flag Unknowns
Explicitly state what you DON'T know that could change the recommendation:
- Load characteristics not yet known
- Failure rates of dependencies not measured
- Scaling requirements still undefined
Source
git clone https://github.com/Langerrr/distributed-architect/blob/main/skills/dist-design/SKILL.mdView on GitHub Overview
Dist Design offers a structured, design-time analysis of distributed system options. It focuses on trade-offs across communication patterns, component boundaries, state management, failure strategy, and scaling. By gathering requirements, loading topology, and enumerating options, it guides a practical recommendation before implementation.
How This Skill Works
It follows a six-step workflow: gather requirements and constraints, load or infer the project topology, and identify design decision types. It then enumerates viable options with topology impact, failure modes, concurrency, and operating cost, compares them side by side, and issues a recommendation with required topology changes and known unknowns.
When to Use It
- Choosing between communication patterns (stream, queue, RPC) for a new service.
- Deciding where to split responsibilities between components (same process vs separate services).
- Evaluating state ownership and consistency for distributed storage or caches.
- Assessing scaling strategies (horizontal vs vertical, partitioning, replication).
- Performing a design-time review when topology files exist or are being created.
Quick Start
- Step 1: Gather requirements and constraints from ARGUMENTS or user input.
- Step 2: Load the project topology (or infer context) and identify the decision types.
- Step 3: Enumerate options, compare them, draft topology changes, and select a recommendation.
Best Practices
- Start with explicit constraints: latency, throughput, consistency, and fault tolerance.
- Load or define the topology context before evaluating options.
- Document topology changes and rationale for each option.
- Assess topology impact, failure modes, concurrency, and operational cost for every option.
- Validate recommendations with simulations, pilots, or incremental rollout.
Example Use Cases
- Choosing between stream processing and request-driven RPC for a data ingestion path.
- Deciding boundary boundaries when breaking a monolith into microservices.
- Selecting a horizontal vs vertical scaling strategy for a data store with eventual consistency.
- Defining where state lives and who owns it in a distributed cache layer.
- Designing a retry and circuit-breaker policy as part of a failure strategy.