suffix-structure-builder
npx machina-cli add skill a5c-ai/babysitter/suffix-structure-builder --openclawSuffix Structure Builder Skill
Purpose
Build suffix arrays, suffix trees, and related structures with efficient construction algorithms and common query implementations.
Capabilities
- Suffix array construction (SA-IS, DC3)
- LCP array construction
- Suffix tree construction
- Suffix automaton construction
- Query implementations for each structure
- Sparse table for LCP queries
Target Processes
- trie-suffix-structures
- pattern-matching-algorithms
- string-processing
Suffix Structures
Suffix Array
- O(n log n) or O(n) construction
- Combined with LCP for powerful queries
- Pattern matching in O(m log n)
LCP Array
- Kasai's algorithm O(n)
- Range minimum queries for LCA
- Distinct substring counting
Suffix Tree
- Ukkonen's algorithm O(n)
- More complex but powerful
- Direct pattern matching O(m)
Suffix Automaton
- O(n) construction
- Smallest automaton for all substrings
- Powerful for counting problems
Input Schema
{
"type": "object",
"properties": {
"structure": {
"type": "string",
"enum": ["suffixArray", "lcpArray", "suffixTree", "suffixAutomaton"]
},
"algorithm": { "type": "string" },
"queries": { "type": "array" },
"language": {
"type": "string",
"enum": ["cpp", "python", "java"]
}
},
"required": ["structure"]
}
Output Schema
{
"type": "object",
"properties": {
"success": { "type": "boolean" },
"code": { "type": "string" },
"complexity": { "type": "object" },
"queryImplementations": { "type": "array" }
},
"required": ["success", "code"]
}
Source
git clone https://github.com/a5c-ai/babysitter/blob/main/plugins/babysitter/skills/babysit/process/specializations/algorithms-optimization/skills/suffix-structure-builder/SKILL.mdView on GitHub Overview
Builds suffix arrays, trees, and related structures with efficient construction algorithms and common query implementations. This skill enables fast pattern matching, substring counting, and text analysis by combining SA-IS, DC3, Kasai, Ukkonen, and sparse table techniques.
How This Skill Works
Implements constructors for SA (SA-IS/DC3), LCP (Kasai), suffix trees (Ukkonen), and suffix automata, then provides query implementations for each structure. Uses a structured input schema to select the structure, algorithm, and language, and outputs a standardized result object including success, code, and optional query implementations.
When to Use It
- Need fast pattern matching on large texts using a suffix array with LCP
- Counting distinct substrings or performing LCAs with LCP RMQ
- Building a suffix tree or suffix automaton for substring queries
- Prototype or compare different suffix-structure algorithms (SA-IS vs DC3 vs Ukkonen)
- Design a string-processing component for trie-suffix-structures workflows
Quick Start
- Step 1: Define input with {structure: 'suffixArray', algorithm: 'SA-IS', language: 'cpp'}
- Step 2: Run construction and query methods; inspect queryImplementations
- Step 3: Validate results and optimize with LCP RMQ or switch algorithms
Best Practices
- Choose SA-IS for linear-time suffix array construction on large inputs
- Combine suffix arrays with LCP and RMQ for efficient substring queries
- Use a suffix automaton when counting distinct substrings or substrings enumeration
- Validate input against the provided schema (structure, algorithm, language)
- Cache or reuse constructed structures when processing many queries in batch
Example Use Cases
- Text search indexing and autocomplete in document corpora
- DNA or genome sequence analysis with substring queries
- Plagiarism detection by matching substrings across texts
- Log analysis for pattern occurrences and anomaly detection
- Educational tools demonstrating suffix-structure algorithms