deslop
Scannednpx machina-cli add skill thtskaran/claude-skills/deslop --openclawDeslop — Code Hardening Skill
Systematically audit and harden an existing codebase for production. Designed for AI-generated ("vibe coded") codebases that work superficially but are fragile, sloppy, or insecure under real-world conditions.
This skill operates in two strict phases. Never mix them.
Phase 1: AUDIT → produces AUDIT.md (human-reviewable, machine-parseable)
Phase 2: FIX → consumes AUDIT.md → applies fixes in safety order
The user provides a codebase (directory path or repo). They may specify focus areas, exclusions, or constraints. If not specified, audit everything.
Phase 1: AUDIT
MANDATORY before starting: Read the full audit checklist and format spec:
Read ~/.claude/skills/deslop/references/checklist.md # 15-category audit checklist with specific patterns
Read ~/.claude/skills/deslop/references/audit-format.md # Structured finding format for AUDIT.md
The audit runs five sequential passes. Each pass builds on the previous. Never skip a pass — thoroughness matters more than speed.
Pass 0 — Reconnaissance
Map the codebase before judging anything. Do NOT write findings yet.
- Generate file tree:
find . -type f | head -500or equivalent. Understand the shape. - Classify files: Source, tests, configs, migrations, infra, docs, generated, vendored.
- Identify the stack: Language(s), framework(s), package manager, build system, runtime.
- Build dependency map: Which files import from which. Use
grep -rn "import\|require\|from\|include"or language-specific tooling. - Identify entry points: Main files, route handlers, API endpoints, CLI entry points, event handlers.
- Detect existing tooling: Check for
.eslintrc,pyproject.toml,tsconfig.json,.prettierrc,Makefile, CI configs, etc. - Run available linters: If linters are configured, run them and capture output. If not, note this as a finding. Common commands:
- JS/TS:
npx eslint . --format json 2>/dev/nullornpx tsc --noEmit 2>&1 - Python:
python -m pylint **/*.py --output-format=json 2>/dev/nullorpython -m flake8 . 2>&1 - Go:
go vet ./... 2>&1 - Rust:
cargo clippy --message-format=json 2>/dev/null - If no linter is configured, skip — don't install new tooling unless asked.
- JS/TS:
- Produce the codebase map: Write a structured summary to the top of
AUDIT.md:- Stack summary (language, framework, runtime)
- File count by category
- Module/directory architecture
- Entry points
- High-coupling modules (most imports in + out)
- Linter output summary (if available)
Output of Pass 0: The "Codebase Overview" section of AUDIT.md. No findings yet.
Pass 1 — Patterns & Consistency
Scan for mechanical quality issues that don't require deep semantic understanding. Process file-by-file, prioritizing high-coupling modules first (from Pass 0's dependency map).
What to scan (load ~/.claude/skills/deslop/references/checklist.md categories 8-14):
- Naming convention violations and inconsistencies
- DRY violations — duplicated logic blocks, copy-pasted code
- Dead code — unused functions, unreachable branches, commented-out code, unused imports
- Magic numbers and hardcoded values
- Missing type annotations (for typed languages)
- Inconsistent patterns (e.g., callbacks in one file, promises in another)
- KISS/YAGNI violations — unnecessary abstractions, premature optimization
- Dependency issues — unused deps, circular imports
Chunking strategy for large files (>500 lines): Read in function/class-sized chunks. Never split in the middle of a function. When analyzing a chunk, re-read the file's import header for context.
For each finding: Write it to AUDIT.md using the format from ~/.claude/skills/deslop/references/audit-format.md.
Pass 2 — Logic & Resilience
Trace critical paths through the code. This pass requires cross-file context.
Start from entry points identified in Pass 0. For each critical path:
- Trace the happy path from input to output.
- Then ask: what happens when each step fails?
- Check every error handling point: is the error caught? Logged? Re-thrown? Swallowed?
- Check edge cases: null/undefined inputs, empty collections, boundary values, concurrent access.
- Check data validation at trust boundaries (user input, API responses, file reads, env vars).
What to scan (load ~/.claude/skills/deslop/references/checklist.md categories 2-4):
- Correctness bugs: null derefs, off-by-one, race conditions, resource leaks, incorrect API usage
- Error handling: empty catches, swallowed exceptions, generic catches, missing async error handling, silent failures
- Business logic fragility: happy-path-only logic, missing edge cases, hardcoded business rules
- Regex hacks: patterns that pass test cases but fail on real-world input
- Async issues: unhandled promise rejections, missing await, callback hell without error propagation
CRITICAL: Do NOT flag business logic as "wrong" unless it's clearly a bug. The goal is resilience, not redesign. If you're unsure whether something is intentional behavior or a bug, flag it as [NEEDS_CLARIFICATION] and describe the concern.
Pass 3 — Security
What to scan (load ~/.claude/skills/deslop/references/checklist.md categories 1, 7):
- Injection: SQL injection, XSS, command injection, path traversal, SSRF
- Auth/Authz: missing checks, privilege escalation, broken session management
- Secrets: hardcoded API keys, tokens, passwords, connection strings in source
- Crypto: weak algorithms, insecure random, missing HTTPS enforcement
- Input validation: missing server-side validation, inconsistent validation across endpoints
- Data exposure: sensitive data in logs, verbose error messages to clients, PII leaks
- Dependencies: known CVE in dependencies (check lock files if present)
Use OWASP Top 10 as the mental framework. For each finding, note the CWE number if applicable.
Pass 4 — Synthesis & Prioritization
After all passes complete:
-
Deduplicate: Some findings may appear in multiple passes. Merge them, keeping the most severe classification.
-
Cross-reference: A DRY violation might also be a security concern (divergent validation logic). Link related findings.
-
Classify severity for every finding:
- CRITICAL: Direct security vulnerabilities, data loss risk, crashes in core paths
- HIGH: Logic errors with production impact, missing auth checks, resource leaks
- MEDIUM: Missing validation, performance issues, significant code smells
- LOW: Minor inconsistencies, naming issues, missing docs
- INFO: Style suggestions, optimization opportunities, nice-to-haves
-
Assign fix tier (from
~/.claude/skills/deslop/references/fix-playbook.md):- Tier 1 (safe): No behavioral change possible. Formatting, dead code removal, renaming, type annotations.
- Tier 2 (medium): Additive changes. Input validation, error handling, logging. Needs characterization tests.
- Tier 3 (high risk): Structural changes. Refactoring APIs, changing data access patterns. Needs human review.
-
Write the executive summary at the top of
AUDIT.md:- Total findings by severity
- Top 5 most critical issues with one-line descriptions
- Overall assessment: is this codebase production-ready? What's the blast radius of the issues found?
- Estimated fix effort by tier
-
Present
AUDIT.mdto the user. Ask if they want to proceed to Phase 2, adjust priorities, or exclude anything.
Phase 2: FIX
MANDATORY before starting: Read the fix playbook:
Read ~/.claude/skills/deslop/references/fix-playbook.md # Safety hierarchy, characterization tests, fix ordering
NEVER start Phase 2 without user confirmation on the audit. The user must review AUDIT.md first.
Fix Ordering
Process findings in this exact order:
- Tier 1 fixes, CRITICAL severity first → safest changes, highest impact
- Tier 1 fixes, remaining severities → clear out all safe mechanical fixes
- Tier 2 fixes, CRITICAL severity → write characterization tests first, then fix
- Tier 2 fixes, HIGH severity → same pattern
- Tier 2 fixes, MEDIUM and below → same pattern
- Tier 3 fixes → flag for human review, do NOT auto-fix unless user explicitly approves each one
Fix Protocol
For every fix:
- State the finding ID (e.g.,
SEC-003) so the user can cross-referenceAUDIT.md. - For Tier 2+: Write a characterization test FIRST that captures current behavior. Run it. It must pass.
- Apply the fix. Prefer mechanical transformations (rename, extract, inline) over generative rewrites. If you're rewriting more than ~20 lines, stop and reconsider — you might be changing business logic.
- Run the characterization test again. It must still pass (unless the fix intentionally changes error handling from "silent failure" to "explicit error", in which case update the test).
- Run existing tests if they exist. Nothing should break.
- Update
AUDIT.md: Mark the finding as[FIXED]with a one-line description of what changed.
Fix Guardrails
NEVER do any of these during fixes:
- Change business logic, pricing, billing, or domain rules
- Modify public API contracts (URL paths, request/response shapes, function signatures used by external callers)
- Delete or rewrite tests (you can ADD tests, never remove them)
- Install new dependencies without asking the user
- Rewrite working code just because you'd "write it differently"
- Batch-apply more than one Tier 2+ fix at a time without running tests between
- Assume you know what the user intended — when in doubt, mark
[NEEDS_CLARIFICATION]and move on
ALWAYS do these:
- Keep the application in a deployable state after every individual fix
- Preserve all existing tests in passing state
- Add comments explaining WHY a fix was made (not what — the diff shows that)
- Use the project's existing code style and conventions (detected in Pass 0)
- Prefer the project's existing patterns over "better" patterns from elsewhere
After All Fixes
- Run the full test suite (if one exists).
- Run linters (if configured).
- Update
AUDIT.mdwith a final status section showing: fixed count, skipped count, needs-clarification count, needs-human-review count. - Present the summary to the user.
Handling Large Codebases
For codebases with 50+ files or 10,000+ lines:
- Pass 0 is even more critical. Spend extra time building the codebase map. Identify the "core" (most-connected modules) vs "periphery" (leaf modules, utilities).
- Prioritize by connectivity. Audit core modules first — bugs there have the largest blast radius.
- Use scripts for mechanical scanning. Run
bash ~/.claude/skills/deslop/scripts/scan_dead_code.shandbash ~/.claude/skills/deslop/scripts/scan_patterns.shto accelerate Pass 1 findings. Review their output rather than manually grepping every file. - Use dependency mapping. Run
bash ~/.claude/skills/deslop/scripts/scan_dependencies.shduring Pass 0 to build the dependency graph automatically. - Chunk Pass 2 by critical path. Don't try to trace every possible path. Focus on: authentication flow, main CRUD operations, payment/billing (if any), data export/import, and any path that handles sensitive data.
- Fix periphery first. During Phase 2, fix leaf modules before core modules. Lower blast radius, and fixes in leaf modules sometimes resolve findings in core modules (e.g., adding validation in a utility function used by core).
Asking Questions
The skill should ask the user questions in these situations:
- Before starting: "I'll audit the codebase at
[path]. Any areas to focus on or exclude?" - During Pass 2: When you encounter something that might be intentional behavior or might be a bug, flag it as
[NEEDS_CLARIFICATION]in the audit file and batch all clarification questions for the user between Pass 2 and Pass 3. - Before Phase 2: Always. "Here's the audit. Want to proceed with fixes? Any findings to skip or reprioritize?"
- During Tier 3 fixes: Each one individually. "This fix would change [specific thing]. Approve?"
Never ask about Tier 1 fixes — just do them. They're safe by definition.
What This Skill Is NOT
- Not a test-writing skill (though it writes characterization tests as a safety mechanism)
- Not a feature-building skill (it never adds new functionality)
- Not a migration tool (it doesn't change frameworks, languages, or architectures)
- Not a style enforcer (it uses the project's existing style, not an external standard)
- Not a replacement for manual security audit (it catches common issues but is not exhaustive)
The single goal: make the existing codebase more robust, maintainable, and production-ready without changing what it does.
Source
git clone https://github.com/thtskaran/claude-skills/blob/master/deslop/SKILL.mdView on GitHub Overview
Deslop systematically audits an existing codebase to production readiness, focusing on AI-generated or vibe-coded code that is fragile or insecure. It operates in two strict phases: Phase 1 audits and generates AUDIT.md; Phase 2 consumes AUDIT.md to apply fixes in a safety-tiered order, without altering business logic. It is language-agnostic and aims to harden, clean, and robustify code.
How This Skill Works
In Phase 1, Deslop performs a comprehensive multi-pass audit guided by a formal checklist and outputs a structured AUDIT.md. In Phase 2, it consumes AUDIT.md to apply fixes in safety order, ensuring changes improve resilience without changing behavior. The process is repeatable and auditable across any language or framework.
When to Use It
- Before releasing an AI-generated or vibe-coded codebase to production, to validate stability and security.
- When you need to clean up dead code, magic numbers, or inconsistent patterns across a repo.
- If you want a formal, machine-parseable audit report (AUDIT.md) to guide fixes.
- When refactoring or hardening code but without changing business logic.
- For language-agnostic projects requiring a safety-first, staged improvement plan.
Quick Start
- Step 1: Provide the target codebase path or repo and any focus constraints.
- Step 2: Run Phase 1 AUDIT to produce AUDIT.md by following the checklist.
- Step 3: Run Phase 2 FIX to apply fixes in safety order and validate with tests.
Best Practices
- Run Pass 0 reconnaissance first to map the codebase.
- Follow the five sequential audit passes and never skip a pass.
- Reference the checklist and audit-format specs at ~/.claude/skills/deslop/references.
- Keep business logic untouched; focus fixes on structure, robustness, and security.
- Document findings clearly in AUDIT.md and plan fixes in Phase 2.
Example Use Cases
- AI-generated Python service with unstable imports and insecure eval usage.
- Node.js/TypeScript microservice with hardcoded URLs, magic numbers, and inconsistent linting.
- Go-based CLI where error handling and exit codes are inconsistent.
- ML data-pipeline scripts with fragile data transforms and no strong typing.
- Frontend app with mixed patterns and dead code that degrades in production.