behavioral-mutation-analyzer
Scannednpx machina-cli add skill ArabelaTso/Skills-4-SE/behavioral-mutation-analyzer --openclawBehavioral Mutation Analyzer
Overview
This skill systematically analyzes surviving mutants from mutation testing to understand test suite weaknesses and automatically generate improvements. It identifies why mutants survived, categorizes root causes, and produces actionable test enhancements to increase mutation detection rates.
Analysis Workflow
Step 1: Input Collection and Validation
Gather required inputs and verify completeness:
Required Inputs:
- Repository source code (path or files)
- Test suite (test files and framework)
- Mutation testing results (report file or data)
Mutation Result Formats:
- PIT (Java): XML or HTML reports
- Stryker (JavaScript/TypeScript): JSON reports
- mutmut (Python): result files
- Pitest, Infection (PHP), Cosmic Ray, etc.
Validation checklist:
- Source code accessible
- Test suite runnable
- Mutation results parseable
- Mutation tool and version identified
Step 2: Surviving Mutant Extraction
Parse mutation results to identify all surviving mutants:
Extract for each mutant:
- Mutant ID
- Source file and line number
- Mutation operator (e.g., boundary change, negation)
- Original code
- Mutated code
- Status (survived/killed/timeout/error)
Focus on survived mutants: Filter out killed mutants and focus analysis on survivors that indicate test weaknesses.
Step 3: Root Cause Classification
Analyze each surviving mutant to determine why it survived:
Category 1: Insufficient Coverage
Indicators:
- Mutated line not executed by any test
- Mutated method/function never called
- Conditional branch not taken
Analysis:
- Check code coverage data
- Identify uncovered code paths
- Trace execution from test entry points
Example:
// Original
public int calculate(int x) {
if (x > 0) {
return x * 2; // Line 3: Covered
}
return 0; // Line 5: NOT covered
}
// Mutant: Line 5 changed to "return 1;"
// Survives because no test calls calculate() with x <= 0
Category 2: Equivalent Mutants
Indicators:
- Mutation produces semantically identical behavior
- Mathematical or logical equivalence
- Dead code or unreachable state
Analysis:
- Compare control flow graphs
- Check for mathematical identities
- Identify redundant operations
Example:
# Original
result = x * 1
# Mutant: changed to "result = x"
# Equivalent: multiplying by 1 has no effect
Category 3: Weak Assertions
Indicators:
- Test executes mutated code but doesn't verify output
- Assertions too broad or generic
- Only checking for exceptions, not correctness
Analysis:
- Review test assertions
- Check what properties are verified
- Identify missing postconditions
Example:
// Test
test('calculate returns a number', () => {
const result = calculate(5);
expect(typeof result).toBe('number'); // Weak: doesn't check value
});
// Mutant: "return x * 2" → "return x * 3"
// Survives because test only checks type, not value
Category 4: Missed Edge Cases
Indicators:
- Mutation affects boundary conditions
- Special values not tested (null, zero, empty, max/min)
- Error handling paths not verified
Analysis:
- Identify boundary values in mutated code
- Check test inputs for edge case coverage
- Review exception handling tests
Example:
// Original
public int divide(int a, int b) {
return a / b;
}
// Mutant: added "if (b == 0) return 0;"
// Survives because no test checks division by zero
Category 5: Timing and Concurrency Issues
Indicators:
- Mutant affects timing, delays, or synchronization
- Race conditions or thread safety
- Asynchronous behavior changes
Analysis:
- Check for concurrent code
- Identify timing-dependent logic
- Review async/await patterns
Category 6: State-Dependent Behavior
Indicators:
- Mutant affects state transitions
- Order-dependent operations
- Side effects not verified
Analysis:
- Trace state changes
- Check for stateful objects
- Verify side effect assertions
Step 4: Test Generation Strategy
For each surviving mutant, determine the appropriate test enhancement:
Strategy 1: Add Missing Test Cases
- When: Insufficient coverage
- Action: Generate new test that executes mutated code
- Focus: Cover the uncovered path
Strategy 2: Strengthen Assertions
- When: Weak assertions
- Action: Add specific value checks
- Focus: Verify exact expected behavior
Strategy 3: Add Edge Case Tests
- When: Missed edge cases
- Action: Generate boundary value tests
- Focus: Test special inputs (null, zero, empty, max, min)
Strategy 4: Mark as Equivalent
- When: Equivalent mutant
- Action: Document equivalence reasoning
- Focus: No test needed, update mutation config to ignore
Strategy 5: Add Integration Tests
- When: State or timing issues
- Action: Create tests verifying end-to-end behavior
- Focus: Observable effects and state transitions
Step 5: Automated Test Generation
Generate concrete test code to kill surviving mutants:
Test Generation Process:
- Identify test framework (JUnit, pytest, Jest, etc.)
- Analyze existing test patterns and style
- Generate test following project conventions
- Include descriptive test names
- Add comments explaining what mutant is targeted
Example Generated Test:
def test_calculate_with_negative_input():
"""
Test to kill mutant #42: calculate() with x <= 0
Mutant changed 'return 0' to 'return 1' on line 5
"""
result = calculate(-5)
assert result == 0, "calculate() should return 0 for negative input"
result = calculate(0)
assert result == 0, "calculate() should return 0 for zero input"
Step 6: Report Generation
Create comprehensive analysis report using template in assets/mutation_analysis_report.md:
Report Sections:
- Executive summary (mutation score, survival rate)
- Surviving mutants by category
- Root cause analysis for each mutant
- Generated test enhancements
- Equivalent mutant documentation
- Recommendations for test suite improvement
Mutation Operators Reference
Common mutation operators and their implications:
Arithmetic Operators:
+↔-,*↔/,%↔*- Tests should verify exact numeric results
Relational Operators:
>↔>=,<↔<=,==↔!=- Tests should cover boundary conditions
Logical Operators:
&&↔||,!insertion/removal- Tests should verify boolean logic
Conditional Boundaries:
<↔<=,>↔>=- Tests should include boundary values
Return Values:
- Return value changes, void method calls removed
- Tests should assert return values
Statement Deletion:
- Remove method calls, assignments
- Tests should verify side effects
For detailed mutation operator catalog, see references/mutation_operators.md.
Tool Integration
PIT (Java)
Parse PIT XML reports:
# Run PIT
mvn org.pitest:pitest-maven:mutationCoverage
# Report location
target/pit-reports/YYYYMMDDHHMI/mutations.xml
Stryker (JavaScript/TypeScript)
Parse Stryker JSON reports:
# Run Stryker
npx stryker run
# Report location
reports/mutation/mutation.json
mutmut (Python)
Parse mutmut results:
# Run mutmut
mutmut run
# Show results
mutmut results
mutmut show [mutant-id]
For tool-specific parsing guidance, see references/tool_integration.md.
Practical Examples
Example 1: Insufficient Coverage
Surviving mutant:
// Line 15: return defaultValue; → return null;
Analysis: No test calls this method with conditions triggering line 15.
Generated test:
@Test
public void testGetValueWithMissingKey() {
// Kills mutant on line 15
String result = config.getValue("nonexistent");
assertEquals("default", result);
}
Example 2: Weak Assertion
Surviving mutant:
# Line 8: return items[:5] → return items[:4]
Analysis: Test only checks len(result) > 0, not exact length.
Enhanced test:
def test_get_top_items_returns_five():
# Kills mutant on line 8
items = create_test_items(10)
result = get_top_items(items)
assert len(result) == 5, "Should return exactly 5 items"
Example 3: Equivalent Mutant
Surviving mutant:
// Original: if (x > 0 && x < 100)
// Mutant: if (0 < x && 100 > x)
Analysis: Logically equivalent, no behavioral difference.
Action: Mark as equivalent in mutation config, no test needed.
Best Practices
Prioritize mutants:
- High-impact code (critical business logic)
- Frequently executed paths
- Security-sensitive operations
- Public API methods
Test quality over quantity:
- Focus on meaningful assertions
- Avoid brittle tests
- Test behavior, not implementation
Iterative improvement:
- Start with easiest mutants to kill
- Gradually tackle complex cases
- Re-run mutation testing after improvements
Document equivalent mutants:
- Maintain list of known equivalent mutants
- Configure mutation tool to skip them
- Explain equivalence reasoning
References
For detailed information on specific topics:
- Mutation operators: See
references/mutation_operators.md - Tool integration: See
references/tool_integration.md - Test patterns: See
references/test_patterns.md
Source
git clone https://github.com/ArabelaTso/Skills-4-SE/blob/main/skills/behavioral-mutation-analyzer/SKILL.mdView on GitHub Overview
Behavioral Mutation Analyzer systematically analyzes surviving mutants from mutation testing to understand test suite weaknesses and automatically generate improvements. It identifies why mutants survived (coverage gaps, equivalent mutants, weak assertions, missed edge cases) and produces actionable test enhancements to increase mutation detection rates. It supports multiple mutation tooling formats (PIT, Stryker, mutmut, etc.) and can auto-generate new test cases.
How This Skill Works
Collects and validates repo code, tests, and mutation results in common formats (PIT, Stryker, mutmut). Parses results to enumerate surviving mutants with IDs, locations, operators, and code snippets. Classifies each survivor into root causes (insufficient coverage, equivalent mutants, weak assertions, missed edge cases) and outputs actionable test improvements and new test cases.
When to Use It
- When you have mutation testing results and need to understand why some mutants survived
- When aiming to improve test suite effectiveness and kill more mutants
- When mutation scores are unexpectedly low and you need root-cause analysis
- When you want to generate new tests specifically to kill surviving mutants
- When you want to raise overall test quality based on mutation analysis
Quick Start
- Step 1: Provide repository path, test suite, and a mutation testing report in supported formats
- Step 2: Run the Behavioral Mutation Analyzer to extract surviving mutants and root causes
- Step 3: Review the generated test improvements or auto-generated tests and re-run mutation testing
Best Practices
- Provide complete inputs: repository, test suite, and a parseable mutation report
- Match mutation tool versions and report formats to your project
- Prioritize fixes by impact and mutation fatality, not just volume
- Cross-check with existing code coverage to identify true gaps
- Validate new tests by re-running mutation testing and updating as needed
Example Use Cases
- A Java project where a surviving mutant in a boundary branch is revealed; addition of a test for x <= 0 kills the mutant
- A Python function where a mutant is equivalent; test refinements ensure different outcomes are verified
- A JS project with weak assertions; after analysis, tests assert exact values rather than types
- An edge-case miss (null, empty, or zero) discovered by survival; new tests cover the edge case and fix the gap
- Post-analysis, new tests are auto-generated and mutation score improves on re-run