Sensitive Data Leakage
npx machina-cli add skill allsmog/vuln-scout/sensitive-data-leakage --openclawGeneric Sensitive Data Leakage Detection
Core Principle
IF data MATCHES sensitive_pattern
AND data FLOWS TO output_sink
THEN potential_leak
This skill detects credentials, secrets, and sensitive data flowing to logging, error messages, HTTP responses, or any other output - regardless of which libraries the codebase uses.
Phase 1: Identify Sensitive Data (Sources)
1.1 Sensitive Naming Patterns
Search for variables, fields, parameters with these patterns:
Go:
grep -rniE "(secret|password|passwd|pwd|apikey|api_key|token|credential|private.?key|access.?key|auth.?token|bearer|encryption.?key|signing.?key|client.?secret|consumer.?secret|conn.?str|connection.?string)" --include="*.go" | grep -v "_test\.go"
Python:
grep -rniE "(secret|password|passwd|pwd|apikey|api_key|token|credential|private.?key|access.?key|auth.?token|bearer)" --include="*.py" | grep -v "test_"
Java:
grep -rniE "(secret|password|passwd|pwd|apiKey|api_key|token|credential|privateKey|accessKey|authToken|bearer)" --include="*.java" | grep -v "Test\.java"
JavaScript/TypeScript:
grep -rniE "(secret|password|passwd|pwd|apiKey|api_key|token|credential|privateKey|accessKey|authToken|bearer)" --include="*.js" --include="*.ts" | grep -v "\.test\." | grep -v "\.spec\."
1.2 Sensitive Function Returns
# Functions that return/fetch secrets (Go)
grep -rniE "func.*(Get|Read|Fetch|Load|Decrypt|Retrieve).*(Secret|Password|Key|Token|Cred)" --include="*.go"
# Assignments from credential functions
grep -rniE "(secret|password|key|token|cred).*:?=.*(Get|Read|Fetch|Load|Decrypt|Retrieve)" --include="*.go"
1.3 Sensitive Struct/Class Fields
# Go struct fields
grep -rniE "^\s+(Secret|Password|Key|Token|Credential|ApiKey|PrivateKey|AccessKey)\s+\S+" --include="*.go"
# Python class attributes
grep -rniE "self\.(secret|password|key|token|credential|api_key)" --include="*.py"
# Java fields
grep -rniE "(private|protected|public)\s+\S+\s+(secret|password|key|token|credential)" --include="*.java"
1.4 Environment Variables
# Go
grep -rniE "os\.Getenv\([\"'].*?(SECRET|PASSWORD|KEY|TOKEN|CREDENTIAL|API_KEY)" --include="*.go"
# Python
grep -rniE "os\.environ\.get\([\"'].*?(SECRET|PASSWORD|KEY|TOKEN|CREDENTIAL|API_KEY)" --include="*.py"
# Node.js
grep -rniE "process\.env\.(SECRET|PASSWORD|KEY|TOKEN|CREDENTIAL|API_KEY)" --include="*.js" --include="*.ts"
Phase 2: Identify Output Sinks
2.1 Discover Logging Library Used
# Go - find log imports
grep -rniE "^import|^\t\"" --include="*.go" | grep -iE "log|zap|logrus|zerolog|klog|glog|slog" | head -10
# Find actual log function calls
grep -rhoE "\w+\.(Error|Info|Debug|Warn|Fatal|Print|Log|Msg)(f|ln|w|Context)?\s*\(" --include="*.go" | sort | uniq -c | sort -rn | head -20
2.2 All Logging Calls (Generic)
# Matches ANY logging library
grep -rniE "\.(log|print|error|warn|info|debug|fatal|trace|notice|output|write|emit|send|record)(f|ln|w)?\s*\(" --include="*.go" --include="*.py" --include="*.java" --include="*.js"
2.3 Error Creation/Wrapping
# Go
grep -rniE "(fmt\.Errorf|errors\.New|errors\.Wrap|errors\.Wrapf|fmt\.Sprintf.*[Ee]rr)" --include="*.go"
# Python
grep -rniE "(raise\s+\w+Exception|raise\s+\w+Error)" --include="*.py"
# Java
grep -rniE "throw\s+new\s+\w+Exception" --include="*.java"
2.4 HTTP Responses
# Go
grep -rniE "(\.Write\(|\.WriteString\(|json\.Encode|\.JSON\(|c\.String\(|w\.Write)" --include="*.go"
# Python (Flask/Django)
grep -rniE "(jsonify|JsonResponse|Response\(|return.*json)" --include="*.py"
# Node.js
grep -rniE "(res\.send|res\.json|res\.write|response\.send)" --include="*.js" --include="*.ts"
Phase 3: Find Dangerous Intersections
3.1 Sensitive Variable in Log Call
# Direct pattern - sensitive var name in log arguments
grep -rniE "(log|print|error|warn|info|debug|fatal)\w*\(.*\b(secret|password|key|token|cred|apikey)\w*\b" --include="*.go" | grep -v "_test\.go"
3.2 Format String Struct Dumps (%v, %+v, %#v)
# These format verbs dump ALL struct fields including secrets
grep -rniE "%[+#]?v" --include="*.go" | grep -v "_test\.go"
# More specific - %v with config/options types
grep -rniE "(Error|Info|Debug|Warn|Print|Log)(f|w)?\(.*%[+#]?v.*(config|option|session|setting|client|request)" --include="*.go"
3.3 Sensitive Data Passed to Format Functions
# Sensitive variable as argument to printf-style function
grep -rniE "(printf|errorf|sprintf|infof|debugf|warnf|fatalf)\([^)]+,\s*\w*(secret|password|key|token|cred)" --include="*.go"
3.4 Error Returns Containing Secrets
# Functions returning errors with sensitive data
grep -rniE "return.*(fmt\.Errorf|errors\.).*%(v|s|w).*\w*(secret|password|key|token|config|opt)" --include="*.go"
Phase 4: Contextual Analysis (Critical for Avoiding False Positives)
For each finding, verify:
| Check | Question | How to Verify |
|---|---|---|
| Is it actually sensitive? | Not a map key, keyboard key, or generic "key" | Check variable usage context |
| Does it reach output? | Trace variable through code to log/response | Follow data flow |
| Has safe String() method? | Struct implements fmt.Stringer that redacts secrets? | grep -A10 "func (.*TypeName) String()" |
| Format verb? | Using %+v/%#v? (these bypass String() methods) | Check format string - %s and %v use String() |
| Is it SDK error? | SDK errors rarely contain config structs | Don't flag SDK error logging by default |
| Log level? | Debug logs may be disabled in prod | Lower severity for debug-only |
Critical: Always Check for String() Method
Before flagging any struct being logged:
# For a struct named "Server" or "Config":
grep -rn "func (.*Server) String()" --include="*.go"
grep -rn "func (.*Config) String()" --include="*.go"
If a safe String() method exists that omits credentials → NOT a vulnerability (unless %+v or %#v is used)
Phase 5: Common Vulnerable Patterns
Pattern 1: Direct Struct Dump with Credentials
// VULNERABLE: struct with credentials logged directly
type Config struct {
Region string
SecretKey string // Sensitive!
}
log.Warnf("Config issue: %+v", config) // Dumps ALL fields including SecretKey
log.Warnf("Server error: %s", server) // ONLY vulnerable if Server lacks safe String() method
IMPORTANT: Before flagging struct logging, check if the struct has a custom String() method:
# Check for safe String() implementation
grep -A5 "func (.*TypeName) String()" --include="*.go"
If the struct has a String() method that omits sensitive fields, logging with %s or %v is SAFE.
Pattern 2: Config Struct Dump
// VULNERABLE: config.SecretKey exposed
log.Debugf("Using config: %+v", config)
Pattern 3: Request Logging
// VULNERABLE: Authorization header exposed
log.Infof("Request: %+v", req)
log.Infof("Headers: %v", req.Header)
Pattern 4: Error Chain Propagation
// VULNERABLE: secret propagates up call stack
err := connectWithSecret(secretKey)
return fmt.Errorf("connection failed: %w", err) // wraps error containing secret
Pattern 5: Response Body Logging
// VULNERABLE: response may contain tokens
body, _ := ioutil.ReadAll(resp.Body)
log.Debugf("Response: %s", body) // May contain access_token, refresh_token
Quick Scan Commands
All-in-One Scan (Go)
#!/bin/bash
echo "=== Sensitive Data Leakage Scan ==="
echo -e "\n[1] Sensitive identifiers in log calls:"
grep -rniE "(log|print|error|warn|info|debug|fatal)\w*\([^)]*\b(secret|password|key|token|cred|apikey)\w*" --include="*.go" | grep -v "_test\.go" | head -20
echo -e "\n[2] Struct dumps with %v/%+v:"
grep -rniE "(Error|Info|Debug|Warn|Print)(f)?\([^)]*%[+#]?v" --include="*.go" | grep -v "_test\.go" | head -20
echo -e "\n[3] Sensitive data in error creation:"
grep -rniE "(Errorf|Wrapf?|New)\([^)]*\b(secret|password|key|token|cred)" --include="*.go" | grep -v "_test\.go" | head -20
echo -e "\n[4] Config/Options types being logged:"
grep -rniE "(log|print)\w*\([^)]*(config|option|session|setting|credential)" --include="*.go" | grep -v "_test\.go" | head -20
echo -e "\n=== Scan Complete ==="
Format String Audit
# Find all %v/%+v usage for manual review
grep -rn "%+v\|%#v" --include="*.go" | grep -v "_test\.go" | while read line; do
file=$(echo "$line" | cut -d: -f1)
linenum=$(echo "$line" | cut -d: -f2)
echo "[$file:$linenum] $(echo "$line" | cut -d: -f3-)"
done
Remediation
Fix 1: Log Only Error Message
// Before (vulnerable)
log.Errorf("Failed: %v", err)
// After (safe)
log.Errorf("Failed: %s", err.Error())
Fix 2: Implement fmt.Stringer Interface
func (c *Config) String() string {
return fmt.Sprintf("Config{Region: %s, Bucket: %s}",
c.Region, c.Bucket)
// Omit SecretKey, Password, etc.
}
Fix 3: Use Structured Logging with Explicit Fields
// Only log non-sensitive fields
logger.Error("connection failed",
zap.String("region", config.Region),
zap.String("endpoint", config.Endpoint),
// Don't include: zap.String("secret", config.SecretKey)
)
Fix 4: Redact Before Logging
func redact(s string) string {
if len(s) <= 4 {
return "****"
}
return s[:2] + "****" + s[len(s)-2:]
}
log.Infof("Using key: %s", redact(apiKey))
Fix 5: Use Log Sanitization Middleware
// Wrap logger to auto-redact patterns
type RedactingLogger struct {
inner Logger
patterns []*regexp.Regexp
}
func (l *RedactingLogger) Errorf(format string, args ...interface{}) {
msg := fmt.Sprintf(format, args...)
msg = l.redactPatterns(msg)
l.inner.Errorf("%s", msg)
}
Common False Positives to Avoid
Before reporting a finding, verify it's not one of these common false positives:
FP 1: SDK Error Logging
// USUALLY NOT VULNERABLE
session, err := session.NewSessionWithOptions(opts)
if err != nil {
log.Errorf("Failed to create session: %v", err) // err is just an error message
}
Why it's usually safe: SDK errors typically contain descriptive error messages, NOT the config struct with credentials. The err object from most SDKs (AWS, GCP, Azure) does not embed or reference the options/config passed to the function.
When it IS vulnerable: Only if the SDK explicitly includes config in error (rare), or if wrapping an error that contains sensitive data.
FP 2: Struct with Safe String() Method
// NOT VULNERABLE if Server has safe String() method
log.Warnf("Server issue: %s", server)
// Check: Does Server implement String()?
func (s *Server) String() string {
return s.ID + "=>" + s.Address // Safe - omits credentials
}
Verification: Always grep for func (.*StructName) String() before flagging.
FP 3: Generic "key" or "token" Variable Names
// NOT VULNERABLE - these are map keys, not secrets
for key, value := range items {
log.Debugf("Processing key: %s", key)
}
// NOT VULNERABLE - JWT token parsing (token is being validated, not a secret)
token, err := jwt.Parse(tokenString, keyFunc)
Verification: Check context - is "key" a cryptographic key or a map/dictionary key?
FP 4: Error Contains Path, Not Credentials
// USUALLY NOT VULNERABLE
credsJSON, err := ioutil.ReadFile(storageCredsPath)
if err != nil {
log.Errorf("Unable to read credentials: %v", err) // err contains file path, not credentials
}
Why it's usually safe: File read errors contain the path and OS error, not file contents.
When it IS vulnerable: If the path itself is sensitive (contains account IDs, etc.)
Verification Checklist
Before reporting any credential logging finding, verify:
| Step | Check | If No → |
|---|---|---|
| 1 | Is the logged variable actually a struct with credentials? | Not a vulnerability |
| 2 | If struct: Does it have a custom String() method? | If safe String() exists → Not vulnerable |
| 3 | If error: Does the SDK/library actually embed config in errors? | Usually no → Likely FP |
| 4 | Is %+v or %#v used? (These bypass String() methods) | If just %v or %s → Check String() method |
| 5 | Is the sensitive field directly in the format string args? | If not direct → trace data flow |
Integration with Other Skills
- Use dangerous-functions skill for traditional injection sinks
- Use data-flow-tracing skill for complex flow analysis
- Use vuln-patterns skill for exploitation context
Source
git clone https://github.com/allsmog/vuln-scout/blob/main/whitebox-pentest/skills/sensitive-data-leakage/SKILL.mdView on GitHub Overview
Sensitive Data Leakage detects credentials, secrets, and sensitive data flowing to output sinks such as logging, error messages, and HTTP responses. It uses source-patterns (names, returns, fields, env vars) and sink detection to flag potential leaks across Go, Python, Java, and JS/TS projects. This helps prevent CWE-532 and secret exposure by surfacing risky data flows.
How This Skill Works
Phase 1 identifies sensitive data sources using naming patterns, function returns, struct/class fields, and environment variables. Phase 2 traces data to output sinks such as logs and HTTP responses using language-agnostic and language-specific probes. If a sensitive value matches a source pattern and reaches a sink, the leak is flagged as potential leakage.
When to Use It
- Auditing code for password or secret exposure in logs or error messages
- Investigating API responses that may leak tokens or credentials
- Scanning server logs for accidental logging of secrets or private keys
- Reviewing environment variable usage to ensure secrets are not emitted in outputs
- Assessing error handling paths that could reveal credentials or API keys
Quick Start
- Step 1: Run pattern searches for sensitive identifiers (secret, password, api_key, token) across Go, Python, Java, and JS/TS
- Step 2: Identify sensitive function returns and struct/class fields that may fetch or hold secrets, plus environment variables
- Step 3: Scan for output sinks (logs, HTTP responses) and triage any matches to verify actual leaks
Best Practices
- Use language-specific sensitive naming patterns (Go, Python, Java, JS/TS) to identify sources
- Combine source detection (1.1–1.4) with sink analysis (2.1–2.4) to map data flow
- Include all output sinks (logs, responses, errors) in scans across the codebase
- Exclude test files and mocks to reduce false positives
- Triage findings by tracing actual data flow and confirming the presence of leaked data
Example Use Cases
- A Go service logs a password variable in an error message, enabling credential leakage
- A Python Flask API returns a JSON response that includes a token value in the body
- A Java application writes an API key to logs via an Info/Debug log statement
- A Node.js service prints a JWT from an environment variable to the console
- A Go server exposes a private key through an HTTP error response