security-ownership-map
npx machina-cli add skill tech-leads-club/agent-skills/security-ownership-map --openclawSecurity Ownership Map
Overview
Build a bipartite graph of people and files from git history, then compute ownership risk and export graph artifacts for Neo4j/Gephi. Also build a file co-change graph (Jaccard similarity on shared commits) to cluster files by how they move together while ignoring large, noisy commits.
Requirements
- Python 3
networkx(required; community detection is enabled by default)
Install with:
pip install networkx
Workflow
- Scope the repo and time window (optional
--since/--until). - Decide sensitivity rules (use defaults or provide a CSV config).
- Build the ownership map with
scripts/run_ownership_map.py(co-change graph is on by default; use--cochange-max-filesto ignore supernode commits). - Communities are computed by default; graphml output is optional (
--graphml). - Query the outputs with
scripts/query_ownership.pyfor bounded JSON slices. - Persist and visualize (see
references/neo4j-import.md).
By default, the co-change graph ignores common “glue” files (lockfiles, .github/*, editor config) so clusters reflect actual code movement instead of shared infra edits. Override with --cochange-exclude or --no-default-cochange-excludes. Dependabot commits are excluded by default; override with --no-default-author-excludes or add patterns via --author-exclude-regex.
If you want to exclude Linux build glue like Kbuild from co-change clustering, pass:
python skills/skills/security-ownership-map/scripts/run_ownership_map.py \
--repo /path/to/linux \
--out ownership-map-out \
--cochange-exclude "**/Kbuild"
Quick start
Run from the repo root:
python skills/skills/security-ownership-map/scripts/run_ownership_map.py \
--repo . \
--out ownership-map-out \
--since "12 months ago" \
--emit-commits
Defaults: author identity, author date, and merge commits excluded. Use --identity committer, --date-field committer, or --include-merges if needed.
Example (override co-change excludes):
python skills/skills/security-ownership-map/scripts/run_ownership_map.py \
--repo . \
--out ownership-map-out \
--cochange-exclude "**/Cargo.lock" \
--cochange-exclude "**/.github/**" \
--no-default-cochange-excludes
Communities are computed by default. To disable:
python skills/skills/security-ownership-map/scripts/run_ownership_map.py \
--repo . \
--out ownership-map-out \
--no-communities
Sensitivity rules
By default, the script flags common auth/crypto/secret paths. Override by providing a CSV file:
# pattern,tag,weight
**/auth/**,auth,1.0
**/crypto/**,crypto,1.0
**/*.pem,secrets,1.0
Use it with --sensitive-config path/to/sensitive.csv.
Output artifacts
ownership-map-out/ contains:
people.csv(nodes: people)files.csv(nodes: files)edges.csv(edges: touches)cochange_edges.csv(file-to-file co-change edges with Jaccard weight; omitted with--no-cochange)summary.json(security ownership findings)commits.jsonl(optional, if--emit-commits)communities.json(computed by default from co-change edges when available; includesmaintainersper community; disable with--no-communities)cochange.graph.json(NetworkX node-link JSON withcommunity_id+community_maintainers; falls back toownership.graph.jsonif no co-change edges)ownership.graphml/cochange.graphml(optional, if--graphml)
people.csv includes timezone detection based on author commit offsets: primary_tz_offset, primary_tz_minutes, and timezone_offsets.
LLM query helper
Use scripts/query_ownership.py to return small, JSON-bounded slices without loading the full graph into context.
Examples:
python skills/skills/security-ownership-map/scripts/query_ownership.py --data-dir ownership-map-out people --limit 10
python skills/skills/security-ownership-map/scripts/query_ownership.py --data-dir ownership-map-out files --tag auth --bus-factor-max 1
python skills/skills/security-ownership-map/scripts/query_ownership.py --data-dir ownership-map-out person --person alice@corp --limit 10
python skills/skills/security-ownership-map/scripts/query_ownership.py --data-dir ownership-map-out file --file crypto/tls
python skills/skills/security-ownership-map/scripts/query_ownership.py --data-dir ownership-map-out cochange --file crypto/tls --limit 10
python skills/skills/security-ownership-map/scripts/query_ownership.py --data-dir ownership-map-out summary --section orphaned_sensitive_code
python skills/skills/security-ownership-map/scripts/query_ownership.py --data-dir ownership-map-out community --id 3
Use --community-top-owners 5 (default) to control how many maintainers are stored per community.
Basic security queries
Run these to answer common security ownership questions with bounded output:
# Orphaned sensitive code (stale + low bus factor)
python skills/skills/security-ownership-map/scripts/query_ownership.py --data-dir ownership-map-out summary --section orphaned_sensitive_code
# Hidden owners for sensitive tags
python skills/skills/security-ownership-map/scripts/query_ownership.py --data-dir ownership-map-out summary --section hidden_owners
# Sensitive hotspots with low bus factor
python skills/skills/security-ownership-map/scripts/query_ownership.py --data-dir ownership-map-out summary --section bus_factor_hotspots
# Auth/crypto files with bus factor <= 1
python skills/skills/security-ownership-map/scripts/query_ownership.py --data-dir ownership-map-out files --tag auth --bus-factor-max 1
python skills/skills/security-ownership-map/scripts/query_ownership.py --data-dir ownership-map-out files --tag crypto --bus-factor-max 1
# Who is touching sensitive code the most
python skills/skills/security-ownership-map/scripts/query_ownership.py --data-dir ownership-map-out people --sort sensitive_touches --limit 10
# Co-change neighbors (cluster hints for ownership drift)
python skills/skills/security-ownership-map/scripts/query_ownership.py --data-dir ownership-map-out cochange --file path/to/file --min-jaccard 0.05 --limit 20
# Community maintainers (for a cluster)
python skills/skills/security-ownership-map/scripts/query_ownership.py --data-dir ownership-map-out community --id 3
# Monthly maintainers for the community containing a file
python skills/skills/security-ownership-map/scripts/community_maintainers.py \
--data-dir ownership-map-out \
--file network/card.c \
--since 2025-01-01 \
--top 5
# Quarterly buckets instead of monthly
python skills/skills/security-ownership-map/scripts/community_maintainers.py \
--data-dir ownership-map-out \
--file network/card.c \
--since 2025-01-01 \
--bucket quarter \
--top 5
Notes:
- Touches default to one authored commit (not per-file). Use
--touch-mode fileto count per-file touches. - Use
--window-days 90or--weight recency --half-life-days 180to smooth churn. - Filter bots with
--ignore-author-regex '(bot|dependabot)'. - Use
--min-share 0.1to show stable maintainers only. - Use
--bucket quarterfor calendar quarter groupings. - Use
--identity committeror--date-field committerto switch from author attribution. - Use
--include-mergesto include merge commits (excluded by default).
Summary format (default)
Use this structure, add fields if needed:
{
"orphaned_sensitive_code": [
{
"path": "crypto/tls/handshake.rs",
"last_security_touch": "2023-03-12T18:10:04+00:00",
"bus_factor": 1
}
],
"hidden_owners": [
{
"person": "alice@corp",
"controls": "63% of auth code"
}
]
}
Graph persistence
Use references/neo4j-import.md when you need to load the CSVs into Neo4j. It includes constraints, import Cypher, and visualization tips.
Notes
bus_factor_hotspotsinsummary.jsonlists sensitive files with low bus factor;orphaned_sensitive_codeis the stale subset.- If
git logis too large, narrow with--sinceor--until. - Compare
summary.jsonagainst CODEOWNERS to highlight ownership drift.
Source
git clone https://github.com/tech-leads-club/agent-skills/blob/main/packages/skills-catalog/skills/(security)/security-ownership-map/SKILL.mdView on GitHub Overview
Build a bipartite graph of people and files from git history to compute ownership risk and bus factor, exporting graph-ready artifacts for Neo4j/Gephi. It also creates a file co-change graph to cluster files by movement while filtering noise, enabling security-focused governance and risk visualization.
How This Skill Works
The tool parses git history to create people and files nodes with touches edges, computing ownership risk and bus factor from participation. It also builds a co-change graph using Jaccard similarity on shared commits and applies default community detection to form clusters, with outputs including CSV/JSON artifacts suitable for graph databases and visualization.
When to Use It
- When you explicitly need a security-oriented ownership or bus-factor analysis grounded in git history.
- To identify orphaned sensitive code or security hotspots lacking proper ownership.
- To perform CODEOWNERS reality checks for risk and reassess ownership coverage.
- To reveal sensitive hotspots and ownership clusters that may indicate security risk.
- To generate graph artifacts for visualization in graph databases or network tools.
Quick Start
- Step 1: Scope the repo and optional time window with --since/--until and set sensitivity rules if needed.
- Step 2: Run the map: python skills/skills/security-ownership-map/scripts/run_ownership_map.py --repo . --out ownership-map-out
- Step 3: Query or visualize outputs (edges.csv, people.csv, files.csv, cochange.graph.json) in Neo4j/Gephi or using summary.json.
Best Practices
- Scope the repository and optional time window using --since/--until to limit history.
- Define sensitivity rules via a CSV configuration or rely on defaults.
- Run the ownership map with scripts/run_ownership_map.py and adjust co-change settings as needed.
- Review outputs: people.csv, files.csv, edges.csv, summary.json, communities.json, and cochange edges.
- Validate results by visualizing in Neo4j/Gephi and cross-checking with CODEOWNERS and security policies.
Example Use Cases
- Identify orphaned sensitive code paths with no active security maintainers.
- Audit CODEOWNERS coverage against high-risk paths (auth/crypto) to reduce risk.
- Cluster files by co-change to reveal security-related ownership communities.
- Export graphs to Neo4j/Gephi for a governance dashboard and risk scoring.
- Use summary.json to surface high-risk ownership clusters and hotspots for remediation.