ci-fix
Scannednpx machina-cli add skill jmerta/codex-skills/ci-fix --openclawFiles (1)
SKILL.md
3.0 KB
CI fix (GitHub Actions)
Goal
- Get CI green quickly with minimal, reviewable diffs.
- Use
ghto locate failing runs, inspect logs/artifacts, rerun jobs, and confirm the fix.
Inputs to ask for (if missing)
- Repo (
OWNER/REPO) and whether this is a PR or branch build. - Failing run URL/ID (or PR number / branch name).
- What "green" means (required workflows? allowed flaky reruns?).
- Any constraints (no workflow edits, no permission changes, no force-push, etc.).
Workflow (checklist)
- Confirm
ghcontext- Auth:
gh auth status - Repo:
gh repo view --json nameWithOwner -q .nameWithOwner - If needed, add
-R OWNER/REPOto all commands. - If
ghis not installed or not authenticated, tell the user and ask whether to install/authenticate or proceed by pasting logs/run URLs manually.
- Auth:
- Find the failing run
- If you have a run URL, extract the run ID:
.../actions/runs/<id>. - Otherwise:
- Recent failures:
gh run list --limit 20 --status failure - Branch failures:
gh run list --branch <branch> --limit 20 --status failure - Workflow failures:
gh run list -w <workflow> --limit 20 --status failure
- Recent failures:
- Open in browser:
gh run view <id> --web
- If you have a run URL, extract the run ID:
- Pull the signal from logs
- Job/step overview:
gh run view <id> --verbose - Failed steps only:
gh run view <id> --log-failed - Full log for a job:
gh run view <id> --log --job <job-id> - Download artifacts:
gh run download <id> -D .artifacts/<id>
- Job/step overview:
- Identify root cause (prefer the smallest fix)
- Use
references/ci-failure-playbook.mdfor common patterns and safe fixes. - Prefer: deterministic code/config fix > workflow plumbing fix > rerun flake.
- Use
- Implement the fix (minimal diff)
- Update code/tests/config and/or
.github/workflows/*.yml. - Keep changes scoped to the failing job/step.
- If changing triggers/permissions/secrets, call out risk and get explicit confirmation.
- Update code/tests/config and/or
- Verify in GitHub Actions
- Rerun only failures:
gh run rerun <id> --failed - Rerun a specific job (note: job databaseId):
gh run view <id> --json jobs --jq '.jobs[] | {name,databaseId,conclusion}' - Watch until done:
gh run watch <id> --compact --exit-status - Manually trigger:
gh workflow run <workflow> --ref <branch>
- Rerun only failures:
Safety notes
- Avoid
pull_request_target(and any change that runs untrusted fork code with secrets) unless the user explicitly requests it and understands the security tradeoffs. - Keep workflow
permissions:least-privilege; don’t broaden token access “just to make it pass”.
Deliverable (paste in chat / PR)
- Summary: ...
- Failing run: <link/id> (job/step)
- Root cause: ...
- Fix: ...
- Verification: commands + new run link/id
- Notes/risks: ...
Overview
ci-fix helps you diagnose and repair failing GitHub Actions builds using the GitHub CLI. It guides you to locate failing runs, inspect logs and artifacts, implement a minimal, reviewable patch, and verify the fix by rerunning jobs. This approach keeps CI green with clear traceability.
How This Skill Works
First confirm gh context and the target repo, then locate the failing run using gh run list or a direct URL and inspect logs with gh run view. Identify the root cause and implement a minimal patch to code, tests, or workflows with a scoped diff. Finally rerun the failed jobs and watch the run to completion.
When to Use It
- CI is failing after the latest PR and needs diagnosis
- You have a direct failing run URL or ID to investigate
- You want a minimal, reviewable patch rather than a broad workflow rewrite
- You need to verify fixes by rerunning only the failed jobs
- You must preserve security and permissions while fixing CI
Quick Start
- Step 1: Confirm gh context and repo state with gh auth status and gh repo view
- Step 2: Locate the failing run with gh run list and inspect logs with gh run view
- Step 3: Implement a minimal fix in code or workflow, push, and verify by rerunning the failing jobs
Best Practices
- Verify gh is installed and authenticated before starting
- Limit changes to the failing job or step and prefer small diffs
- Use deterministic fixes when possible and avoid broad workflow plumbing
- Rerun only the failed jobs and document the commands used
- Provide a clear deliverable including summary, failing run, root cause, fix, verification, and risks
Example Use Cases
- A unit test intermittently fails causing CI to fail; fix by adjusting timeout and caching
- Workflow step fails due to a missing secret; patch workflow to handle missing secrets gracefully
- CI failure due to flaky network; add retry with backoff in the action
- PR introduces a breaking change; fix by pinning an action version and updating tests
- Job uses deprecated API; patch code to use new API and rerun
Frequently Asked Questions
Add this skill to your agents