Pynchy Ops
Use Cautionnpx machina-cli add skill crypdick/pynchy/pynchy-ops --openclawPynchy Ops
The pynchy service runs on pynchy-server over Tailscale. SSH: ssh pynchy-server.
Auto-deploy: Never Restart Manually
Pynchy self-manages. Two mechanisms trigger automatic restarts:
- Git changes on
main— the polling mechanism detects new commits, pulls, and restarts (with container rebuild if source files changed). - Config file changes — editing
config.toml,litellm_config.yaml, or other settings files triggers an automatic restart. Edit the file and wait ~30–90s.
Do not manually restart containers or the service. This includes docker restart, systemctl restart, and direct container management (docker kill/stop/rm). Manual restarts bypass lifecycle management and can leave things in a bad state.
Only use manual commands when the service is unhealthy and needs fixing. See references/server-debug.md for diagnostic steps.
Quick Status Check
Preferred: the /status endpoint. Single command that returns everything:
# On pynchy-server directly:
curl -s http://localhost:8484/status | python3 -m json.tool
# Remotely (via Tailscale):
curl -s http://pynchy-server:8484/status | python3 -m json.tool
Returns JSON with: service (uptime), deploy (SHA, dirty, unpushed), channels (slack/whatsapp connected), gateway (LiteLLM health, model counts), queue (active containers, waiting groups), repos (per-repo worktree status — SHA, dirty, ahead/behind, conflicts), messages (inbound/outbound counts, last activity), tasks (scheduled tasks with status/next run), host_jobs, groups (total, active sessions).
Fallback: manual commands (when the HTTP server is down or you need logs):
# 1. Is the service running?
systemctl --user status pynchy
# 2. Any running containers?
docker ps --filter name=pynchy
# 3. Any stopped/orphaned containers?
docker ps -a --filter name=pynchy
# 4. Recent errors in service log?
journalctl --user -u pynchy -p err -n 20
# 5. Is WhatsApp connected?
journalctl --user -u pynchy --grep 'Connected to WhatsApp|Connection closed' -n 5
# 6. Are groups loaded?
journalctl --user -u pynchy --grep 'groupCount' -n 3
Deploy & Observe
# Trigger a deploy (from HOST — use mcp__pynchy__deploy_changes from containers)
curl -s -X POST http://pynchy-server:8484/deploy
# Observe (always safe)
ssh pynchy-server 'systemctl --user status pynchy'
ssh pynchy-server 'journalctl --user -u pynchy -f'
ssh pynchy-server 'journalctl --user -u pynchy -n 100'
ssh pynchy-server 'docker ps --filter name=pynchy'
# Manual restart — ONLY for unhealthy/stuck service
ssh pynchy-server 'systemctl --user restart pynchy'
Monitoring Live Agent Activity
journalctl only shows lifecycle events (container spawn, session create/destroy, errors). It does NOT show agent output (tool calls, thinking, text broadcasts). To monitor what an agent is actually doing, query SQLite:
# Recent activity for a specific group (replace <JID> with e.g. slack:C0AFR6DB0FK)
ssh pynchy-server 'sqlite3 data/messages.db "
SELECT timestamp, message_type, substr(content, 1, 120)
FROM messages WHERE chat_jid = '\''<JID>'\''
ORDER BY timestamp DESC LIMIT 15;
"'
# All recent activity across all groups
ssh pynchy-server 'sqlite3 data/messages.db "
SELECT timestamp, chat_jid, message_type, substr(content, 1, 80)
FROM messages ORDER BY timestamp DESC LIMIT 15;
"'
When to use what:
| What you need | Tool |
|---|---|
| Is the service running? | systemctl --user status pynchy |
| Did the container spawn/crash? | journalctl or docker logs |
| What is the agent doing right now? | SQLite messages table |
| Agent tool calls and traces | SQLite events table |
| Container startup errors (before DB writes) | docker logs pynchy-<group> |
Sending Synthetic Messages
Use the TUI API to inject messages into any group's chat pipeline (useful for testing):
# Send a message as if a user typed it
curl -s -X POST http://pynchy-server:8484/api/send \
-H "Content-Type: application/json" \
-d '{"jid": "<JID>", "content": "your message here"}'
This goes through the full message pipeline (routing → agent → output → broadcast), same as a real Slack/WhatsApp message.
Service Management Reference
macOS:
launchctl load ~/Library/LaunchAgents/com.pynchy.plist
launchctl unload ~/Library/LaunchAgents/com.pynchy.plist
Linux:
systemctl --user start pynchy
systemctl --user stop pynchy
systemctl --user restart pynchy
journalctl --user -u pynchy -f # Follow logs
Systemd unit template: config-examples/pynchy.service.EXAMPLE
Container GitHub Access
Admin containers only. GH_TOKEN is forwarded only to admin containers. Non-admin containers have git operations routed through host IPC and never receive the token.
# Interactive login (works over SSH with -t for TTY)
ssh -t pynchy-server 'gh auth login -p ssh'
# Verify
ssh pynchy-server 'gh auth status'
After authenticating, _write_env_file() auto-discovers GH_TOKEN and git identity on each admin container launch. No manual env configuration needed.
Container Build Cache
Apple Container's buildkit caches the build context aggressively. --no-cache alone does NOT invalidate COPY steps. To force a truly clean rebuild:
container builder stop && container builder rm && container builder start
./src/pynchy/agent/build.sh
Verify: container run -i --rm --entrypoint python pynchy-agent:latest -c "import agent_runner; print('OK')"
LiteLLM Gateway
Runs as pynchy-litellm Docker container with PostgreSQL sidecar (pynchy-litellm-db). Access at http://localhost:4000 on the pynchy server, or via Tailscale at port 4000.
Master key: ssh pynchy-server 'grep master_key ~/src/PERSONAL/pynchy/config.toml'
Pass as: Authorization: Bearer <key>
If master_key is not in config.toml, it may be injected via .env or container env. Prefer a scripted lookup that does not print the key, e.g. using it inline for a request (see references/litellm-diagnostics.md for examples).
Config: ~/src/PERSONAL/pynchy/litellm_config.yaml. Editing it triggers an automatic restart (~30–90s). Do not manually restart containers.
Dashboard: http://pynchy-server:4000/ui/
- Diagnostics, spend tracking, failure analysis: references/litellm-diagnostics.md
- MCP server management API and gotchas: references/litellm-mcp-api.md
Zombie Processes (LiteLLM)
If SSH login reports zombie processes, check whether they live inside the LiteLLM container:
ssh pynchy-server 'docker exec pynchy-litellm ps -eo pid,ppid,stat,args | awk '\''$3 ~ /Z/ {print}'\'''
Note: use args, not cmd — cmd can appear empty for zombie processes.
MCP Server Containers
MCP tool servers (e.g., Playwright) run as separate Docker containers managed by McpManager. They start on-demand when an agent needs them and stop after the configured idle_timeout.
See src/pynchy/host/container_manager/mcp/ and MCP management.
Database Files
All databases live in data/:
| File | Purpose |
|---|---|
data/messages.db | Main DB — messages, groups, sessions, tasks, events, outbound ledger |
data/neonize.db | WhatsApp auth state (Neonize credentials) |
data/memories.db | BM25-ranked memory store (sqlite-memory plugin) |
Quick inspection (run on pynchy-server or prefix with ssh pynchy-server):
# List registered groups
sqlite3 data/messages.db "SELECT name, folder, is_admin FROM registered_groups;"
# Recent messages across all channels
sqlite3 data/messages.db "SELECT timestamp, chat_jid, sender_name, substr(content, 1, 80) FROM messages ORDER BY timestamp DESC LIMIT 10;"
# Active sessions
sqlite3 data/messages.db "SELECT * FROM sessions;"
# Scheduled tasks
sqlite3 data/messages.db "SELECT id, group_folder, status, next_run FROM scheduled_tasks WHERE status = 'active';"
For the full query cookbook (traces, tool calls, cross-table debugging), see the pynchy-dev skill's sqlite-queries.md.
Server Debugging
For specific failure scenarios — container timeouts, agent not responding, mount issues, WhatsApp auth — see references/server-debug.md.
Docker logs are useful for runtime errors (container crashes, process failures) where the issue occurs before messages reach the database. For agent behavior, use the pynchy-dev skill's SQLite query reference instead.
Source
git clone https://github.com/crypdick/pynchy/blob/main/.claude/skills/pynchy-ops/SKILL.mdView on GitHub Overview
Pynchy Ops provides practical commands and workflows to manage the pynchy service on pynchy-server via SSH and to diagnose the LiteLLM proxy. It covers deploying changes, observing logs, checking service status, restarting when needed, and configuring GitHub auth or rebuilding the agent container. It also guides troubleshooting for LiteLLM UI, dashboard, proxy errors, and model availability.
How This Skill Works
The pynchy service runs on pynchy-server over Tailscale and is operable via SSH (ssh pynchy-server). Automatic restarts are triggered by Git changes on main or config file edits (config.toml, litellm_config.yaml), with manual restarts discouraged unless the service is unhealthy. Use the /status endpoint for a full health view, and perform low-level checks with systemctl, docker, journalctl, and SQLite queries to observe actual agent activity.
When to Use It
- Trigger a deploy after a commit by sending a POST to http://pynchy-server:8484/deploy from HOST.
- Quickly verify the overall health and status using the /status endpoint or, if unavailable, via systemctl, docker, and journalctl queries.
- Diagnose unhealthy or stuck services and perform manual restarts only as a last resort.
- Investigate LiteLLM proxy issues, including failed requests, model routing, spend tracking, and API gateway diagnostics or config changes.
- Reference the LiteLLM UI/dashboard, proxy errors, or model availability when user reports issues.
Quick Start
- Step 1: Ensure you can reach pynchy-server (SSH or via TailScale).
- Step 2: Trigger a deploy: curl -s -X POST http://pynchy-server:8484/deploy
- Step 3: Check status or logs: curl -s http://pynchy-server:8484/status | python3 -m json.tool
Best Practices
- Always check the /status endpoint first for a comprehensive health snapshot.
- Avoid manual container or service restarts; rely on automatic restarts triggered by code or config changes.
- Use SSH to pynchy-server for in-depth checks (systemctl, docker, journalctl) when issues persist.
- Use journalctl to inspect lifecycle events and SQLite data/messages.db to see actual agent activity.
- Edit config.toml or litellm_config.yaml to adjust settings and allow automatic restart rather than manual intervention.
Example Use Cases
- Trigger a deploy from the host after a main branch commit and then verify status remotely via the /status endpoint.
- Check service status and recent logs to confirm a healthy restart or to identify the root cause of a failure.
- Verify that WhatsApp is connected and groups are loaded by inspecting logs and status fields.
- Diagnose LiteLLM proxy health, inspect model counts, and review API gateway diagnostics when requests fail.
- Query the SQLite messages database to review recent agent activity for a specific group.