What is Pynchy Ops used for?

Pynchy Ops is for managing the pynchy service on pynchy-server and diagnosing the LiteLLM proxy, including deployments, status checks, logs, restarts, and LiteLLM configuration adjustments.

Is manual restart allowed?

Manual restarts should only be used when the service is unhealthy. Automatic restarts occur on main branch changes or config file edits to preserve lifecycle integrity.

How do I monitor agent activity?

Journalctl shows lifecycle events, not agent output. For actual agent activity, query the SQLite data/messages.db to view recent messages per group.

Pynchy Ops

Use Caution

npx machina-cli add skill crypdick/pynchy/pynchy-ops --openclaw

Files (1)

SKILL.md

9.1 KB

Pynchy Ops

The pynchy service runs on pynchy-server over Tailscale. SSH: ssh pynchy-server.

Auto-deploy: Never Restart Manually

Pynchy self-manages. Two mechanisms trigger automatic restarts:

Git changes on main — the polling mechanism detects new commits, pulls, and restarts (with container rebuild if source files changed).
Config file changes — editing config.toml, litellm_config.yaml, or other settings files triggers an automatic restart. Edit the file and wait ~30–90s.

Do not manually restart containers or the service. This includes docker restart, systemctl restart, and direct container management (docker kill/stop/rm). Manual restarts bypass lifecycle management and can leave things in a bad state.

Only use manual commands when the service is unhealthy and needs fixing. See references/server-debug.md for diagnostic steps.

Quick Status Check

Preferred: the /status endpoint. Single command that returns everything:

# On pynchy-server directly:
curl -s http://localhost:8484/status | python3 -m json.tool

# Remotely (via Tailscale):
curl -s http://pynchy-server:8484/status | python3 -m json.tool

Returns JSON with: service (uptime), deploy (SHA, dirty, unpushed), channels (slack/whatsapp connected), gateway (LiteLLM health, model counts), queue (active containers, waiting groups), repos (per-repo worktree status — SHA, dirty, ahead/behind, conflicts), messages (inbound/outbound counts, last activity), tasks (scheduled tasks with status/next run), host_jobs, groups (total, active sessions).

Fallback: manual commands (when the HTTP server is down or you need logs):

# 1. Is the service running?
systemctl --user status pynchy

# 2. Any running containers?
docker ps --filter name=pynchy

# 3. Any stopped/orphaned containers?
docker ps -a --filter name=pynchy

# 4. Recent errors in service log?
journalctl --user -u pynchy -p err -n 20

# 5. Is WhatsApp connected?
journalctl --user -u pynchy --grep 'Connected to WhatsApp|Connection closed' -n 5

# 6. Are groups loaded?
journalctl --user -u pynchy --grep 'groupCount' -n 3

Deploy & Observe

# Trigger a deploy (from HOST — use mcp__pynchy__deploy_changes from containers)
curl -s -X POST http://pynchy-server:8484/deploy

# Observe (always safe)
ssh pynchy-server 'systemctl --user status pynchy'
ssh pynchy-server 'journalctl --user -u pynchy -f'
ssh pynchy-server 'journalctl --user -u pynchy -n 100'
ssh pynchy-server 'docker ps --filter name=pynchy'

# Manual restart — ONLY for unhealthy/stuck service
ssh pynchy-server 'systemctl --user restart pynchy'

Monitoring Live Agent Activity

journalctl only shows lifecycle events (container spawn, session create/destroy, errors). It does NOT show agent output (tool calls, thinking, text broadcasts). To monitor what an agent is actually doing, query SQLite:

# Recent activity for a specific group (replace <JID> with e.g. slack:C0AFR6DB0FK)
ssh pynchy-server 'sqlite3 data/messages.db "
  SELECT timestamp, message_type, substr(content, 1, 120)
  FROM messages WHERE chat_jid = '\''<JID>'\''
  ORDER BY timestamp DESC LIMIT 15;
"'

# All recent activity across all groups
ssh pynchy-server 'sqlite3 data/messages.db "
  SELECT timestamp, chat_jid, message_type, substr(content, 1, 80)
  FROM messages ORDER BY timestamp DESC LIMIT 15;
"'

When to use what:

What you need	Tool
Is the service running?	`systemctl --user status pynchy`
Did the container spawn/crash?	`journalctl` or `docker logs`
What is the agent doing right now?	SQLite `messages` table
Agent tool calls and traces	SQLite `events` table
Container startup errors (before DB writes)	`docker logs pynchy-<group>`

Sending Synthetic Messages

Use the TUI API to inject messages into any group's chat pipeline (useful for testing):

# Send a message as if a user typed it
curl -s -X POST http://pynchy-server:8484/api/send \
  -H "Content-Type: application/json" \
  -d '{"jid": "<JID>", "content": "your message here"}'

This goes through the full message pipeline (routing → agent → output → broadcast), same as a real Slack/WhatsApp message.

Service Management Reference

macOS:

launchctl load ~/Library/LaunchAgents/com.pynchy.plist
launchctl unload ~/Library/LaunchAgents/com.pynchy.plist

Linux:

systemctl --user start pynchy
systemctl --user stop pynchy
systemctl --user restart pynchy
journalctl --user -u pynchy -f          # Follow logs

Systemd unit template: config-examples/pynchy.service.EXAMPLE

Container GitHub Access

Admin containers only. GH_TOKEN is forwarded only to admin containers. Non-admin containers have git operations routed through host IPC and never receive the token.

# Interactive login (works over SSH with -t for TTY)
ssh -t pynchy-server 'gh auth login -p ssh'

# Verify
ssh pynchy-server 'gh auth status'

After authenticating, _write_env_file() auto-discovers GH_TOKEN and git identity on each admin container launch. No manual env configuration needed.

Container Build Cache

Apple Container's buildkit caches the build context aggressively. --no-cache alone does NOT invalidate COPY steps. To force a truly clean rebuild:

container builder stop && container builder rm && container builder start
./src/pynchy/agent/build.sh

Verify: container run -i --rm --entrypoint python pynchy-agent:latest -c "import agent_runner; print('OK')"

LiteLLM Gateway

Runs as pynchy-litellm Docker container with PostgreSQL sidecar (pynchy-litellm-db). Access at http://localhost:4000 on the pynchy server, or via Tailscale at port 4000.

Master key: ssh pynchy-server 'grep master_key ~/src/PERSONAL/pynchy/config.toml' Pass as: Authorization: Bearer <key>

If master_key is not in config.toml, it may be injected via .env or container env. Prefer a scripted lookup that does not print the key, e.g. using it inline for a request (see references/litellm-diagnostics.md for examples).

Config: ~/src/PERSONAL/pynchy/litellm_config.yaml. Editing it triggers an automatic restart (~30–90s). Do not manually restart containers.

Dashboard: http://pynchy-server:4000/ui/

Diagnostics, spend tracking, failure analysis: references/litellm-diagnostics.md
MCP server management API and gotchas: references/litellm-mcp-api.md

Zombie Processes (LiteLLM)

If SSH login reports zombie processes, check whether they live inside the LiteLLM container:

ssh pynchy-server 'docker exec pynchy-litellm ps -eo pid,ppid,stat,args | awk '\''$3 ~ /Z/ {print}'\'''

Note: use args, not cmd — cmd can appear empty for zombie processes.

MCP Server Containers

MCP tool servers (e.g., Playwright) run as separate Docker containers managed by McpManager. They start on-demand when an agent needs them and stop after the configured idle_timeout.

See src/pynchy/host/container_manager/mcp/ and MCP management.

Database Files

All databases live in data/:

File	Purpose
`data/messages.db`	Main DB — messages, groups, sessions, tasks, events, outbound ledger
`data/neonize.db`	WhatsApp auth state (Neonize credentials)
`data/memories.db`	BM25-ranked memory store (sqlite-memory plugin)

Quick inspection (run on pynchy-server or prefix with ssh pynchy-server):

# List registered groups
sqlite3 data/messages.db "SELECT name, folder, is_admin FROM registered_groups;"

# Recent messages across all channels
sqlite3 data/messages.db "SELECT timestamp, chat_jid, sender_name, substr(content, 1, 80) FROM messages ORDER BY timestamp DESC LIMIT 10;"

# Active sessions
sqlite3 data/messages.db "SELECT * FROM sessions;"

# Scheduled tasks
sqlite3 data/messages.db "SELECT id, group_folder, status, next_run FROM scheduled_tasks WHERE status = 'active';"

For the full query cookbook (traces, tool calls, cross-table debugging), see the pynchy-dev skill's sqlite-queries.md.

Server Debugging

For specific failure scenarios — container timeouts, agent not responding, mount issues, WhatsApp auth — see references/server-debug.md.

Docker logs are useful for runtime errors (container crashes, process failures) where the issue occurs before messages reach the database. For agent behavior, use the pynchy-dev skill's SQLite query reference instead.

Source

git clone https://github.com/crypdick/pynchy/blob/main/.claude/skills/pynchy-ops/SKILL.mdView on GitHub

Overview

Pynchy Ops provides practical commands and workflows to manage the pynchy service on pynchy-server via SSH and to diagnose the LiteLLM proxy. It covers deploying changes, observing logs, checking service status, restarting when needed, and configuring GitHub auth or rebuilding the agent container. It also guides troubleshooting for LiteLLM UI, dashboard, proxy errors, and model availability.

How This Skill Works

The pynchy service runs on pynchy-server over Tailscale and is operable via SSH (ssh pynchy-server). Automatic restarts are triggered by Git changes on main or config file edits (config.toml, litellm_config.yaml), with manual restarts discouraged unless the service is unhealthy. Use the /status endpoint for a full health view, and perform low-level checks with systemctl, docker, journalctl, and SQLite queries to observe actual agent activity.

When to Use It

Trigger a deploy after a commit by sending a POST to http://pynchy-server:8484/deploy from HOST.
Quickly verify the overall health and status using the /status endpoint or, if unavailable, via systemctl, docker, and journalctl queries.
Diagnose unhealthy or stuck services and perform manual restarts only as a last resort.
Investigate LiteLLM proxy issues, including failed requests, model routing, spend tracking, and API gateway diagnostics or config changes.
Reference the LiteLLM UI/dashboard, proxy errors, or model availability when user reports issues.

Quick Start

Step 1: Ensure you can reach pynchy-server (SSH or via TailScale).
Step 2: Trigger a deploy: curl -s -X POST http://pynchy-server:8484/deploy
Step 3: Check status or logs: curl -s http://pynchy-server:8484/status | python3 -m json.tool

Best Practices

Always check the /status endpoint first for a comprehensive health snapshot.
Avoid manual container or service restarts; rely on automatic restarts triggered by code or config changes.
Use SSH to pynchy-server for in-depth checks (systemctl, docker, journalctl) when issues persist.
Use journalctl to inspect lifecycle events and SQLite data/messages.db to see actual agent activity.
Edit config.toml or litellm_config.yaml to adjust settings and allow automatic restart rather than manual intervention.

Example Use Cases

Trigger a deploy from the host after a main branch commit and then verify status remotely via the /status endpoint.
Check service status and recent logs to confirm a healthy restart or to identify the root cause of a failure.
Verify that WhatsApp is connected and groups are loaded by inspecting logs and status fields.
Diagnose LiteLLM proxy health, inspect model counts, and review API gateway diagnostics when requests fail.
Query the SQLite messages database to review recent agent activity for a specific group.

Frequently Asked Questions

Add this skill to your agents