How do I view all failed services at a glance?

Use systemctl --failed to list failed units and then inspect each with systemctl status or view related logs with journalctl -u .

How can I capture outputs for later analysis without losing interactivity?

Redirect or tee outputs to a file, e.g. your_command | tee -a /var/log/sys-health-$(date +%F).log, so you keep an audit trail.

What is the recommended way to monitor IO performance over time?

Use sar for historical trends (e.g., sar -u 1 10 for CPU, sar -d 1 10 for disk) and iostat -x 1 for current IO stats; iotop provides per-process IO when needed.

How do I ensure a service starts on boot after changes?

Use systemctl enable to enable and systemctl disable to disable startup behavior; verify with systemctl list-unit-files --type=service and systemctl is-enabled .

system-admin

Scanned

linux system monitoring admin

npx machina-cli add skill chaterm/terminal-skills/system-admin --openclaw

Files (1)

SKILL.md

2.0 KB

Linux System Administration

Overview

Core commands and best practices for Linux system administration, including system information viewing, resource monitoring, service management, etc.

System Information

Basic Information

# System version
cat /etc/os-release
uname -a

# Hostname
hostnamectl

# Uptime and load
uptime

Hardware Information

# CPU information
lscpu
cat /proc/cpuinfo

# Memory information
free -h
cat /proc/meminfo

# Disk information
lsblk
df -h

Resource Monitoring

Real-time Monitoring

# Comprehensive monitoring
top
htop

# Memory monitoring
vmstat 1

# IO monitoring
iostat -x 1
iotop

# Network monitoring
iftop
nethogs

Historical Data

# System activity report
sar -u 1 10    # CPU
sar -r 1 10    # Memory
sar -d 1 10    # Disk

Service Management

Systemd Services

# Service status
systemctl status service-name
systemctl is-active service-name

# Start/Stop services
systemctl start/stop/restart service-name

# Boot startup
systemctl enable/disable service-name

# View all services
systemctl list-units --type=service

Common Scenarios

Scenario 1: System Health Check

# Quick health check script
echo "=== System Load ===" && uptime
echo "=== Memory Usage ===" && free -h
echo "=== Disk Usage ===" && df -h
echo "=== Failed Services ===" && systemctl --failed

Scenario 2: Troubleshoot High Load

# 1. Check load
uptime

# 2. Find high CPU processes
ps aux --sort=-%cpu | head -10

# 3. Find high memory processes
ps aux --sort=-%mem | head -10

Troubleshooting

Problem	Commands
System lag	`top`, `vmstat 1`, `iostat -x 1`
Disk full	`df -h`, `du -sh /*`, `ncdu`
Memory shortage	`free -h`, `ps aux --sort=-%mem`
Service abnormal	`systemctl status`, `journalctl -u`

Source

git clone https://github.com/chaterm/terminal-skills/blob/main/linux/system-admin/SKILL.mdView on GitHub

Overview

system-admin provides practical workflows for Linux system administration, including viewing basic OS and host information, inspecting hardware, real-time and historical resource monitoring, and managing services via systemd. You use these repeatable patterns to verify server status, troubleshoot performance issues, and ensure correct startup behavior of services.

How This Skill Works

You leverage a curated suite of native Linux tools described in the skill content to gather state and enforce operational checks. You’ll typically run commands like cat /etc/os-release, uname -a, and uptime to establish a quick health baseline, then drill into hardware with lscpu and /proc reads, monitor in real time with top/htop, vmstat, iostat, iotop, iftop, and nethogs, and finally inspect historical trends with sar. For service management, you use systemctl to query, start/stop/restart, and enable/disable units, complemented by journalctl for per-service logs.

When to Use It

After provisioning a new Linux host to verify baseline (OS, host, hardware)
During performance troubleshooting (high load, IO wait, memory pressure)
When investigating a failing or misbehaving systemd service
For capacity planning and resource auditing using real-time and historical data
For post-change validation after patches or configuration updates

Quick Start

1) Run a quick health snapshot to establish baseline:
```bash
echo "=== System Load ===" && uptime
echo "=== Memory Usage ===" && free -h
echo "=== Disk Usage ===" && df -h
echo "=== Failed Services ===" && systemctl --failed
```
2) Inspect OS and host identity:
```bash
cat /etc/os-release
uname -a
hostnamectl
```
3) Check hardware and storage layout:
```bash
lscpu
free -h
lsblk
df -h
```
4) Start real-time monitoring and service checks:
```bash
top
vmstat 1
iostat -x 1
systemctl status nginx
journalctl -u nginx | tail -n 20
```

Best Practices

Run commands with sudo when you need privileged access to /proc or systemd
Prefer non-intrusive monitoring on production (avoid long-running heavy captures)
Capture outputs to logs for auditability (e.g., using tee to write to a file)
Use sar for historical data instead of relying solely on ad-hoc snapshots
Combine classification commands (e.g., systemctl + journalctl) to correlate service status with logs

Example Use Cases

Scenario: Post-Provisioning Sanity Check — you verify the fresh server baseline with cat /etc/os-release, uname -a, hostnamectl, and baseline hardware view via lscpu and lsblk before deploying workloads.
Scenario: Troubleshoot High CPU / Load — you identify top CPU consumers with ps aux --sort=-%cpu | head -10 and cross-check with uptime, then isolate memory pressure via ps aux --sort=-%mem | head -10 and free -h.
Scenario: Disk Space Shortage — you check disk usage with df -h, then drill into large directories with du -sh /*, and optionally ncdu for interactive exploration.
Scenario: Service Misbehavior — you verify service status via systemctl status, check if it's enabled on boot, and inspect recent logs with journalctl -u <service-name> --since today.
Scenario: IO and Network Bottlenecks — you monitor IO with iostat -x 1 and iotop, and examine network usage with iftop or nethogs to identify bandwidth-heavy processes.

Frequently Asked Questions

Add this skill to your agents

Related Skills

network-tools

chaterm/terminal-skills

Linux network tools and diagnostics

shell-scripting

chaterm/terminal-skills

Bash Shell 脚本编写

monitoring

chaterm/terminal-skills

监控与告警

file-operations

chaterm/terminal-skills

Linux file and directory operations

process-management

chaterm/terminal-skills

Linux process management and control

user-permissions

chaterm/terminal-skills

Linux user and permission management