encoding-handler
npx machina-cli add skill a5c-ai/babysitter/encoding-handler --openclawFiles (1)
SKILL.md
1.5 KB
Encoding Handler
Handle text encoding across platforms.
Capabilities
- Detect file encoding
- Convert between encodings
- Handle BOM markers
- Configure Windows codepage support
- Normalize text encoding
- Handle encoding errors
Generated Patterns
import { Buffer } from 'buffer';
import iconv from 'iconv-lite';
export function detectBOM(buffer: Buffer): string | null {
if (buffer[0] === 0xEF && buffer[1] === 0xBB && buffer[2] === 0xBF) return 'utf-8';
if (buffer[0] === 0xFF && buffer[1] === 0xFE) return 'utf-16le';
if (buffer[0] === 0xFE && buffer[1] === 0xFF) return 'utf-16be';
return null;
}
export function stripBOM(content: string): string {
return content.charCodeAt(0) === 0xFEFF ? content.slice(1) : content;
}
export function decodeBuffer(buffer: Buffer, encoding = 'utf-8'): string {
const bom = detectBOM(buffer);
if (bom) {
return stripBOM(iconv.decode(buffer, bom));
}
return iconv.decode(buffer, encoding);
}
export function encodeString(content: string, encoding = 'utf-8', addBOM = false): Buffer {
const encoded = iconv.encode(content, encoding);
if (addBOM && encoding.toLowerCase() === 'utf-8') {
return Buffer.concat([Buffer.from([0xEF, 0xBB, 0xBF]), encoded]);
}
return encoded;
}
Target Processes
- cross-platform-cli-compatibility
- cli-output-formatting
- configuration-management-system
Source
git clone https://github.com/a5c-ai/babysitter/blob/main/plugins/babysitter/skills/babysit/process/specializations/cli-mcp-development/skills/encoding-handler/SKILL.mdView on GitHub Overview
Encoding Handler standardizes text across platforms by detecting encoding, converting between encodings, and managing BOM markers. It supports Windows codepages, UTF-8, and BOM handling, ensuring consistent text in pipelines and configurations. It also normalizes text and gracefully handles encoding errors.
How This Skill Works
The skill uses detectBOM to identify BOM markers, optionally strips them with stripBOM, and decodes buffers via decodeBuffer using iconv-lite. When writing, encodeString applies the target encoding and can prepend a UTF-8 BOM if addBOM is true. These primitives enable reliable conversion and normalization across environments.
When to Use It
- Ingesting legacy files written in Windows codepages (e.g., CP1252) and converting to UTF-8.
- Normalizing project assets across macOS, Linux, and Windows in a CI pipeline.
- Reading and writing CLI output to ensure consistent encoding across platforms.
- Migrating configuration data with mixed encodings to a UTF-8-based system.
- Handling BOM markers to prevent hidden characters in text processing.
Quick Start
- Step 1: Read the file as a Buffer and detect BOM with detectBOM(buffer).
- Step 2: Decode to string using decodeBuffer(buffer) or decodeBuffer(buffer, 'cp1252') as needed.
- Step 3: Write back using encodeString(text, 'utf-8', true/false) depending on whether a BOM is required.
Best Practices
- Detect BOMs early with detectBOM and stripBOM when needed to avoid stray characters.
- Use decodeBuffer with a sensible default but allow BOM-based auto-detection when possible.
- Only add a BOM on output if a consumer explicitly requires it (via addBOM).
- Normalize all text through a single target encoding (e.g., UTF-8) in pipelines.
- Validate decoded text and handle encoding errors with fallbacks and logging.
Example Use Cases
- A script reads a Windows CP1252 config file, decodes with decodeBuffer(buffer, 'cp1252'), and saves as UTF-8 without BOM.
- A log processor ingests server logs from Windows and Linux, detects BOM when present, and decodes to a unified UTF-8 string.
- A CI pipeline converts all source assets to UTF-8 and emits BOM-free outputs for consistency.
- An export tool writes UTF-8 with BOM to satisfy Windows applications expecting BOM.
- A text-processing service stores normalized UTF-8 in a database after decoding mixed-encoding user input.
Frequently Asked Questions
Add this skill to your agents