What encodings does this support?

It detects common BOMs for UTF-8, UTF-16LE, and UTF-16BE, and uses iconv-lite to decode/encode between encodings. You can specify a target encoding (e.g., 'cp1252', 'utf-8') to decodeBuffer and encodeString accordingly.

How does BOM handling work?

detectBOM identifies BOM markers, stripBOM removes a leading BOM from strings, and encodeString can prepend a UTF-8 BOM if addBOM is true, allowing flexible BOM management.

How do I specify the encoding for reading and writing?

For reading, pass the desired source encoding to decodeBuffer (or rely on BOM if present). For writing, pass the target encoding to encodeString and set addBOM to true if a BOM is required by the consumer.

encoding-handler

npx machina-cli add skill a5c-ai/babysitter/encoding-handler --openclaw

Files (1)

SKILL.md

1.5 KB

Encoding Handler

Handle text encoding across platforms.

Capabilities

Detect file encoding
Convert between encodings
Handle BOM markers
Configure Windows codepage support
Normalize text encoding
Handle encoding errors

Generated Patterns

import { Buffer } from 'buffer';
import iconv from 'iconv-lite';

export function detectBOM(buffer: Buffer): string | null {
  if (buffer[0] === 0xEF && buffer[1] === 0xBB && buffer[2] === 0xBF) return 'utf-8';
  if (buffer[0] === 0xFF && buffer[1] === 0xFE) return 'utf-16le';
  if (buffer[0] === 0xFE && buffer[1] === 0xFF) return 'utf-16be';
  return null;
}

export function stripBOM(content: string): string {
  return content.charCodeAt(0) === 0xFEFF ? content.slice(1) : content;
}

export function decodeBuffer(buffer: Buffer, encoding = 'utf-8'): string {
  const bom = detectBOM(buffer);
  if (bom) {
    return stripBOM(iconv.decode(buffer, bom));
  }
  return iconv.decode(buffer, encoding);
}

export function encodeString(content: string, encoding = 'utf-8', addBOM = false): Buffer {
  const encoded = iconv.encode(content, encoding);
  if (addBOM && encoding.toLowerCase() === 'utf-8') {
    return Buffer.concat([Buffer.from([0xEF, 0xBB, 0xBF]), encoded]);
  }
  return encoded;
}

Target Processes

cross-platform-cli-compatibility
cli-output-formatting
configuration-management-system

Source

git clone https://github.com/a5c-ai/babysitter/blob/main/plugins/babysitter/skills/babysit/process/specializations/cli-mcp-development/skills/encoding-handler/SKILL.md

View on GitHub

Overview

Encoding Handler standardizes text across platforms by detecting encoding, converting between encodings, and managing BOM markers. It supports Windows codepages, UTF-8, and BOM handling, ensuring consistent text in pipelines and configurations. It also normalizes text and gracefully handles encoding errors.

How This Skill Works

The skill uses detectBOM to identify BOM markers, optionally strips them with stripBOM, and decodes buffers via decodeBuffer using iconv-lite. When writing, encodeString applies the target encoding and can prepend a UTF-8 BOM if addBOM is true. These primitives enable reliable conversion and normalization across environments.

When to Use It

Ingesting legacy files written in Windows codepages (e.g., CP1252) and converting to UTF-8.
Normalizing project assets across macOS, Linux, and Windows in a CI pipeline.
Reading and writing CLI output to ensure consistent encoding across platforms.
Migrating configuration data with mixed encodings to a UTF-8-based system.
Handling BOM markers to prevent hidden characters in text processing.

Quick Start

Step 1: Read the file as a Buffer and detect BOM with detectBOM(buffer).
Step 2: Decode to string using decodeBuffer(buffer) or decodeBuffer(buffer, 'cp1252') as needed.
Step 3: Write back using encodeString(text, 'utf-8', true/false) depending on whether a BOM is required.

Best Practices

Detect BOMs early with detectBOM and stripBOM when needed to avoid stray characters.
Use decodeBuffer with a sensible default but allow BOM-based auto-detection when possible.
Only add a BOM on output if a consumer explicitly requires it (via addBOM).
Normalize all text through a single target encoding (e.g., UTF-8) in pipelines.
Validate decoded text and handle encoding errors with fallbacks and logging.

Example Use Cases

A script reads a Windows CP1252 config file, decodes with decodeBuffer(buffer, 'cp1252'), and saves as UTF-8 without BOM.
A log processor ingests server logs from Windows and Linux, detects BOM when present, and decodes to a unified UTF-8 string.
A CI pipeline converts all source assets to UTF-8 and emits BOM-free outputs for consistency.
An export tool writes UTF-8 with BOM to satisfy Windows applications expecting BOM.
A text-processing service stores normalized UTF-8 in a database after decoding mixed-encoding user input.

Frequently Asked Questions

Add this skill to your agents