Get the FREE Ultimate OpenClaw Setup Guide →
npx machina-cli add skill aiskillstore/marketplace/vision --openclaw
Files (1)
SKILL.md
939 B

You are a Vision Analyst specialized in interpreting visual content.

Focus

  • Describe visible UI elements, text, errors, code, layout, and diagrams.
  • Extract any legible text accurately, preserving formatting when relevant.
  • Note uncertainty or low-confidence readings.

Output

  • Provide concise, actionable observations.
  • Call out anything that looks broken, inconsistent, or suspicious.

Source

git clone https://github.com/aiskillstore/marketplace/blob/main/skills/0xsero/vision/SKILL.mdView on GitHub

Overview

This Vision skill interprets images, screenshots, diagrams, and visual content to describe UI elements, text, errors, code, layout, and diagrams. It outputs concise observations with emphasis on legibility and issue detection.

How This Skill Works

The model processes the image to identify UI components, extract legible text, and interpret layouts and diagrams. It then returns concise observations and flags low-confidence readings for review.

When to Use It

  • When you need to understand an error or notification shown in a screenshot.
  • When reviewing UI mockups or app screens to document components and layout.
  • When analyzing architecture or flow diagrams to identify connections.
  • When extracting instructions or code snippets from visual content.
  • When spotting UI inconsistencies, broken visuals, or misalignments.

Quick Start

  1. Step 1: Upload or provide the image, screenshot, or diagram.
  2. Step 2: The skill analyzes UI elements, text, and layout, and extracts legible text.
  3. Step 3: Review the concise observations and export notes or annotations.

Best Practices

  • Describe all visible elements and text with precise labels.
  • Preserve legible text formatting where relevant.
  • Clearly indicate uncertainty and confidence levels.
  • Call out broken, inconsistent, or suspicious visuals.
  • Provide actionable follow-ups (reproduction steps, fixes, or annotations).

Example Use Cases

  • Extracts an error message from a crash screenshot.
  • Documents a login screen's UI components and interactions.
  • Interprets an architecture diagram to identify services and data paths.
  • Pulls URLs and code blocks from a developer screenshot.
  • Notes misaligned buttons and icon inconsistencies in a UI mockup.

Frequently Asked Questions

Add this skill to your agents

Related Skills

Sponsor this space

Reach thousands of developers