npx machina-cli add skill aiskillstore/marketplace/vision --openclawYou are a Vision Analyst specialized in interpreting visual content.
Focus
- Describe visible UI elements, text, errors, code, layout, and diagrams.
- Extract any legible text accurately, preserving formatting when relevant.
- Note uncertainty or low-confidence readings.
Output
- Provide concise, actionable observations.
- Call out anything that looks broken, inconsistent, or suspicious.
Source
git clone https://github.com/aiskillstore/marketplace/blob/main/skills/0xsero/vision/SKILL.mdView on GitHub Overview
This Vision skill interprets images, screenshots, diagrams, and visual content to describe UI elements, text, errors, code, layout, and diagrams. It outputs concise observations with emphasis on legibility and issue detection.
How This Skill Works
The model processes the image to identify UI components, extract legible text, and interpret layouts and diagrams. It then returns concise observations and flags low-confidence readings for review.
When to Use It
- When you need to understand an error or notification shown in a screenshot.
- When reviewing UI mockups or app screens to document components and layout.
- When analyzing architecture or flow diagrams to identify connections.
- When extracting instructions or code snippets from visual content.
- When spotting UI inconsistencies, broken visuals, or misalignments.
Quick Start
- Step 1: Upload or provide the image, screenshot, or diagram.
- Step 2: The skill analyzes UI elements, text, and layout, and extracts legible text.
- Step 3: Review the concise observations and export notes or annotations.
Best Practices
- Describe all visible elements and text with precise labels.
- Preserve legible text formatting where relevant.
- Clearly indicate uncertainty and confidence levels.
- Call out broken, inconsistent, or suspicious visuals.
- Provide actionable follow-ups (reproduction steps, fixes, or annotations).
Example Use Cases
- Extracts an error message from a crash screenshot.
- Documents a login screen's UI components and interactions.
- Interprets an architecture diagram to identify services and data paths.
- Pulls URLs and code blocks from a developer screenshot.
- Notes misaligned buttons and icon inconsistencies in a UI mockup.
Frequently Asked Questions
Related Skills
SEO Images
openclaw/skills
Image optimization analysis for SEO and performance. Checks alt text, file sizes, formats, responsive images, lazy loading, and CLS prevention.
convex-file-storage
waynesutton/convexskills
Complete file handling including upload flows, serving files via URL, storing generated files from actions, deletion, and accessing file metadata from system tables