observe
npx machina-cli add skill Crawlio-app/crawlio-plugin/observe --openclawobserve
Query Crawlio's observation log — the append-only timeline of everything Crawlio observed during a crawl session.
When to Use
Use this skill when the user wants to:
- See what happened during a crawl
- Review extension captures (framework detection, network requests, console logs)
- Reconstruct a timeline of events
- Find specific observations by host, source, or time range
Quick Reference
Get Recent Observations
get_observations({ limit: 20 })
Filter by Host
get_observations({ host: "example.com", limit: 50 })
Filter by Source
| Source | What It Captures |
|---|---|
extension | Chrome extension enrichment (framework, network, console, DOM) |
engine | Crawl lifecycle events (crawl_start, crawl_done) |
webkit | WebKit runtime capture |
agent | AI-created findings |
get_observations({ source: "extension", limit: 30 })
Filter by Operation
| Op | Meaning |
|---|---|
observe | Raw data capture |
finding | Agent-created insight |
crawl_start | Crawl began |
crawl_done | Crawl completed |
page | Single page observation |
get_observations({ op: "crawl_done" })
Time-Based Query
Use Unix timestamps to query a time range:
get_observations({ since: 1708444200, limit: 100 })
Combine Filters
get_observations({
host: "example.com",
source: "extension",
op: "observe",
limit: 50
})
Reading Observations
Each observation entry contains:
- id — unique identifier (
obs_prefix for observations,fnd_for findings) - op — what type of event this is
- ts — ISO 8601 timestamp
- url — the URL this relates to
- source — what produced this entry
- data — composite payload (framework detection, network requests, console logs, progress, etc.)
Common Patterns
Timeline Reconstruction
Query by host with no limit to see the full story of a crawl:
get_observations({ host: "example.com" })
Crawl Summary
Get start and end events to see crawl performance:
get_observations({ op: "crawl_start" })
get_observations({ op: "crawl_done" })
The crawl_done entry includes progress data (totalDiscovered, downloaded, failed).
Extension Audit
See everything the Chrome extension captured:
get_observations({ source: "extension", limit: 200 })
After Observation — Create Findings
Once you've identified patterns in observations, use the finding skill to record insights with evidence chains.
Source
git clone https://github.com/Crawlio-app/crawlio-plugin/blob/main/skills/observe/SKILL.mdView on GitHub Overview
Use this skill to review what happened during a crawl by querying Crawlio's observation log. It exposes an append-only timeline of observations grouped by host, source, operation, and time, helping you audit extensions, network activity, and crawl lifecycle events.
How This Skill Works
Call get_observations with optional filters such as host, source, op, since, and limit. Each entry returns id, op, ts, url, source, and data payload, which encodes framework detections, network activity, console logs, progress, and other observations. You can combine filters to reconstruct a precise timeline.
When to Use It
- See what happened during a crawl
- Review extension captures (framework detection, network requests, console logs)
- Reconstruct a timeline of events for a session
- Find specific observations by host, source, or time range
- Audit crawl performance using crawl_start and crawl_done entries
Quick Start
- Step 1: Decide the filters and time window (host, source, op, since)
- Step 2: Run get_observations with the chosen filters, e.g. get_observations({ host: "example.com", limit: 50 })
- Step 3: Review returned entries (id, ts, op, source, data) to build a timeline or identify patterns
Best Practices
- Filter by host with a broad time window to map the full crawl timeline
- Combine host, source, and op filters to narrow down results
- Use since or a time range to focus on a specific period
- Check crawl_start and crawl_done entries to gauge duration and progress
- Differentiate observe (raw data) from finding (insights) to separate data types
Example Use Cases
- get_observations({ limit: 20 })
- get_observations({ host: "example.com", limit: 50 })
- get_observations({ source: "extension", limit: 30 })
- get_observations({ op: "crawl_done" })
- get_observations({ since: 1708444200, limit: 100 })