Get the FREE Ultimate OpenClaw Setup Guide →

bootstrap-template-evaluation

npx machina-cli add skill Goodeye-Labs/truesight-mcp-skills/bootstrap-template-evaluation --openclaw
Files (1)
SKILL.md
1.5 KB

Bootstrap Template Evaluation

Use this skill when a pre-built template likely covers the target use case.

Interactive Q&A protocol (mandatory)

If template choice is ambiguous, ask one question at a time with lettered options.

Example:

Which template family best matches your goal?
A) AI writing detection
B) Code quality
C) Unsure, list all templates first

Rules:

  • Ask one question per message.
  • Prefer lettered options.
  • Ask one follow-up only when needed.

Workflow

  1. Discover templates:
    • Call list_templates.
  2. Select template:
    • Match use case to template slug.
  3. Provision private dataset:
    • Call provision_template(slug).
  4. Deploy live evaluation:
    • Call create_and_deploy_evaluation(dataset_id).
    • Capture api_key immediately because it is returned only once.
  5. Verify:
    • Run run_eval with representative inputs.
  6. Return deployment artifacts:
    • dataset_id
    • live_evaluation_id
    • verification result

Guardrails

  • If no template fits, hand off to create-evaluation.
  • Do not skip verification after deployment.

Scopes reference

  • list_templates requires datasets:read
  • provision_template requires datasets:write
  • create_and_deploy_evaluation requires evaluations:write, live-evaluations:write
  • run_eval requires live-evaluations:execute

Source

git clone https://github.com/Goodeye-Labs/truesight-mcp-skills/blob/main/skills/bootstrap-template-evaluation/SKILL.mdView on GitHub

Overview

Bootstrap Template Evaluation provides the fastest path to a deployed live evaluation by using a pre-built Truesight template. It helps you skip building judgment configs from scratch and move quickly from template discovery to live deployment. This is ideal for rapid experimentation and getting evaluation results fast.

How This Skill Works

Start by discovering templates with list_templates and selecting the template slug that best matches your use case. Provision a private dataset with provision_template(slug), then deploy the live evaluation using create_and_deploy_evaluation(dataset_id) and capture the api_key (returned only once). Finally, verify the evaluation with run_eval and return deployment artifacts such as dataset_id, live_evaluation_id, and the verification result.

When to Use It

  • You want a quick-start live evaluation without building judgment configs from scratch
  • A pre-built template closely matches your goal and you want to minimize setup time
  • You need to provision a private dataset and deploy in a single flow
  • You are ready to verify the evaluation with representative inputs before going live
  • You want a traceable path from template discovery to deployed artifacts (dataset_id, live_evaluation_id)

Quick Start

  1. Step 1: Discover templates with list_templates to find a matching template slug
  2. Step 2: Provision the template by calling provision_template(slug) to create a private dataset
  3. Step 3: Deploy the live evaluation with create_and_deploy_evaluation(dataset_id), capture api_key, then run_eval for verification

Best Practices

  • Validate that the selected template slug matches your use case before provisioning
  • If the template is ambiguous, follow the interactive Q&A protocol and ask one question at a time
  • Capture the api_key immediately after deployment since it is returned only once
  • Do not skip verification after deployment; run run_eval with representative inputs
  • Record and review deployment artifacts (dataset_id, live_evaluation_id, verification results) and check required scopes

Example Use Cases

  • Moderation: deploy a live evaluation using an AI writing detection template to monitor user-generated content
  • Code quality: use a pre-built code evaluation template to assess a codebase without building judgments from scratch
  • Sentiment analysis: fast-deploy to evaluate customer support interactions with a matching template
  • Compliance checks: run a live evaluation against a privacy/compliance template for regulated workflows
  • Threat detection: quick setup to evaluate security-related patterns using a pre-built template

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers