What if no template fits my use case?

Follow the Guardrails: if no template matches, hand off to create-evaluation.

Do I need to verify the evaluation after deployment?

Yes. Do not skip verification; run run_eval with representative inputs and review results.

What API scopes are required for these actions?

list_templates requires datasets:read; provision_template requires datasets:write; create_and_deploy_evaluation requires evaluations:write and live-evaluations:write; run_eval requires live-evaluations:execute.

bootstrap-template-evaluation

npx machina-cli add skill Goodeye-Labs/truesight-mcp-skills/bootstrap-template-evaluation --openclaw

Files (1)

SKILL.md

1.5 KB

Bootstrap Template Evaluation

Use this skill when a pre-built template likely covers the target use case.

Interactive Q&A protocol (mandatory)

If template choice is ambiguous, ask one question at a time with lettered options.

Example:

Which template family best matches your goal?
A) AI writing detection
B) Code quality
C) Unsure, list all templates first

Rules:

Ask one question per message.
Prefer lettered options.
Ask one follow-up only when needed.

Workflow

Discover templates:
- Call list_templates.
Select template:
- Match use case to template slug.
Provision private dataset:
- Call provision_template(slug).
Deploy live evaluation:
- Call create_and_deploy_evaluation(dataset_id).
- Capture api_key immediately because it is returned only once.
Verify:
- Run run_eval with representative inputs.
Return deployment artifacts:
- dataset_id
- live_evaluation_id
- verification result

Guardrails

If no template fits, hand off to create-evaluation.
Do not skip verification after deployment.

Scopes reference

list_templates requires datasets:read
provision_template requires datasets:write
create_and_deploy_evaluation requires evaluations:write, live-evaluations:write
run_eval requires live-evaluations:execute

Source

git clone https://github.com/Goodeye-Labs/truesight-mcp-skills/blob/main/skills/bootstrap-template-evaluation/SKILL.mdView on GitHub

Overview

Bootstrap Template Evaluation provides the fastest path to a deployed live evaluation by using a pre-built Truesight template. It helps you skip building judgment configs from scratch and move quickly from template discovery to live deployment. This is ideal for rapid experimentation and getting evaluation results fast.

How This Skill Works

Start by discovering templates with list_templates and selecting the template slug that best matches your use case. Provision a private dataset with provision_template(slug), then deploy the live evaluation using create_and_deploy_evaluation(dataset_id) and capture the api_key (returned only once). Finally, verify the evaluation with run_eval and return deployment artifacts such as dataset_id, live_evaluation_id, and the verification result.

When to Use It

You want a quick-start live evaluation without building judgment configs from scratch
A pre-built template closely matches your goal and you want to minimize setup time
You need to provision a private dataset and deploy in a single flow
You are ready to verify the evaluation with representative inputs before going live
You want a traceable path from template discovery to deployed artifacts (dataset_id, live_evaluation_id)

Quick Start

Step 1: Discover templates with list_templates to find a matching template slug
Step 2: Provision the template by calling provision_template(slug) to create a private dataset
Step 3: Deploy the live evaluation with create_and_deploy_evaluation(dataset_id), capture api_key, then run_eval for verification

Best Practices

Validate that the selected template slug matches your use case before provisioning
If the template is ambiguous, follow the interactive Q&A protocol and ask one question at a time
Capture the api_key immediately after deployment since it is returned only once
Do not skip verification after deployment; run run_eval with representative inputs
Record and review deployment artifacts (dataset_id, live_evaluation_id, verification results) and check required scopes

Example Use Cases

Moderation: deploy a live evaluation using an AI writing detection template to monitor user-generated content
Code quality: use a pre-built code evaluation template to assess a codebase without building judgments from scratch
Sentiment analysis: fast-deploy to evaluate customer support interactions with a matching template
Compliance checks: run a live evaluation against a privacy/compliance template for regulated workflows
Threat detection: quick setup to evaluate security-related patterns using a pre-built template

Frequently Asked Questions

Add this skill to your agents