bootstrap-template-evaluation
npx machina-cli add skill Goodeye-Labs/truesight-mcp-skills/bootstrap-template-evaluation --openclawBootstrap Template Evaluation
Use this skill when a pre-built template likely covers the target use case.
Interactive Q&A protocol (mandatory)
If template choice is ambiguous, ask one question at a time with lettered options.
Example:
Which template family best matches your goal?
A) AI writing detection
B) Code quality
C) Unsure, list all templates first
Rules:
- Ask one question per message.
- Prefer lettered options.
- Ask one follow-up only when needed.
Workflow
- Discover templates:
- Call
list_templates.
- Call
- Select template:
- Match use case to template
slug.
- Match use case to template
- Provision private dataset:
- Call
provision_template(slug).
- Call
- Deploy live evaluation:
- Call
create_and_deploy_evaluation(dataset_id). - Capture
api_keyimmediately because it is returned only once.
- Call
- Verify:
- Run
run_evalwith representative inputs.
- Run
- Return deployment artifacts:
dataset_idlive_evaluation_id- verification result
Guardrails
- If no template fits, hand off to
create-evaluation. - Do not skip verification after deployment.
Scopes reference
list_templatesrequiresdatasets:readprovision_templaterequiresdatasets:writecreate_and_deploy_evaluationrequiresevaluations:write,live-evaluations:writerun_evalrequireslive-evaluations:execute
Source
git clone https://github.com/Goodeye-Labs/truesight-mcp-skills/blob/main/skills/bootstrap-template-evaluation/SKILL.mdView on GitHub Overview
Bootstrap Template Evaluation provides the fastest path to a deployed live evaluation by using a pre-built Truesight template. It helps you skip building judgment configs from scratch and move quickly from template discovery to live deployment. This is ideal for rapid experimentation and getting evaluation results fast.
How This Skill Works
Start by discovering templates with list_templates and selecting the template slug that best matches your use case. Provision a private dataset with provision_template(slug), then deploy the live evaluation using create_and_deploy_evaluation(dataset_id) and capture the api_key (returned only once). Finally, verify the evaluation with run_eval and return deployment artifacts such as dataset_id, live_evaluation_id, and the verification result.
When to Use It
- You want a quick-start live evaluation without building judgment configs from scratch
- A pre-built template closely matches your goal and you want to minimize setup time
- You need to provision a private dataset and deploy in a single flow
- You are ready to verify the evaluation with representative inputs before going live
- You want a traceable path from template discovery to deployed artifacts (dataset_id, live_evaluation_id)
Quick Start
- Step 1: Discover templates with list_templates to find a matching template slug
- Step 2: Provision the template by calling provision_template(slug) to create a private dataset
- Step 3: Deploy the live evaluation with create_and_deploy_evaluation(dataset_id), capture api_key, then run_eval for verification
Best Practices
- Validate that the selected template slug matches your use case before provisioning
- If the template is ambiguous, follow the interactive Q&A protocol and ask one question at a time
- Capture the api_key immediately after deployment since it is returned only once
- Do not skip verification after deployment; run run_eval with representative inputs
- Record and review deployment artifacts (dataset_id, live_evaluation_id, verification results) and check required scopes
Example Use Cases
- Moderation: deploy a live evaluation using an AI writing detection template to monitor user-generated content
- Code quality: use a pre-built code evaluation template to assess a codebase without building judgments from scratch
- Sentiment analysis: fast-deploy to evaluate customer support interactions with a matching template
- Compliance checks: run a live evaluation against a privacy/compliance template for regulated workflows
- Threat detection: quick setup to evaluate security-related patterns using a pre-built template