What is PubFi DSL Client?

A guide and runtime to build safe, structured DSL requests for PubFi data access; the server accepts only the fixed DSL shape and executes a predefined OpenSearch query.

Can I send SQL or raw OpenSearch DSL to the server?

No. The server only accepts the structured DSL defined by the schema.

How do I test locally or in staging?

Set API_BASE to staging or local, fetch the schema via GET /v1/dsl/schema, build a DSL-compliant request, POST to /v1/dsl/query, and review audit logs for verification.

pubfi-dsl-client

Scanned

npx machina-cli add skill helixbox/pubfi-skills/pubfi-dsl-client --openclaw

Files (1)

SKILL.md

3.2 KB

PubFi DSL Client Guide

Overview

This skill defines how to build safe, structured DSL requests for PubFi data access. The server only accepts structured DSL and executes a fixed OpenSearch query shape. The server does not accept SQL, raw OpenSearch DSL, or natural language compilation.

Base URL

Set API_BASE to choose the environment:

Staging (publish target): API_BASE=https://api-stg.pubfi.ai
Local debugging (Actix server): API_BASE=http://127.0.0.1:23340

Note: 0.0.0.0 is typically a listen address, not a client address. If you still need it for local debugging, you may set API_BASE=http://0.0.0.0:23340.

When To Use

You need to convert an agent intent into a structured request.
You must avoid sending SQL or raw OpenSearch DSL to the server.
You need a deterministic, auditable query shape for billing and rate limits.

Core Rules

Only send fields defined in the DSL schema.
Do not send SQL, OpenSearch DSL, or arbitrary field names.
Use explicit UTC timestamps and treat window.end as exclusive.
Keep search.doc_topk, filter sizes, and aggregation sizes within the published limits from the schema endpoint.
Use search.doc_topk = 0 when you only need aggregation results.

Workflow

Fetch the schema from the API, or use references/dsl-schema.md as the local contract.
Convert the user intent into time window, filters, document fields, and aggregation primitives.
Validate the JSON shape and required fields before sending.
Record the request and response for audit and billing reconciliation.

Endpoints

GET /v1/dsl/schema
POST /v1/dsl/query

Example Commands

Fetch the schema:

curl -sS "$API_BASE/v1/dsl/schema"

Run a query:

curl -sS "$API_BASE/v1/dsl/query" \
  -H 'content-type: application/json' \
  -d @request.json

Example Request

{
  "window": {"start": "2026-02-05T00:00:00Z", "end": "2026-02-06T00:00:00Z"},
  "search": {"text": "liquidity stress exchange withdrawal", "doc_topk": 50},
  "filters": {"tags": ["withdrawal_pause"], "entities": ["Binance"], "sources": ["coindesk"]},
  "output": {
    "fields": [
      "document_id",
      "title",
      "source",
      "source_published_at",
      "tag_slugs",
      "entity_labels",
      "document_summary"
    ],
    "aggregations": [
      {
        "name": "top_tags",
        "aggregation": {"type": "terms", "field": "tag_slugs", "size": 50, "min_doc_count": 2}
      },
      {
        "name": "volume_1h",
        "aggregation": {"type": "date_histogram", "field": "source_published_at", "fixed_interval": "1h"}
      },
      {
        "name": "tag_edges",
        "aggregation": {"type": "cooccurrence", "field": "tag_slugs", "outer_size": 50, "inner_size": 50, "min_doc_count": 2}
      }
    ]
  }
}

Common Mistakes

Passing natural language constraints instead of structured filters.
Omitting window or using non-UTC timestamps.
Requesting too many fields or too many aggregations.
Using unsupported fixed_interval values.
Requesting large co-occurrence expansions that violate edge limits.

Source

git clone https://github.com/helixbox/pubfi-skills/blob/main/skills/pubfi-dsl-client/SKILL.mdView on GitHub

Overview

PubFi DSL Client defines how external agents should construct safe, structured PubFi DSL requests for OpenSearch-backed data access. The server only accepts the fixed DSL shape and blocks SQL, raw OpenSearch DSL, or natural language compilation, ensuring determinism and auditability.

How This Skill Works

Fetch the DSL schema from the API or local contract, then map user intent to a structured combination of time window, filters, document fields, and aggregations. Validate the JSON against the schema and required fields before posting to /v1/dsl/query, and record the request/response for audit and billing reconciliation. Ensure UTC timestamps and treat window.end as exclusive, and keep sizes within published limits.

When to Use It

You need to convert an agent intent into a structured request.
You must avoid sending SQL or raw OpenSearch DSL to the server.
You need a deterministic, auditable query shape for billing and rate limits.
You want to ensure you only send fields defined in the DSL schema and use UTC timestamps.
You need to validate the JSON against the DSL schema before sending.

Quick Start

Step 1: Set API_BASE to the target environment and fetch the DSL schema (GET /v1/dsl/schema) or reference the local contract references/dsl-schema.md.
Step 2: Translate the user intent into a structured window, filters, document fields, and aggregations; ensure the JSON matches the DSL schema and required fields.
Step 3: POST the validated request to /v1/dsl/query and log both the request and response for audit and billing reconciliation.

Best Practices

Only send fields defined in the DSL schema.
Do not send SQL, OpenSearch DSL, or arbitrary field names.
Use explicit UTC timestamps and treat window.end as exclusive.
Keep search.doc_topk, filter sizes, and aggregation sizes within the published limits from the schema endpoint.
Use search.doc_topk = 0 when you only need aggregation results.

Example Use Cases

Example demonstrating a full query with window, text search, filters, and an output with fields and aggregations (as in the provided SKILL.md Example Request).
Aggregation-only query using doc_topk:0 to fetch top_tags, volume_1h, and tag_edges.
Query narrowing to specific tags/entities and returning a compact set of fields.
Source- and entity-filtered query with a daily volume histogram (volume_1h).
Co-occurrence analysis example that computes tag_edges for related tags.

Frequently Asked Questions

Add this skill to your agents