pubfi-dsl-client
Scannednpx machina-cli add skill helixbox/pubfi-skills/pubfi-dsl-client --openclawPubFi DSL Client Guide
Overview
This skill defines how to build safe, structured DSL requests for PubFi data access. The server only accepts structured DSL and executes a fixed OpenSearch query shape. The server does not accept SQL, raw OpenSearch DSL, or natural language compilation.
Base URL
Set API_BASE to choose the environment:
- Staging (publish target):
API_BASE=https://api-stg.pubfi.ai - Local debugging (Actix server):
API_BASE=http://127.0.0.1:23340
Note: 0.0.0.0 is typically a listen address, not a client address. If you still need it for local debugging, you may set API_BASE=http://0.0.0.0:23340.
When To Use
- You need to convert an agent intent into a structured request.
- You must avoid sending SQL or raw OpenSearch DSL to the server.
- You need a deterministic, auditable query shape for billing and rate limits.
Core Rules
- Only send fields defined in the DSL schema.
- Do not send SQL, OpenSearch DSL, or arbitrary field names.
- Use explicit UTC timestamps and treat
window.endas exclusive. - Keep
search.doc_topk, filter sizes, and aggregation sizes within the published limits from the schema endpoint. - Use
search.doc_topk = 0when you only need aggregation results.
Workflow
- Fetch the schema from the API, or use
references/dsl-schema.mdas the local contract. - Convert the user intent into time window, filters, document fields, and aggregation primitives.
- Validate the JSON shape and required fields before sending.
- Record the request and response for audit and billing reconciliation.
Endpoints
GET /v1/dsl/schemaPOST /v1/dsl/query
Example Commands
Fetch the schema:
curl -sS "$API_BASE/v1/dsl/schema"
Run a query:
curl -sS "$API_BASE/v1/dsl/query" \
-H 'content-type: application/json' \
-d @request.json
Example Request
{
"window": {"start": "2026-02-05T00:00:00Z", "end": "2026-02-06T00:00:00Z"},
"search": {"text": "liquidity stress exchange withdrawal", "doc_topk": 50},
"filters": {"tags": ["withdrawal_pause"], "entities": ["Binance"], "sources": ["coindesk"]},
"output": {
"fields": [
"document_id",
"title",
"source",
"source_published_at",
"tag_slugs",
"entity_labels",
"document_summary"
],
"aggregations": [
{
"name": "top_tags",
"aggregation": {"type": "terms", "field": "tag_slugs", "size": 50, "min_doc_count": 2}
},
{
"name": "volume_1h",
"aggregation": {"type": "date_histogram", "field": "source_published_at", "fixed_interval": "1h"}
},
{
"name": "tag_edges",
"aggregation": {"type": "cooccurrence", "field": "tag_slugs", "outer_size": 50, "inner_size": 50, "min_doc_count": 2}
}
]
}
}
Common Mistakes
- Passing natural language constraints instead of structured filters.
- Omitting
windowor using non-UTC timestamps. - Requesting too many fields or too many aggregations.
- Using unsupported
fixed_intervalvalues. - Requesting large co-occurrence expansions that violate edge limits.
Source
git clone https://github.com/helixbox/pubfi-skills/blob/main/skills/pubfi-dsl-client/SKILL.mdView on GitHub Overview
PubFi DSL Client defines how external agents should construct safe, structured PubFi DSL requests for OpenSearch-backed data access. The server only accepts the fixed DSL shape and blocks SQL, raw OpenSearch DSL, or natural language compilation, ensuring determinism and auditability.
How This Skill Works
Fetch the DSL schema from the API or local contract, then map user intent to a structured combination of time window, filters, document fields, and aggregations. Validate the JSON against the schema and required fields before posting to /v1/dsl/query, and record the request/response for audit and billing reconciliation. Ensure UTC timestamps and treat window.end as exclusive, and keep sizes within published limits.
When to Use It
- You need to convert an agent intent into a structured request.
- You must avoid sending SQL or raw OpenSearch DSL to the server.
- You need a deterministic, auditable query shape for billing and rate limits.
- You want to ensure you only send fields defined in the DSL schema and use UTC timestamps.
- You need to validate the JSON against the DSL schema before sending.
Quick Start
- Step 1: Set API_BASE to the target environment and fetch the DSL schema (GET /v1/dsl/schema) or reference the local contract references/dsl-schema.md.
- Step 2: Translate the user intent into a structured window, filters, document fields, and aggregations; ensure the JSON matches the DSL schema and required fields.
- Step 3: POST the validated request to /v1/dsl/query and log both the request and response for audit and billing reconciliation.
Best Practices
- Only send fields defined in the DSL schema.
- Do not send SQL, OpenSearch DSL, or arbitrary field names.
- Use explicit UTC timestamps and treat window.end as exclusive.
- Keep search.doc_topk, filter sizes, and aggregation sizes within the published limits from the schema endpoint.
- Use search.doc_topk = 0 when you only need aggregation results.
Example Use Cases
- Example demonstrating a full query with window, text search, filters, and an output with fields and aggregations (as in the provided SKILL.md Example Request).
- Aggregation-only query using doc_topk:0 to fetch top_tags, volume_1h, and tag_edges.
- Query narrowing to specific tags/entities and returning a compact set of fields.
- Source- and entity-filtered query with a daily volume histogram (volume_1h).
- Co-occurrence analysis example that computes tag_edges for related tags.