a
Search Cluster
Scanned@1999AZZAR
npx machina-cli add skill @1999AZZAR/search-cluster --openclawFiles (1)
SKILL.md
2.0 KB
Search Cluster
Unified search system for multi-source information gathering.
Prerequisites
- Binary:
python3must be installed. - Google Search: Requires
GOOGLE_CSE_KEYandGOOGLE_CSE_ID. - NewsAPI: Requires
NEWSAPI_KEY. - Cache (Optional): Active Redis instance (defaults to localhost:6379).
Setup
- Define API keys in your environment or a local
.envfile. - Install optional Redis client:
pip install redis.
Core Workflows
1. Single Source Search
Query a specific engine for targeted results.
- Usage:
python3 $WORKSPACE/skills/search-cluster/scripts/search-cluster.py <source> "<query>" - Sources:
google,wiki,reddit,newsapi.
2. Aggregated Search
Query all supported engines in parallel and aggregate results.
- Usage:
python3 $WORKSPACE/skills/search-cluster/scripts/search-cluster.py all "<query>"
3. RSS/Feed Fetching
Retrieve and parse standard RSS or Atom feeds.
- Usage:
python3 $WORKSPACE/skills/search-cluster/scripts/search-cluster.py rss "<url>"
Reliability & Security
- Secure Networking: Enforces strict SSL/TLS verification for all API and feed requests. No unverified fallback is permitted.
- Namespace Isolation: Cache keys are prefixed with
search:to avoid collisions. - Local Preference: Redis connectivity defaults to
localhost. Users must explicitly setREDIS_HOSTfor remote instances. - User Agent: Uses a standardized
SearchClusterBotagent to comply with site policies.
Reference
- API Setup: See references/search-apis.md.
Overview
A single, unified search system that queries Google, Wikipedia, Reddit, NewsAPI, and RSS feeds. It aggregates results in parallel and returns structured JSON, speeding up research and competitive analysis.
How This Skill Works
The skill queries multiple engines in parallel and returns a structured JSON output. Redis caching is optional and keys are prefixed with search: to prevent collisions; it uses a fixed SearchClusterBot user agent and enforces SSL/TLS verification for all requests.
When to Use It
- Research topics across multiple sources in parallel to save time
- Aggregate results from Google, Wikipedia, Reddit, NewsAPI, and RSS feeds for a comprehensive brief
- Monitor trends or mentions across sources with up-to-date results
- Build a structured JSON knowledge snapshot for downstream apps or dashboards
- Fetch and parse RSS/Atom feeds alongside API results for a complete feed roundup
Quick Start
- Step 1: Set environment variables for GOOGLE_CSE_KEY, GOOGLE_CSE_ID, and NEWSAPI_KEY (and install Python 3).
- Step 2: Optional — install and configure Redis (pip install redis) and set REDIS_HOST if using remote caches.
- Step 3: Run a command, e.g., python3 $WORKSPACE/skills/search-cluster/scripts/search-cluster.py all "your query"
Best Practices
- Define and protect API keys in environment variables or a local .env file (GOOGLE_CSE_KEY, GOOGLE_CSE_ID, NEWSAPI_KEY)
- Enable optional Redis caching by installing redis and configuring REDIS_HOST when needed
- Use the aggregated 'all' workflow to leverage parallel querying and reduce latency
- Prefix cache keys with search: to avoid collisions and enable clean invalidation
- Always use strict SSL/TLS verification and a consistent User-Agent (SearchClusterBot) for compliance
Example Use Cases
- A marketing team runs all-source searches to monitor brand mentions across Google, Wikipedia, Reddit, NewsAPI, and RSS feeds, compiling a structured JSON report.
- A researcher aggregates diverse sources to build a topic brief, combining API results with RSS updates for completeness.
- A newsroom assembles breaking-story rundowns by querying multiple sources in parallel and delivering unified JSON results.
- A competitive intelligence analyst tracks competitors across search and social sources in a single query to speed Insight generation.
- A knowledge-base pipeline collects multi-source results into structured JSON for ingestion into downstream analytics or dashboards.
Frequently Asked Questions
Add this skill to your agents