Get the FREE Ultimate OpenClaw Setup Guide →

categorizing-bsky-accounts

Scanned
npx machina-cli add skill oaustegard/claude-skills/categorizing-bsky-accounts --openclaw
Files (1)
SKILL.md
7.9 KB

Categorizing Bluesky Accounts

Fetch Bluesky account data and extract keywords for Claude to categorize by topic. The script compresses account context (bio + posts) into bio + keywords, then Claude performs intelligent categorization.

Prerequisites

Requires: extracting-keywords skill (provides YAKE venv + domain stopwords)

The analyzer delegates keyword extraction to the extracting-keywords skill, which provides:

  • Optimized YAKE installation with minimal dependencies
  • Domain-specific stopwords: English (574), AI/ML (1357), Life Sciences (1293)
  • Support for 34 languages

Core Workflow

When users request Bluesky account analysis:

  1. Ensure keyword extraction is set up - Invoke the extracting-keywords skill using the Skill tool to ensure YAKE venv exists (skip if already invoked in this session)

  2. Determine input mode based on user's request:

    • Following list → use --following handle
    • Followers → use --followers handle
    • List of handles → use --handles "h1,h2,h3"
    • File provided → use --file accounts.txt
  3. Configure parameters:

    • --accounts N - Number to analyze (default: 100, max: 100)
    • --posts N - Posts per account (default: 20, max: 100)
    • --stopwords [en|ai|ls] - Choose domain-specific stopwords:
      • en: English (general purpose)
      • ai: AI/ML domain (recommended for tech accounts)
      • ls: Life Sciences (for biomedical/research accounts)
    • --exclude "pattern1,pattern2" - Skip spam/bot accounts
  4. Run script - Outputs simple text format to stdout:

    @handle1.bsky.social (Display Name)
    Bio text here
    Keywords: keyword1, keyword2, keyword3
    
    @handle2.bsky.social (Another Name)
    Bio text here
    Keywords: keyword4, keyword5, keyword6
    
  5. Categorize accounts - Claude analyzes bio + keywords to categorize by topic

Quick Start

Analyze following list with AI/ML stopwords:

python scripts/bluesky_analyzer.py --following austegard.com --stopwords ai

Analyze followers:

python scripts/bluesky_analyzer.py --followers austegard.com

Analyze specific handles:

python scripts/bluesky_analyzer.py --handles "user1.bsky.social,user2.bsky.social,user3.bsky.social"

From file:

python scripts/bluesky_analyzer.py --file accounts.txt --stopwords ai

Filter out bot accounts:

python scripts/bluesky_analyzer.py --following handle --exclude "bot,spam,promo" --stopwords ai

Parameters

Input Modes (choose one)

--handles "h1,h2,h3" Comma-separated list of Bluesky handles

--following HANDLE Analyze accounts followed by HANDLE

--followers HANDLE Analyze accounts following HANDLE

--file PATH Read handles from file (one per line)

Analysis Options

--accounts N Number of accounts to analyze (1-100, default: 100)

--posts N Posts to fetch per account (1-100, default: 20)

--stopwords [en|ai|ls] Stopwords to use for keyword extraction (default: en)

  • en: English stopwords (574 terms) - general purpose
  • ai: AI/ML domain stopwords (1357 terms) - tech-focused accounts
  • ls: Life Sciences stopwords (1293 terms) - biomedical/research accounts

--exclude "word1,word2" Skip accounts with these keywords in bio/posts

Output Format

The script outputs simple text format for Claude to process:

@alice.bsky.social (Alice Smith)
AI researcher working on LLM alignment and safety
Keywords: alignment, safety research, interpretability, llm evaluation

@bob.bsky.social (Bob Johnson)
Full-stack developer building web applications
Keywords: react, typescript, node.js, api design, postgresql

@carol.bsky.social (Carol Williams)
Biotech researcher studying CRISPR applications
Keywords: crispr, gene editing, therapeutics, clinical trials

Claude then categorizes accounts based on bio + keywords without hardcoded rules.

Common Workflows

Audit Your Following List

python scripts/bluesky_analyzer.py --following your-handle.bsky.social --stopwords ai

Claude will categorize accounts by topic and identify patterns in who you follow.

Find Experts in a Topic

python scripts/bluesky_analyzer.py --following alice.bsky.social --stopwords ai

Ask Claude: "Which of these accounts are ML researchers?" or "Who focuses on climate tech?"

Analyze a Curated List

cat > accounts.txt << 'EOF'
expert1.bsky.social
expert2.bsky.social
expert3.bsky.social
EOF

python scripts/bluesky_analyzer.py --file accounts.txt --stopwords ls

Filter Out Bot Accounts

python scripts/bluesky_analyzer.py --following handle --exclude "bot,spam,promo,follow back" --stopwords ai

Technical Details

Keyword Extraction

Delegates to extracting-keywords skill using YAKE venv:

  • Stopwords options (--stopwords):
    • en: English (574 terms) - general purpose
    • ai: AI/ML domain (1357 terms) - filters technical noise, ML boilerplate
    • ls: Life Sciences (1293 terms) - filters research methodology, clinical terms
  • N-grams: 1-3 words
  • Deduplication: 0.9 threshold
  • Top keywords: 10 per account
  • Performance: ~5% overhead with domain stopwords vs English

API Rate Limits

Bluesky API limits:

  • 3000 requests per 5 minutes
  • 5000 requests per hour

The analyzer respects these limits with built-in delays.

Categorization Algorithm

Script's role:

  1. Fetch account data (bio + posts)
  2. Extract keywords to compress context
  3. Output bio + keywords in simple format

Claude's role:

  1. Read bio + keywords for each account
  2. Intelligently categorize by topic (no hardcoded rules)
  3. Group accounts, identify patterns, answer user questions

This agentic pattern is more flexible than hardcoded keyword matching.

Troubleshooting

"No accounts to analyze"

  • Verify handle format (include domain: handle.bsky.social)
  • Check if account exists and has public following/followers

"Insufficient content for keyword extraction"

  • Account has few posts (<5)
  • Posts are very short
  • Try increasing --posts parameter

Rate limit errors

  • Reduce --accounts parameter
  • Add delays between batches
  • Check Bluesky API status

Import errors

  • Verify extracting-keywords skill is available
  • Check YAKE venv exists: /home/claude/yake-venv/bin/python -c "import yake"
  • Verify Python 3.8+: python3 --version

Integration with Other Skills

Built-in integration:

  • extracting-keywords: Automatically delegates keyword extraction to this skill's optimized YAKE venv with domain-specific stopwords

Example Sessions

User: "Can you analyze the accounts I follow on Bluesky and tell me what topics they focus on?"

Claude:

python scripts/bluesky_analyzer.py --following user-handle.bsky.social --stopwords ai

Based on the output, I can see you follow:

  • AI/ML researchers (15 accounts): Focus on LLM safety, alignment, interpretability
  • Software engineers (20 accounts): Web development, React, TypeScript, DevOps
  • Writers (8 accounts): Tech journalism, newsletters, long-form content
  • Scientists (7 accounts): Climate science, biotech, physics

User: "Find ML researchers in @alice's network"

Claude:

python scripts/bluesky_analyzer.py --following alice.bsky.social --stopwords ai

I found 23 ML researchers in Alice's network:

  • 8 working on LLM alignment and safety
  • 6 focused on model evaluation and benchmarks
  • 5 in ML infrastructure and MLOps
  • 4 in computer vision and multimodal models

User: "Here's a list of 30 accounts, categorize them"

Claude:

python scripts/bluesky_analyzer.py --file accounts.txt --stopwords ai

Categorized into:

  • Climate Tech (8 accounts)
  • Biotech (6 accounts)
  • Fintech (5 accounts)
  • AI/ML (7 accounts)
  • Other (4 accounts)

Source

git clone https://github.com/oaustegard/claude-skills/blob/main/categorizing-bsky-accounts/SKILL.mdView on GitHub

Overview

Categorizing Bluesky Accounts fetches Bluesky data and derives keywords to categorize accounts by topic. It compresses bio and posts into a bio+keywords payload, then Claude applies topic-based labeling. It relies on the extracting-keywords skill to provide a YAKE-based keyword set with domain stopwords.

How This Skill Works

First, ensure the keyword extraction environment is ready by invoking the extracting-keywords skill. Next, choose an input mode (--following, --followers, --handles, or --file) and configure parameters (--accounts, --posts, --stopwords, --exclude). Finally, run the bluesky_analyzer; it outputs a simple text block of bio and keywords per account, which Claude uses to categorize by topic.

When to Use It

  • You need topic discovery across Bluesky accounts (bio + posts) to surface themes.
  • You want to analyze who someone follows or is followed by for network patterns.
  • You’re curating Bluesky accounts by topic for a project or organization.
  • You’re filtering out bots or spam accounts before analysis with --exclude.
  • You’re preparing topic-based analytics or reporting on Bluesky networks.

Quick Start

  1. Step 1: Ensure keyword extraction is set up (invoke the extracting-keywords skill).
  2. Step 2: Run the analyzer with your input (e.g., --following, --followers, --handles, or --file) and optional --stopwords/--exclude.
  3. Step 3: Review the simple text output and let Claude categorize accounts by topic.

Best Practices

  • Use domain-specific stopwords by selecting --stopwords en, ai, or ls suited to the target accounts.
  • Limit to 100 accounts and 20 posts per account to balance depth and speed.
  • Exclude obvious bots or promo accounts with --exclude patterns.
  • Choose input mode that matches your data source (followers, following, handles, or file).
  • Review keywords for domain relevance and adjust stopwords if needed before re-running.

Example Use Cases

  • AI research group accounts: capture topics like alignment, safety, and model eval.
  • Biomedical researchers: surface keywords such as genomics, proteomics, and clinical trials.
  • Developer communities: identify topics around React, TypeScript, Node.js, and APIs.
  • Tech journalists: map coverage areas like ML policy, AI ethics, and tools.
  • Startup ecosystems: link founders and investors via topics like funding, growth, and market trends.

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers