discord_vector_db
A tool for retrieving Discord messages and storing them in a vector database for semantic search and analysis.
claude mcp add --transport stdio youngsecurity-discord_vector_db python -m discord_vector_db \ --env LOG_LEVEL="INFO" \ --env CHROMA_DB_DIR="path/to/chroma/db" \ --env CHROMA_DB_URI="http://localhost:7700" \ --env DISCORD_BOT_TOKEN="your_discord_bot_token" \ --env OPTOUT_REGISTRY_URL="https://example.com/opt-out-registry" \ --env PII_REDACTION_ENABLED="true"
How to use
This MCP server connects to a Discord bot and provides tools to retrieve messages from Discord channels, sanitize them for privacy, and store them in a vector database (ChromaDB) for semantic search and analysis. It offers a Python-based workflow including a message retriever, a privacy processor for PII detection and redaction, and a processor that converts messages into embeddings for vector storage. The server also includes a CLI for end-to-end usage: fetch messages, prepare them for the vector database, and perform semantic searches. Users can opt out of processing via the opt-out registry, and data minimization is enforced by default.
How to install
Prerequisites:
- Python 3.8+ installed on your system
- Virtual environment support (venv) or your preferred environment manager
- Access to a Discord bot with appropriate permissions and a Discord bot token
- Optional: a running ChromaDB instance or accessible ChromaDB endpoint
Steps:
-
Clone the repository: git clone https://github.com/yourusername/discord_vector_db.git cd discord_vector_db
-
Create and activate a virtual environment: python -m venv venv
macOS/Linux
source venv/bin/activate
Windows
venv\Scripts\activate
-
Install dependencies: pip install -r requirements.txt
-
Configure environment variables (example): export DISCORD_BOT_TOKEN="your_discord_bot_token" export CHROMA_DB_DIR="./chromadb" export CHROMA_DB_URI="http://localhost:7700" export PII_REDACTION_ENABLED="true" export OPTOUT_REGISTRY_URL="https://example.com/opt-out-registry"
-
Run the server: python -m discord_vector_db
-
Optional: verify connectivity to ChromaDB and Discord by running basic CLI commands provided by the project (see CLI in README).
Additional notes
Tips:
- Ensure the Discord bot has access to the channels you intend to scan.
- If PII redaction is enabled, test with sample messages to validate redaction rules.
- Use the opt-out registry to respect user preferences; ensure the registry URL is reachable.
- Monitor logs (LOG_LEVEL) to troubleshoot connectivity or processing errors.
- For large volumes, consider enabling checkpointing and circuit-breaker settings if available in the config.
Common issues:
- Invalid or missing DISCORD_BOT_TOKEN: check token and bot permissions.
- ChromaDB connection failures: verify CHROMA_DB_URI and that the service is running.
- File permission errors writing to CHROMA_DB_DIR: ensure write access to the directory.
Related MCP Servers
web-eval-agent
An MCP server that autonomously evaluates web applications.
mcp-neo4j
Neo4j Labs Model Context Protocol servers
Gitingest
mcp server for gitingest
fhir
FHIR MCP Server – helping you expose any FHIR Server or API as a MCP Server.
unitree-go2
The Unitree Go2 MCP Server is a server built on the MCP that enables users to control the Unitree Go2 robot using natural language commands interpreted by a LLM.
sympy
A MCP server for symbolic manipulation of mathematical expressions