kuon
久远:一个开发中的大模型语音助手,当前关注易用性,简单上手,支持对话选择性记忆和Model Context Protocol (MCP)服务。 KUON:A large language model-based voice assistant under development, currently focused on ease of use and simple onboarding. It supports selective memory in conversations and the Model Context Protocol (MCP) service.
claude mcp add --transport stdio lissettecarlr-kuon python kuon.py \ --env OPENAI_API_KEY="Your OpenAI API key (if using OpenAI models)" \ --env KUON_CONFIG_PATH="Path to the KUON configuration file if needed"
How to use
KUON is a modular large-model voice assistant that supports both text and speech input, and can output either text or speech. The project splits functionality into independent components (ASR, TTS, dialogue model integration) so you can mix-and-match modules as needed. The system can connect to OpenAI-compatible chat models for dialogue, and it exposes a set of scripts and configuration files to run and test each component individually. Typical usage involves running the main server script to start the orchestrator, then using provided tests to verify speech input, speech-to-text, chat-model responses, and text-to-speech playback. The repository emphasizes flexibility: you can deploy ASR (Automatic Speech Recognition) separately, TTS (Text-to-Speech) separately, and point KUON at your deployed services via configuration files. You’ll also find guidance for different deployment targets and model backends.
To operate KUON, you can start the Python-based server (kuon.py) which coordinates the modules, and use the configuration to adjust whether voice output, text output, and initial audio input are enabled. For dialogue, KUON supports OpenAI-style chat completions via a configured endpoint and API key. The system also provides scripts to update ASR and TTS components and a test script to validate each subsystem individually (audio input, ASR, chat model, TTS, and playback).
How to install
Prerequisites:
- Python 3.10+ installed (recommended)
- Basic Python tooling (pip, virtualenv optional)
- Optional: CUDA-capable GPU and PyTorch if you want GPU-accelerated models for ASR/TTs (depends on component usage)
- Prepare a Python environment (recommended):
- Create a virtual environment and activate it python -m venv venv source venv/bin/activate # On Windows use: venv\Scripts\activate
-
Install required dependencies: pip install -r requirements.txt
-
(Optional) Install PyTorch with CUDA support if you plan to use GPU-accelerated models: pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
-
Retrieve or clone necessary submodules or dependent repos referenced by KUON (ASR, TTS, etc.):
- AsR repository: AutomaticSpeechRecognition
- TTS repository: TextToSpeech
- Configure environment variables and settings:
- Open kuon.py and/or corresponding config files to adjust module endpoints (ASR, TTS, chat model), and set API keys if using OpenAI or other hosted models.
- Create or edit the configuration file as needed (paths to ASR/TTS services, server URLs, and keys).
-
Run the server: python kuon.py
-
(Optional) Run the test script to verify components: python check.py
Notes:
- If you host ASR and TTS services separately, ensure their URLs and credentials are correctly set in the configuration files (e.g., kuonasr/config.yaml, kuontts/config.yaml).
- On Windows, you may need to adjust path separators and possibly install additional dependencies for voice playback (playsound).
Additional notes
Tips and common issues:
- Ensure your OpenAI API key or other chat model credentials are correctly configured; a missing key will prevent the dialogue component from functioning.
- When using online TTS/ASR services, confirm network access to the service endpoints and that the service URLs in config.yaml reflect the actual deployment locations.
- If voice input fails to trigger, adjust the microphone input channel and threshold in your configuration (as described in the README) to suit your hardware.
- For offline deployments, refer to the separate repositories (ASR and TTS) and update the kuon config to point to local models and endpoints.
- If modules are independently deployed, ensure consistent data formats and authentication methods across components to avoid compatibility issues.
Related MCP Servers
NagaAgent
A simple yet powerful agent framework for personal assistants, designed to enable intelligent interaction, multi-agent collaboration, and seamless tool integration.
FireRed-OpenStoryline
FireRed-OpenStoryline is an AI video editing agent that transforms manual editing into intention-driven directing through natural language interaction, LLM-powered planning, and precise tool orchestration. It facilitates transparent, human-in-the-loop creation with reusable Style Skills for consistent, professional storytelling.
mcp_chatbot
A chatbot implementation compatible with MCP (terminal / streamlit supported)
Archive-Agent
Find your files with natural language and ask questions.
mcp-tts
MCP Server for Text to Speech
macOS-Notification
macOS Notification MCP enables AI assistants to trigger native macOS sounds, visual notifications, and text-to-speech. Built for Claude and other AI models using the Model Context Protocol.