星号AI语音座席

星号AI语音座席
Asterisk AI Voice Agent

原始链接: https://github.com/hkjarral/Asterisk-AI-Voice-Agent

## Asterisk AI 语音座席：快速总结 Asterisk AI 语音座席为您的 Asterisk/FreePBX 系统带来 AI 驱动的语音能力。它是一个模块化、开源解决方案，提供云端和本地 AI 处理选项，用于语音转文本、自然语言理解和文本转语音等功能。 **快速入门（2 分钟）：** 克隆仓库，运行 `preflight.sh --apply-fixes`（这将创建一个安全的 `.env` 文件），并使用 `docker compose up -d admin-ui` 启动管理界面（可通过 http://localhost:3003 访问）。设置向导将引导您进行提供商配置和测试。 **主要功能：** 呼叫历史和分析、插话改进、支持多种 AI 模型（OpenAI、Google、Deepgram、ElevenLabs、本地选项如 Ollama）以及强大的工具调用系统，用于执行呼叫转移和发送电子邮件等操作。 **最新更新（v4.5.3）：** 增强的呼叫历史记录，包含调试工具，改进的插话功能，扩展的模型支持以及安全加固。 **部署：** 需要一个 Linux 系统（x86_64 架构），并安装了 Docker、Docker Compose 和 Asterisk 18+。该项目提供五个黄金基线配置，以便快速设置。

黑客新闻新 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交登录 Asterisk AI 语音代理 (github.com/hkjarral) 10 分，akrulino 发表于 1 小时前 | 隐藏 | 过去 | 收藏 | 讨论指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系搜索：

Get the Admin UI running in 2 minutes.

For a complete first successful call walkthrough (dialplan + transport selection + verification), see:

1. Run Pre-flight Check (Required)

# Clone repository
git clone https://github.com/hkjarral/Asterisk-AI-Voice-Agent.git
cd Asterisk-AI-Voice-Agent

# Run preflight with auto-fix (creates .env, generates JWT_SECRET)
sudo ./preflight.sh --apply-fixes

Important: Preflight creates your .env file and generates a secure JWT_SECRET. Always run this first!

# Start the Admin UI container
docker compose up -d --build admin-ui

Open in your browser:

Local: http://localhost:3003
Remote server: http://<server-ip>:3003

Default Login: admin / admin

Follow the Setup Wizard to configure your providers and make a test call.

⚠️ Security: The Admin UI is accessible on the network. Change the default password immediately and restrict port 3003 via firewall, VPN, or reverse proxy for production use.

# Start ai-engine (required for health checks)
docker compose up -d --build ai-engine

# Check ai-engine health
curl http://localhost:15000/health
# Expected: {"status":"healthy"}

# View logs for any errors
docker compose logs ai-engine | tail -20

The wizard will generate the necessary dialplan configuration for your Asterisk server.

Transport selection is configuration-dependent (not strictly “pipelines vs full agents”). Use the validated matrix in:

🔧 Advanced Setup (CLI)

For users who prefer the command line or need headless setup.

Option A: Interactive CLI

./install.sh
agent quickstart

# Configure environment
cp .env.example .env
# Edit .env with your API keys

# Start services
docker compose up -d

Configure Asterisk Dialplan

Add this to your FreePBX (extensions_custom.conf):

[from-ai-agent]
exten => s,1,NoOp(Asterisk AI Voice Agent v4.5.3)
 same => n,Stasis(asterisk-ai-voice-agent)
 same => n,Hangup()

Health check:

View logs:

docker compose logs -f ai-engine

🎉 What's New in v4.5.3

Latest Updates

📊 Call History & Analytics

Full Call Logging: Every call saved with conversation history, timing, and outcome
Per-Call Debugging: Review transcripts, tool executions, and errors from Admin UI
Search & Filter: Find calls by caller, provider, context, or date range
Export: Download call data as CSV or JSON

🎤 Barge-In Improvements

Immediate Interruption: Agent audio stops instantly when caller speaks
Provider-Owned Turn-Taking: Full agents (Google, Deepgram, OpenAI, ElevenLabs) handle VAD natively
Platform Flush: Local playback clears immediately on interruption signal
Transport Parity: Works with both ExternalMedia RTP and AudioSocket

🧠 Additional Model Support

Faster Whisper: High-accuracy STT backend with GPU acceleration
MeloTTS: New neural TTS option for local pipelines
Model Hot-Swap: Switch models via Dashboard without container restart

🔌 MCP Tool Integration

External Tools Framework: Connect AI agents to external services via Model Context Protocol
Admin UI Config: Configure MCP servers from the web interface

🔒 RTP Security Hardening

Remote Endpoint Pinning: Lock RTP streams to prevent audio hijacking
Allowlist Support: Restrict allowed remote hosts for ExternalMedia
Cross-Talk Prevention: SSRC-based routing ensures call isolation

🚀 Pipeline-First Default

local_hybrid Default: Privacy-focused pipeline is now the out-of-box default
Pipeline-Aware Readiness: Health probes correctly reflect pipeline component status

Previous Versions

v4.4.3 - Cross-Platform Support

🌍 Pre-flight Script: System compatibility checker with auto-fix mode.
🔧 Admin UI Fixes: Models page, providers page, dashboard improvements.
🛠️ Developer Experience: Code splitting, ESLint + Prettier.

v4.4.2 - Local AI Enhancements

🎤 New STT Backends: Kroko ASR, Sherpa-ONNX.
🔊 Kokoro TTS: High-quality neural TTS.
🔄 Model Management: Dynamic backend switching from Dashboard.
📚 Documentation: LOCAL_ONLY_SETUP.md guide.

🖥️ Admin UI v1.0: Modern web interface (http://localhost:3003).
🎙️ ElevenLabs Conversational AI: Premium voice quality provider.
🎵 Background Music: Ambient music during AI calls.

v4.3 - Complete Tool Support & Documentation

🔧 Complete Tool Support: Works across ALL pipeline types.
📚 Documentation Overhaul: Reorganized structure.
💬 Discord Community: Official server integration.

v4.2 - Google Live API & Enhanced Setup

🤖 Google Live API: Gemini 2.0 Flash integration.
🚀 Interactive Setup: agent quickstart wizard.

v4.1 - Tool Calling & Agent CLI

🔧 Tool Calling System: Transfer calls, send emails.
🩺 Agent CLI Tools: doctor, troubleshoot, demo.

🌟 Why Asterisk AI Voice Agent?

Feature	Benefit
Asterisk-Native	Works directly with your existing Asterisk/FreePBX - no external telephony providers required.
Truly Open Source	MIT licensed with complete transparency and control.
Modular Architecture	Choose cloud, local, or hybrid - mix providers as needed.
Production-Ready	Battle-tested baselines with Call History-first debugging.
Cost-Effective	Local Hybrid costs ~$0.001-0.003/minute (LLM only).
Privacy-First	Keep audio local while using cloud intelligence.

5 Golden Baseline Configurations

OpenAI Realtime (Recommended for Quick Start)
- Modern cloud AI with natural conversations (<2s response).
- Config: config/ai-agent.golden-openai.yaml
- Best for: Enterprise deployments, quick setup.
Deepgram Voice Agent (Enterprise Cloud)
- Advanced Think stage for complex reasoning (<3s response).
- Config: config/ai-agent.golden-deepgram.yaml
- Best for: Deepgram ecosystem, advanced features.
Google Live API (Multimodal AI)
- Gemini Live (Flash) with multimodal capabilities (<2s response).
- Config: config/ai-agent.golden-google-live.yaml
- Best for: Google ecosystem, advanced AI features.
ElevenLabs Agent (Premium Voice Quality)
- ElevenLabs Conversational AI with premium voices (<2s response).
- Config: config/ai-agent.golden-elevenlabs.yaml
- Best for: Voice quality priority, natural conversations.
Local Hybrid (Privacy-Focused)
- Local STT/TTS + Cloud LLM (OpenAI). Audio stays on-premises.
- Config: config/ai-agent.golden-local-hybrid.yaml
- Best for: Audio privacy, cost control, compliance.

🏠 Self-Hosted LLM with Ollama (No API Key Required)

Run your own local LLM using Ollama - perfect for privacy-focused deployments:

# In ai-agent.yaml
active_pipeline: local_ollama

Features:

No API key required - fully self-hosted on your network
Tool calling support with compatible models (Llama 3.2, Mistral, Qwen)
Local Vosk STT + Your Ollama LLM + Local Piper TTS
Complete privacy - all processing stays on-premises

Requirements:

Mac Mini, gaming PC, or server with Ollama installed
8GB+ RAM (16GB+ recommended for larger models)
See docs/OLLAMA_SETUP.md for setup guide

Recommended Models:

Model	Size	Tool Calling
`llama3.2`	2GB	✅ Yes
`mistral`	4GB	✅ Yes
`qwen2.5`	4.7GB	✅ Yes

Tool Calling System: AI-powered actions (transfers, emails) work with any provider.
Agent CLI Tools: doctor, troubleshoot, demo, init commands.
Modular Pipeline System: Independent STT, LLM, and TTS provider selection.
Dual Transport Support: AudioSocket and ExternalMedia RTP (see Transport Compatibility matrix).
High-Performance Architecture: Separate ai-engine and local-ai-server containers.
Observability: Built-in Call History for per-call debugging + optional /metrics scraping.
State Management: SessionStore for centralized, typed call state.
Barge-In Support: Interrupt handling with configurable gating.

Modern web interface for configuration and system management.

Quick Start:

docker compose up -d admin-ui
# Access at: http://localhost:3003
# Login: admin / admin (change immediately!)

Key Features:

Setup Wizard: Visual provider configuration.
Dashboard: Real-time system metrics and container status.
Live Logs: WebSocket-based log streaming.
YAML Editor: Monaco-based editor with validation.

📞 Try it Live! (US Only)

Experience our production-ready configurations with a single phone call:

Dial: (925) 736-6718

Press 5 → Google Live API (Multimodal AI with Gemini 2.0)
Press 6 → Deepgram Voice Agent (Enterprise cloud with Think stage)
Press 7 → OpenAI Realtime API (Modern cloud AI, most natural)
Press 8 → Local Hybrid Pipeline (Privacy-focused, audio stays local)
Press 9 → ElevenLabs Agent (Santa voice with background music)
Press 10 → Fully Local Pipeline (100% on-premises, CPU-based)

🛠️ AI-Powered Actions (v4.3+)

Your AI agent can perform real-world telephony actions through tool calling.

Caller: "Transfer me to the sales team"
Agent: "I'll connect you to our sales team right away."
[Transfer to sales queue with queue music]

Supported Destinations:

Extensions: Direct SIP/PJSIP endpoint transfers.
Queues: ACD queue transfers with position announcements.
Ring Groups: Multiple agents ring simultaneously.

Cancel Transfer: "Actually, cancel that" (during ring).
Hangup Call: Ends call gracefully with farewell.
Voicemail: Routes to voicemail box.

Automatic Call Summaries: Admins receive full transcripts and metadata.
Caller-Requested Transcripts: "Email me a transcript of this call."

Tool	Description	Status
`transfer`	Transfer to extensions, queues, or ring groups	✅
`cancel_transfer`	Cancel in-progress transfer (during ring)	✅
`hangup_call`	End call gracefully with farewell message	✅
`leave_voicemail`	Route caller to voicemail extension	✅
`send_email_summary`	Auto-send call summaries to admins	✅
`request_transcript`	Caller-initiated email transcripts	✅

Production-ready CLI for operations and setup.

Installation:

curl -sSL https://raw.githubusercontent.com/hkjarral/Asterisk-AI-Voice-Agent/main/scripts/install-cli.sh | bash

Commands:

agent quickstart          # Interactive setup wizard
agent dialplan            # Generate dialplan snippets
agent config validate     # Validate configuration
agent doctor --fix        # System health check
agent troubleshoot        # Analyze specific call
agent demo                # Demo features

Example .env:

OPENAI_API_KEY=sk-your-key-here
DEEPGRAM_API_KEY=your-key-here
ASTERISK_ARI_USERNAME=asterisk
ASTERISK_ARI_PASSWORD=your-password

Optional: Metrics (Bring Your Own Prometheus)

The engine exposes Prometheus-format metrics at http://<engine-host>:15000/metrics. Per-call debugging is handled via Admin UI → Call History.

🏗️ Project Architecture

Two-container architecture for performance and scalability:

ai-engine (Lightweight orchestrator): Connects to Asterisk via ARI, manages call lifecycle.
local-ai-server (Optional): Runs local STT/LLM/TTS models (Vosk, Sherpa, Kroko, Piper, Kokoro, llama.cpp).

graph LR
    A[Asterisk Server] <-->|ARI, RTP| B[ai-engine]
    B <-->|API| C[AI Provider]
    B <-->|WS| D[local-ai-server]
    
    style A fill:#f9f,stroke:#333,stroke-width:2px
    style B fill:#bbf,stroke:#333,stroke-width:2px
    style C fill:#bfb,stroke:#333,stroke-width:2px
    style D fill:#fbf,stroke:#333,stroke-width:2px

Requirement	Details
Architecture	x86_64 (AMD64) only
OS	Linux with systemd
Supported Distros	Ubuntu 20.04+, Debian 11+, RHEL/Rocky/Alma 8+, Fedora 38+, Sangoma Linux

Note: ARM64 (Apple Silicon, Raspberry Pi) is not currently supported. See Supported Platforms for the full compatibility matrix.

Minimum System Requirements

Type	CPU	RAM	Disk
Cloud (OpenAI/Deepgram)	2+ cores	4GB	1GB
Local Hybrid	4+ cores	8GB+	2GB

Docker + Docker Compose v2
Asterisk 18+ with ARI enabled
FreePBX (recommended) or vanilla Asterisk

The preflight.sh script handles initial setup:

Seeds .env from .env.example with your settings
Prompts for Asterisk config directory location
Sets ASTERISK_UID/ASTERISK_GID to match host permissions (fixes media access issues)
Re-running preflight often resolves permission problems

Configuration & Operations

Contributions are welcome! Please see our Contributing Guide.

👩‍💻 For Developers

This project is licensed under the MIT License. See the LICENSE file for details.

If you find this project useful, please give it a ⭐️ on GitHub!