展示HN:使用Docker和Llama 3的离线RAG系统(无需云API)
Show HN: Offline RAG System Using Docker and Llama 3 (No Cloud APIs)

原始链接: https://github.com/PhilYeh1212/Local-AI-Knowledge-Base-Docker-Llama3

## 离线 RAG 系统 – 黑色星期五促销! 该项目提供一个开箱即用的、**100% 离线检索增强生成 (RAG)** 系统,用于安全地与您的专有文档(PDF、TXT、Markdown)进行交互。 保护您的敏感数据私密 – 无需依赖云服务或 API 密钥! 该系统采用微服务架构和 **Docker Compose**,实现一键轻松部署。它使用 **Ollama** (Llama 3 8B) 进行 LLM 推理,**mxbai-embed-large** 用于嵌入,以及 **ChromaDB** 用于本地向量存储。 **Python & Streamlit** 后端提供上下文感知的聊天界面。 **主要特点:** GPU 加速(CUDA 支持)、智能文档摄取、对话历史记录以及简化的设置。 推荐硬件包括 16GB+ 内存和 NVIDIA RTX 3060 (8GB VRAM) 以获得最佳性能,但也可以在 CPU 模式下运行。 **现在使用代码 BLACKFRIDAY 享受 15% 折扣!** 完整软件包包括完整的源代码、Docker 配置和设置指南 – 节省您数周的开发时间。 [Gumroad 链接](https://gumroad.com/)

黑客新闻 新的 | 过去的 | 评论 | 提问 | 展示 | 招聘 | 提交 登录 [已标记] PhilYeh 1天前 | 隐藏 | 过去的 | 收藏 Alifatisk 1天前 [–] 为什么要在发布在这里?它只是一个自述文件和一个购买产品的链接。GitHub链接感觉具有误导性。 wkat4242 1天前 | 父级 [–] 是的,GitHub是用于开源的。具有误导性,是的。 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请YC | 联系 搜索:
相关文章

原文

🔥 BLACK FRIDAY SALE: Get 15% OFF all source codes with code BLACKFRIDAY. Click here to apply discount automatically

Docker Llama 3 License Platform

Stop sending your sensitive engineering data to the cloud.

This project provides a production-grade, 100% offline RAG (Retrieval-Augmented Generation) architecture. It allows you to chat with your proprietary documents (PDF, TXT, Markdown) using a local LLM, ensuring absolute data privacy.


🏗️ System Architecture

This system is designed with a microservices architecture, fully containerized using Docker Compose for one-click deployment.

  • LLM Inference: Ollama (Running Meta Llama 3 8B)
  • Embeddings: mxbai-embed-large (State-of-the-art retrieval performance)
  • Vector Database: ChromaDB (Persistent local storage)
  • Backend/Frontend: Python + Streamlit (Optimized for RAG workflows)
  • Deployment: Docker Compose (Isolated environment)

  • 🔒 100% Privacy: No data leaves your machine. No OpenAI API keys required. Zero monthly fees.
  • 🚀 GPU Acceleration: Native support for NVIDIA GPUs (CUDA) for lightning-fast inference.
  • 📂 Smart Ingestion: Automatically parses, chunks, and vectorizes PDF and text documents.
  • 💬 Context-Aware Chat: Remembers conversation history and retrieves relevant context from your knowledge base.
  • 🐳 One-Click Setup: No "dependency hell". Just run docker-compose up -d.

Click the image below to watch the system in action:

Watch the Demo

1. Chat Interface (Streamlit)

[Chat UI]螢幕擷取畫面 2025-11-17 133350

[Ingestion Process]

graph TD
    subgraph Docker_Container [🐳 Docker Containerized Environment]
        style Docker_Container fill:#e1f5fe,stroke:#01579b,stroke-width:2px,rx:10,ry:10
        
        UI["🖥️ Streamlit Web UI"]:::ui
        Backend["⚙️ Python RAG Backend"]:::code
        
        subgraph Local_AI [🧠 Local AI Engine]
            style Local_AI fill:#fff3e0,stroke:#ff6f00,stroke-width:2px
            Ollama["🦙 Ollama Service<br/>(Llama 3 Model)"]:::ai
            Embed["✨ Embedding Model<br/>(mxbai-embed-large)"]:::ai
        end
        
        DB[("🗄️ ChromaDB<br/>Vector Store")]:::db
    end

    User([👤 User]) -->|Upload PDF/Ask Question| UI
    UI <-->|API Request| Backend
    Backend <-->|Store/Retrieve Vectors| DB
    Backend <-->|Inference Request| Ollama
    Backend -->|Generate Embeddings| Embed

    classDef ui fill:#d1c4e9,stroke:#512da8,stroke-width:2px,color:black;
    classDef code fill:#c8e6c9,stroke:#2e7d32,stroke-width:2px,color:black;
    classDef db fill:#ffcdd2,stroke:#c62828,stroke-width:2px,color:black;
    classDef ai fill:#fff9c4,stroke:#fbc02d,stroke-width:2px,color:black;
Loading

To run this system smoothly with Llama 3 (8B), the following hardware is recommended:

  • OS: Windows 10/11 (WSL2) or Linux (Ubuntu)
  • RAM: 16GB+ System Memory
  • GPU: NVIDIA RTX 3060 (8GB VRAM) or higher recommended.
    • Note: The system can run on CPU-only mode, but inference will be slower.

📥 Get the Complete System

Building a stable RAG system from scratch takes weeks of configuration (handling Python dependencies, Vector DB connections, and Docker networking).

I have packaged the Full Source Code, Docker Configuration, and Setup Guide into a ready-to-deploy bundle.

📦 What's included in the Full Package?

  • Complete Source Code (Python)
  • docker-compose.yml (Production ready)
  • Embedding & Vectorization Logic
  • UI/UX Implementation
  • Premium Support Guide

👉 Download the System Here: Get it on Gumroad (Instant Access. One-time payment. Lifetime usage.)


👨‍💻 About the Author

Phil Yeh - Senior Automation & Systems Engineer. Specializing in Hardware-Software Integration, Industrial Automation, and Local AI Solutions.


Keywords: RAG, Llama 3, Ollama, Docker, Local AI, Private GPT, Knowledge Base, Python, Vector Database, ChromaDB, Source Code

联系我们 contact @ memedata.com