Show HN: Local LLM Notepad – run a GPT-style model from a USB stick

原始链接: https://github.com/runzhouye/Local_LLM_Notepad

Local LLM Notepad is a portable, open-source application allowing you to run large language models (LLMs) locally on any Windows PC without installation, internet, or admin rights. Simply copy the single .exe file and a compatible .gguf model to a USB drive and double-click to start. It features a clean, two-pane UI, highlighting words from your prompt in the model's response for easy fact-checking. Conversations can be saved and loaded as JSON files. Built with llama.cpp, it's CPU-based for maximum compatibility. Key features include: * **Portability:** Runs from a USB drive on any Windows PC. * **Simple UI:** Clear interface for prompts and responses. * **Source Highlighting:** Automatically underlines words from your prompt in the LLM output. * **Save/Load:** Export and import conversations. * **Hotkeys:** Includes Ctrl+S to send, Ctrl+Z to stop, and more. Get started by downloading the .exe and a .gguf model, placing them on a USB, and running the .exe. You can select different models through the "File" menu. The project is also available on GitHub for cloning and building.

Here's a short summary of the Hacker News post: A developer created "Local LLM Notepad," a 45MB Windows executable designed to run a GPT-style model directly from a USB drive without installation or network access. It bundles a Python runtime, llama.cpp, and a minimal Tk UI into a single file, allowing users to run LLMs on any Windows PC, even without admin privileges. It uses memory-mapping for efficiency, achieving around 20 tokens per second on an i7 processor with a specific model. The UI features a render loop for responsiveness and a "source viewer" to trace the origins of generated tokens, aiding in hallucination detection. The post generated discussion about Windows' prevalence compared to Linux among hackers, with some pointing out the tool's potential compatibility with WINE for Linux users. There was also interest in its performance with different models and the possibility of integrating it with hardware accelerators like Coral boards.

原文

Plug a USB drive and run a modern LLM on any PC locally with a double‑click.

No installation, no internet, no API, no Cloud computing, no GPU, no admin rights required.

Local LLM Notepad is an open-source, offline plug-and-play app for running local large-language models. Drop the single bundled .exe onto a USB stick, walk up to any computer, and start chatting, brainstorming, or drafting documents.

🔌 Portable

Drop the one‑file EXE and your .gguf model onto a flash drive; run on any Windows PC without admin rights.

🪶 Clean UI

Two‑pane layout: type prompts below, watch token‑streamed answers above—no extra chrome.

🔍 Source‑word under‑lining

Every word or number you wrote in your prompt is automatically bold‑underlined in the model’s reply. Ctrl+left click on them to view them in a separate window. Handy for fact‑checking summaries, tables, or data extractions.

💾 Save/Load chats

One‑click JSON export keeps conversations with the model portable alongside the EXE.

⚡ Llama.cpp inside

CPU‑only by default for max compatibility.

🎹 Hot‑keys

Ctrl + S to send, Ctrl + Z to stop, Ctrl + F to find, Ctrl + X to clear chat history, Ctrl + Mouse‑Wheel zoom, etc.

Download Local_LLM_Notepad-portable.exe from the Releases page.

Copy the EXE and a compatible GGUF model (e.g. gemma-3-1b-it-Q4_K_M.gguf) onto your USB.

Double‑click the EXE on any Windows computer. First launch caches the model into RAM; subsequent prompts stream instantly.

Need another model? Use File ▸ Select Model… and point to a different GGUF.