LLM 的调试模式，在 vLLora 中。

LLM 的调试模式，在 vLLora 中。
Debug Mode for LLMs in vLLora

原始链接: https://vllora.dev/blog/debug-mode/

## vLLora 推出 LLM 请求调试模式使用大型语言模型 (LLM)（例如代理和 RAG 管道）开发复杂应用程序时，常常会受到其“黑盒”性质的阻碍。由于未见的提示变异、参数变化或不正确的数据打包，很难确定 LLM 无法按预期行为的*原因*。 vLLora 的全新 **调试模式** 通过在发送每个 LLM 请求*之前*暂停请求，提供完全的可视性和可编辑性来解决此问题。用户可以检查完整的请求负载——包括消息、参数、工具定义和标头——并直接修改它。这种“暂停-检查-编辑-继续”工作流程使开发人员能够快速识别和修复诸如静默工具调用失败、上下文过载以及多步骤过程中状态漂移等问题。调试模式对于代理尤其有价值，因为错误会在许多步骤中累积，从而使根本原因分析变得困难。通过提供对 LLM 输入的直接控制，调试模式简化了开发、缩短了调试时间并确保应用程序的行为可预测。

vLLora (vllora.dev) 是一种用于调试大型语言模型 (LLM) 的新工具。最初采用 Apache 2.0 许可，项目部分现在使用一种专门针对本地调试 UI/工具的“公平代码”许可——旨在防止转售，而非限制免费使用。核心可嵌入的 Rust crate 仍然采用 Apache 2.0 许可。开发者澄清说，付费功能将作为独立云服务提供，避免核心功能许可问题。虽然一些评论员质疑 vLLora 的必要性，认为可以通过提示和现有 AI 工具实现 LLM 调试，但创建者认为专用工具至关重要。 vLLora 专注于检查“代理数据流”——提示、输出和工具交互——提供 LLM 运行时成本、时间和行为的可视性。它还允许模型修改和微调（涉及 LoRA，因此得名）。它被定位为传统调试方法的补充，为复杂的代理工作流程提供专门的方法。

原文

LLMs behave like black boxes. You send them a request, hope the prompt is right, hope your agent didn't mutate it, hope the framework packaged it correctly — and then hope the response makes sense. In simple one-shot queries this usually works fine. But when you're building agents, tools, multi-step workflows, or RAG pipelines, it becomes very hard to see what the model is actually receiving. A single unexpected message, parameter, or system prompt change can shift the entire run.

Today we're introducing Debug Mode for LLM requests in vLLora that makes this visible — and editable.

Here’s what debugging looks like in practice:

Debugging LLM Request using Debug Mode

vLLora now supports Debug Mode for LLM requests. When Debug Mode is enabled, every request pauses before it reaches the model. Debug Mode works by inserting breakpoints on every outgoing LLM request, allowing you to inspect, edit, or continue execution.

You can:

Inspect the exact request
Edit anything
Continue execution normally

This brings a familiar software-engineering workflow ("pause -> inspect -> edit -> continue") to LLM development.

Why We Built This

If you've built anything beyond a simple chat interface, you've likely hit one of these:

Silent tool-call failures (wrong name / bad params / malformed JSON)
Overloaded or corrupted context / RAG input leading to hallucination or truncation
Error accumulation and state drift in long or multi-step workflows
Lack of visibility: standard logs rarely show the actual request sent to the model

It is difficult to fix these issues without proper observability. Debug Mode changes that.

What Happens When a Request Pauses

Here's what it looks like when vLLora intercepts a request right before it's sent:

Paused request example

You get a real-time snapshot of:

The selected model
Full message array (system, user, assistant)
Parameters like temperature or max tokens
Any tool definitions
Any extra fields and headers your framework injected

This is the full request payload your application is about to send — not what you assume it's sending.

Edit Anything

Click Edit and the payload becomes modifiable:

Edit Request modal with JSON editor

You can adjust:

Message content
System prompts
Model name
Parameters
Tool definitions
Metadata

This affects only the current request. Your application code stays untouched.

It's a fast way to validate fixes, test ideas, and confirm what the agent should have sent.

Continue the Workflow

Continue Workflow

When you click Continue, vLLora:

Sends your edited request to the model
Receives the real response
Passes it back to your application
Resumes the workflow as if nothing unusual happened

After you click Continue, the workflow proceeds using the response from your edited request. The agent treats it the same way it would treat any normal response from the model.

Why This Matters for Agents

Agents are long-running chains of decisions. Each step can depend on the previous one, and each step can affect the next. Once you're 15 steps deep, you might not know whether:

The prompt changed
A system message was overwritten
A parameter was set differently than expected
The context blew up
A tool schema got mutated

With Debug Mode:

You catch drift early
You see exactly what the model receives
You fix issues in seconds
You avoid rerunning long multi-step workflows
You test prompt or parameter changes instantly

For deep agents, debugging becomes 10x easier.

Closing Thoughts

Debugging LLM systems has been mostly tedious. Debug Mode gives you a clear view into what’s happening and a way to correct issues as they occur.

If you need to understand or fix what an agent is sending, this is the most direct way to do it.

Read the docs: Debug Mode

Try it locally: Quickstart