Llama-Scan：使用本地LLM将PDF转换为文本

Llama-Scan：使用本地LLM将PDF转换为文本
Llama-Scan: Convert PDFs to Text W Local LLMs

原始链接: https://github.com/ngafar/llama-scan

使用Ollama将PDF转换为文本文件的工具。本地转换PDF文件为文本，无需token费用。使用Ollama支持的最新多模态模型。将图像和图表转换为详细的文本描述。Python 3.10+，Ollama已安装并正在本地运行。安装Ollama和默认模型：拉取默认模型：ollama run qwen2.5vl:latest 使用pip安装：或uv：uv tool install llama-scan 基本用法： llama-scan path/to/your/file.pdf --output, -o: 输出目录 (默认: "output") --model, -m: 使用的Ollama模型 (默认: "qwen2.5vl:latest") --keep-images, -k: 保留中间图像文件 (默认: False) --width, -w: 调整大小后的图像宽度 (0跳过调整大小; 默认: 0) --start, -s: 起始页码 (默认: 0) --end, -e: 结束页码 (默认: 0) 处理特定页面并自定义宽度： llama-scan document.pdf --start 1 --end 5 --width 1000 使用不同的Ollama模型： llama-scan document.pdf --model qwen2.5vl:3b

Hacker News 新闻 | 过去 | 评论 | 提问 | 展示 | 工作 | 提交登录 Llama-Scan: 使用本地 LLM 将 PDF 转换为文本 (github.com/ngafar) 17 分，nawazgafar 发表于 1 小时前 | 隐藏 | 过去 | 收藏 | 3 条评论 david_draco 6 分钟前 | 下一个 [–] 查看代码，这个是将 PDF 页面转换为图像，然后转录每个图像。我可能期望一个 pdftotext 后处理器。PDF 的复杂性吧...回复 firesteelrain 1 分钟前 | 父评论 | 下一个 [–] 有一个非常流行的 Python 模块叫做 ocrmypdf。我用它来帮助我的 HOA 进行旧 PDF 的 OCR 处理。https://github.com/ocrmypdf/OCRmyPDF 不需要 LLM。回复 roscas 24 分钟前 | 上一个 [–] 几乎完美，我测试的 PDF 只错过了几个符号。但这是我肯定会使用的东西。谢谢。回复指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系搜索：

原文

A tool for converting PDFs to text files using Ollama.

Convert PDFs to text files locally, no token costs.
Use the latest multimodal models supported by Ollama.
Turn images and diagrams into detailed text descriptions.

Python 3.10+
Ollama installed and running locally

Installing Ollama and the Default Model

Install Ollama
Pull the default model:

ollama run qwen2.5vl:latest

Install using pip:

or uv:

uv tool install llama-scan

Basic usage:

llama-scan path/to/your/file.pdf

--output, -o: Output directory (default: "output")
--model, -m: Ollama model to use (default: "qwen2.5vl:latest")
--keep-images, -k: Keep the intermediate image files (default: False)
--width, -w: Width of the resized images (0 to skip resizing; default: 0)
--start, -s: Start page number (default: 0)
--end, -e: End page number (default: 0)

Process specific pages with custom width:

llama-scan document.pdf --start 1 --end 5 --width 1000

Use a different Ollama model:

llama-scan document.pdf --model qwen2.5vl:3b

Llama-Scan：使用本地LLM将PDF转换为文本 Llama-Scan: Convert PDFs to Text W Local LLMs

Installing Ollama and the Default Model

Llama-Scan：使用本地LLM将PDF转换为文本
Llama-Scan: Convert PDFs to Text W Local LLMs