树形解析器 vs. 语言服务器

树形解析器 vs. 语言服务器
Tree-sitter vs. Language Servers

原始链接: https://lambdaland.org/posts/2026-01-21_tree-sitter_vs_lsp/

## Tree-sitter 与语言服务器：总结 Tree-sitter 和语言服务器都是增强文本编辑器功能的工具，但它们运作方式不同。**Tree-sitter** 是一个*快速的解析器生成器*。给定一种语言的定义，它会创建一个程序来解析代码，即使存在语法错误——这对于在键入时实现响应迅速的语法高亮显示至关重要。它还允许通过专用语言查询代码结构，提供超越简单正则表达式的强大分析。 **语言服务器** 则*分析代码并向编辑器提供语义信息*，通过语言服务器协议 (LSP) 实现。这种标准化的通信避免了为每种语言/编辑器组合都需要独特的分析器。它们提供诸如定义查找和代码补全等功能，利用语言的运行时和编译器来保证准确性。虽然语言服务器*可以*处理语法高亮显示，但通常比 Tree-sitter 的方法更慢且更复杂。目前，Tree-sitter 仍然是许多人的首选，提供忠实且高性能的高亮显示。最终，它们满足不同的需求：Tree-sitter 擅长解析和结构分析，而语言服务器提供更深入的语义理解。

Hacker News 新闻 | 过去 | 评论 | 提问 | 展示 | 工作 | 提交登录 Tree-sitter vs. Language Servers (lambdaland.org) 17 分，ashton314 发表于 42 分钟前 | 隐藏 | 过去 | 收藏 | 2 条评论 tetris11 3 分钟前 [–] 我喜欢 tree-sitter+eglot，但我在使用的几种语言/方案，就是没有解析器： > pacman -Ssq tree-sitter tree-sitter tree-sitter-bash tree-sitter-c tree-sitter-cli tree-sitter-javascript tree-sitter-lua tree-sitter-markdown tree-sitter-python tree-sitter-query tree-sitter-rust tree-sitter-vim tree-sitter-vimdoc R、YAML、Golang 和其他几个呢？回复 taeric 0 分钟前 | 父评论 [–] 奇怪，yaml-ts-mode 存在吗？他们改变了获取解析器的方式吗？回复指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系搜索：

原文

21 Jan 2026

I got asked a good question today: what is the difference between Tree-sitter and a language server? I don’t understand how either of these tools work in depth, so I’m just going to explain from an observable, pragmatic point of view.

Tree-sitter #

Tree-sitter is a parser generator. What this means is that you can hand Tree-sitter a description for a programming language and it will create a program that will parse that language for you. What’s special about Tree-sitter is that it is a.) fast, and b.) can tolerate syntax errors in the input. These two properties make Tree-sitter ideal for creating syntax highlighting engines in text editors. When you’re editing a program, most of the time the program will be in a syntactically invalid state. During that time, you don’t want your colors changing or just outright breaking while you’re typing. Naïve regex-based syntax highlighters frequently suffer from this issue.

Tree-sitter also provides a query language where you can make queries against the parse tree. I use this in the Emacs package I’m trying to develop to add Typst support to the Citar citation/bibliography tool: I can ask Tree-sitter to find a particular syntax object; it is safer and more robust than using a regular expression because it can do similar parsing to the Typst engine itself.

In short, Tree-sitter provides syntax highlighting that is faithful to how the language implementation parses the program, instead of relying on regular expressions that incidentally come close.

Language server #

A language server is a program that can analyze a program and report interesting information about that program to a text editor. A standard, called the Language Server Protocol (LSP), defines the kinds of JSON messages that pass between a text editor and the server. The protocol is an open standard; any language and any text editor can take advantage of the protocol to get nice smart programming helps in their system. Language servers can provide information like locating the definition of a symbol, possible completions at the cursor point, etc. to a text editor which can then decide how and when to display or use this information.

Language servers solve the “ \(N \times M\) problem” where \(N\) programming languages and \(M\) text editors would mean there have to be \(N \times M\) implementations for language analyzers. Now, every language just needs a language server, and every editor needs to be able to speak the LSP protocol.

Language servers are powerful because they can hook into the language’s runtime and compiler toolchain to get semantically correct answers to user queries. For example, suppose you have two versions of a pop function, one imported from a stack library, and another from a heap library. If you use a tool like the dumb-jump package in Emacs I just want to say that I think dumb-jump is very cool and I am not trying to knock it down at all. It’s honest about its limitations and can be handy when you do not have a language server available. and you use it to jump to the definition for a call to pop, it might get confused as to where to go because it’s not sure what module is in scope at the point. A language server, on the other hand, should have access to this information and would not get confused.

Using a language server for highlighting #

It is possible to use the language server for syntax highlighting. I am not aware of any particularly strong reasons why one would want to (or not want to) do this. The language server can be a more complicated program and so could surface particularly detailed information about the syntax; it might also be slower than tree-sitter.

Emacs’ built-in LSP client, Eglot, recently added eglot-semantic-tokens-mode to support syntax highlighting as provided from the language server. I have tried this a little bit in Rust code and it seems fine; the Tree-sitter-based syntax highlighting has been working just fine for me, so I will probably stick to that unless I find a compelling reason to use the LSP-based highlighting.

I wrote all of the above article. I did not ask an LLM to generate any portion of it. Please know that whenever you read something on my blog, it comes 100% from a human—me, Ashton Wiersdorf.

I am not so anti-AI to say that LLMs are worthless or should never be used. I’ve used LLMs a little bit. I think they’re fantastic at translating between languages; this seems to be something that they should be good at doing. They’re helpful at writing some boring parts of the code I write. However, most of the time I find that I can typically write the tricky bits of the code about as fast as I could specify to an LLM what I want.

I know that an LLM could have generated a facile pile of text much like the above, and honestly it would probably be decently helpful. However, know that what you have just read came directly from the fingers of a person who thought about the topic and bent his effort to helping you understand. This is from real human who understands the meaning behind each word here. I do not play games with syntax and generate answer-shaped blog posts. There is real meaning here. Enjoy it, and go forth and make more of it.