带有类型的集合论

带有类型的集合论
Set theory with types

原始链接: https://lawrencecpaulson.github.io//2025/11/21/Typed_Set_Theory.html

## 德布鲁因的愿景：类型论 vs. 集合论最近对 NG 德布鲁因 1973 年的论文《带有类型限制的集合论》的反思，强调了数学中一个长期存在的争论：集合论的基础地位。虽然传统上被认为是数学的基石，但德布鲁因认为应该转向**类型论**，设想一个系统，其中事物被“区分开来”——一个有理数并不固有地与点的集合可比较。他批评了 ZF 集合论坚持“一切都是集合”的观点，认为这不直观且容易产生悖论。德布鲁因提出了一种**类型化的集合论**，其中元素必须具有兼容的类型，从而防止了像 *x ∈ x* 这样的无意义构造。他认为这种方法，在 AUTOMATH 等语言中实现，足以完成大部分数学工作，甚至几十年就预见了它的强大之处。现代高阶逻辑，源自《数学原理》，提供了一个类似的框架。它允许使用方便的基于集合的语言，同时保留类型系统的优势。像 Isabelle/HOL 这样的形式化证明助手甚至可以将 ZF 公理作为独立的“世界”使用类型类，从而弥合了这两种方法之间的差距。最终，德布鲁因倡导一种务实的方法——一个足以进行数学运算而没有不必要复杂性的系统，甚至可能针对特定应用（如编程语言建模）利用遗传有限集合。

## 集合论与数学基础 - Hacker News 讨论总结最近 Hacker News 的讨论围绕数学基础，特别是集合论与类型论的优劣。链接文章（[lawrencecpaulson.github.io](https://lawrencecpaulson.github.io)）可能探讨了带有类型的集合论。争论突显了一种核心矛盾：一些数学家更喜欢“一切皆是集合”作为基础方法，因为它简单且具有统一性，而另一些人认为它存在问题，因为“抽象泄漏”——依赖于特定的集合论实现（如冯·诺伊曼自然数），这可能导致实现相关的结果，甚至对于看似基本的问题。反驳意见认为，这些问题可以通过关注抽象结构来解决，并且集合论更容易形式化。另一些人则提倡类型论或范畴论基础，认为它们可以避免这些陷阱，并提供更好的形式化可能性。一个关键点是存在不同的“集合论”，以及清晰定义比标签更重要，这反映了数学概念通常的处理方式。讨论还涉及了材料集合论和结构集合论的区别。

原文

21 Nov 2025

[ AUTOMATH NG de Bruijn type theory set theory type classes Principia Mathematica Archive of Formal Proofs ]

It is known that mathematics is heavily reliant on set theory, but no one can agree on what set theory is. Many people today understand that we have a choice between set theory and type theory, but they don’t know what type theory is either. Many think that type theory refers to some sort of dependent type theory, as found in Lean or Agda, while everything else is set theory. But prior to 1980 or so, “type theory” generally referred to higher-order logic and related systems. In 1973, NG de Bruijn wrote a paper called “Set Theory with Type Restrictions”. His discussion of set theory with types was intended as motivation for the AUTOMATH language, a dependent type theory. His thoughts are fascinating and pertinent today.

Set theory according to de Bruijn

De Bruijn begins “it has been believed throughout this century that set theory is the basis of all mathematics.” He specifies this as “Cantor set theory”, or Zermelo–Fraenkel (ZF). And indeed a lot of people continue to insist on this ZF foundation. Paradoxically, mathematicians (who are the ones with skin in the game) are generally the least interested in the technicalities of set theory: actual set theorists are scarce, and the questions they study seldom have a direct impact on other branches of mathematics. De Bruijn continues:

It seems, however that there is a revolt. Some people have begun to dislike the doctrine “everything is a set”, just at the moment that educational modernizers have pushed set theory even into kindergarten. It is not unthinkable that this educational innovation is one of the things that make others try to get rid of set theory’s grip for absolute power.

I was one of those who was taught a tiny bit of set theory in school, when I was something like eight years old. I don’t think it went beyond unions and intersections of two sets. They didn’t even cover Aronszajn trees. I don’t know who was behind this educational trend, but intersections and unions are useful concepts even in everyday life. Roger Needham liked to describe unrealistic expectations as “an elaborate description of the empty set”; people need to be able to understand such remarks.

De Briujn’s chief objection to ZF set theory comes from “the weird idea that everything is set”:

In our mathematical culture we have learned to keep things apart. If we have a rational number and a set of points in the Euclidean plane, we cannot even imagine what it means to form the intersection. The idea that both might have been coded in ZF with a coding so crazy that the intersection is not empty seems to be ridiculous.

He also mentions the possibility of writing $x\in x$, which seems nonsensical and gave us Russell’s paradox. Also objectionable to many are the numerous coding tricks employed, where the ordered pair $(x,y)$ becomes $\{\{x\},\{x,y\}\}$ and the natural number $n$ becomes $\{0, \ldots, n-1\}$. He continues:

The natural, intuitive way to think of a set, is to collect things that belong to a class or type given beforehand. In this way one can try to get theories that stay quite close to their interpretations, that exclude $x\in x$ and are yet rich enough for everyday mathematics.

I like this statement, because it is precisely how sets are used in Isabelle/HOL and how they could be used (but seldom are) in the HOL family of proof assistants.

I have my own objection to Zermelo–Fraenkel set theory: it is much, much too large for any imaginable purpose. It bugs me that people are continuing to grasp for even stronger systems, notably Tarski–Grothendieck set theory. Towards the end of his paper, de Bruijn says

The world where we have $\mathbb{N}$, $\mathbb{N}^\mathbb{N}$, $\mathbb{N}^{\mathbb{N}^\mathbb{N}}, \ldots$, but where [the union of all those sets] is “inaccessible”, is a world most mathematicians will doubtlessly find big enough to live in. For those who want to have a bigger world…, there is a simple way out: they just take a type SET and provide it with Zermelo–Fraenkel axioms. If they want to have the picture complete, they will not find it hard to embed the types $\mathbb{N}$, $\mathbb{N}^\mathbb{N}$, $\mathbb{N}^{\mathbb{N}^\mathbb{N}}, \ldots$ into a small portion of their paradise.

The context of this remark is that AUTOMATH, by virtue of its built-in general product, allows $\mathbb{N}$, $\mathbb{N}^\mathbb{N}$, $\mathbb{N}^{\mathbb{N}^\mathbb{N}}, \ldots$ to be constructed while having no way to form their “union” because it provides no way of indexing over types. I am impressed that de Bruijn already perceived back in 1973 that such a weak system was strong enough for most mathematics. I came to this conclusion only recently, after a series of formalisation experiments using Isabelle and higher-order logic. We can easily use the convenient language of sets in higher-order logic, and we also have the option to “take a type SET and provide it with Zermelo–Fraenkel axioms”.

Sets in higher-order logic

Higher-order logic is descended from the mysterious, never-properly-formalised system of Principia Mathematica (PM). Whitehead and Russell wrote extensively about classes, which were clearly the same thing as typed sets, though PM was never explicit about what its type system actually was. Higher-order logic as formalised by Church is PM done right. A set is nothing but a boolean-valued function, exactly the same thing as a logical predicate, and the phrase “predicate sets” is fairly common in the literature of the HOL proof assistant. Nevertheless, we think about sets and predicates differently, even in the trivial case of unions and intersections. Often I have looked at an HOL Light proof and seen a prediate $Q(x)$ defined by $\exists y. P(y) \land x = f(y)$, which is simply the image under the function $f$ of a set.

Early versions of Isabelle/HOL maintained separate types for sets and predicates. At some point about 20 years ago, somebody had the idea of abolishing the distinction, presumably to avoid some duplication. This change persisted for a couple of releases before being laboriously reversed. Sets and predicates are simply not the same.

The elements of typed set theory

Typed set theory has everything that you would expect, only typed. As de Bruijn would wish, it excludes $x\in x$: you can write $x\in A$ only if the types agree. That means, if $x$ has type $\tau$ then $A$ has type $\tau\,\texttt{set}$, which is also the type of the comprehension $\{x.\,P(x)\}$ if $x$ has type $\tau$. Union, intersection and difference combine two sets of the same type in the obvious way. We have the universal set and therefore set complement because our sets are typed: we have the set of all integers, say, or of all sets of integers, but never the set of all sets. The power set operator (which denotes the set of all subsets of its argument) takes type $\tau\,\texttt{set}$ to type $(\tau\,\texttt{set})\,\texttt{set}$.

The image operator is the set-theoretic version of the “apply to all” operator that’s called map in programming languages from Standard ML to Perl, except in LISP where it’s called MAPCAR; their MAP does something weird. In Isabelle/HOL, the image of a set A under the function f is written f ` A and please accept my apologies for a syntax influenced by PM. But mathematicians write $f(A)$ for the set of all $f(x)$ for $x\in A$, treating $A$ as a set because of its capital letter, which would never do. The inverse image is written f -` A and denotes the set of all $x$ such that $f(x)\in A$. Both operators are typed in the obvious way.

This brings us to something crucial: the function space $A\to B$ and its generalisation to the indexed product, $\prod_{x\in A}\,B(x)$. The latter was called a “dependent function space” by Robert Constable but is called the “dependent product” by people who never bothered to read Constable’s papers. It has been around for more than a century due to its close association with the axiom of choice, which states that $\prod_{x\in A}\,B(x)$ is nonempty provided $B(x)$ is nonempty for all $x\in A$. In typed set theory, we often need to talk about the set of all functions that have domain $A$ and codomain $B$, and occasionally the greater precision of the indexed product is helpful. Both are trivial to define using set comprehension; in fact, $f\in A\to B$ if and only if $f(A)\subseteq B$. There is a complication concerning function extensionality, which I will not discuss here. Another complication is that the inverse image involving a function in $A\to B$ is generally expected to be a subset of $A$; the Isabelle/HOL inverse image operator does not and cannot accomplish that.

Analogously, we need the binary product space $A\times B$. Its generalisation $\sum_{x\in A}\,B(x)$, the indexed sum (Constable’s “dependent product”), is not especially important but has its uses. Both are again trivial to define using set comprehension. As in type theory, the “non-dependent” versions $A\to B$ and $A\times B$ are not even defined separately but are simply degenerate cases of the full versions when $B$ does not depend on $x$.

If $f\in A\to B$ then we may want to know whether $f$ is an injective on the set $A$ or whether it is surjective (the image $f(A)$ equals $B$). If $f$ is both then it is a bijection between the two sets. These properties also can give us an indication of the relative sizes of $A$ and $B$: if $f$ is an injection then $A$ is “smaller” than $B$ (written $A\precsim B$ or $A\prec B$ if $A$ is strictly smaller). Then we can also talk about countable sets. Exhibiting a bijection between two sets shows that they are equinumerous and is often the easiest way to show that they have the same cardinality. In the basic Isabelle/HOL library, the cardinality function for sets returns a natural number, so it is only useful for finite sets, but we can do much better if necessary.

Incorporating all of Zermelo–Fraenkel set theory

Typed set theory, as sketched above, meets de Bruijn’s desiderata. It is also simpler and more intuitive than AUTOMATH, and can express vast swathes of mathematics in a natural manner. But, for better or worse, set-theoretic notions such as transfinite ordinals and cardinals sometimes pop up in odd places. ZF set theory becomes inescapable once its very language finds its way into your theorem statement. I have already written on how to incorporate ZF set theory into higher-order logic. It is done exactly as de Bruijn suggested: by postulating a type of ZF sets and equipping it with all the ZF axioms.

A possible annoyance with this approach is ending up with two separate mathematical worlds:

the higher-order logic world, painstakingly constructed by more than 100 Isabelle/HOL theories
the vast ZF world, with only eight Isabelle/HOL theories mainly concerned with ordinal and cardinal numbers, everything else existing only as possible consequences of the ZF axioms

As I described earlier, the AFP entry ZFC_in_HOL connects the two worlds using Isabelle’s type class system. It sets up a family of embeddings from the standard Isabelle/HOL type constructors to suitable ZF analogues. ZF thereby gets things like the set of real numbers for free: we don’t have to construct it all over again. Summarising the earlier blog post, it’s done by

defining the type class embeddable of types that can be embedded into the ZF universe
defining the type class small of types that can be embedded into some ZF set

An Isabelle library defines a type class of countable types and proves many basic types to be countable or to preserve countability. These include nat (the natural numbers) and rat (the rational numbers). Moreover, countable is a subclass of small, which is a subclass of embeddable. It follows immediately that all those basic types are small. Moreover, the function space type constructor preserves smallness. To show that there is a ZF set corresponding to the Isabelle/HOL type real, we need to show that real belongs to small, which is trivial because each real number is represented by a set of functions from nat to rat.

A simpler way to get some set theory

De Bruijn objects to the principle that “everything is a set”, but it comes in handy when you are trying to define spaces. Given any two sets $A$ and $B$, you can always construct their unions and intersections, the product $A\times B$, function space $A\to B$, also the disjoint sum $A+B$, enumerations like $\{A,B\}$, the indexed sum and product and much more. This sort of flexibility comes in handy when defining the semantics of a programming language, where variables can range over a variety of types. It also comes in handy when talking about finite state machines, where different ways of combining machines require analogous ways of combining sets of machine states. If you are in a typed world like higher-order logic, you may be tempted to just use the natural numbers along with some encoding of the necessary structures. But that sucks.

The hereditarily finite sets (type hf) give us everything that ZF does, with the big exception of infinite sets. Hereditarily finite sets are finite all the way down. Finitary constructions such as integers, rationals, and obvious data structures like pairs and lists are easily represented. But we have no real numbers (only floating point) and cannot use infinite equivalence classes either. Still, hf has everything we need for modelling possible data in a programming language or state spaces in the world of finite state machines.

The set of all hereditarily finite sets is countable, and they map naturally to the natural numbers. That means, if you use type hf, you are essentially using the natural numbers, but the structural embeddings are hidden and you get to use those beloved set-theoretic constructions. And hf is definable in higher-order logic without additional axioms.

Addendum 2025-11-22: I wrote above that the technicalities of set theory seldom mattered to anybody else, and on the same day, Quanta Magazine published this article!