数据库设计的原则,亦或,真相就在那里
The principles of database design, or, the Truth is out there

原始链接: https://ebellani.github.io/blog/2025/the-principles-of-database-design-or-the-truth-is-out-there/

Eduardo Bellani的文章《数据库设计的原则,或者,真相就在那里》强调了正确的数据库设计在软件项目中准确表示现实世界场景的关键作用。他认为开发人员往往缺乏数据库设计和逻辑方面的正规培训,导致采用临时方法,并可能产生数据不一致等严重后果。 Bellani概述了关键的设计原则,包括正交设计、表达简约、表达完整性、完全范式化、信息原则和逻辑独立性,这些都借鉴了McGoveran和Pascal的研究成果。他还介绍了他提出的本质表示原则(PED),提倡使用反映实体固有身份的自然键,而不是任意的代理键。 文章强调,作为现实的表达,数据库需要严格的设计才能避免语义混淆和技术不稳定。Bellani总结道,扎实掌握基础原则对于构建真实可靠的信息系统至关重要。

Hacker News 最新 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交 登录 数据库设计的原则,或者说,真相就在那里 (ebellani.github.io) 8 分,来自 b-man,1 小时前 | 隐藏 | 过去 | 收藏 | 1 条评论 AnonHP 21 分钟前 [–] 这篇文章似乎过分强调了规范化,这在许多情况下是合适的,但在报表等需求方面可能会造成巨大的成本和性能问题。对于同一个应用程序的不同需求,你可能需要不同类型的模式和数据存储结构,这反过来可能会导致数据重复,但结果是可以接受的权衡。 回复 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系我们 搜索:

原文

The principles of database design, or, the Truth is out there by Eduardo Bellani

Every software project needs to represent the reality of the business he is embedded in. The way we can represent reality as limited rational beings is through propositions, i.e, declarative statements that affirm or deny something about reality. When a collection of such propositions is stored in a computer system, we call it a database.

Such database needs to be designed to properly reflect reality. This can’t be automated, since the semantics of the situation need to be encoded in a way that can be processed by a computer. Such then is the goal of database design: to encode propositions in such a way that can properly be processed by a database management system (DBMS).

At this point, a regular software developer comes to a stall. Since there is scarcely any formal training in database design (or formal logic) in his education, he tends to fall back haphazardly on ad-hoc methods, with severe consequences (update anomalies and data inconsitencies with huge potential downsides).

If you are such developer, you need to understand the underlying principles of database design. Think about it, if you don’t have principles of design, you are not doing engineering, are you?

Here is a list of design principles to follow for formal database design(McGoveran 2012, 2015)(Pascal 2016):

  • Principle of Orthogonal Design (POOD): Base relations are independent;
  • Principle of Representational Parsimony (PORP): There are no superfluous base relations;
  • Principle of Expressive Completeness (POEC): All meaningful relations are derivable from the base relations.
  • Principle of Full Normalization (POFN) : Every base relation should be in its highest normal form (3, 5 or 6th normal form).​ Thus eliminating redundancy and preventing anomalies by ensuring that each relation is free from undesirable characteristics like partial, transitive, or join dependencies.
  • The Information Principle (TIP) : All information in the database is represented explicitly and in exactly one way — by attribute values drawn from domains in relations.
  • Principle of Logical Independence (PLI) : Application programs and terminal activities remain logically unimpaired when information preserving changes of any kind that theoretically permit unimpairment are made to the base relations.

To these I’d like to introduce a new principle, an innovation I hope to develop in future work:

  • Principle of Essential Denotation (PED): A relation should be identified by a natural key that reflects the entity’s essential, domain-defined identity — not by arbitrary or surrogate values.

The following pseudo-SQL shows the contrast between improper and proper denotation:

  -- usage of surrogate keys
  create table citizen (
    id uuid primary key,
    national_id text,
    full_name text);

Code Snippet 1: Illustration of not using PED with SQL

This usage has some problems, the worst of it is the disconnection between database structure and domain semantics. Here is a rework that preserves such connection:

  create domain national_id as text check (...);

  create table citizen (
    national_id national_id primary key,
    full_name text);

Code Snippet 2: Illustration of using PED with SQL

Conclusion

Databases are representations of reality, and as such, they are foundational to any serious information system. Poor design leads to semantic confusion and technical instability—with consequences that can be costly and far-reaching. Good design demands rigor, discipline, and a firm grasp of foundational principles.

To put it simply: if you’re in the business of information, you need to know how to build structures that tell the truth that is out there.

References

Figure 1: But revolutionary Parisians had had enough of its royal resonance. The cathedral’s west facade featured 28 statues that portrayed the biblical Kings of Judah. In fall 1793, the new government ordered workers to remove them. They didn’t portray French kings, but no matter: The 500-year-old statues combined monarchy and religion, and they were brought to the cathedral’s square and decapitated. Twenty-one of the heads were only recovered in 1977, when workers found them behind a wall in an old Parisian mansion.(Blakemore 2019)

Figure 1: But revolutionary Parisians had had enough of its royal resonance. The cathedral’s west facade featured 28 statues that portrayed the biblical Kings of Judah. In fall 1793, the new government ordered workers to remove them. They didn’t portray French kings, but no matter: The 500-year-old statues combined monarchy and religion, and they were brought to the cathedral’s square and decapitated. Twenty-one of the heads were only recovered in 1977, when workers found them behind a wall in an old Parisian mansion.(Blakemore 2019)

Feel free to send me an email: ebellani -at- gmail -dot- com

Fingerprint: 48C50C6F1139C5160AA0DC2BC54D00BC4DF7CA7C

联系我们 contact @ memedata.com