ULID：通用唯一词法排序标识符

ULID：通用唯一词法排序标识符
ULID: Universally Unique Lexicographically Sortable Identifier

原始链接: https://packagemain.tech/p/ulid-identifier-golang-postgres

## ULID：UUIDs 的强大替代方案 UUIDs 广泛用于唯一标识符，但存在局限性——效率低下，不易于人工阅读，并且随机 UUIDv4 可能因碎片化导致数据库性能问题。ULID（通用唯一词法可排序标识符）提供了一个引人注目的解决方案，特别是对于使用 Postgres 的 Go 程序（尽管适用于其他语言/数据库）。 ULID 是 128 位标识符，由 48 位时间戳和 80 位随机值组成。这种结构提供了关键优势：**词法可排序性**（允许按插入时间高效的数据库索引）、不区分大小写以及 URL 安全字符。重要的是，ULID 通常可以直接与现有的 UUID 数据库列一起使用，通过像 `oklog/ulid` 这样的包，避免模式更改。与随机 UUID 相比，ULID 确保记录按创建时间在索引中物理排序，无需单独的时间戳列。虽然极高容量的写入*可能*会由于时间戳聚类而创建“热点”，但 ULID 通常提供显著的性能优势。新的 UUIDv7 旨在复制 ULID 的时间顺序结构，突出了 ULID 对标识符标准的影响。

Hacker News新 | 过去 | 评论 | 提问 | 展示 | 工作 | 提交登录 ULID：通用唯一词法排序标识符 (packagemain.tech) 11 分，由 der_gopher 1 小时前发布 | 隐藏 | 过去 | 收藏 | 2 条评论 sblom 7 分钟前 | 下一个 [–] 我喜欢它的美观。对于许多应用程序来说，加密强度权衡（相对于 UUIDv7）似乎很糟糕。回复 nighthawk454 16 分钟前 | 上一个 [–] 在文章评论中提到：> 为什么不使用 UUID7？> “ULID 比 UUID v7 更早，而且看起来更好” 对于不熟悉的人来说，UUIDv7 几乎具有相同的属性——可排序，具有时间戳等。 ULID：01ARZ3NDEKTSV4RRFFQ69G5FAV UUIDv7：019b04ff-09e3-7abe-907f-d67ef9384f4f 回复指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系搜索：

原文

The UUID format is a highly popular and amazing standard for unique identifiers. However, despite its ubiquity, it can be suboptimal for many common use-cases because of several inherent limitations:

It isn’t the most character efficient or human-readable.
UUID v1/v2 is impractical in many environments, as it requires access to a unique, stable MAC address.
UUID v3/v5 requires a unique seed.
UUID v4 provides no other information than true randomness, which can lead to database fragmentation in data structures like B-trees, ultimately hurting write performance.

Few projects I worked on used the ULID (Universally Unique Lexicographically Sortable Identifier), and I really enjoyed working with it, and would love to share this experience with you. Specifically for Go programs using Postgres database. But the same applies to other languages or databases too.

You can find the full spec here - github.com/ulid/spec

ulid() // 01ARZ3NDEKTSV4RRFFQ69G5FAV

ULID addresses the drawbacks of traditional UUID versions by focusing on four key characteristics:

Lexicographically sortable. Yes, you can sort the IDs. This is the single biggest advantage for database indexing.
Case insensitive.
No special characters (URL safe).
It’s compatible with UUID, so you can still use native UUID columns in your database, for example.

ULID’s structure is key to its sortability. It is composed of 128 bits, just like a UUID, but those bits are structured for function: 48 bits of timestamp followed by 80 bits of cryptographically secure randomness.

 01AN4Z07BY      79KA1307SR9X4MV3

|----------|    |----------------|
 Timestamp          Randomness
   48bits             80bits

The power of ULID is its seamless integration into existing systems, even those relying on the UUID data type. Here is a demonstration using Go with the popular pgx driver for PostgreSQL and the oklog/ulid package.

The code below first connects to a running PostgreSQL instance and creates a table where the primary key is of type UUID. We then insert records using both standard UUID v4 and ULID.

package main

import (
  "context"
  "fmt"
  "os"

  "github.com/google/uuid"
  "github.com/jackc/pgx/v5"
  "github.com/oklog/ulid/v2"
)

func main() {
  ctx := context.Background()

  conn, err := pgx.Connect(ctx, "postgres://...")
  if err != nil {
    panic(err)
  }
  defer conn.Close(ctx)

  _, err = conn.Exec(ctx, `
CREATE TABLE IF NOT EXISTS ulid_test (
  id UUID PRIMARY KEY,
  kind TEXT NOT NULL,
  value TEXT NOT NULL
);`)
  if err != nil {
    panic(err)
  }

  insertUUID(ctx, conn, “1”)
  insertUUID(ctx, conn, “2”)
  insertUUID(ctx, conn, “3”)
  insertUUID(ctx, conn, “4”)
  insertUUID(ctx, conn, “5”)

  insertULID(ctx, conn, “1”)
  insertULID(ctx, conn, “2”)
  insertULID(ctx, conn, “3”)
  insertULID(ctx, conn, “4”)
  insertULID(ctx, conn, “5”)
}

func insertUUID(ctx context.Context, conn *pgx.Conn, value string) {
  id := uuid.New()
  conn.Exec(ctx, "INSERT INTO ulid_test (id, value, kind) VALUES ($1, $2, 'uuid')", id, value)

  fmt.Printf("Inserted UUID: %s\n", id.String())
}

func insertULID(ctx context.Context, conn *pgx.Conn, value string) {
  id := ulid.Make()

  // as you can see, we don’t need to format the ULID as a string, it can be used directly
  conn.Exec(ctx, "INSERT INTO ulid_test (id, value, kind) VALUES ($1, $2, 'ulid')", id, value)

  fmt.Printf("Inserted ULID: %s\n", id.String())
}

The oklog/ulid package implements the necessary interfaces (specifically, database/sql/driver.Valuer and encoding.TextMarshaler) that allow it to be automatically converted into a compatible format (like a string or []byte representation of the UUID) that the pgx driver can successfully map to the PostgreSQL UUID column type. This allows developers to leverage the sortable advantages of ULID without having to change the underlying database schema type in many popular environments.

This allows developers to leverage the sortable advantages of ULID without having to change the underlying database schema type in many popular environments.

The time-based prefix means that new ULIDs will always be greater than older ULIDs, ensuring that records inserted later will be physically placed at the end of the index. This contrasts sharply with UUID v4, where the sheer randomness means records are scattered throughout the index structure.

With traditional UUID v4, sorting records by their insert time is not possible without an extra column. When using ULID, the sort order is inherent in the ID itself, as demonstrated by the following database query output:

select * from ulid_test where kind = 'ulid' order by id;

019aaae4-be9c-d307-238f-be1692b3e8d7 | ulid | 1
019aaae4-be9d-011f-b82e-b870ca2abe9d | ulid | 2
019aaae4-be9f-e9d7-6efc-5b298ecc572b | ulid | 3
019aaae4-bea0-deae-6408-d89e7e3ce030 | ulid | 4
019aaae4-bea1-8ed2-c2f5-144bb1ffedde | ulid | 5

As we can see, the records are returned in the same order they were inserted. Furthermore, ULID is much shorter and cleaner when used in contexts like a URL:

/users/01KANDQMV608PBSMF7TM9T1WR4

ULID can generate 1.21e+24 unique ULIDs per millisecond, which should be more than enough for most applications.

There’s really no major drawback to using ULID, but you should understand its limitations. For very (and I mean very) high-volume write systems, ULIDs can become problematic. Since all writes are clustered around the current timestamp, you will have hot spots around the current index key, which can potentially lead to slower writes and increased latency on that specific index block.

While other alternative identifiers exist, such as CUID or NanoID, the benefits of ULID have become a major factor in the evolution of unique identifier standards.

It is worth noting that the newest proposed standard for unique identifiers, UUID v7, aims to address the sortability and database performance issues of older UUID versions by adopting a similar time-ordered structure to ULID.

ULID：通用唯一词法排序标识符 ULID: Universally Unique Lexicographically Sortable Identifier

ULID：通用唯一词法排序标识符
ULID: Universally Unique Lexicographically Sortable Identifier