Unix 可以原子性地完成的事情
Things Unix can do atomically (2010)

原始链接: https://rcrowley.org/2010/01/06/things-unix-can-do-atomically.html

本文档记录了类UNIX/POSIX兼容系统中可用的原子操作,为线程和进程安全的编程提供了构建块,*无需*传统的锁。其核心理念是利用内核固有的原子性,信任内核而非自定义锁定机制。 列表重点关注诸如 `mv -T`(原子地更改符号链接目标)、`link` & `symlink`(创建硬链接/符号链接用于锁定——如果目标存在则失败)、`rename`(文件系统内的原子路径更改)以及 `open(O_CREAT|O_EXCL)`/`mkdir`(仅在文件/目录不存在时创建)等操作。还包括诸如 `fcntl`(用于锁定文件区域)和 `mmap`/`msync`(用于共享内存)等文件描述符操作。 最后,文档还提到了 GCC 原子内置函数 (`__sync_fetch_and_add` 等),用于无锁算法。**重要提示:** 许多操作依赖于文件系统(避免 NFS),并且 macOS 的 `mv` 未利用原子 `rename`。作者欢迎反馈以进行更正和补充。

黑客新闻 新 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交 登录 Unix 可以原子性地完成的事情 (rcrowley.org) 16 分,由 onurkanbkrc 发表于 31 分钟前 | 隐藏 | 过去 | 收藏 | 讨论 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系 搜索:
相关文章

原文

This is a catalog of things UNIX-like/POSIX-compliant operating systems can do atomically, making them useful as building blocks for thread-safe and multi-process-safe programs without mutexes or read/write locks.  The list is by no means exhaustive and I expect it to be updated frequently for the foreseeable future.

The philosophy here is to let the kernel do as much work as possible.  At my most pessimistic, I trust the kernel developers more than a trust myself.  More practically, it’s stupid to spend CPU time locking around an operation that’s already atomic.  Added 2010-01-07.

Operating on a pathname

The operations below are best left to local filesystems.  More than a few people have written in crying foul if any of these techniques are used on an NFS mount.  True.  When there are multiple kernels involved, the kernel can’t very well take care of all the locking for us.  Added 2010-01-06.

  • mv -T <oldsymlink> <newsymlink> atomically changes the target of <newsymlink> to the directory pointed to by <oldsymlink> and is indispensable when deploying new code.  Updated 2010-01-06: both operands are symlinks.  (So this isn’t a system call, it’s still useful.)  A reader pointed out that ln -Tfs <directory> <symlink> accomplishes the same thing without the second symlink.  Added 2010-01-06.  Deleted 2010-01-06: strace(1) shows that ln -Tfs <directory> <symlink> actually calls symlink(2), unlink(2), and symlink(2) once more, disqualifying it from this page.  mv -T <oldsymlink> <newsymlink> ends up calling rename(2) which can atomically replace <newsymlink>.  Caveat 2013-01-07: this does not apply to Mac OS X, whose mv(1) doesn’t call rename(2).  mv(1).
  • rename(oldpath, newpath) can change a pathname atomically, provided oldpath and newpath are on the same filesystem.  This will fail with the error code ENOENT if oldpath does not exist, enabling interprocess locking much like link(oldpath, newpath) above.  I find this technique more natural when the files in question will be unlinked later.  rename(2).
  • open(pathname, O_CREAT | O_EXCL, 0644) creates and opens a new file.  (Don’t forget to set the mode in the third argument!)  O_EXCL instructs this to fail with the error code EEXIST if pathname exists.  This is a useful way to decide which process should handle a task: whoever successfully creates the file.  open(2).
  • mkdir(dirname, 0755) creates a new directory but fails with the error code EEXIST if dirname exists.  This provides for directories the same mechanism link(2) open(2) with O_EXCL provides for files.  mkdir(2)Added 2010-01-06; edited 2013-01-07.

Operating on a file descriptor

  • fcntl(fd, F_GETLK, &lock), fcntl(fd, F_SETLK, &lock), and fcntl(fd, F_SETLKW, &lock) allow cooperating processes to lock regions of a file to serialize their access.  lock is of type struct flock and describes the type of lock and the region being locked.  F_SETLKW is particularly useful as it blocks the calling process until the lock is acquired.  There is a “mandatory locking” mode but Linux’s implementation is unreliable as it’s subject to a race condition.  fcntl(2).
  • fcntl(fd, F_GETLEASE) and fcntl(fd, F_SETLEASE, lease) ask the kernel to notify the calling process with SIGIO when another process opens or truncates the file referred to by fd.  When that signals arrives, the lease needs to be removed by fcntl(fd, F_SETLEASE, F_UNLCK)fcntl(fd, F_NOTIFY, arg) is similar but doesn’t block other processes, so it isn’t useful for synchronization.  fcntl(2).
  • mmap(0, length, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0) returns a pointer from which a file’s contents can be read and written by normal memory operations.  By making frequent use of msync(addr, length, MS_INVALIDATE), data written in this manner can be shared between processes that both map the same file.  mmap(2), msync(2).

Operating on virtual memory

  • __sync_fetch_and_add, __sync_add_and_fetch, __sync_val_compare_and_swap, and friends provide a full barrier so “no memory operand will be moved across the operation, either forward or backward.” These operations are the basis for most (all?) lock-free algorithms.  GCC Atomic Builtins.

Something I should add to my repertoire?  Race condition?  Let me know at [email protected] or @rcrowley and I’ll fix it.

联系我们 contact @ memedata.com