显示 HN:一个 shell 原生、cd 兼容的目录跳转器,使用幂律频率法。
Show HN: A shell-native cd-compatible directory jumper using power-law frecency

原始链接: https://github.com/jghub/sd-switchdir

## SD:一个智能目录导航工具 SD是一个为ksh93、bash(≥4.2)和zsh(≥4.3)设计的shell工具,旨在智能地导航目录,作为`cd`的直接替代品。它使用明确的历史记录跟踪目录访问频率和最近性,并根据可配置的幂律核对目录进行排名。 `sd` 不仅接受路径名,还接受模式(正则表达式)来从动态排序的堆栈中查找目录。重复使用相同的模式会确定性地循环遍历匹配项。一个配套命令`ds`允许检查和管理这个排序堆栈。 SD记录目录更改,维护内存历史记录和持久性日志文件,并在每次更改后重新计算排名。关键配置选项包括窗口大小(历史记录长度)、幂律指数(最近性偏差)和日志文件限制。 如果安装了`fzf`,`ds`提供了一个交互式模糊查找器。可以将`cd`别名为`sd`以实现无缝使用。SD优先考虑完整的访问时间顺序,而不是聚合统计数据,提供不受时间影响的确定性排名。性能通常很快,典型的执行时间约为22-32毫秒。

## Hacker News 上的新目录跳转工具 jghub 创建的一款快速跳转目录的新工具,自 2011 年以来一直私下使用,现已在 Hacker News 上分享。它功能类似于 `z` 或 `zoxide` 等工具,但其独特的“frecency”排名系统使其与众不同。 该工具不只是追踪上次访问时间,而是使用用户 `cd` 历史记录(通常是最近的 1000+ 次事件)的幂律卷积,来平衡一致的目录使用和最近的、可能临时的活动。这避免了单次访问高峰掩盖长期习惯。 尽管这种方法在计算上很复杂,但查询时间仍然很快(在 bash/ksh 中为 20-30 毫秒)。该工具还优先使用逻辑路径名进行搜索(使用 `pwd -L`),以保留符号链接名称,方便搜索。该项目可在 GitHub 上找到。
相关文章

原文

SD is a directory navigation utility for ksh93u+, bash ≥ 4.2, and zsh ≥ 4.3, using frequency–recency tracking over an explicit visit history. Ksh-compatibility options are enabled: extglob in bash; KSH_ARRAYS, KSH_GLOB, POSIX_BUILTINS, and SH_WORD_SPLIT in zsh.

Note: This project is unrelated to the Rust-based sd text replacement tool.

This repository is a read-only snapshot mirror containing tagged stable states.
Issues are monitored, and pull requests may be considered for manual integration.

SD provides the sd command, designed to act as a drop-in replacement for cd for pathname arguments. In addition, it supports pattern-based directory selection from a dynamically ranked directory stack.

Directory ranking is computed over a trailing window of recorded directory visits. For each directory, a score is obtained by summing weighted contributions of its visits within that window. The weighting follows a configurable power-law kernel over normalized event indices.

Repeated invocation with the same pattern cycles deterministically through successive rank-ordered matches.

The companion command ds exposes the ranked directory stack and provides inspection and management functions.


  • Records directory changes in a logfile.
  • Maintains an in-memory history including events not yet flushed to disk.
  • Recomputes a ranked stack from a trailing window of directory visit events after each directory change.
  • Matches patterns (regular expressions) against full directory paths.
  • Changes directory according to highest-ranked match.
  • Allows deterministic cycling through further matches by repeating the same pattern.

These shell functions provide the interface:

  • sd ARG... — change directory (pathname or pattern)
  • ds [options] [pattern] — inspect and optionally select directories interactively
    • If fzf is installed, matches are presented in an interactive fuzzy finder.
  • cd ARG... — identical to sd

The cd command behaves identically to sd. It is provided for convenience so existing muscle memory works without retraining; use whichever you prefer.

sd accepts arguments but does not implement command-line options.

Behavior:

  1. If the argument resolves to a valid pathname, standard cd semantics apply.
  2. Otherwise, the argument is treated as a pattern and matched against the ranked stack.

Special case:

  • sd - behaves like builtin cd -.

All other arguments beginning with - are treated as ordinary arguments (pathname or pattern). Builtin cd options (e.g., -P, -L) are not parsed or forwarded.

For valid pathnames, behavior matches the shell builtin cd, including:

  • Absolute and relative paths
  • ., ..
  • ~ expansion
  • Permission handling

If no valid pathname is found, arguments are interpreted as a pattern:

  • Multiple arguments are merged into a single pattern.
  • Whitespace is normalized to single blanks.
  • Matching uses regular expressions against full directory paths.
  • Smart case: matching is case-insensitive unless the pattern contains uppercase characters.

Shell metacharacters may require quoting (e.g., 'a\.b').


If sd pattern is invoked repeatedly with the exact same pattern:

  • Matches are traversed in strictly rank-based order.
  • Cycling is deterministic during the first traversal of the current stack.
  • The stack is recomputed after each directory change.
  • The selected directory's score increases, but the relative order of the remaining matches is preserved during the first traversal.
  • Cycling resets as soon as a different argument is used.
  • Intervening non-sd commands do not reset the cycle.

After a full traversal, a further cycle may reflect updated ranking due to score changes.


Scoring is computed over a trailing window of $N$ events. Let the last $N$ directory-change events be indexed $1, \dots, N,$ oldest to newest. Let $n_i \subseteq {1,\dots,N}$ be the set of indices at which directory $i$ was visited. Then the score of directory $i$ is

$$ F(i) = \sum_{n \in n_i} \left(\frac{n}{N}\right)^p $$

where:

  • $N$ is the window length (not the logfile size)
  • $p$ is the power-law exponent
  • $n_i$ are the event indices within the window at which directory $i$ was visited

Higher $p$ increases recency bias. Default: $p = 9.97$.

Properties:

  • Full visit chronology is preserved within the window.
  • Scores are computed from event indices.
  • Ranking is deterministic given the recorded history.
  • Recency is measured in number of directory-change events, not wall-clock time. Inactive periods do not affect scores or ranking.

A first-time visit receives score $F = 1$, which decreases as subsequent events push it back in history.


Two independent limits exist:

  • loglim — maximum number of stored directory-change events (hard cutoff)
  • window — number of trailing events used for scoring

Events older than window do not influence ranking. Events older than loglim are permanently discarded.

With default settings:

  • window = 1280 events. At ~50 directory changes per day, this spans roughly one month.
  • loglim = 8192 typically retains substantially longer history.

Changing window recomputes the stack immediately.


Stale and missing directories

If a ranked match no longer exists:

  1. Lower-ranked matches are attempted.
  2. If all stack matches are stale (or none exist), matching is expanded to the full recorded history currently held in memory.

Permission errors are handled consistently with the builtin cd.

Logfile integrity is verified via a header marker when loading. No automatic repair mechanism is implemented.

Logfile locking is used to coordinate concurrent shells.


Performance depends on the scoring window size.

On typical hardware and for default settings:

  • ~22 ms (ksh) to ~32 ms (bash) per cd action (real time)

Extending window size to the full event log reduces performance by about a factor of two.


Requirements

  • ksh93, bash, or zsh
  • Optional: fzf for interactive selection via ds

Add to your shell rc file:

On first run, a logfile is created and initialized.


Configuration is held in associative array SD_CFG, initialized with defaults. Define only the keys you want to override:

typeset -A SD_CFG=(
    [loglim]=8192
    [window]=1280
    [power]=9.97
    [stacklim]=0
    [smartcase]=1
    [verbose]=1
    [period]=3600
)
. /path/to/sd.ksh

Runtime adjustments (via ds):

  • ds -l N — set scoring window
  • ds -k K — cap stack size
  • ds -e pow — adjust exponent
  • ds -c — clean stale entries
  • ds -i — show status
  • ds -m — show full manual

sd itself takes no options.


Other directory-jumping tools include:

These tools maintain per-directory aggregate state rather than a full sequence of visits. SD retains the complete visit history up to a configurable limit and derives scores from it, unaffected by elapsed wall-clock time.

联系我们 contact @ memedata.com