Beagle:Git、URI 以及所有脏话
Beagle: Git, URIs and all the dirty words

原始链接: https://replicated.wiki/blog/uris.html

Beagle 是一个与 Git 兼容的源代码管理(SCM)系统,旨在用基于 URI 和 HTTP 动词的更简单、标准化的模型,来替代 Git 那套常令人困惑且臃肿的命令行界面。 尽管 Git 的底层结构是一套简洁优雅的 Blob 树和提交链系统,但其接口却以复杂著称。Beagle 通过将仓库视为一个内容寻址的文件系统来解决这一问题,它使用标准的 URI 语法(例如 `scheme://host/path?query#fragment`)来定位文件、分支及特定的代码位置。 Beagle 通过将 Git 中不一致的命令(如 merge、rebase、squash、cherry-pick)替换为一套严格正交的 HTTP 操作,进一步简化了版本控制: * **GET/HEAD:** 获取数据或进行试运行。 * **POST:** 提交更改。 * **PUT/DELETE:** 管理引用(分支/标签)和暂存。 * **PATCH:** 应用版本间的更改。 通过将版本控制分解为这些精确的原语并使用一致的 URI 寻址,Beagle 让开发者能够通过可预测的逻辑序列执行复杂操作(如变基或压缩),而无需学习一套专业且凌乱的传统命令词汇。

Hacker News 最新 | 过往 | 评论 | 提问 | 展示 | 招聘 | 提交 登录 Beagle:Git、URI 和所有不雅词汇 (replicated.wiki) 4 分,作者:gritzko,1 小时前 | 隐藏 | 过往 | 收藏 | 讨论 | 帮助 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系 搜索:
相关文章

原文
uris

Git's basic model is a wonderfully simple system of blob trees and commit chains that one can explain in 5 minutes to anyone. Further up the stack, that wonderful simplicity devolves into a mess of commands and flags developers with 20 years of git experience have difficulty remembering.

That is doubly so when multi-tasking with LLMs. "I believe we implemented it on Tuesday, but it is not here. Where is it?" "Which branch corresponds to that remote?" And so on.

If only we had some universal language to address and access local and remote resources, files and locations in files! Oh wait, we have HTTP and URI, which are as standard as it gets. Those were specifically designed for this task. Supported in so many apps and libs. Can we apply that to git?

URIs

The URI layout we all remember by heart:

  1. scheme: -- the access protocol / addressing scheme,
  2. //authority -- most often the network host,
  3. /path -- path in the remote filesystem,
  4. ?query -- other stuff (like arguments),
  5. #fragment -- location within the document.

Can we retrofit that to a versioned store? Well, if all the versioning info goes into the query, the rest is obvious. http://somehost/dir/file?branch#L101 for example. In fact, Beagle is a git-compatible SCM doing exactly that.

HTTP verbs

The case of HTTP is more interesting. Originally, HTTP has a vocabulary of verbs: HEAD, GET, PUT, POST, PATCH, DELETE. Although, people only use GET and POST nowadays. But, there was some reason for the other verbs to exist, right?

  • GET "retrieves information"
  • HEAD is like GET, but no body
  • POST makes the server "accept the entity"
  • PUT requests the entity to be "stored"
  • DELETE does what it says
  • PATCH requests "changes" to be "applied"

While the vocabulary is a bit vague, fundamentally it grows out of the need to access a remote filesystem. That fits naturally the git model, which is, as described, a [content- addressed filesystem]f. For that reason, Beagle uses the HTTP verbs exclusively.

Wait, but it only has patch? What about merge vs rebase?

Git's dirty words

There is always plenty of confusion around merge, rebase, squash, cherry-pick and all the related techniques of git-handling the twisted history of edits. Each command does several often unrelated things and each thing can be done by several commands, subtly differently.

Beagle decomposes those practices into a set of orthogonal operations, building on that wonderfully simple underlying model of git:

  • GET moves data from repo to worktree (including remotes)
  • HEAD is like GET's dry-run - fetch and report
  • POST moves data from worktree to repo (commits)
  • PUT only edits the reflog (sets branches/tags, stages)
  • DELETE is like PUT, but deletes
  • PATCH applies another version's changes to the worktree

As you might see, there is no way to supplement one operation by another: they are strictly orthogonal. Let's see how that applies to the pandemonium of merge/rebase/squash/cherrypick.

Let's see what all git merge variants do:

  1. they apply changes from a diverging commit or branch,
  2. they reuse (rebase) or add new (merge, squash) message,
  3. they refer to the original (merge) or not (rebase, squash).

Consequently, we have 8 options: commit/branch, reuse/retitle, and refer/forget. In fact, only some of these 8 have git terms defined. For example, to squash we have to apply a diverging branch in its entirety, add a new commit message, do not refer to the original branch. To rebase, we apply separate commits, reuse the messages, do not refer back. To merge, we apply all of a branch, add a new message, refer back (the parent header).

The way to express it in Beagle CLI:

    # rebase one commit: apply, post
    be patch ?feature
    be post #!

    # merge a branch: apply all, post with a new message
    be patch ?feature!
    be post '#merge the feature'

    # squash a branch
    be patch ?feature!
    be post '#add a new feature!'

    # rebase the entire branch
    while be patch ?feature; do
        make && make test && be post #!;
    done

    # cherry pick one commit
    be patch #391a0d33
    be post #!

Here we use the bang modifier to:

  1. '?branch!' apply the entire branch (default: one commit),
  2. '#message!' dont link the original commit (the parent ref).

Note: when we supply no message, the original one gets reused. We may keep message/author but drop the original commit: #!.

Branch rebase here may only happen as a cycle, because we make as many posts as many commits we have. This also ensures that all the commited revisions build and pass the tests.

FAQ

So, how PUT is different from POST?

POST does commit and/or fast-forward. PUT resets a branch or marks a file for commit/removal (reflog-only operations).

How does that compare to the URIs git uses?

git only uses URIs to access repos, e.g. git://github.com/gritzko/beagle.git That is very limiting, so we want to extend that addressing scheme to access files, revisions, locations in files.

How does that compare to GitHub URIs?

GitHub URIs have a typical web-app structure, that makes them invonvenient for our case.

https://github.com/gritzko/beagle/blob/main/keeper/README.md

In particular, beagle URIs orthogonalize all the versioning information into the query part to avoid overusing the path for everything (project, user, branch, path). Beagle branches are tree-ordered filesystem-like and the top level entries are project trunks, so the GitHub URI above becomes

be://replicated.live/keeper/README.md?/beagle

Note that a Beagle repo may host any number of projects, and the default way to convey a project is the query. If we want to peek into a branch, the URI becomes

be://replicated.live/keeper/README.md?/beagle/MEM-issues

联系我们 contact @ memedata.com