```依赖项中不包含 LLM 代码```
No LLM Code in Dependencies

原始链接: https://joeyh.name/blog/entry/no_LLM_code_in_dependencies/

*git-annex* 的开发者最近花费了 100 个小时,专门清理了项目中依赖树里由大语言模型(LLM)生成的代码。这次严苛的审计揭示了严重的质量问题,包括大规模且逻辑不通的提交、无法解释的代码变更,以及代码来源存疑甚至可能侵权的情况。 该开发者深感忧虑,认为审计整个依赖树已成为现代编程中必不可少但极其耗力的环节。尽管他们认为这是在与整个行业趋势抗争,且注定是一场败仗(并指出像软件自由保护组织这样的主要机构似乎不愿解决这一问题),但他们仍对用户负责。 该开发者警告称,不要为了通过自动格式化或重构来“十倍速”提升生产力而随意使用大语言模型,并指出这种做法会降低代码质量,威胁开源协作的完整性。最终,这一经历迫使该开发者重新考虑是否要继续参与那些推崇未经审查、由人工智能驱动的工作流的社区。

Hacker News 社区目前正围绕“禁止在依赖项中使用大模型代码”的运动展开激烈辩论。这场讨论源于一位开发者决定在其实际项目 *git-annex* 中禁用由大语言模型(LLM)生成的代码。 此项讨论凸显了开源软件(OSS)未来发展方向上的严重分歧。反对禁令者认为,大模型仅仅是工具,抵制它们是一种保守的“纯洁性测试”,忽视了其效用以及就业市场的竞争压力。相反,支持禁令者则认为,人工智能生成的代码(即所谓的“垃圾内容”)给维护者带来了不可持续的负担。他们强调,指导人类贡献者是开源文化的核心价值,而审计非人类生成的代码不仅缺乏这种社会效益,还存在用未经审查、低质量或可能窃取的内容污染代码库的风险。 此次对话还触及了更广泛的议题:软件工艺的流失、大模型贡献的识别难度,以及开源模式的可持续性。尽管一些人预见人工智能将有助于构建更安全、更健壮的系统,但另一些人则担心,对人工智能的依赖会导致代码质量崩溃,使整个行业陷入人类无法验证或理解的“垃圾堆”之中。
相关文章

原文

I've spent about 100 hours of work over the past month to make sure git-annex can build without dependencies that contain LLM generated code. At least so far.

https://git-annex.branchable.com/no_llm_code/

Needing to review a program's whole dependency tree on an ongoing basis is apparently what programming has come to?

I've found some real stinkers. Large LLM generated changes being reverted in the next release without any explanation. An incoherent 1489 line commit message with 10,000 lines of changes to a 26,000 LOC code base. A LLM prompt to copy code from another project that seems to have only avoided being copyright infringement due to luck.

I now have additional information about the quality of dependencies which will surely influence future decisions. As far as I can see, that's the only positive benefit of this work.

I realize that I am probably trying to hold back the tide at this point. That appears to be why Software Freedom Conservancy punted, and I doubt that the FSF will do any better.

As these dominos fall, I am reconsidering my participation in these communities. But I continue my work and support my users.

It may seem easy to prompt a LLM with

Add fourmolu config and restyled

neat

format a module

And commit the result and call yourself a 10xer. But please consider the broader impact of your actions. (In the above case, that project lost my further collaboration on it.)

联系我们 contact @ memedata.com