R 包数量过多:CRAN 正被投稿淹没
Too many R packages: CRAN is inundated with submissions

原始链接: https://rworks.dev/posts/too-many-R-packages/

作为“Top 40”新 R 包的长期策展人,作者回顾了提交至 CRAN 的 R 包不可持续的增长趋势。由于新发布包的数量巨大,这项曾经易于管理的任务如今已变得难以应对。 作者认为,软件部署的便捷性推动了这一激增,并将其比作低影响力人工智能应用的泛滥。作者质疑这种趋势的价值,并指出许多新的 R 包对社区并无实质性贡献。这种质量下降的一个主要指标是缺乏文档;值得注意的是,许多新包缺少 README 文件、说明文档(vignettes)或存储库链接,实际上根本无法使用。 最终,作者质疑这种涌入是有益的,还是仅仅造成了混乱。鉴于现代提交内容缺乏实用性和透明度,作者邀请 R 社区通过其 GitHub 存储库对此问题发表看法。

抱歉。
相关文章

原文

If you are reading this post on R-bloggers, you will probably know that I have been publishing my selection of the “Top 40” new R packages on CRAN for quite some time. I did this first as part of my work at Revolution Analytics, then on R Views for RStudio and Posit, and now here on R Works. It used to take about a day’s worth of pleasurable work spread out over a month to select forty interesting packages. For a hundred or so packages, I could look at all of the package webpages, download and play with a small number of them. Now, the “Top 40” has become a real hamster-on-the-wheel project. The following plot shows my count of the number of new packages to make it to CRAN since I began publishing on R Works.

Why the sharp increase? What could possibly be going on? Well, I have a guess. It is apparently now just too easy to package up some code and ship it to CRAN. That’s understandable: it’s just too easy to write and deploy any kind of software. The following plot that John Burn-Murdoch recently published in the Financial Times based on NBER research shows the explosion of apps in our Agentic AI era.

App releases over time

The plot also indicates that the new apps can’t be making much of a positive contribution to people’s lives or corporate profits. They are apparently not being used, or reviewed, or even discovered.

So let’s ask the same kinds of questions about new R packages. Are most of them really making a contribution to R and the R Community? Are they contributing new statistical methods, extending the reach of R into new application areas, offering efficient high-performance code, or doing something else that would arguably benefit the R Community?

My impression as an engaged dilettante is no: most new R packages are not making a contribution. One obvious indicator of quality is documentation. A significant number of new R packages don’t provide sufficient documentation to explain what they offer. For example, in May, 40 of the 323 new CRAN packages had no README file, no vignettes, and no URL linking to a repository. To my way of thinking, with the possible exceptions of packages that have some sort of discoverable out-of-band documentation (e.g., a journal publication) or are not meant to be called by end users because they are infrastructure for some suite of packages, packages that don’t describe what, why, and how they work are not contributions.

As a hamster on the wheel, I would be very pleased to hear what you have to say. If you are motivated, please leave a comment at Issue #68 on the R Works GitHub repository.

联系我们 contact @ memedata.com