GitHut – 编程语言和GitHub
GitHut – Programming Languages and GitHub (2014)

原始链接: https://githut.info/

GitHut是一个项目,旨在可视化GitHub上编程语言的格局。GitHub是全球最大的代码托管平台,拥有数百万用户。该项目通过分析GitHub公开可用的数据(特别是通过Google BigQuery访问的GitHub Archive),来理解语言的流行度和特征。 该项目将语言使用视为人类解决问题和协作努力演化的体现,而不仅仅是一种工具。虽然语言*创建*的数据不完整,但GitHut专注于“活动”——代码更改的数量——作为衡量流行度的最可靠指标。 编程语言的发布年份来自维基百科的编程语言时间线。详细的方法论和数据收集信息可在GitHut GitHub仓库中找到,数据按季度更新。最终,GitHut通过开源开发的视角,为编程的动态世界提供了一个独特的视角。

## GitHut:回顾 GitHub 编程语言趋势 一则 Hacker News 讨论回顾了 2014 年的 GitHut 项目 (githut.info),该项目是对 GitHub 上编程语言使用情况的可视化呈现。原始数据一直到 2014 年,引发了关于语言趋势和开发环境演变的对话。 值得注意的是,用户指出直接 JavaScript “推送” 数量的下降与 TypeScript 的兴起相关。讨论还强调了 Go 语言日益普及,以及 C++、Rust 和 C 等语言较高的“每仓库推送”数量,可能表明其复杂性或活跃开发。 许多评论者表达了对 GitHut 更新版本的渴望,并透露 GitHut 2.0 已经存在 ([https://madnight.github.io/githut/#/pull_requests/2024/1](https://madnight.github.io/githut/#/pull_requests/2024/1)),但有些人觉得新的可视化效果不太吸引人。 还有讨论关于利用 GitHub Archive 中的更多当前数据重建 GitHut 的可能性,利用 Google BigQuery 等现有资源。 其他讨论点包括 Copilot 对新开发者采用的影响,以及对 Swift 等语言的呈现方式的疑问。
相关文章

原文
GitHut

GitHut is an attempt to visualize and explore the complexity of the universe of programming languages used across the repositories hosted on GitHub.

Programming languages are not simply the tool developers use to create programs or express algorithms but also instruments to code and decode creativity. By observing the history of languages we can enjoy the quest of human kind for a better way to solve problems, to facilitate collaboration between people and to reuse the effort of others.

GitHub is the largest code host in the world, with 3.4 million users. It's the place where the open-source development community offers access to most of its projects. By analyzing how languages are used in GitHub it is possible to understand the popularity of programming languages among developers and also to discover the unique characteristics of each language.

Data

GitHub provides publicly available API to interact with its huge dataset of events and interaction with the hosted repositories.
GitHub Archive takes this data a step further by aggregating and storing it for public consumption. GitHub Archive dataset is also available via Google BigQuery.
The quantitative data used in GitHut is collected from GitHub Archive. The data is updated on a quarterly basis.

An additional note about the data is about the large amount of records in which the programming language is not specified. This particular characteristic is extremely evident for the Create Events (of repository), therefore it is not possible to visualize the trending language in terms of newly created repositories. For this reason the Activity value (in terms of number of changes pushed) has been considered the best metric for the popularity of programming languages.

The release year of the programming language is based on the table Timeline of programming languages from Wikipedia.

For more information on the methodology of the data collection check-out the publicly available GitHub repository of GitHut.

联系我们 contact @ memedata.com