可视化 AI 模型随时间的训练成本

可视化 AI 模型随时间的训练成本
Visualizing The Training Costs Of AI Models Over Time

原始链接: https://www.zerohedge.com/technology/visualizing-training-costs-ai-models-over-time

由于计算需求不断增加，ChatGPT 和 Google Gemini Ultra 等先进人工智能模型需要总计数百万美元的大量投资。由于对强大计算资源的高需求，培训成本飞涨。根据 Visual Capitalist 的 Dorothy Neufeld 和斯坦福大学 2024 年人工智能指数报告，训练复杂人工智能模型的成本显着上升。确定培训成本时考虑的因素包括持续时间、硬件使用效率和硬件价值。然而，由于缺乏可靠数据，准确的成本信息受到限制。下表列出了自 2017 年以来重要人工智能模型经通胀调整后的大致培训成本：型号| 培训成本（经通货膨胀调整） ---|--- OpenAI 的 GPT-4 | 7840 万美元谷歌掌上电脑(540B)| 1240万美元早期变压器模型 (2017) | 930 美元谷歌的 Gemini Ultra | 1.91 亿美元相比之下，去年训练的 OpenAI 的 GPT-4 所需的投资几乎是 2017 年 Transformer 模型的 80 倍。尽管存在这种趋势，在保持性能的同时减少培训费用的新方法不断出现，包括开发专门的小型模型和利用自行生成的合成数据。然而，实现持续的成功仍然难以实现，因为最近涉及人工智能模型处理无意义输入的实验导致了“模型崩溃”。

原文

Training advanced AI models like OpenAI’s ChatGPT and Google’s Gemini Ultra requires millions of dollars, with costs escalating rapidly.

As computational demands increase, the expenses for the computing power necessary to train them are soaring. In response, AI companies are rethinking how they train generative AI systems. In many cases, these include strategies to reduce computational costs given current growth trajectories.

As Visual Capitalist's Dorothy Neufeld shows in the following graphic, based on analysis from Stanford University’s 2024 Artificial Intelligence Index Report, the training costs for advanced AI models has surged.

How Training Cost is Determined

The AI Index collaborated with research firm Epoch AI to estimate AI model training costs, which were based on cloud compute rental prices. Key factors that were analyzed include the model’s training duration, the hardware’s utilization rate, and the value of the training hardware.

While many have speculated that training AI models has become increasingly costly, there is a lack of comprehensive data supporting these claims. The AI Index is one of the rare sources for these estimates.

Ballooning Training Costs

Below, we show the training cost of major AI models, adjusted for inflation, since 2017:

Last year, OpenAI’s GPT-4 cost an estimated $78.4 million to train, a steep rise from Google’s PaLM (540B) model, which cost $12.4 million just a year earlier.

For perspective, the training cost for Transformer, an early AI model developed in 2017, was $930. This model plays a foundational role in shaping the architecture of many large language models used today.

Google’s AI model, Gemini Ultra, costs even more, at a staggering $191 million. As of early 2024, the model outperforms GPT-4 on several metrics, most notably across the Massive Multitask Language Understanding (MMLU) benchmark. This benchmark serves as a crucial yardstick for gauging the capabilities of large language models. For instance, its known for evaluating knowledge and problem solving proficiency across 57 subject areas.

Training Future AI Models

Given these challenges, AI companies are finding new solutions for training language models to combat rising costs.

These include a number of approaches, such as creating smaller models that are designed to perform specific tasks. Other companies are experimenting with creating their own, synthetic data to feed into AI systems. However, a clear breakthrough is yet to be seen.

Today, AI models using synthetic data have shown to produce nonsense when asked with certain prompts, triggering what is referred to as “model collapse”.