华为的明星AI模型建立在倦怠和窃基于
Huawei's star AI model was built on burnout and plagiarism

原始链接: https://the-open-source-ward.ghost.io/the-pangu-illusion-how-huaweis-star-ai-model-was-built-on-burnout-betrayal-and-open-source-theft/

在Github上发表的一份举报人报告称,华为的Pangu LLM开发项目涉及阿里巴巴Qwen模型的窃,证实了先前由Hextagi提出的主张。该报告详细介绍了华为内部的强烈内部压力和竞争,导致“小型模型团队”重新包装Qwen和DeepSeek等开源模型,而不是从头开始开发自己的。据称,他们在开源型号中添加了图层和参数调整,同时声称使用数千亿个代币对其进行了真正的改进,从而实现了巨大的10点公吨 该报告强调了由于美国制裁和越野车的筹码,核心LLM团队面临的挑战。作者认为,这一事件说明了开源LLM的潜在优势以及通过诚实的pla窃检测方法的AI法医学的兴起。该案提出了有关知识产权的问题,以及在AI培训中对模型归因和合理使用的更清晰准则的需求。

相关文章

原文

On July 5th, the True-Story-of-Pangu GitHub repository was published, and it is not a traditional code repository, as it only contains a README.md. It is a whistleblower report detailing the development of Noah Pangu — Huawei’s flagship LLM.

This project was recently accused by HonestAGI of copying Alibaba Qwen models— claims that Huawei executives strongly deny. However, the whistleblower’s version of events appears to align with HonestAGI’s allegations, depicting an intense development cycle marked by fierce internal competition and plagiarism.

In this article, I wanted to cover the key points of the affair while also taking a step back to reflect on how such news impacts and shapes the future of open-source LLMs. If you enjoy this kind of content and want fresh news from the open-source world, subscribe to the newsletter now!

Note: HonestAGI’s work was taken down from the internet, and I hadn’t been able to find out why. (Let me know if you come across any additional information.)


About The Pangu project

Huawei’s Pangu Model, unveiled in 2021, was positioned as a major milestone in China's race for artificial intelligence dominance. The project aimed to rival Western large language models (LLMs) like GPT-3 by leveraging Huawei’s computing infrastructure and research capabilities. Internally, it was seen not just as a technical endeavor, but as a national priority — especially under increasing U.S. tech sanctions.

According to the author, the work was split across multiple teams, including the core LLM team (the author’s own) and the small model team. Although initially presented as a long-term research initiative, the project quickly transformed into a crunch-intensive production race.

LLM Core Team Development Hell

From the outset, Huawei’s core LLM team faced a steep uphill battle in building the Pangu model. Due to U.S. sanctions, Huawei was restricted from using the world-standard NVIDIA GPUs, forcing them to rely primarily on Ascend chips—a custom in-house alternative. These chips were buggy and unstable, and working with them required the core team to spend enormous engineering effort stabilizing the stack before real progress on model training could even begin.

Despite this, the team made genuine technical strides: training several iterations of dense and Mixture of Experts (MoE) models ranging from 13B to 135B parameters. Even when they faced severe issues—such as an inefficient tokenizer or a collapsed 230B model—they kept refining their approach and architecture. Eventually, they delivered a fully from-scratch 135B dense model (V3) that was seen internally as Huawei’s first truly competitive, honest effort.

The 135B V2 Deception:

When the core team's original 135B model lagged behind competitors, the small model team claimed to have improved it with just a few hundred billion tokens of training, reporting dramatic 10-point metric gains.

The Reality: According to the whistleblower, they had repackaged Qwen 1.5 110B by:

  • Adding layers and increasing FFN dimensions to reach ~135B parameters
  • Inserting mechanisms from Pangu Pi paper for legitimacy
  • Changing architecture from 107 layers to 82 layers
  • Smoking gun: Parameter distributions matched Qwen 110B almost exactly, with model class names still referencing "Qwen"

Pangu Pro MoE 72B

Later, the small model team claimed this was scaled from their 7B model, but evidence suggests it was based on Qwen 2.5 14B. It quickly outperformed the core team's 38B V3 in internal evaluations, demoralizing those who had worked genuinely.

DeepSeek V3 Response

After DeepSeek V3's impressive launch, the small model team allegedly:

  • Directly loaded DeepSeek's checkpoint (directory name unchanged)
  • Froze parameters and repackaged as their own work
  • Easily outperformed the core team's genuine 718B model still in development

The Systemic Injustice

The whistleblower describes a system where legitimate work was consistently undermined:

While the core team followed strict processes and built models honestly, the small model team operated with impunity, repackaging competitors' work and receiving credit. Management knew the truth but allowed it because the fake results benefited them.

This drove away top talent to companies like DeepSeek, ByteDance, and Moonshot AI. One departing colleague said: "Joining this place was the greatest shame of my technical career."


Taking a Step Back

While every story has multiple perspectives and this account is heavily biased toward the whistleblower's view, it reveals something significant about the current state of AI development.

The Pangu allegations illustrate what happens when geopolitical pressures and corporate competition collide with technical reality. Under intense pressure to deliver results, organizations may prioritize appearances over genuine innovation—a pattern that extends far beyond this single case.

So, is it just more of the same? Nothing new under the sun?

I tend to disagree — at least from a technical point of view — because, to me, this story highlights three interesting aspects of the ongoing AI race:

Open-source models might ultimately come out on top, which would be a remarkable shift in a world where most algorithms and code remain behind closed doors. The irony here is striking: while companies like Huawei allegedly cut corners by repackaging open models, they're inadvertently proving the superiority of openly developed systems.

HonestAGI is paving the way for detecting plagiarism between models by making it possible to identify whether a given model contains the signature of an existing one. This represents a genuinely new frontier in AI accountability — something like forensic science for neural networks. The ability to trace parameter distributions, architectural choices, and training signatures means we're moving beyond simple performance comparisons to actual model genealogy. As this technology matures, it could fundamentally change how we verify claims about model development and establish intellectual property rights in the AI space.

There is still a long road ahead when it comes to addressing data privacy and intellectual property in training AI models. The Pangu case exposes how murky these waters remain — when does inspiration become imitation, and when does imitation become theft? The current legal frameworks simply weren't designed for a world where billion-parameter models can be reverse-engineered, fine-tuned, and repackaged. We need new standards for attribution, new concepts of derivative work in AI, and clearer guidelines about what constitutes fair use when training data and model architectures are involved.


Thanks for reading! I hope this deep dive into the Pangu controversy has given you some food for thought about the evolving landscape of AI development. If you enjoy this kind of content and want fresh news from the open-source world, subscribe to the newsletter now!

联系我们 contact @ memedata.com