Apple Neural Engine:架构、编程与性能
Apple Neural Engine: Architecture, Programming, and Performance

原始链接: https://arxiv.org/abs/2606.22283

本文对苹果神经网络引擎(ANE)进行了全面的逆向工程技术分析。ANE 是集成在苹果 A 系列和 M 系列芯片中的矩阵加速器。研究涵盖了从 A11/M1 到 A18/M5 的硬件代际演进,详细介绍了 ANE 的架构,包括其数据通路、权重压缩技术、固件、内核驱动程序及指令协议。 通过结合硬件直接测量(主要针对 M1 和 M5 芯片)以及对私有运行时和编译器的静态分析,作者确定了该引擎的性能边界与运行特性。本指南明确指出,虽然 ANE 通常仅限于苹果的 Core ML 框架使用,但从用户空间直接访问该硬件是可行的。不过,作者提醒这种底层方法未经官方文档支持,且对版本高度敏感;它仅适用于研究和基准测试,不应用于生产软件。总之,这项工作为理解驱动苹果自研芯片机器学习加速的专有机制提供了透明的视角。

```Hacker News 新闻 | 过往 | 评论 | 提问 | 展示 | 招聘 | 提交 登录 Apple Neural Engine:架构、编程与性能 (arxiv.org) 53 点,由 Jimmc414 发布于 5 小时前 | 隐藏 | 过往 | 收藏 | 2 条评论 | 帮助 carbocation 9 分钟前 [–] 这看起来很像是 AI 写的。 回复 dkdcdev 7 分钟前 | 父评论 [–] 为什么? 回复 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系 搜索:```
相关文章

原文

[Submitted on 21 Jun 2026]

View a PDF of the paper titled Apple Neural Engine: Architecture, Programming, and Performance, by Spencer H. Bryngelson

View PDF
Abstract:The Apple Neural Engine (ANE) is the fixed-function matrix accelerator that has shipped in Apple systems-on-chip since the A11-class iPhone and iPad chips and the M1-class Mac chips, exposed to applications only through the Core ML model framework. This guide reports a reverse-engineered account of the engine, based on direct measurement on Apple silicon and static analysis of the private runtime, compiler, kernel driver, and firmware. It documents the datapath and the roofline that bound the engine's throughput and energy, the dispatch route that reaches it below Core ML, the compiler and on-disk program format, the weight-compression scheme, and the kernel driver, firmware, and command protocol beneath them. The account covers the A11 through A18 and M1 through M5 families, with per-chip target tables and an operation-by-device matrix; the direct measurements are on the M1 and M5. Claims are labeled as measured, decompile-derived, or predicted, and the methodology and open questions are recorded. The direct route is callable from ordinary user space but remains undocumented, unsupported, and version-fragile; it is intended for measurement, research, and on-device work, not for shipping software, where Core ML remains the supported path.
From: Spencer Bryngelson [view email]
[v1] Sun, 21 Jun 2026 00:17:34 UTC (407 KB)
联系我们 contact @ memedata.com