ONNX Runtime 和 CoreML 可能会静默地将你的模型转换为 FP16。

Actual Negative	207 (TN)	24 (FP)
Actual Positive	69 (FN)	164 (TP)

Actual Negative	206 (TN)	25 (FP)
Actual Positive	68 (FN)	165 (TP)

Single precision	32	23 + 1 sign	8	\(1.2 * 10^{-38}\)	\(3.4 * 10^{38}\)
Half precision	16	10 + 1 sign	5	\(5.96 * 10^{-8}\)	\(6.55 * 10^{4}\)

ONNX Runtime 和 CoreML 可能会静默地将你的模型转换为 FP16。
ONNX Runtime and CoreML May Silently Convert Your Model to FP16

原始链接: https://ym2132.github.io/ONNX_MLProgram_NN_exploration

在Mac上使用ONNX Runtime (ORT)中的CoreMLExecutionProvider运行ONNX模型，可能导致预测结果与PyTorch或CPU上的ORT相比出现意外差异，原因是隐式转换为FP16精度。ORT在转换为CoreML时，默认会将模型转换为FP16，这可能会改变结果，尤其是在决策阈值附近。解决方法是在创建`InferenceSession`时，显式地将`ModelFormat`设置为“MLProgram”：`ort.InferenceSession(onnx_model_path, providers=[("CoreMLExecutionProvider", {"ModelFormat": "MLProgram"})])`。这将确保模型保持在FP32。这种差异源于CoreML的两种模型格式：较旧的NeuralNetwork格式（ORT中的默认格式），它缺乏显式类型并默认在Apple GPU上使用FP16，以及较新的MLProgram格式，它通过更强大、类型化的中间表示来保留FP32精度。MLProgram利用CoreML的模型中间语言 (MIL) 进行优化，同时保持指定的数据类型。虽然FP16在Apple硬件上可能更快，但隐式转换会显著影响模型准确性。在所有部署平台上进行彻底测试至关重要，以避免此类问题。

ONNX Runtime 和 CoreML 可能会静默地将你的模型转换为 FP16 (ym2132.github.io) 8 分，Two_hands 发表于 1 小时前 | 隐藏 | 过去 | 收藏 | 2 条评论 DiabloD3 发表于 6 分钟前 [–] 这就是我嘲笑那些所谓的“AI 研究人员”的原因。他们构建像这样的“高质量软件”，而其他人停止瞎折腾，使用 ggml 和 llama.cpp，并且不会遇到这些奇怪的问题。回复 ipython 发表于 2 分钟前 | 父评论 [–] 嗯，那些“AI 研究人员”正忙于在刚印好的本杰明堆里打滚，根本不在乎“高质量软件”。回复指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系搜索：

Running an ONNX model in ONNX RunTime (ORT) with the CoreMLExecutionProvider may change the predictions your model makes implicitly and you may observe differences when running with PyTorch on MPS or ONNX on CPU. This is because the default arguments ORT uses when converting your model to CoreML will cast the model to FP16.

The fix is to use the following setup when creating the InferenceSession:

ort_session = ort.InferenceSession(onnx_model_path, providers=[("CoreMLExecutionProvider", {"ModelFormat": "MLProgram"})])

FP32 Confusion Matrix

FP16 Confusion Matrix

The Link to the ONNX Issue

A Bug or Intended Behaviour - It Must be a Bug Right…?

Changes From 2017 to 2021 Which Lead to Greater Adoption of Intermediate Representations

The MIL Approach

Further Reading on ML Compilers