法学硕士的硬件加速:全面调查和比较 Hardware Acceleration of LLMs: A comprehensive survey and comparison

原始链接: https://arxiv.org/abs/2409.03384

论文《LLM 的硬件加速:综合调查和比较》讨论了大型语言模型 (LLM) 在自然语言处理任务中的使用。 它调查了加速这些模型的 Transformer 网络硬件的各种方法,包括 TensorFlow、PyTorch 和 BERT 等框架。 作者根据技术、处理平台(FPGA、ASIC、内存、GPU)、加速、能源效率和性能 (GOP) 等因素对这些框架进行了定性和定量比较。 然而,由于实现技术的差异,公平地比较这些方案具有挑战性。 为了解决这个问题,作者评估了每种方法应用于相同技术时的性能和能源效率。 他们还在多个 FPGA 芯片上测试了 LLM 的某些部分,以实现公平的比较。 本研究的主要目标是确定法学硕士硬件加速的有效方法,以提高自然语言处理能力。

The paper "Hardware Acceleration of LLMs: A Comprehensive Survey and Comparison" discusses the use of large language models (LLMs) for natural language processing tasks. It surveys various approaches to accelerating transformer network hardware for these models, including frameworks such as TensorFlow, PyTorch, and BERT. The authors provide qualitative and quantitative comparisons of these frameworks based on factors like technology, processing platform (FPGA, ASIC, In-Memory, GPU), speedup, energy efficiency, and performance (GOPs). However, due to differences in implementation technologies, it's challenging to compare the schemes fairly. To address this, the authors estimate the performance and energy efficiency of each approach when applied to the same technology. They also test some parts of the LLMs on multiple FPGA chips to achieve a fair comparison. The main goal of this study is to identify efficient methods for hardware acceleration of LLMs for improved natural language processing capabilities.


[Submitted on 5 Sep 2024]

View a PDF of the paper titled Hardware Acceleration of LLMs: A comprehensive survey and comparison, by Nikoletta Koilia and Christoforos Kachris

View PDF HTML (experimental)
Abstract:Large Language Models (LLMs) have emerged as powerful tools for natural language processing tasks, revolutionizing the field with their ability to understand and generate human-like text. In this paper, we present a comprehensive survey of the several research efforts that have been presented for the acceleration of transformer networks for Large Language Models using hardware accelerators.
The survey presents the frameworks that have been proposed and then performs a qualitative and quantitative comparison regarding the technology, the processing platform (FPGA, ASIC, In-Memory, GPU), the speedup, the energy efficiency, the performance (GOPs), and the energy efficiency (GOPs/W) of each framework. The main challenge in comparison is that every proposed scheme is implemented on a different process technology making hard a fair comparison. The main contribution of this paper is that we extrapolate the results of the performance and the energy efficiency on the same technology to make a fair comparison; one theoretical and one more practical. We implement part of the LLMs on several FPGA chips to extrapolate the results to the same process technology and then we make a fair comparison of the performance.
From: Christoforos Kachris [view email]
[v1] Thu, 5 Sep 2024 09:43:25 UTC (1,209 KB)
相关文章
联系我们 contact @ memedata.com