从头开始构建法学硕士:3 小时的编码研讨会 Building LLMs from the Ground Up: A 3-Hour Coding Workshop

原始链接: https://magazine.sebastianraschka.com/p/building-llms-from-the-ground-up

这是一个时长 3 小时的视频教程,介绍了理解大语言模型 (LLM) 的实践方法。 该视频涵盖了这些模型的实施、训练和使用,首先介绍了它们的基础知识。 它提供了分步指南,包括创建自定义分词器类、基于 GPT-2 和 Llama 等现有模型设置模型架构、加载预训练权重、执行指令微调、基准评估和测试会话性能。 该视频使用Python编程语言,所有必要的材料都可以在其随附的GitHub存储库和Lightning Studio资源中找到。 此外,研讨会还利用了 LitGPT 库,该库有一个单独的 GitHub 存储库。 本教程允许观看者从头开始构建自己的法学硕士。

This is a 3-hour video tutorial presenting a hands-on approach to understanding large language models (LLMs). The video covers implementing, training, and utilizing these models, starting with an introduction to their basics. It provides a step-by-step guide through creating a custom tokenizer class, setting up model architecture based on existing models such as GPT-2 and Llama, loading pretrained weights, performing instruction fine-tuning, benchmark evaluations, and testing conversational performance. The video uses Python programming languages, and all necessary materials can be found in its accompanying GitHub repository and Lightning Studio resources. Additionally, the workshop also leverages the LitGPT library which has a separate GitHub repository. This tutorial allows viewers to build their own LLMs from scratch.


If you’d like to spend a few hours this weekend to dive into Large Language Models (LLMs) and understand how they work, I've prepared a 3-hour coding workshop presentation on implementing, training, and using LLMs.

Below, you'll find a table of contents to get an idea of what this video covers (the video itself has clickable chapter marks, allowing you to jump directly to topics of interest):

0:00 – Workshop overview

2:17 – Part 1: Intro to LLMs

9:14 – Workshop materials

10:48 – Part 2: Understanding LLM input data

23:25 – A simple tokenizer class

41:03 – Part 3: Coding an LLM architecture

45:01 – GPT-2 and Llama 2

1:07:11 – Part 4: Pretraining

1:29:37 – Part 5.1: Loading pretrained weights

1:45:12 – Part 5.2: Pretrained weights via LitGPT

1:53:09 – Part 6.1: Instruction finetuning

2:08:21 – Part 6.2: Instruction finetuning via LitGPT

02:26:45 – Part 6.3: Benchmark evaluation

02:36:55 – Part 6.4: Evaluating conversational performance

02:42:40 – Conclusion

It's a slight departure from my usual text-based content, but the last time I did this a few months ago, it was so well-received that I thought it might be nice to do another one!

Happy viewing!

  1. Build an LLM from Scratch book

  2. Build an LLM from Scratch GitHub repository

  3. GitHub repository with workshop code

  4. Lightning Studio for this workshop

  5. LitGPT GitHub repository

相关文章
联系我们 contact @ memedata.com