这是一个时长 3 小时的视频教程,介绍了理解大语言模型 (LLM) 的实践方法。 该视频涵盖了这些模型的实施、训练和使用,首先介绍了它们的基础知识。 它提供了分步指南,包括创建自定义分词器类、基于 GPT-2 和 Llama 等现有模型设置模型架构、加载预训练权重、执行指令微调、基准评估和测试会话性能。 该视频使用Python编程语言,所有必要的材料都可以在其随附的GitHub存储库和Lightning Studio资源中找到。 此外,研讨会还利用了 LitGPT 库,该库有一个单独的 GitHub 存储库。 本教程允许观看者从头开始构建自己的法学硕士。
This is a 3-hour video tutorial presenting a hands-on approach to understanding large language models (LLMs). The video covers implementing, training, and utilizing these models, starting with an introduction to their basics. It provides a step-by-step guide through creating a custom tokenizer class, setting up model architecture based on existing models such as GPT-2 and Llama, loading pretrained weights, performing instruction fine-tuning, benchmark evaluations, and testing conversational performance. The video uses Python programming languages, and all necessary materials can be found in its accompanying GitHub repository and Lightning Studio resources. Additionally, the workshop also leverages the LitGPT library which has a separate GitHub repository. This tutorial allows viewers to build their own LLMs from scratch.
If you’d like to spend a few hours this weekend to dive into Large Language Models (LLMs) and understand how they work, I've prepared a 3-hour coding workshop presentation on implementing, training, and using LLMs.
Below, you'll find a table of contents to get an idea of what this video covers (the video itself has clickable chapter marks, allowing you to jump directly to topics of interest):
0:00 – Workshop overview
2:17 – Part 1: Intro to LLMs
9:14 – Workshop materials
10:48 – Part 2: Understanding LLM input data
23:25 – A simple tokenizer class
41:03 – Part 3: Coding an LLM architecture
45:01 – GPT-2 and Llama 2
1:07:11 – Part 4: Pretraining
1:29:37 – Part 5.1: Loading pretrained weights
1:45:12 – Part 5.2: Pretrained weights via LitGPT
1:53:09 – Part 6.1: Instruction finetuning
2:08:21 – Part 6.2: Instruction finetuning via LitGPT
02:26:45 – Part 6.3: Benchmark evaluation
02:36:55 – Part 6.4: Evaluating conversational performance
02:42:40 – Conclusion
It's a slight departure from my usual text-based content, but the last time I did this a few months ago, it was so well-received that I thought it might be nice to do another one!
Happy viewing!
Build an LLM from Scratch book
Build an LLM from Scratch GitHub repository
GitHub repository with workshop code
Lightning Studio for this workshop
LitGPT GitHub repository