显示HN：学习LLMS Leetcode样式

显示HN：学习LLMS Leetcode样式
Show HN: Learn LLMs LeetCode Style

原始链接: https://github.com/Exorust/TorchLeet

Torchleet提供了分类为“问题集”和“ LLM集”的Pytorch实践问题，以增强深度学习技能。问题集的范围从初学者到高级主题，例如张量，自动克拉德，CNN，gans等。 LLM集合着重于从头开始实施大型语言模型，涵盖了注意机制，嵌入和高级LLM技术，例如量化和增强学习。每个问题都包括一个不完整的代码块和“动手实践”的“＃todo”评论，使其非常适合指导学习。还提供了解决方案以进行比较。该项目在``解决方案/''中的问题/“目录和解决方案”中的问题结构。鼓励用户通过添加新的，有据可查的问题或在既定项目结构后改善现有问题来做出贡献。

这个黑客新闻线程讨论了“ Lear LLMS Leetcode样式”，这是一个旨在通过类似于LeetCode的问题解决方法来教LLM的GitHub项目。用户赞美这个想法，但指出与leetcode的差异，指出问题更开放，这既可以是专业人士又是骗局。一个建议是为每项练习添加可重现的数据生成功能和清晰的评估指标，以确保代码质量。作者，Exorust承认该项目将GPT用于生成，并计划增加披露。这引发了关于使用LLM创建学习资源的适当性的辩论，同时建议不要在解决问题中使用。其他建议包括发布用于生成透明性问题的提示。一些用户还要求对学习Pytorch和Cuda等低级ML工具的建议提出建议。

TorchLeet is broken into two sets of questions:

Question Set: A collection of PyTorch practice problems, ranging from basic to hard, designed to enhance your skills in deep learning and PyTorch.
LLM Set: A new set of questions focused on understanding and implementing Large Language Models (LLMs) from scratch, including attention mechanisms, embeddings, and more.

Note

Avoid using GPT. Try to solve these problems on your own. The goal is to learn and understand PyTorch concepts deeply.

Mostly for beginners to get started with PyTorch.

Recommended for those who have a basic understanding of PyTorch and want to practice their skills.

These problems are designed to challenge your understanding of PyTorch and deep learning concepts. They require you to implement things from scratch or apply advanced techniques.

Implement parameter initialization for a CNN (Solution)
Implement a CNN from Scratch
Implement an LSTM from Scratch (Solution)
Implement AlexNet from scratch
Build a Dense Retrieval System using PyTorch
Implement KNN from scratch in PyTorch

These problems are for advanced users who want to push their PyTorch skills to the limit. They involve complex architectures, custom layers, and advanced techniques.

Write a custom Autograd function for activation (SILU) (Solution)
Write a Neural Style Transfer
Build a Graph Neural Network (GNN) from scratch
Build a Graph Convolutional Network (GCN) from scratch
Write a Transformer (Solution)
Write a GAN (Solution)
Write Sequence-to-Sequence with Attention (Solution)
[Enable distributed training in pytorch (DistributedDataParallel)]
[Work with Sparse Tensors]
Add GradCam/SHAP to explain the model. (Solution)
Linear Probe on CLIP Features
Add Cross Modal Embedding Visualization to CLIP (t-SNE/UMAP)
Implement a Vision Transformer
Implement a Variational Autoencoder

An all new set of questions to help you understand and implement Large Language Models from scratch.

Each question is designed to take you one step closer to building your own LLM.

Implement KL Divergence Loss
Implement RMS Norm
Implement Byte Pair Encoding from Scratch (Solution)
Create a RAG Search of Embeddings from a set of Reviews
Implement Predictive Prefill with Speculative Decoding
Implement Attention from Scratch (Solution)
Implement Multi-Head Attention from Scratch (Solution)
Implement Grouped Query Attention from Scratch (Solution)
Implement KV Cache in Multi-Head Attention from Scratch
Implement Sinusoidal Embeddings (Solution)
Implement ROPE Embeddings (Solution)
Implement SmolLM from Scratch (Solution)
Implement Quantization of Models
1. GPTQ
Implement Beam Search atop LLM for decoding
Implement Top K Sampling atop LLM for decoding
Implement Top p Sampling atop LLM for decoding
Implement Temperature Sampling atop LLM for decoding
Implement LoRA on a layer of an LLM
1. QLoRA
Mix two models to create a mixture of Experts
Apply SFT on SmolLM
Apply RLHF on SmolLM
Implement DPO based RLHF
Add continuous batching to your LLM
Chunk Textual Data for Dense Passage Retrieval
Implement Large scale Training => 5D Parallelism

What's cool? 🚀

Diverse Questions: Covers beginner to advanced PyTorch concepts (e.g., tensors, autograd, CNNs, GANs, and more).
Guided Learning: Includes incomplete code blocks (... and #TODO) for hands-on practice along with Answers

<E/M/H><ID>/: Easy/Medium/Hard along with the question ID.
<E/M/H><ID>/qname.ipynb: The question file with incomplete code blocks.
<E/M/H><ID>/qname_SOLN.ipynb: The corresponding solution file.

Navigate to questions/ and pick a problem
Fill in the missing code blocks (...) and address the #TODO comments.
Test your solution and compare it with the corresponding file in solutions/.

Happy Learning! 🚀

Feel free to contribute by adding new questions or improving existing ones. Ensure that new problems are well-documented and follow the project structure. Submit a PR and tag the authors.

显示HN：学习LLMS Leetcode样式 Show HN: Learn LLMs LeetCode Style

显示HN：学习LLMS Leetcode样式
Show HN: Learn LLMs LeetCode Style