[AI] Trivedi Anand / Триведи Ананд - Building LLMs with PyTorch / Создание LLMS с помощью PyTorch [2025, PDF/EPUB, ENG]
Главная »
Литература
» Книги FB2 » Учебно-техническая литература
|
| Статистика раздачи | |
| Размер: 35.59 MB | Зарегистрирован: 6 месяца 4 дня | Скачано: 317 раза | |
| Работает мультитрекерная раздача | |
|
Полного источника не было: Никогда |
|
|
| Автор | Сообщение | |||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| MAGNAT ®
|
Building LLMs with PyTorch: A step-by-step guide to building advanced AI models with PyTorch / Создание LLMS с помощью: Пошаговое руководство по созданию продвинутых моделей искусственного интеллекта с помощью PyTorch
Год издания: 2025 Автор: Trivedi Anand / Триведи Ананд Издательство: BPB Publications ISBN: 978-93-65898-255 Язык: Английский Формат: PDF/EPUB Качество: Издательский макет или текст (eBook) Интерактивное оглавление: Да Количество страниц: 589 Описание: PyTorch has become the go-to framework for building cutting-edge large language models (LLMs), enabling developers to harness the power of deep learning for natural language processing. This book serves as your practical guide to navigating the intricacies of PyTorch, empowering you to create your own LLMs from the ground up. You will begin by mastering PyTorch fundamentals, including tensors, autograd, and model creation, before diving into core neural network concepts like gradients, loss functions, and backpropagation. Progressing through regression and image classification with convolutional neural networks, you will then explore advanced image processing through object detection and segmentation. The book seamlessly transitions into NLP, covering RNNs, LSTMs, and attention mechanisms, culminating in the construction of Transformer-based LLMs, including a practical mini-GPT project. You will also get a strong understanding of generative models like VAEs and GANs. A single idea drove this book: How can anyone who wants to start their journey into AI begin? How can someone understand the complex concepts of LLMs, Generative AI, and Diffusion Models? How can those who want to change existing AI programs and models, or even build new models from scratch, get started? By the end of this book, you will possess the technical proficiency to build, train, and deploy sophisticated LLMs using PyTorch, equipping you to contribute to the rapidly evolving landscape of AI. What you will learn: - Build and train PyTorch models for linear and logistic regression. - Configure PyTorch environments and utilize GPU acceleration with CUDA. - Construct CNNs for image classification and apply transfer learning techniques. - Master PyTorch tensors, autograd, and build fundamental neural networks. - Utilize SSD and YOLO for object detection and perform image segmentation. - Develop RNNs and LSTMs for sequence modeling and text generation. - Implement attention mechanisms and build Transformer-based language models. - Create generative models using VAEs and GANs for diverse applications. - Build and deploy your own mini-GPT language model, applying the acquired skills. Who this book is for: Software engineers, AI researchers, architects seeking AI insights, and professionals in finance, medical, engineering, and mathematics will find this book a comprehensive starting point, regardless of prior Deep Learning expertise. PyTorch стал универсальной платформой для создания передовых больших языковых моделей (LLM), позволяющей разработчикам использовать возможности глубокого обучения для обработки естественного языка. Эта книга послужит вам практическим руководством по освоению тонкостей PyTorch, которое поможет вам создать свой собственный LLM с нуля. Вы начнете с освоения основ PyTorch, включая тензоры, автоградацию и создание моделей, а затем погрузитесь в основные концепции нейронных сетей, такие как градиенты, функции потерь и обратное распространение. Пройдя через регрессию и классификацию изображений с помощью сверточных нейронных сетей, вы затем познакомитесь с расширенной обработкой изображений с помощью обнаружения объектов и сегментации. Книга плавно переходит в NLP, охватывая RNN, LSTM и механизмы внимания, кульминацией которых является создание LLM на основе трансформаторов, включая практический мини-проект GPT. Вы также получите четкое представление о генеративных моделях, таких как VAE и GAN. В основе этой книги лежит одна идея: как может начать свой путь в ИИ каждый, кто хочет? Как можно разобраться в сложных концепциях LLM, генеративного ИИ и диффузионных моделей? Как могут начать те, кто хочет изменить существующие программы и модели искусственного интеллекта или даже создать новые модели с нуля? К концу прочтения этой книги вы будете обладать техническими знаниями, необходимыми для создания, обучения и внедрения сложных LLM-систем с использованием PyTorch, что позволит вам внести свой вклад в быстро развивающийся ландшафт искусственного интеллекта. Чему вы научитесь: - Создавать и обучать модели PyTorch для линейной и логистической регрессии. - Настраивать среды PyTorch и использовать графическое ускорение с помощью CUDA. - Создавать CNN для классификации изображений и применять методы трансферного обучения. - Освоить тензоры PyTorch, автоградацию и построить фундаментальные нейронные сети. - Использовать SSD и YOLO для обнаружения объектов и сегментации изображений. - Разрабатывать RNN и LSTM для моделирования последовательности и генерации текста. - Внедрять механизмы внимания и создавать языковые модели на основе трансформаторов. - Создавать порождающие модели с использованием VAE и GAN для различных приложений. - Создайте и разверните свою собственную языковую модель mini-GPT, применяя приобретенные навыки. Для кого предназначена эта книга: Инженеры-программисты, исследователи искусственного интеллекта, архитекторы, ищущие идеи в области искусственного интеллекта, а также специалисты в области финансов, медицины, инженерии и математики найдут эту книгу полезной для начала, независимо от их предшествующего опыта в области глубокого обучения. Оглавление1. Introduction to Deep LearningIntroduction Structure Objectives Applications and benefits of industrial deep learning in modern industries Why PyTorch, and not TensorFlow My experience with PyTorch Understanding deep learning Learning about artificial neuron Examples of neural networks in action Learning how neurons work Decoding the mysterious black box Arrangement of neurons makes the difference Understanding the learning process of neural networks Deep learning model lifecycle Running all the codes Bare bones of machine learning example Conclusion 2. Nuts and Bolts of AI with PyTorch Introduction Structure Objectives Neural network and neurons Tensor simplified Running a pre-trained PyTorch model Introduction to PyTorch modules with examples Linear regression with Pytorch pytorch.nn module Significance of using zero_grad() Understanding gradients Getting to lowest point in hilly terrain with blindfold Striving to find the absolute lowest points Error, loss and cost Cost versus loss Learning rate of Global Minima Learning rate and batch size Improving linear neural network Importance of incorporating non-linearity in deep learning models Importance of non-linearity Simple classification as example of non-linearity Conclusion Points to remember 3. Introduction to Convolution Neural Network Introduction Structure Objectives Need of deep networks for images Computer vision without neural networks Computer vision with neural networks Visualizing a convolution Convolution Filters Choice of filters in CNN Convolution to pooling Flattening Passing data to dense layers PyTorch torchvision module torchvision.transforms PyTorch datasets and dataloaders Combining components to create a complete CNN network Convolutional layers Activation functions Max-pooling layers Flatten layer Fully connected layers Choice of parameters CNN model on custom datasets Usage of CNN in enterprises Visualizing CNN internal layers Conclusion 4. Model Building with Custom Layers and PyTorch 2.0 Introduction Structure Objectives Designing and developing models with PyTorch NN module in PyTorch Step-by-step model creation Exploring linear layers in PyTorch Convolutional layers available in PyTorch Other important layers Constructing neural networks in PyTorch Simplicity of nn.Sequential Subclassing nn.Module Iterative designs with nn.ModuleList Organizing with nn.ModuleDict Combining methods: Hybrid approaches Nested models for modularity Preparing for deployment: TorchScript Nested models for modularity in PyTorch Famous neural network architectures using nested models and modularity Residual Networks Inception Crafting bespoke layers and activation functions in PyTorch Creating custom layers in PyTorch Using the custom layer Weight initialization Custom activation function Pretrained models and PyTorch Hub PyTorch Hub Saving, exporting, and understanding model methods with examples Conclusion 5. Advances in Computer Vision: Transfer Learning and Object Detection Introduction Structure Objectives Necessity of transfer learning Transfer learning Training later layers Hierarchical feature learning in neural networks Reasons for training only last layers Example Brain tumor image classification Object detection Visualizing feature maps in a pre-trained Faster R-CNN model Methods of object detection Fast R-CNN Faster R-CNN Conclusion 6. Advanced Object Detection and Segmentation Introduction Structure Objectives Drawbacks of Faster R-CNN Exploring SSD, YOLO, and transformer-based solutions Single Shot MultiBox Detector for object detection Dive deeper into SSD You Only Look Once What makes YOLO so fast YOLO updates YOLO versus SSD Practical example for YOLO Training YOLO5 on custom dataset Image segmentation Image segmentation implementation Conclusion 7. Mastering Object Detection with Detectron2 Introduction Structure Objectives Versatility of Detectron2 Detectron2 architecture Setting up Detectron2 in Google Colab Exploring Detectron2 models zoo Object detection on custom dataset Tools for annotating datasets in object detection and image segmentation Implementing image segmentation Advanced human pose estimation with Detectron2 DensePose Demonstrating dense pose estimation Project: Yoga pose estimation Conclusion 8. Introduction to RNNs and LSTMs Introduction Structure Objectives Emergence of specialized neural networks for sequences Handling long sequences Recurrent neural networks RNN vs Feed Forward Neural Networks Example: Language translation Limitations of RNNs Working of RNN PyTorch components for RNNs and LSTM networks Understanding RNN cell and RNN layer Formula for hidden state update in a basic RNN cell Example calculation RNN layer Predicting the next even number in sequence Understanding the challenges with RNNs Why these problems occur Vanishing and exploding gradient Vanishing gradient Exploding gradient Context vector Working example: Language modeling LSTM deep dive LSTMs memory cell Internal working of LSTM Basic architecture Working of LSTM LSTM implementation stock price prediction Downloading and visualizing data Visualizing last two years’ data for reliance LSTM for stock prediction Conclusion 9. Understanding Text Processing and Generation in Machine Learning Introduction Structure Objectives LSTM models Word embeddings One-hot encoding Example Problems with one-hot encoding Introduction to word embeddings How word embeddings capture semantic information Creating word embeddings from scratch PyTorch embedding layer Initializing embedding layer Using embedding layer Example: Embedding layer in a neural network Tips and considerations Comparing two sentences using word embeddings in PyTorch Using predefined word embeddings LSTM application in text generation Advantages of practical LSTM-based text generation Pride and Prejudice by Jane Austen as dataset Implementation Stacked LSTM Implementing stacked LSTM in PyTorch Sequence to sequence models Variable input and output lengths Overview of Seq2Seq models Encoder Attention mechanism Decoder Transformer model Applications and evolution Language translation Image captioning Implementing image captioning EncoderCNN class DecoderRNN class CNNtoRNN class Attention is all you need Without attention mechanism With attention mechanism Attention technical overview Components of attention Evolution from attention to transformers Using attention with LSTM Using attention with LSTMs Going beyond with transformers Attention example Integrating attention mechanism with LSTM for sequence classification Use case Types of attention mechanisms Why do we need different attention mechanisms? Key types of attention mechanisms Conclusion 10. Transformers Unleashed Introduction Structure Objectives NLP revolution with attention Power of attention mechanisms Understanding the difference between attention versus transformers Transformers: A complete model using attention Transformer architecture essentials Understanding transformers through examples Challenge How transformers calculate attention Head in multi-head attention This process is for one head Example: Explaining the encoder process Positional encodings Working Why it works Complete encoder working Post multi-head attention processing in transformer encoders Decoder in transformer Input to the decoder Start with a special token Masked multi-head attention Add and Norm In context of transformers Feed-forward Add and Norm Linear layer and softmax Step-by-step translation from English to French with a transformer model Input from the encoder Begin with a start token Self-attention on generated tokens Cross-attention with encoder outputs Predict the first word Feed the generated word back into the decoder Cross-attention with encoder outputs Predict the next word Repeat until end token Implementing language translation using transformer Simple example using nn.Transformer Downloading datasets Tokenization and setting vocabulary Adding positional encoding Token embedding Seq2Seq transformer Masking Sentence representation Adding padding Creating masks Square subsequent mask for target sequence Application of masks in attention mechanism Final result of masking and its effect Utility function for text Example with specific values Training Greedy decode Greedy decoding example Advanced decoding strategies Translating Deep dive into architectures and applications Vision transformers Traditional approach with CNNs Better understanding of vision transformers work Implementing vision transformer from scratch Vision transformer architecture Step 1 Step 2 Step 3 Step 4 Step 5 Future directions Conclusion 11. Introduction to GANs: Building Blocks of Generative Models Introduction Structure Objectives Generative AI, the AI artist Discriminative versus generative models Working of generative AI Working of GANs via a deepfake example How transformers are different Transformer model diagram Practical example of GANs Loading the dataset Generating new images What GANs can do Step-by-step guide to generating anime faces with PyTorch Architecture Generator Discriminator Training code Train the discriminator Train the generator Conclusion 12. Conditional GANs, Latent Spaces, and Diffusion Models Introduction Structure Objectives Convolutional GAN Using convolutional neural networks Convolutional filters Convolution operation Importance of transposed convolutions in upsampling Importance of convolutions in downsampling Training process of a convolutional GAN Convolutional GANs with CelebA dataset CelebA dataset Improvements in the architecture compared to normal GANs Explanation of the practical example Generator class Defining the network layers Main training loop Understanding the training process Latent space in GAN Latent space in GANs using a pretrained model Conditional GANs Use of conditional GANs Practical implementation of CGAN Forward method, generating an image Forward method, evaluating the image Output after training Diffusion models Types of diffusion models Applications of diffusion models Example Train a diffusion model from scratch U-Net model for better prediction Working of U-Net U-Net model implementation Conclusion 13. PyTorch 2.0: New Features, Efficient CUDA Usage, and Accelerated Model Training Introduction Structure Objectives Introduction to PyTorch 2.0 PyTorch 2.0 and CUDA 11.8 Installing or upgrading to PyTorch 2.0 with CUDA 11.8 On Google Colab On local environment PyTorch 2.0 comparison with previous versions Mixed precision training Simplified precision in PyTorch 2.0 Asynchronous CUDA execution Asynchronous CUDA execution Accelerating training Leveraging modern GPU capabilities to enhance model performance Mixed precision training TorchScript Using TorchScript Taking advantage of new kernel libraries Leveraging new kernel libraries in PyTorch 2.0 Distributed training using PyTorch 2.0/1.x Advantages of distributed training Implementation specifics in PyTorch 2.0 Overview of distributed training implementation Conclusion 14. Building Large Language Models from Scratch Introduction Structure Objectives Building a GPT-like model from scratch Understanding GPT models Visualizing model growth Understanding model parameters in machine learning Comparing GPT and the original transformer architecture Understanding language modeling From basic to large language models 'Large' in large language models Pre-training and fine-tuning Pre-training large language models Pre-training ChatGPT Role of pre-training Instruction Pre-Training Finetuning LLM Pre-Training Fine-tuning Fine tuning methods Building an LLM from scratch Multi-head attention and the block in GPT Working of multi-head attention Transformer block GPT model Size of GPT model Building large models like GPT-3 and LLaMA Conclusion Index
|
|||||||||||||||||||||
Главная »
Литература
» Книги FB2 » Учебно-техническая литература
|
Текущее время: 05-Дек 15:41
Часовой пояс: UTC + 5
Вы не можете начинать темы
Вы не можете отвечать на сообщения Вы не можете редактировать свои сообщения Вы не можете удалять свои сообщения Вы не можете голосовать в опросах Вы не можете прикреплять файлы к сообщениям Вы можете скачивать файлы |






