Anshul's Notes
About
  • Aug 15, 2024 Qunatizing the Large Language Model - GPTO
    Notes on Quantizations of DNN | Pre-req for GPTQ
  • Aug 14, 2024 Optimal Brain Quantizer - Pruning and Quantizing the Neural Nets
    Notes on Quantizations of DNN | Pre-req for GPTQ
  • Jul 13, 2024 Training Models Across Multiple Machines - The Federated Learning Way
    Notes on Distributed Training
  • Jun 7, 2024 How to train your Cuda? Going deep to GPU Architecture!
    The overall Architecture of the GPU and some bottlenecks!
  • May 13, 2024 Eveything Everywhere at once - A Distributed Training Saga...
    my notes on distributed data parallelism(DDP)
  • Apr 1, 2024 2.0 - Language Models as Operating System
    This is my notes exploring the usecase of Large Language Models as an Operating System of this decade. P
  • Jan 7, 2024 How to Program Your GPUs? Learning Cuda...
    how to operate the parallel chips if i ever become GPU rich...
  • Jan 7, 2024 State Space Models - Discretization and Convolution
    Notes on States Space Model and Mamba Architecture
  • Anshul's Notes
  • anshulsc
  • anshulsc

I dump my notes and thoughts here.