Posts Tags Categories About
PostsTagsCategoriesAbout

 ML-DL

2025

Transformer architecture variation: RMSNorm 05-11
One for all: the torch.einsum API 04-14
Weight Tying in Language Models: A Technique to Parameter efficiency 03-11
What is Multi-Head Attention (MHA) 03-04
An Explanation of Self-Attention mechanism in Transformer 03-02
The Flow of GraphRAG 02-12
Reading Notes: Outrageously Large Neural Networks-The Sparsely-Gated Mixture-of-Experts Layer 02-02

2024

How KNN Algorithm Works 12-15

2023

LLM inference optimization - KV Cache 10-12
LoRA fine-tuning 09-14
A trick to calculating partial derivatives in machine learning 07-26
Demystifying Pytorch's Strides Format 07-14
How to understand the backpropagation algorithm 04-04
Linear Regression Model Guide - theory part 03-15
Powered by Hugo | Theme - DoIt
2019 - 2025 MartinLwx | CC BY-NC 4.0