A trick to calculating partial derivatives in machine learning

You may have difficulties when trying to calculate the partial derivatives in machine learning like me. Even though I found a good reference cookbook that could be used to derive the gradients, I still got confused. Today, I want to share a practical technique I recently learned from this video: when calculating partial derivatives in machine learning, you can treat everything as if it were a scalar and then make the shapes match

Demystifying Pytorch's Strides Format

Even though I have been using Numpy and Pytorch for a long time, I never really knew how they implemented the underlying tensors and why they are so efficient. Recently, while studying the course Deep Learning Systems, I finally got the opportunity to try implementing tensors on my own. After going through the process, my understanding of tensors is much better 🧐

As a Pytorch user, is it necessary to understand the underlying tensor storage mechanism? I believe it is essential. In most cases, understanding the underlying principles helps you grasp higher-level concepts better. For example, understanding the tensor storage mechanism can help you answer the following questions:

How to memorize the Red-black tree

If you are attracted by the title of this blog, I believe you may agree with me: The process of memorizing the insertion and deletion operations of the Red-black tree can be incredibly arduous. It entails keeping track of complex tree rotations and the necessity to recolor nodes as required. I once read the renowned Introducing to Algorithms written by the CLRS. However, there are so many cases to remember and I quickly get overwhelmed.

Git bundle guide

git bundle is a relatively less commonly used git command. Its purpose is to package a git repo into a single file, which can then be used by others to recreate the original git repo. Additionally, git bundle supports incremental update. Before I learned about the git bundle command, I would usually directly use tar czf some_git_repo to create a package for a git repo. Recently, I accidentally discovered the git bundle and found it quite useful🍻.

Understanding GAT throught MPNN

Justin Gilmer proposed the MPNN (Message Passing Neural Network) framework 1 for describing graph neural network models used in supervised learning on graphs. I found this to be a useful framework that provides a clear understanding of how different GNN models work and facilitates a quick grasp of the differences between them. Considering a node $v$ on the graph $G$, the update procedure for its vector representation $h_v$ is as follows:

SICP Exercise 2.27

Modify your reverse procedure of exercise 2.18 to produce a deep-reverse procedure that takes a list as an argument and returns as its value the list with its elements reversed and with all sublists deep-reversed as well.

SICP Exercise 1.46

Several of the numerical methods described in this chapter are instances of an extremely general computational strategy known as iterative improvement. Iterative improvement says that, to compute something, we start with an initial guess for the answer, test if the guess is good enough, and otherwise improve the guess and continue the process using the improved guess as the new guess. Write a procedure iterative-improve that takes two procedures as arguments: a method for telling whether a guess is good enough and a method for improving a guess. Iterative-improve should return as its value a procedure that takes a guess as argument and keeps improving the guess until it is good enough. Rewrite the sqrt procedure of section 1.1.7 and the fixed-point procedure of section 1.3.3 in terms of iterative-improve.

Solution of Proj4. Scheme Interpreter of CS61A (2021-Fall)

Recently, I am reading a book called Crafting interpreters written by Robert Nystrom. In the original book, a Tree-walker interpreter jlox was implemented in Java. And I am trying to rewrite in Python - pylox. I highly recommend it👍. At this moment, I suddenly remembered that there were a few small issues with the Scheme interpreter for CS61A that I had not resolved after finishing it a year ago, which kept it in an unfinished state. So today I opened the project and intended to run through it from beginning to end and talk about the ideas.

Solving DP problems by SRTBOT Framework

Changelog:

When solving algorithm problems, what often gives me a headache are dynamic programming problems(DP problems). They are the type of problems that I can’t figure out on my own after thinking for a long time, but after seeing the answer, it suddenly becomes clear and reasonable. However, the next time I encounter a similar problem, I may forget how to solve it. I have also read many people’s solutions and tried to digest and apply their ideas, but I have been unable to find a particularly good framework that works for all dynamic programming problems. It seems that everyone has their way of solving dynamic programming problems, and when I try to apply their methods to new problems, I always encounter difficulties. Things start to change after I learned MIT6.006. The teacher presented 6 steps to solve dynamic programming problems, which is called the SRTBOT framework. I found it to be so useful and practical that I decided to write this blog post to share it with everyone 🙌

How to understand the backpropagation algorithm

Update: Backpropagation in matrix form could be found here

In the field of deep learning, optimizing the network involves a crucial process of continuously updating the weights and bias items. This is achieved by implementing the gradient descent method, which progressively minimizes the loss function. At the heart of this process lies the backpropagation algorithm, which facilitates efficient computation of gradients across the network

To better understand this concept, let us recall the formula for gradient descent. In this formula, we utilize the symbol $\theta$ to represent all the learnable parameters of the model, $J$ to represent the cost or loss function, and $\alpha$ to denote the learning rate. Thus, we can express the updating process as:

Linear Regression Model Guide - theory part

Recently, I review the machine learning course of Andrew ng in Coursera. Surprisingly, I can still learn a lot, so I decided to write some posts👍.

To talk about linear regression, we must first have a basic understanding of what is machine learning. What is machine learning? abstractly speaking, machine learning is learning a function: $$ f(input) = output $$ where $f$ refers to the specific machine learning model. Machine learning is a methodology for automatically mining the relationship between input and output. Sometimes we find it hard to define a specific algorithm to solve some problems, and this is where machine learning shines, we can let it learn and summarize some patterns from data and make predictions. This is also where it differs from traditional algorithms (binary search, recursive, etc.). One has to admit that machine learning is fascinating by definition, and it seems to provide a viable framework for solving all intractable problems. It just so happens that many real-life problems are so hard that solving them with traditional algorithms is impossible.