Bag-of-Word model
What is the bag-of-word model?
In NLP, we need to represent each document as a vector because machine learning can only accept input as numbers. That is, we want to find a magic function that: $$ f(\text{document}) = vector $$
Today’s topic is bag-of-word(BoW) model, which can transform a document into a vector representation.
💡 Although the BoW model is outdated in 2023, I still encourage you to learn from the history and think about some essential problems: