TF-IDF model
What is the TF-IDF model
In previous post, we talked about the bag-of-word model, which has many limitations. Today we take a step further to see if we can try to fix one of the limitations - Each word has the same importance.
💡 The crux of the problem - How to define the word importance?
One idea is: The more frequently a word appears within a single document, the more important it is for that document. For instance, in an article discussing dogs, the word “dog” is likely to appear frequently, reflecting the document’s main topic.