![]() BM25 uses more input signals, it's based on better heuristics, and it typically doesn't require tuning. There's no question that BM25 is a more advanced relevancy algorithm than what ts_rank or ts_rank_cd use. Ultimately, this gives better results over a wider range of document types. Interestingly, the resulting BM25 formula is not all that different from TF-IDF but it incorporates a couple more concepts: the frequency saturation and the document length. If you're curious about the said mathematical research, I recommend this talk that makes it accessible. While the TF-IDF formula is mostly based on intuition and practical experiments, BM25 is the result of more formal mathematical research.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |