Hacker Newsnew | past | comments | ask | show | jobs | submitlogin



ts_rank functions use the term frequency within that document. not a global term frequency (which is why u need a separate index like what elasticsearch does).

this is important, cos if a word is too common, its considered less significant for a document match. When we calculate IDF, it will be very low for the most occurring words such as stop words (β€œis” is present in almost all of the documents, and tf-idf will give a very low value to that word).

there's someone who implemented this, its pretty cool. but definitely performance takes a hit versus a separate elasticsearch cluster. https://codebots.com/crud/How-to-efficiently-search-text-usi...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: