File-based TF-IDF: Calculates keywords in a document, using a word corpus
Calculates keywords in a document, using a word corpus. Why? Because I found myself with hundreds of plain text files, with no way to know what each one contains. I then recalled this thing called TF-IDF from university, but found no utility that operates on files. Hence, here we are. How? Basically, each word in the current document gets a score. The score increases each time the word it appears in this document, and decreases each time it appears in […]
Read more