Indexing in Natural Language Processing for Information Retrieval

This article was published as a part of the Data Science Blogathon

Overview

  • This blog covers GREP(Global-Regular-Expression-Print) and its drawbacks
  • Then we move on to Document Term Matrix and Inverted Matrix
  • Finally, we end with dynamic and distributed indexing
INDEXING INFORMATION RETRIEVAL grep
image source-https://javarevisited.blogspot.com/2011/06/10-examples-of-grep-command-in-unix-and.html#axzz6zwakOXgt

 

 

Global Regular Expression Print

Whenever we are dealing with a small amount of data, we can use the grep command very efficiently. It allows us to search one or more files for lines that contain a pattern.

For example-:

“grep pat check.txt”

This command will print all lines containing the text string “pat”, from the file check.txt

All the lines containing text strings such as “pat”, “patty”, “pattern”, “patties” will be printed at the output terminal.

Drawbacks of Grep command:-