Part 17: Step by Step Guide to Master NLP – Topic Modelling using pLSA

This article was published as a part of the Data Science Blogathon

Introduction

This article is part of an ongoing blog series on Natural Language Processing (NLP). In the previous article, we discussed a Topic modelling technique named Latent Semantic Analysis (LSA), but we observed that there are some disadvantages of LSA, so to overcome those problems, we come up with the concept of pLSA, which stands for Probabilistic Latent Semantic Analysis.

So, In this article, we will deep dive into the concepts of pLSA, which is a technique used to model information under a probabilistic framework, and also discuss the mathematics behind the different parametrization of this technique in a detailed manner.

This is part-17 of the blog series on the Step by Step Guide to Natural Language Processing.

 

Table of Contents

1. Familiar with variables involved in pLSA

2. What is pLSA?

3. Latent Variable Model for pLSA

4. Matrix Factorization model for pLSA

5. Advantages and Disadvantages of pLSA

 

To finish reading, please visit source site