NLP: Answer Retrieval from Document using Python

This article was published as a part of the Data Science Blogathon Introduction → This article focuses on answer retrieval from a document by using similarity and difference metrics. This task falls under Natural Language Processing which is a subset of Deep Learning. In this article we will be understanding the concept of general similarity algorithms and how can they be applied to complete our task. The article will be based on python for the coding part. How to Approach → To […]

Read more

Resume Screening with Natural Language Processing in Python

For each recruitment, companies take out online ads, referrals and go through them manually. Companies often submit thousands of resumes for every posting. When companies collect resumes through online advertisements, they categorize those resumes according to their requirements. After collecting resumes, companies close advertisements and online applying portals. Then they send the collected resumes to the Hiring Team(s). It becomes very difficult for the hiring teams to read the resume and select the resume according to the requirement, there is […]

Read more

Part 14: Step by Step Guide to Master NLP – Basics of Topic Modelling

This article was published as a part of the Data Science Blogathon Introduction This article is part of an ongoing blog series on Natural Language Processing (NLP). In this series, we completed our discussion on the entity extraction technique “Named Entity Recognition (NER)”. But at that time, we didn’t discuss another popular entity extraction technique called Topic Modelling. So, in continuation of that article, we will discuss Topic modelling in this article. In this article, we will discuss firstly some of […]

Read more

Part 17: Step by Step Guide to Master NLP – Topic Modelling using pLSA

This article was published as a part of the Data Science Blogathon Introduction This article is part of an ongoing blog series on Natural Language Processing (NLP). In the previous article, we discussed a Topic modelling technique named Latent Semantic Analysis (LSA), but we observed that there are some disadvantages of LSA, so to overcome those problems, we come up with the concept of pLSA, which stands for Probabilistic Latent Semantic Analysis. So, In this article, we will deep dive into […]

Read more

Topic extraction From Prime Minister Modi’s Speech

This article was published as a part of the Data Science Blogathon INTRODUCTION Artificial Intelligence (AI) has been a trendy term among individuals for many years. Earlier, when we used to hear the term “AI”, we could only think about Robots. However AI is not limited to robots, and nowadays, every electronic device we use has AI associated with it, be it smartphones, smart TVs, refrigerators, or Air conditioners. AI basically means a machine can take its decision without human intervention. […]

Read more

Part 3: Step by Step Guide to NLP – Text Cleaning and Preprocessing

This article was published as a part of the Data Science Blogathon Introduction This article is part of an ongoing blog series on Natural Language Processing (NLP). In part-1and  part-2 of this blog series, we complete the theoretical concepts related to NLP. Now, in continuation of that part, in this article, we will cover some of the new concepts. In this article, we will understand the terminologies required and then we start our journey towards text cleaning and preprocessing, which is […]

Read more

Part 6: Step by Step Guide to Master NLP – Word2Vec

This article was published as a part of the Data Science Blogathon Introduction This article is part of an ongoing blog series on Natural Language Processing (NLP). In the previous article of this series, we completed the statistical or frequency-based word embedding techniques, which are pre-word embedding era techniques. So, in this article, we will discuss the recent word-era embedding techniques. NOTE: In recent word-era embedding, there are many such techniques but in this article, we will discuss only the Word2Vec […]

Read more

Regex Cheatsheet For Natural Language Processing tasks

This article was published as a part of the Data Science Blogathon Introduction Regex is a shorthand for Regular Expression. It is a representation for a set, a set of strings. Say we have a list of emails and we want to check if they are in the correct format or not. One way is to check each and every mail manually but that’s not possible if the number of mails is quite high. So, regex here comes to your rescue. […]

Read more

Part 13: Step by Step Guide to Master NLP – Regular Expressions

This article was published as a part of the Data Science Blogathon Introduction This article is part of an ongoing blog series on Natural Language Processing (NLP). From this article, we will start our discussion on Regular Expressions. When a data scientist comes across a text processing problem whether it is searching for titles in names or dates of birth in a dataset, regular expressions rear their ugly head very frequently. They form part of the basic techniques in NLP and […]

Read more

Part 2: Topic Modeling and Latent Dirichlet Allocation (LDA) using Gensim and Sklearn

This article was published as a part of the Data Science Blogathon Introduction In the previous article, we had started with understanding the basic terminologies of text in Natural Language Processing(NLP), what is topic modeling, its applications, the types of models, and the different topic modeling techniques available. Let’s continue from there, explore Latent Dirichlet Allocation (LDA), working of LDA, and its similarity to another very popular dimensionality reduction technique called Principal Component Analysis (PCA).   Table of Contents A Little […]

Read more
1 3 4 5 6 7 14