July 20, 2021 Advanced, NLP, Text

Feature Extraction and Embeddings in NLP: A Beginners guide to understand Natural Language Processing

This article was published as a part of the Data Science Blogathon Introduction In Natural Language Processing, Feature Extraction is one of the trivial steps to be followed for a better understanding of the context of what we are dealing with. After the initial text is cleaned and normalized, we need to transform it into their features to be used for modeling. We use some particular method to assign weights to particular words within our document before modeling them. We go […]

July 14, 2021 Data Science, Intermediate, Libraries, NLP, Python, Text, Word Embeddings

Practical Guide to Word Embedding System

This article was published as a part of the Data Science Blogathon Pre-requisites – Basic knowledge of Python – Understanding of basics of NLP(Natural Language Processing) Introduction In natural language processing, word embedding is used for the representation of words for Text Analysis, in the form of a vector that performs the encoding of the meaning of the word such that the words which are closer in that vector space are expected to have similar in mean. Consider, boy-men vs […]

July 13, 2021 Beginner, NLP, Python, Text, Word Embeddings

Word2Vec For Word Embeddings -A Beginner’s Guide

This article was published as a part of the Data Science Blogathon Why are word embeddings needed? Let us consider the two sentences – “You can scale your business.” and “You can grow your business.”. These two sentences have the same meaning. If we consider a vocabulary considering these two sentences, it will constitute of these words: {You, can, scale, grow, your, business}. A one-hot encoding of these words would create a vector of length 6. The encodings for each of […]

July 8, 2021 Beginner, Data Cleaning, Machine Learning, NLP, Python, Text, Word Embeddings

Part 5: Step by Step Guide to Master NLP – Word Embedding and Text Vectorization

This article was published as a part of the Data Science Blogathon Introduction This article is part of an ongoing blog series on Natural Language Processing (NLP). Up to the previous part of this article series, we almost completed the necessary steps involved in text cleaning and normalization pre-processing. After that, we will convert the processed text to numeric feature vectors so that we can feed it to computers for Machine Learning applications. NOTE: Some concepts included in the pipeline of […]

December 2, 2020 Classification, Intermediate, Libraries, NLP, Programming, Python, PyTorch, Supervised, Text, Unstructured Data Leave a comment

Introduction to Flair for NLP: A Simple yet Powerful State-of-the-Art NLP Library

Introduction Last couple of years have been incredible for Natural Language Processing (NLP) as a domain! We have seen multiple breakthroughs – ULMFiT, ELMo, Facebook’s PyText, Google’s BERT, among many others. These have rapidly accelerated the state-of-the-art research in NLP (and language modeling, in particular). We can now predict the next sentence, given a sequence of preceding words. What’s even more important is that machines are now beginning to understand the key element that had eluded them for long. Context! Understanding context […]

November 10, 2020 Classification, Intermediate, Machine Learning, NLP, Python, Supervised, Technique, Text, Unstructured Data Leave a comment

Ultimate Guide to Understand and Implement Natural Language Processing (with codes in Python)

Overview Complete guide on natural language processing (NLP) in Python Learn various techniques for implementing NLP including parsing & text processing Understand how to use NLP for text feature engineering Introduction According to industry estimates, only 21% of the available data is present in structured form. Data is being generated as we speak, as we tweet, as we send messages on Whatsapp and in various other activities. Majority of this data exists in the textual form, which is highly unstructured […]

November 9, 2020 Classification, Intermediate, Libraries, Machine Learning, NLP, Programming, Python, Supervised, Text, Unstructured Data Leave a comment

Text Classification & Word Representations using FastText (An NLP library by Facebook)

Introduction If you put a status update on Facebook about purchasing a car -don’t be surprised if Facebook serves you a car ad on your screen. This is not black magic! This is Facebook leveraging the text data to serve you better ads. The picture below takes a jibe at a challenge while dealing with text data. Well, it clearly failed in the above attempt to deliver the right ad. It is all the more important to capture the context […]

November 9, 2020 Classification, Intermediate, Machine Learning, NLP, Python, Supervised, Technique, Text, Unstructured Data Leave a comment

Ultimate guide to deal with Text Data (using Python) – for Data Scientists and Engineers

Introduction One of the biggest breakthroughs required for achieving any level of artificial intelligence is to have machines which can process text data. Thankfully, the amount of text data being generated in this universe has exploded exponentially in the last few years. It has become imperative for an organization to have a structure in place to mine actionable insights from the text being generated. From social media analytics to risk management and cybercrime protection, dealing with text data has never […]

November 6, 2020 Algorithm, Deep Learning, Intermediate, NLP, Sound Processing Leave a comment

An Introductory Guide to Understand how ANNs Conceptualize New Ideas (using Embedding)

Introduction Here’s something you don’t hear everyday – everything we perceive is just a best case probabilistic prediction by our brain, based on our past encounters and knowledge gained through other mediums. This might sound extremely counter intuitive because we have always imagined that our brain mostly gives us deterministic answers. We’ll do a small experiment to showcase this logic. Take a look at the below image: Q1. Do you see a human ? Q2. Can you identify the person? […]

November 6, 2020 Advanced, NLP, Python, Social Media, Technique, Text, Unstructured Data, Unsupervised, Word Embeddings Leave a comment

A Step-by-Step NLP Guide to Learn ELMo for Extracting Features from Text

Introduction I work on different Natural Language Processing (NLP) problems (the perks of being a data scientist!). Each NLP problem is a unique challenge in its own way. That’s just a reflection of how complex, beautiful and wonderful the human language is. But one thing has always been a thorn in an NLP practitioner’s mind is the inability (of machines) to understand the true meaning of a sentence. Yes, I’m talking about context. Traditional NLP techniques and frameworks were great when […]

1 2 »