December 2, 2020 Intermediate, NLP, Project, Python, Sequence Modeling, Supervised, Text, Unstructured Data Leave a comment

Building a FAQ Chatbot in Python – The Future of Information Searching

Introduction What do we do when we need any information? Simple: “We Ask, and Google Tells”. But if the answer depends on multiple variables, then the existing Ask-Tell model tends to sputter. State of the art search engines usually cannot handle such requests. We would have to search for information available in bits and pieces and then try to filter and assemble relevant parts together. Sounds time consuming, doesn’t it? Source: Inbenta This Ask-Tell model is evolving rapidly with the […]

December 2, 2020 Classification, Data Science, Intermediate, Machine Learning, NLP, Project, Python, Supervised, Text, Unstructured Data Leave a comment

A Comprehensive Guide to Understand and Implement Text Classification in Python

Improving Text Classification Models While the above framework can be applied to a number of text classification problems, but to achieve a good accuracy some improvements can be done in the overall framework. For example, following are some tips to improve the performance of text classification models and this framework. 1. Text Cleaning : text cleaning can help to reducue the noise present in text data in the form of stopwords, punctuations marks, suffix variations etc. This article can help to understand how […]

December 2, 2020 Classification, Intermediate, Libraries, NLP, Programming, Python, PyTorch, Supervised, Text, Unstructured Data Leave a comment

Introduction to Flair for NLP: A Simple yet Powerful State-of-the-Art NLP Library

Introduction Last couple of years have been incredible for Natural Language Processing (NLP) as a domain! We have seen multiple breakthroughs – ULMFiT, ELMo, Facebook’s PyText, Google’s BERT, among many others. These have rapidly accelerated the state-of-the-art research in NLP (and language modeling, in particular). We can now predict the next sentence, given a sequence of preceding words. What’s even more important is that machines are now beginning to understand the key element that had eluded them for long. Context! Understanding context […]

December 2, 2020 Analytics Vidhya, Intermediate, Listicle, NLP, Winners Approach Leave a comment

Innoplexus Sentiment Analysis Hackathon: Top 3 Out-of-the-Box Winning Approaches

Overview Hackathons are a wonderful opportunity to gauge your data science knowledge and compete to win lucrative prizes and job opportunities Here are the top 3 approaches from the Innoplexus Sentiment Analysis Hackathon – a superb NLP challenge Introduction I’m a big fan of hackathons. I’ve learned so much about data science from participating in these hackathons in the past few years. I’ll admit it – I have gained a lot of knowledge through this medium and this, in […]

December 2, 2020 Classification, Intermediate, Libraries, Machine Learning, NLP, Programming, Project, Python, Spark, Text, Unstructured Data Leave a comment

How to use a Machine Learning Model to Make Predictions on Streaming Data using PySpark

Overview Streaming data is a thriving concept in the machine learning space Learn how to use a machine learning model (such as logistic regression) to make predictions on streaming data using PySpark We’ll cover the basics of Streaming Data and Spark Streaming, and then dive into the implementation part Introduction Picture this – every second, more than 8,500 Tweets are sent, more than 900 photos are uploaded on Instagram, more than 4,200 Skype calls are made, more than 78,000 […]

November 26, 2020 Intermediate, NLP, Project, Python, Text Leave a comment

Words that matter! A Simple Guide to Keyword Extraction in Python

This article was published as a part of the Data Science Blogathon. Introduction Unstructured data contains a plethora of information. It is like energy when harnessed, will create high value for its stakeholders. A lot of work is already being done in this area by various companies. There is no doubt that the unstructured data is noisy and significant work has to be done to clean, analyze, and make them meaningful to use. This article talks about an area which […]

November 10, 2020 Big data, Business Analytics, Intermediate, NLP, Technique Leave a comment

Information Retrieval System explained in simple terms!

Introduction While searching for things over internet, I always wondered, what kind of algorithms might be running behind these search engines which provide us with the most relevant information? How do they decide which result to show for which set of search keywords. This might be a no brainer for a few people, but definitely an interesting problem for some of the best brains around the world. To find the answer, I read every guide, tutorial, learning material that came my way. Eventually, I learnt […]

November 10, 2020 Classification, Intermediate, Machine Learning, NLP, Python, Supervised, Technique, Text, Unstructured Data Leave a comment

Ultimate Guide to Understand and Implement Natural Language Processing (with codes in Python)

Overview Complete guide on natural language processing (NLP) in Python Learn various techniques for implementing NLP including parsing & text processing Understand how to use NLP for text feature engineering Introduction According to industry estimates, only 21% of the available data is present in structured form. Data is being generated as we speak, as we tweet, as we send messages on Whatsapp and in various other activities. Majority of this data exists in the textual form, which is highly unstructured […]

November 10, 2020 Classification, Intermediate, Machine Learning, NLP, Python, R, Supervised, Technique, Text, Unstructured Data Leave a comment

Complete tutorial on Text Classification using Conditional Random Fields Model (in Python)

Introduction The amount of text data being generated in the world is staggering. Google processes more than 40,000 searches EVERY second! According to a Forbes report, every single minute we send 16 million text messages and post 510,00 comments on Facebook. For a layman, it is difficult to even grasp the sheer magnitude of data out there? News sites and other online media alone generate tons of text content on an hourly basis. Analyzing patterns in that data can become […]

November 10, 2020 Intermediate, NLP, Podcast Leave a comment

DataHack Radio #21: Detecting Fake News using Machine Learning with Mike Tamir, Ph.D.

Introduction Fake news is one of the biggest scourges in our digitally connected world. That is no exaggeration. It is no longer limited to little squabbles – fake news spreads like wildfire and is impacting millions of people every day. How do you deal with such a sensitive issue? Millions of articles are being churned out every day on the internet – how do you tell real from fake? It’s not as easy as turning to a simple fact checker. […]

« 1 2 3 4 … 7 »