Python tutorials

Ultimate guide to deal with Text Data (using Python) – for Data Scientists and Engineers

Introduction One of the biggest breakthroughs required for achieving any level of artificial intelligence is to have machines which can process text data. Thankfully, the amount of text data being generated in this universe has exploded exponentially in the last few years. It has become imperative for an organization to have a structure in place to mine actionable insights from the text being generated. From social media analytics to risk management and cybercrime protection, dealing with text data has never […]

Read more

Comprehensive Hands on Guide to Twitter Sentiment Analysis with dataset and code

Introduction Natural Language Processing (NLP) is a hotbed of research in data science these days and one of the most common applications of NLP is sentiment analysis. From opinion polls to creating entire marketing strategies, this domain has completely reshaped the way businesses work, which is why this is an area every data scientist must be familiar with. Thousands of text documents can be processed for sentiment (and other features including named entities, topics, themes, etc.) in seconds, compared to […]

Read more

8 Excellent Pretrained Models to get you Started with Natural Language Processing (NLP)

Introduction Natural Language Processing (NLP) applications have become ubiquitous these days. I seem to stumble across websites and applications regularly that are leveraging NLP in one form or another. In short, this is a wonderful time to be involved in the NLP domain. This rapid increase in NLP adoption has happened largely thanks to the concept of transfer learning enabled through pretrained models. Transfer learning, in the context of NLP, is essentially the ability to train a model on one dataset […]

Read more

How to Get Started with NLP – 6 Unique Methods to Perform Tokenization

Overview Looking to get started with Natural Language Processing (NLP)? Here’s the perfect first step Learn how to perform tokenization – a key aspect to preparing your data for building NLP models We present 6 different ways to perform tokenization on text data   Introduction Are you fascinated by the amount of text data available on the internet? Are you looking for ways to work with this text data but aren’t sure where to begin? Machines, after all, recognize numbers, […]

Read more

How Search Engines like Google Retrieve Results: Introduction to Information Extraction using Python and spaCy

Overview How do search engines like Google understand our queries and provide relevant results? Learn about the concept of information extraction We will apply information extraction in Python using the popular spaCy library – so a lot of hands-on learning is ahead!   Introduction I rely heavily on search engines (especially Google) in my daily role as a data scientist. My search results span a variety of queries – Python code questions, machine learning algorithms, comparison of Natural Language Processing […]

Read more

Hands-on NLP Project: A Comprehensive Guide to Information Extraction using Python

Overview Information extraction is a powerful NLP concept that will enable you to parse through any piece of text Learn how to perform information extraction using NLP techniques in Python   Introduction I’m a bibliophile – I love pouring through books in my free time and extracting as much knowledge as I can. But in today’s information overload age, the way we read stuff has changed. Most of us tend to skip the entire text, whether that’s an article, a […]

Read more

Steps for effective text data cleaning (with case study using Python)

Introduction   The days when one would get data in tabulated spreadsheets are truly behind us. A moment of silence for the data residing in the spreadsheet pockets. Today, more than 80% of the data is unstructured – it is either present in data silos or scattered around the digital archives. Data is being produced as we speak – from every conversation we make in the social media to every content generated from news sources. In order to produce any […]

Read more

Beginners Guide to Topic Modeling in Python

Introduction Analytics Industry is all about obtaining the “Information” from the data. With the growing amount of data in recent years, that too mostly unstructured, it’s difficult to obtain the relevant and desired information. But, technology has developed some powerful methods which can be used to mine through the data and fetch the information that we are looking for. One such technique in the field of text mining is Topic Modelling. As the name suggests, it is a process to […]

Read more

Natural Language Processing for Beginners: Using TextBlob

Introduction Natural Language Processing (NLP) is an area of growing attention due to increasing number of applications like chatbots, machine translation etc. In some ways, the entire revolution of intelligent machines in based on the ability to understand and interact with humans. I have been exploring NLP for some time now.  My journey started with NLTK library in Python, which was the recommended library to get started at that time. NLTK is a perfect library for education and research, it becomes […]

Read more

The Top GitHub Repositories & Reddit Threads Every Data Scientist should know (June 2018)

Introduction Half the year has flown by and that brings us to the June edition of our popular series – the top GitHub repositories and Reddit threads from last month. During the course of writing these articles, I have learned so much about machine learning from either open source codes or invaluable discussions among the top data science brains in the world. What makes GitHub special is not just it’s code hosting and social collaboration features for data scientists. It […]

Read more
1 140 141 142 143 144 183