Topic Modeling and Latent Dirichlet Allocation(LDA) using Gensim and Sklearn : Part 1

This article was published as a part of the Data Science Blogathon Introduction Let’s say you have a client who has a publishing house. Your client comes to you with two tasks: one he wants to categorize all the books or the research papers he receives weekly on a common theme or a topic and the other task is to encapsulate large documents into smaller bite-sized texts. Is there any technique and tool available that can do both of these two […]

Read more

Text Preprocessing in NLP with Python codes

This article was published as a part of the Data Science Blogathon Introduction Natural Language Processing (NLP) is a branch of Data Science which deals with Text data. Apart from numerical data, Text data is available to a great extent which is used to analyze and solve business problems. But before using the data for analysis or prediction, processing the data is important. To prepare the text data for the model building we perform text preprocessing. It is the very first […]

Read more

Why and how to use BERT for NLP Text Classification?

This article was published as a part of the Data Science Blogathon Introduction NLP or Natural Language Processing is an exponentially growing field. In the “new normal” imposed by covid19, a significant proportion of educational material, news, discussions happen through digital media platforms. This provides more text data available to work upon! Originally, simple RNNS (Recurrent Neural Networks) were used for training text data. But in recent years there have been many new research publications that provide state-of-the-art results. One of […]

Read more

Text detection from images using EasyOCR: Hands-on guide

# Changing the image path IMAGE_PATH = ‘Turkish_text.png’ # Same code here just changing the attribute from [‘en’] to [‘zh’] reader = easyocr.Reader([‘tr’]) result = reader.readtext(IMAGE_PATH,paragraph=”False”) result Output: [[[[89, 7], [717, 7], [717, 108], [89, 108]], ‘Most Common Texting Slang in Turkish’], [[[392, 234], [446, 234], [446, 260], [392, 260]], ‘test’], [[[353, 263], [488, 263], [488, 308], [353, 308]], ‘yazmak’], [[[394, 380], [446, 380], [446, 410], [394, 410]], ‘link’], [[[351, 409], [489, 409], [489, 453], [351, 453]], ‘bağlantı’], [[[373, 525], […]

Read more

Analyzing customer feedbacks using Aspect Based Sentiment Analysis

This article was published as a part of the Data Science Blogathon Introduction With the advancement in technology, the growth of social media like Facebook, Twitter, Instagram has been a platform for the customers to give feedback to the businesses based on their satisfaction. The reviews posted by customers are the globally trusted source of genuine content for other users. Customer feedback serves as the third-party validation tool to build user trust in the brand. For understanding these customer feedbacks […]

Read more

Part- 6: Step by Step Guide to Master Natural Language Processing (NLP) in Python

This article was published as a part of the Data Science Blogathon Introduction This article is part of an ongoing blog series on Natural Language Processing (NLP). In the previous article of this series, we completed the statistical or frequency-based word embedding techniques, which are pre-word embedding era techniques. So, in this article, we will discuss the recent word-era embedding techniques. NOTE: In recent word-era embedding, there are many such techniques but in this article, we will discuss only the Word2Vec […]

Read more

Must Known Techniques for text preprocessing in NLP

This article was published as a part of the Data Science Blogathon In any Machine learning task, cleaning or preprocessing the data is as important as model building. Text data is one of the most unstructured forms of available data and when comes to deal with Human language then it’s too complex. Have you ever wondered how Alexa, Siri, Google assistant can understand, process, and respond in Human language. NLP is a technology that works behind it where before any response […]

Read more

Language Translation with Transformer In Python!

This article was published as a part of the Data Science Blogathon Introduction Natural Language Processing (NLP) is a field at the convergence of artificial intelligence, and linguistics. The aim is to make the computers understand real-world language or natural language so that they can perform tasks like Question Answering, Language Translation, and many more. NLP has lots of applications in different fields. 1. NLP enables the recognition and prediction of diseases based on electronic health records. 2. It is used […]

Read more

Generate Questions from Movies!

This article was published as a part of the Data Science Blogathon Have you ever thought of generating questions from the SRT files of Movies? I don’t know if we can use this but it is pretty exciting when I came to know as a beginner that we can do that. What is SRT? In simple terms, the subtitles you see in Amazon Prime, Netflix, Hotstar, HBO, etc are saved in a text file with (.srt) extension with timestamps. The timestamp […]

Read more

Measuring Text Similarity Using BERT

This article was published as a part of the Data Science Blogathon BERT is too kind — so this article will be touching on BERT and sequence relationships! Abstract A significant portion of NLP relies on the connection in highly-dimensional spaces. Typically an NLP processing will take any text, prepare it to generate a tremendous vector/array rendering said text — then make certain transformations. It’s a highly-dimensional charm. At an exceptional level, there’s not much extra to it. We require to […]

Read more
1 2 3 4 5 13