July 30, 2021 Advanced, NLP, Text

Part 16 : Step by Step Guide to Master NLP – Topic Modelling using LSA

This article was published as a part of the Data Science Blogathon Introduction This article is part of an ongoing blog series on Natural Language Processing (NLP). In the previous article, we completed a basic technique of Topic Modeling named Non-Negative Matrix Factorization. So, In continuation of that part now we will start our discussion on another Topic modeling technique named Latent Semantic Analysis. So, In this article, we will deep dive into a Topic Modeling technique named Latent Semantic Analysis […]

July 30, 2021 Advanced, NLP, Text

Part 20: Step by Step Guide to Master NLP – Information Retrieval

This article was published as a part of the Data Science Blogathon Introduction This article is part of an ongoing blog series on Natural Language Processing (NLP). In the previous article, we completed our discussion on Topic Modelling Techniques. Now, in this article, we will be discussing an important application of NLP in Information Retrieval. So, In this article, we will discuss the basic concepts of Information Retrieval along with some of the models that are used in Information Retrieval. NOTE: […]

July 30, 2021 Beginner, Machine Learning, NLP, Python, Text

Bag-of-words vs TFIDF vectorization –A Hands-on Tutorial

This article was published as a part of the Data Science Blogathon Whenever we apply any algorithm to textual data, we need to convert the text to a numeric form. Hence, there arises a need for some pre-processing techniques that can convert our text to numbers. Both bag-of-words (BOW) and TFIDF are pre-processing techniques that can generate a numeric form from an input text. Bag-of-Words: The bag-of-words model converts text into fixed-length vectors by counting how many times each word appears. […]

July 29, 2021 Text

Vision Transformer for Fast and Efficient Scene Text Recognition

deep-text-recognition-benchmark ViTSTR is a simple single-stage model that uses a pre-trained Vision Transformer (ViT) to perform Scene Text Recognition (ViTSTR). It has a comparable accuracy with state-of-the-art STR models although it uses significantly less number of parameters and FLOPS. ViTSTR is also fast due to the parallel computation inherent to ViT architecture. ViTSTR is built using a fork of CLOVA AI Deep Text Recognition Benchmark whose original documentation is at the bottom. Below we document how to train and evaluate […]

July 29, 2021 Text

A Simple Strong Baseline for TextVQA and TextCaps

Simple is not Easy Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps[AAAI2021] Citation If you use ssbaseline in your work, please cite: @article{zhu2020simple, title={Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps}, author={Zhu, Qi and Gao, Chenyu and Wang, Peng and Wu, Qi}, journal={arXiv preprint arXiv:2012.05153}, year={2020} } Installation First install the repo using git clone https://github.com/ZephyrZhuQi/ssbaseline.git ~/ssbaseline cd ~/ssbaseline python setup.py build develop Getting Data We provide SBD-Trans OCR for TextVQA and […]

July 27, 2021 Advanced, NLP, Project, Python, Supervised, Text, Unstructured Data

Getting Started with Natural Language Processing using Python

This article was published as a part of the Data Science Blogathon Why NLP? Natural Language Processing has always been a key tenet of Artificial Intelligence (AI). With the increase in the adoption of AI, systems to automate sophisticated tasks are being built. Some of these examples are described below. Diagnosing rare form of cancer – At the University of Tokyo’s Institute of Medical Science, doctors used artificial intelligence to successfully diagnose a rare type of leukemia. The doctors used an AI […]

July 23, 2021 Text

A Python parser that takes the content of a text file and then reads into variables

Text-File-Parser A Python parser that takes the content of a text file and then reads into variables. Input.text File 1. What is your ***? 1. 18 – 34 2. 35- 44 3. 45- 54 4. 55-64 5. Over 65 6. Don’t know 2. What *** do you live in? 1. Ontario 2. Quebec 3. Manitoba 4. Alberta 5. Other Given a plain text file as above, this Python script reads all the questions and their numbers, storing them into two […]

July 20, 2021 Advanced, NLP, Text

Feature Extraction and Embeddings in NLP: A Beginners guide to understand Natural Language Processing

This article was published as a part of the Data Science Blogathon Introduction In Natural Language Processing, Feature Extraction is one of the trivial steps to be followed for a better understanding of the context of what we are dealing with. After the initial text is cleaned and normalized, we need to transform it into their features to be used for modeling. We use some particular method to assign weights to particular words within our document before modeling them. We go […]

July 17, 2021 Text

Encode and decode text application in python

Text Encoder and Decoder Encode and decode text in many ways using this GUI application! Encode in: ASCII85 Base85 Base64 Base32 Base16 Url MD5 Hash SHA-1 SHA-224 SHA-384 SHA-256 SHA-512 Decode in: ASCII85 Base85 Base64 Base32 Base16 Url GitHub https://github.com/nonimportant/text-encode-and-decoder

July 17, 2021 Advanced, NLP, Project, Python, Text, Unstructured Data

Indexing in Natural Language Processing for Information Retrieval

This article was published as a part of the Data Science Blogathon Overview This blog covers GREP(Global-Regular-Expression-Print) and its drawbacks Then we move on to Document Term Matrix and Inverted Matrix Finally, we end with dynamic and distributed indexing image source-https://javarevisited.blogspot.com/2011/06/10-examples-of-grep-command-in-unix-and.html#axzz6zwakOXgt Global Regular Expression Print Whenever we are dealing with a small amount of data, we can use the grep command very efficiently. It allows us to search one or more files for lines that contain a pattern. For […]

« 1 2 3 4 5 … 22 »