The Rich Get Richer: Disparate Impact of Semi-Supervised Learning

Preprocess file of the dataset used in implicit sub-populations:(Demographic groups: race and gender) The following code will pre-process the jigsaw dataset and return train/test dataset files including demographic groups information. Step-1: Download the jigsaw dataset: identity_individual_annotations.csv from https://www.kaggle.com/c/jigsaw-unintended-bias-in-toxicity-classification/data. Step-2: python preprocecss_jiasaw_toxicity_gender_and_race_balanced.py Implementation of SSL methods Please follow the official implementations of MixMatch, MixText, and UDA. [1] https://github.com/google-research/mixmatch [2] https://github.com/GT-SALT/MixText [3] https://github.com/google-research/uda GitHub – UCSC-REAL/Disparate-SSL at pythonawesome.com Contribute to UCSC-REAL/Disparate-SSL development by creating an account on GitHub. GitHubUCSC-REAL    

Read more

Neural Semi supervised Learning for Text Classification Under Large Scale Pretraining

Neural-Semi-Supervised-Learning-for-Text-Classification Neural Semi supervised Learning for Text Classification Under Large Scale Pretraining. Download Models and Dataset Datasets and Models are found in the follwing list. Download 3.4M IMDB movie reviews. Save the data at [REVIEWS_PATH].You can download the dataset HERE. Download the vanilla RoBERTa-large model released by HuggingFace. Save the model at [VANILLA_ROBERTA_LARGE_PATH].You can download the model HERE. Download in-domain pretrained models in the paper and save the model at [PRETRAIN_MODELS]. We provide three following models.You can download HERE. init-roberta-base: […]

Read more

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CAC) Xin Lai*, Zhuotao Tian*, Li Jiang, Shu Liu, Hengshuang Zhao, Liwei Wang, Jiaya Jia This is the official PyTorch implementation of our paper Semi-supervised Semantic Segmentation with Directional Context-aware Consistency that has been accepted to 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2021). [Paper] Our method achives the state-of-the-art performance on semi-supervised semantic segmentation. Based on CCT, this Repository also supports efficient distributed training with multiple GPUs. Environment The […]

Read more

Fake news classifier on US Election News📰 | LSTM 🈚

Introduction News media has become a channel to pass on the information of what’s happening in the world to the people living. Often people perceive whatever conveyed in the news to be true. There were circumstances where even the news channels acknowledged that their news is not true as they wrote. But some news has a significant impact not only on the people or    

Read more

Multilingualism in Natural Language Processing: Targeting Low Resource Indian Languages

Introduction A language is a systematic form of communication that can take a variety of forms. There are approximately 7,000 languages believed to be spoken across the globe. Despite this diversity, the majority of the world’s population speaks only a fraction of these languages. In Spite of such a rich diversity Languages are still evolving across time much like the society we live in. While the English language is uniform, having the distinct status of being the official language of […]

Read more

Multilingualism in Natural Language Processing targeting low resource Indian languages

Introduction Language is a systematic form of communication that can take a variety of forms. There are approximately 7,000 languages believed to be spoken across the globe. Despite this diversity, the majority of the world’s population speaks only a fraction of these languages. In Spite of such a rich diversity Languages are still evolving across time much like the society we live in. While the English language is uniform, having the distinct status of being the official language of multiple […]

Read more

An Exhaustive Guide to Detecting and Fighting Neural Fake News using NLP

Overview Neural fake news (fake news generated by AI) can be a huge issue for our society This article discusses different Natural Language Processing methods to develop robust defense against Neural Fake News, including using the GPT-2 detector model and Grover (AllenNLP) Every data science professional should be aware of what neural fake news is and how to combat it   Introduction Fake news is a major concern in our society right now. It has gone hand-in-hand with the rise […]

Read more