Detecting Fake News with Natural Language Processing

This article was published as a part of the Data Science Blogathon 1. Introduction We consume news through several mediums throughout the day in our daily routine, but sometimes it becomes difficult to decide which one is fake and which one is authentic. Do you trust all the news you consume from online media? Every news that we consume is not real. If you listen to fake news it means you are collecting the wrong information from the world which can […]

Read more

Stochastic Gradient Descent Algorithm With Python and NumPy

Stochastic gradient descent is an optimization algorithm often used in machine learning applications to find the model parameters that correspond to the best fit between predicted and actual outputs. It’s an inexact but powerful technique. Stochastic gradient descent is widely used in machine learning applications. Combined with backpropagation, it’s dominant in neural network training applications. Basic Gradient Descent Algorithm The gradient descent algorithm is an approximate and iterative method for mathematical optimization. You can use it to approach the minimum […]

Read more

Spelling Correction in Python with TextBlob

Introduction Spelling mistakes are common, and most people are used to software indicating if a mistake was made. From autocorrect on our phones, to red underlining in text editors, spell checking is an essential feature for many different products. The first program to implement spell checking was written in 1971 for the DEC PDP-10. Called SPELL, it was capable of performing only simple comparisons of words and detecting one or two letter differences. As hardware and software advanced, so have […]

Read more

A Review of 2020 and Trends in 2021 – A Technical Overview of Machine Learning and Deep Learning!

Introduction Data science is not a choice anymore. It is a necessity. 2020 is almost in the books now. What a crazy year from whichever standpoint you look at it. A pandemic raged around the world and yet it failed to dim the light on data science. The thirst to learn more continued unabated in our community and we saw some incredible developments and breakthroughs this year. From OpenAI’s mind-boggling GPT-3 framework to Facebook’s DETR model, this was a year […]

Read more

Top 15 Open-Source Datasets of 2020 that every Data Scientist Should add to their Portfolio!

Overview Here is a list of Top 15 Datasets for 2020 that we feel every data scientist should practice on The article contains 5 datasets each for machine learning, computer vision, and NLP By no means is this list exhaustive. Feel free to add other datasets in the comments below   Introduction For the things we have to learn before we can do them, we learn by doing them -Aristotle I am sure everyone can attest to this saying. No […]

Read more

Simple NLP in Python with TextBlob: N-Grams Detection

Introduction The constant growth of data on the Internet creates a demand for a tool that could process textual information in a faster way with no effort from the ordinary user. Moreover, it’s highly important that this instrument of text analysis could implement solutions for both low and high-level NLP tasks such as counting word frequencies, calculating sentiment analysis of the texts or detecting patterns in relationships between words. TextBlob is a great lightweight library for a wide variety of […]

Read more

Top 5 Machine Learning GitHub Repositories & Reddit Discussions (October 2018)

Introduction “Should I use GitHub for my projects?” – I’m often asked this question by aspiring data scientists. There’s only one answer to this – “Absolutely!”. GitHub is an invaluable platform for data scientists looking to stand out from the crowd. It’s an online resume for displaying your code to recruiters and other fellow professionals. The fact that GitHub hosts open-source projects from the top tech behemoths like Google, Facebook, IBM, NVIDIA, etc. is what adds to the gloss of […]

Read more

How to use a Machine Learning Model to Make Predictions on Streaming Data using PySpark

Overview Streaming data is a thriving concept in the machine learning space Learn how to use a machine learning model (such as logistic regression) to make predictions on streaming data using PySpark We’ll cover the basics of Streaming Data and Spark Streaming, and then dive into the implementation part   Introduction Picture this – every second, more than 8,500 Tweets are sent, more than 900 photos are uploaded on Instagram, more than 4,200 Skype calls are made, more than 78,000 […]

Read more

How to create a poet / writer using Deep Learning (Text Generation using Python)?

Introduction From short stories to writing 50,000 word novels, machines are churning out words like never before. There are tons of examples available on the web where developers have used machine learning to write pieces of text, and the results range from the absurd to delightfully funny. Thanks to major advancements in the field of Natural Language Processing (NLP), machines are able to understand the context and spin up tales all by themselves.               […]

Read more

The Ultimate Learning Path to Become a Data Scientist and Master Machine Learning in 2019

The Learning Path to Become a Data Scientist in 2020 is now live! Head over here to start your data science journey. Introduction Learning paths are immensely popular among our readers and with good reason! Learning paths take away the pain and confusion from the learning process. For those who don’t know what a learning path is – we take the pain of going through all the resources available on data science, machine learning and Artificial Intelligence, select the best […]

Read more
1 2 3 11