The 15 Most Popular Data Science and Machine Learning Articles on Analytics Vidhya in 2018

Introduction What is the one thing you enjoy most about Analytics Vidhya? The most popular answer we receive (and have received since Kunal transformed his idea into reality) is the content we publish. Our content is the one thing take pride in, and 2018 saw us take our high-quality content to a whole new level. We launched multiple top-quality and popular training courses, published knowledge-rich machine learning and deep learning articles and guides, and saw our blog visits cross 2.5 million […]

Read more

How Search Engines like Google Retrieve Results: Introduction to Information Extraction using Python and spaCy

Overview How do search engines like Google understand our queries and provide relevant results? Learn about the concept of information extraction We will apply information extraction in Python using the popular spaCy library – so a lot of hands-on learning is ahead!   Introduction I rely heavily on search engines (especially Google) in my daily role as a data scientist. My search results span a variety of queries – Python code questions, machine learning algorithms, comparison of Natural Language Processing […]

Read more

Steps for effective text data cleaning (with case study using Python)

Introduction   The days when one would get data in tabulated spreadsheets are truly behind us. A moment of silence for the data residing in the spreadsheet pockets. Today, more than 80% of the data is unstructured – it is either present in data silos or scattered around the digital archives. Data is being produced as we speak – from every conversation we make in the social media to every content generated from news sources. In order to produce any […]

Read more

The Top GitHub Repositories & Reddit Threads Every Data Scientist should know (June 2018)

Introduction Half the year has flown by and that brings us to the June edition of our popular series – the top GitHub repositories and Reddit threads from last month. During the course of writing these articles, I have learned so much about machine learning from either open source codes or invaluable discussions among the top data science brains in the world. What makes GitHub special is not just it’s code hosting and social collaboration features for data scientists. It […]

Read more

The 25 Best Data Science and Machine Learning GitHub Repositories from 2018

Introduction What’s the best platform for hosting your code, collaborating with team members, and also acts as an online resume to showcase your coding skills? Ask any data scientist, and they’ll point you towards GitHub. It has been a truly revolutionary platform in recent years and has changed the landscape of how we host and even do coding. But that’s not all. It acts as a learning tool as well. How, you ask? I’ll give you a hint – open […]

Read more

5 Amazing Deep Learning Frameworks Every Data Scientist Must Know! (with Illustrated Infographic)

Introduction I have been a programmer since before I can remember. I enjoy writing codes from scratch – this helps me understand that topic (or technique) clearly. This approach is especially helpful when we’re learning data science initially. Try to implement a neural network from scratch and you’ll understand a lot of interest things. But do you think this is a good idea when building deep learning models on a real-world dataset? It’s definitely possible if you have days or […]

Read more

A Complete List of Important Natural Language Processing Frameworks you should Know (NLP Infographic)

Overview Here’s a list of the most important Natural Language Processing (NLP) frameworks you need to know in the last two years From Google AI’s Transformer to Facebook Research’s XLM/mBERT, we chart the rise of NLP through the lens of these seismic breakthroughs   Introduction Have you heard about the latest Natural Language Processing framework that was released recently? I don’t blame you if you’re still catching up with the superb StanfordNLP library or the PyTorch-Transformers framework! There has been […]

Read more

2019 In-Review and Trends for 2020 – A Technical Overview of Machine Learning and Deep Learning!

Overview A comprehensive look at the top machine learning highlights from 2019, including an exhaustive dive into NLP frameworks Check out the machine learning trends in 2020 – and hear from top experts like Sudalai Rajkumar and Dat Tran!   Introduction 2020 is almost upon us! It’s time to welcome the new year with a splash of machine learning sprinkled into our brand new resolutions. Machine learning will continue to be at the heart of what we do and how […]

Read more

Hugging Face Releases New NLP ‘Tokenizers’ Library Version (v0.8.0)

Hugging Face is at the forefront of a lot of updates in the NLP space. They have released one groundbreaking NLP library after another in the last few years. Honestly, I have learned and improved my own NLP skills a lot thanks to the work open-sourced by Hugging Face. And today, they’ve released another big update – a brand new version of their popular Tokenizer library.   A Quick Introduction to Tokenization So, what is tokenization? Tokenization is a crucial […]

Read more

The Ultimate Learning Path to Becoming a Data Scientist in 2018

Introduction So you’ve taken the plunge. You want to become a data scientist. But where to begin? There are far too many resources out there. How do you decide the starting point? Did you miss out on topics you should have studied? Which are the best resources to learn? Don’t worry, we have you covered! Analytics Vidhya’s learning path for 2016 saw 250,000+ views. In 2017, we went even further and saw an incredible 500,000+ views! So this year, we […]

Read more
1 3 4 5 6 7