DataHack Radio #12: Exploring the Nuts and Bolts of Natural Language Processing with Sebastian Ruder

https://soundcloud.com/datahack-radio/episode-12-sebastian-ruder Introduction There’s text everywhere around us, from digital sources like social media to physical objects like books and print media. The amount of text data being generated every day is mind boggling and yet we’re not even close to harnessing the full power of natural language processing. I see a ton of aspiring data scientists interested in this field, but they often turn away daunted by the challenges NLP presents. It’s such a niche line of work, and we […]

Read more

25 Open Datasets for Deep Learning Every Data Scientist Must Work With

Introduction The key to getting better at deep learning (or most fields in life) is practice. Practice on a variety of problems – from image processing to speech recognition. Each of these problem has it’s own unique nuance and approach. But where can you get this data? A lot of research papers you see these days use proprietary datasets that are usually not released to the general public. This becomes a problem, if you want to learn and apply your […]

Read more