It’s time to stop using Python 3.8

Upgrading to new software versions is work, and work that doesn’t benefit your software’s users. Users care about features and bug fixes, not how up-to-date you are. So it’s perhaps not surprising how many people still use Python 3.8. As of September 2024, about 14% of packages downloaded from PyPI were for Python 3.8. This includes automated downloads as part of CI runs, so it doesn’t mean 3.8 is used in 14% of applications, but that’s still 250 million packages […]

Read more

When should you upgrade to Python 3.13?

Python 3.13 will be out October 1, 2024—but should you switch to it immediately? And if you shouldn’t upgrade just yet, when should you? Immediately after the release, you probably didn’t want to upgrade just yet. But from December 2024 and onwards, upgrading is definitely worth trying, though it may not succeed. To understand why, we need to consider Python packaging, the software development process, and take a look at the history of past releases. The problems with a new […]

Read more

Comparing Scikit-Learn and TensorFlow for Machine Learning

Comparing Scikit-Learn and TensorFlow for Machine LearningImage by Editor | Ideogram Choosing a machine learning (ML) library to learn and utilize is essential during the journey of mastering this enthralling discipline of AI. Understanding the strengths and limitations of popular libraries like Scikit-learn and TensorFlow is essential to choose the one that adapts to your needs. This article discusses and compares these two popular Python libraries for ML under eight criteria. Scope of Models and Techniques Let’s start by highlighting […]

Read more

Filling the Gaps: A Comparative Guide to Imputation Techniques in Machine Learning

In our previous exploration of penalized regression models such as Lasso, Ridge, and ElasticNet, we demonstrated how effectively these models manage multicollinearity, allowing us to utilize a broader array of features to enhance model performance. Building on this foundation, we now address another crucial aspect of data preprocessing—handling missing values. Missing data can significantly compromise the accuracy and reliability of models if not appropriately managed. This post explores various imputation strategies to address missing data and embed them into our […]

Read more

Automating Data Cleaning Processes with Pandas

Automating Data Cleaning Processes with Pandas Few data science projects are exempt from the necessity of cleaning data. Data cleaning encompasses the initial steps of preparing data. Its specific purpose is that only the relevant and useful information underlying the data is retained, be it for its posterior analysis, to use as inputs to an AI or machine learning model, and so on. Unifying or converting data types, dealing with missing values, eliminating noisy values stemming from erroneous measurements, and […]

Read more

Quiz: Python Virtual Environments: A Primer

Interactive Quiz ⋅ 10 QuestionsBy Kate Finegan Share So you’ve been primed on Python virtual environments! Test your understanding of the tutorial here. The quiz contains 10 questions and there is no time limit. You’ll get 1 point for each correct answer. At the end of the quiz, you’ll receive a total score. The maximum score is 100%. Good luck! Start the Quiz » « Browse All Python Quizzes    

Read more

Quiz: Python 3.13: Free-Threading and a JIT Compiler

Interactive Quiz ⋅ 16 QuestionsBy Bartosz Zaczyński Share In this quiz, you’ll test your understanding of the new features in Python 3.13. By working through this quiz, you’ll revisit how to compile a custom Python build, disable the Global Interpreter Lock (GIL), enable the Just-In-Time (JIT) compiler, determine the availability of new features at runtime, assess the performance improvements in Python 3.13, and make a C extension module targeting Python’s new ABI. The quiz contains 16 questions and there is […]

Read more

Research Focus: Week of September 9, 2024

Welcome to Research Focus, a series of blog posts that highlights notable publications, events, code/datasets, new hires and other milestones from across the research community at Microsoft. NEW RESEARCH Can LLMs be Fooled? Investigating Vulnerabilities in LLMs Large language models (LLMs) are the de facto standard for numerous machine learning tasks, ranging from text generation and  

Read more

Tips for Using Machine Learning in Fraud Detection

Tips for Using Machine Learning in Fraud DetectionImage by Editor | Midjourney The battle against fraud has become more intense than it ever has been. As transactions become increasingly digital and complex, fraudsters are constantly devising new ways to exploit vulnerabilities in financial systems. And this is where the power of machine learning comes into play. Machine learning offers a robust approach to identifying and even preventing fraudulent activities. By harnessing advanced algorithms and analytics, financial institutions can stay one […]

Read more

Scaling to Success: Implementing and Optimizing Penalized Models

This post will demonstrate the usage of Lasso, Ridge, and ElasticNet models using the Ames housing dataset. These models are particularly valuable when dealing with data that may suffer from multicollinearity. We leverage these advanced regression techniques to show how feature scaling and hyperparameter tuning can improve model performance. In this post, we’ll provide a step-by-step walkthrough on setting up preprocessing pipelines, implementing each model with scikit-learn, and fine-tuning them to achieve optimal results. This comprehensive approach not only aids […]

Read more
1 2 3 895