Finding Value with Data: The Cohesive Force Behind Luxury Real Estate Decisions

The real estate industry is a vast network of stakeholders including agents, homeowners, investors, developers, municipal planners, and tech innovators, each bringing unique perspectives and objectives to the table. Within this intricate ecosystem, data emerges as the critical element that binds these diverse interests together, facilitating collaboration and innovation. PropTech, or Property Technology, illustrates this synergy by applying information technology to real estate, transforming how properties are researched, bought, sold, and managed through the power of data science. From its […]

Read more

Python Basics Exercises: Dictionaries

In plain English, a dictionary is a book containing the definitions of words. Each entry in a dictionary has two parts: the word being defined, and its definition. Python dictionaries, like lists and tuples, store a collection of objects. However, instead of storing objects in a sequence, dictionaries hold information in pairs of data called key-value pairs. That is, each object in a dictionary has two parts: a key and a value. Each key is assigned a single value, which […]

Read more

Python News: What’s New From February 2024

As February takes a rare leap forward with an extra day this year, the Python community followed suit! Python versions 3.12 and 3.11 receive a security fix, and CPython source distributions now document the software supply chain to allow for a more effective vulnerability detection. Another Rust-based tool makes its way into the Python ecosystem, promising exciting improvements to the existing package management system. Looking ahead, the reveal of the PyCon US 2024 schedule gives us a glimpse into the […]

Read more

Scaling early detection of esophageal cancer with AI

Microsoft Research and Cyted have collaborated to build novel AI models (opens in new tab) to scale the early detection of esophageal cancer. The AI-supported methods demonstrated the same diagnostic performance as the existing manual workflow, potentially reducing the pathologist’s workload by up to 63%. Esophageal cancer is the sixth most common cause of cancer deaths worldwide, in part because this disease  

Read more

Harmonizing Data: A Symphony of Segmenting, Concatenating, Pivoting, and Merging

In the world of data science, where raw information swirls in a cacophony of numbers and variables, lies the art of harmonizing data. Like a maestro conducting a symphony, the skilled data scientist orchestrates the disparate elements of datasets, weaving them together into a harmonious composition of insights. Welcome to a journey where data transcends mere numbers and, instead, transforms into a vibrant melody of patterns and revelations. Let’s explore the intricacies of segmenting, concatenating, pivoting, and merging data using […]

Read more

Improving LLM understanding of structured data and exploring advanced prompting methods

This research paper was presented at the 17th ACM International Conference on Web Search and Data Mining (opens in new tab) (WSDM 2024), the premier conference on web-inspired research on search and data mining. In today’s data-driven landscape, tables are indispensable for organizing and presenting information, particularly text. They streamline repetitive content, enhance  

Read more

Beyond SQL: Transforming Real Estate Data into Actionable Insights with Pandas

In the realm of data analysis, SQL stands as a mighty tool, renowned for its robust capabilities in managing and querying databases. However, Python’s pandas library brings SQL-like functionalities to the fingertips of analysts and data scientists, enabling sophisticated data manipulation and analysis without the need for a traditional SQL database. This exploration delves into applying SQL-like functions within Python to dissect and understand data, using the Ames Housing dataset as your canvas. The Ames Housing dataset, a comprehensive compilation […]

Read more

Skewness Be Gone: Transformative Tricks for Data Scientists

Data transformations enable data scientists to refine, normalize, and standardize raw data into a format ripe for analysis. These transformations are not merely procedural steps; they are essential in mitigating biases, handling skewed distributions, and enhancing the robustness of statistical models. This post will primarily focus on how to address skewed data. By focusing on the ‘SalePrice’ and ‘YearBuilt’ attributes from the Ames housing dataset, we will provide examples of positive and negative skewed data and illustrate ways to normalize […]

Read more

Highlights from Machine Translation and Multilinguality in February 2024

With a new month, here are a few papers that I noticed on arXiv in February. Linear-time Minimum Bayes Risk Decoding with Reference Aggregation A preprint from the University of Zurich proposes a linear time version of Minimum Bayes Risk (MBR) decoding in machine translation. This decoding algorithm does not aim to generate the most probable sequence given the model but the most typical one. This is typically done by sampling dozens of candidate output sentences, from which we select […]

Read more

Research Forum Episode 2: Transforming health care and the natural sciences, AI and society, and the evolution of foundational AI technologies

Research advances are driving real-world impact faster than ever. Recent developments in AI are reshaping the way people live, work, and think. In the latest episode of Microsoft Research Forum (opens in new tab), we explore how AI is transforming health care and the natural sciences, the intersection of AI and society, and the continuing evolution of foundational AI technologies.  Below is a brief recap of the event, including select quotes from the presentations. Full replays of each session and […]

Read more
1 2 3 4 5 6 857