Automating Outlier Detection via Meta-Learning

Given an unsupervised outlier detection (OD) task on a new dataset, how can we automatically select a good outlier detection method and its hyperparameter(s) (collectively called a model)? Thus far, model selection for OD has been a “black art”; as any model evaluation is infeasible due to the lack of (i) hold-out data with labels, and (ii) a universal objective function… In this work, we develop the first principled data-driven approach to model selection for OD, called MetaOD, based on […]

Read more

MStream: Fast Streaming Multi-Aspect Group Anomaly Detection

Given a stream of entries in a multi-aspect data setting i.e., entries having multiple dimensions, how can we detect anomalous activities? For example, in the intrusion detection setting, existing work seeks to detect anomalous events or edges in dynamic graph streams, but this does not allow us to take into account additional attributes of each entry… Our work aims to define a streaming multi-aspect data anomaly detection framework, termed MStream, which can detect unusual group anomalies as they occur, in […]

Read more

Real-Time Streaming Anomaly Detection in Dynamic Graphs

Given a stream of graph edges from a dynamic graph, how can we assign anomaly scores to edges in an online manner, for the purpose of detecting unusual behavior, using constant time and memory? Existing approaches aim to detect individually surprising edges… In this work, we propose MIDAS, which focuses on detecting microcluster anomalies, or suddenly arriving groups of suspiciously similar edges, such as lockstep behavior, including denial of service attacks in network traffic data. We further propose MIDAS-F, to […]

Read more

Meta-AAD: Active Anomaly Detection with Deep Reinforcement Learning

High false-positive rate is a long-standing challenge for anomaly detection algorithms, especially in high-stake applications. To identify the true anomalies, in practice, analysts or domain experts will be employed to investigate the top instances one by one in a ranked list of anomalies identified by an anomaly detection system… This verification procedure generates informative labels that can be leveraged to re-rank the anomalies so as to help the analyst to discover more true anomalies given a time budget. Some re-ranking […]

Read more