Unsupervised Concept Drift Detection from Deep Learning Representations in Real-time
Sep 1, 2024ยท,,,ยท
1 min read
Salvatore Greco
Bartolomeo Vacchetti
Daniele Apiletti
Tania Cerquitelli
Abstract
Concept Drift is a phenomenon in which the underlying data distribution and statistical properties of a target domain change over time, leading to a degradation of the model’s performance. Consequently, models deployed in production require continuous monitoring through drift detection techniques. Most drift detection methods to date are supervised, i.e., based on ground-truth labels. However, true labels are usually not available in many real-world scenarios. Although recent efforts have been made to develop unsupervised methods, they often lack the required accuracy, have a complexity that makes real-time implementation in production environments difficult, or are unable to effectively characterize drift. To address these challenges, we propose DriftLens, an unsupervised real-time concept drift detection framework. It works on unstructured data by exploiting the distribution distances of deep learning representations. DriftLens can also provide drift characterization by analyzing each label separately. A comprehensive experimental evaluation is presented with multiple deep learning classifiers for text, image, and speech. Results show that (i) DriftLens performs better than previous methods in detecting drift in 11/13 use cases; (ii) it runs at least 5 times faster; (iii) its detected drift value is very coherent with the amount of drift (correlation ); (iv) it is robust to parameter changes.
Type
This work is driven by the results in my previous paper on LLMs.
Create your slides in Markdown - click the Slides button to check out the example.
Add the publication’s full text or supplementary notes here. You can use rich formatting such as including code, math, and images.