Concept Drift

In this research I proposed a framework to real-time drift detection for large amount of data. The framework currently works for NLP classifiers but I am working to extend it to other data types (e.g., images, audio).

Article

Greco, S., Cerquitelli, T. “Drift Lens: Real-time unsupervised Concept Drift detection by evaluating per-label embedding distributions”. 2021 International Conference on Data Mining Workshops (ICDMW), Auckland, New Zealand, 2021, pp. 341-349, doi: 10.1109/ICDMW53433.2021.00049

Despite the significant improvements made by deep learning models, their adoption in real-world dynamic applications is still limited. Concept drift is among the open issues preventing the widespread exploitation of deep learning models in real-life settings. The dynamic world changes very quickly, and the collected data drifts accordingly. Prediction models, usually trained on static historical data, should be promptly re-trained in case of new real-time drifted data distributions. Although some drift detection methodologies have been proposed over the years, different issues are still open since state-of-the-art solutions show limited effectiveness and efficiency.This paper proposes Drift Lens, a novel real-time unsupervised per-label drift detection methodology based on embedding distribution distances in deep learning models. The preliminary experiments performed on a transformer-based model fine-tuned for topic text classification show promising results in drift detection accuracy, drift characterization, and efficient execution time to support real-time concept drift detection.

Article