Adaptive Automl Pipelines for Large-Scale Data Streams Under Concept Drift

International Journal of Development Research

Volume: 
15
Article ID: 
29532
8 pages
Research Article

Adaptive Automl Pipelines for Large-Scale Data Streams Under Concept Drift

Akash Vijayrao Chaudhari and Pallavi Ashokrao Charate

Abstract: 

Data stream mining in non-stationary environments presents the twin challenges of automated model selection and concept drift adaptation. This paper proposes a framework for Adaptive AutoML Pipelines capable of continuous learning from large-scale streaming data under evolving distributions. We integrate Automated Machine Learning (AutoML) with online learning to dynamically optimize full model pipelines – including preprocessing, feature selection, and classification – as new data arrive and concepts change. A drift detection mechanism triggers rapid pipeline reconfiguration or incremental update when statistical properties of the target variable shift over timearxiv.orgpure.tue.nl. Experiments on real and synthetic data streams with sudden, gradual, and recurring drift demonstrate that the proposed adaptive pipelines significantly outperform static AutoML solutions and classical stream-learning baselines in both accuracy and time to recovery after drift. We present detailed methodology, including a high-level pipeline architecture diagram and concept drift handling strategies, and we report results with tables and figures for multiple benchmark streams. The findings underscore the importance of continuous pipeline (re)optimization for maintaining robust performance in dynamic environments. Finally, we discuss scalability considerations – such as asynchronous model search and distributed deployment – that enable our approach to handle high-velocity data streams in real-world applications like fraud detection and IoT sensor networks.

DOI: 
https://doi.org/10.37118/ijdr.29532.04.2025
Download PDF: