Imbalanced datasets often bias downstream models towards favoring majority classes, posing a critical challenge in deep learning, where extensive data is pivotal for optimal performance. Traditional solutions, such as classical data augmentation, often struggle with nuanced data traits and lack adaptability. The emergence of deep learning techniques like Auto Encoders (AEs), Generative Adversarial Networks (GANs), Diffusion Models (DMs), and Large Language Models (LLMs) opens promising avenues for addressing class imbalance through synthetic data generation. This paper presents a comprehensive survey of generative AI techniques for mitigating class imbalance in tabular datasets.
Omar A. Mures, Javier Taibo, Emilio J. Padrón, Jose A. Iglesias-Guitian