Imbalanced Data: Analyze First, Sample Later

M. Masum, PhD
6 min readOct 18, 2023
Class distribution of an imbalanced dataset (Image by author)

Classification is a common task in machine learning, where the goal is to classify or label given data or observations. Specifically, in the field of machine learning, binary classification is the most common type. In a binary classification task, many real-world datasets encounter the challenge of imbalanced class distribution, if not all.

--

--