Classification
Contents
Classification#
Multi-label Classification Stratified Split#
Iterative Stratification: Easily stratify the train test split for multi-label classification problems
In machine learning classification problems, when your input data has imbalanced classes, itโs necessary to stratify the train-test split so that we maintain the proportion of the minority class in both the train and test splits.
Unfortunately, sklearnโs train_test_split only allows stratified splits for single-label classification and does not support multiple labels.
Enter iterative-stratification:
Stratify the train test split across multiple labels concurrently
With K-Fold cross-validation!
Compatible with sklearn
pip install iterative-stratification
๐ Github: https://github.com/trent-b/iterative-stratification
Efficient Metrics Calculation#
Typically when you want to calculate multiple metrics for a machine learning model, youโd import and run them one by one.
Instead, you can refactor your code to calculate multiple metrics with a
single #!python get_scorer()
method, as shown in the image.
You could move metrics_list to a config.yml so that we can change the metrics calculated without changing any code!