Classification#

Multi-label Classification Stratified Split#

Iterative Stratification: Easily stratify the train test split for multi-label classification problems

  • In machine learning classification problems, when your input data has imbalanced classes, itโ€™s necessary to stratify the train-test split so that we maintain the proportion of the minority class in both the train and test splits.

  • Unfortunately, sklearnโ€™s train_test_split only allows stratified splits for single-label classification and does not support multiple labels.

Enter iterative-stratification:

  • Stratify the train test split across multiple labels concurrently

  • With K-Fold cross-validation!

  • Compatible with sklearn

pip install iterative-stratification

๐ŸŒŸ Github: https://github.com/trent-b/iterative-stratification

Efficient Metrics Calculation#

Typically when you want to calculate multiple metrics for a machine learning model, youโ€™d import and run them one by one.

Instead, you can refactor your code to calculate multiple metrics with a single #!python get_scorer() method, as shown in the image.

You could move metrics_list to a config.yml so that we can change the metrics calculated without changing any code!