As Big Data is applied operationally throughout plants, one unique challenge that manufacturers face is Imbalance Classes. In the discipline of Automated Machine Learning for Asset Maintenance, Imbalanced Data is important to address.
When modelling for Predictive Maintenance, one needs to address the classic problem of modelling with imbalanced data when only a fraction of the data constitutes failure.
This kind of data poses several issues. While normal operations data (i.e. non-failure data) which constitutes most of the data is similar to each other, failure data may be different from one another. Standard methods for feature selection and feature extraction and construction do not work well for imbalanced data.
Moreover, the metrics used to evaluate the model can be misleading. For example, in a classification model for a dataset with more than 99% non-failure data and less than 1% failure data, a near perfect accuracy could be achieved simply by assigning all instances in the data to the majority (non-failure) class.
This model, however, is not useful, as it has never learned to predict a failure. More appropriate metrics for evaluating these types of models are precision, recall, AUC etc. Instead of conventional accuracy, the accuracy per class should be computed and the mean of these accuracies should be reported.
For details on how to compute these evaluation metrics see below.
Deep Learning (DL), considered cutting-edge Machine Learning methods, show very good performance when trained on large, balanced data sets. However, predictive asset maintenance problems involve imbalanced data because the classes have a small number of training samples.
This is because failures tend to be rare relative to the regular operations of machines. The performance of DL methods, as well as more traditional classifiers, drops significantly in such settings. Most of the existing solutions for imbalanced problems focus on customizing the data for training.
With such problems, it is easy to collect background data, while data representing the target class is rare or hard (expensive) to obtain. Most existing powerful classifiers (e.g., SVM, Neural Networks, including deep ones) assume balanced training sets, and when trained on imbalanced sets, they show degraded classification performance. A standard classifier, trained on an imbalanced dataset, is significantly skewed towards the majority class.
Popular approaches for handling imbalanced training sets
Data Sampling methods manipulate the training set to provide a standard classifier with a balanced training set. A popular approach for balancing the datasets is random oversampling of the minority class to increase its size, or random subsampling of the majority class to decrease its size.
This method is simple and easy to understand and shows good results in many cases. However, it may introduce various problems such as underrepresentation of “good” samples and overrepresentation of “bad” samples, causing overfitting and unhelpful generalization due to the loss of valuable information.
More recent methods achieve better results by making more informed under-sampling of the majority class. For example, hard-negative mining retrains the model on a set of samples, misclassified by a model that was previously trained on a subset of the training data. Cluster-based oversampling (CBO) applies k-means clustering to each class and then resamples 4 from each cluster to achieve balanced sets. The informed sampling methods reduce negative generalizations and overfitting, but in some cases may also introduce artifacts.
Another approach is synthetic sampling, where synthetic samples augment the dataset. The augmentation process can be “domain-specific” augmentation or “generic” augmentation. Domain-specific augmentation has been widely used in training Neural Networks for image understanding tasks. For example, images can be flipped, slightly rotated, re-sized, cropped and manipulated in other ways, while all the derived instances still represent the same class.
While domain-specific augmentation is tailored for the domain, generic methods operate directly on the feature space and are thus domain-agnostic. An example of such a method is the synthetic minority oversampling technique (SMOTE). SMOTE performs well on various problems but may increase instance overlapping.
With SKF Enlight AI’s Automated Machine Learning for asset maintenance, our Advanced AI algorithms used sophisticated techniques to deal with imbalance classes to give accurate failure prediction to the customer. When the SKF Enlight AI team built its Deep Learning models, a clear understanding of the business requirements and the tolerance to false negatives and false positives was necessary.
For some businesses, failure to predict a malfunction can be detrimental (e.g. steam turbine failure) or exorbitantly expensive (e.g. production shutdown in a factory), in which cases we must tune our machine learning model for a high recall.
Factories would prefer that the model errs on the side of caution as it is more cost effective to do a maintenance checkup in response to a false prediction rather than a full-blown shutdown. On the other hand, falsely predicting a failure when there is none can be a problem for other businesses due to loss of time and resources to address a falsely predicted failure, in which case the model should be tuned for a high precision.
In the language of statistics, this is what we call “misclassification cost”. The actual dollar amount associated with a false prediction can be evaluated by the business by accounting for the repair costs, from parts as well as labor, quantifying the effect on their brand and reputation, customer satisfaction etc.
This should be the driving factor for tuning the model for cost-sensitive learning.
Big Data scientists struggle with issues of imbalanced classes. SKF Enlight AI’s flexible and dynamic algorithms are based on artificial intelligence and have been designed to adjust for imbalanced classes.