Tsirizo Rabenoro's PhD Defense

Health Monitoring

My Phd student Tsirizo Rabenoro (jointly advised by Prof. Marie Cottrell) worked during in thesis on aircraft engine health monitoring. This Cifre thesis took place at Snecma, one of the two world leaders in aircraft engine. The world of aircraft engine is representative of the constraints we are constantly facing when doing data science in industrial contexts. The ultimate goal is to build explainable and auditable classifiers. In the health monitoring context this means detecting early signs of possible failures in a way that can be explained to the field experts allowing them to accept (or not!) the automated decision.

Tsirizo's strategy was to use a simple classifier (the naive bayes classifier) and to combine it with low level detectors defined by field experts. The key idea is that experts are able to describe in a rough way early signs of failures. However those descriptions need to be tuned to the actual data (for instance, some threshold have to be set). In addition, their predictive power is generally rather low. Therefore we need to use a large number of those low level detectors and to combine them using a basic classifier. A basic combination of those detectors works only with machine learning methods that can handle high dimensional noisy inputs, e.g., random forests. Unfortunately, they belong to the black box class of models and thus are unacceptable for field experts. Tsirizo showed that using advanced feature selections it was possible to reduce significantly the number of low level tests needed to achieve acceptable performances while using an easy to understand white box classifier, the naive bayes one. This work is covered by the following publications:

Search Strategies for Binary Feature Selection for a Naive Bayes Classifier (2015) Tsirizo Rabenoro, Jérôme Lacaille, Marie Cottrell and Fabrice Rossi. In Proceedings of the 23-th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2015), pages 291-296, Bruges, Belgique, April 2015.
publisher version arXiv:1506.04177 hal preprint (pdf) bib
Interpretable Aircraft Engine Diagnostic via Expert Indicator Aggregation (2014) Tsirizo Rabenoro, Jérôme Lacaille, Marie Cottrell and Fabrice Rossi. Transactions on Machine Learning and Data Mining, volume 7, number 2, pages 41-63, October 2014.
publisher version arXiv:1503.05526 hal preprint (pdf) bib
A Methodology for the Diagnostic of Aircraft Engine Based on Indicators Aggregation (2014) Tsirizo Rabenoro, Jérôme Lacaille, Marie Cottrell and Fabrice Rossi. In Advances in Data Mining. Applications and Theoretical Aspects (Proceedings of the 14th Industrial Conference, ICDM 2014), edited by Petra Perner, volume 8557, pages 144-158, St. Petersburg (Russia), July 2014.
doi:10.1007/978-3-319-08976-8_11 arXiv:1408.6214 hal preprint (pdf) bib
Anomaly detection based on indicators aggregation (2014) Tsirizo Rabenoro, Jérôme Lacaille, Marie Cottrell and Fabrice Rossi. In Proceedings of the International Joint Conference on Neural Networks (IJCNN 2014), pages 2548-2555, Beijing (China), July 2014.
doi:10.1109/IJCNN.2014.6889841 arXiv:1409.4747 hal preprint (pdf) bib
Anomaly Detection Based on Aggregation of Indicators (2014) Tsirizo Rabenoro, Jérôme Lacaille, Marie Cottrell and Fabrice Rossi. In Proceedings of 23rd annual Belgian-Dutch Conference on Machine Learning (Benelearn 2014), edited by Benoît Frénay, Michel Verleysen and Pierre Dupont, pages 64-71, Brussels (Belgium), June 2014.
publisher version arXiv:1407.0880 hal preprint (pdf) bib

The defense

took place on the 18th of September. Tsirizo gave an excellent speech in front of the following jury:

Dr. Allou Samé, IFSTTAR, reviewer
Prof. Michel Verleysen, Université Catholique de Louvain, reviewer
Mr. Serge Blanchard, Snecma
Dr. Jérôme Lacaille, Snecma
Prof. Ludovic Denoyer, LIP6, UPMC, president of the jury
Prof. Marie Cottrell, SAMM, co-adviser

and myself.

The summary of the thesis follows:

Identifying early signs of failures in an industrial complex system is one of the main goals of preventive maintenance. It allows to avoid failure and reduce the degradation on a component by doing an earlier maintenance operation. Health monitoring for aircraft engines is one of the industrial fields for which this anomaly detection is very important and meaningful. Aircraft engine manufacturers such as Snecma collect large amount of engine related data during each flight. The idea is to be able to automatically detect when the engine is deviating from its normal behavior. Thus Snecma is developing applications allowing people to prevent engine failures by detecting early signs of anomaly. This doctoral thesis is introdulcing how the experts' knowledge is used to process this engine related data. This first step has pointed out the difficulties in handling the data whether relating to their storage or relating to processing algorithms themselves. After that, this thesis offers a method to combine experts' knowledge with machine learning processes which follow Snecma needs such as the combination of various informations, error control or the interpretability of diagnostics results. To do that the method is focusing directly on the data from the algorithms developed by the experts themselves. This is done by homogenizing the data and then by merging these data. This step allows for the use of supervised classification algorithms whom goal are to to group the items (here the engines) of a similar nature in the same class without losing the temporal component of the information. The homogenization of the data also allows the use of monitoring applications developed by experts in order to detect anomalies. Before merging the data, a selection algorithm is used. This thesis describes how the selection process allows the monitoring algorithms to calibrate themselves. Moreover, this selection follows the first constraint imposed by Snecma concerning the interpretability of the results. Eventually, the method introduced in this thesis aims at helping Snecma make the anomalies' labels converge for all its users. It also aims at incitating to gather all the data on a single database containing : the raw and the processed data from the engine and the engine related data that could be useful such as the results from experts analysis, etc. Using this database, this thesis can then offer a labelisation tool that can be used to improve selection and classification algorithms. Tsirizo Rabenoro, Outils statistiques de traitement d'indicateurs pour le diagnostic et le pronostic des moteurs d'avions

The thesis is available on TEL here (it's written French).

Published

18 September 2015

Tags

naive bayes classifier

health monitoring