Sankey diagram showing the relative importance of the 12 general statistical approaches to address the main 10 summary topics in the EFSA remits. For instance “Surveillance” needs in particular the application of regression models and statistical spatial techniques.

Machine Learning Techniques applied in risk assessment related to food safety

Giuseppe Ru, Maria Ines Crescio, Francesco Ingravalle, Cristina Maurella, Dario Gregori, Corrado Lanera, Danila Azzolina, Giulia Lorenzoni, Nicola Soriani, Slavica Zec, Paola Berchialla, Silvio Mercadante, Federica Zobec, Marco Ghidina, solidea Baldas, Barbara Bonifacio, Al Kinkopf, Dejan Kozina, Luca Nicolandi, Luca Rosati

July 2017

PDF DOI View Journal Article

Machine Learning Techniques applied in risk assessment related to food safety

July 2017

PDF DOI View Journal Article

Abstract

In 2014 European Food Safety Authority (EFSA) commissioned this evaluation of the potential use of Machine Learning Techniques (MLTs) to provide insights for the elaboration of a guidance document and to facilitate the harmonisation in EFSA’s assessments. Four objectives were provided: 1. To produce an inventory of MLTs that could be of use in the EFSA risk assessment activities; 2. To carry out a classification of EFSA opinions to identify the questions most commonly asked; 3. To assess the performance of ML techniques compared to non-MLTs and to propose a decision tree to help in the choice of the most appropriate methodology; 4. To develop, if possible, machine learning algorithms tailored to answer EFSA specific questions. The extensive literature search on 22 online databases led to an inventory of more than 2.6 million MLTs references: 213,070 abstracts were classified as relevant for EFSA and labelled by applying a Support Vector Machine and a Name co-Occurrences analysis. The application of Latent Dirichlet Allocation and Correlated Topic Modeling to the text of 3,744 EFSA scientific documents allowed the description of 28 main topics characterising the overall activity of assessment carried out by EFSA. Moreover the most common statistical techniques applied in EFSA to address the topics have been identified by text mining and by a questionnaire survey that involved 49 EFSA staff. Six different examples were used to show and compare the different performances of MLTs and non-MLTs techniques: this activity served to develop a decision tree that on the basis of a set of predefined criteria provides a guideline for the selection of fit for purpose MLT. Finally to better address some specific issues, data from the European Union Summary Reports on Zoonoses and on Antimicrobial Resistance were used to develop case studies where existing MLTs were expressly modified.

Type

Journal article

Publication

EFSA Supporting Publications