Sankey diagram showing the relative importance of the 12 general statistical approaches to address the main 10 summary topics in the EFSA remits. For instance “Surveillance” needs in particular the application of regression models and statistical spatial techniques.
Sankey diagram showing the relative importance of the 12 general statistical approaches to address the main 10 summary topics in the EFSA remits. For instance “Surveillance” needs in particular the application of regression models and statistical spatial techniques.

Machine Learning Techniques applied in risk assessment related to food safety

Abstract

In 2014 European Food Safety Authority (EFSA) commissioned this evaluation of the potential use of Machine Learning Techniques (MLTs) to provide insights for the elaboration of a guidance document and to facilitate the harmonisation in EFSA’s assessments. Four objectives were provided: 1. To produce an inventory of MLTs that could be of use in the EFSA risk assessment activities; 2. To carry out a classification of EFSA opinions to identify the questions most commonly asked; 3. To assess the performance of ML techniques compared to non-MLTs and to propose a decision tree to help in the choice of the most appropriate methodology; 4. To develop, if possible, machine learning algorithms tailored to answer EFSA specific questions. The extensive literature search on 22 online databases led to an inventory of more than 2.6 million MLTs references: 213,070 abstracts were classified as relevant for EFSA and labelled by applying a Support Vector Machine and a Name co-Occurrences analysis. The application of Latent Dirichlet Allocation and Correlated Topic Modeling to the text of 3,744 EFSA scientific documents allowed the description of 28 main topics characterising the overall activity of assessment carried out by EFSA. Moreover the most common statistical techniques applied in EFSA to address the topics have been identified by text mining and by a questionnaire survey that involved 49 EFSA staff. Six different examples were used to show and compare the different performances of MLTs and non-MLTs techniques: this activity served to develop a decision tree that on the basis of a set of predefined criteria provides a guideline for the selection of fit for purpose MLT. Finally to better address some specific issues, data from the European Union Summary Reports on Zoonoses and on Antimicrobial Resistance were used to develop case studies where existing MLTs were expressly modified.

Publication
EFSA Supporting Publications