Structure of the Bayesian network estimated considering demographical variables, known risk factors and genetic factors. The red node is the clinical endpoint, which indicates the presence or absence of EIMs. The chosen network was learned with the Tabu Search algorithm, which is one of the available algorithms in the “bnlearn” R package.

The Role of Genetic Factors in Characterizing Extra-Intestinal Manifestations in Crohn’s Disease Patients: Are Bayesian Machine Learning Methods Improving Outcome Predictions?

Structure of the Bayesian network estimated considering demographical variables, known risk factors and genetic factors. The red node is the clinical endpoint, which indicates the presence or absence of EIMs. The chosen network was learned with the Tabu Search algorithm, which is one of the available algorithms in the “bnlearn” R package.

The Role of Genetic Factors in Characterizing Extra-Intestinal Manifestations in Crohn’s Disease Patients: Are Bayesian Machine Learning Methods Improving Outcome Predictions?

Abstract

Background The high heterogeneity of inflammatory bowel disease (IBD) makes the study of this condition challenging. In subjects affected by Crohn’s disease (CD), extra-intestinal manifestations (EIMs) have a remarkable potential impact on health status. Increasing numbers of patient characteristics and the small size of analyzed samples make EIMs prediction very difficult. Under such constraints, Bayesian machine learning techniques (BMLTs) have been proposed as a robust alternative to classical models for outcome prediction. This study aims to determine whether BMLT could improve EIM prediction and statistical support for the decision-making process of clinicians. Methods Three of the most popular BMLTs were employed in this study: Naϊve Bayes (NB), Bayesian Network (BN) and Bayesian Additive Regression Trees (BART). They were applied to a retrospective observational Italian study of IBD genetics. Results The performance of the model is strongly affected by the features of the dataset, and BMLTs poorly classify EIM appearance. Conclusions This study shows that BMLTs perform worse than expected in classifying the presence of EIMs compared to classical statistical tools in a context where mixed genetic and clinical data are available but relevant data are also missing, as often occurs in clinical practice.

Publication
Journal of Clinical Medicine, (8)