text mining

Use of Machine Learning Techniques for Case-Detection of Varicella Zoster Using Routinely Collected Textual Ambulatory Records: Pilot Observational Study

The purpose of this paper is to compare machine learning techniques in their application to EHR analysis for disease detection. Boosting has demonstrated promising performance in large-scale EHR-based infectious disease identification.

Screening PubMed abstracts: is class imbalance always a challenge to machine learning?

We combined four machine learning techniques and four data preprocessing for class imbalance to identify the outperforming strategy to screen articles in PubMed for inclusion in systematic reviews. We used textual data of 14 systematic reviews as case studies. Meta-analytic fixed-effect models were used to pool delta AUCs separately by classifier and strategy. Resampling techniques slightly improved the performance of the investigated machine learning techniques. From a computational perspective, random undersampling 35:65 may be preferred.

Analysis of unstructured text-based data using machine learning techniques: the case of pediatric emergency department records in Nicaragua

Five-hundred Random Forests were trained on a set of bootstrap samples of the whole dataset (1789 ED visits) to perform the classification task. MLTs seemed to be a promising opportunity for the exploitation of unstructured information reported in ED records in low- and middle-income Spanish-speaking countries.

Extending PubMed searches to ClinicalTrials.gov through a machine learning approach for systematic reviews

The proposed machine learning instrument has the potential to help researchers identify relevant studies in the SR process by reducing workload, without losing sensitivity and at a small price in terms of specificity.