Reducing screening workload in medical literature monitoring with machine learning

No author information is available for this submission.



The authors are in the process of completing this submission.

Please try again later.


Medical literature monitoring of adverse drug reactions and special situations is an important aspect of the pharmacovigilance process and a regulatory requirement for every medicinal product available in the European market. It is however a time consuming effort requiring specialist domain knowledge and where only a small fraction of articles reviewed eventually become valid individual case safety reports (ICSRs).

We present an approach that applies machine learning models to reliably filter out irrelevant articles ahead of manual screening, based on information available in the article abstract and title. A benchmark model is trained on a labeled dataset produced for study, by asking annotators what articles include suspected adverse event mentions. This choice of label helps overcome the incomplete nature of article titles and abstracts, and produces a drug-agnostic dataset that requires less annotation effort.

Using historical data from the EMA’s own literature screening activities and a benchmark deep learning classification model, we achieve significant savings in volume of articles to be screened even when setting low target levels for false negatives (ie. high recall). For example: 44% average monthly savings (from 40% to 49%) in the volume of abstracts to screen for a target recall of 95%. These results suggest our approach is a promising use of machine learning to pragmatically reduce manual workloads in medical literature monitoring.


Ask a Question

Get involved to find out more about this Presentation.

All Comments

Create an Account or Log In to participate in the discussion

Loading discussions