Join us to hear Ph.D. student Anna Seffernick discuss the use of a Bayesian Stereotype Model for feature selection in acute myeloid leukemia. 

Ordinal outcomes are common in public health and medical research and often represent the observed cutpoints of some latent continuous variables. However, some ordinal outcomes are truly discrete and their categorization relies on the subjective combination of several factors. One example of this type of ordinal variable is the European LeukemiaNet (ELN) risk stratification system, which characterizes acute myeloid leukemia (AML) patients as having favorable, intermediate, or adverse risk. Because AML is a heterogeneous mix of diseases, accurate classification of AML patients can have important prognostic and treatment implications. Thus, we are interested in identifying genomic features that are monotonically associated with ELN score, as these might be mechanistically related to disease severity or progression and could be useful as diagnostic markers or potential therapeutic targets.

The stereotype logit model was developed for these discrete, or “assessed,” ordinal variables. Though not widely used, this model has been implemented in high-dimensional settings with the frequentist GMIFS algorithm and in the Bayesian setting with low-dimensional data. Here, we extend the Bayesian framework for the stereotype model to high-dimensional data and explore different regularization and variable selection priors. Simulation studies were conducted to investigate the feature selection performance of these priors, as well as to compare our method with existing methods. We also applied our proposed model to an AML gene expression dataset to identify genomic features that might be driving patients’ cytogenetic risk group classification. 

Learn more about this presentation on Monday at 3:10 p.m. via this Zoom link.