A New Framework for Understanding How Genes Function
Gene set enrichment analysis, though an esoteric-sounding term, lies behind many advances in molecular biology today. It aided in the discovery of subtypes of triple-negative breast cancer, for instance. It also confirmed the efficacy of a gene editing tool, CRISPR-Cas9, on human cells.
GSEA helps scientists study the coordinated actions of a set of genes, via the biological pathways and functions they share. But GSEA isn’t standardized. Now, in the most comprehensive study of GSEA methods to date, researchers at CUNY’s Graduate School of Public Health and Health Policy have benchmarked the most popular GSEA methods and provided an open-source framework that lets researchers evaluate other methods.
“This study improves our understanding of how gene set enrichment analysis methods perform in biomedical applications, and will help future developments focus on improvements relevant to public health,” Waldron told CUNY SPH.
The study appears in Briefings in Bioinformatics. Professor Levi Waldron (CUNY SPH) and postdoctoral fellow Ludwig Geistlinger were authors on the study, as well as CUNY SPH alumni Marcel Ramos and Lucas Schiffer.
GSEA is an incredibly important tool for a range of biological studies. But up until now, many methods had only been evaluated on simulated data, or only on a few data sets, making it hard to know which methods work the best.
To remedy this, the study authors have provided software and a curated collection of 75 datasets to benchmark different GSEA methods, so that other researchers can do their own benchmarking.
They also assessed the 10 most popular GSEA methods, creating standards for things like statistical soundness and the ability to correctly prioritize the biological pathways most relevant to a range of human disease. The assessment revealed vast differences between methods, showing that some work objectively better than others.