Research

Our laboratory is broadly is interested in problems at the intersection of Genomics, Computer Science and Statistics. In particular we develop scalable and efficient methods to make sense of complex, large-scale datasets that are being generated in the fields of human genetics and medical imaging. The goal of our work is to leverage these new methodologies to understand the genetic basis of disease as well as to address questions related to human evolution.

Some major research themes of the lab are:

1_1.2.840.113619.2.110.210419.2015091416

1. Applying methods in AI to medical imaging data connected to genomic and lifetime electronic healthcare records. In the past decades, enormous amounts of data have been generated in the field of medical imaging. However linking these with the genomic data has been hindered by the inability to directly measure phenotypes from them at high throughput. We are building machine learning methods, particularly deep learning approaches to automatically extract features and phenotypes from a variety of biomedical imaging datasets and then connecting these with genomic information to understand the genetic basis of these traits.

2. Human evolution using the powerful new technology of ancient DNA. The ability to sequence genomes from skeletal remains tens of thousands of years old has revolutionized the field of genomics by allowing us to extend genomic datasets from the single dimension of space adding the important new dimension of time. We generate data from a variety of species to understand questions related to the spread of human languages, the extinction of megafauna, the response of organisms to climate change and the process of early domestication.