23, Python and Me: Using Machine Learning in Python to Analyze Consumer Genomics Data
30 Minute Talk
Sunday at 2:45 pm in Calypso
We are over 20 years into the genomics era, with new insights into human health and our recent evolutionary history emerging almost daily. Genomics may seem like the province of PhDs and R&D departments, but anyone with basic Python skills can navigate a genomics data pipeline and explore human genetic diversity, including their own! In this presentation, I will introduce the fields of population and health genomics, and the types of domain-specific data used for genomics study. I will then demonstrate how you can use Python to visualize and analyze public data sources like the 1000 Genomes Project using unsupervised machine learning methods, and how to investigate your own genomics data from sources like 23andMe. Since many people are not comfortable giving for-profit companies access to their genomic data, I'll also show how you can simulate realistic personal genomic data via a supervised ML model.