Track
The Python programming language
Type
Talk
Level
beginner
Language
English
Duration
20 minutes
What if the tools we use to predict disease risk don't work for entire populations, simply because very few people (studies) test them over there? That's the case for many polygenic risk score (PRS) models and pipelines, which often fail when applied to African genomes. PRS is a number that estimates how likely someone is to develop a disease based on their DNA. In this talk, I'll walk through how I am using Python to build something better: a reproducible pipeline for genomic tools that center inclusion. More than a technical talk, this is a story about learning, persistence and impact. I'll share how I use accessible Python libraries like argparse, pandas, subprocess and shutil to automate preprocessing, handle containerised genomic tools with Docker, and stitch everything together using workflows like Nextflow. The tools I'm building are still a work in progress, but this talk will spotlight the imperfect, iterative process of building (while learning) pipelines that don't exclude underrepresented populations by design. I'll share how Python makes it possible to go from fragmented data to actionable results. This talk is for anyone curious about bioinformatics, passionate about global inclusion or simply learning Python