Industry Experiences
Sarepta Therapeutics, Inc.
Research Data Scientist I
May 2025 - July 2025
- Experimental Guidance: Supported scientific decision making through developed AI/ML models that leverage internal and external experimental data.
- Protein sequence to function mapping Identified tissue targeting and detargeting mutations with over 10\% impact and high capsid viability in AAV-based gene therapy.
- In silico solution design Employed cutting-edge AI/ML protein LLMs models for protein and peptide function identification, identified key scientific problems, and curated relevant datasets to develop novel computational solutions.
Nine Square Therapeutics
Computational Biology Intern
May 2024 - August 2024
- Vision transformer (ViT) model development: Trained ViT models for multi-cell and single-cell level immunofluorescence cell images in Gaucher's disease classification tasks that exceeded internal benchmarks by 7%.
- Model explainability: Designed white-box deep learning models to interpret models' behaviors during inference and identified disease-associated morphologies across different cell lines using image heatmaps.
- Cell morphology analysis: Validated findings through literature review and in-house analysis, and selected disease phenotypes for experimental validation in subsequent screens.
Academic Experiences
The Ohio State University, College of Medicine
Graduate Research Associate
May 2020 - May 2025
- Dissertation title: Artificial Intelligence Guided Systems Biology for Synthetic Lethality and Gene Combination Effect Prediction: Bridging In Silico and In Vitro
- In silico guidance for in vitro assays: Trained AI models with active learning to improve in vitro CRISPR-Cas9 synthetic lethality identification rate from 7.2% to 46.8%, increased on-target and reduced off-target effects in sgRNAs, and similarly improved cell type annotation accuracy by 13.5% in thyroid cancer.
- Improving assay efficiency: Reduced theoretical genome-wide CRISPR-Cas9 double knockout (CDKO) experiment time from 883 years to half a year and costs from $36m to $18k.
- Biology-first deep learning modeling: Developed biologically relevant AI/DL methods that allowed in silico analyses and in vitro assays to be comparable, and created a new experiment paradigm for synthetic lethality.
- Novel model training strategies: Developed a suite of novel transfer learning strategies, increased average accuracy by 19.02% in 8,900 DL models (classification tasks) and Pearson correlation by 23.33% in 8,600 DL models (regression tasks), and allowed models to be explainable through custom Shapley value implementation.
- General responsibilities: Conceptualized and led computational and cross-functional projects for drug discovery, and created R-Shiny full-stack web applications, Python software packages, and computational workflows.