Research

My research interests and ongoing work

Research Interests

AI for Science

Leveraging machine learning to advance scientific discovery by understanding physical laws, modeling complex systems, and extracting insights from scientific data across genomics, physics, and materials science.

Computational Physics & Modeling

Building machine learning models to simulate, predict, and optimize physical phenomena. Focus on scientific machine learning approaches that respect physical constraints and domain knowledge.

ML Systems & Applications

Designing robust, scalable machine learning systems for NLP, real-world applications, and education. Passionate about supporting underrepresented groups in tech through accessible tools and knowledge sharing.

Research Projects

Capstone: NLP for Course Evaluation Analysis

Ongoing

Built an NLP pipeline to analyze and summarize course evaluations, extracting actionable insights from noisy, unstructured feedback. Evaluated summarization and sentiment models across datasets (CSUMB, UCLA), designing metrics to measure performance and generalization. Developed a benchmarking framework to compare model outputs and assess insight quality across diverse evaluation formats.

Role: Undergraduate Researcher Advisor: Dr. Cao Thang Bui April 2026 - Present
Python NLP Machine Learning Evaluation Metrics

PAT: Pangenome Annotation Toolkit

Ongoing

PAT is a scalable pipeline that merges annotations from multiple source genomes into a unified graph coordinate set to rapidly annotate any newly added assembly to the graph. Poster Link

Pangenome Annotation Toolkit Research Poster
Role: Undergraduate Researcher Principle Investigator: Dr. Benedict Paten 2025 - Present
Python Bash Pangenomics