Data Scientist & ML Researcher

Turning real-world data into rigorous intelligence.

I build end-to-end machine-learning and NLP systems, from LLM multi-agent research to deployed prediction services that turn complex data into clarity and value.

Featured work

Selected Projects

View all projects →
Soccer market value app

Soccer Player Market Value Scout

A deployed ML service that predicts soccer players' market value from performance stats and explains every prediction. Streamlit app + FastAPI on Google Cloud Run.

Pythonscikit-learnFastAPIStreamlitDocker
🤖

LLM Multi-Agent Research Teams

M.S. thesis: LLM multi-agent simulations of five real research teams (~24K utterances) with a four-layer fidelity framework and a 55-metric NLP evaluation suite. Under review at EMNLP 2026.

LLM AgentsNLPStatistical InferencePython
Emotion recognition demo

Enhancing Emotion Recognition in AI

Improving the interpretability and reducing the bias of an image emotion-recognition model using CLIP-Dissect and concept-based analysis on a ResNet-50 backbone.

PyTorchCLIPResNet-50Interpretability
Sungjin Choi

About me

Solving real problems with data and intelligence.

I'm a master's student in Applied Statistics & Data Science at UCLA (GPA 4.0), with a B.S. in Data Science from UC San Diego. My work sits where large language models, NLP, and applied statistics meet, building reproducible pipelines that turn messy interaction data into rigorous inference.

In summer 2025 I researched multimodal LLM merging as an AI Model Research Intern at Samsung Electronics. My current research on LLM multi-agent simulations of research teams is under review at EMNLP 2026.

4.0
UCLA M.S. GPA
EMNLP '26
Paper Under Review
Samsung
AI Research Intern

Tech stack

Technologies I work with

Python PyTorch TensorFlow scikit-learn pandas SQL FastAPI Streamlit Docker AWS Google Cloud Git LLMs / NLP

Let's build something great

Have a project or role in mind?

I'm always open to research collaborations and data-science opportunities.

Get in touch →