Designed and productionized data and training pipelines for code-generation agents targeting domain-specific languages, using AWS Bedrock, SageMaker, LangChain, and Python.
Collaborated with cross-functional engineering and science partners to improve agent tool-use workflows, retrieval quality, and evaluation coverage for production-facing code generation.
Research Assistant
Social Cognitive AI (SCAI) Lab at Johns Hopkins University
Applied science and research focus:
Studied embodied assistance and spoken instruction following with theory-of-mind models, Bayesian inverse planning, and simulated/real-world robot task data.
Contributed to research artifacts on embodied AI and human-centered AI, including SIFToM and UnclearInstruct, advised by Prof. Tianmin Shu at Johns Hopkins University.
Research Assistant
Learning to Defer with an Uncertain Rejector via Conformal Prediction
Applied science and research focus:
Proposed an uncertainty-aware, distribution-free post-training rejector for learning to defer, improving human-AI collaboration on object recognition and hate-speech detection tasks.
Developed surrogate losses and baselines for learning-to-defer experiments with Wide ResNet, human expert simulators, data augmentation, and PyTorch.
Built experiment pipelines for uncertainty quantification and distribution shift, including batch ensembles, SNGP, MC-Dropout, Bayesian neural networks, CIFAR-10/100 corruption tests, and GPU/HPC automation with Slurm, Shell, and Docker.
Research Assistant
Augmentation for Distribution Drift in Credit Scoring
Applied science and research focus:
Proposed data augmentation algorithms with kernel density estimation to reduce distribution drift in credit scoring models, improving AUC from 0.73 to 0.85 under varying economic factors.
Benchmarked gradient boosting and neural-network credit risk models with Python, PyTorch, LightGBM, NumPy, Pandas, and matplotlib.
Built large-scale financial time-series datasets with Spark and SQLAlchemy across approximately 2B data points.