CV

Education

Doctor of Philosophy (PhD) in Biostatistics - University of California, Los Angeles
- September 2024 -
- Courses: Uncertainty in LLMs, Agent-based AI
Master of Science in Computer Science - Northeastern University
- January 2023 - May 2024
- Courses: Algorithm, Distributed Database
Master of Science in Statistics and Operations Research - University of North Carolina at Chapel Hill
- August 2018 - June 2020
- Courses: Applied Statistics, Machine Learning, Time Series Forecasting

Software Engineer Intern at Amazon
June 2024 - August 2024

Developed key features for Amazon B2B in AWS platform to automate event-driven applications using EventBridge
Implemented enhancements in the Visibility Service using Java, JavaScript and TypeScript, improving real-time tracking and monitoring of important business transactions across the platform
Conducted integration tests for all Outbound services and events, ensuring smooth and error-free deployment
Designed and tested dashboards using CloudWatch to monitor service performance metrics

Data Scientist at ByteDance Ltd.
February 2021 - August 2022

Collaborated with cross-function team to deploy dynamic subscription tool to improve customer experience in platform
Verified product feasibility and deployed XGBoost model to select the important indicators for customers to help design product features
Applied interrupted time series to estimate the potential revenue impact, and evaluate the risk of restricting creator’s quotes strategy in advance
Conducted attribution analysis to evaluate marketing campaign performance which provided powerful evidence to spur on product marketing

Data Scientist at Blingby
August 2020 - December 2020

Built data ETL pipelines through Apache Spark to transform raw data into features by combining business sense and statistical knowledge
Developed, maintained web-based dashboards with Tableau to update daily data analysis report, which increased 20% daily work efficiency

Machine Learning Intern at TouchSuite
June 2020 - August 2020

Queried and cleaned terabyte-sized order data from Azure SQL using pyodbc
Conducted online analytical processing (OLAP) to display critical sales performance from different dimensions
Developed item-based approaches to handle cold-start problems and tuned the model hyper-parameters through SparkML cross-evaluation toolbox which reduced root mean square errors by 10%

Incentivizing Truthful Language Models via Peer Elicitation Games
- NeurIPS 2025, September 2025
- Paper Link
- This paper introduces Peer Elicitation Games (PEG), a training-free, game-theoretic framework for aligning LLMs.
Competitive Multi-Agent Delegation For LLM Reasoning
- Submitted to ICML 2025
- This paper introduces COMMAND, a competitive multi-agent delegation framework that treats LLM reasoning as a principal-agent delegation process, using competition and alignment incentives to elicit higher-quality answers.
LightAgent: Production-level Open-source Agentic Al Framework
- arXiv preprint, September 2025
- Paper Link
- This paper propose LightAgent, a lightweight, open-source agentic framework that balances flexibility and simplicity by integrating core functionalities like Memory, Tools, and Tree of Thought.