Professional Work Experience
Mozilla (Current)
Machine Learning Engineer
SoundHound
Machine Learning Engineer / Data Scientist
- Managed a team of machine learning scientists and software engineers to build innovative solutions in natural language processing
and understanding using traditional approaches and LLMs
- Led development of end-to-end machine learning research projects in areas of natural language processing, text (pair)/token/audio
classification, phrase segmentation, and natural language/expression generation to solve business problems including failed query handling,
ASR error detection and automatic training data generation
- Built automatic and scalable data processing pipelines and production-level machine learning solutions that are integrated into company's
product offering (cloud \& embedded)
- Dropped production NLU failure rate by over 40\% and greatly improved user experience for in-car and voice ordering environment
- Filed several patents for our novel solutions
- Received top impact award in company's quarterly meeting
Manulife — Lab of Forward Thinking
Data Scientist
- Led multiple data science and machine learning projects, built various statistical models and
deep learning models (NLP & CNN) to provide insights to business or to
semi-automate current repetitive business processes
- Researched on computer vision adversarial attack and defenses, used Grad-CAM to
visually understand the effect of attacks on confusing the attention of deep neural networks
and effectiveness of available defenses, demonstrated how adversarial attack
will potentially influence insurance industries by using transfer learning
to generate industry-specific cases, and presented research at Vector Institute ESS \#2
- Worked closely with back-end and front-end engineers to design databases, create ETL pipelines
for data flow and deploy RESTful APIs for exposing developed machine learning models and visualizations
Athos
Data Scientist
- Assessed existing algorithm, then individually invented, implemented and tested a new algorithm
to calculate the Athos Score (currently in production app)
- Implemented the first machine learning algorithm to predict lower body heart rate confidence
- Wrote Python and MySQL scripts to automate data cleaning, transfer and storage processes
- Created sensor signal and other data visualizations for problem detection and algorithm improvement
MD Financial Management
Insight Analyst
- Conducted statistical analysis on a product pre-launch survey; designed and implemented predictive models
to identify future product purchasers and a visualization application for the survey data using R with Shiny
- Mined customer and survey databases to explore gender differences in investment behaviour
Princess Margaret Hospital — University Health Network
Bioinformatics Data Analyst
- Implemented an experiment database using Python with SQLite3 which greatly improved efficiency for data standardization and sharing
- Automated experimental data storage to portable SQLite3 database using Python and Excel
- Applied mixed effect models and forecasting on a set of therapeutic experiment data using R; wrote a detailed statistical report in LaTeX
Institutes of Systems Science — National University of Singapore
Business Research Assistant
- Analyzed customer survey data using SPSS, R and Excel and produced visualization plots for presentation and final report to clients
- Mined Singapore's transportation data for commuters' travel pattern analysis
Education
University of Waterloo
MMath Statistics, 2017
Longitudinal Data Analysis •
Non-parametric Regression •
Causal Inference and Graphical Models •
Hypothesis Testing •
Sampling •
Stochastic Processes
University of Waterloo
BMath Honours Statistics & Computational Mathematics Co-op, 2016
Mathematical Statistics •
Experimental Design •
Applied Probabilities •
Classification •
Data Visualization •
Generalized Linear Models •
Spatial Data Analysis •
Statistical Learning •
Data Structures
Hackathons
Top 10% & 1st in NA, 2017 International Data Mining Cup — Online shopping revenue prediction.
1st Place, 2015 6sense DataHack — Yelp-for-business: where should I open the restaurant? Press
1st Place, 2015 Capital One Data Mining Cup — Break-even-bid price prediction for search engine advertaisement.
Talks
Adversarial Attacks and Defenses on Computer Vision Systems and Their Impact to Regulated Industries, Vector Institute ESS2, Nov 2017.