Yilun (Tom) Zhang

Professional Work Experience

Mozilla (Current)

Machine Learning Engineer

SoundHound

Machine Learning Engineer / Data Scientist
  • Managed a team of machine learning scientists and software engineers to build innovative solutions in natural language processing and understanding using traditional approaches and LLMs
  • Led development of end-to-end machine learning research projects in areas of natural language processing, text (pair)/token/audio classification, phrase segmentation, and natural language/expression generation to solve business problems including failed query handling, ASR error detection and automatic training data generation
  • Built automatic and scalable data processing pipelines and production-level machine learning solutions that are integrated into company's product offering (cloud \& embedded)
  • Dropped production NLU failure rate by over 40\% and greatly improved user experience for in-car and voice ordering environment
  • Filed several patents for our novel solutions
  • Received top impact award in company's quarterly meeting

Manulife — Lab of Forward Thinking

Data Scientist
  • Led multiple data science and machine learning projects, built various statistical models and deep learning models (NLP & CNN) to provide insights to business or to semi-automate current repetitive business processes
  • Researched on computer vision adversarial attack and defenses, used Grad-CAM to visually understand the effect of attacks on confusing the attention of deep neural networks and effectiveness of available defenses, demonstrated how adversarial attack will potentially influence insurance industries by using transfer learning to generate industry-specific cases, and presented research at Vector Institute ESS \#2
  • Worked closely with back-end and front-end engineers to design databases, create ETL pipelines for data flow and deploy RESTful APIs for exposing developed machine learning models and visualizations

Athos

Data Scientist
  • Assessed existing algorithm, then individually invented, implemented and tested a new algorithm to calculate the Athos Score (currently in production app)
  • Implemented the first machine learning algorithm to predict lower body heart rate confidence
  • Wrote Python and MySQL scripts to automate data cleaning, transfer and storage processes
  • Created sensor signal and other data visualizations for problem detection and algorithm improvement

MD Financial Management

Insight Analyst
  • Conducted statistical analysis on a product pre-launch survey; designed and implemented predictive models to identify future product purchasers and a visualization application for the survey data using R with Shiny
  • Mined customer and survey databases to explore gender differences in investment behaviour

Princess Margaret Hospital — University Health Network

Bioinformatics Data Analyst
  • Implemented an experiment database using Python with SQLite3 which greatly improved efficiency for data standardization and sharing
  • Automated experimental data storage to portable SQLite3 database using Python and Excel
  • Applied mixed effect models and forecasting on a set of therapeutic experiment data using R; wrote a detailed statistical report in LaTeX

Institutes of Systems Science — National University of Singapore

Business Research Assistant
  • Analyzed customer survey data using SPSS, R and Excel and produced visualization plots for presentation and final report to clients
  • Mined Singapore's transportation data for commuters' travel pattern analysis

Education

University of Waterloo

MMath Statistics, 2017

Longitudinal Data Analysis • Non-parametric Regression • Causal Inference and Graphical Models • Hypothesis Testing • Sampling • Stochastic Processes

University of Waterloo

BMath Honours Statistics & Computational Mathematics Co-op, 2016

Mathematical Statistics • Experimental Design • Applied Probabilities • Classification • Data Visualization • Generalized Linear Models • Spatial Data Analysis • Statistical Learning • Data Structures

Hackathons

Top 10% & 1st in NA, 2017 International Data Mining Cup — Online shopping revenue prediction.

1st Place, 2015 6sense DataHack — Yelp-for-business: where should I open the restaurant? Press

1st Place, 2015 Capital One Data Mining Cup — Break-even-bid price prediction for search engine advertaisement.

Talks

Adversarial Attacks and Defenses on Computer Vision Systems and Their Impact to Regulated Industries, Vector Institute ESS2, Nov 2017.