Biography

🐎 I am moving my website to https://viterbi-web.usc.edu/~yzhao010/, and this site is no longer maintained. You will be redirected automatically in 3 seconds.


Next Steps. 🐎 I will join USC as an Assistant Professor of Computer Science in August 2023 (more details on intern/visitor, graduate student, and postdoc recruitment to follow.)

Contact me by Email (yzhao010 [AT] usc.edu) or WeChat @ yzhao062. Of course, please add my WeChat only for good reasons, e.g., collaboration, visiting, joining the group, etc.

Short Bio. I got my Ph.D. in 4 years at Carnegie Mellon University (CMU). My research accelerates and automates unsupervised ML: (1) how to speed up large-scale learning tasks with ML systems and (2) how to automate unsupervised ML model selection and hyperparameter optimization. I build AI/ML applications in finance, healthcare, and security.

Open-source Contribution. I have led to more than 10 ML open-source initiatives, receiving 16,000 GitHub stars (top 0.002%: ranked 800 out of 40M GitHub users) and >15M downloads.

Social Media. I wrote more than 300 articles (in Chinese) with 220,000 followers on Zhihu (知乎) (200 million+ users). My articles have got more than 20,000,000 time read. See my Zhihu (微调).

Ph.D. time. At CMU, I worked with Prof. Leman Akoglu, Prof. Zhihao Jia, and Prof. George H. Chen. I was a member of CMU automated learning systems group (Catalyst). I collaborated with Prof. Jure Leskovec at Stanford and Prof. Philip S. Yu at UIC.


Interests

  • Unsupervised ML
  • ML Systems (MLSys)
  • Automated ML (AutoML)
  • Anomaly Detection
  • Outlier/out-of-distribution Detection
  • Graph Neural Networks
  • Healthcare AI
  • AI + Security

Education

  • Ph.D., 2019-2023

    Carnegie Mellon University

  • M.S., 2015-2017

    University of Toronto

  • B.S., 2015

    University of Cincinnati

  • High School Diploma, 2010

    Shanxi Experimental Secondary School 山西省实验中学

Miscellaneous

News & Travel

May 2023: I defended my thesis!!! See the photo :)

Apr 2023: Do Not Train It: A Linear Neural Architecture Search of Graph Neural Networks will appear in ICML 2023. Congrats to Peng Xu and Lin Zhang!

Feb 2023: Weakly Supervised Anomaly Detection: A Survey is out! [code]

Profile & Casual Pictures

Publications

See my Google Scholar, DBLP, ORCID, and ResearchGate.


Prepints & Working Papers

[w23a] Weakly Supervised Anomaly Detection: A Survey, with Minqi Jiang, Chaochuan Hou, Ao Zheng, Xiyang Hu, Songqiao Han, Hailiang Huang, Xiangnan He, Philip S. Yu. Preprint.

[w22f] Diffusion Models: A Comprehensive Survey of Methods and Applications, with Ling Yang, Zhilong Zhang, Yang Song, Shenda Hong, Runsheng Xu, Yingxia Shao, Wentao Zhang, Bin Cui, and Ming-Hsuan Yang. Preprint.

[w22e] Hyperparameter Optimization for Unsupervised Outlier Detection, with Leman Akoglu. Preprint.


Peer-reviewed Papers

(2023). TOD: GPU-accelerated Outlier Detection via Tensor Operations. International Conference on Very Large Data Bases (VLDB).

PDF Code DOI

(2022). ADBench: Anomaly Detection Benchmark. Advances in Neural Information Processing Systems (NeurIPS) (Equal contribution).

PDF Code

(2022). ELECT: Toward Unsupervised Outlier Model Selection. IEEE International Conference on Data Mining (ICDM).

PDF Code

(2022). ECOD: Unsupervised Outlier Detection Using Empirical Cumulative Distribution Functions. IEEE Transactions on Knowledge and Data Engineering (TKDE) (Co-first author; equal contribution).

PDF Code DOI IEEE Xplore

(2021). Automatic Unsupervised Outlier Model Selection. Advances in Neural Information Processing Systems (NeurIPS).

PDF Code Project

(2020). Combining Machine Learning Models Using combo Library. Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI), demo track.

PDF Code Video DOI

(2020). SynC: A Unified Framework for Generating Synthetic Population with Gaussian Copula. Workshops at the Thirty-Fourth AAAI Conference on Artificial Intelligence.

PDF Code PPAI Arxiv

(2018). DCSO: Dynamic Combination of Detector Scores for Outlier Ensembles. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), Workshop on Outlier Detection De-constructed (ODD).

PDF Slides

(2017). An empirical study of touch-based authentication methods on smartwatches. Proceedings of the 2017 ACM International Symposium on Wearable Computers (ISWC) (Co-first author; equal contribution).

PDF DOI ACM DL

Awards and Funds

Meta 2022 AI4AI Research Award

To foster further innovation in this area, and to deepen our collaboration with academia, Meta is pleased to invite faculty to respond to this call for research proposals pertaining to the aforementioned topics.

The Norton Labs Graduate Fellowship

The Norton Labs Graduate Fellowship provides up to $20,000 USD that may be used to cover one year of the student's tuition fees and/or reimburse expenses incurred by the student during collaboration with Norton Labs. Selected as one of the only two graduate students to receive the award.

CMU Presidential Fellowship

PwC Presidential Fellowship ($80,000).

Mitacs-Accelerate Research and Development Funding

Project IT07884 ($30,000): machine learning in HR analytics.

Mantei/Mae Award & Scholar

Awarded to highest-performing students in Electrical Engineering, Computer Engineering, and Computer Science ($40,000 in four years).

University Global Award and Scholarship

Awarded to top performing international students ($32,000 in four years).

Services

Conference Organizing Committee

External Reviewer for Funding Proposals

Journal Reviewer

Program Committee and/or Reviewer for Conferences and Workshops

Experience

Professional Positions

 
 
 
 
 

Assistant Professor of Computer Science

University of Southern California

Aug 2023 – Present
 
 
 
 
 

Machine Learning Research Intern

NortonLifeLock Research Group

May 2022 – Dec 2022
Supervised by Dr. Acar Tamersoy and Dr. Kevin Roundy.
 
 
 
 
 

Machine Learning Research Intern

Microsoft Research

Jan 2022 – Mar 2022

Designed weakly supervised anomaly detection algorithms.

Supervised by Dr. Guoqing Zheng and Dr. Subhabrata (Subho) Mukherjee.

 
 
 
 
 

Visiting Student Researcher

Stanford University, Computer Science Department,

May 2021 – Aug 2021 Stanford, CA, USA

Designed new GNN systems and models.

Supervised by Prof. Jure Leskovec.

 
 
 
 
 

Machine Learning Research Intern

IQVIA, Analytics Center of Excellence

May 2020 – Aug 2020 Boston, MA, USA

Designed new machine learning systems and models in healthcare.

Supervised by Dr. Cao (Danica) Xiao (IQVIA) and Prof. Jimeng Sun (UIUC).

 
 
 
 
 

Senior Consultant

PwC Canada, Consulting & Deals

Feb 2017 – Jun 2019 Toronto, ON, Canada
I was a senior consultant with the following duties:

  • Designed fraud analytic solutions for major Canadian banks and insurance firms.
  • Led applied data analytics projects, e.g., client segmentation and churn analysis.
  • Developed multiple pricing optimization models with statistical methods.
 
 
 
 
 

Research Associate (Intern)

PwC Canada, Consulting & Deals

May 2016 – Dec 2016 Toronto, ON, Canada

Applied research in people analytics with machine learning.

Supervised by Prof. Anthony Bonner and the project is partly supported by Mitacs-Accelerate Research and Development Funding (IT07884).

 
 
 
 
 

Software Engineer (Contract & Intern)

Siemens PLM Software USA

Mar 2012 – Dec 2014 Cincinnati, Ohio, USA
As a co-op student and contractor, my works include:

  • Managed a Java project to transition the LabManager system to vCloud Director.
  • Refactored outdated automation code and added new modules and JUnit test cases.
  • Led a C++ Code Coverage project on Teamcenter platform to strengthen its stability.

Experience

Teaching Positions

 
 
 
 
 

Teaching Assistant to multiples courses

Carnegie Mellon University, Heinz College of Information Systems

Feb 2020 – May 2022 Pittsburgh, PA, United States

I am a teaching assistant for the following courses:

  • Intro to Artificial Intelligence taught by Prof. David Steier (Fall 2020, Spring 2021, Fall 2021, Spring 2022).
  • Digital Transformation taught by Prof. James Riel (Spring 2022).
  • Statistics for IT Managers taught by Prof. Daniel Nagin (Fall 2021).

The main duties include grading assignments and giving lectures on selected topics.

 
 
 
 
 

Teaching Assistant

University of Toronto, Department of Computer Science

Sep 2015 – Dec 2015 Toronto, ON, Canada
I was a teaching assistant for Embedded Systems taught by Prof. Philip Anderson.
 
 
 
 
 

Teaching Assistant

University of Cincinnati, Department of Electrical Engineering & Computer Science

Sep 2014 – Dec 2014 Cincinnati, OH, USA
I was a teaching assistant for Introduction to Programming taught by Prof. George Purdy.

Open-source Initiatives

To find more of my open-source initiatives, see my GitHub. Popular ones:

  • PyOD: A Python Toolbox for Scalable Outlier Detection (Anomaly Detection).
  • Therapeutics Data Commons (TDC): Machine learning for drug discovery.
  • ADBench: The most comprehensive tabular anomaly detection benchmark (30 anomaly detection algorithms on 57 benchmark datasets).
  • TOD: Tensor-based outlier detection–First large-scale GPU-based system for acceleration!
  • PyTorch Geometric (PyG): Graph Neural Network Library for PyTorch. Contributed to profiler & benchmarking, and heterogeneous data transformation.
  • SUOD: An Acceleration System for Large-scale Heterogeneous Outlier Detection.
  • Python Graph Outlier Detection (PyGOD): A Python Library for Graph Outlier Detection.
  • combo: A Python Toolbox for ML Model Combination (Ensemble Learning).
  • TODS: Time-series Outlier Detection. Contributed to core detection models.

ADBench

ADBench Anomaly Detection Benchmark

PyGOD (Python Graph Outlier Detection)

A Python Library for Graph Outlier Detection (Anomaly Detection) and Its Benchmark

Therapeutics Data Commons (TDC)

Machine Learning Datasets and Tasks for Drug Discovery and Development

SUOD

SUOD Accelerating Large-scale Unsupervised Heterogeneous Outlier Detection

combo

A Python Toolbox for Machine Learning Model Combination.

Python Outlier Detection Toolbox

PyOD–A Python Toolbox for Scalable Outlier Detection (Anomaly Detection).

Talks

Previous Talks

Contact

[WeChat (微信) @ yzhao062 | 微信 @ 加群小助手]