I am a Ph.D. student at H. John Heinz III College, Carnegie Mellon University. I am advised by Prof. Leman Akoglu and Prof. Amelia Haviland. My research focuses on data mining algorithms, systems, applications, and implications to decision process and policy-making:
I am an active software developer with more than 4000 GitHub stars in total (top 1,500 among 37,000,000 GitHub developers ranked by Gitstar Ranking). I led multiple popular open-source machine learning initiatives, including pyod (total downloads > 100,000 times), combo, anomaly-detection-resources, and awesome-ensemble-learning. Before coming to CMU, I have more than 5-year industry experience as a software engineer and management consultant. See my professional experience for more information.
Ph.D. in Information System & Management (Primary); Joint Ph.D. in Machine Learning & Public Policy (Expected), 2019-2024
Carnegie Mellon University
M.S. in Applied Computing, 2016
University of Toronto
B.S. in Computer Engineering, 2015
University of Cincinnati
High School Diploma, 2010
Shanxi Experimental Secondary School (山西省实验中学)
I am open to peer review chances in the field of outlier & anomaly detection, ensemble Learning, clustering, and ML systems. Please send me an email (email@example.com) or a request in the corresponding reviewing system.
July 2019: I initialized a new Python toolbox, combo, for the easy use of combination methods in machine learning.
Mar 7th, 2019: Our paper on music artist classification with deep net is accepted at International Joint Conference on Neural Networks (IJCNN).
I am a dedicated technical writer with more than 200 articles (in Chinese) and 80,000 followers on Zhihu (知乎) — Chinese Quora (200 million+ registered users). Since 2018, I have been officially recognized as a “Top Zhihu Writer” (优秀回答者) in four fields (AI, ML, DM, and STAT). See my Zhihu page.
If needed, high-resolution profile pictures can be downloaded here:
[w18a] DivBoost: Constructing Effective Outlier Ensembles by Base Learner Diversity Maximization
[w19a] HD-Cluster: Synthesized Cluster Analysis and Outlier Detection on High-dimensional Data
[w19c] [A new statistical model. *Name masked due to double blind review policy] AAAI Conference on Artificial Intelligence (AAAI), 2019. Submitted, under review.
[w19d] [Combining Machine Learning Models and Scores using combo library] AAAI Conference on Artificial Intelligence (AAAI), demo track, 2019. To submit.
I am an enthusiastic open-source developer: I build machine learning libraries and systems. Specifically, I initialized Python Outlier Detection library (PyOD) in 2018, which has become the most popular Python outlier detection toolkit. I also initialized combo: A Python Toolbox for Machine Learning Model Combination in July 2019–it is currently under active development. Watch/Star/Follow welcome!
Applied research in people analytics: build machine learning models for various people analytic projects.
Supervised by Prof. Anthony Bonner and the project is partly supported by Mitacs-Accelerate Research and Development Funding (IT07884).