I am a Master of Computational Data Science student at Carnegie Mellon University. My main research interests are Multimodal Deep Learning and Multimodal Question Answering. Broadly my work is at the intersection of Deep Learning, Machine Learning, Natural Language Processing, Computer Vision, Artificial Intelligence, Multi-modal learning and Data Science (including Statistics).

I did my bachelors from National Institute of Technology, Hamirpur, India.

Select Projects

Atom Smashing using Machine Learning at CERN (CERN Openlab project)

Used Apache Spark to streamline different predictive prototypes by gathering information from CMS, ran predictive models and proposed datasets which will become popular over time. Evaluated quality of individual models, performed component analysis and selected best predictive model for new set of data.

Publication: Siddha Ganju, Valentin Kuznetsov, Tony Wildish, Manuel Martin Marquez, Antonio Romero Marin (2015). Evaluation of Apache Spark as an Analytics framework for CERN’s Big Data Analytics.


Presented at Strata+Hadoop World 2016, San Jose, USA

O’Reilly Blog post I about my talk, also featured on their data newsletter.

O’Reilly Blog post II about my talk, also featured on their data newsletter.

See here for news coverage and Twitter feed.

Automated Pipeline for Machine Learning Problems

Created a Python command line toolkit using scikit-learn, numpy, pandas and matplotlib libraries to solve machine learning problems automatically. Imputation and hyper parameteric optimization placed my model among the top 10% of the Titanic kaggle.com challenge (Rank 198 out of 2035 in July 2014). Experimented with large data sets and deployed on Hadoop cluster over AWS.

Mentor: Anirudh Koul, Data Scientist, Microsoft

Presented at Grace Hopper 2015

In progress


Panelist, IBM+Apache Spark Maker Community Event, San Francisco, USA, 2016. News coverage and Twitter feed

Speaker, Strata + Hadoop World, San Jose, 2016, I gave a talk about my research at CERN, the European Organization for Nuclear Research.

Strata + Hadoop World in San Jose 2016

Invited to the Open Leadership Cohort, Working Open Workshop, Mozilla Science Lab, Berlin, Germany, 2016

Poster Presentation, Grace Hopper Conference, Texas, 2015. News coverage

Talk, Mozfest, London, 2015, by Team OpenCosmics. - Git announcement by Mozilla


Scholar, Grace Hopper Conference, Texas, USA, 2015

Winner, Best Innovative Outreach, CERN WebFest, Geneva, Switzerland, 2015

Winner, Grace Hopper Conference Hackathon, Bangalore, India, 2014

Finalist, New York University International Hackathon, Abu Dhabi, 2014

Winner, India Scholarship Award, Institution of Engineering and Technology (IET), Delhi, India, 2013, News Coverage


Student Women Representative, Community Volunteers Conference, IET, Sri Lanka, 2013


You can contact me on :

Carnegie Mellon University Language Technologies Institute 5000 Forbes Avenue Gates Hillman Center – 5404 Pittsburgh, PA 15213

Curriculum Vitae

Check out my CV

Siddha Ganju