Adapted (mostly verbatim) from Prof Larry Wasserman's excellent book "All of Statistics".
 Statistics Computer Science Meaning estimation learning using data to estimate an unknown quantity classification supervised learning predicting discrete Y from X (groups/categories are known) clustering unsupervised learning putting data into "clusters" (categories are not known before hand) data training sample $\left(X$1, Y1), (X2, Y2), (X3, Y3), ... , (Xn-1, Yn-1), (Xn, Yn) covariates features the Xi's classifier hypothesis a map from covariates to outcomes hypothesis --- subset of a parameter space $\Theta$ confidence interval --- interval that contains an unknown quantity with given frequency directed acyclic graph Bayes net multivariate distribution with given conditional independence relations Bayesian inference Bayesian inference statistical methods for using data to update beliefs frequentist inference --- statistical methods with guaranteed frequency behavior large deviation bounds PAC learning uniform bounds on probability of errors