

Statistics 

Computer Science 

Meaning 


estimation 

learning 

using data to estimate an unknown quantity 


classification 

supervised learning 

predicting discrete Y from X (groups/categories are known) 


clustering 

unsupervised learning 

putting data into "clusters" (categories are not known before hand) 


data 

training sample 

$(X$_{1}, Y_{1}),
(X_{2}, Y_{2}),
(X_{3}, Y_{3}), ... ,
(X_{n1}, Y_{n1}),
(X_{n}, Y_{n})



covariates 

features 

the X_{i}'s 



classifier 

hypothesis 

a map from covariates to outcomes 


hypothesis 

 

subset of a parameter space $\Theta $ 


confidence interval 

 

interval that contains an unknown quantity with given frequency 


directed acyclic graph 

Bayes net 

multivariate distribution with given conditional independence relations 


Bayesian inference 

Bayesian inference 

statistical methods for using data to update beliefs 


frequentist inference 

 

statistical methods with guaranteed frequency behavior 


large deviation bounds 

PAC learning 

uniform bounds on probability of errors 






