Statistical Independence

Intuitively, two variables are independent when knowing about one variables adds nothing to what you know about the other. For example, hair color and gender are independent. Knowing someone's hair color adds nothing to your knowledge of their gender. Height and weight are dependent, however. Knowing someone's height does not determine their weight, but you know more about their weight after you have been told their height than before. For example, knowing nothing about them, you don't know anything about whether Clark is heavier than Peter. If I tell you that Clark is 6' 2" and Peter is 5'5", now you know something, and its a good guess that Clark weighs more. Now if I tell you that Peter eats no breakfast and only a bagel for lunch, and that Clark is to dieting what Liz Taylor is to marriage, you have even more information.

Probabilistic or statistical independence is a way to formalize these intuitions. Two random variables X and Y are probabilistically independent if the probability distribution over X is the same as the probability distribution over X conditional on Y, for all values y that Y can take on. That is, for all y, P(X) = P(X|Y=y)