Learn, Think & Do

Enjoy Randomness !

,

The Independence of Events (P-5)

The property of independence of the events in the given sample space is one of the most crucial properties of the probability theory. Similarly we can talk about independence of the random variables. This is the inherent or desired assumption in almost all statistical samples or observed data. In this post I am going to talk about the following questions:

  1. What are independent events?
  2. Why do they come naturally in probability?
  3. When do we call two or more random variables as independent random variables?

Independent Events

Let S be the given sample space and P be the probability function defined on it. Suppose A and B be two events in S. There are two main ways we can say that A and B are independent. First, if P(A|B)=P(A) then we say that these two events are independent. Second, if P(A \cap B)=P(A)P(B) then also we say that these two events are independent. The second form can be derived from the first using Bayes’ formula, it is also called as the multiplication rule. We use it implicitly in performing experiments like tossing a coin twice or rolling a die thrice where we multiply the probabilities for each toss or roll. For example, what is the probability of getting heads on both the toss assuming the coin is fair? Let’s call the event that we get head on the first toss as A and the event of getting head on the second toss as B. Here what we want to calculate is the probability of the event A \cap B. We multiply the two probabilities i.e. 0.5*0.5=0.25 because the probability of head on first toss is independent of the second toss and similarly probability of head on second toss is independent of the first toss. But what if we want to calculate the probability of getting at least one head? Let’s call this event C. Now we can not simply multiply the probabilities because use of ‘at least’ means dependence on the first and second toss. What is P(A\cap C)? We know P(A \cap C)=P(A)+P(C)-P(A \cup C) but P(A \cup C)=P(C) because C is the bigger set. So we get P(A \cap C)=P(A)=0.5 \neq P(A)P(C)=0.5*0.75, therefore A and C are not independent. Similarly one can show that B and C are not independent. So this was about two events, what about more than two events? Suppose A_1,A_2 and A_3 are three events of the given sample space. Does it suffice to say that when P(A_1 \cap A_2 \cap A_3)=P(A_1)P(A_2)P(A_3), they are independent events? Sadly the answer is NO. Consider the sample space of tossing two dice, S=\{(1,1),(1,2),...,(6,1),...,(6,6)\}. Let A=\{(1,1),(2,2),...,(6,6)\}, set of pairs, B=\{the sum is between 7 and 10 including both \}, and C=\{the sum is 2 or 7 or 8 \}. Check that P(A)=1/6, P(B)=1/2, P(C)=1/3 and P(A \cap B \cap C)=P(\{(4,4)\})=1/36=1/6 \times 1/2 \times 1/3. But P(B \cap C)=11/36 \neq P(B)P(C)=1/6. Similarly we have P(A \cap B)\neq P(A)P(B) and P(A \cap C)\neq P(A)P(C). This gives us why the answer is no and we need to define independence in some other way for more than two events. This example is taken from Casella and Berger.

Now what if all the three events are pairwise independence, i.e., if we have P(A \cap B)=P(A)P(B), P(A \cap C)=P(A)P(C) and P(B \cap C)=P(B)P(C) , does it mean that P(A \cap B \cap C)=P(A)P(B)P(C)? Unfortunately the answer in this case is also no. For this consider the sample space S=\{aaa, bbb, ccc, abc, acb, bca, bac, cba, cab\} each element with probability 1/9. Let A_i=\{ ith place in the triple element is occupied by a\}. Again we can calculate that P(A_i)=1/3 for all i=1,2,3 and P(A_1 \cap A_2)=P(A_2 \cap A_3)=P(A_1\cap A_3)=1/9, which means A_i‘s are pairwise independent but P(A_1 \cap A_2 \cap A_3)=1/9 \neq P(A_1)P(A_2)P(A_3)=1/27. So for the simultaneous (or mutual, as it is generally called) independence of more than two events we need stronger definition, which is defined as follows: A collection of events is A_1,A_2,..., A_n are mutually independent if for any subcollection A_{i_1}, A_{i_2},..., A_{i_k} (here we can choose different events within the given n events as we like and want them to satisfy ), we have, P(\bigcap_{j=1}^{k}A_{i_j})=\prod_{j=1}^{k}P(A_{i_j}).

Why Independent events are so natural?

This fact comes from the way, the product integration or measure (no need for you to worry about these terms if you do not understand them) is defined. For example, one would like to have a P(A \times B)= P_1(A) \times P_2(B) where A and B are two events in sample spaces S_1 and S_2 respectively and P_1 and P_2 are respective probability functions defined, A\times B is an event in the cartesian product S_1 \times S_2 and P as the probability function on it. We have A\times B=\{(a,b): a\in A, b\in B\}. So we would like to have P(A\times S_2)=P_1(A)\times P_2(S_2)=P_1(A), since P_2(S_2)=1. Similarly we would like to have P(S_1\times B)=P_2(B). Also note that, (S_1\times B)\cap (A\times S_2)=A\times B, so now you know how come intersection enters the picture.

Independence of random variables

As with the independence of two events we can define independence of two random variables in the same fashion i.e. let X and Y be two random variables, then they are said to be independent if P(X\in B_1, Y \in B_2)=P(X\in B_1)P(Y\in B_2) for all B_1, B_2 \subset \mathbb{R}. This can be extended easily to more that two random variables. In terms of joint pdf/pmf and joint cdf (we won’t be discussing joint distribution definition in this blog, search online for a quick introduction) for independent random variables joint pdf/pmf and joint cdf they get separated variablewise, that is if F_{X,Y} represent joint cdf of X, Y (simply said it is just a double sum or double integration), let F_X and F_Y be cdfs of X and Y respectively, then when X and Y are independent we have F_{X,Y}=F_X F_Y. Similar is the case for joint pdf/pmf, that is f_{X,Y}=f_X f_Y. They make life easy by changing multiple integration to multiplication of many single integration.

References:(If you click from the link provided here and buy books it would help me. Thanks.)

Leave a comment

Navigation

About

Mostly about Math-Stats, Finance, Data Science, Artificial Intelligence(AI), and their combination with some random stuff here and there. Happy Learning and enjoy Randomness !