본문 바로가기
Studies & Courses/Data Analytics & Stats

[Probability & Statistics] 4. Bayes’ Rule, Concept of a Random Variable

by Air’s Big Data 2020. 4. 18.

Probability & Statistics

Bayes’ Rule, Concept of a Random Variable


Chapter 2. Probability

Section 2.7 Bayes’ Rule

Figure 2.12 Venn diagram for the events A, E and E '

 

Theorem 2.13 : Rule of Total Probability

If events B1,B2, . . . ,Bk constitute a partition of the sample space S and P(Bi) 6= 0 for i = 1, 2, . . . , k, then for any event A in S:


Figure 2.14 Partitioning the sample space S


Example 2.41

In a certain assembly plant, three machines, B1 , B2, and B3, make 30%, 45%, and 25%, respectively, of the products. It is known from past experience that 2%, 3%, and 2% of the products made by each machine, respectively, are defective. Now, suppose that a finished product is randomly selected. What is the probability that it is defective? 

 

Given that, P(B1) = 0.30, P(B2) = 0.45, P(B3) = 0.25

P(A|B1) = 0.02, P(A|B2) = 0.03, P(A|B3) = 0.02

 

a) P(A) = P(B1)P(A|B1) + P(B1)P(A|B1)+ P(B1)P(A|B1)

         = 0.30*0.02 + 0.45*0.03 + 0.25*0.02 = 0.0245

b) P(B3|A) = P(A and B3)/P(A) = P(B3)P(A|B3)/P(A) = (0.25*0.02)/0.0245 = 0.005/0.0245 = 0.204082

 


Theorem 2.14 : Bayes' Rule 

If the events B1, B2, ··· ,Bk constitute a partition of a sample space S and P(Bi)  0 for i = 1, 2, ··· , k, then for any event A in S such that P(A) ≠ 0


Example 2.43

A manufacturing firm employs three analytical plans for the design and devel- opment of a particular product. For cost reasons, all three are used at varying times. In fact, plans 1, 2, and 3 are used for 30%, 20%, and 50% of the products, respectively. The defect rate is different for the three procedures as follows:

P(D|P1) = 0.01, P(D|P2) = 0.03, P(D|P3) = 0.02,

where P (D|Pj ) is the probability of a defective product, given plan j. If a random product was observed and found to be defective, which plan was most likely used and thus responsible?

P(P1|D) = 0.003 

P(P2|D) = 0.006

P(P3|D) = 0.01

 

 


Chapter 3. Random Variables and Probability Distributions

Section 3.1 Concept of a Random Variable

Definition 3.1

A random variable is a function that associaicates a real number with each element in the sample space.

 

Examples 3.1

Two balls are drawn in succession without replacement from an urn containing 4 red balls and 3 black balls. The possible outcomes and the values y of the random variable Y , where Y is the number of red balls, are


Examples of RVs

・ Consider coin flip

 : S={H,T}

 Define random variable X:S=>R

  : X(s)=1 if s=H, X(s)=0 if s=T

 Define random variable Y:S=>R

 : Y(s)=0 if s=H, Y(s)=1 if s=T

 Consider roll of a die

 : S={1,2,3,4,5,6}

 Define random variable X:S=>R

 : X(s)=s, for s=1,2,3,4,5,6

 Define another RV Y:S=>R
 : Y(s)=0 if s=1,3,5, Y(s)=1 if s=2,4,6

 Define another RV Z:S=>R

 : Z(s)=2s+1, for s=1,...,6


Definition 3.2

If a sample space contains a finite number of possibilities or an unending sequence with as many elements as there are whole numbers (countable), it is called a discrete sample space.

  (Example) Outcomes of a roll of a die : S={1,2,3,4,5,6}


Definition 3.3

If a sample space contains an infinite number of possibilities equal to the number of points on a line segment, it is called a continuous sample space.

  (Example) Temperature of Seoul during year of 2018 : S={x|-15<x<38}


Definition

 

 Random variable X is said to be discrete RV if the range of X is discrete

  – That is X() takes discrete value
  – Example: S={1,...,6}, X(s)=2s for s in S

 

 Random variable X is said to be continuous RV if the range of X is continuous

  – Example:

Random variable X is said to be continuous RV if the range of X is continuous


Why use random variable?

(1) Using sample spaces is not mathematically convenient

 : S={H,T}

(2) By mapping the sample space (and events) to a real number, using RV, it is much more convenient to mathematically define and use the probability


 

댓글