Solutions to Assignment 4

***1. Classify each of the following statements as an example of theoretical probability, relative frequency interpretation of probability, or personal (subjective) probability.

Ages of voters	Frequency (millions)
18 to 24 years old	32.3
25 to 44 years old	49.8
45 to 64 years old	64.1
65 years and over	67.6

a. What is the probability that a voter chosen randomly from these voters is 18 to 24 years old?

32.3/(32.3+49.8+64.1+67.6) ≈ 0.151

b. What is the probability that a voter chosen randomly from these voters is over 45 years old?

(64.1+67.6)/(32.3+49.8+64.1+67.6) ≈ 0.616

c. Suppose you choose two random voters. What is the probability that both are 18 to 24 years old?

Choosing two random voters can be viewed as essentially independent events, since the number of voters is so large. (In general, choosing without replacement is not independent.) So the probability is 0.151*0.151 ≈ 0.0228

***3. Suppose a family has four children.

a. Describe the sample space for the gender distribution (boy, girl, boy, boy etc) of the four children. Note: the order matters! There are sixteen possibilities.

BBBB, BBBG, BBGB, BBGG, BGBB, BGBG, BGGB, BGGG, GBBB, GBBG, GBGB, GBGG, GGBB, GGBG, GGGB, GGGG

b. If the probability of having a male baby is 0.512, what is the probability of having four girls?

The probability of having one girl is 1 – 0.512 = 0.488. The probability of four girls is 0.488^4 ≈ 0.0567. This event is fairly uncommon.

(skip) c. Simulate this scenario in Excel as follows. In Column H type:

0 0.512

1 0.488

Then generate 10000 rows of four 0's and 1's based on the theoretical probability.

0

Paste the top 10 rows of your simulation into your Word document.

Here is what I got. The fifth column as is the sum of the four columns.

0 0 1 0 1

0 1 1 1 3

0 0 0 0 0

1 0 0 0 1

1 1 1 0 3

1 0 1 1 3

0 1 0 0 1

1 1 1 1 4

1 0 0 0 1

0 0 0 0 0

(skip) d. Using your simulation, calculate the relative frequency of having all girls. You should get an answer fairly close to the answer you got in part b. (Hint. The way I would do this is to add up each row and then count how many rows add up to 4, but there are other ways as well.)

I got 564/10000 = 0.0564 which is quite close to the theoretical 0.0567.

(skip) e. What is more likely in a family of four: having two boys and two girls or having three of one gender and one of the other gender? Your answer should include the calculation of the probability for each scenario. Hint. You can do this theoretically or you can use your simulation to obtain a very close approximation to the theoretical probability. If you proceed theoretically, use the sample space to count how many possibilities there are in each case. You have to consider two subcases in calculating the probability of having three of one gender and one of the other gender, namely the case of three girls and one boy and the case of three boys and one girl. If you use your simulation, use COUNTIF.

The theoretical probability of having two boys and two girls is 6*(0.512)^2*(0.488)^2 ≈ 0.375 (there are 6 ways: BBGG, BGBG, BGGB, GBBG, GBGB, GGBB). The theoretical probability of having three of one and one of the other is 4*(0.512)^3*0.488 + 4*0.512*(0.488)^3 ≈ 0.5.

(skip) f. Did the result of part e surprise you? Why or why not?
This result surprised me the first time I heard it. If you make a graph of the probabilities, one gets an idea of what is going on. The two bars next to the middle bar add up to more than the middle bar. Intuitively, it seems that the two bars next to the middle one should be smaller than they really are.

***4. The distribution of ABO and Rh blood type for both men and women in the US is given in the table below:

(skip) 5. When tossing a coin, a "run" is a string of consecutive results that are all the same. For example, in the sequence T T H H T T T T H T there is a run of length 2 (two T's) followed by a run of length 2 (two H's), followed by a run of length 4 (four T's), followed by a run of length 1 (one H), and final run of length 1 (one T). The longest run in this case is one of length 4. Open the file SequencesOfTenCoinFlips.xls. This file contains the results of 50,000 sequences of 10 coin flips. As usual, "1" denotes heads, and "0" denotes tails. In addition, all but the top ten rows of Column L are filled with the length of the longest run for the coin flip sequence in that row.

The cells shaded in yellow show how many ways one can get a sum of 8. The probability is 5/36 or about 0.139.

*** 8. A common data collection method in consumer and market research is the telephone survey. However, a major problem with consumer telephone surveys is nonresponse. A nonresponse can result from a person not being home, a person not answering, a person refusing to participate, etc. Research has shown that the probability of a successful response to a telephone survey is 0.08.

		ABO blood type
		O	A	B	AB	Total
Rh Factor	Rh positive	38.4%	32.3%	9.4%	3.2%
Rh Factor	Rh negative	7.7%	6.5%	1.7%	0.7%
	Total

		ABO blood type
		O	A	B	AB	Total
Rh Factor	Rh positive	38.4%	32.3%	9.4%	3.2%	83.3%
Rh Factor	Rh negative	7.7%	6.5%	1.7%	0.7%	16.6%
	Total	46.1%	38.8%	11.1%	3.9%	99.9%

Length of longest streak	Frequency	Percentage
1	92	0.2%
2	8397	16.8%
3	18155	36.3%
4	12454	24.9%
5	6193	12.4%
6	2745	5.5%
7	1161	2.3%
8	478	1.0%
9	225	0.5%
10	100	0.2%

	1	2	3	4	5	6
1	(1,1)	(1,2)	(1,3)	(1,4)	(1,5)	(1,6)
2	(2,1)	(2,2)	(2,3)	(2,4)	(2,5)	(2,6)
3	(3,1)	(3,2)	(3,3)	(3,4)	(3,5)	(3,6)
4	(4,1)	(4,2)	(4,3)	(4,4)	(4,5)	(4,6)
5	(5,1)	(5,2)	(5,3)	(5,4)	(5,5)	(5,6)
6	(6,1)	(6,2)	(6,3)	(6,4)	(6,5)	(6,6)

	Probability
First success on first	0.08	0.0800
First success on second	0.92*0.08	0.0736
First success on third	0.920.920.08	0.0677
First success on fourth	0.920.920.92*0.08	0.0623
First success on fifth	0.920.920.920.920.08	0.0573
etc	etc

0	0	0	1	0	0	0	1	1	1	3
1	0	1	0	0	0	1	0	1	1	3
0	1	1	1	0	1	0	0	1	1	3
1	0	0	1	0	1	1	1	1	0	4
0	1	1	0	1	1	0	1	1	0	2
0	1	1	0	0	1	1	1	0	0	3
0	0	1	0	1	1	1	1	1	0	5
0	1	1	0	0	0	1	1	1	1	4
1	1	0	1	0	0	1	0	0	1	2
0	0	1	0	0	0	1	1	0	1	3

0	0	0	1	0	0	0	1	1	1	3
1	0	1	0	0	0	1	0	1	1	3
0	1	1	1	0	1	0	0	1	1	3
1	0	0	1	0	1	1	1	1	0	4
0	1	1	0	1	1	0	1	1	0	2
0	1	1	0	0	1	1	1	0	0	3
0	0	1	0	1	1	1	1	1	0	5
0	1	1	0	0	0	1	1	1	1	4
1	1	0	1	0	0	1	0	0	1	2
0	0	1	0	0	0	1	1	0	1	3

0	0	0	1	0	0	0	1	1	1	3
1	0	1	0	0	0	1	0	1	1	3
0	1	1	1	0	1	0	0	1	1	3
1	0	0	1	0	1	1	1	1	0	4
0	1	1	0	1	1	0	1	1	0	2
0	1	1	0	0	1	1	1	0	0	3
0	0	1	0	1	1	1	1	1	0	5
0	1	1	0	0	0	1	1	1	1	4
1	1	0	1	0	0	1	0	0	1	2
0	0	1	0	0	0	1	1	0	1	3