Midterm Review and Practice

I.  Correlation -

Practice Problem

1. Open the file GestationLongevity.xls

a.  Make a XY scatterplot of gestation vs. average longevity.  Include a trendline (no equation) and the correlation coefficient.

b.  Open CriticalValuesForR.xls.  Is the relationship between gestation time and average longevity statistically significant?
 

We have 43 points, thus we are confident at the 99% level because our r value = 0.671 > 0.403. Therefore the relationship is statistically significant.

c.  Discuss the strength (strong, positive or weak) of the relationship between these two variables.  Also discuss whether the relationship is positively or negatively correlated.

There is a positive relationship between gestation and average longevity. This means that, generally speaking, as gestation time increases, longevity increases. Since our r-value is still under 0.7, we do not consider this relationship to be very strong, however we notice there are potential outliers that may be influencing our r-value.

d.  Are there any outliers in this dataset? What are they? Are there any benefits to removing the outliers? Are there any negatives to removing the outliers? What might you do in this case?

Three possible outliers are Hippopotamus, Elephant (Asian), and Elephant (African). If we remove the outliers, our r-value actually decreases to 0.578. This may suggest that the outliers are causing our data to appear more correlated than it actually is. However, our graph is about mammals, and Hippos and Elephants certainly qualify - therefore it might not be fair to remove the data points. In this case, I would report my findings as in part (a), but I would make a note of the outliers and point out that they are raising my correlation a little.

e. Does a longer gestation time cause an animal to live longer?

Though they are correlated, we cannot say that longer gestation causes an animal to live longer. Perhaps animals with longer gestation periods have offspring that come out more fully formed and capable to deal with the dangers of their environment. Therefore, they are more skilled at survival.

 

II  Probability -

Answer the following questions.  Determine whether each is asking for a theoretical, relative frequency, or personal probability?
If you flip a fair coin, what is the probability that it will land heads up?
 
What is the probability that you will eventually own a home; that is, how likely do you think it is?
 
A bag contains some marbles.  You pull one marble 10 times and get a red marble 6 times.  What is the probability of getting a red marble?
 

Basic Rules to Know:

The probability of an event is between 0 and 1.   A probability of 1 is equivalent to 100% certainty.

If A, B, and C are the only  possible outcomes:
pr(A) + pr(B) + pr(C) = 1
 

If the probability of an event A is pr(A), then the probability of event A not occurring is

pr(not A) + pr(A) = 1  OR    pr(not A)  = 1 - pr(A)

If two events A and B are independent (this means that the occurrence of A has no impact at all on whether B occurs and vice versa), then the probability  A and then B occurring is

pr(A and B) = pr(A)×pr(B)

If two events A and B are mutually exclusive (this means A cannot occur if B occurs and vice versa), then the probability of either A or B occurring is

pr(A or B) = pr(A) + pr(B)
 

If two events are not mutually exclusive (meaning A and B could happen together), you must subtract the probability that A and B do happen at the same time.

pr(A or B) = pr(A) + pr(B) – pr(A and B)

Practice:  See Probability Worksheet and Activities 2 and 3.

 

More probability practice:

1. When flipping a coin 5 times what the probability of getting...   

5 heads in a row? outcome: HHHHH    probability of getting heads on the first flip is 1/2 or .5.  For each subsequent flip the prob of getting heads is also 1/2 or .5.  Since you want H then H then H then H then H, these are successive events.  Multiply the individual prob together:  1/2 * 1/2 * 1/2 * 1/2 * 1/2 = 1/32 or (.5)5 = .03125

4 heads and then a tail?   outcome:  HHHHT the prob for each flip is again 1/2 so the prob of this specific outcome is again (1/2)5 = 1/32.

exactly one tail?  outcomes: HHHHT or HHHTH or HHTHH or HTHHH or THHHH.  The prob of each of these outcomes is 1/32.  Since the first outcome or the second or the third or the fourth or the fifth would qualify as exactly one tail out of 5, add these probs together.  5/32

at least one tail?   outcomes:  could be 1 out of 5 tails, or 2 out of 5, or 3 out of 5, or 4 out of 5, or 5 out of 5; every outcome possible except no tails at all.  pr(at least one) = 1 - pr(none).  None = HHHHH.  prob of all heads = 1/32 (see first question) so prob at least one tail = 1 - 1/32 = 31/32

2. In our class of 27 students, there are 20 females and 7 males.  Of all the last names, 6 end with a vowel and the rest end with a consonant.  There are 5 females with a last name ending in a vowel.  Fill in and complete the following table and then answer the questions.

  Female Male Total
ends in vowel 5 1 6
ends in consonant 15 6 21
total 20 7 27

What is the probability that a randomly chosen person from the class...

is a male?  7/27

has a name that ends in a consonant?   21/27

is a female whose name ends in a consonant?  15/27

is either a male or has a name that ends in a vowel?  7/27 + 6/27 - 1/27 = 12/27  Since male and ends in a vowel are not mutually exclusive you must subtract the prob that the person is a male whose name ends in a vowel.

 

3. Suppose you have to cross a train track on your commute.  The probability that you will have to wait for a train is 0.20.  If you don't have to wait, the commute takes 15 minutes, but if you have to wait, it takes 20 minutes.

i. What is the expected value of the time it takes you to commute?  (0.20)*(20) + (0.80)*(15) = 16 minutes

ii. Is the expected value ever the actual commute time? Briefly explain.   No, the expected value is never an actual commute time.  The reason for this fact is that the expected value is the average commute time averaged over many commutes.
 

4.  To quote Forrest Gump, "My momma always said, 'Life was like a box of chocolates. You never know what you're gonna get.'"  You have a box of 20 chocolates with 8 creme caramel, 5 dark chocolate truffle, 4 coconut almond, and 3 mint. 

What is the probability that if you close your eyes a pick one chocolate, that it is a:

dark chocolate truffle?   5/20

not a coconut almond?  1 - 4/20 = 16/20

either a mint or a creme caramel?   3/20 + 8/20 = 11/20

You hate coconut almond.  What is the probability that if you randomly pick a chocolate, taste it, put it back and pick another one, that they both would be coconut almond?   (4/20)*(4/20) = .04

If you pick 5 (and return after one taste), what is the probability that at least one of them is the dreaded coconut almond? 

1 - P( none of the 5 are almond) = 1 - (16/20)*(16/20)*(16/20)*(16/20)*(16/20) = 0.672

5. Insurance companies use the concept of expected value to determine how much to charge their customers.  customers pay for insurance coverage.  If there is a claim made, the company must pay the customer.  An insurance company charges one customer $500 for its policy for one year.   There is a 10% probability that the customer will make a claim for $2500 which the insurance company will have to pay out. There is a 20% probability that the customer will make a claim $1000 (This means there is a 70% probability that the customer will make no claim). 

a. What is the expected value for how much the company will pay out to the customer?  (0.10)*(2500) + (0.20)*(1000) + (0.70)*(0) = $450

b. Is the company expected to make money on that policy?  Explain   Yes because the customer pays $500 for the policy. Therefore, on average, the company is making $50 per policy.

6. Redo Homework 2

 

NOTE: from our last lecture, expected value, gambler's fallacy, confusion of the inverse, Simpson's paradox, coincidences, and risk/relative risk will be covered on the midterm.  Descriptive Statistics will NOT be covered on the midterm.