ISP 120 Midterm Practice

1.  Review activities, assignments and worksheets from days one through four.
2.  No other notes, websites, spreadsheets, etc... are allowed.  You will not have access to the class website or any other internet sites during the exam.
3.  Review topics below.  (This review is not meant to be a substitute for reviewing the activities, assignments, and lecture notes.)

Topics to Review

Linear Modeling

1.  Linear Trendlines. Open the file USTobacco.xls which contains data on the total amount of tobacco produced in the US.

a. Make an XY scatter plot of the data. Include it in your document.  Add a linear trendline to your plot, including the equation and the R-squared value.
b. Use the equation in b. to predict the amount of tobacco produced in 2002.
c. How much confidence do you have in your prediction?
Write at least three sentences justifying your prediction including a lead sentence, at least two supporting sentences and any items that that must be pointed out (if applicable).

a. and b.  (shows the prediction graphically)

b.   -.059*2002 +118.77 = .652 billion pounds  (shows the prediction algebraically)

c.  Based on data from 1975 through 1990, I predict that 652 million pounds of tobacco will be produced in 2002.   I am confident in my prediction because the r-squared value of .7011 shows that the data has a strong linear relationship.  In addition, because of societal pressures and the numerous health risks (which continue to make the news), we believe that the consumption of tobacco products will continue to decline.  Even though the trend in the last three years seems to indicate an increase in tobacco production, I still believe that the trend will decline in the long run based on the negative publicity associated with tobacco use.

(The fact that there are a sufficient number of data points and no outliers, these two areas do not need to be addressed.)

Note:  You could've decided to add a localized trendline.  Below shows the addition of a localized trendline.  Based on the localized trendline, in 2002 it is expected that there will be 3.24 million pounds of tobacco produced.  To make a prediction that shows an increase in tobacco production, you would need a different argument. 

 

 

 

2. Open LeaningTowerOfPisa.xls file.  Make a linear model of this data.  ( an X,Y scatter graph of the data with a trendline with equation and R-squared value).  After you finish the graph, click on the equation and R-squared value and increase the number of decimal places to 8 decimal places.  (If you know how to use =slope() and =intercept() to calculate predictions, it is an easier alternative to showing 8 decimal points.)  If you do not use at least 8 decimal places in your equation, you will calculate a prediction that doesn't make sense.

   

a. What does the R-squared value tell you about the data.  

 

b. Use the equation to make a prediction for the size of the lean in 2005. 

 

c.  How much confidence do you have in your prediction? Write at least three sentences justifying your prediction including a lead sentence, at least two supporting sentences and any items that that must be pointed out (if applicable).

 

d. According to the trendline equation, in what year was the tower completely upright?   What factors affect your confidence?

a.  The R-squared value is 0.988.  Since this is very close to one, the data is very close to linear. This R-squared value shows a strong relationship.

b.  To make a prediction of the lean in 2005, plug 2005 in for the X in the trendline equation.  = 0.00093187*2005 + 1.12333846.  According to the model, the lean in 2005 will be 2.9917 meters. 

c.  If the trend established from 1975-1987 persists, the size of the lean will be 2.9917 meters in 2005.   The R-squared value of .988 indicates a strong relationship.  Despite this, I have only moderate confidence in my prediction because it is 18 years after the last given date.  Also, I would think there be a limit as to far this building can lean. 

(The fact that there are a sufficient number of data points, the trend continues the direction as the data and there are no outliers, these three areas do not need to be addressed.)

d.  To find the year for which the tower was completely upright, meaning the lean was 0 meters, plug 0 into the equation for Y and solve for X.  0 =.000093187X+1.12333846    -1.12333846=.00093187X      -1.12333846/.00093187=X        -1205=X       Since X represents years, the prediction is that in the year 1205 BC, there was no lean in the tower.  I have no confidence in my prediction because the tower was not built until 1173 AD.

Graphing

4. Open the file MotorVehicleFatalityRatesByAge.xls.
    a) Make an effective graph of the data.  Copy and paste the graph into your Word document. 
    b) Write 1-2 sentences carefully describing the graph.  What do you want you audience to know about the data you graphed?
    c) Do you have any concerns about this graph?

a. 

b.  The age groups with the highest motor vehicle fatality rates are 16-24 and >80.  The lowest fatality rates are for children under 5.
c.  One concern with this graph is that the age groups are not consistent.  (For example, the bars should be from 5-15, 26-25, 26-35, etc...)

5. Open the Doctoral Degrees.xls file. 

    a) Make am effective graph of the 1993 data.  Put the percents on the graph.
    b) Which category earns the second highest number of doctoral degrees? 
    c) Why is a pie chart appropriate to use for this data?  Why is a line graph not appropriate?

a.  

b.  Nonresident aliens earned the second highest number of Doctoral degrees in 1993.

c.  A pie chart is appropriate when you have data from all possible categories (here all races/ethnicities are accounted for) and there is no overlapping data (for example, race and gender data on the same graph).  The pie graph also takes the absolute data (the number of degrees earned by each group) and converts it to percent of the total number of degrees.  We can clear see the differences among the groups.  A X,Y scatter graph is not appropriate here because we are not looking at data change over time.  However, a column/bar graph could be appropriate for this data but since we are given all parts of a total, pie is better.

Calculate and Analyze Percentages in context or using a data set

6.  In 1960 there were 38,137 motor vehicle deaths in the US; in 2000 there were 42,500.   What percent increase does the change from 38,137 to 42,500 represent?

11.4% increase

7.  Open the file AccidentalDeaths.xls which lists the number of different types of accidental deaths reported in 1980, 1994, 1995, and 1996 for the total population of the United States (U.S. Census Bureau data). 

a.  What type of accident accounted for the highest number of deaths in each of the years?

b.  In column G, calculate the absolute change in number of deaths for each accident category between 1980 and 1996.  Then in column H, calculate the percent change in number of deaths from 1980 to 1996.

c.  What type of accidental deaths decreased the most during this 16 year time span?  What category had the largest increase?  Which type has the largest percent increase?  the largest percent decrease?

d. How, mathematically, does an accident type such as Motor Vehicle Accidents have such a large absolute change but not a large relative change? 

e. What was the total number of accidental deaths reported in 1996?  What percentage of the total accidental deaths are due to motor vehicle accidents in that year?  

f.  Given that the total population of the US in 1980 was 227,726,000, what was the rate of total accidental deaths per person in 1980?  Express as a decimal with 4 decimal places and as a fraction. 

 

a.  Motor vehicle accidents accounted for the highest number of accidents.

b.  The correct formula for cell G7 is =F7-B7.  The correct formula for cell H7 is either =(F7-B7)/B7 OR =G7/B7.  Format the cell by clicking on the percent format button.  Double click on the corner to fill the column. The table should look like :

Deaths from Accidents 1980 to 1996            
             
[Source: Statistical Abstract of the US]            
             
Deaths from Accidents 1980 1990 1994 1995 1996 absolute change percent change
             
Motor Vehicle Accidents  53,172  46,814  42,524  43,363  43,649 -9523 -18%
Accidental Falls  13,294  12,313  13,450  13,986  14,986 1692 13%
Accidental Drowning    6,043    3,979    3,404    3,790    3,488 -2555 -42%
Fires    5,822    4,175    3,986    3,761    3,741 -2081 -36%
Ingestion or Inhalation of objects    3,249    3,303    3,065    3,185    3,206 -43 -1%
Accidental Poisoning    2,492    4,506    7,828    8,000    8,431 5939 238%
Firearms and Handguns    1,955    1,416    1,356    1,225    1,134 -821 -42%
Air and Space Transport Accidents    1,494       941    1,075       851    1,061 -433 -29%
Water-Transport Accidents    1,429       923       723       762       675 -754 -53%
Electric Current    1,095       670       561       559       482 -613 -56%
Railway Accidents       632       663       635       569       565 -67 -11%

c.  Motor Vehicle Accidents had the largest absolute decrease and Accidental Poisoning had the largest absolute increase.  Electric Current had the largest percent decrease and Accidental Poisoning had the largest percent increase.

d.  If the absolute change is large and the percent change is fairly small, there must have been a large number of such accidents to begin with.  Dividing a large number by a very large number results in a small percent.

e.  Use the sum button to add up column F.  The total number of accidental deaths reported in 1996 was 81,418.  Part/total * 100.  43649/81418 * 100 = 54%.  Motor vehicle accidents accounted for 54% of all the accidental deaths.

f.  Sum column B.  The rate in 1980 was 90,677/227726000 = .0004.   The rate per 10,000 is 4.  This is 4 out of 10,000 which means that for every 10,000 people, 4 will die from an accident.

8. In 1960 there were 38,137 motor vehicle deaths in the US; in 2000 there were 42,500.  

    a.  What percent increase does the change from 38,137 to 42,500 represent?

    b. How could it be reasonable to conclude that even though the number of traffic fatalities increased from 1960 to 2000, traffic safety has improved?

a.  (42500-38137)/38137 * 100 = 11.4% 

b.  Since the population of the US increased over the 40 years from 1960 to 2000, it is quite possible that percentage of people who die in motor vehicles decreased.

9.  Chicago’s population grew from 2.78 million in 1990 to 2.90 million in 2000.  By how many percent did it grow?

 4.3% increase

10.  In 1993, 248.7 million people in the United States were born in the United States, and the rest, 19.8 million were foreign born.  What percentage of the population of the US was foreign born?

 7.4% foreign born

11.  In 2001, 6,113 applicants were accepted to DePaul. This represents about a 72% acceptance rate. Of the students who were accept, 2,050 students enrolled at the university.

  1. How many students in 2001 applied for admission at DePaul?
  2. What percent of the students who were accepted to DePaul actually enrolled at the University?

a.  8490 students applied
b.  33.54% of students who were accepted enrolled

12.  The January 13, 2002 issue of the New York Times  reported that there were 547,867 foreign students enrolled at US colleges and universities in 2000-01 — representing a 6.4% increase from the previous year.  The greatest number (59,939) of foreign students came from China.  What percent of the foreign students studying in the US were from China?  

10.94% from China

13.  Identify whether the bolded amount in the following statements refer to a relative quantity or an absolute quantity.

a.  DePaul's total enrollment of 23,149 represents a diverse student body in terms of ethnicity, geography and other demographics.  These students come from all fifty states and about 100 countries.  Absolute

b.  From 1998  to 2006, the undergraduate class has grown 43% and now accounts for 64% of total enrollment, up from 60% in 1996.  Relative

c.  A total of 54% are women and 28% are minority students.  Relative

14.  Practice successive percent (see worksheet, activity and assignment)

 

Review Excel skills (ie sorting, graphing, summing, etc...)

15.  Open the file mileage.xls which contains the city and highway mileage for 1996 model cars and four-wheel drive vehicles.  Using Excel's sort function, determine: 

a. The mid-size car model that gets the best city mileage Dodge Stratus

b. The mid-size car model that gets the worst highway mileage Jaguar XJ12

c. The four wheel drive car model that gets the best city mileage GeoTracker