Problem Set #4: Statistical Analysis of Charlotte-Rock Hill Air Quality
Data
Individual work required. Show all work to receive credit.
You must provide both handwritten solved problems as well as a copy of
your spread sheet with your submission
Download dataset1 on the course syllabus
page. The data set consists of ozone, NO, CO, and hydrocarbon ambient
air measurements for several days during a recent ozone season in the Charlotte-Rock
Hill Metropolitan Statistical Area.
Complete the following exercises:
-
For each of the variables (columns) listed in the data set, determine
the mean, median, standard deviation, standard error, variance, CV, geometric
mean, and range.
-
For CO readings, calculate the 90% confidence interval.
-
Use a normal statistical model from the data to predict the % of the
time that the eight-hour ozone reading would be expected to be above 80
parts-per billion (ppb). Then compare this prediction with the actual %
of readings that were above 80 ppb.
-
For the aromatics readings:
-
Assume that these data are normally distributed and then calculate
the 95% confidence interval for the readings
-
Assume that these data are log-normally distributed and then calculate
the 95% confidence interval
-
Plot a histogram of the data using a class width of 4 ppb.
-
From the histogram, indicate whether the data appear to be normally
or log-normally distributed.
-
From the two confidence intervals calculated above, which is a better
representation of the data? Explain.