More on Two Variable Relationships

 Home | Contact us   
  Main Concepts  | Demonstration  | Activity  | Teaching Tips  | Data Collection & Analysis  | Practice Questions   | Milestone  | Fathom Tutorial 
 

   

 Milestone

California, like every other state, is going test-crazy. In this milestone we're going to ask you to examine some of California's educational data. The data are scores from the Academic Performance Index (API) for all 806 high schools in 1999. The API presumably measures performance, and so one might predict that some variables will be strongly related to the API. The API is in 1999 was essentially the STAR exam (if that means anything to you). The STAR exam tests individual students on various content areas.

The data set we're giving you includes the APIs for all but one high school as well as some demographic data:

Variable Description
school

name of school

api99 API score
pct_meals % of students receiving free meals
not_high_g % who do not graduate
high_grad % who graduate but no college
some_coll % go to college, no degree
coll_grad % of college degrees, no higher
grad_schl % who go to graduate school
avg_ed averge years higher ed (?)
pct_cred % of teachers who are credentialed
pct_emerg % teachers who hold emergency creds

The data on education (not_high_g and some_coll, for example) refer to the parents of the children.

1) All of these variables are possible predictors of API. Choose the two that you think tell the most interesting story. Explain these stories by showing the scatterplots and describing the scatterplots.

2) For each of the two variables you chose in part (1), find the best regression line fit that you can. Some things to keep in mind: try transformations. Also, there might not be a very good straight line fit for any transformation. Is the regression line still interpretable in this case?

3) Interpret your regression lines.

4) One school, Whittier High (my alma mater) was left out. Here are its values:

variable value
pct_meals 34
not_high_g 27
high_grad 27
some_coll 23
coll_grad 18
grad_schl 5
avg_ed 2.47
pct_cred 85
pct_emerg 15
API ??????

Predict its API score.

 

The data are at

http://inspire.stat.ucla.edu/unit_03/milestone.txt

You can type this URL directly into Fathom to load the file.

Here are some hints: There are many schools that report values of 0. How should you deal with these? (This is a loaded question. First, you need to think about what you want to do with them. Then you need to decide how to get Fathom to do it. Talk to your instructor or the bulletin board for help with either aspect of the question.) Another thing to consider is that you'll find that you probably won't find a transformation that's perfectly linear. So you need to think about how good is good enough.

Name your file ms3xxxxxx where xxxxx is your last name. So mine, for example, would be ms3gould. Drop it in the drop-box.