The Data Set for the end of semester Assignment is in a file called Data Set for Assignment.xls which presents Weekly Income (WI), Weekly Expenditure on Food (WEF), Highest Level of Education (HLE) 1 , Family Size (FS), and Gender of the Head of Household (GHH) for the a population of 1000 households.
Column A consists of Households are named by number from 1 to 1000.
Columns B to F record these households' WI, WEF, HLE, FS and GHH, respectively.
Column I present 100 samples of households as your sample, consisting:
1 st household is based on the first 3 digit of your student ID
2 nd household is based on the last 3 digit of your student ID
3 rd – 5 th is based on the day, month and the last 2 digits year of your birthday, respectively.
6 th – 100 th is randomly chosen.
Example: if your student ID is: 181XX728 and you were born in 31st of March 1985
1 st household is no. 181
2 nd household is no. 728
3 rd household is no. 31, 4 th is no. 03, and 5 th is no. 85
6 th to 100th is randomly chosen.
A. Organize your sample data in a spreadsheet as per “Instructions” above. (Students who failed to follow the instructions will not be marked and “0” mark will be awarded to them)
B. What sampling method is used to select your sample data?
C. Do you think that is the best method of sampling? Why not? Why yes?
D. What is the best statistic used to compare the volatility in WEF, WI, and FS values? Why?
Based on your sample data:
A. Develop the tabular form and graphical bar chart of WI based on the following classification:
1 st Class = Very Poor
2 nd Class = Poor
3 rd Class = Moderate
4 th Class = Rich
5 th Class = Very Rich.
B. What is the most frequent group in your WI sample data? What does that indicate in terms of your data distribution?
C. Do you think your WI of sample data is normally distributed? Provide the “statistical reason” for your answer?
A. What is the top 10% and bottom 10 % of your WEF household values?
B. What is the probability that your WI values will be less than or equal to $200?
C. What is the probability that FS will be equal to 2?
D. Is there any outlier(s) in your sample data of WEF? Show the graph or prove for that! If yes, what is the best statistic to measure the dispersity of your WEF?
A. What is the probability that the head of household is woman and her HLE is Primary?
B. What is the probability that the head of household is man and has the College degree?
C. What is the proportion of having the Secondary as the highest degree from among males?
D. What is the proportion of having the Intermediate as the highest degree from among males?
E. Do you think that the events “gender of household head is male” and “having the College Degree” are independent?
A. Provide the most accurate of interval estimate of WI and interpret your result.
B. Provide the least accurate of interval estimate of WEF and interpret your result.
C. Provide the most and least accurate of interval estimates of FS and interpret your result.
D. Explain the main differences between the most and least accurate of interval estimate! Why they called as most and least accurate of interval estimate?
A. After surveying many countries, Michael Scott, one of La Trobe University researchers believes that in order to be considered as the wealthy city, the average weekly income of the household would be at least $1200. Based on the statement above, can you consider your sample data is from a wealthy city? (?? = 0.10).
B. Michael Scott also believe one city can be considered as the fertile city if the average of family size of household is greater than 8 (?? ???? > 8). Based on the statement above, can you consider your sample as the fertile city? (?? = 0.05)
C. Michael Scott believes one city can be considered as the obese city if the average spending of Weekly food expenditure of household is greater than or equal to 50. Based on the statement above, can you consider your sample as the obese city? (?? = 0.01)
D. Bases on the calculation above, which prediction is the most accurate and why?
A. What is the relationship between the amount of WEF and FS in your sample?
B. What is the relationship between the amount of WI and gender of the family head in your sample?
C. How the HLE and WI do affected the WEF in your selected sample?
NB: Use the linear regression line to estimate, R, R 2 and graph in order to explain the relationship.
As one of the largest city in USA, New York is also known as the food city. In this city people spend so much money in food, and Bill de Blasio, Mayor of New York believes, that the average amount of weekly income (WI) spent by households is not equal with your sample data. In order to prove that he collects a random sample of 50 households data of his city. (The data is attached on excel file New York tab).
Based on the Bill de Blasio's statement, perform the analysis on hypothesis testing with level of significance of 5%. Do you think Bill de Blasio's statement is correct?
You may consider the following assumptions while performing this test:
A. Populations for both of your sample and New York data are normally distributed and samples are independent.
B. Population variances of Weekly Food on Expenditure (WEF) are unknown and unequal.