KEY INFORMATION
- Friday, 4:00–7:00 pm @ TEAMS
- Prof. Ioannis Pavlidis — ipavlidis[@]uh.edu — Office Hours: Fri 3–4 pm @ TEAMS
- Mohammed Emtiaz Ahmed — mahmed24[@]uh.edu — Office Hours: Thu 12–2 pm @ TEAMS
The course covers statistical methods in human and technology studies or experiments, from where the bulk of scientific and engineering data originate. It situates statistics in the context of data science and emphasizes its relationship to machine learning.
The course is practical in orientation, emphasizing understanding of concepts and the ability to choose the right design or apply the right statistical test.
- 10% — Participation
- 50% — Homework assignments
- 40% — Course project
The course has a semester-long project in place of a final test. The homeworks are individual assignments, while the project is a group assignment; each project group typically consists of 2–3 students.
- R
- RStudio
- [1] Horton, N.J. and Kleinman, K. Using R and RStudio for Data Management, Statistical Analysis, and Graphics. CRC Press, 2015
- [2] Freund, R. J., W. J. Wilson, and D. L. Mohr. Statistical Methods. 2010.
- [3] Montgomery, Douglas, C. Design and Analysis of Experiments. Ninth Edition. John Wiley & Sons, 2017.
COURSE OUTLINE
1/17/2020
Topics to Cover: Situating Statistics and Machine Learning in Data Science; observations and variables; types of measurements for variables; distributions; numerical descriptive statistics; exploratory data analysis; bivariate data; data collection
1/24/2020
Topics to Cover: Probability; discrete probability distributions; continuous probability distributions; sampling distributions
Homework #1 Out
1/31/2020
Topics to Cover: Hypothesis testing; estimation; sample size; assumptions
Assignment of Projects
2/7/2020
Topics to Cover: Inferences on the population mean; inferences on a proportion; inferences on the variance of one population; assumptions
Homework #1 Due on 2/7/2020
Homework #2 Out on 2/7/2020
2/14/2020
Topics to Cover: Inferences on the difference between means using independent samples; inferences on variances; inferences on means for dependent samples; inferences on proportions; assumptions
2/21/2020
Topics to Cover: Analysis of variance; linear model; assumptions; specific comparisons; random models; unequal sample sizes; analysis of means
Homework #2 Due on 2/21/2020
Homework #3 Out on 2/21/2020
3/6/2020
Topics to Cover: The regression model; estimation of parameters; inferences for regression; correlation; regression diagnostics
3/27/2020
Topics to Cover: The multiple regression model; estimation of coefficients; inferential procedures; correlations; special models; multicollinearity; variable selection; detection of outliers
4/3/2020
Topics to Cover: The dummy variable model; unbalanced data; models with dummy and interval variables; weighted least squares; correlated errors
Homework #3 Due on 4/3/2020
Homework #4 Out on 4/3/2020
4/10/2020
Topics to Cover: Hypothesis test for a multinomial population; goodness of fit; contingency tables; loglinear model
4/17/2020
Topics to Cover: One sample; two independent samples; more than two samples; rank correlation; the bootstrap
4/24/2020
Topics to Cover: Randomized designs; paired comparison designs; randomized complete block designs; Latin square designs; Greco-Latin square designs; balanced incomplete block designs; two-factor factorial designs; general factorial designs
Homework #4 Due on 4/27/2020
Project Reports Due on 4/29/2020
