STAT 7200 - Winter 2020

General Information

This is the course website for STAT 7200: Multivariate Statistics. This course aims to provide students with a broad overview of techniques used in multivariate statistical analysis, with an emphasis on Multivariate Linear Regression and Principal Component Analysis. At the end of the course, students will be able to

  • make decisions on how and when to use the techniques discussed in class;
  • apply and assess multivariate methods on real data;
  • make sound statistical conclusions based on a multivariate analysis.

Moreover, the course aims to make students familiar, or competent, with the R statistical software.

The course outline can be downloaded here.

Prerequisites

Consent of the instructor. Good knowledge of linear algebra and mathematical statistics is required.

Textbook

There is no required textbook for this class. If you are looking for a good reference, I recommend:

  • Anderson, An Introduction to Multivariate Statistical Analysis. Wiley (2003).
  • Muirhead, Aspects of Multivariate Statistical Theory. Wiley (2005).
  • Johnson & Wichern, Applied Multivariate Statistical Analysis. Prentice Hall (2007).
  • Fujikoshi, Ulyanov & Shimizu, Multivariate Statistics–High-Dimensional and Large-Sample Approximations. Wiley (2010).
  • Bilodeau & Brenner, Theory of Multivariate Statistics. Springer (1999).

Assessments

The assessments for this course include:

  • Three (3) assignments;
  • One (1) midterm test;
  • One (1) final project, which includes a written report and an oral presentation.
    • The guidelines for the term project can be found here.

In particular, there is no final exam.

Outline of Topics

The course is expected to cover the following topics:

  1. Aspects of multivariate analysis: handling multivariate data, graphical displays, statistical distance
  2. Matrix algebra and random vectors: eigenvalues and eigenvectors, positive definite matrices, mean vectors, covariance matrices and matrix decompositions
  3. Random Samples: sample geometry, characterizing random samples
  4. Multivariate normal distribution: definition and properties, estimation and sampling distributions
  5. Inferences about a mean vector: Hotelling’s T2 and likelihood ratio tests, confidence regions and multiple comparisons
  6. Multivariate linear regression: multivariate analysis of variance, least squares estimation and inference
  7. Principal Component Analysis: interpretation and use of principal components
  8. Factor Analysis: orthogonal factor model, estimation and inference
  9. Canonical Correlation Analysis: canonical variables and canonical correlations
  10. Graphical models (if time permits)

Statistical Software

The course will make use of the R statistical software for demonstrating some of the theoretical concepts. Sample codes will be provided to students.

You can download R for free (for Windows, Mac, Linux, and Solaris) from the Comprehensive R Archive Network at: https://cran.r-project.org/

For additional resources on R, see here.