General Information #
This is the course website for SCI 2000: Introduction to Data Science. This course aims to provide students with an introduction to data science. Specifically, this course will introduce you to tools and hands-on experience needed to analyse data. By the end of the course, students will:
- Become proficient in
R, to the level that they can analyse data using the tools from this class.
- Be able to describe and analyze data through visualization and simple statistical procedures.
- Be introduced to statistical thinking and be able to think critically about variation and biases.
Course Details #
- Instructor: Max Turgeon
- Email: firstname.lastname@example.org
- Office: 373 Machray Hall
- Website: https://maxturgeon.ca/f20-stat3150/
- Lectures: TR 11:30 AM–12:45 AM, via Webex
- Office Hours:
- By appointment only
The course outline can be downloaded here.
There is no textbook for this course. Notes will be provided to students through UM Learn, along with additional resources.
The assessments for this course include:
- Four (4) assignments.
- Three (3) data analysis summaries.
- One (1) final project.
Outline of Topics #
The course is expected to cover the following topics:
- Data visualization
- Data wrangling
- Relational data
- Web scraping
- Introduction to regular expressions
- (If time permits) Automation and version control
Throughout the course, the applied topics above will be complemented with an introduction to statistical thinking: how to think about variability, what biases can occur in the data, and how to perform simple statistical procedures (e.g. comparing means, proportions, linear regression).
Statistical Software #
The course requires you to make extensive use of the R statistical software for your assignments and final data project. Sample codes will be provided to students.
You can download
R for free (for Windows, Mac, Linux, and Solaris) from the Comprehensive R Archive Network at: https://cran.r-project.org/
For additional resources on
R, see here.