General Information #
This is the course website for DATA 2010: Tools and Techniques for Data Science. This course aims to provide students with an introduction to the field of data science with an emphasis on the fundamental tools and techniques that underlie the field. Throughout the course, students will:
- Become proficient in
R/Python, to the level that they can analyze data using the tools from this class. - Be able to describe and analyze data through visualization and simple statistical procedures.
- Be introduced to statistical thinking and be able to think critically about variation and biases.
At the end of the course, students will be able to analyze data using both R and Python. Each programming language has its pros and cons, and knowing how to use both is a valuable skill.
Course Details #
- Instructor: Max Turgeon
- Email: max.turgeon@umanitoba.ca
- Office: 373 Machray Hall
- Website: https://maxturgeon.ca/f21-data2010/
- Lectures: MWF 1:30 PM–2:20 PM, via Zoom
- Lab: M 4:30PM–5:45 PM (on Webex)
- Office Hours:
- By appointment only
Prerequisites #
MATH 1240 (Elementary Discrete Mathematics), MATH 1300 or MATH 1220 (Linear Algebra), one of MATH 1700, MATH 1710, or MATH 1232 (Calculus 2)
Co-Requisites #
STAT 2150 (Statistics and Computing) and COMP 2140 (Data Structures and Algorithms)
Textbook #
The following textbooks are good references, but they are not required:
- Skiena, The Data Science Design Manual. Springer (2017)
- Kim & Ismay, Statistical Inference Via Data Science: A ModernDive Into
Rand the Tidyverse. CRC Press (2019). - VanderPlas, Python Data Science Handbook. O’Reilly (2016)
Assessments #
The assessments for this course include:
- Five (5) labs.
- Three (3) assignments.
- One (1) course project.
- One (1) final exam.
Outline of Topics #
The course is expected to cover the following topics:
- Data cleaning and wrangling
- Correlation, distributions, significance
- Data visualization
- Scores and rankings
- Linear regression
- Introduction to Machine Learning