Course Title: Practical Data Science

Part A: Course Overview

Course Title: Practical Data Science

Credit Points: 12.00

Terms

Course Code

Campus

Career

School

Learning Mode

Teaching Period(s)

COSC2738

City Campus

Undergraduate

171H School of Science

Face-to-Face

Sem 1 2018,
Sem 1 2019,
Sem 1 2020,
Sem 1 2021,
Sem 2 2021

COSC2789

RMIT University Vietnam

Undergraduate

171H School of Science

Face-to-Face

Viet2 2019,
Viet3 2020,
Viet3 2021

Course Coordinator: Professor Shane Culpepper

Course Coordinator Phone: +61 3 9925 1704

Course Coordinator Email: shane.culpepper@rmit.edu.au

Course Coordinator Location: 14.9.21

Course Coordinator Availability: By appointment


Pre-requisite Courses and Assumed Knowledge and Capabilities

None


Course Description

The course gives you a set of practical skills for handling data that comes in a variety of formats and sizes, such as texts, spatial and time series data. These skills cover the data analysis lifecycle from initial access and acquisition, modeling, transformation, integration, querying, application of statistical learning and data mining methods, and presentation of results. This includes data wrangling, the process of converting raw data into a more useful form that can be subsequently analysed. The course is hands-on, using python.


Objectives/Learning Outcomes/Capability Development

On completion of this course you should be able to:

  1. Wrangle data including: selecting, uploading, cleaning up and transforming the data into a format suitable for a data science pipeline
  2. Extract an interpretation of data using exploratory data analysis
  3. Manipulate data by creating new features, reducing dimensionality, and by handling outliers in the data
  4. Apply simple machine learning tools to the data
  5. Visualise and plot graphical representations of data.


This course contributes to the following Program Learning Outcomes for BH119 Bachelor of Analytics (Honours):


Knowledge and technical competence:
an understanding of appropriate and relevant, fundamental and applied mathematical and statistical knowledge, methodologies and modern computational tools.


Problem Solving:
the ability to bring together and flexibly apply knowledge to characterise, analyse and solve a wide range of problems
an understanding of the balance between the complexity / accuracy of the mathematical / statistical models used and the timeliness of the delivery of the solution.


Overview of Learning Activities

You will learn about key concepts in Pre-recorded lecture videos, where you can engage with course material and the subject matter being illustrated through demonstrations and examples.

Tutorials, workshops and/or labs and/or group discussions (including online forums) focused on projects and problem solving will provide you practice in the application of theory and procedures. You will explore the concepts with teaching staff and other students, and receive feedback on your progress. You will develop an integrated understanding of the subject matter through private study by working through the course as presented in classes. Comprehensive learning materials will aid you in gaining practice at solving conceptual and technical problems. 

This course includes 2 hours per week of lectures and 2 hours per week of tutorial/laboratory classes. To achieve high levels of academic results you are expected to spend on average an additional 6 hours per week on self-directed independent learning (reading, online activities and assignments).


Total study hours

This course includes 4 hours per week of tutorial/lectorial classes. To achieve high levels of academic results you are expected to spend on average an additional 4 hours per week on self-directed independent learning (reading, online activities and assignments).



Overview of Learning Resources

You will make extensive use of computer laboratories and relevant software provided by the RMIT University. You will be able to access course information and learning materials through myRMIT.


Overview of Assessment

The assessment for this course comprises practical, written, and presentation assignments, including data pre-processing, data analysis and data modelling. The assessment tasks involve the processing and analysis of various types of datasets, and the applications of various machine learning models. While this course will use machine learning tools, the focus of the assessment is on analysis, application and problem solving. Across all assessment tasks, students are required to demonstrate their knowledge of theoretical concepts and practical techniques, including identifying the appropriate techniques and applying them to new situations. 

This course has no hurdle requirements.

Assessment Task 1: Practical Assignment (individual) -- 30% 
This assignment involves the pre-processing and exploration of a dataset representing a specific data science challenge. Students need to apply a range of suitable techniques to clean, process, and analyse the data (e.g. how to deal with missing values).

This assessment task supports CLOs 1, 3, 4, 6

Assessment Task 2: Practical and Written Assignment (individual) -- 35%
This assignment focuses on data modelling, a core step in the data science process. Students need to perform an in-depth investigation and analysis of a data science problem by using different data modelling techniques. Then, students are required to apply appropriate techniques to solve the data science problem and demonstrate that the solution is both efficient and effective using appropriate measurement and evaluation techniques.

This assessment task supports CLOs 1, 2, 3, 4, 5, 6

Assessment Task 3: Practical and Written Assignment (individual)  -- 35% 
This assignment tests the ability to use more advanced techniques to solve a data science problem. This will be an extension of Assessment Task 2 where a more complex goal will be defined. Students will also be expected to write a formal report to summarise their findings and display their ability to use a program to comprehensively study a large collection of raw data.

This assessment task supports CLOs 1, 2, 3, 4, 5, 6