Course Title: Data Preparation for Analytics

Part A: Course Overview

Course Title: Data Preparation for Analytics

Credit Points: 12.00


Course Code




Learning Mode

Teaching Period(s)


City Campus


145H Mathematical & Geospatial Sciences


Sem 2 2010,
Sem 2 2011,
Sem 2 2012,
Sem 2 2013,
Sem 2 2014,
Sem 2 2015,
Sem 2 2016


City Campus


171H School of Science


Sem 2 2018,
Sem 2 2019

Course Coordinator: Mr Xu Zhang

Course Coordinator Phone: -

Course Coordinator Email:

Pre-requisite Courses and Assumed Knowledge and Capabilities

MATH2200 Statistics 1 and MATH2201 Statistics 2 or their equivalent.


Course Description

Data preparation and data cleaning is an inevitable step in statistical analysis. In business environments, it is frequently required to transfer data from databases and perform statistical analysis. Establish a linkage between data marts and statistical packages is an important task which occurs in professional organisations. This course introduces you to the concepts and the techniques to prepare data located in business intelligent data marts for statistical analysis using SAS software. It covers reading, cleaning, pre-analysing data using SAS.

Objectives/Learning Outcomes/Capability Development

On successful completion of this course you should be able to:

  1. Work in a business environment in which data preparation occurs.
  2. Prepare data marts for statistical analysis using SAS software.
  3. Program SAS in an efficient way.
  4. Read data from databases and clean the data for statistical analysis in SAS.
  5. Develop strategies for dealing with imperfect real world data.

This course contributes to the development of the following Program Learning Outcomes:

Personal and professional awareness

  • The ability to contextualise outputs where data are drawn from diverse and evolving social, political and cultural dimensions
  • The ability to reflect on experience and improve your own future practice
  • The ability to apply the principles of lifelong learning to any new challenge.

Knowledge and technical competence

  • The ability to use the appropriate and relevant, fundamental and applied mathematical and statistical knowledge, methodologies and modern computational tools.


  • The ability to bring together and flexibly apply knowledge to characterise, analyse and solve a wide range of problems
  • An understanding of the balance between the complexity / accuracy of the mathematical / statistical models used and the timeliness of the delivery of the solution.

Overview of Learning Activities

Key concepts of data preparation will be extensively covered in this course. These will be explained and elucidated with relevant class and computer laboratory examples. The assignments and labs will test your understanding of class materials.

Your will undertake 3 hours of lecture/lab sessions (face-to-face learning) per week. Meanwhile it is recommended that an average of 6 hours/week of out-of-class study is required for course review and completing assessment tasks.

Overview of Learning Resources

A list of recommended textbooks for this course is provided on Canvas.

All course materials, including lecture notes, lab exercises, practical exercises, assignments will be posted on Canvas LMS.

The statistical package SAS can be accessed from the school computer labs, meanwhile students can get access to SAS through the RMIT MyDesktop system anywhere and anytime.     

A Library Guide is available at:

Overview of Assessment

This course has no hurdle requirements.


Assessment Tasks:

Early Assessment Task: Lab Test

Weighting 5%

This assessment task supports CLOs  1, 2, 3, 4 & 5

Assessment Task 2: Lab Tests

Weighting 15%

This assessment task supports CLOs  1, 2, 3, 4 & 5

Assessment Task 3: Assignments - related to usage of computer software

Weighting 30%

This assessment task supports CLOs  1, 2, 3, 4 & 5

Assessment Task 4: Final Exam

Weighting 50%

This assessment task supports CLOs  1, 2, 3, 4 & 5