Course Title: Data Preprocessing

Part A: Course Overview

Course Title: Data Preprocessing

Credit Points: 12.00


Course Code




Learning Mode

Teaching Period(s)


City Campus


171H School of Science


Sem 1 2021,
Sem 1 2022,
Sem 2 2022

Course Coordinator: Dr. Sona Taheri

Course Coordinator Phone: +61 3 9925 2526

Course Coordinator Email:

Course Coordinator Location: 15.04.02

Course Coordinator Availability: by email

Pre-requisite Courses and Assumed Knowledge and Capabilities

Applied business mathematics


Course Description

Real-world data is commonly incomplete, noisy, and inconsistent. You will be equipped with the skills needed to prepare all forms of untidy data for analysis. You will learn about the core concepts of data wrangling, namely tidy data, data integration, data cleaning, data transformation, data standardisation, data discretisation, and data reduction. You will develop and apply your data wrangling skills to complex, noisy, and inconsistent real-world data using leading open-source software R.


Objectives/Learning Outcomes/Capability Development

On successful completion of this course, you should be able to:

  1. Utilise leading open-source software, R, to address and resolve data wrangling tasks.
  2. Select, perform, and justify data validation processes for raw datasets to satisfy quality requirements
  3. Apply and evaluate the best practice standards of Tidy Data Principles.
  4. Critically analyse data integration procedures for combining data with different types and structures into a suitable format.


This course contributes to the following Program Learning Outcomes for BP330, Bachelor of Space Science.

Understanding science and engineering

  • You will demonstrate an understanding of the scientific method and engineering fundamentals and an ability to apply them in practice.

Knowledge and technical competence

  • You will have broad knowledge in space science and technology with deep knowledge in its core concepts.
  • You will have knowledge in at least one discipline other than your primary discipline and some understanding of interdisciplinary linkages.

Inquiry and Problem Solving

  • You will be able to choose appropriate tools and methods to solve scientific problems within your area of specialisation.
  • You will demonstrate well-developed problem-solving skills, applying your knowledge and using your ability to think analytically and creatively.

Information literacy

  • You will develop a capacity for independent and self-directed work.
  • You will work responsibly, safely, legally, and ethically.
  • You will develop an ability to work collaboratively.


Overview of Learning Activities

This course uses highly structured learning activities to guide your learning and prepare you for your assessments. The activities are a combination of individual, peer-supported and facilitator-guided activities, with opportunities for feedback throughout. 


Authentic and industry-relevant learning is critical to this course; you will therefore be encouraged to critically compare current thinking and practice within this context and industry. You will apply your thinking by producing relevant real-world assessment tasks and engage with scenarios and case studies.  


Social learning is another important aspect of coursework; you are therefore expected to participate in group activities, share drafts of your work and other resources that might be helpful, as well as giving and receiving peer feedback. By working efficiently and effectively with others, you will achieve outcomes greater than those that you might have achieved on your own.


Above all, the learning activities are designed to maximise the likelihood that you will not only understand the course learning resources, but also be able to apply those learnings to your own professional practice.

Overview of Learning Resources

The learning and teaching approaches used in this program may include webinars, problem-based learning, and case studies.  


The activities and tasks are designed to facilitate the application of theory and encourage peer learning in a collaborative, open manner using online tools and interactive discussion forums. Assessment is integrated throughout the program to ensure that you graduate with a set of applicable skills and knowledge.  


There are services available to support your learning via the RMIT University Library. The Library provides guides on academic referencing and subject specialist help as well as a range of study support services.  


RMIT provides support and equal opportunities for students with a disability, long-term illness and/or mental health condition and primary carers of individuals with a disability. If you need assistance, please speak to your Program Manager or contact the Equitable Learning Services (ELS).  


At RMIT you can apply for credit, so your previous learning or experience counts toward your RMIT Online program. For further information on how to apply for credit, please click here.  


Please view the Assessment and Assessment Flexibility Policy for further information regarding applying for an extension, special consideration, equitable assessment arrangements and supplementary assessment. 


Overview of Assessment

You will be assessed on how well you meet the course learning outcomes and on your development against the program learning outcomes.  


This course has no hurdle requirements.

Assessment Tasks: 


Assessment Task 1  

Pre-processing data project, 20%  

CLOs:  1, 2 and 3.


Assessment Task 2  

Coding exercises, 35%  

CLOs: 1, 2, 3 & 4.


Assessment Task 3 

Applied relational data project, 45% 

CLOs: 1, 2, 3 & 4.