Course Title: Data Wrangling

Part A: Course Overview

Course Title: Data Wrangling

Credit Points: 12.00

Terms

Course Code

Campus

Career

School

Learning Mode

Teaching Period(s)

MATH2349

City Campus

Postgraduate

171H School of Science

Face-to-Face

Sem 1 2018,
Sem 2 2018,
Sem 1 2019,
Sem 2 2019,
Sem 1 2020,
Sem 2 2020,
Sem 2 2021,
Sem 1 2022,
Sem 2 2022,
Sem 1 2023,
Sem 2 2023,
Sem 1 2024,
Sem 2 2024

MATH2349

City Campus

Postgraduate

171H School of Science

Internet

Sem 1 2021

Flexible Terms

Course Code

Campus

Career

School

Learning Mode

Teaching Period(s)

MATH2405

RMIT Online

Postgraduate

171H School of Science

Internet

JanJun2020 (TP3)

MATH2405

RMIT Online

Postgraduate

171H School of Science

Internet

JanJun2021 (All)

MATH2405

RMIT Online

Postgraduate

171H School of Science

Internet

JulDec2021 (KP5)

MATH2405

RMIT Online

Postgraduate

171H School of Science

Internet

JanJun2022 (KP2)

MATH2405

RMIT Online

Postgraduate

171H School of Science

Internet

JulDec2022 (KP4)

MATH2405

RMIT Online

Postgraduate

171H School of Science

Internet

JulDec2023 (KP4),

JulDec2023 (KP6)

MATH2405

RMIT Online

Postgraduate

171H School of Science

Internet

JanJun2024 (KP2)

MATH2405

RMIT Online

Postgraduate

171H School of Science

Internet

JulDec2024 (KP6),

JulDec2024 (KP4)

Course Coordinator: Dr. Sona Taheri

Course Coordinator Phone: +61 3 9925

Course Coordinator Email: sona.taheri@rmit.edu.au

Course Coordinator Availability: By appointment, by email


Pre-requisite Courses and Assumed Knowledge and Capabilities

Applied business mathematics


Course Description

Real-world data is commonly incomplete, noisy, and inconsistent. You will be equipped with the skills needed to prepare all forms of untidy data for analysis. You will learn about the core concepts of data wrangling, namely tidy data, data integration, data cleaning, data transformation, data standardisation, data discretisation, and data reduction. You will develop and apply your data wrangling skills to complex, noisy, and inconsistent real-world data using leading open-source software R.


Objectives/Learning Outcomes/Capability Development

This course contributes to Program Learning Outcomes for the following programs:

  • MC004 Master of Statistics and Operations Research
  • MC242 Master of Analytics
  • GC173KP19 Graduate Certificate of Data Science
  • MC274 Master of Data Science Strategy and Leadership


On successful completion of this course, you should be able to:

  1. Utilise leading open-source software, R, to address and resolve data wrangling tasks.
  2. Select, perform, and justify data validation processes for raw datasets to satisfy quality requirements.
  3. Apply and evaluate the best practice standards of Tidy Data Principles.
  4. Critically analyse data integration procedures for combining data with different types and structures into a suitable format.


Overview of Learning Activities

This course uses highly structured learning activities to guide your learning and prepare you for your assessments. The activities are a combination of individual, peer-supported and facilitator-guided activities, with opportunities for feedback throughout.  

Authentic and industry-relevant learning is critical to this course; you will therefore  be encouraged to critically compare  current thinking and practice within this context and industry. You will apply your thinking by producing relevant real-world assessment tasks and engage with scenarios and case studies.  Social learning is another important aspect of coursework; you are therefore expected to participate in group activities, share drafts of your work and other resources that might be helpful, as well as giving and receiving peer feedback. By working efficiently and effectively with others, you will achieve outcomes greater than those that you might have achieved on your own. Above all, the learning activities are designed to maximise the likelihood that you will not only understand the course learning resources, but also be able to apply those learnings to your own professional practice.


Overview of Learning Resources

The learning and teaching approaches used in this program may include webinars, problem-based learning and case studies.  

The activities and tasks are designed to facilitate the application of theory and encourage peer learning in a collaborative, open manner using online tools and interactive discussion forums. Assessment is integrated throughout the program to ensure that you graduate with a set of applicable skills and knowledge.  

There are services available to support your learning via the RMIT University Library. The Library provides guides on academic referencing and subject specialist help as well as a range of study support services.  

RMIT Online provides support and equal opportunities for students with a disability, long-term illness and/or mental health condition and primary carers of individuals with a disability. If you need assistance, please speak to your Program Manager or contact the Equitable Learning Services (ELS).  

At RMIT you can apply for credit so your previous learning or experience counts toward your RMIT Online program. For further information on how to apply for credit, please click here.  

Please view the Assessment and Assessment Flexibility Policy for further information regarding applying for an extension, special consideration, equitable assessment arrangements and supplementary assessment.  


Overview of Assessment

Please note that this Part A course guide supports 2 program offerings: RMIT campus and RMIT online 

You will be assessed on how well you meet the course learning outcomes and on your development against the program learning outcomes.   

 

Assessment Task 1: Pre-processing data project

Weighting 20%  

CLOs:  1, 2 and 3.

 

Assessment Task 2: Coding exercises

Weighting 35%  

CLOs: 1, 2, 3 & 4.

 

Assessment Task 3: Applied relational data project

Weighting 45%  

CLOs: 1, 2, 3 & 4.