Course Title: Data Wrangling

Part A: Course Overview

Course Title: Data Wrangling

Credit Points: 12.00

Terms

Course Code

Campus

Career

School

Learning Mode

Teaching Period(s)

MATH2349

City Campus

Postgraduate

171H School of Science

Face-to-Face

Sem 1 2018,
Sem 2 2018,
Sem 1 2019,
Sem 2 2019,
Sem 1 2020,
Sem 2 2020,
Sem 2 2021

MATH2349

City Campus

Postgraduate

171H School of Science

Internet

Sem 1 2021

Flexible Terms

Course Code

Campus

Career

School

Learning Mode

Teaching Period(s)

MATH2405

RMIT Online

Postgraduate

171H School of Science

Internet

JanJun2020 (TP3)

MATH2405

RMIT Online

Postgraduate

171H School of Science

Internet

JanJun2021 (All)

MATH2405

RMIT Online

Postgraduate

171H School of Science

Internet

JulDec2021 (KP5)

Course Coordinator: Dr. Sona Taheri

Course Coordinator Phone: +61 3 9925 2526

Course Coordinator Email: sona.taheri@rmit.edu.au

Course Coordinator Location: 15.04.02

Course Coordinator Availability: By appointment, by email


Pre-requisite Courses and Assumed Knowledge and Capabilities

A working knowledge of basic mathematics and familiarity with computers.


Course Description

Real-world data are commonly incomplete, noisy, and inconsistent. This course will cover a wide range of topics designed to equip you with the skills needed to prepare all forms of untidy data for analysis. The course will cover the core concepts of data pre-processing, namely tidy data, data integration, data cleaning, data transformation, data standardisation, data discretisation, and data reduction. You will develop and apply your data wrangling skills to complex, noisy, and inconsistent real-world data using leading open source software.


Objectives/Learning Outcomes/Capability Development

This course contributes to the following Program Learning Outcomes for MC004 Master of Statistics and Operations Research and MC242 Master of Analytics:

 

Personal and professional awareness

  • the ability to contextualise outputs where data are drawn from diverse and evolving social, political and cultural dimensions
  • the ability to reflect on experience and improve your own future practice
  • the ability to apply the principles of lifelong learning to any new challenge.

Knowledge and technical competence

  • an understanding of appropriate and relevant, fundamental and applied mathematical and statistical knowledge, methodologies and modern computational tools.

Problem-solving

  • the ability to bring together and flexibly apply knowledge to characterise, analyse and solve a wide range of problems
  • an understanding of the balance between the complexity / accuracy of the mathematical / statistical models used and the timeliness of the delivery of the solution.

Information literacy

  • the ability to locate and use data and information and evaluate its quality with respect to its authority and relevance.



This course contributes to the following Program Learning Outcomes from GC173KP19 Graduate Certificate of Data Science.

  1. Understand and use appropriate and relevant, fundamental and applied mathematical and statistical knowledge and methodologies and modern processes and analytical tools. 
  2. Bring together and flexibly apply knowledge to characterise, analyse and solve a range of data science problems. 
  3. Select and justify relevant approaches to analyse data problems and/or identify opportunities. 
  4. Use visualisation approaches to interpret and communicate findings to inform decision makers.
  5. Critically reflect on their own practice to support their continual professional development and become lifelong learners.


Course Learning Outcomes

On completion of this course you should be able to:

  1. Accurately, logically and ethically combine data from multiple sources to make suitable for statistical analysis and draw valid interpretations.
  2. Articulate how data meets the best practice standards (e.g. tidy data principles).
  3. Select, perform and justify data validation processes for raw datasets.
  4. Use leading open source software (e.g. R) for reproducible, automated data processing.


Overview of Learning Activities

This course uses highly structured learning activities to guide your learning process and prepare you for your assessments. The activities are a combination of individual, peer-supported and facilitator-guided activities, and where possible project-led, with opportunities for feedback throughout.  

Authentic and industry-relevant learning is critical to this course and you will be encouraged to critically compare what is happening in your context and in industry, and to use your insights.  

Social learning is another important component and you are expected to participate in class and group activities, share drafts of work and resources and give and receive peer feedback. You will be expected to work efficiently and effectively with others to achieve outcomes greater than those that you might have achieved alone.  

Above all, the learning activities are designed to maximize the likelihood that you will not only understand the course learning resources but also apply that learning to improving your own practice, for example by producing real-world artefacts and engaging in scenarios and case studies.   


Overview of Learning Resources

You will be able to access course information and learning materials through RMIT’s Learning Management System (LMS).  

The LMS will give access to important announcements, a discussion forum, staff contact details, the teaching schedule, course contents, notes, learning materials and data sets, and all assessment briefs.  

A list of recommended textbooks for this course will also be provided on Canvas as reference sources. 

A Library Guide is available at: http://rmit.libguides.com/mathstats 


Overview of Assessment

Please note that this Part A course guide supports 2 program offerings: RMIT campus and RMIT online 

For RMIT on campus 


Assessment item 1: Practical Assessments  

Weighting 45%  

This assessment task supports CLOs 1, 2, 3 & 4

Assessment item 2: Module discipline-based Assessments  

Weighting 10%  

This assessment task supports CLOs 1,2,3 & 4 


Final Assessment - Online short answer and MCQ assessment 

Weighting 45% 

This assessment tasks supports CLOs, 1,2,3 & 4 


For RMIT Online - where there are no on campus sessions


Assessment item 1: Assignment 1  

Weighting 30%  

This assessment task supports CLOs 1, 2, 3 & 4 


Assessment item 2: Assignment 2 

Weighting 30%  

This assessment task supports CLOs 1,2,3 & 4 


Final Assessment - Assignment 3 

Weighting 40% 

This assessment tasks supports CLOs, 1,2,3 & 4