Course Title: Big Data Processing

Part A: Course Overview

Course Title: Big Data Processing

Credit Points: 12.00


Course Code




Learning Mode

Teaching Period(s)


City Campus


140H Computer Science & Information Technology


Sem 2 2015,
Sem 2 2016


City Campus


171H School of Science


Sem 2 2018,
Sem 2 2019,
Sem 2 2020,
Sem 2 2021


City Campus


175H Computing Technologies


Sem 2 2022,
Sem 2 2023

Course Coordinator: Dr Ke Deng

Course Coordinator Phone: +61 3 9925 3202

Course Coordinator Email:

Course Coordinator Location: 14.9.12

Course Coordinator Availability: By appointment, by email

Pre-requisite Courses and Assumed Knowledge and Capabilities

Enforced Prerequisites:

COSC1295 Advanced Programming 


ISYS1055 Database Concepts

Note: it is a condition of enrolment at RMIT that you accept responsibility for ensuring that you have completed the prerequisite/s and agree to concurrently enrol in co-requisite courses before enrolling in a course. 

For your information go to RMIT Course Requisites webpage.

Course Description

This course builds on your database and programming skills. It aims to give you an in-depth understanding of a wide range of fundamental algorithms and processing platforms used in big data management.

The course covers Big Data Fundamentals, including the characteristics of Big Data, the sources Big Data (such as social media, sensor data, and geospatial data), as well as the challenges imposed around information management, data analytics, as well as platforms and architectures. Emphasis will be given to non-relational databases by examining techniques for storing and processing large volumes of structured and unstructured data, streaming data as well as complex analytics on them. Cloud computing and data centres will also be presented as a solution to handling big data and business intelligence applications. 

The course aims to keep a balance between algorithmic and systematic issues. The algorithms discussed in this course involve methods of organising big data for efficient complex computation. In addition, we consider Big Data platforms (such as Hadoop) to present practical applications of the algorithms covered in the course.

Objectives/Learning Outcomes/Capability Development

Program Learning Outcomes

This course is a specialisation course that contributes to the following Program Learning Outcomes (PLOs) for MC267 Master of Data Science and MC208 Master of Information Technology:

Critical Analysis:

You will learn to accurately and objectively examine, and critically investigate computer science and information technology (IT) concepts, evidence, theories or situations, in particular to:

-- analyse and model complex requirements and constraints for the purpose of designing and implementing software artefacts and IT systems

-- evaluate and compare designs of software artefacts and IT systems on the basis of organisational and user requirements.

Problem Solving: 

Your capability to analyse complex problems and synthesise suitable solutions will be extended as you learn to: design and implement software solutions that accommodate specified requirements and constraints, based on analysis or modelling or requirements specification.


You will learn to communicate effectively with a variety of audiences through a range of modes and media to: interpret abstract theoretical propositions, choose methodologies, justify conclusions and defend professional decisions to both IT and non-IT personnel via technical reports of professional standard and technical presentations.

Course Learning Outcomes

Upon successful completion of this course, you should have gained an understanding of Big Data concepts, including cloud and big data architectures, an overview of Big Data analytics, implementation of Big Data platforms, and be able to apply these concepts using an industry standard non-relational database environment.

The key course learning outcomes are:

  1. Model and implement efficient big data solutions for various application areas using appropriately selected algorithms and data structures.
  2. Analyse methods and algorithms, to compare and evaluate them with respect to time and space requirements and make appropriate design choices when solving real-world problems.
  3. Motivate and explain trade-offs in big data processing technique design and analysis in written and oral form.
  4. Explain the Big Data Fundamentals, including the evolution of Big Data, the characteristics of Big Data and the challenges introduced.
  5. Apply non-relational databases, the techniques for storing and processing large volumes of structured and unstructured data, as well as streaming data.
  6. Apply the novel architectures and platforms introduced for Big data, i.e., Hadoop, MapReduce, and Spark.

Overview of Learning Activities

You will be actively engaged in a range of learning activities such as lectorials, tutorials, practicals, laboratories, seminars, project work, class discussion, individual and group activities. Delivery may be face to face, online or a mix of both.

You are encouraged to be proactive and self-directed in your learning, asking questions of your lecturer and/or peers and seeking out information as required, especially from the numerous sources available through the RMIT library, and through links and material specific to this course that is available through myRMIT Studies Course.

Overview of Learning Resources

RMIT will provide you with resources and tools for learning in this course through myRMIT Studies Course.

There are services available to support your learning through the University Library. The Library provides guides on academic referencing and subject specialist help as well as a range of study support services. For further information, please visit the Library page on the RMIT University website and the myRMIT student portal

Overview of Assessment

Overview of Assessment 

The assessment for this course comprises four assignments. 

Note: This course has no hurdle requirements.

Assessment Tasks


Assessment Task 1: MapReduce Programming

This assignment helps students to build up an understanding on fundamental MapReduce program principles. 
Weighting 25% 
This assessment task supports CLOs 1-4 and 6.

Assessment Task 2: Big Data Processing with High-level language

This assignment is featured by big data processing with high-level language based on the Hadoop platform. 
Weighting 25%
This assessment task supports CLOs 1-4 and 6.

Assessment Task 3: Spark Problem Solving 

This assignment gives students the chance to understand the Spark program principles and to develop advanced problem-solving skills by working on a challenging big data analysis task. 
Weighting 25%
This assessment task supports CLOs 1-4 and 6.

Assessment Task 4: Timed Assessment (online)

This assignment is a time-controlled assessment to evaluate the understanding of students on the fundamental and advanced knowledge in big data processing taught in the course.  
Weighting 25%
This assessment task supports CLOs 1, 2, 4, 5, and 6.
This is a 2-hour assessment that may be taken at any time within a 24-hour period.