Course Title: Big Data Infrastructures

Part A: Course Overview

Course Title: Big Data Infrastructures

Credit Points: 12.00



Course Coordinator: Professor James Thom

Course Coordinator Phone: +61 3 9925 2992

Course Coordinator Email: james.thom@rmit.edu.au

Course Coordinator Location: 14.9.16

Course Coordinator Availability: By appointment


Pre-requisite Courses and Assumed Knowledge and Capabilities

Expected prior study:

Databases: this prerequisite knowledge can be attained by completing ISYS1057 Database Concepts.
Extensive programming skills: this prerequisite knowledge can be attained by completing COSC1076 Advanced Programming Techniques.


Course Description

This course builds on skills gained in database concepts and gives You an in-depth understanding of a wide range of fundamental Big Data processing platforms.
This course is an overview of Big Data tools and technologies. It establishes a strong working knowledge of the concepts, techniques and products associated with Big Data. The main focus is on the different storage models, processing approaches and reporting tools available to work with Big Data.
You will learn the core functionality of each major Big Data component and how they integrate to form a coherent solution with business benefit. Hands-on exercises aim to provide insight into what the tools do so that their role in Big Data systems can be understood.
The course emphasizes on how to plan and implement a Big Data solution and the various technologies that comprise Big Data. Many examples and exercises of Big Data systems are provided throughout the course. There will be some exposure, although minimal, to programming examples. These examples will provide an understanding of the workings of the major components a Big Data solutions and how they can be integrated to solve different business problems.
The course keeps a good balance between algorithmic and systems issues. The algorithms discussed in this course involve methods of organising big data for efficient complex computation using MapReduce in particular platforms, such as Hadoop, to present practical applications for Big Data.


Objectives/Learning Outcomes/Capability Development

On completion of this course you should have gained an understanding of Big Data concepts, including cloud and big data architectures, an overview of Big Data tools and platforms, and to be apply these concepts using industry standard tools and products. In particular, you should be able to:

  • CLO 1: Describe Big Data Fundamentals, including the evolution of Big Data, the characteristics of Big Data and the challenges introduced.
  • CLO 2: Apply this knowledge using distributed file systems to accommodate Big Data, and to store and query Big Data using state of the art environments like Hadoop.
  • CLO 3: Analyse critically distributed processing algorithms (for example with MapReduce) and use in creating, distributing, monitoring and executing processing jobs in distributed storage environments.
  • CLO 4: Reflect upon and communicate about the issues in developing a Big Data Strategy, defining a Big Data strategy for an organisation, establishing Big Data needs, evaluating commercial Big Data tools as well as enabling analytic innovation and selecting the correct tools.



Overview of Learning Activities

The learning activities included in this course are:

  • Key concepts will be explained in lectures, classes or online, where syllabus material will be presented and the subject matter will be illustrated with demonstrations and examples;
  • Tutorials and/or labs and/or group discussions (including online forums) focused on projects and problem solving will provide practice in the application of theory and procedures, allow exploration of concepts with teaching staff and other You, and give feedback on your progress and understanding;
  • Assignments, as described in Overview of Assessment (below), requiring an integrated understanding of the subject matter; and
  • Private study, working through the course as presented in classes and learning materials, and gaining practice at solving conceptual and technical problems.

Total study hours

A total of 120 hours of study is expected during this course, comprising:
Teacher-directed hours (48 hours):
lectures, tutorials and laboratory sessions. Each week there will be 2 hours of lectures and 2 hours of tutorial / computer laboratory work. You are encouraged to participate during lectures through asking questions, commenting on the lecture material based on your own experiences and through presenting solutions to written exercises. The tutorial / laboratory sessions will introduce you to the tools necessary to undertake the assignment work.
Student-directed hours (72 hours):
You are expected to be self-directed, studying independently outside class.

 

 


Overview of Learning Resources

The course is supported by the learning management system which provides specific learning resources. See also the RMIT Library Guide at http://rmit.libguides.com/compsci
You will make use of computer laboratories and relevant software provided by the School. You will be able to access course information and learning materials through myRMIT and may be provided with copies of additional materials in class or via email. Lists of relevant reference texts, resources in the library and freely accessible Internet sites will be provided.


Overview of Assessment

The assessment for this course comprises weekly on-line tests or tasks, a major assignment and a formal written exam.

Note: This course has no hurdle requirements.

Assessment Tasks

Assessment Task 1: Weekly On-line Tests or Tasks
These tests or tasks complement lecture topics and tutorial/laboratory sessions.
Weighting 20%
This assessment task supports CLOs 1, 2, 3 & 4

Assessment Task 2: Major Assignment
In this task, you will design, implement, critically analyse and report on a substantial big data project.
Weighting 30%
This assessment task supports CLOs 2 & 3

Assessment Task 3: End-of-semester Examination
The 2 hour end-of-semester examination will cover all course learning outcomes.
Weighting 50%
This assessment task supports CLOs 1, 2, 3 & 4