Course Title: Big Data Management

Part A: Course Overview

Course Title: Big Data Management

Credit Points: 12.00

Terms

Course Code	Campus	Career	School	Learning Mode	Teaching Period(s)
COSC2636	City Campus	Postgraduate	140H Computer Science & Information Technology	Face-to-Face	Sem 1 2016
COSC2636	City Campus	Postgraduate	171H School of Science	Face-to-Face	Sem 1 2017, Sem 1 2019, Sem 1 2020
COSC2636	City Campus	Postgraduate	175H Computing Technologies	Face-to-Face	Sem 1 2022

Course Coordinator: Dr. Zhifeng Bao

Course Coordinator Phone: +61 3 9925 1940

Course Coordinator Email: zhifeng.bao@rmit.edu.au

Course Coordinator Location: 14.9.10

Course Coordinator Availability: By appointment, by email

Pre-requisite Courses and Assumed Knowledge and Capabilities

Required Prior Study:

You should have satisfactorily completed the following courses or equivalent before you commence this course.

ISYS1055 Database Concepts
COSC1295 Advanced Programming

Required Concurrent Study:

You should undertake Algorithm and Analysis (COSC1285/2123) at the same time as this course as it contains areas of knowledge and skills which are implemented together in practice.

Alternatively, you may be able to demonstrate the required skills and knowledge before you start this course.

Contact your course coordinator if you think you may be eligible for recognition of prior learning.

Course Description

This course builds on skills gained in database management systems and gives students an in-depth understanding of a wide range of fundamental Big Data Management systems. In particular, this course focuses on the “variety” of the 3Vs in big data, where how to store, index and query various types of data (structured, unstructured, geo-spatial and time series data) in a real-world application. Moreover, this course introduces end-to-end infrastructure to solve big data management problems, which include data cleaning, data integration, data update, query processing (top-k query, k-nearest neighbour query, range query, point query), data visualization, data crowdsourcing, from front-end to back-end. The students are expected to establish the skills to extract core efficiency/scalability challenges from a real-life application scenario, in order to identify and address the bottleneck of a big data management system. This course establishes a strong working knowledge of the concepts, techniques and products associated with Big Data. The main focus is on specialized storage models, indexing techniques, efficient and scalable algorithm designs for query processing, to work with a variety of Big Data. Students will learn the core functionality of each major Big Data component and how they integrate to form a coherent solution with business benefit. Hands-on programming and algorithm design exercises aim to provide insight into what the tools do so that their role in Big Data systems can be understood. The course keeps a good balance between algorithmic and systems issues. The algorithms discussed in this course involve methods of organising big data for efficient complex computation for data with big variety.

Objectives/Learning Outcomes/Capability Development

This course contributes to the development of the following Program Learning Outcomes:

PLO1: Knowledge - Apply a broad and coherent set of knowledge and skills for developing user-centric computing solutions for contemporary societal challenges.

PLO2: Problem Solving - Apply systematic problem solving and decision-making methodologies to identify, design and implement computing solutions to real world problems, demonstrating the ability to work independently to self-manage processes and projects.

PLO4: Communication - Communicate effectively with diverse audiences, employing a range of communication methods in interactions to both computing and non-computing personnel.

On completion of this course you should have gained an understanding of Big Data concepts, including cloud and big data architectures, an overview of Big Data tools and platforms, and to apply these concepts using an industry standard tools and products. The key learning outcomes are:

Be knowledgeable on the Big Data Fundamentals, including the evolution of Big Data, the characteristics of Big Data and the challenges introduced.
Be Proficient on characterizing, formally defining the usability of big data, and extracting the core technical/research questions from a real-world problem.
Can acquire and implement various efficient indexing schemes to manage different types of data (to cater for “Variety” of data), which include but not limit to geo-spatial data, spatial-textual data, multimedia data, time series data, high-dimensional structured data, crowdsourced data.
Design algorithms to achieve efficient query processing over heterogeneous data (on top of the index designed), and can conduct theoretical analysis on the space and time complexity of the algorithm that applies to large-scale heterogeneous data.
Adopt an end-to-end approach to turn the theoretical analysis to physical development of system prototype that address real-life applications.

Overview of Learning Activities

Key concepts will be explained in pre-recorded lectures, classes or online, where syllabus material will be presented and the subject matter will be illustrated with demonstrations and examples.
Tutorials and/or labs and/or group discussions (including online forums) focused on projects and problem solving will provide practice in the application of theory and procedures, allow exploration of concepts with teaching staff and other students, and give feedback on your progress and understanding;
Assignments, as described in Overview of Assessment (below), requiring an integrated understanding of the subject matter; and
Private study, working through the course as presented in classes and learning materials, and gaining practice at solving conceptual and technical problems.

Overview of Learning Resources

You will make use of computer laboratories and relevant software provided by the School. You will be able to access course information and learning materials through Canvas and may be provided with copies of additional materials in class or via email. Lists of relevant reference texts, resources in the library and freely accessible Internet sites will be provided.

Use the RMIT Bookshops textbook list search page to find any recommended textbook(s).

See also the RMIT Library Guide at http://rmit.libguides.com/compsci

Overview of Assessment

Assessments:

Assessment task 1: Programming Assignment
Weighting: 16%
This assessment supports CLOs 1-5

Assessment task 2: Programming Assignment
Weighting: 18%
This assessment supports CLOs 1-5

Assessment task 3: Programming Assignment
Weighting: 24%
This assessment supports CLOs 1-5

Assessment task 4: Test
Weighting: 42%
This assessment supports CLOs 1-5

Course Title: Big Data Management

Part A: Course Overview

Terms

Course Code

Campus

Career

School

Learning Mode

Teaching Period(s)

Index

Explore

Admin essentials

Learning support

Services and facilities

Work and study opportunities

Library

Explore

My employment

Services and tools

Teaching and student resources

Research support

Intranets