Course Title: Big Data Management

Part A: Course Overview

Course Title: Big Data Management

Credit Points: 12.00

Terms

Course Code

Campus

Career

School

Learning Mode

Teaching Period(s)

COSC2632

City Campus

Undergraduate

171H School of Science

Face-to-Face

Sem 1 2018,
Sem 1 2019,
Sem 1 2020

COSC2632

City Campus

Undergraduate

175H Computing Technologies

Face-to-Face

Sem 1 2022

Course Coordinator: Dr Zhifeng Bao

Course Coordinator Phone: +61 3 9925 2793

Course Coordinator Email: zhifeng.bao@rmit.edu.au

Course Coordinator Location: 14.9.10

Course Coordinator Availability: By appointment, by email


Pre-requisite Courses and Assumed Knowledge and Capabilities

Required Prior Study:

You should have satisfactorily completed the following courses or equivalent before you commence this course.

  • ISYS1057 - Database Concepts
  • COSC1076 - Advanced Programming Techniques

Required Concurrent Study:

You should undertake Algorithm and Analysis (COSC1285/2123) at the same time as this course as it contains areas of knowledge and skills which are implemented together in practice.

 

Alternatively, you may be able to demonstrate the required skills and knowledge before you start this course.

Contact your course coordinator if you think you may be eligible for recognition of prior learning. 


Course Description

This course builds on skills gained in database management systems and gives students an in-depth understanding of a wide range of fundamental Big Data Management systems. In particular, this course focuses on the “variety” of the 3Vs in big data, where how to store, index and query various types of data (structured, unstructured, geo-spatial and time series data) in a real-world application. Moreover, this course introduces end-to-end infrastructure to solve big data management problems, which include data cleaning, data integration, data update, query processing (top-k query, k-nearest neighbour query, range query, point query), data visualization, data crowdsourcing, from front-end to back-end. The students are expected to establish the skills to extract core efficiency/scalability challenges from a real-life application scenario, in order to identify and address the bottleneck of a big data management system.

This course establishes a strong working knowledge of the concepts, techniques and products associated with Big Data. The main focus is on specialized storage models, indexing techniques, efficient and scalable algorithm designs for query processing, to work with a variety of Big Data.

Students will learn the core functionality of each major Big Data component and how they integrate to form a coherent solution with business benefit. Hands-on programming and algorithm design exercises aim to provide insight into what the tools do so that their role in Big Data systems can be understood.

The course keeps a good balance between algorithmic and systems issues. The algorithms discussed in this course involve methods of organising big data for efficient complex computation for data with big variety.


Objectives/Learning Outcomes/Capability Development

Program Learning Outcomes

This course contributes to the development of the following Program Learning Outcomes:

PLO2: Problem Solving - Apply systematic problem solving and decision-making methodologies to identify, design and implement computing solutions to real world problems, demonstrating the ability to work independently to self-manage processes and projects.

PLO4: Communication - Communicate effectively with diverse audiences, employing a range of communication methods in interactions.to both computing and non-computing personnel.

 


On completion of this course you should have gained an understanding of Big Data concepts, including cloud and big data architectures, an overview of Big Data tools and platforms, and to apply these concepts using an industry standard tools and products.

The course learning outcomes are:

  1. Be knowledgeable on the Big Data Fundamentals, including the evolution of Big Data, the characteristics of Big Data and the challenges introduced.
  2. Be Proficient on characterizing, formally defining the usability of big data, and extracting the core technical/research questions from a real-world problem.
  3. Can acquire and implement various efficient indexing schemes to manage different types of data (to cater for “Variety” of data), which include but not limit to geo-spatial data, spatial-textual data, multimedia data, time series data, high-dimensional structured data, crowdsourced data.
  4. Design algorithms to achieve efficient query processing over heterogeneous data (on top of the index designed), and can conduct theoretical analysis on the space and time complexity of the algorithm that applies to large-scale heterogeneous data.


Overview of Learning Activities

The learning activities included in this course are:

  • Key concepts will be explained in pre-recorded lectures, classes or online, where syllabus material will be presented and the subject matter will be illustrated with demonstrations and examples;
  • Tutorials and/or labs and/or group discussions (including online forums) focused on projects and problem solving will provide practice in the application of theory and procedures, allow exploration of concepts with teaching staff and other You, and give feedback on your progress and understanding;
  • Assignments, as described in Overview of Assessment (below), requiring an integrated understanding of the subject matter; and
  • Private study, working through the course as presented in classes and learning materials, and gaining practice at solving conceptual and technical problems.


Overview of Learning Resources

The course is supported by the learning management system which provides specific learning resources.

See also the RMIT Library Guide at http://rmit.libguides.com/compsci

You will make use of computer laboratories and relevant software provided by the School. You will be able to access course information and learning materials through Canvas and may be provided with copies of additional materials in class or via email. Lists of relevant reference texts, resources in the library and freely accessible Internet sites will be provided.

Use the RMIT Bookshops textbook list search page to find any recommended textbook(s).


Overview of Assessment

Assessments:

Assessment task 1: Programming Assignment
Weighting: 16%
This assessment supports CLOs 1-4

Assessment task 2: Programming Assignment
Weighting: 18%
This assessment supports CLOs 1-4 

Assessment task 3: Programming Assignment
Weighting: 24%
This assessment supports CLOs 1-4

Assessment task 4: Test
Weighting: 42%
This assessment supports CLOs 1-4