Course Title: Advanced Programming for Data Science
Part A: Course Overview
Course Title: Advanced Programming for Data Science
Credit Points: 12.00
Terms
Course Code |
Campus |
Career |
School |
Learning Mode |
Teaching Period(s) |
COSC2820 |
City Campus |
Postgraduate |
171H School of Science |
Face-to-Face |
Sem 2 2021 |
COSC2820 |
City Campus |
Postgraduate |
175H Computing Technologies |
Face-to-Face |
Sem 2 2022, Sem 1 2024, Sem 2 2024 |
COSC3015 |
RMIT University Vietnam |
Postgraduate |
175H Computing Technologies |
Face-to-Face |
Viet3 2022, Viet2 2024 |
Course Coordinator: Professor Feng Xia
Course Coordinator Phone: +61 9925
Course Coordinator Email: feng.xia@rmit.edu.au
Pre-requisite Courses and Assumed Knowledge and Capabilities
COSC1519 Introduction to Programming OR
COSC2676 Programming Fundamentals for Scientists OR
COSC2531 Programming Fundamentals OR
COSC2801 Programming Bootcamp 1
Note it is a condition of enrolment at RMIT that you accept responsibility for ensuring that you have completed the prerequisite/s and agree to concurrently enrol in co-requisite courses before enrolling in a course.
Course Description
This is an advanced programming course, designed specifically for students who are interested in the field of Data Science.
Advanced programming concepts and techniques for the purposes of data processing (e.g., data parsing, cleansing, integration, etc.) will be taught, enabling more complex data pre-processing and getting data ready for down-stream analysis. These include, for example, the handling of data stored in different formats (e.g., CSV, JSON, XML,), the handling of bad and missing data, and the integration of data from different sources. The course will also introduce both fundamental and the state-of-the-art advanced techniques for text pre-processing, to convert raw natural language text data to feature representations that can be directly used in downstream analysis. The course will also explore a simple web app development framework, which enables students to deploy their developed data driven applications online.
A Python environment will be used for implementation throughout the course.
Objectives/Learning Outcomes/Capability Development
Objectives/Learning Outcomes/Capability Development
This course is a core course in MC267 Master of Data Science and contributes to the following Program Learning Outcomes:
1. Enabling Knowledge:
You will gain skills as you apply knowledge with creativity and initiative to new situations. In doing so, you will demonstrate mastery of a body of knowledge that includes recent developments in computer science and information technology
2. Critical Analysis:
You will learn to accurately and objectively examine, and critically investigate computer science and data science concepts, evidence, theories or situations, in particular to analyse and model complex requirements and constraints for the purpose of processing data and building computing and IT systems.
3. Problem Solving:
Your capability to analyse complex problems and synthesise suitable solutions will be extended as you learn to: design and implement software solutions that accommodate specified requirements and constraints, based on analysis or modelling or requirements specification.
4. Communication:
You will learn to communicate effectively with a variety of audiences through a range of modes and media, in particular to: interpret abstract theoretical propositions, choose methodologies, justify conclusions and defend professional decisions to both Data Science and non-Data Science personnel via technical reports of professional standard and technical presentations.
5. Responsibility:
You will be required to accept responsibility for your own learning and make informed decisions about judging and adopting appropriate behaviour in professional and social situations. This includes accepting the responsibility for independent life-long learning and a high level of accountability. Specifically, you will learn to:
- effectively apply relevant standards, ethical considerations, and an understanding of legal and privacy issues to designing software applications and IT systems.
Upon successful completion of this course, you should be able to:
- Programmatically parse data in the required format;
- Programmatically identify and resolve data quality issues;
- Programmatically integrate data from various sources for data enrichment;
- Pre-process natural language text data to generate effective feature representations;
- Document and maintain an editable transcript of the data pre-processing pipeline for professional reporting;
- Build small to medium scale data-driven applications using a Web development framework.
Overview of Learning Activities
The learning activities for this course include:
- Key concepts will be explained in pre-recorded videos, activity notebooks, lectures and workshops, where syllabus material will be presented and the subject matter illustrated via demonstrations and examples;
- Workshops will focus on hands-on activities and problem solving, allowing exploration of concepts with teaching staff and other students, to provide feedback on progress and understanding;
- Assignments, as described in Overview of Assessment (below), will provide simulation of workplace activities and an opportunity to demonstrate an integrated understanding of the subject matter; and
- Private study, working through the course materials (available online and in class) and gaining practice at solving conceptual and technical problems.
Teacher-directed 48, student-directed 72
Overview of Learning Resources
You will make extensive use of computer laboratories and relevant software provided by the School and/or available for download onto private laptops/machines. You will be able to access course information and learning materials via Canvas and may be provided with copies of additional materials in the library or via freely accessible internet sites. Use the RMIT Bookshop’s textbook list search page to find any recommended textbook(s).
Overview of Assessment
This course has no hurdle requirements.
The assessment for this course comprises an in-class coding exercise, two project assignment work and a technical interview.
Assessment Task 1: In-class Coding Exercise
Weighting 15%
This assessment task supports CLOs 1, 2, 3
Assessment Task 2: Assignment 1
Weighting 20%
This assessment task supports CLOs 1, 2, 3, 5
Assessment Task 3: Assignment 2
Weighting 35%
This assessment task supports CLOs 4, 5, 6
Assessment Task 4: Technical Interview
Weighting 30%
This assessment supports CLOs 1, 2, 3, 4