Course Title: Managing Semi-structured and Unstructured Data

Part A: Course Overview

Course Title: Managing Semi-structured and Unstructured Data

Credit Points: 12.00

Terms

Course Code

Campus

Career

School

Learning Mode

Teaching Period(s)

ISYS1078

City Campus

Postgraduate

140H Computer Science & Information Technology

Face-to-Face

Sem 2 2006,
Sem 2 2007,
Sem 2 2008,
Sem 2 2009,
Sem 2 2010,
Sem 2 2011,
Sem 2 2012,
Sem 2 2013,
Sem 2 2014,
Sem 2 2015

ISYS1078

City Campus

Postgraduate

171H School of Science

Face-to-Face

Sem 1 2017,
Sem 1 2018,
Sem 2 2019

ISYS1079

City Campus

Undergraduate

140H Computer Science & Information Technology

Face-to-Face

Sem 2 2006,
Sem 2 2007,
Sem 2 2008,
Sem 2 2009,
Sem 2 2010,
Sem 2 2011,
Sem 2 2012,
Sem 2 2013,
Sem 2 2014,
Sem 2 2015

ISYS1079

City Campus

Undergraduate

171H School of Science

Face-to-Face

Sem 1 2017,
Sem 1 2018,
Sem 2 2019

Course Coordinator: Dr. Falk Scholer

Course Coordinator Phone: +61 3 9925 9831

Course Coordinator Email: falk.scholer@rmit.edu.au

Course Coordinator Location: City Campus, Building 14, Level 9, Room 22

Course Coordinator Availability: By appointment


Pre-requisite Courses and Assumed Knowledge and Capabilities

You should be able to program in Java or C, and have knowledge of core data structures and algorithms, equivalent to

COSC1076 / COSC2207 Advanced Programming Techniques (formerly Programming Techniques)

OR

COSC1295 Advanced Programming (formerly Java for Programmers)


Course Description

The Internet is the world’s largest collection of information. Search engines are the key enabling technology to help users to find useful material among the billions of available resources. In this course you will learn about the techniques used to retrieve useful information from repositories such as the Web.

 

The course first introduces standard concepts in information retrieval such as documents, queries, collections, and relevance.

Approaches for efficient indexing, to allow for the quick identification of candidate answer documents, are considered. To find the best answers, a range of querying approaches, such as Boolean and Ranked retrieval, are studied.  Modern techniques for crawling data from the web, and support functions such as query suggestion and spelling correction are studied, as well as a

selection of advanced application areas such as document summarisation, cross-lingual retrieval, and image search.


Objectives/Learning Outcomes/Capability Development

Program Learning Outcomes

This course contributes to the following program learning outcomes:

  • PLO1: Knowledge - Apply a broad and coherent set of knowledge and skills for developing user-centric computing solutions for contemporary societal challenges.
  • PLO2: Problem Solving - Apply systematic problem solving and decision-making methodologies to identify, design and implement computing solutions to real world problems, demonstrating the ability to work independently to self-manage processes and projects.
  • PLO3: Cognitive and Technical Skill - Critically analyse and evaluate user requirements and design systems employing software development tools, techniques, and emerging technologies.
  • PLO4: Communication - Communicate effectively with diverse audiences, employing a range of communication methods in interactions.to both computing and non-computing personnel.


Course Learning Outcomes

On completion of this course you should have gained a good understanding of the foundation concepts of information retrieval techniques and be able to apply these concepts into practice. Specifically, you should be able to:  

  1. Apply information retrieval principles to locate relevant information in large collections of data
  2. Understand and deploy efficient techniques for the indexing of document objects that are to be retrieved
  3. Implement features of retrieval systems for web-based and other search tasks
  4. Analyse the performance of retrieval systems using test collections
  5. Make practical recommendations about deploying information retrieval systems in different search domains, including considerations for document management and querying.


Overview of Learning Activities

The learning activities included in this course are:

  • key concepts are explained in lectures, where fundamental concepts will be presented and illustrated through relevant demonstrations and examples;
  • tutorials and/or labs and/or group discussions (including online forums) are focussed on analysis and problem solving as applied to specific projects and scenarios, will provide practice in the application of theory, explore concepts with teaching staff and peers, and provide feedback on your progress and understanding;
  • interaction with IT specialist teaching staff to justify design and implementation of approaches
  • critical thinking and analysis will be developed through review of current research literature in the area.
  • private study, work through the course as presented in classes and learning materials, gain practice at solving conceptual and technical problems.

 

Total Study Hours

Teacher Guided Hours: 36 per semester

Learner Directed Hours: 84 per semester

Attendance: While a minimum attendance standard is not compulsory, non-attendance is correlated with lack of success in this course. Where visa conditions apply, attendance is compulsory.

 

 


Overview of Learning Resources

You will make extensive use of computer laboratories and relevant software provided by the School. You will be able to access course information and learning materials through MyRMIT and may be provided with copies of additional materials in class or via email. Lists of relevant reference texts, resources in the library and freely accessible Internet sites will be provided.

Use the RMIT Bookshop’s textbook list search page to find any recommended textbook(s).


Overview of Assessment

The assessment for this course comprises practical project work and a final exam.  

  • Assessment task 1: Assignment 1, Document indexing. 20%. This assessment task supports CLOs 1 and 2.
  • Assessment task 2: Assignment 2, Document retrieval. 30%. This assessment task supports CLOs 1, 3, 4 and 5.
  • Assessment task 3: Examination, 2 hours. 50% This assessment task supports CLOs 1—5.