Course Title: Internet and Intranet Document Engineering

Part A: Course Overview

Course Title: Internet and Intranet Document Engineering

Credit Points: 12.00


Course Code

Campus

Career

School

Learning Mode

Teaching Period(s)

COSC1168

City Campus

Postgraduate

140H Computer Science & Information Technology

Face-to-Face

Sem 2 2006,
Sem 2 2007,
Sem 2 2008,
Sem 2 2009,
Sem 2 2010,
Sem 2 2011,
Sem 2 2012,
Sem 2 2013,
Sem 2 2015

COSC1169

City Campus

Undergraduate

140H Computer Science & Information Technology

Face-to-Face

Sem 2 2006,
Sem 2 2007,
Sem 2 2008,
Sem 2 2009,
Sem 2 2010,
Sem 2 2011,
Sem 2 2012,
Sem 2 2013,
Sem 2 2015

Course Coordinator: Dr. Zhifeng Bao

Course Coordinator Phone: +61 3 9925 1940

Course Coordinator Email: zhifeng.bao@rmit.edu.au


Pre-requisite Courses and Assumed Knowledge and Capabilities

You may not enrol in this course unless it is explicitly listed in your enrolment program summary, and you have confirmed with your program coordinator that it is an appropriate choice for your study plan.

The ability to write programs in Java to manipulate XML documents is essential. You should also be able to outline how data is managed by a database system and how queries are evaluated. An ability to use Unix tools is expected. A basic understanding of HTML is helpful. Completetion of following courses will satisfy these requirements.

Database Concepts


Course Description

 

The course covers the principles and cutting edge practical techniques of data engineering on both the internet and on intranets. We focus on keyword search, which is the most user-friendly query paradigm that benefits to almost every Internet user. Regarding the topics, we will cover the topics on keyword query processing over semi-structured XML data, spatio-textual data like Google Map, social network data with short texts but a large social graph topology, and structured relational data. Regarding the techniques, we will cover various state-of-the-art technologies, including but not limited to visualisation of search results as well as data, user interaction design, personalized search, fuzzy search, type-ahead search, real time search and keyword query suggestion” techniques used to build a comprehensive search engine.


Objectives/Learning Outcomes/Capability Development

This course contributes to the development of the following capabilities:

  • Critical analysis: Analyse and model requirements and constraints for the purpose of designing and implementing search engines over data with various structure.
  • Problem solving: Design and implement a comprehensive search engine that accommodates specified requirements and constraints, based on modeling or requirements specification.
  • Communication: Motivate and explain complex data engineering concepts, relevant alternatives and decision recommendations to IT specialists, via technical reports of professional standard, and technical presentations.


On completion of this course you should be able to: 

  1. Understand and distinguish the different key challenges raised by search over different types of data, i.e. structured data (like relational database), semi-structured data (like XML database), spatial-textual data (like Google Map, Bing Map), social network data (like Twitter, Facebook).
  2. Understand the difference between search over un-structured text documents and data with structure.
  3. Learn the importance of data and query result visualization and user interaction in a search engine and some common practical design on visualization and interaction.
  4. Learn and understand how to support query suggestion in your search engine,
  5. Learn and understand how to support error-tolerance in query processing to enable fuzzy search,
  6. Learn and understand how to support type-ahead search to find the matching results even when users do not complete their query.
  7. Learn the challenges to support real-time search, which is especially important for social media search.
  8. Learn how personalized search are implemented in a search engine.
  9. Apply the above technologies to build a practical search tool over a certain type of data.


Overview of Learning Activities

The learning activities included in this course are:
 

  • key concepts will be explained in lectures, classes or online, where syllabus material will be presented and the subject matter will be illustrated with demonstrations and examples;
  • tutorials and/or labs and/or group discussions (including online forums) focussed on projects and problem solving will provide practice in the application of theory and procedures, allow exploration of concepts with teaching staff and other students, and give feedback on your progress and understanding;
  • assignments, as described in Overview of Assessment (below), requiring an integrated understanding of the subject matter; and
  • private study, working through the course as presented in classes and learning materials, and gaining practice at solving conceptual and technical problems.


Overview of Learning Resources

You will make extensive use of computer laboratories and relevant software provided by the School. You will be able to access course information and learning materials through myRMIT and may be provided with copies of additional materials in class or via email. Lists of relevant reference texts, resources in the library and freely accessible Internet sites will be provided.

Use the RMIT Bookshop’s textbook list search page to find any recommended textbook(s).
 


Overview of Assessment

The assessment for this course comprises practical work (including the development of computer programs and scripts using a combination of technologies such as of XML, Java, Key-Value DB) and a final exam.


For standard assessment details, including deadlines, weightings, and hurdle requirements relating to Computer Science and IT courses see: http://www.rmit.edu.au/compsci/cgi