Part A: Course Overview

Course Title: Managing Semi-structured and Unstructured Data

Credit Points: 12.00

Course Coordinator: Dr. Falk Scholer

Course Coordinator Phone: +61 3 9925 9831

Course Coordinator Email:

Course Coordinator Location: City Campus, Building 14, Level 9, Room 22

Course Coordinator Availability: By appointment

Pre-requisite Courses and Assumed Knowledge and Capabilities

Enforced Pre-Requisite Courses 

Successful completion of the following course/s: 

Note: it is a condition of enrolment at RMIT that you accept responsibility for ensuring that you have completed the prerequisite/s and agree to concurrently enrol in co-requisite courses before enrolling in a course. 

For information go to RMIT Course Requisites webpage. 


If you have completed prior studies at RMIT or another institution that developed the skills and knowledge covered in the above course/s you may be eligible to apply for credit transfer.

Alternatively, if you have prior relevant work experience that developed the skills and knowledge covered in the above course/s you may be eligible for recognition of prior learning.

Please follow the link for further information on how to apply for credit for prior study or experience.

Course Description

The Internet is the world’s largest collection of information. Search engines are the key enabling technology to help users to find useful material among the billions of available resources. In this course you will learn about the techniques used to retrieve useful information from repositories such as the Web. 

The course first introduces standard concepts in information retrieval such as documents, queries, collections, and relevance. 

Approaches for efficient indexing, to allow for the quick identification of candidate answer documents, are considered. To find the best answers, a range of querying approaches, such as Boolean and Ranked retrieval, are studied.  Modern techniques for crawling data from the web, and support functions such as query suggestion and spelling correction are studied, as well as a selection of advanced application areas such as document summarisation, cross-lingual retrieval, and image search.

Objectives/Learning Outcomes/Capability Development

Program Learning Outcomes

This course is an option course so it is not required to contribute to the development of program learning outcomes (PLOs) though it may assist your achievement of several PLOs.

For more information on the program learning outcomes for your program, please see the program guide.

Upon successful completion of this course you should have gained a good understanding of the foundation concepts of information retrieval techniques and be able to apply these concepts into practice. Specifically, you should be able to:  

  1. Apply information retrieval principles to locate relevant information in large collections of data
  2. Understand and deploy efficient techniques for the indexing of document objects that are to be retrieved
  3. Implement features of retrieval systems for web-based and other search tasks
  4. Analyse the performance of retrieval systems using test collections
  5. Make practical recommendations about deploying information retrieval systems in different search domains, including considerations for document management and querying.

Overview of Learning Activities

The learning activities included in this course are:

  • key concepts are explained in lectures, where fundamental concepts will be presented and illustrated through relevant demonstrations and examples;
  • tutorials and/or labs and/or group discussions (including online forums) are focussed on analysis and problem solving as applied to specific projects and scenarios, will provide practice
  • in the application of theory, explore concepts with teaching staff and peers, and provide feedback on your progress and understanding;
  • interaction with IT specialist teaching staff to justify design and implementation of approaches
  • critical thinking and analysis will be developed through review of current research literature in the area.
  • private study, work through the course as presented in classes and learning materials, gain practice at solving conceptual and technical problems.

Overview of Learning Resources

You will make extensive use of computer laboratories and relevant software provided by the School. You will be able to access course information and learning materials through MyRMIT and may be provided with copies of additional materials in class or via email. Lists of relevant reference texts, resources in the library and freely accessible Internet sites will be provided.

Use the RMIT Bookshop’s textbook list search page to find any recommended textbook(s).

Overview of Assessment

Assessment Tasks

Assessment task 1: Document indexing
Weighting 20%
This assessment task supports CLOs 1 and 2

Assessment task 2: Document retrieval
Weighting 30%
This assessment task supports CLOs 1, 3, 4 and 5.

Assessment task 3: Examination
Weighting 50%
This assessment task supports CLOs 1-5.

If you have a long-term medical condition and/or disability it may be possible to negotiate to vary aspects of the learning or assessment methods. You can contact the program coordinator or Equitable Learning Services if you would like to find out more.