Part A: Course Overview
Course Title: Managing Semi-structured and Unstructured Data
Credit Points: 12.00
Terms
Course Code |
Campus |
Career |
School |
Learning Mode |
Teaching Period(s) |
ISYS3476 |
City Campus |
Postgraduate |
175H Computing Technologies |
Face-to-Face |
Sem 2 2025 |
Course Coordinator: Dr. Zhuang Li
Course Coordinator Phone: +61 3 9925
Course Coordinator Email: zhuang.li@rmit.edu.au
Course Coordinator Availability: By appointment
Pre-requisite Courses and Assumed Knowledge and Capabilities
Enforced Pre-Requisite Courses
Successful completion of the following course/s:
-
COSC1295 Advanced Programming (Course ID: 004316)
OR - COSC2820 Advanced Programming for Data Science (Course ID: 054137)
Note: it is a condition of enrolment at RMIT that you accept responsibility for ensuring that you have completed the prerequisite/s and agree to concurrently enrol in co-requisite courses before enrolling in a course.
For information go to RMIT Course Requisites webpage.
If you have completed prior studies at RMIT or another institution that developed the skills and knowledge covered in the above course/s you may be eligible to apply for credit transfer.
Alternatively, if you have prior relevant work experience that developed the skills and knowledge covered in the above course/s you may be eligible for recognition of prior learning.
Please follow the link for further information on how to apply for credit for prior study or experience.
Course Description
Large Language Models (LLMs) have transformed the way we access and generate information, but they remain limited by the static nature of their training data. Retrieval-Augmented Generation systems offer a powerful solution by combining language models with search capabilities. These systems dynamically retrieve relevant content from external sources, allowing intelligent agents to provide up-to-date and context-aware responses. As a result, search engines and retrieval modules have become integral to modern AI systems, enabling more accurate and grounded decision-making.
This course begins by introducing foundational concepts in information retrieval. You will explore the structure of documents, queries, and collections, and learn how to evaluate relevance in large-scale systems. Key topics include document indexing, Boolean and ranked retrieval models, query expansion techniques, and evaluation using standard benchmarks. These methods form the basis of effective search infrastructure and are critical to supporting dynamic reasoning in downstream applications.
In the second half of the course, you will design and implement a lightweight multi-agent system that leverages LLMs to interact with structured and unstructured data sources. This system will demonstrate how LLM agents coordinate, share knowledge, and make decisions in real time. You will develop skills in LLM communication protocols, simple agent planning, and data-driven reasoning, preparing you to build intelligent systems capable of integrating retrieval into complex workflows. You will also explore retrieval-augmented generation (RAG), prompt engineering, agent memory, and tool use with LLM APIs.
Objectives/Learning Outcomes/Capability Development
Program Learning Outcomes
This course is an option course so it is not required to contribute to the development of program learning outcomes (PLOs) though it may assist your achievement of several PLOs.
For more information on the program learning outcomes for your program, please see the program guide.
Upon successful completion of this course you should have gained a good understanding of the foundation concepts of information retrieval techniques and be able to apply these concepts into practice. Specifically, you should be able to:
- Apply and critically evaluate information retrieval principles to extract relevant content from large-scale datasets using both classical methods and modern LLM APIs.
- Devise and optimise advanced indexing strategies and retrieval pipelines for semi-structured data, incorporating hybrid sparse-dense techniques and LLM-based embeddings.
- Analyse and evaluate the performance of retrieval systems using both standard test collections and advanced, context-sensitive evaluation methodologies.
- Design and implement cooperative LLM agents capable of executing structured queries and performing reasoning tasks, through dynamic inter-agent coordination and context management protocols.
- Develop and integrate robust retrieval and prompt-based LLM workflows into pipelines capable of processing unstructured or noisy data in complex, real-world environments.
Overview of Learning Activities
The learning activities in this course include:
- Lectorials to explain key IR and multi-agent concepts, supported by live coding or demonstrations.
- Labs, and group discussions (including online forums) focused on problem-solving and implementation.
- Peer and teaching staff feedback on system designs and assignments.
- Literature reviews and research discussions to explore trends in IR and LLM agent-based systems.
- Independent study to reinforce theoretical and practical skills.
Overview of Learning Resources
You will make extensive use of computer laboratories and relevant software provided by the School. Course information and learning materials will be available through MyRMIT, and additional materials may be provided in class or via email.
Lists of relevant reference texts, library resources, and freely accessible Internet sites will be provided.
Use the RMIT Bookshop’s textbook list search page to find any recommended textbook(s).
Overview of Assessment
Assessment Tasks
Assessment Task 1: Document Pre-processing and Feature Engineering
Weighting: 20%
Supports CLOs 1 and 2
Assessment Task 2: Information Retrieval System and Evaluation
Weighting: 30%
Supports CLOs 2 and 3
Assessment Task 3: Lightweight Multi-Agent System
Weighting: 50%
Supports CLOs 1, 2, 3, 4 and 5
If you have a long-term medical condition and/or disability it may be possible to negotiate to vary aspects of the learning or assessment methods. You can contact the program coordinator or Equitable Learning Services if you would like to find out more.