Part A: Course Overview
Course Title: Big Data Management
Credit Points: 12.00
Terms
Course Code |
Campus |
Career |
School |
Learning Mode |
Teaching Period(s) |
COSC2636 |
City Campus |
Postgraduate |
140H Computer Science & Information Technology |
Face-to-Face |
Sem 1 2016 |
COSC2636 |
City Campus |
Postgraduate |
171H School of Science |
Face-to-Face |
Sem 1 2017, Sem 1 2019, Sem 1 2020 |
COSC2636 |
City Campus |
Postgraduate |
175H Computing Technologies |
Face-to-Face |
Sem 1 2022, Sem 2 2024 |
Course Coordinator: Dr. Zhifeng Bao
Course Coordinator Phone: +61 3 9925 1940
Course Coordinator Email: zhifeng.bao@rmit.edu.au
Course Coordinator Location: 14.9.10
Course Coordinator Availability: By appointment, by email
Pre-requisite Courses and Assumed Knowledge and Capabilities
Recommended Prior Study
You should have satisfactorily completed or received credit for the following course/s before you commence this course:
If you have completed prior studies at RMIT or another institution that developed the skills and knowledge covered in the above course/s you may be eligible to apply for credit transfer.
Alternatively, if you have prior relevant work experience that developed the skills and knowledge covered in the above course/s you may be eligible for recognition of prior learning.
Please follow the link for further information on how to apply for credit for prior study or experience.
Recommended Concurrent Study
It is recommended to undertake the following course/s at the same time as this course as it contains areas of knowledge and skills which are implemented together in practice.
Alternatively, if you have the equivalent skills and knowledge covered in the above course/s you may be eligible for recognition of prior learning.
Please contact your course coordinator for further details.
Course Description
This course builds on skills gained in database management systems and gives students an in-depth understanding of a wide range of fundamental Big Data Management systems. In particular, this course focuses on the “variety” of the 3Vs in big data, where how to store, index and query various types of data (structured, unstructured, geo-spatial and time series data) in a real-world application. Moreover, this course introduces end-to-end infrastructure to solve big data management problems, which include data cleaning, data integration, data update, query processing (top-k query, k-nearest neighbour query, range query, point query), data visualization, data crowdsourcing, from front-end to back-end. The students are expected to establish the skills to extract core efficiency/scalability challenges from a real-life application scenario, in order to identify and address the bottleneck of a big data management system. This course establishes a strong working knowledge of the concepts, techniques and products associated with Big Data. The main focus is on specialized storage models, indexing techniques, efficient and scalable algorithm designs for query processing, to work with a variety of Big Data. Students will learn the core functionality of each major Big Data component and how they integrate to form a coherent solution with business benefit. Hands-on programming and algorithm design exercises aim to provide insight into what the tools do so that their role in Big Data systems can be understood. The course keeps a good balance between algorithmic and systems issues. The algorithms discussed in this course involve methods of organising big data for efficient complex computation for data with big variety.
Objectives/Learning Outcomes/Capability Development
Program Learning Outcomes
This course is an option course so it is not required to contribute to the development of program learning outcomes (PLOs) though it may assist your achievement of several PLOs.
For more information on the program learning outcomes for your program, please see the program guide.
Upon successful completion of this course, you will be able to:
- Be knowledgeable on the Big Data Fundamentals, including the evolution of Big Data, the characteristics of Big Data and the challenges introduced.
- Be Proficient on characterizing, formally defining the usability of big data, and extracting the core technical/research questions from a real-world problem.
- Can acquire and implement various efficient indexing schemes to manage different types of data (to cater for “Variety” of data), which include but not limit to geo-spatial data, spatial-textual data, multimedia data, time series data, high-dimensional structured data, crowdsourced data.
- Design algorithms to achieve efficient query processing over heterogeneous data (on top of the index designed), and can conduct theoretical analysis on the space and time complexity of the algorithm that applies to large-scale heterogeneous data.
- Adopt an end-to-end approach to turn the theoretical analysis to physical development of system prototype that address real-life applications.
Overview of Learning Activities
- Key concepts will be explained in pre-recorded lectures, classes or online, where syllabus material will be presented and the subject matter will be illustrated with demonstrations and examples.
- Tutorials and/or labs and/or group discussions (including online forums) focused on projects and problem solving will provide practice in the application of theory and procedures, allow exploration of concepts with teaching staff and other students, and give feedback on your progress and understanding;
- Assignments, as described in Overview of Assessment (below), requiring an integrated understanding of the subject matter; and
- Private study, working through the course as presented in classes and learning materials, and gaining practice at solving conceptual and technical problems.
Overview of Learning Resources
You will make use of computer laboratories and relevant software provided by the School. You will be able to access course information and learning materials through Canvas and may be provided with copies of additional materials in class or via email. Lists of relevant reference texts, resources in the library and freely accessible Internet sites will be provided.
Use the RMIT Bookshops textbook list search page to find any recommended textbook(s).
See also the RMIT Library Guide at http://rmit.libguides.com/compsci
Overview of Assessment
Assessment Tasks:
Assessment task 1: Programming Assignment
Weighting: 16%
This assessment supports CLOs 1-5
Assessment task 2: Programming Assignment
Weighting: 18%
This assessment supports CLOs 1-5
Assessment task 3: Programming Assignment
Weighting: 24%
This assessment supports CLOs 1-5
Assessment task 4: Test
Weighting: 42%
This assessment supports CLOs 1-5
If you have a long-term medical condition and/or disability it may be possible to negotiate to vary aspects of the learning or assessment methods. You can contact the program coordinator or Equitable Learning Services if you would like to find out more.