|Unit level:||Level 6|
|Teaching period(s):||Semester 1|
|Offered by||School of Computer Science|
|Available as a free choice unit?:||Y
This module will examine the entire data life cycle, including data creation, modelling, acquisition, representation, use, maintenance, preservation and disposal. As the majority of data is stored in databases, the module will examine various database engineering approaches to support data management, including database design, data warehousing, maintenance and analytics. Data standards and data quality will be examined and the challenge of "big datasets" will be considered.
OverviewThe Harvard Business Review in October 2012 described the role of data scientist as 'the sexiest job of the 21st century'. The 'big data' phenomenon has become part of the vernacular, with the digital universe expected to grow by a factor of 44 from 2009-2020 to a trillion Gigabytes [IDC Digital Universe Study, 2010]. This has led to recognition of a data lifecycle and the need for its systematic management, including both technical and societal issues. Particular focus here is on issues such as data standardisation and data quality, and data analytics (description and prediction) across all application domains.
Learning outcomes are detailed on the COMP60711 course unit syllabus page on the School of Computer Science's website for current students.
- Analytical skills
- Problem solving
- Written communication
- Written exam - 60%
- Written assignment (inc essay) - 40%
- An overview of the data life cycle
- Data engineering, modelling and design techniques
- Data storage and warehousing
- Data access and maintenance
- Big Data, Map-Reduce, Hadoop
- Data analytics and visualisation
- Engineering non-traditional data types
- Data standards and data quality
COMP60711 reading list can be found on the School of Computer Science website for current students.
Feedback methodsRegular coursework, returned marked with feedback
- Independent study hours - 150 hours