Quality and Cost-Based Query Processing in XML Databases

Primary supervisor

Contact admissions office

Project description

This research topic is about processing and optimising queries over XML databases, taking quality of data into account, in addition to query response/elapsed time. Efficient query processing and optimisation has been a relevant topic in most applications of computer science, and has extensively been studied in the past, having a sizeable pool of experience and results that can be re-examined and re-used in new contexts. On the other hand, quality of data is a topic that has gained attention only recently and is highlighted by the expression of interest from enterprises seeking to become more competitive and to avoid profit losses, and also by the need for integration of data stored in remote, different, and possibly unreliable data sources, especially in the e-Science domain. We believe that the combination of these research areas can lead to interesting insights and results in XML query optimisation aimed at achieving both optimal query response time and best quality of results.

There are a number of challenges involved in the development of this research, including: Extending XML-based query languages to express quality constraints, capturing quality characteristics at data creation/input; finding suitable mechanisms for annotating data with quality information and storing annotations in an efficient way; updating data as well as its associated annotations during query processing; propagating annotations as data sets get combined or integrated, avoiding redundancies and conflicts; removing or updating annotations as more quality information is made available; and using the available annotations during query optimisation to extend the query optimiser with rules and heuristics related to the selection of operators with quality related functionality and to the selection of best quality results.As a topic that intersects with multiple areas of research, such as query processing and optimisation, data quality, data annotation and management, it is a suitable project for students seeking to broaden their knowledge and develop expertise in combined research areas, while working in a multi-disciplinary project involving experts from the IMG group (query processing and optimisation, data quality, data annotation and management) and Manchester Business School (Business applications and data quality problems).

▲ Up to the top