Mobile menu icon
Skip to navigation | Skip to main content | Skip to footer
Mobile menu icon Search iconSearch
Search type

Department of Computer Science

Data Lake Exploration with Modern Artificial Intelligence Techniques

Primary supervisor

Additional supervisors

  • Jiaoyan Chen

Contact admissions office

Other projects with the same supervisor


  • Competition Funded Project (Students Worldwide)

This research project is one of a number of projects at this institution. It is in competition for funding with one or more of these projects. Usually the project which receives the best applicant will be awarded the funding. Applications for this project are welcome from suitably qualified candidates worldwide. Funding may only be available to a limited set of nationalities and you should read the full department and project details for further information.

Project description

Data Lakes are emerging as data management infrastructures for storing data in various schemata and structural forms. Their goal is to serve as a single entry point for the data analysis process across highly heterogeneous datasets, supporting analytical tasks following a schema-on-read approach, in which data is discovered and integrated when it is to be used. Due to their semantic and structural heterogeneity, Data Lakes bring integration challenges to a new scale of complexity. With the fast development of Artificial Intelligence in recent years, many modern techniques such as Large Language Models and Knowledge Graphs have shown great power in dealing with many problems, including those in data management and data science. These techniques provide a new and promising direction for addressing the challenges in Data Lake.

The Information Management Group at the University of Manchester invites applications for PhD candidates in the area of Artificial Intelligence for Exploration of Data Lakes. PhD projects in this area will explore how contemporary techniques building on Language Models (such as Prompt Learning and Instruction Following Fine-tuning), Knowledge Engineering (such as Knowledge Graphs) and Data Engineering can brought together to explore deep semantics of tabular data for more efficient and effective for Data Lake management.

Examples of research challenges include: 1) how to embed tables in a vector space with their schemas, instances and associated metadata; 2) how to combine semantics from Language Models and Knowledge Graphs for semantic table annotation and schema inference; and 3) how to characterize complex relationships between tables and table attributes and to use these to inform data integration.

Applicants are expected to have:

1. An excellent undergraduate degree in Computer Science or Mathematics (or related discipline), and preferably, a relevant M.Sc. degree.
2. Confidence and independence in programming complex systems in Java or Python.
3. Previous academic or industry experience in at least one of the relevant topics such as Machine Learning, Natural Language Processing, Semantic Web, Knowledge Engineering, Data Engineering and Data Science.
4. Excellent report writing and presentation skills.

Please note that applicants must additionally satisfy the standard requirements for postgraduate studies at the University of Manchester, such as a first-class or high upper-second class (or an equivalent international qualification) and English language qualifications, as stated in the Postgraduate Research Degree guidelines.

Person specification

For information


Applicants will be required to evidence the following skills and qualifications.

  • You must be capable of performing at a very high level.
  • You must have a self-driven interest in uncovering and solving unknown problems and be able to work hard and creatively without constant supervision.


Applicants will be required to evidence the following skills and qualifications.

  • You will have good time management.
  • You will possess determination (which is often more important than qualifications) although you'll need a good amount of both.


Applicants will be required to address the following.

  • Comment on your transcript/predicted degree marks, outlining both strong and weak points.
  • Discuss your final year Undergraduate project work - and if appropriate your MSc project work.
  • How well does your previous study prepare you for undertaking Postgraduate Research?
  • Why do you believe you are suitable for doing Postgraduate Research?