Mobile menu icon
Skip to navigation | Skip to main content | Skip to footer
Mobile menu icon Search iconSearch
Search type

Department of Computer Science

Deep Learning Architectures for Complex Data Fusion and Integration

Primary supervisor

Additional supervisors

  • Andre Freitas

Contact admissions office

Other projects with the same supervisor


  • Competition Funded Project (Students Worldwide)

This research project is one of a number of projects at this institution. It is in competition for funding with one or more of these projects. Usually the project which receives the best applicant will be awarded the funding. Applications for this project are welcome from suitably qualified candidates worldwide. Funding may only be available to a limited set of nationalities and you should read the full department and project details for further information.

Project description

Project description
Data Lakes are emerging as data management infrastructures for storing data in various schemata and structural forms. Their goal is to serve as a single entry point for the data analysis process across highly heterogeneous datasets, supporting analytical tasks following a schema-on-read approach, in which data is discovered and integrated when it is to be used. Due to their semantic and structural heterogeneity, Data Lakes bring integration challenges to a new scale of complexity.

In this project we will explore the interface between emerging Deep Learning representation paradigms and heterogeneous dataspaces (structured, semi-structured and unstructured data), investigating how contemporary deep learning architectures and their induced embeddings can serve as a foundation for data integration, fusion and interpretation on data lakes. You will have the opportunity to design novel AI architectures exploring the space of contemporary methods such as transformers, variational autoencoders and graph neural networks.

Topics of interest include:
??? Design of novel neural and variational embeddings for tables.
??? Applications of table embeddings in inference tasks.
??? Embeddings as a supporting paradigm for data fusion.
??? (Semantically deep) program synthesis for data transformation (few-shot learning settings).
??? Explaining table differences (via explainable neural architectures).

Applicants are expected to have:
??? An excellent undergraduate degree in Computer Science or Mathematics (or related discipline), and preferably, a relevant M.Sc. degree.
??? Confidence and independence in programming complex systems in Java or Python.
??? Previous academic or industry experience in Natural Language Processing, Machine Learning or Data Science (desired).
??? Excellent report writing and presentation skills.

Qualified applicants are strongly encouraged to informally contact Norman Paton ( and Andre Freitas ( to discuss the application prior to applying.

Person specification

For information


Applicants will be required to evidence the following skills and qualifications.

  • You must be capable of performing at a very high level.
  • You must have a self-driven interest in uncovering and solving unknown problems and be able to work hard and creatively without constant supervision.


Applicants will be required to evidence the following skills and qualifications.

  • You will have good time management.
  • You will possess determination (which is often more important than qualifications) although you'll need a good amount of both.


Applicants will be required to address the following.

  • Comment on your transcript/predicted degree marks, outlining both strong and weak points.
  • Discuss your final year Undergraduate project work - and if appropriate your MSc project work.
  • How well does your previous study prepare you for undertaking Postgraduate Research?
  • Why do you believe you are suitable for doing Postgraduate Research?