Data Integration & Exploration on Data Lakes
Primary supervisor
Additional supervisors
- Andre Freitas
Contact admissions office
Other projects with the same supervisor
- Designing Safe & Explainable Neural Models in NLP
- Data Wrangling
- Retrieved Augmented Generation with Data Lakes and Knowledge Graphs
- Finding a way through the Fog from the Edge to the Cloud
- Data Lake Exploration with Modern Artificial Intelligence Techniques
- Fishing in the Data Lake
Funding
- Self-Funded Students Only
If you have the correct qualifications and access to your own funding, either from your home country or your own finances, your application to work with this supervisor will be considered.
Project description
Data Lakes are emerging as data management infrastructures for storing data in various schemata and structural forms. Their goal is to serve as a single entry point for the data analysis process across highly heterogeneous datasets, supporting analytical tasks following a schema-on-read approach, in which data is discovered and integrated when it is to be used. Due to their semantic and structural heterogeneity, Data Lakes bring integration challenges to a new scale of complexity.
The Information Management Group at the University of Manchester invites applications for PhD candidates in the area of data integration and exploration on Data Lakes. PhD projects in this area will explore how contemporary techniques in Natural Language Processing (such as Open Information Extraction, Distributional Semantics and Semantic Parsing) can be used as a foundation to support exploratory data analysis on real-world data lakes.
Examples of research challenges include:
How to scale the integration of unstructured, semi-structured and structured datasets.
How to support end-users in exploratory data analysis (using Natural Language Questions for example).
How to use information embedded in large-scale corpora to support data integration.
How to use contemporary techniques in one-shot machine learning to support data integration.
Applicants are expected to have:
An excellent undergraduate degree in Computer Science or Mathematics (or related discipline), and preferably, a relevant M.Sc. degree.
Confidence and independence in programming complex systems in Java or Python.
Previous academic or industry experience in Natural Language Processing or Data Science (desired).
Excellent report writing and presentation skills.
Please note that applicants must additionally satisfy the standard requirements for postgraduate studies at the University of Manchester, such as a first-class or high upper-second class (or an equivalent international qualification) and English language qualifications, as stated in the PGR guidelines.
Qualified applicants are strongly encouraged to informally contact Norman Paton (norman.paton@manchester.ac.uk) and Andre Freitas (andre.freitas@manchester.ac.uk) to discuss the application prior to applying.
Person specification
For information
- Candidates must hold a minimum of an upper Second Class UK Honours degree or international equivalent in a relevant science or engineering discipline.
- Candidates must meet the School's minimum English Language requirement.
- Candidates will be expected to comply with the University's policies and practices of equality, diversity and inclusion.
Essential
Applicants will be required to evidence the following skills and qualifications.
- You must be capable of performing at a very high level.
- You must have a self-driven interest in uncovering and solving unknown problems and be able to work hard and creatively without constant supervision.
Desirable
Applicants will be required to evidence the following skills and qualifications.
- You will have good time management.
- You will possess determination (which is often more important than qualifications) although you'll need a good amount of both.
General
Applicants will be required to address the following.
- Comment on your transcript/predicted degree marks, outlining both strong and weak points.
- Discuss your final year Undergraduate project work - and if appropriate your MSc project work.
- How well does your previous study prepare you for undertaking Postgraduate Research?
- Why do you believe you are suitable for doing Postgraduate Research?