Integrated text and table mining
Primary supervisor
Additional supervisors
- Viktor Schlegel
Additional information
Contact admissions office
Other projects with the same supervisor
- Data-Science Approaches to Better Understand Multimorbidity and Treatment Outcomes in Patients with Rheumatoid Arthritis
- Applying Natural Language Processing to real-world patient data to optimise cancer care
- Text Analytics and Blog/Forum Analysis
- (MRC DTP) Unlocking the research potential of unstructured patient data to improve health and treatment outcomes
Funding
- Self-Funded Students Only
If you have the correct qualifications and access to your own funding, either from your home country or your own finances, your application to work with this supervisor will be considered.
Project description
The amount of published scientific articles is growing exponentially, such that researchers are hardly capable of keeping up with advancements in their area. Text mining methods can help to make sense of these large corpora by automatically processing them and extracting relevant information. The need for processing tables in scientific articles is well documented, as they provide a source of succinct and detailed information, such as performance of an approach or details of an experiment. However, this information is often incomplete if considered in isolation, as important aspects necessary to understand the table are elaborated in the full text of the article.
This project aims to develop technologies for hybrid extraction, linking and analysis of tabular and textual data, using a combination of natural language processing, text mining and machine learning techniques. Research would include developing hybrid approaches that jointly operate over unstructured and semi-structured representations (e.g. text and tables), curation of appropriate corpora and their usage in downstream applications such as question answering or semantic retrieval. The project will be placed in the context of different scientific communities, including, for example, (bio)medical, chemical or computer science domains.
The successful candidate will have an excellent first degree in Computer Science or Computational Linguistic, with clear interests and at least some relevant experience in text mining and language modelling. Understanding of machine learning and preferably deep learning is also expected.
Person specification
For information
- Candidates must hold a minimum of an upper Second Class UK Honours degree or international equivalent in a relevant science or engineering discipline.
- Candidates must meet the School's minimum English Language requirement.
- Candidates will be expected to comply with the University's policies and practices of equality, diversity and inclusion.
Essential
Applicants will be required to evidence the following skills and qualifications.
- You must be capable of performing at a very high level.
- You must have a self-driven interest in uncovering and solving unknown problems and be able to work hard and creatively without constant supervision.
Desirable
Applicants will be required to evidence the following skills and qualifications.
- You will have good time management.
- You will possess determination (which is often more important than qualifications) although you'll need a good amount of both.
General
Applicants will be required to address the following.
- Comment on your transcript/predicted degree marks, outlining both strong and weak points.
- Discuss your final year Undergraduate project work - and if appropriate your MSc project work.
- How well does your previous study prepare you for undertaking Postgraduate Research?
- Why do you believe you are suitable for doing Postgraduate Research?