Clinical text mining

Primary supervisor

Contact admissions office

Other projects with the same supervisor


  • Self-Funded Students Only
If you have the correct qualifications and access to your own funding, either from your home country or your own finances, your application to work with this supervisor will be considered.

Project description

Recent developments in making electronic health records (EHRs) available provide an opportunity to use vast amounts of clinical information that is buried in textual form to facilitate personalised health-care and improve the quality of clinical practice (e.g. through large-scale data sharing and integration that can be used to build clinical decision support systems). This data can be used to support medical research (e.g. identification of patients with specific conditions to support clinical trials or improving understanding of treatment benefits and harms). While key issues remain in the adoption of EHRs and in managing data confidentiality, automated processing of available clinical data is a major challenge: manual identification of such information is time consuming and often inconsistent and incomplete. This is particularly the case with clinical narratives, which are often the primary, preferred and richest source of patient information.

Our team (see has been involved in a number of projects with both local and national hospitals to develop methods for automated information extraction from clinical notes and letters. This project aims to continue such developments. Specifically, the project will aim to develop a text mining methodology to extract and structure clinically-relevant outcomes and place them in appropriate context (e.g. temporally or with regards to a disease status). One of specific challenges will be extracting temporal relationships and dependencies between clinical events (e.g. "the patient to be sent for X if results of Y are below Z").

The methodologies will include combining rule-based approaches with machine learning, in particular graph-based methodologies.

The project will be applied and evaluated in the context of several case studies (e.g. cancer, brain injuries and rheumatology) within the newly established Health eResearch Centre, in collaboration with partners from local centres of excellence (e.g. The Christie hospital, Arthritis Research UK, etc).

The successful candidate will have an excellent first degree in Computer Science or Computational Linguistic, with clear interests and at least some relevant experience in text mining and language modelling. Understanding of machine learning will be a distinctive advantage.

▲ Up to the top