Mobile menu icon
Skip to navigation | Skip to main content | Skip to footer
Mobile menu icon Search iconSearch
Search type

Department of Computer Science

Data-Science Approaches to Better Understand Multimorbidity and Treatment Outcomes in Patients with Rheumatoid Arthritis

Primary supervisor

Contact admissions office

Other projects with the same supervisor


  • Self-Funded Students Only

If you have the correct qualifications and access to your own funding, either from your home country or your own finances, your application to work with this supervisor will be considered.

Project description

Chronic inflammatory diseases such as rheumatoid arthritis (RA) are potentially life-ruining. Not only is the condition itself associated with significant pain and disability, patients are more likely to be diagnosed with other co-morbid conditions. This multimorbidity can both be caused by and influence disease status (such as remission) and medications. The ways these comorbidities develop and cluster over time and their association with medications is not well understood. Data-science approaches offer a new opportunity to explore this further, with the potential to reveal previously unrecognised patterns of illness over time.

The British Society for Rheumatology ( has been capturing significant clinical data from >30000 patients with RA since 2001. Free-text adverse event and comorbidity data (>140000 records) have been recorded and manually coded to the Medical Dictionary for Regulatory Affairs (MedDRA); however manual coding is laborious and introduces the potential for inconsistencies in disease identification over time. Natural language processing (NLP) through use of text-mining software to automatically "machine-code" adverse event and comorbidity data offers a new opportunity to better harmonise outcome data over time. Unsupervised machine learning approaches, such as Latent Class Analysis and Topological Data Analysis, can subsequently be applied to these data to look for relations and patterns of disease clustering and their relationship to medication and the underlying arthritis over time.

Proposed PhD Plan:

Year 1: Literature reviews on (1) multimorbidity in RA and (2) utility of NLP/ML for discovery of "disease status". Tailoring and application of text-mining software to free text data in study.

Year 2: Clustering based on comorbidity patterns, application of latent class analysis (and other methods) to cross-sectional snapshot of all accumulated events and relating back to outcomes (drug exposure, remission status, etc.).

Year 3: Longitudinal analysis of morbidity patterns - identifying disease/adverse event sequences using ML approaches; write up and submit PhD

This PhD would be attractive to candidates with experience and training in data science, and also for those from an epidemiological or statistical background looking to increase their knowledge and experience in data science. Applicants are expected to hold, or about to obtain, a minimum upper second class undergraduate degree (or equivalent) in epidemiology, statistics, data science, computing or other related field. A Masters degree in a relevant subject and/or experience in a related subject area/discipline is desirable.

Entry Requirements:
Applicants must have obtained, or be about to obtain, at least an upper second class honours degree (or equivalent) in a relevant subject.

UK applicants interested in this project should make direct contact with the Primary Supervisor to arrange to discuss the project further as soon as possible. International applicants (including EU nationals) must ensure they meet the academic eligibility criteria (including English Language) as outlined before contacting potential supervisors to express an interest in their project. Eligibility can be checked via the University Country Specific information page (

If your country is not listed you must contact the Doctoral Academy Admissions Team providing a detailed CV (to include academic qualifications - stating degree classification(s) and dates awarded) and relevant transcripts.

Following the review of your qualifications and with support from potential supervisor(s), you will be informed whether you can submit a formal online application.

To be considered for this project you MUST submit a formal online application form - full details on how to apply can be found on the MRC Doctoral Training Partnership (DTP) website

Person specification


Applicants will be required to address the following.

  • Why do you want to do a PhD?
  • In terms of personality and temperament, why do you believe you're suitable for doing a PhD and describe any experience that demonstrates your capacity to conduct research?
  • How did you become interested in the ideas you mentioned in your research proposal?
  • Outline the objectives of your research and explain the importance of this research in the context of your current knowledge?
  • From your degree transcript what was your best and worst unit and why?
  • What was your favourite unit and why?
  • What was the most difficult part of your final year project and how did you overcome it?
  • Describe how you have helped another with their learning either informally or formally or any service or leadership roles you might have had including extracurricular activities.
  • Describe any community activities that you have been part of; such as hackathons, societies related to academics, or other extracurricular community activities for which you have participated in.
  • How do you see your future after the PhD?