Modulation of Medical Condition Likelihood by Patient History Similarity
|
If you are the presenter of this abstract (or if you cite this abstract in a talk or on a poster), please show the QR code in your slide or poster (QR code contains this URL). |
Abstract
Background: An ambition of modern medical care is ‘personalised medicine’ – tailoring care regimens and treatment pathways according to the particular genotype of the patient and the subtype of the condition being treated. However, initial diagnosis can be less personalised, with decision support systems being based on general rules, symptoms and basic demographics of the patient presenting for care. We describe a system whereby general probabilities of conditions for a patient are modulated by comparison of the patient’s longitudinal clinical history, including seemingly unrelated historical conditions, with those who have had similar histories and may or may not have gone on to develop other conditions.
Objective: Our objective is to determine whether the probability of a particular condition existing in an individual patient can be modulated by examining that patient’s longitudinal clinical history and comparing it to others who have had a similar history.
Methods: We have taken clinical event codes and dates from anonymised longitudinal clinical records for 26,000 patients, extracted from several primary care sources and merged these into a single data set of 749,000 recorded events. In the UK, the majority of primary care records are coded using Read Codes, a hierarchical coding system developed in the 1980s. Where necessary records using other coding systems (e.g. ICD9/10, SNOMED CT) have been translated into Read Codes; however, in principal, any standard coding system could be used as the target system.
Once our data set was established, we reserved a proportion of the records as a test set. Furthermore, patients from this test set had their most recent conditions reserved. These truncated test records were compared with records in the main records set to discover those patients with the most similar histories; information on the event histories of these ‘similar’ patients were then used to predict an increase or decrease in probabilities of particular conditions for our test patients, using techniques adapted from recommender systems.
Results: Using Read codes to represent a patient’s sequence of diagnoses, we were able to generate association rules for particular combinations of conditions by collaborative filtering. Confidences in the results were improved by reducing the granularity of the coding, although this produced less granular predictive outputs.
Conclusions: We found challenges in merging records from different data sets and particularly when merging records that used different coding systems. However, we have been able to demonstrate some associations between diagnosis codes in longitudinal clinical histories and our results suggest that there is potential for using this technique to refine the probabilities of individual patients having particular conditions and for individuals to assess how lifestyle choices could affect their future health outcomes.
Objective: Our objective is to determine whether the probability of a particular condition existing in an individual patient can be modulated by examining that patient’s longitudinal clinical history and comparing it to others who have had a similar history.
Methods: We have taken clinical event codes and dates from anonymised longitudinal clinical records for 26,000 patients, extracted from several primary care sources and merged these into a single data set of 749,000 recorded events. In the UK, the majority of primary care records are coded using Read Codes, a hierarchical coding system developed in the 1980s. Where necessary records using other coding systems (e.g. ICD9/10, SNOMED CT) have been translated into Read Codes; however, in principal, any standard coding system could be used as the target system.
Once our data set was established, we reserved a proportion of the records as a test set. Furthermore, patients from this test set had their most recent conditions reserved. These truncated test records were compared with records in the main records set to discover those patients with the most similar histories; information on the event histories of these ‘similar’ patients were then used to predict an increase or decrease in probabilities of particular conditions for our test patients, using techniques adapted from recommender systems.
Results: Using Read codes to represent a patient’s sequence of diagnoses, we were able to generate association rules for particular combinations of conditions by collaborative filtering. Confidences in the results were improved by reducing the granularity of the coding, although this produced less granular predictive outputs.
Conclusions: We found challenges in merging records from different data sets and particularly when merging records that used different coding systems. However, we have been able to demonstrate some associations between diagnosis codes in longitudinal clinical histories and our results suggest that there is potential for using this technique to refine the probabilities of individual patients having particular conditions and for individuals to assess how lifestyle choices could affect their future health outcomes.
Medicine 2.0® is happy to support and promote other conferences and workshops in this area. Contact us to produce, disseminate and promote your conference or workshop under this label and in this event series. In addition, we are always looking for hosts of future World Congresses. Medicine 2.0® is a registered trademark of JMIR Publications Inc., the leading academic ehealth publisher.

This work is licensed under a Creative Commons Attribution 3.0 License.