Precision Medicine – The Future of Healthcare

Precision Medicine – The Future of Healthcare

Sharing Knowledge

The Future of Precision Medicine

Subscribe to stay up to date with the latest Sonrai content.
Author: Dr Matthew Alderdice, Head of Data Science

What is precision medicine?

Precision Medicine (PM) describes the process of combining clinical and molecular data to identify subgroups of patients associated with better responses to treatments and clinical interventions. These clinically relevant subgroups of patients are identified from big data using advanced analytics such as Machine Learning (ML) and Artificial Intelligence (AI). The subgroups are defined by the presence of distinguishing molecular profiles. The Precision Medicine Initiative was launched by United States President Barack Obama in 2015 with the aim of performing deep phenotyping on over one million volunteers. The findings of the initiative will inform clinical decision-making, identify new targeted therapies and ultimately improve patient care. 

Is precision medicine different from personalized medicine?

Precision Medicine (PM) describes the process of combining clinical and molecular data to identify subgroups of patients associated with better responses to treatments and clinical interventions. These clinically relevant subgroups of patients are identified from big data using advanced analytics such as Machine Learning (ML) and Artificial Intelligence (AI). The subgroups are defined by the presence of distinguishing molecular profiles. The Precision Medicine Initiative was launched by United States President Barack Obama in 2015 with the aim of performing deep phenotyping on over one million volunteers. The findings of the initiative will inform clinical decision-making, identify new targeted therapies and ultimately improve patient care. 

What is a biomarker?

Biomarkers are one of the core concepts of precision medicine. The FDA describes a biomarker as “a defined characteristic that is measured as an indicator of normal biological processes, pathogenic processes, or response to an exposure or intervention, including therapeutic interventions”. In other words, it is a very broad term and may refer to a single quantitative measurement such as blood glucose level or a multitude of clinical and molecular data measurements with complex interdependencies. Applications of biomarkers include diagnosis of patients with disease, staging of disease, prognosis, and prediction and monitoring of treatment response.

Biomarker Discovery

Advances in molecular techniques in the 1990s coincided with the start of the Human Genome Project which paved the way for a myriad of discoveries linking mutational status with disease and response to clinical intervention. Perhaps one of the most significant early discoveries, led by Dr Mary-Claire King demonstrated the link between the presence of mutations in the genomic sequence of BRCA1/2 genes and the diagnosis of early-onset familial breast cancer.  More recently, cancer treatment has been transformed with the development of targeted treatments alongside companion diagnostics such as imatinib and bcr-ABL, trastuzumab and HER2,  and vemurafenib and BRAF v600. All of these advances have only been possible due to the ability to perform deep phenotyping experiments generating large multi-omic data lakes in the process. 

The Multiomics Revolution

At the beginning of the century, it cost approximately $100 million to sequence a single human genome. From 2008 onward, we have seen a rapid decline in the cost of DNA sequencing and other technologies whereby we can now routinely generate multi-omic molecular profiles (e.g genomics, transcriptomics and proteomics) for every patient. It is no coincidence that the emergence of precision medicine is directly correlated with our ability to perform large-scale molecular profiling. As the data deluge continues it becomes more challenging to generate insights and we must apply advanced analytics to identify the most clinically and biologically relevant biomarkers.

The Curse of Dimensionality

The curse of dimensionality is a term often used to describe the phenomenon where the number of features (e.g genes) greatly outweighs the number of observations (e.g patients). This is a very prominent characteristic of multi-omic data which makes finding the next ground-breaking biomarker or meaningful patterns with simple statistics like looking for a needle in a haystack. There is a rich ecosystem of bioinformatics tools that are routinely used for exploratory data analysis in precision medicine to ease the discovery of novel biomarkers. We will explore a number of these key tools. Differential expression analysis and feature selection tools are core techniques used in the generation of molecular assays known as gene signatures. 

Gene Signatures

Gene signatures are lists of genes with unique expression patterns that characterise a clinical or biological phenotype. Gene signatures can provide insights into the biology that underpins disease using techniques such as pathway analysis. Most gene signatures are not clinically viable, however, there are a number of noteworthy commercially available gene signatures (e.g  MammaPrintDDRDOncotypeDX and Prosigna) that are used for prognostication and prediction. Prosigna, previously known as the PAM50 gene signature, assigns patients with early-stage breast cancer a prognostic score which is indicative of cancer recurrence. Gene signatures such as Prosigna are the culmination of years of exploratory data analysis, development and validation. The PAM50 gene signature was first published in Nature in 2000, the authors used a technique called hierarchical clustering to identify four ‘molecular subtypes’ associated with complex biological pathways. Molecular subtyping is a term that has become synonymous with precision medicine particularly in oncology but what does it mean and how is it impacting health care? 

Molecular Subtyping

Molecular subtyping refers to the use of multi-omics data to find clusters of samples that have shared biological traits. Molecular subtypes are often discovered using unsupervised machine learning techniques such as clustering and dimensionality reduction techniques (e.g tSNE) which are an incredibly powerful approach for cluster visualization. Breast cancer is undoubtedly the cancer type which has seen the most research in this area. However, there is now substantial evidence to suggest that molecular subtypes exist across most if not all tumour types. The Consensus Molecular Subtypes of Colorectal Cancer (CMS) was established more recently in 2015 by Guinney et al, bringing together the findings of six independent subtyping studies. Using Markov Clustering they identified four molecular subtypes characterised using thousands of Colorectal Cancer (CRC) transcriptome profiles. The subtypes have clear molecular associations defined not only by gene expression but also mutational, methylation, miRNA and histopathology profiles. The paper has been cited over 2000 times since 2015 and is now helping shape future clinical stratification strategies. CMS stratification was originally published using a Random Forest Classifier. Random Forests are a classical machine learning algorithm and form part of the wider Artificial Intelligence and ML ecosystem. ML techniques have been widely used by the data science and bioinformatics community in precision medicine for a while however over the last decade AI has seen a resurgence. 

AI in precision medicine

The renewed excitement around the application of AI in precision medicine is hard to ignore but what is AI and how is AI being routinely used in precision medicine? The term AI or ‘narrow AI’ is used to describe a collection of machine learning and deep learning algorithms that are trained on vast amounts of data that mimic aspects of human decision-making and behaviour. The precision medicine Sector has been slower than others in its adoption of these algorithms. However, a recent review by Benjamens et al showed that the number of publications using AI/ML in the field of life sciences rose from 596 in 2010 to 12,422 in 2019. Similarly, the number of clinical trials published on NCBI involving AI has grown dramatically over the last decade (see figure). The same review also showed that medical imaging is the area where AI has had the biggest impact in precision medicine with 46% of the 29 official FDA approved algorithms being applied in radiology data modalities such as CT, MRI and mammograms. The success of AI in medical imaging is partly due to the development of deep learning frameworks such as TensorFlow and Pytorch. These frameworks enable data scientists to train algorithms known as convolutional neural networks (CNNs) which are highly specialised at classifying images.

 

We should expect to see over the next decade the same pattern of adoption to emerge in digital pathology where AI applied to large gigapixel histopathology images such as H&Es will be used routinely in clinical decision-making. Large consortiums such as PathLAKE which provide researchers with access to huge data lakes and computing infrastructure are accelerating the adoption of AI in digital pathology. The next big step will be applying AI to molecular data, however, there needs to be a shift in our data management practices before this can be realised.

The Trajectory of Precision Medicine

The adoption of data best practices is one of the most important challenges to address if the true value of precision medicine is to be delivered. Multi-omic data is inherently diverse, there are many file types, bioinformatics pipelines and preferred analysis methods. Adopting a data-driven mentality will be key if AI is to be successfully applied to multi-omic data. The power of precision medicine is inextricably linked to data and our ability to store and analyse it. Organisations that undertake a digital transformation now are going to be the ones that make the most impact.

Get in touch

Like What You See? Let's Talk

No hard sales conversations
We Listen to your problems
We give you confidence to make your decision
Related Posts