Integrating Digital Pathology Data to uncover novel insights

Integrating Digital Pathology Data to uncover novel insights

Sharing Knowledge

Integrating Digital Pathology Data to Uncover Novel Insights

Subscribe to stay up to date with the latest Sonrai content.

Author: Paul O’ Reilly, Head of Innovation | Reading time: 7 Minutes


High throughput scanning technologies have matured, and digital slide scanners allowing whole slide imaging (WSI) have become more widely deployed in clinical and research labs. Digital Pathology (DP) is increasingly becoming a mainstream tool in the practice of pathology. It is in the process of being approved for use in primary diagnosis, with more digital slides being created routinely year after year, with numerous solutions now clinically approved for the acquisition and management of WSIs. 


While the benefits of digital pathology workflow efficiencies are quantifiable, the digitisation of glass slides also opens up the opportunity to apply computational pathology and AI technologies to interpret the pathology information within the resulting high-resolution digital images. 


Understandably the concentration has been on developing Artificial Intelligence (AI)-assisted technologies for diagnostics purposes. This potentially provides the most significant immediate benefit to clinicians and impacts patient care. However, using AI-enabled pathology in a research context also promises benefits, particularly in terms of automation and quantification and reproducibility of results.

AI in Computational Pathology

AI is currently being successfully used for a number of tasks in computational pathology, and applications based on AI have been validated to assist clinicians in a number of fundamental use cases.

Tumour Identification & Grading

Tumour identification, quantification and grading in H&E images have attracted research attention, including regulatory approvals for clinical use. These are used in diverse ways, from quality assurance to diagnostic use cases.

Biomarker quantification

There are a number of well-established approaches for quantification and scoring of pathology images in research and clinical use. These mainly apply to the quantification of staining in Immuno-HistoChemical (IHC) images using computer algorithms for image analysis. 


These algorithms are typically used to stratify cohorts on the basis of protein expression. In the clinic, scoring such markers as ER/PR/HER2/Ki67/PD-L1 is increasingly being used to predict response to treatments or as a prognostic marker. The inherent variability in pathologist scoring of markers has motivated the use of algorithms (previously Image Analysis algorithms, but increasingly AI-based algorithms) to consistently and accurately provide a score that can be used.


Outside the clinic, AI-based scoring has been used as part of the early discovery process across multiple markers and is used to help validate the utility of a multitude of predictive and prognostic biomarkers.

Molecular Status

Equally important but less developed is the use of AI to determine molecular status, including MSI and other molecular indications. Publications to date are purely in the research domain but have shown clinical-grade performance.


The use of AI-based biomarker quantification in pathology images is now well-established and widely used in tissue-based research. There are multiple commercial and open-source tools that can be used by researchers to replace or augment pathologist scores.


However, the majority of AI algorithms have tended to replicate exactly pathologist-based scoring approaches. This is the natural first step and has been needed to establish AI and gain acceptance within the research community. The resolution of image data through the use of such scores has allowed such data to be used alongside resolved data of other modalities in an integrated manner, for research and discovery (as seen below), and managed and shared with such data outside of Digital Pathology' silos'.


However, following this approach potentially limits the scope of the application of AI to only using information identifiable to the human eye, and ignores the ability of AI algorithms to use information and patterns in images that a pathologist may disregard or fail to pick up. There are a number of more sophisticated approaches which have become possible through the use of AI, and recently-published work has shown that using these can result in robust and novel insights into biological and biomedical data sets.


The success in determining the molecular status of cases from H&E slides, rather than IHC (as often required by pathologists) shows that there is information in WSIs that can be used algorithmically to identify cohorts, stratify patients and predict outcomes.

Features derived from AI outputs

One approach which has shown utility is the use of AI models to classify and identify tissue areas or cells. This has been used to identify CD8-positive lymphocytes, and hence identify their extent of infiltration within tumour. This infiltration quantification was integrated with transcriptomic data to identify novel signatures with prognostic value.


In another recent study, so-called Human Interpretable Features, derived from tissue classifications from AI models, were shown to correlate with and be predictive of markers of the tumour microenvironment, and molecular signatures.

Need some advice?

Meet our expert AI Algorithm Team

We Listen to your problems
No hard sales conversations
We give you confidence to move forward

Integrating Image Features directly with other data

Recently, there has been increased interest in using AI-derived image features directly, as contrasted with applying post-processing to create features used in integrative analysis. Although in its early stages, such an approach potentially allows hypothesis-free mining of datasets which include pathology images and other data modalities. Using standard architectures and even pre-trained weights, these Deep Feature Extractors can be used to interrogate images and generate hypotheses linking clinical and other data to the tissue images.

Simple Example of Image Feature Integration

To illustrate the utility of raw image features we have taken a set of slides from The Cancer Genome Atlas dataset (TCGA) - specifically colorectal cancer cases. A Region of Interest on each slide corresponding to the area of tumour on the slide was identified, and a set of 224x224-pixel patches (at 20x magnification) were extracted.


A deep Convolutional Neural Network with the EfficientNetB7 architecture, as pre-trained on the ImageNet data set was used to extract a 2560-dimensional vector of features for each patch, and the feature vectors aggregated on a slide basis using simple median aggregation.

These slide-level feature vectors were imported into Sonraí's Indra Engine alongside clinical information, as downloaded from TCGA, with the aim of performing a simple analysis integrating the image features with the clinical data.

In this example, using Indra's apps we can find the feature or features most significantly correlated with prognosis for Stage I & II cases. To test this, Indra provides a Survival Plot app, which shows that grouping based on the top feature is prognostic.

Consider the slides with the lowest and highest aggregate values of this feature, and the patches in those slides with extreme values, as shown below.

The highest scoring patches on the highest scoring slide (i.e. with poor prognosis) consist of areas of poorly-differentiated, high-grade tumour as would be identified by a pathologist. Thus, the image features extracted by the Deep Neural Network can be seen to correlate with a recognised tumour phenotype - the pathological grade, and appear to have a similar prognostic ability.


AI Powered Decision Support for Pathologists

We are partnering with clients to develop a suite of medical device algorithms to find the right treatment for every patient.

Imagine What You Could Discover

Our cloud-based AI technology unifies and integrates all data sources from discovery, research, and development to regulatory approval.


Tumour Identification & Grading using AI


G. Campanella et al., “Clinical-grade computational pathology using weakly supervised deep learning on whole slide images,” Nat. Med.
B. Ehteshami Bejnordi et al., “Diagnostic Assessment of Deep Learning Algorithms for Detection of Lymph Node Metastases in Women With Breast Cancer,” JAMA
L. Pantanowitz et al., “An artificial intelligence algorithm for prostate cancer diagnosis in whole slide images of core needle biopsies: a blinded clinical validation and deployment study,” Lancet Digit. Heal.
R. Colling et al., “Artificial intelligence in digital pathology: a roadmap to routine use in clinical practice,” J. Pathol.


Determining Molecular Status from H&E slide images


J. N. Kather et al., “Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer,” Nat. Med.
A. Echle et al., “Clinical-Grade Detection of Microsatellite Instability in Colorectal Tumors by Deep Learning,” Gastroenterology
J. N. Kather et al., “Pan-cancer image-based detection of clinically actionable genetic alterations,” Nat. Cancer
P. Tsou and C.-J. Wu, “Mapping Driver Mutations to Histopathological Subtypes in Papillary Thyroid Carcinoma: Applying a Deep Convolutional Neural Network,” J. Clin. Med.

A. Echle, N. T. Rindtorff, T. J. Brinker, T. Luedde, A. T. Pearson, and J. N. Kather, “Deep learning in cancer pathology: a new generation of clinical biomarkers,” Br. J. Cancer


Using AI outputs to derive pathology features


M. Desbois et al., “Integrated digital pathology and transcriptome analysis identifies molecular mediators of T-cell exclusion in ovarian cancer,” Nat. Commun.,

J. A. Diao et al., “Human-interpretable image features derived from densely mapped cancer pathology slides predict diverse molecular phenotypes,” Nat. Commun.


Using Image Features Directly


C. Abbet, I. Zlobec, B. Bozorgtabar, and J. P. Thiran, “Divide-and-Rule: Self-Supervised Learning for Survival Analysis in Colorectal Cancer,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

D. Bychkov et al., “Deep learning based tissue analysis predicts outcome in colorectal cancer,” Sci. Rep.

L. Lu and B. J. Daigle, “Prognostic analysis of histopathological images using pre-trained convolutional neural networks: Application to hepatocellular carcinoma,” PeerJ

H. Zeng, L. Chen, Y. Huang, Y. Luo, and X. Ma, “Integrative Models of Histopathological Image Features and Omics Data Predict Survival in Head and Neck Squamous Cell Carcinoma,” Front. Cell Dev. Biol.


Image Analysis Tools


HALO | Indica Labs

Get in touch

Like What You See? Let's Talk

No hard sales conversations
We Listen to your problems
We give you confidence to make your decision
Related Posts
Leave a Reply