AI-driven High Content Screening
Overview of HCS and its Applications
High Content Screening (HCS) is used to systematically analyse and extract information from many biological samples, such as cells or tissues, in a high-throughput manner. This process involves using automated microscopy and image analysis to extract quantitative data from microscopy images of samples, enabling the study of cellular phenotypes, functional responses, various biological processes, and the effects of drugs or compounds on cells. HCS is employed in multiple scientific and medical applications, including drug discovery and development, toxicity assessment, disease research, and precision medicine.
Importance of HCS in drug discovery
HCS plays a key role in drug discovery and development by allowing researchers to assess how different potential drug candidates affect cells and biological processes. Those compounds that promote beneficial cellular pathways or inhibit disease-related pathways can be identified and selected for further testing. The goal is to identify compounds that have similar mechanisms of action (MoA), as they may be candidates for further development as potential drugs.
Challenges of manual HCS analysis
Performing HCS analysis manually poses many challenges due to human error, inconsistency and lower throughput. Researchers performing this process manually must acquire and analyse the data themselves leading to omissions, errors and interpretation bias, which can lead to less reproducible results.
The main challenges associated with manual HCS analysis are presented below.
HCS generates large volumes of complex image data, making data management, storage, and analysis challenging.
Biological samples and imaging conditions can introduce variability, affecting the reproducibility and reliability of results.
Extracting meaningful information from HCS data requires advanced image analysis techniques and can be time-consuming.
As the demand for high-throughput screening increases, there’s a need for scalable and efficient HCS platforms.
Accelerating discovery and development with AI-enabled HCS
High-content screening data can be quickly analysed using AI to accelerate the identification of drug candidates of interest and reduce time from discovery to clinical trials.
Automated Image Analysis: AI algorithms can automate the analysis of HCS images, reducing the burden on researchers and improving accuracy.
Data Interpretation: AI can identify subtle patterns and correlations within HCS data that may be challenging for human analysis, leading to new discoveries.
Quality Control: AI algorithms can flag data quality issues, helping researchers ensure the reliability of results.
Predictive Modeling: AI can build predictive models based on HCS data, aiding in drug discovery, toxicity prediction, and disease modeling.
Speed and Efficiency: AI-powered HCS systems can process data faster, increasing throughput and scalability.
Personalized Medicine: AI can analyze HCS data to identify patient-specific responses, facilitating personalized treatment plans.
Data Integration: AI can integrate HCS data with other omics data (genomics, proteomics) for a more comprehensive understanding of biological systems.
Reduced Cost: By automating tasks and improving efficiency, AI can reduce the overall cost of HCS experiments.
Sonrai Example Scenario:
Data Collection and Preprocessing:
Researchers expose cells to various compounds and capture images of the cells’ responses in different wells of a microplate. Each well corresponds to a specific compound treatment. AI is used to preprocess the images to enhance their quality and consistency, such as normalizing lighting conditions and removing background noise.
Convolutional Neural Networks (CNNs):
CNNs are employed to analyze the images. These deep learning models are particularly well-suited for image analysis tasks. They consist of multiple layers that automatically learn hierarchical features from the images.
The CNNs process the well images and extract visual features. These features represent important patterns or characteristics in the images, which can include cell morphology, staining intensity, or other relevant details, but may also correspond to non-human-distinguishable visual patterns, which nevertheless allow differentiation between different cell populations.
Once the features are extracted, clustering algorithms (e.g., k-means clustering) are applied to group compounds with similar MoA based on the similarity of their extracted visual features.
The clustered compounds are then analyzed by researchers. Compounds within the same cluster are likely to have similar MoA, which can guide further experiments or drug development efforts.
How CNNs Work for Feature Extraction:
CNNs consist of layers that progressively extract features from input images:
Convolutional Layers: These layers apply filters (kernels) to the input image to detect patterns like edges, textures, or shapes.
Pooling Layers: Pooling layers reduce the spatial dimensions of the feature maps while retaining essential information.
Fully Connected Layers: These layers combine features learned in previous layers to make predictions.
CNNs learn to recognize and emphasize relevant patterns in the images during training. In the context of HCS, they learn to recognize cellular features indicative of different MoAs.
Automation: CNNs automate the process of feature extraction, making it faster and more consistent.
Accuracy: CNNs can identify subtle differences in cell responses that might be missed by manual analysis.
Scalability: CNNs can process a large volume of well images, enabling high-throughput screening.
Discovery: Clustering compounds based on visual features can reveal new insights and potential drug candidates.
HCS can be used from images to group compounds of similar MoA. The CNN analyses the information from images, to create a compact numerical representation of the image patterns in the form of feature vectors. Since these features are numerical vectors, they can then be aggregated, processed and visualised with tools such as T-SNE and predict MoA.
The POC uses a well-known benchmark dataset obtained by applying 113 compounds at 8 different concentrations to populations of breast cancer cell lines in vitro. Visualizing using the Sonrai Discovery T-SNE application, the AI-extracted features form distinct clusters, which indicate similar MoA.
Using these extracted features, it was possible to train an XGBoost classifier to predict the MoA for unseen inputs. Although this is a simple workflow within Sonrai Discovery, it achieved an impressive performance of 96% compared to 99% state of the art, and outperformed other supervised learning approaches to the benchmark problem.
In this PoC, we have shown that by applying AI and CNNs in HCS, researchers can efficiently analyze large datasets of well images, identify compounds with similar MoA, and accelerate drug discovery and development. This approach can lead to the discovery of novel drugs and more effective treatments.