Case Studies | Computer Vision AI

Nuclear segmentation, classification and quantification within HE stained images of colon

PRR.AI is committed to excellence in pathology.

December 12, 2022

Written by PRR.AI

Objective of the challenge

The objective of CONIC grand challenge was to develop algorithms that perform segmentation, classification and counting of 6 different types of nuclei within the current largest known publicly available nuclei-level dataset in computational pathology (CPath), containing around half a million labelled nuclei.

This challenge consists of two separate tasks:

Task 1: Nuclear segmentation and classification:

The first task requires to segment nuclei within the tissue, while also classifying each nucleus into one of the following categories: epithelial, lymphocyte, plasma, eosinophil, neutrophil or connective tissue. 

Task 2: Prediction of cellular composition:

For the second task, predict how many nuclei of each class are present in each input image. 

The output of Task 1 can be directly used to perform Task 2, but these can be treated as independent tasks.

Introduction

Nuclear segmentation, classification, and quantification within H & E-stained histology images enable the extraction of interpretable cell-based features that can be used in downstream models in computational pathology (CPath). To help drive forward research and innovation for automatic nuclei recognition in CPath, The Colon Nuclei Identification and Counting (CoNIC) Challenge requires researchers to develop algorithms that perform segmentation, classification and counting of 6 different types of nuclei within the current largest known publicly available nuclei-level dataset in CPath, containing around half a million labelled nuclei.

Methods: 

Datasets used

Lizard dataset was used for this challenge, which is currently the largest known publicly available dataset for instance segmentation in Computational Pathology. The dataset consists of Hematoxylin and Eosin-stained histology images at 20x objective magnification (~0.5 microns/pixel) from 6 different data sources.

For each image, an instance segmentation and a classification mask are provided. Within the dataset, each nucleus is assigned to one of the following categories:

Epithelial
Lymphocyte
Plasma

Eosinophil
Neutrophil
Connective tissue

Training set and methods 

Lizard Datasets have 238 images of varying size, from where we extracted patches further.

204/238 images are used as the training images and 34/238 images are used as validation images.
We have extracted 244*244 size patches with the overlap of random size between 150 to 200 for each image.
We introduced mirror padding for edges to resize the patch images to 256*256 size.
We have used Reinhard color Normalization at the patch level to state color variation challenge.
We have also added H branch where H refers to Hematoxylin component.

In this case study, we aim to use the above unique characteristic of H&E stain as Hematoxylin-aware guidance for the segmentation network. To achieve this, we apply a color decomposition technique (Ruifrok et al., 2001) to decompose the Hematoxylin Component from the original RGB image. This approach is commonly utilized as a color normalization pre-processing in traditional methods due to its robustness of color inconsistency in the H&E stained WSI.

We assume that the grey level in each RGB channel is linear with light transmission rate T = I/I0, where I0 is the incident light, and I is the transmitted light. So, each specific stain will be characterized by a specific absorption factor c for the light in each of the three RGB channels. Then we can model the relationship between the among of stain and its absorption using Beer-Lambert’s Law (Parson, 2007).

Data augmentation techniques used are affine transformation, rotation, flipping and gaussian Blur. Some background less images are also introduced using the instance map label which contains only nuclei of different classes.

We used StepLR Scheduler to optimize the learning rate. Adam was used as an optimizer. Training was done in single split. Model is trained till 70 epochs after which we observed a constant loss rate.

Model’s Architecture

In our model, we have two encoder channels and five decoder channels, some of which are inter-connected with PDFA block.

We have used Preact-resnet as an encoder instead of resnet as a backbone network as it is updated variant of resent for the classification task.

A.1. Encoder channels:

We have two encoder channels as following:

a. RGB image channel: This channel is used to extract the raw image features.

b. Haematoxylin channel: This channel is used to collect the haematoxylin-aware contour feature.

c. PDFA blocks: This block is defined as in the Triple U-net models created to combine the raw image feature to haematoxylin-aware branch’s features.

A.2. Decoder channels:

a. Nuclei pixels segmentation channel

b. Horizontal and vertical map segmentation channel

c. Nuclei Classification channel

d. Hematoxylin output channel

Post Processing

We are combining nuclei pixel prediction with the horizontal and vertical map result to get the final segmentation result which is similar to Hovernet model post processing. We are discarding the haematoxylin output channel and RGB output channel. These channels are sharing feature extraction information with other channels with the PDFA blocks in training. For the nuclei composition prediction, we have used the nuclei classification branch prediction with segmentation branch and post-processing as given in HOVERNET model, where the nuclei classification prediction branch is used to get the class for the nuclei and nuclei pixel prediction branch is used to get the counts of nuclei classes.

Validation set and methods 

Validation is performed on 34 images of varying size from Lizard dataset. Patches are extracted from those images of size 244*244 with the overlap of random size between 150 to 200 for each image. We introduced mirror padding for edges to resize the patch images to 256*256 size. We have used Reinhard color Normalization.

Evaluation Metrics:

Metrics used is multi-class panoptic quality (`PQ`) to determine the performance of nuclear instance segmentation and classification.  For each type `t`, the `PQ` is defined as:

where `x` denotes a ground truth (GT) instance, `y` denotes a predicted instance, and IoU denotes intersection over union. Setting IoU(`x`,`y`)>0.5 will uniquely match `x` and `y`. This unique matching therefore splits all available instances of type `t` within the dataset into matched pairs (TP), unmatched GT instances (FN) and unmatched predicted instances (FP). Henceforth, we define the multi-class `PQ` (`mPQ`) as the task ranking metric, which takes averages the `PQ` over all classes:

For `mPQ` we calculate the statistics over all images to ensure there are no issues when a particular class is not present in a patch. This is different to `mPQ` calculation used in publications, such as PanNuke, MoNuSAC. This metric is referred to as `mPQ`+ in Conic Challenge.

For the second task, multi-class coefficient of determination to determine the correlation between the predicted and true counts is used. The statistic is calculated for each class independently and then the results are averaged. For each nuclear category `t` the correlation of determination is defined as follows:

Results

We have trained multiple models with different pre-processing, hyperparameters and patch samples. We have used PQ and mPQ metric defined as in the Conic challenge to evaluate the models.

The results for these models are as given: