Elsevier

European Journal of Cancer

Volume 118, September 2019, Pages 91-96
European Journal of Cancer

Original Research
Deep learning outperformed 11 pathologists in the classification of histopathological melanoma images

https://doi.org/10.1016/j.ejca.2019.06.012Get rights and content
Under a Creative Commons license
open access

Highlights

  • A convolutional neural network (CNN) was trained with 595 histopathologic images of melanoma and nevi.

  • In a direct comparison, the CNN and 11 histopathologists classified a test set of 100 additional histopathologic images (1:1 melanoma/nevi).

  • The CNN systematically outperformed the 11 histopathologists in terms of overall accuracy, sensitivity and specificity (p = 0.016).

Abstract

Background

The diagnosis of most cancers is made by a board-certified pathologist based on a tissue biopsy under the microscope. Recent research reveals a high discordance between individual pathologists. For melanoma, the literature reports on 25–26% of discordance for classifying a benign nevus versus malignant melanoma. A recent study indicated the potential of deep learning to lower these discordances. However, the performance of deep learning in classifying histopathologic melanoma images was never compared directly to human experts. The aim of this study is to perform such a first direct comparison.

Methods

A total of 695 lesions were classified by an expert histopathologist in accordance with current guidelines (350 nevi/345 melanoma). Only the haematoxylin & eosin (H&E) slides of these lesions were digitalised via a slide scanner and then randomly cropped. A total of 595 of the resulting images were used to train a convolutional neural network (CNN). The additional 100 H&E image sections were used to test the results of the CNN in comparison to 11 histopathologists. Three combined McNemar tests comparing the results of the CNNs test runs in terms of sensitivity, specificity and accuracy were predefined to test for significance (p < 0.05).

Findings

The CNN achieved a mean sensitivity/specificity/accuracy of 76%/60%/68% over 11 test runs. In comparison, the 11 pathologists achieved a mean sensitivity/specificity/accuracy of 51.8%/66.5%/59.2%. Thus, the CNN was significantly (p = 0.016) superior in classifying the cropped images.

Interpretation

With limited image information available, a CNN was able to outperform 11 histopathologists in the classification of histopathological melanoma images and thus shows promise to assist human melanoma diagnoses.

Keywords

Melanoma
Pathology
Histopathology
Deep learning
Artificial intelligence

Cited by (0)