A team of European and US scientists showed an AI program more than 100,000 images of both malignant melanomas, and harmless moles.
While Dermatologists detected 71.3 per cent of cancers, the AI detected a whopping 95 per cent.
The deep learning convolutional neural network, which uses machine learning, was up against 58 dermatologists in the study. Researchers say the AI missed fewer melanomas, and was less likely to misdiagnose moles.
“The CNN works like the brain of a child,” Professor Holger Haenssle, senior managing physician at the Department of Dermatology, University of Heidelberg, Germany, explained.
Professor Haenssle says this means it needs training – which is where the 100,000 images (each magnified 10 times) came in. As they were shown to the AI, it learned.
“After finishing the training, we created two test sets of images from the Heidelberg library that had never been used for training and therefore were unknown to the CNN,” Professor Haenssle says.
“One set of 300 images was built to solely test the performance of the CNN. Before doing so, 100 of the most difficult lesions were selected to test real dermatologists in comparison to the results of the CNN.”
58 Dermatologists from 17 countries took part in the study. 17 with less than two years experience, 11 with two to five years and 30 with more than five years experience.
Oxford University Press explains how it went down:
The dermatologists were asked to first make a diagnosis of malignant melanoma or benign mole just from the dermoscopic images (level I) and make a decision about how to manage the condition (surgery, short-term follow-up, or no action needed). Then, four weeks later they were given clinical information about the patient (including age, sex and position of the lesion) and close-up images of the same 100 cases (level II) and asked for diagnoses and management decisions again.
In level I, the dermatologists accurately detected an average of 86.6% of melanomas, and correctly identified an average of 71.3% of lesions that were not malignant. However, when the CNN was tuned to the same level as the physicians to correctly identify benign moles (71.3%), the CNN detected 95% of melanomas. At level II, the dermatologists improved their performance, accurately diagnosing 88.9% of malignant melanomas and 75.7% that were not cancer.
“The CNN missed fewer melanomas, meaning it had a higher sensitivity than the dermatologists, and it misdiagnosed fewer benign moles as malignant melanoma, which means it had a higher specificity; this would result in less unnecessary surgery,” said Professor Haenssle.
“When dermatologists received more clinical information and images at level II, their diagnostic performance improved. However, the CNN, which was still working solely from the dermoscopic images with no additional clinical information, continued to out-perform the physicians’ diagnostic abilities.”
The expert dermatologists performed better at level I than the less experienced dermatologists and were better at detecting malignant melanomas. However, their average ability to make the correct diagnosis was still worse than the CNN at both levels.
“These findings show that deep learning convolutional neural networks are capable of out-performing dermatologists, including extensively trained experts, in the task of detecting melanomas,” he said.
The incidence of malignant melanoma is increasing, with an estimated 232,000 new cases worldwide and around 55,500 deaths from the disease each year. It can be cured if detected early, but many cases are only diagnosed when the cancer is more advanced and harder to treat.
“This CNN may serve physicians involved in skin cancer screening as an aid in their decision whether to biopsy a lesion or not. Most dermatologists already use digital dermoscopy systems to image and store lesions for documentation and follow-up. The CNN can then easily and rapidly evaluate the stored image for an ‘expert opinion’ on the probability of melanoma. We are currently planning prospective studies to assess the real-life impact of the CNN for physicians and patients.”
The study has some limitations, which include the fact that the dermatologists were in an artificial setting where they knew they were not making “life or death” decisions; the test sets did not include the full range of skin lesions; there were fewer validated images from non-Caucasian skin types and genetic backgrounds; and the fact that doctors may not always follow the recommendation of a CNN they don’t trust.
“Currently, diagnostic accuracy for melanoma is dependent on the experience and training of the treating doctor,” Dr Victoria Mar of Monash University in Melbourne, and Professor H Peter Soyer from The University of Queensland said of the study.
“This shows that artificial intelligence promises a more standardised level of diagnostic accuracy, such that all people, regardless of where they live or which doctor they see, will be able to access reliable diagnostic assessment.”
But before AI could become standard in clinics, the difficulty of imaging some melanomas on sites such as the fingers, toes and scalp, and how to train AI sufficiently to recognise atypical melanomas and ones that patients are unaware of needs to be considered.
“Currently, there is no substitute for a thorough clinical examination,” they say.
“However, 2D and 3D total body photography is able to capture about 90 to 95 per cent of the skin surface and given exponential development of imaging technology we envisage that sooner than later, automated diagnosis will change the diagnostic paradigm in dermatology.”