Published on in Vol 9 (2023)

Preprints (earlier versions) of this paper are available at https://preprints.jmir.org/preprint/49534, first published .
Image Quality Assessment Using a Convolutional Neural Network for Clinical Skin Images

Image Quality Assessment Using a Convolutional Neural Network for Clinical Skin Images

Image Quality Assessment Using a Convolutional Neural Network for Clinical Skin Images

Abstract

School of Medicine, Duke University, Durham, NC, United States

Corresponding Author:

Meenal Kheterpal

School of Medicine, Duke University

Durham, NC

United States

Phone: 1 919 684 3432

Email: meenal.kheterpal@duke.edu


Background: The quality of the images received for teledermatology evaluation is often suboptimal, with up to 50% of patients providing images that are poorly lit, off-center, or blurry. To ensure a similar level of care to in-person consultations, high-quality images are essential.

Objective: The aim of this study is to develop an image quality analysis tool to assess patient- and primary care physician (PCP)–derived images using a deep learning model leveraging multiple instance learning and ordinal regression for model predictions.

Methods: The data set used for this study was acquired from patient-derived images submitted to the Department of Dermatology, Duke University, between August 21, 2018, and December 31, 2019, and PCP-derived images between March 1, 2021, and June 30, 2022. Seven dermatology faculty members with a designation of professor, associate professor, and assistant professor evaluated 400 images each, and 2 dermatology residents evaluated 400 images, assuring that each image had 4 different quality labels. We used a pretrained model VGG16 architecture, further fine-tuned by updating weights based on the input data. The images were taken with cell phones (patients) or cameras (PCPs) in RGB scale, with the resolution being 76 pixels per inch for both height and width, and the average pixel size of the image being 2840×2793 (SD 986×983; 1471 inch2, SD 707 inch2). The optimal threshold was determined using the Youden index, which represents the best trade-off between sensitivity and specificity and balance the number of true positives and true negatives in the classification results. Once the model predicts the rank, the ordinal labels are transformed to binary labels by using a majority vote as the goal is to distinguish between 2 distinct categories (good vs bad quality) and not predict quality as a continuous variable.

Results: Based on the Youden index, we achieved a positive predicted value of 0.906, implying that the model will predict 90% of the good-quality images as such, while 10% of the poor-quality images are predicted as being of good quality to enhance clinical utility, with an area under the receiver operating characteristic curve (AUC) for the test set at 0.885 (95% CI 0.838-0.933) and sensitivity, specificity, and negative predictive value (NPV) of 0.829, 0.784, and 0.645, respectively. Further evaluation on independent validation consisting of 300 images from patients and 150 images from PCPs revealed AUCs of 0.864 (95% CI 0.818-0.909) and 0.902 (95% CI 0.85-0.95), respectively. The sensitivity, specificity, positive predicted value, and NPV for the 300 images were 0.827, 0.800, 0.959, and 0.450, respectively.

Conclusions: This study shows a practical approach to improve image quality for clinical decision-making. While patients and PCPs may have to capture additional images (due to lower NPV), this is offset by the reduced workload and improved efficiency of clinical teams due to the receipt of higher-quality images. Additional images can also be useful if all images (good or poor) are transmitted to medical records. Future studies need to focus on real-time clinical validation of our results.

Conflicts of Interest: None declared.

iproc 2023;9:e49534

doi:10.2196/49534

Keywords


Multimedia Appendix 1

Overview of the Image Quality Assessment (IQA) network architecture. The input images are partitioned into smaller region and are processed through the neural network architecture. The resulting outputs are aggregated, and a threshold criterion is applied to determine whether the image is accepted (good quality) or rejected (bad quality).

PNG File , 200 KB

Edited by A Oakley; submitted 01.06.23; peer-reviewed by JP Tirado-Pérez; accepted 06.08.23; published 31.08.23.

Copyright

©Hyeon Ki Jeong, Ricardo Henao, Christine Park, Simon Jiang, Matilda Nicholas, Suephy Chen, Meenal Kheterpal. Originally published in Iproceedings (https://www.iproc.org), 31.08.2023.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in Iproceedings, is properly cited. The complete bibliographic information, a link to the original publication on https://www.iproc.org/, as well as this copyright and license information must be included.