30
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      The development of “automated visual evaluation” for cervical cancer screening: The promise and challenges in adapting deep‐learning for clinical testing

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          There is limited access to effective cervical cancer screening programs in many resource‐limited settings, resulting in continued high cervical cancer burden. Human papillomavirus (HPV) testing is increasingly recognized to be the preferable primary screening approach if affordable due to superior long‐term reassurance when negative and adaptability to self‐sampling. Visual inspection with acetic acid (VIA) is an inexpensive but subjective and inaccurate method widely used in resource‐limited settings, either for primary screening or for triage of HPV‐positive individuals. A deep learning (DL)‐based automated visual evaluation (AVE) of cervical images has been developed to help improve the accuracy and reproducibility of VIA as assistive technology. However, like any new clinical technology, rigorous evaluation and proof of clinical effectiveness are required before AVE is implemented widely. In the current article, we outline essential clinical and technical considerations involved in building a validated DL‐based AVE tool for broad use as a clinical test.

          Abstract

          What's new?

          An emerging option for cervical cancer screening is deep learning‐based automated visual evaluation (AVE) of cervical images. Here, the authors lay out parameters for the successful development of deep learning‐based AVE. For instance, an algorithm should be trained on representative images from each of four distinct biological stages: normal cervix; infection with high‐risk HPV; precancer; and invasive cervical cancer. Characteristics that may lead to erroneous classification, such as cervicitis, should also be considered in the training. Introducing deep learning‐based methods prematurely threatens their eventual acceptance and best use.

          Related collections

          Most cited references68

          • Record: found
          • Abstract: found
          • Article: not found

          High-performance medicine: the convergence of human and artificial intelligence

          Eric Topol (2019)
          The use of artificial intelligence, and the deep-learning subtype in particular, has been enabled by the use of labeled big data, along with markedly enhanced computing power and cloud storage, across all sectors. In medicine, this is beginning to have an impact at three levels: for clinicians, predominantly via rapid, accurate image interpretation; for health systems, by improving workflow and the potential for reducing medical errors; and for patients, by enabling them to process their own data to promote health. The current limitations, including bias, privacy and security, and lack of transparency, along with the future directions of these applications will be discussed in this article. Over time, marked improvements in accuracy, productivity, and workflow will likely be actualized, but whether that will be used to improve the patient-doctor relationship or facilitate its erosion remains to be seen.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Artificial intelligence in healthcare

              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: A cross-sectional study

              Background There is interest in using convolutional neural networks (CNNs) to analyze medical imaging to provide computer-aided diagnosis (CAD). Recent work has suggested that image classification CNNs may not generalize to new data as well as previously believed. We assessed how well CNNs generalized across three hospital systems for a simulated pneumonia screening task. Methods and findings A cross-sectional design with multiple model training cohorts was used to evaluate model generalizability to external sites using split-sample validation. A total of 158,323 chest radiographs were drawn from three institutions: National Institutes of Health Clinical Center (NIH; 112,120 from 30,805 patients), Mount Sinai Hospital (MSH; 42,396 from 12,904 patients), and Indiana University Network for Patient Care (IU; 3,807 from 3,683 patients). These patient populations had an age mean (SD) of 46.9 years (16.6), 63.2 years (16.5), and 49.6 years (17) with a female percentage of 43.5%, 44.8%, and 57.3%, respectively. We assessed individual models using the area under the receiver operating characteristic curve (AUC) for radiographic findings consistent with pneumonia and compared performance on different test sets with DeLong’s test. The prevalence of pneumonia was high enough at MSH (34.2%) relative to NIH and IU (1.2% and 1.0%) that merely sorting by hospital system achieved an AUC of 0.861 (95% CI 0.855–0.866) on the joint MSH–NIH dataset. Models trained on data from either NIH or MSH had equivalent performance on IU (P values 0.580 and 0.273, respectively) and inferior performance on data from each other relative to an internal test set (i.e., new data from within the hospital system used for training data; P values both <0.001). The highest internal performance was achieved by combining training and test data from MSH and NIH (AUC 0.931, 95% CI 0.927–0.936), but this model demonstrated significantly lower external performance at IU (AUC 0.815, 95% CI 0.745–0.885, P = 0.001). To test the effect of pooling data from sites with disparate pneumonia prevalence, we used stratified subsampling to generate MSH–NIH cohorts that only differed in disease prevalence between training data sites. When both training data sites had the same pneumonia prevalence, the model performed consistently on external IU data (P = 0.88). When a 10-fold difference in pneumonia rate was introduced between sites, internal test performance improved compared to the balanced model (10× MSH risk P < 0.001; 10× NIH P = 0.002), but this outperformance failed to generalize to IU (MSH 10× P < 0.001; NIH 10× P = 0.027). CNNs were able to directly detect hospital system of a radiograph for 99.95% NIH (22,050/22,062) and 99.98% MSH (8,386/8,388) radiographs. The primary limitation of our approach and the available public data is that we cannot fully assess what other factors might be contributing to hospital system–specific biases. Conclusion Pneumonia-screening CNNs achieved better internal than external performance in 3 out of 5 natural comparisons. When models were trained on pooled data from sites with different pneumonia prevalence, they performed better on new pooled data from these sites but not on external data. CNNs robustly identified hospital system and department within a hospital, which can have large differences in disease burden and may confound predictions.
                Bookmark

                Author and article information

                Contributors
                kanan.desai@nih.gov , @kanan_desai2004
                Journal
                Int J Cancer
                Int J Cancer
                10.1002/(ISSN)1097-0215
                IJC
                International Journal of Cancer
                John Wiley & Sons, Inc. (Hoboken, USA )
                0020-7136
                1097-0215
                06 December 2021
                01 March 2022
                : 150
                : 5 ( doiID: 10.1002/ijc.v150.5 )
                : 741-752
                Affiliations
                [ 1 ] Division of Cancer Epidemiology and Genetics National Cancer Institute Rockville Maryland USA
                [ 2 ] Information Management Services Inc. Calverton Maryland USA
                [ 3 ] Department of Epidemiology University of Washington School of Public Health Seattle Washington USA
                [ 4 ] US National Library of Medicine Bethesda Maryland USA
                [ 5 ] Center for Health Decision Science Harvard T.H. Chan School of Public Health Boston Massachusetts USA
                [ 6 ] Division of Cancer Prevention National Cancer Institute Rockville Maryland USA
                [ 7 ] Center for Global Health, National Cancer Institute Rockville Maryland USA
                [ 8 ] ISGlobal Barcelona Spain
                Author notes
                [*] [* ] Correspondence

                Kanan T. Desai, Clinical Genetics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, Rockville, MD, USA.

                Email: kanan.desai@ 123456nih.gov

                Author information
                https://orcid.org/0000-0002-8992-5944
                https://orcid.org/0000-0003-3614-210X
                https://orcid.org/0000-0002-0840-4339
                https://orcid.org/0000-0002-5909-676X
                Article
                IJC33879
                10.1002/ijc.33879
                8732320
                34800038
                774d7635-b525-4126-a815-a61cefd4ae46
                Published 2021. This article is a U.S. Government work and is in the public domain in the USA. International Journal of Cancer published by John Wiley & Sons Ltd on behalf of UICC.

                This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

                History
                : 24 September 2021
                : 07 July 2021
                : 15 October 2021
                Page count
                Figures: 6, Tables: 0, Pages: 12, Words: 9269
                Funding
                Funded by: NCI Cancer Moonshot
                Funded by: NCI/NIH , doi 10.13039/100000002;
                Award ID: T32CA09168
                Funded by: NIH Intramural Research Program
                Categories
                Special Report
                Special Report
                Custom metadata
                2.0
                March 1, 2022
                Converter:WILEY_ML3GV2_TO_JATSPMC version:6.2.0 mode:remove_FC converted:07.10.2022

                Oncology & Radiotherapy
                artificial intelligence,cervical cancer screening,clinical validation,hpv tests,visual triage

                Comments

                Comment on this article

                scite_
                60
                0
                56
                0
                Smart Citations
                60
                0
                56
                0
                Citing PublicationsSupportingMentioningContrasting
                View Citations

                See how this article has been cited at scite.ai

                scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

                Similar content175

                Cited by22

                Most referenced authors1,097