To the Editor: Congenital heart disease (CHD) is the most common severe congenital
malformation which threatens the health of the fetus and the primary factor of intrauterine
death. Prenatal detection of severe CHD could reduce the neonatal mortality and morbidity.
The detection rate ranges between 25.0% and 59.7% in the prenatal ultrasonographic
examination.[1] We need to identify new strategies to improve detection rates of CHD
prenatally. Deep-learning (DL)-based computer-aided diagnosis (CAD) has been highly
and rapidly developed in recent years. The development of DL could improve diagnosis,
reduce health care cost, and optimize clinical workflow.[2] Neural network is also
used in standard plane detection in obstetrics ultrasound, as well as in automated
imaging and tagging, but less in screening and diagnosis of the fetal heart. In this
study, we selected the 2D ultrasound images of the fetal heart in four-chamber view
(4CV) and applied CAD methodology to identify the main elements and screen CHD automatically.
We used the simple method to obtain 4CV, aiming to improve the fetal heart basic training
and CHD screening.
This was a retrospective study, and images of this study were collected at our center
between December 2017 and May 2020. We selected 2D images of the fetal heart from
our database, all the images were selected in singleton pregnancies at 20–23+6 weeks’
gestation. The database included normal heart and CHD images.
The inclusion criteria for normal heart images selection included the following: (1)
images displaying the standard 4CV, (2) ultrasound images of singleton pregnancies
at 20–23+6 weeks’ gestation, (3) images from the cases which had all the ultrasound
examination during pregnancy and postnatal echocardiography showed normal cardiac
structure. The exclusion criteria for normal heart images selection included the following:
(1) non-standard 4CV, (2) images from the cases which had cardiac abnormalities during
pregnancy or postnatal echocardiography showed abnormal cardiac structure. The inclusion
criteria for CHD images selection included the following: (1) the images displaying
standard 4CV, (2) ultrasound images of singleton pregnancies at 20–23+6 weeks’ gestation,
(3) the images from the cases which had been confirmed CHD by postnatal echocardiography
or autopsy after induction of labor. The exclusion criteria for CHD images selection:
the images displaying cardiac structure not caused by CHD in 4CV. The study was in
compliance with the relevant ethical laws and had passed the ethical review. The standard
4CV images were obtained following the guidelines of the International Society of
Ultrasound in Obstetrics and Gynecology. This study was approved by the Ethics Committee
of Beijing Obstetrics and Gynecology Hospital (No. 2018-KY-003-03).
In each of the images, we selected 11 ultrasonographic markers as a region of interest
(ROI) including thoracic region, cardiac region, heart axis angle, atrial septum,
ventricular septum, mitral valve, tricuspid valve, left atrium, left ventricle, right
atrium, and right ventricle. We used the model to annotate and identify these markers.
In this study, we used the convolutional neural network (CNN) in DL to detect and
diagnose fetal heart images of 4CV, in which image segmentation was used to obtain
the ROI.
All the images were divided into training set and validation set, the proportion is
4:1. The ultrasound doctors labeled the images of the training set and input them
into the classification network, output the binary results of diseases (whether they
had the disease), and optimized the classification network. The model could ensure
the similar images could not be separated into different sets. Then, we input the
validation set after the training of the classification network and compared the output
results with the real label of the corresponding images to get the evaluation index.
After the doctors manually segmented the images of the training set to obtain ROI,
they needed to perform data augmentation including translation and horizontal and
vertical flipping to increase the data of the training set and reduce the overfitting
effect. After processing, the ROI was scaled to a fixed size and standardized, then
it could be input into the classification network. The ROI obtained was shown in Figure
1.
Figure 1
ROI acquisition. (A) The 4CV, (B) thoracic region, (C) cardiac region, (D) right atrium,
left atrium, right ventricular and left ventricular, (E) atrium septum, ventricular
septum, mitral valve, and tricuspid, (F) heart axis angle (red asterisk) and all the
markers we selected. ROI: Region of interest; 4CV: Four-chamber view.
In our study, we mainly used the ResNet network which was pre-trained on the ImageNet
dataset,[3] at the same time using the data of the training set in our network to
finetune the convolution layers and the fully connected layer, which could solve the
problem of insufficient medical image data and shorten the training time. The statistical
analyses were performed using SPSS 22.0 software (IBM, SPSS, Chicago, United States)
and scikit-learn library of Python programming language Version 3.9.0 (Python Software
Foundation). Data were expressed as frequencies or mean and standard deviation. The
screening performance for different CHDs was evaluated by receiver operating characteristic
(ROC) curves. Youden index was used to calculate the optimal threshold (cut-off) for
the effective diagnosis of CHDs.
There were a total of 566 cases diagnosed with CHD in singleton pregnancies at 20
to 23+
6 weeks’ gestation between December 2017 and May 2020 in our center. A total of 219
cases that included 520 CHD images and 1000 cases that included 1002 normal images
were included in this study through inclusion and exclusion criteria.
The images were divided into training set and validation set by the proportion of
4:1. In the binary experiment, we used the model to determine whether the image was
a CHD case or normal. The training set consisted of 1217 images which included 416
CHD images and 801 normal images, the validation set consisted of 305 cases which
included 104 CHD images and 201 normal images. The detection rate of the training
set was 92.40%, the specificity was 87.44%, and the accuracy was 89.15%. In the validation
set, the detection rate was 82.83%, the specificity and accuracy were 96.60% and 92.13%,
respectively. The AUC of the training set was 0.962 (95% confidence interval [CI]:
0.949–0.975) and the validation set was 0.973 (95% CI: 0.950–0.996).
In the outlet and arch lesion cases, we added 38 images, about 1/5 of each type, into
the training set of the binary experiment. The validation set of these cases had 145
images, and the model distinguished 51.03% (74/145) as abnormal.
Among all the diagnosed diseases, we selected the five diseases, ventricular septal
defect (VSD), pulmonary stenosis (PS), atrioventricular septal defect (AVSD), pulmonary
atresia (PA), and hypoplastic left heart syndrome (HLHS), with the largest number
of cases for further analysis. The detection rate of these diseases was 84.21%, 48.00%,
77.27%, 70.59%, and 77.78%; the specificity was 94.32%, 99.74%, 98.23%, 95.83%, and
99.66%; the accuracy was 91.80%, 93.11%, 96.72%, 94.43%, and 99.02%, respectively.
The ROC of the model showed a good AUC of the five types of CHD.
Compared with the studies detected CHD using conventional methods, the prevalence
of prenatally screened major CHD was 4.1% to 61.5%.[1] Our study had a higher sensitivity
of CHD detection and showed a higher specificity and accuracy in CHD detection which
were >95%. We could detect all CHDs with lower false positives.
Twelve types of CHD could be found abnormal in 4CV. We selected five types of CHD
to perform further segmentation learning to extract key information for further analysis.
Therefore, we chose VSD, PS, AVSD, PA, and HLHS from our dataset, which may have abnormal
4CV, with a lower detection rate and required more experience and diagnosis time in
clinical diagnosis for further analysis.
Even though VSD was the most common CHD, the detection rate of VSD in the previous
study was low which was only 9.1% to 39%.[1,4] In our study, the model could diagnose
VSD well. Compared with other studies, the detection rate of PS was about 1.6% to
30.0%, while PA was 6.0% to 68.2%.[4,5] Both the two diseases need to be diagnosed
in outflow views, the model could get a sensible diagnosis with high sensitivity in
the 4CV view by using CAD methodology. The detection rates of AVSD and HLHS were near
to the previous study in which the detection rate of AVSD was about 15.1% to 83.4%
and HLHS was about 15.0% to 95.8%.[1,4] Although we could not detect all the malformations
in our study in a single plane, CAD could shorten the examination time and use the
model to judge the images after finding the abnormalities.
CAD could improve the accuracy and efficiency of fetal heart basic screening, and
reduce the examination and waiting time of patients. At the same time, our model could
be applied to all kinds of ultrasonic instruments. We could also retrieve the pictures
in the dataset at any time for re-analysis. At present, spatio-temporal image correlation
and speckle tracking echocardiography could be combined with our model to improve
the prenatal screening and diagnosis of CHD. Furthermore, this model was more helpful
for fetal heart basic screening in junior doctors to identify more fetal heart abnormalities.
And for primary hospitals and remote areas, this model could be used in fetal heart
screening during mid-term sonography and help them to refer CHD cases to superior
hospital in time.
In conclusion, this retrospective study has presented a promising method for the screening
of CHD. The model can potentially be utilized as an effective and convenient tool
for the fetal heart basic training and screening.
Acknowledgments
We thank the ultrasound doctors for collecting the image data for this study.
Funding
This work was supported by a grant from the National Key Research and Development
Program of China (No. 2016YFC1000104).
Conflicts of interest
None.