Introduction
Catheter ablation is an appropriate treatment to ameliorate the symptoms of patients with atrial fibrillation (AF) [1]. The basic post-ablation monitoring strategy recommended by the HRS/EHRA/ECAS expert consensus statement [2] includes a 12-lead electrocardiogram (ECG) at a minimum of three visits (at 3, 6, and 12 months) and 24-hour Holter monitoring at the end of the follow-up period (12 months). In addition to the scheduled ECG recording, timely medical contact whenever symptoms occur is recommended for the detection of symptomatic episodes [3–5]. However, symptoms may resolve before medical contact is made, thus delaying evaluation and possibly resulting in false negative reports. In this study, the temporal pattern of detected atrial fibrillation recurrence (AFR) and the clinical utility of symptom-driven monitoring with a deep learning (DL)-based handheld device were explored.
Methods
Study Design and Patients
This was a prospective, single-cohort study. Patients with paroxysmal AF who underwent radiofrequency catheter ablation (RFCA) were recruited and followed for 12 months. Individuals with prior AF ablation, cardiac implants (except for coronary artery stents), or conditions preventing use of a wearable device (e.g., skin diseases) were excluded from the study. Antiarrhythmic agents were discontinued five half-lives before RFCA and were not prescribed post-ablation. Therapeutic anticoagulation was prescribed to each patient. For routine scheduled monitoring, 24-hour Holter monitoring and 12-lead ECG were arranged every 3 months after the RFCA. For symptom-driven monitoring, patients were trained to use a DL-based handheld device to record real-time ECG signals whenever symptoms (e.g., palpitation, chest tightness, chest pain, dyspnea, dizziness, or fatigue) presented. An AFR was defined as an episode of AF of at least 30 seconds in duration after a 3-month blanking period. Approval was granted by the ethics committee of Sir Run Run Shaw Hospital, and written informed consent was provided by each participant.
Deep Learning-based Handheld Device
Through simulation-based training, the patients learned to manually initiate the recording by using an ECG card when symptoms presented. The ECG card used two metal-plate electrodes to form a differential electrode pair and provided a 5-minute standard lead-I ECG recording with a sampling rate of 500 Hz per manual trigger. The data recorded by the ECG card were sent to a secure server for further analysis.
After data collection, the ECG signals were preprocessed [6] to remove noise, then classified into four categories of AF, normal, other rhythms, and noise [7] by a DL model consisting of a 34-layer deep residual network (ResNet) [8]. The model was initially established on the basis of the open-source 2017 PhysioNet/CinC Challenge ECG dataset [9] and was further tuned through the transfer learning technique by using ECG recordings labeled through consensus by a committee of three experts. During the classification process, the ECG recordings were truncated into segments of 5 seconds and analyzed individually, and a dichotomous output (i.e., AFR vs. non-AFR) was derived via a majority vote. To ensure accuracy, the DL-derived output was reviewed independently by two qualified cardiologists blinded to any diagnosis; disagreements were settled by a third expert physician (senior consultant). An example recording of the sinus rhythm and the onset of AF is demonstrated in Figure 1.
Statistical Analyses
Continuous variables are presented as mean or median ± standard deviation (IQR) whereas categorical variables are presented as count (percentage). For inter-group comparisons, Pearson’s chi-square test or Fisher’s exact test was used for categorical data, whereas the Wilcoxon rank-sum test was used for continuous data. The accuracy, sensitivity, and specificity were determined to evaluate the performance of the DL-based handheld device. To better characterize the effect of symptom-driven monitoring in the AFR detection, we assessed the frequency of ECG recording as an independent variable in a discretized format in the multivariate logistic regression and Cox proportional hazards model. Different sets of covariates were adjusted in the sensitivity test. The odds ratios (ORs), hazard ratios (HRs), and 95% confidence intervals (CIs) were generated after adjustment for covariates, and P-values for time-to-event analyses were calculated with log-rank tests. Statistical analyses were performed in the R program version 3.1.0 (The R Foundation for Statistical Computing, Vienna, Austria).
Results
Baseline Features and Deep Learning-based Identification
A total of 67 patients with a mean age of 59.7 years were enrolled in this study. More than half of the study population comprised men (53.7%). With respect to manually confirmed AFR as the gold standard, the DL-based handheld device had an accuracy of 98.2%, a specificity of 99.2%, and a sensitivity of 73.3% at the ECG tracing level, and an accuracy of 93.5%, a specificity of 88.2%, and a sensitivity of 100.0% at the patient level. With the scheduled monitoring (24-hour Holter) and the handheld device, a total of 22 patients with AFR were identified. No significant inter-group differences were identified (AFR vs. non-AFR, Table 1).
Baseline Characteristics of the Study Population.
Total (n = 67) | AFR (n = 22) | Non-AFR (n = 45) | P-value | |
---|---|---|---|---|
Demography | ||||
Age | 59.73 ± 8.40 | 59.56 ± 8.58 | 59.81 ± 8.40 | 0.91 |
Male sex | 36 (53.7%) | 13 (59.1%) | 23 (51.1%) | 0.54 |
Comorbidities | ||||
Hypertension | 38 (56.7%) | 12 (54.5%) | 26 (57.8%) | 0.80 |
Coronary artery disease | 9 (13.4%) | 3 (13.6%) | 6 (13.3%) | 1.00 |
Diabetes mellitus | 8 (11.9%) | 2 (9.1%) | 6 (13.3%) | 1.00 |
Echocardiagraphy | ||||
Left ventricular ejection fraction | 69.69 ± 6.60 | 69.70 ± 3.62 | 69.68 ± 7.77 | 0.99 |
Left atrial diameter | 34.73 ± 4.15 | 33.43 ± 3.68 | 35.41 ± 4.28 | 0.14 |
Abbreviation: AFR, atrial fibrillation recurrence.
Detection of Recurrence Across Modalities
With the symptom-driven recording, the DL-based handheld device detected 19 (28.4%) patients who experienced AFR during the follow-up, whereas the 24-hour Holter monitor detected AFR in eight (11.9%) patients (Figure 2). Five patients with AFR were detected by both modalities.

Scaling Venn Diagram of AFR Detection Through Different Modalities.
Abbreviations: DL, deep learning; AFR, atrial fibrillation recurrence. Note, the red circle indicates AFR detected by the DL-based handheld device; the green circle indicates AFR detected by 24-hour Holter monitoring; the overlap indicates AFR detected by both modalities.
Temporal Patterns of the ECG Recordings
Significantly more ECG recordings were documented [362(330) vs. 132(133), P=0.01)] for patients with than without AFR. Among patients with AFR, a total of 9671 tracings of real-time ECG recordings with a duration exceeding 182 hours were documented, and AFR was detected in 486 tracings (5.0%) with a recording duration of 9 hours. The temporal distribution of the daily AFR is displayed in Figure 3. The density plot shows the smoothed distribution of the AFR along the numeric axis of time. The highest concentration of AFR, indicated by the peaks of the density plot, was located from 18:00 to 24:00 (Figure 3).
Frequency of ECG Recording in Recurrence Detection
After correction for the covariates of demographic variables (age and sex) and comorbidities (hypertension, coronary artery disease, and diabetes mellitus), an increase in AFR detection was found with more frequent recordings in the multivariate logistic regression (OR=1.40, 95% CI 1.02–1.92, P=0.036, Table 2), as well as in sensitivity testing (after adjustment for covariates of demography and echocardiography: OR=1.53, 95% CI 1.01–2.33, P=0.047, Table 2).
Deep Learning-based Handheld Device in the Detection of AFR.
Model 1 | Model 2 | |||||
---|---|---|---|---|---|---|
OR | P-value | 95% CI | OR | P-value | 95% CI | |
Frequency of ECG recordings | 1.40 | 0.036 | 1.02, 1.92 | 1.53 | 0.047 | 1.01, 2.33 |
Demography | ||||||
Age | 0.97 | 0.443 | 0.90, 1.05 | 0.96 | 0.455 | 0.87, 1.07 |
Male sex | 0.89 | 0.850 | 0.27, 2.94 | 1.21 | 0.803 | 0.28, 5.26 |
Comorbidities | ||||||
Hypertension | 0.76 | 0.654 | 0.23, 2.53 | |||
Coronary artery disease | 0.87 | 0.884 | 0.14, 5.44 | |||
Diabetes mellitus | 1.01 | 0.993 | 0.16, 6.45 | |||
Echocardigraphy | ||||||
LVEF | 0.99 | 0.834 | 0.88, 1.11 | |||
Left atrial diameter | 0.93 | 0.472 | 0.76, 1.13 |
Abbreviations: OR, odds ratio; CI, confidence interval; AFR, atrial fibrillation recurrence; LVEF, left ventricular ejection fraction.
Cox proportional hazards regression was used to estimate the HR and 95% CI for incident AFR in relation to the recording frequency. The time interval from day 1 post-ablation to the date of first AFR capture was set as the survival duration, in which a shorter time to event indicated earlier detection of AFR. As the frequency increased, similar findings of shortened survival duration were found with both the Cox proportional hazards regression (after adjustment for covariates of demography and comorbidities: HR=1.58, 95% CI 1.16–2.16, P=0.004, P for global test=0.511, Table 3) and sensitivity testing (after adjustment for covariates of demography and echocardiography: HR=1.76, 95% CI 1.13–2.75, P=0.013, P for global test=0.086, Table 3).
Deep Learning-based Handheld Device in the Timing of AFR Detection.
Model 1 | Model 2 | |||||
---|---|---|---|---|---|---|
HR | P-value | 95% CI | HR | P-value | 95% CI | |
Frequency of ECG recordings | 1.58 | 0.004 | 1.16, 2.16 | 1.76 | 0.013 | 1.13, 2.75 |
Demography | ||||||
Age | 0.95 | 0.111 | 0.89, 1.01 | 0.94 | 0.223 | 0.85, 1.04 |
Male sex | 0.67 | 0.509 | 0.20, 2.23 | 0.91 | 0.900 | 0.23, 3.71 |
Comorbidities | ||||||
Hypertension | 1.07 | 0.906 | 0.33, 3.44 | |||
Coronary artery disease | 0.68 | 0.728 | 0.08, 6.02 | |||
Diabetes mellitus | 0.79 | 0.837 | 0.08, 7.49 | |||
Echocardigraphy | ||||||
LVEF | 1.00 | 0.999 | 0.89, 1.12 | |||
Left atrial diameter | 0.91 | 0.378 | 0.73, 1.12 |
Abbreviations: HR, hazard ratio; CI, confidence interval; AFR, atrial fibrillation recurrence; ECG, electrocardiogram; LVEF, left ventricular ejection fraction.
Discussion
AFR was detected in 22/67 (33%) patients over a 12-month follow-up. More patients with AFR were detected by the DL-based handheld device-derived symptom-driven monitoring than the scheduled 24-hour Holter monitoring (8 AFR with 24-hour Holter vs. 19 AFR with the handheld device). The improved detection of AFR may be attributable to the frequent ECG recording of this handheld device [362(330) tracings for AFR vs. 132(133) tracings for non-AFR, P=0.01)]. Therefore, this device not only serves as a pragmatic tool facilitating home-based monitoring, given the practical timing of symptom-related recording (18:00 to 24:00), but also enables timely detection of AFR (HR=1.6, 95% CI 1.2–2.2, P<0.01).
Despite a substantial decrease in AF burden, a 53% time-to-first-recurrence success rate within 1 year after ablation has been reported in a large multicenter, randomized AF ablation trial [10], thus indicating the need for extensive monitoring coverage during follow-up. To expand monitoring to the greatest extent possible, community- or home-based rhythm recording would be necessary, but would be highly infeasible for guideline-recommended scheduled monitoring. In addition, a prospective study has evaluated the safety of direct oral anticoagulants (DOACs) in patients with AF by investigating major and minor bleeding events associated with DOAC use during daily clinical practice. The results revealed 205 (11.4%) bleeding events, consisting of 34 (1.9%) major bleeding and 171 (9.4%) minor bleeding events [11]. Although DOACs are essential in preventing stroke in patients with AF, this study highlighted the importance of deploying a well-designed monitoring strategy, in which patients with validated absence of AFR would be cleared of potential bleeding risk due to DOACs.
The implantable loop recorder (ILR) remains the gold standard for the detection of AF [12]. Because they are truly continuous monitors, ILRs can detect transient and silent episodes of AF [13]. However, ILRs are subject to noise and artifacts, thus causing false episode detection [14]. ILR memory is also limited and may not be sufficient for EGM storage of more than a set number of episodes of suspected arrhythmia; moreover, ILRs are very expensive [15, 16]. In our study using a handheld device, more than 144 episodes of ECG signals with a duration of almost 3 hours per patient were documented, read, and sent to the server, which enabled large storage volumes and real-time evaluation. Continuous adhesive ECG patches may also identify asymptomatic episodes of AF, but they are often worn for limited time periods (7–30 days). In addition, the adhesive gels used in these wearable devices often cause skin allergies [17]. Furthermore, silent episodes detected by wearable ECG patches often coexist with symptomatic AFR [18]; thus, reporting of symptom-related episodes may serve as a rough surrogate for concomitant silent episodes [19]. Finally, trans-telephonic ECG technology has shown advantages in the monitoring of AF events [20–22]. However, these sensor cards have very limited memory, and can often retain only three ECGs, each with a 30-second duration, which are then transmitted via telephone [20]. The handheld device-derived symptom-driven recording in our study not only provided a benefit of ease of use comparable to that of trans-telephonic devices, but also had much greater memory capacity and used automated transmission to minimize data loss. Furthermore, whereas trans-telephonic ECG technology must be visually interpreted, our approach was capable of processing and diagnosing more than 9000 episodes free of AFR, thus providing time savings for experts interpreting ECG readings.
Our findings indicated that a handheld ECG recording device identified many more AER events than standard monitoring alone. The DL-based handheld device and 24-hour Holter monitor detected 19 and 8 patients with AFR, respectively, only five of whom were identified by both modalities. Therefore, a large percentage of AFR would have been missed if only the routinely scheduled monitoring [23] had been performed. Interestingly, large numbers of symptomatic AFR events were detected within the time period of 18:00 to 24:00, when contact with outpatient clinics is very limited, particularly for patients located far from medical facilities.
During the COVID-19 pandemic, post-ablation management for AF patients required remote rhythm monitoring combined with teleconsultations. An mHealth infrastructure (TeleCheck-AF), developed by the Cardiology Department of the Maastricht University Medical Centre, has been deployed and found to be effective in the continuous comprehensive monitoring of AF by using a mobile phone application [24]. Notably, this TeleCheck-AF is a photoplethysmography-based technology, which has an inherently lower accuracy than ECG monitoring [25]. Therefore, our approach is not only easy to use but also has greater accuracy than photoplethysmography.
Limitations
This study has several limitations. First, only 5.0% of the potential symptom-associated recordings were eventually confirmed to be AFR episodes (486 in 9671), thus suggesting potential low efficiency of this symptom-driven monitoring approach. However, the handheld device was easy to use, and the reports were facilitated by DL, and did not cause inconvenience to the patients or physicians, or lead to unnecessary medical interventions, despite the high volume of recordings. Second, too many variables might have been included in the multivariate regression models, given the limited events. However, after adjustment for multiple sets of potential covariates in the sensitivity tests, the hazard ratio and confidence intervals remained similar, thereby indicating the robustness of our findings. Third, anxiety or depression may be provoked by awareness of AFR, thus resulting in patient incompliance. Although the emotional state of our patients was not evaluated, most participants were able to complete the follow-up and comply with the medical arrangements.