-
摘要:
针对成像式光体积描记术(Image Photoplethysmography, IPPG)信号采集过程中易受到噪声干扰的问题,本文提出了一种针对IPPG噪声分布特性的去噪扩散概率模型(Denoising Diffusion Probability Model for IPPG, DDPM-IPPG),通过扩散和逆扩散阶段消除基线漂移与噪声,提升信号的信噪比和后续心率指标的准确性。首先,在扩散阶段对光体积描记术(Photoplethysmography, PPG)信号逐步添加高斯噪声,构建噪声序列,训练基于非线性交融模块和桥接模块的噪声预测器。其次,在逆扩散阶段利用训练完善的噪声预测器对初步提取的IPPG信号进行逐步去噪,恢复出形态相似于PPG的IPPG信号。本文提出的模型在PURE、UBFC-rPPG、UBFC-Phys和MMPD数据集上验证和对比分析。实验结果表明:与现有最高精度提取方法相比,DDPM-IPPG在PURE数据集上,信噪比提升1.06 dB,心率的平均绝对误差下降0.24 bpm,均方根误差下降0.41 bpm;在UBFC-IPPG数据集上信噪比提升1.50 dB。本文提出的DDPM-IPPG模型在IPPG信号消除基线漂移与噪声方面达到了当前先进水平,能够更精确地逼近真实信号,为生理健康评估与远程医疗监测提供了更加可靠的数据基础。
Abstract:Image Photoplethysmography (IPPG) signals are easily disturbed by noise during acquisition. To address the issue, this study proposes a denoising diffusion probability model for IPPG (DDPM-IPPG). This model eliminates baseline drift and noise through diffusion and reverse diffusion stages. This improves the signal-to-noise ratio and heart rate accuracy. First, Gaussian noise is gradually added to the photoplethysmography (PPG) signal during the diffusion phase to create a noise sequence. A noise predictor based on a nonlinear fusion module and a bridging module is trained. Subsequently, in the reverse diffusion phase, the well-trained noise predictor is employed. It performs step-by-step denoising on the initially extracted IPPG signal. Through this denoising, a signal with high signal-to-noise ratio is recovered. The model proposed in this paper is validated and compared on the PURE, UBFC-IPPG, UBFC-Phys, and MMPD datasets. The experimental results show the following. Compared with the existing highest-precision extraction method, DDPM-IPPG improves the signal-to-noise ratio by 1.06 dB on the PURE dataset. The mean absolute error of heart rate decreases by 0.24 bpm. The root mean square error of heart rate decreases by 0.41 bpm. On the UBFC-IPPG dataset, the signal-to-noise ratio improved by 1.50 dB. The proposed DDPM-IPPG model reaches the current advanced level. It excels in eliminating baseline drift and noise of IPPG signals. It can approximate real signals more accurately. In turn, it provides a more reliable data foundation. This foundation supports physiological health assessment and telemedicine monitoring.
-
Key words:
- imaging photoplethysmography /
- signal denoising /
- diffusion model /
- attention mechanism
-
图 3 IPPG信号初步提取方法。(a) IPPG信号提取流程图;(b) IPPG的图像处理模块;(c) RGB图;(d) 原始IPPG波形图;(e) 滤波后的IPPG波形图;(f) 截取部分帧的IPPG信号与PPG信号(黄色部分)进行对比
Figure 3. Preliminary extraction method for IPPG signals. (a) IPPG signal extraction flowchart; (b) IPPG image processing module; (c) RGB image; (d) Original IPPG waveform diagram; (e) Filtered IPPG waveform diagram; (f) Comparison of IPPG signals and PPG signals (yellow portion) from a portion of a frame
表 1 IPPG数据集
Table 1. IPPG dataset
数据集名称 受试者
数量视频
数量视频
时长帧率 分辨率 PPG
采样率采集设备与方式 采集场景/条件描述 PURE 10人 60 1 min 30 Hz 640×480 60 Hz Eco274CVGE 相机
CMS50E 脉搏血氧仪稳定、交谈、慢速平移、快速平移、
小范围旋转、中度旋转共6种条件UBFC-rPPG 42人 46 1 min 30 Hz 640×480 60 Hz LogitechC920相机
CMS50E血氧仪室内自然光,通过数字游戏诱导心率变化 UBFC-Phys 56人 168 3 min 35 Hz 1024 ×1024 64 Hz E0-23121CRGB相机
EmpaticaE4腕带室内自然光,压力诱导情境 MMPD 33人 660 1 min 30 Hz 320×240 30 Hz SamsungGalaxyS22手机
HKG-07C血氧仪4种照明(自然光、白炽灯、低LED、高LED)
和4种活动(静止、转头、说话、行走)表 2 IPPG提取方法性能对比(PURE数据集)
Table 2. Comparison of IPPG extraction methods (PURE dataset)
方法 评估指标 SNR HR AVNN SDNN (dB)↑ MAE(bpm)↓ RMSE(bpm)↓ r↑ MAE(ms)↓ RMSE(ms)↓ MAE(ms)↓ RMSE(ms)↓ CHROM[18] 5.35 4.85 16.27 0.72 40.19 − 42.18 − POS[19] − 2.45 3.88 0.85 10.83 16.42 21.56 30.86 ESA-rPPGNet[26] − − − − 8.92 − 11.75 − PhysFormer[23] 6.31 4.36 13.00 0.76 24.83 − 29.11 − DiffPhys[30] 10.07 1.46 5.88 0.90 13.75 − 21.10 − PulseGan[29] 6.69 2.01 6.87 0.87 24.04 − 45.63 − DeepPhys[20] 6.32 3.96 12.95 0.76 25.55 − 41.20 − rPPG-MAE[27] − 0.40 0.92 0.99 − − − − SiNC[28] − 0.61 1.84 1.00 − − − − STFPNet[31] − 0.47 0.69 0.99 − − − − RhythmFormer[25] − 0.27 0.47 0.99 − − − − Ours 11.13 0.03 0.06 0.99 8.90 9.66 42.67 49.66 表 3 IPPG提取方法性能对比(UBFC-rPPG数据集)
Table 3. Comparison of IPPG extraction methods (UBFC-rPPG dataset)
方法 评估指标 SNR HR AVNN SDNN (dB)↑ MAE(bpm)↓ RMSE(bpm)↓ r↑ MAE(ms)↓ RMSE(ms)↓ MAE(ms)↓ RMSE(ms)↓ POS[19] − 2.47 3.88 0.84 12.76 18.28 21.76 31.40 CHROM[18] 4.92 3.19 9.98 0.90 29.11 − 24.21 − ESA-rPPGNet[26] − − − − 5.14 − 13.76 − PhysFormer[23] 6.01 2.83 6.43 0.99 7.11 − 13.49 − DiffPhys[30] 7.98 1.05 1.63 0.99 7.11 − 13.49 − PulseGan[29] 7.90 1.19 2.10 0.97 7.52 − 18.36 − rPPG-MAE[27] − 0.17 0.21 0.99 − − − − STFPNet[31] − 0.41 0.95 0.99 − − − − RhythmFormer[25] − 0.50 0.78 0.99 − − − − Ours 9.48 0.50 0.74 0.99 7.46 10.64 21.71 30.90 表 4 IPPG提取方法性能对比(UBFC-Phys和MMPD数据集)
Table 4. Comparison of IPPG extraction methods (UBFC-Phys and MMPD dataset)
方法 评估指标 HR(UBFC-Phys) HR(MMPD) MAE(bpm)↓ RMSE(bpm)↓ r↑ MAE(bpm)↓ RMSE(bpm)↓ r↑ GREEN[40] 13.55 18.80 0.29 21.68 27.69 −0.01 ICA[17] 10.04 15.73 0.36 18.60 24.30 0.01 CHROM[18] 4.49 7.56 0.80 13.66 18.76 0.08 LGI[41] 6.27 10.41 0.70 17.08 23.32 0.04 PBV[42] 12.34 17.43 0.33 17.95 23.58 0.09 POS[19] 4.51 8.16 0.77 12.36 17.71 0.18 DeepPhys[20] − − − 22.27 28.92 −0.03 PhysNet[24] − − − 4.80 11.80 0.60 TS-CAN[21] − − − 9.71 17.22 0.44 PhysFomer[23] − − − 11.99 18.41 0.18 EfficientPhys[22] − − − 13.47 21.32 0.21 RhythmFormer[25] − − − 3.07 6.81 0.86 Ours 1.40 3.51 0.89 5.99 10.39 0.63 -
[1] ZHANG X B, XIA ZH Q, DAI J, et al. MSDN: a multistage deep network for heart-rate estimation from facial videos[J]. IEEE Transactions on Instrumentation and Measurement, 2023, 72: 5032415. [2] TAO X, SU L W, RAO ZH, et al. Facial video-based non-contact emotion recognition: a multi-view features expression and fusion method[J]. Biomedical Signal Processing and Control, 2024, 96: 106608. doi: 10.1016/j.bspc.2024.106608 [3] GUARDUCCI S, JAYOUSI S, CAPUTO S, et al. Key fundamentals and examples of sensors for human health: wearable, non-continuous, and non-contact monitoring devices[J]. Sensors, 2025, 25(2): 556. doi: 10.3390/s25020556 [4] LEE R J, SIVAKUMAR S, LIM K H. Review on remote heart rate measurements using photoplethysmography[J]. Multimedia Tools and Applications, 2024, 83(15): 44699-44728. [5] PEREPELKINA O, ARTEMYEV M, CHURIKOVA M, et al. HeartTrack: convolutional neural network for remote video-based heart rate monitoring[C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), IEEE, 2020: 1163-1171. [6] YU Z T, PENG W, LI X B, et al. Remote heart rate measurement from highly compressed facial videos: an end-to-end deep learning solution with video enhancement[C]. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), IEEE, 2019: 151-160. [7] CASADO C Á, LÓPEZ M B. Face2PPG: an unsupervised pipeline for blood volume pulse extraction from faces[J]. IEEE Journal of Biomedical and Health Informatics, 2023, 27(11): 5530-5541. doi: 10.1109/JBHI.2023.3307942 [8] ALNAGGAR M, SIAM A I, HANDOSA M, et al. Video-based real-time monitoring for heart rate and respiration rate[J]. Expert Systems with Applications, 2023, 225: 120135. doi: 10.1016/j.eswa.2023.120135 [9] ZOU B CH, GUO Z ZH, HU X CH, et al. RhythmMamba: fast, lightweight, and accurate remote physiological measurement[C]. Proceedings of the 39th AAAI Conference on Artificial Intelligence, AAAI Press, 2025: 11077-11085. [10] WU B W, JIANG T, YU ZH X, et al. Proximity sensing electronic skin: principles, characteristics, and applications[J]. Advanced Science, 2024, 11(13): 2308560. doi: 10.1002/advs.202308560 [11] 饶治, 李炳霖, 隋雅茹, 等. 成像式光体积描记术精神压力检测[J]. 中国光学(中英文), 2022, 15(6): 1350-1359.RAO ZH, LI B L, SUI Y R, et al. Image photoplethysmography for mental stress detection[J]. Chinese Optics, 2022, 15(6): 1350-1359. (in Chinese). [12] 嵇晓强, 刘振瑶, 李炳霖, 等. 面部视频非接触式生理参数感知[J]. 中国光学, 2022, 15(2): 276-285. doi: 10.37188/CO.2021-0157JI X Q, LIU ZH Y, LI B L, et al. Non-contact perception of physiological parameters from videos of faces[J]. Chinese Optics, 2022, 15(2): 276-285. (in Chinese). doi: 10.37188/CO.2021-0157 [13] CASADO C Á, CAÑELLAS M L, LÓPEZ M B. Depression recognition using remote photoplethysmography from facial videos[J]. IEEE Transactions on Affective Computing, 2023, 14(4): 3305-3316. doi: 10.1109/TAFFC.2023.3238641 [14] ANIL A A, KARTHIK S, SIVAPRAKASAM M, et al. Dynamic ROI adaptation for accurate non-contact heart rate estimation using VGG-13 based encoder-decoder model and facial landmarks[C]. ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2025: 1-5. [15] TULYAKOV S, ALAMEDA-PINEDA X, RICCI E, et al. Self-adaptive matrix completion for heart rate estimation from face videos under realistic conditions[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2016: 2396-2404. [16] LI X B, CHEN J, ZHAO G Y, et al. Remote heart rate measurement from face videos under realistic situations[C]. 2014 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, 2014: 4264-4271. [17] POH M Z, MCDUFF D J, PICARD R W. Advancements in noncontact, multiparameter physiological measurements using a webcam[J]. IEEE Transactions on Biomedical Engineering, 2011, 58(1): 7-11. doi: 10.1109/TBME.2010.2086456 [18] DE HAAN G, JEANNE V. Robust pulse rate from chrominance-based rPPG[J]. IEEE Transactions on Biomedical Engineering, 2013, 60(10): 2878-2886. doi: 10.1109/TBME.2013.2266196 [19] WANG W J, DEN BRINKER A C, STUIJK S, et al. Algorithmic principles of remote PPG[J]. IEEE Transactions on Biomedical Engineering, 2017, 64(7): 1479-1491. doi: 10.1109/TBME.2016.2609282 [20] CHEN W X, MCDUFF D. DeepPhys: video-based physiological measurement using convolutional attention networks[C]. Proceedings of the 15th European Conference on Computer Vision - ECCV 2018, Springer, 2018: 356-373. [21] LIU X, FROMM J, PATEL S, et al. Multi-task temporal shift attention networks for on-device contactless vitals measurement[C]. Proceedings of the 34th International Conference on Neural Information Processing Systems, Curran Associates Inc., 2020: 1627. [22] LIU X, HILL B, JIANG Z H, et al. EfficientPhys: enabling simple, fast and accurate camera-based cardiac measurement[C]. 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), IEEE, 2023: 4997-5006. [23] YU Z T, SHEN Y M, SHI J G, et al. PhysFormer: facial video-based physiological measurement with temporal difference transformer[C]. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2022: 4176-4186. [24] YU Z T, LI X B, ZHAO G Y. Remote photoplethysmograph signal measurement from facial videos using spatio-temporal networks[C]. Proceedings of the 30th British Machine Vision Conference 2019, BWVA Press, 2019: 277. [25] ZOU B CH, GUO Z ZH, CHEN J SH, et al. RhythmFormer: extracting patterned rPPG signals based on periodic sparse attention[J]. Pattern Recognition, 2025, 164: 111511. doi: 10.1016/j.patcog.2025.111511 [26] KUANG H L, LV F B, MA X L, et al. Efficient spatiotemporal attention network for remote heart rate variability analysis[J]. Sensors, 2022, 22(3): 1010. doi: 10.3390/s22031010 [27] LIU X, ZHANG Y T, YU Z T, et al. rPPG-MAE: self-supervised pretraining with masked autoencoders for remote physiological measurements[J]. IEEE Transactions on Multimedia, 2024, 26: 7278-7293. doi: 10.1109/TMM.2024.3363660 [28] SPETH J, VANCE N, FLYNN P, et al. Non-contrastive unsupervised learning of physiological signals from video[C]. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2023: 14464-14474. [29] SONG R CH, CHEN H, CHENG J, et al. PulseGAN: learning to generate realistic pulse waveforms in remote photoplethysmography[J]. IEEE Journal of Biomedical and Health Informatics, 2021, 25(5): 1373-1384. doi: 10.1109/JBHI.2021.3051176 [30] CHEN SH T, WONG K L, CHIN J W, et al. DiffPhys: enhancing signal-to-noise ratio in remote photoplethysmography signal using a diffusion model approach[J]. Bioengineering, 2024, 11(8): 743. doi: 10.3390/bioengineering11080743 [31] LI ZH P, XIAO H G, XIA Z Y, et al. STFPNet: a simple temporal feature pyramid network for remote heart rate measurement[J]. Measurement, 2025, 252: 117287. doi: 10.1016/j.measurement.2025.117287 [32] HUANG B, HU SH, LIU Z M, et al. Challenges and prospects of visual contactless physiological monitoring in clinical study[J]. npj Digital Medicine, 2023, 6(1): 231. doi: 10.1038/s41746-023-00973-x [33] LUGARESI C, TANG J Q, NASH H, et al. MediaPipe: a framework for building perception pipelines[J]. arXiv:, 1906, 08172: 2019. (查阅网上资料, 未能确认文献类型, 请确认). [34] HO J, JAIN A, ABBEEL P. Denoising diffusion probabilistic models[C]. Proceedings of the 34th International Conference on Neural Information Processing Systems, Curran Associates Inc., 2020: 574. [35] SOHL-DICKSTEIN J, WEISS E A, MAHESWARANATHAN N, et al. Deep unsupervised learning using nonequilibrium thermodynamics[C]. Proceedings of the 32nd International Conference on Machine Learning, JMLR. org, 2015: 2256-2265. [36] STRICKER R, MÜLLER S, GROSS H M. Non-contact video-based pulse rate measurement on a mobile service robot[C]. Proceedings of the 23rd IEEE International Symposium on Robot and Human Interactive Communication, IEEE, 2014: 1056-1062. [37] BOBBIA S, MACWAN R, BENEZETH Y, et al. Unsupervised skin tissue segmentation for remote photoplethysmography[J]. Pattern Recognition Letters, 2019, 124: 82-90. doi: 10.1016/j.patrec.2017.10.017 [38] TANG J K, CHEN K Q, WANG Y T, et al. MMPD: multi-domain mobile video physiology dataset[C]. 2023 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), IEEE, 2023: 1-5. [39] SABOUR R M, BENEZETH Y, DE OLIVEIRA P, et al. UBFC-Phys: a multimodal database for psychophysiological studies of social stress[J]. IEEE Transactions on Affective Computing, 2023, 14(1): 622-636. doi: 10.1109/TAFFC.2021.3056960 [40] VERKRUYSSE W, SVAASAND L O, NELSON J S. Remote plethysmographic imaging using ambient light[J]. Optics Express, 2008, 16(26): 21434-21445. doi: 10.1364/OE.16.021434 [41] PILZ C S, ZAUNSEDER S, KRAJEWSKI J, et al. Local group invariance for heart rate estimation from face videos in the wild[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), IEEE, 2018: 1254-1262. [42] DE HAAN G, VAN LEEST A. Improved motion robustness of remote-PPG by using the blood volume pulse signature[J]. Physiological Measurement, 2014, 35(9): 1913-1926. doi: 10.1088/0967-3334/35/9/1913 -