通过智能手机摄像头实现被动式心脏健康监测

内容来源:https://research.google/blog/towards-passive-heart-health-monitoring-via-smartphone-camera/
内容总结:
谷歌推出智能手机摄像被动心率监测技术,准确率覆盖所有肤色人群
(华盛顿综合讯) 谷歌研究院于2026年6月4日发布一项突破性研究成果,成功开发出一套可通过智能手机前置摄像头,在日常使用中被动监测用户心率和静息心率的研究系统(PHRM)。相关论文已发表于《自然》杂志。
该系统利用用户在面部解锁后几秒钟内,由前置摄像头捕捉的面部视频,通过深度学习算法估算心率。数据显示,其平均绝对百分误差(MAPE)低于10%,与心电图金标准对比达到行业精度要求,且对所有肤色人群均表现稳定。在每日静息心率的估算上,该系统与可穿戴设备(如Fitbit)的对比误差平均小于5次/分钟。
研究团队指出,心率是反映生理状态的关键生命体征,而静息心率更是心血管健康及长期健康风险的重要生物标志物。尽管可穿戴设备已能追踪这些指标,但在资源匮乏地区和心血管疾病高危人群中普及率仍有待提高。全球约有50亿人拥有智能手机,利用其内置传感器进行健康监测具有巨大潜力。
本研究在方法论上取得显著进展。谷歌基于近700名不同肤色参与者的超过35万个视频片段进行模型训练,确保了肤色的多样化代表性。在实验室及“自由生活”真实场景测试中,PHRM系统在所有肤色分组中的心率监测误差均显著低于10%,显著优于其他15个主流同类模型。这是首次大规模证明在日常使用智能手机时,可通过对面部视频的被动分析,实现接近可穿戴设备精度的心率和静息心率监测。
研究人员强调,这一成果有可能通过人们最普遍拥有的设备,普及心脏健康追踪的益处。为促进该领域的进一步发展,谷歌将公开目前最大、最多样化的智能手机视频研究数据集及预训练模型“PHRM-mini”,供合格的研究人员申请使用。
中文翻译:
2026年6月4日
埃里克·S·蒂斯利(产品经理)、明哲·傅(谷歌研究院高级研究科学家)
我们推出一套研究系统,可通过日常使用智能手机时前置摄像头捕捉的面部视频,被动测量心率和静息心率。
心率作为关键生命体征之一,是反映生理状态的动态指标,受活动、压力、急慢性疾病等多种因素影响。此外,静息心率是心血管健康及长期健康风险的关键生物标志物。较高的静息心率及其随时间上升的趋势,与重大心血管不良事件及全因死亡率存在关联。
Fitbit设备及Pixel Watch等可穿戴设备已使我们在日常生活中追踪这些健康指标成为可能。然而,这些设备的普及率仍有提升空间,尤其是在资源匮乏环境和心血管疾病高风险人群中。智能手机为扩大健康追踪可及性提供了独特机遇——如今约50亿人已拥有配备强大传感器的设备,足以监测自身健康状况。2022年,我们曾展示通过手指覆盖摄像头实现按需心率测量的智能手机方案,后续谷歌研究进一步探索了该测量过程中检测到的信号如何助力预测心血管疾病。
在发表于《自然》杂志的《日常智能手机使用中的被动心率监测》一文中,我们介绍了一套研究系统,能在日常使用智能手机时后台追踪心率和静息心率。该系统利用前置摄像头,在面部解锁事件发生后的数秒内捕捉用户面部视频,随后应用深度学习技术估算心率,其平均绝对百分比误差低于10%(与心电图金标准对比),达到涵盖所有肤色的行业精度标准。最终,系统将全天多次心率测量整合为每日静息心率估算值,其准确性与可穿戴设备相当——与可穿戴追踪器相比,平均绝对误差小于5次/分钟。此次发布中,我们同步公开了目前规模最大、多样性最丰富的智能手机视频研究数据集,以及预训练的"PHRM-mini"模型。具备资质的研究人员可申请使用。
与可穿戴设备、脉搏血氧仪及我们此前的工作类似,该系统通过光电容积描记法测量心率——即感知每次血液搏动通过皮肤时,光与皮肤相互作用产生的波动。我们开发了设备端软件处理流水线,可处理8秒面部视频片段,并利用计算高效的时移卷积神经网络预测心率及置信度分数。该流水线进一步整合全天心率预测结果,结合置信度分数与卡尔曼滤波估算每日静息心率。
尽管此类"远程"光电容积描记(rPPG)的计算机视觉模型已存在二十年,但先前研究多为受控条件下的小规模实验,普适性有限。此外,先前研究中深肤色人群的代表性严重不足——黑色素会使光电容积描记信号更难被摄像头检测。直至近期,研究人员才更全面地探究rPPG模型在深肤色受试者上的表现,发现其准确性显著降低——这与脉搏血氧仪及其他基于光电容积描记技术曾出现的偏差轨迹相似。对脉搏血氧仪的担忧促使美国食品药品监督管理局起草指南,要求验证研究需纳入多样化肤色代表。截至目前,尚缺乏达到类似标准的rPPG研究。
我们利用来自近700名经知情同意的多样化研究参与者的超过35万条视频片段(涵盖实验室及真实场景),开发了该系统,并借鉴此前研究经验,对最具挑战性的案例投入更多模型训练。借助比色法及Mont肤色量表,我们确保浅肤色和中等肤色参与者各占数据集的至少25%,深肤色参与者占比至少33%。该采样策略与FDA后续提出的肤色分组建议相契合。更进一步,在谷歌健康优化团队支持下,我们制定了非劣效性标准:每组心率平均绝对百分比误差与其他组的差异须小于5个百分点。这些举措使我们关于该系统(PHRM)的研究成为迄今规模最大、多样性最丰富的rPPG研究,并助力开发了能在所有肤色光谱上精准运行的包容性模型。
我们在实验室研究中训练系统处理多样化条件:在多种光照条件和活动状态下,记录365名多样化研究参与者的面部视频与同步心电图数据。在由104名参与者组成的独立测试集上,经最低置信度分数筛选后,尽管测试条件多样,该系统在各肤色组均实现了显著低于10%的平均绝对百分比误差。在筛选前后,该系统均显著优于同一测试集上15个领先的已发表rPPG模型,且是唯一能在所有肤色组实现平均绝对百分比误差低于10%的模型。
为使用真实数据训练系统,我们开展了首创性的"自由生活"研究。231名多样化研究参与者在其个人手机上安装定制数据收集应用程序,并正常使用八天,同时佩戴心电图胸带及Fitbit Charge 6健身追踪器。该应用在每次面部解锁后立即记录8秒视频片段及心电图数据,平均每日采集231条片段。每日结束时,参与者审阅片段确认排除敏感内容及其他面部后,手动明确授权上传至我们的安全加密服务器。
在由101名参与者组成的留出验证子集中,经置信度筛选后,该系统整体平均绝对百分比误差为6.09%,分组1、2、3的平均绝对百分比误差分别为5.04%、5.12%、7.84%。各组平均绝对百分比误差均显著低于10%,并达到预设非劣效性目标。在自由生活条件下,该系统以更大优势优于相同15个领先rPPG模型,且仍是唯一能在所有肤色组实现平均绝对百分比误差低于10%的模型。Bland-Altman分析显示,该系统平均仅低估心率0.64次/分钟,95%一致性界限为-11.3至10.3次/分钟;置信度分数越高的测量值误差越低。
随后,我们对至少有一天内拥有20次以上心率测量值的参与者应用静息心率算法。对于90名参与者,该系统在73.6%的参与者-天内成功估算静息心率。与Fitbit Charge 6的每日静息心率相比,该系统静息心率的整体平均绝对误差为4.39次/分钟,显著低于预设的5次/分钟目标。Bland-Altman分析显示,该系统平均低估静息心率0.1次/分钟,95%一致性界限为-9.1至9.2次/分钟;误差随静息心率测量天数增加而减小。除分组3外,其他肤色组的平均绝对误差均显著低于5次/分钟。然而,随着静息心率算法的卡尔曼滤波收敛,各组平均绝对误差随时间递减——分组3自第三天起平均绝对误差显著低于5次/分钟。
在验证方法有效性时,我们进一步发现:在控制协变量后,基于该系统得出的静息心率较高的参与者更可能具有高体重指数和低心血管适能(低最大摄氧量),表明该系统正确捕捉了心血管风险的方向性。
据我们所知,该系统标志着在日常智能手机使用中被动监测心率和每日静息心率的首次大规模验证。作为唯一能在所有肤色人群(甚至在不可预测的真实条件下)达到心率精度标准的rPPG方法,它为该领域设立了新标准。它同时代表着首次使用rPPG估算每日静息心率,并在所有肤色人群中达到可穿戴设备级精度。通过融合用户习惯理解、前沿深度学习技术与包容性设计,我们开发出一套基于智能手机的心率监测系统,能提供类似可穿戴设备的心脏健康洞察。因此,该系统通过我们最普及的设备,为普及心脏健康追踪的益处创造了机遇。更广泛而言,它展示了我们频繁使用的设备如何反向为我们提供健康洞察。
尽管该系统在各肤色人群均达到精度标准,但其心率测量成功率在中肤色组较低、深肤色组最低,这可能是深色皮肤中光电容积描记信号检测困难所致。未来工作可探索优化摄像头曝光或触发额外采样尝试以提升测量成功率。我们还观察到由参与者说话和头部运动导致的个别离群误差。改进视频稳定技术可缓解此类误差,基于加速度计的筛选则有助于优先选择适宜的静息时刻。最后,未来系统可通过要求面部认证及采用安全的设备端处理,确保数据完整性与隐私性。
为推动进一步研究,我们向持有机构审查委员会批准并符合数据保护要求的合格研究人员开放具有里程碑意义的数据与建模资源。为保护研究参与者隐私,所有视频均在机构审查委员会批准下采集,并依据参与者明确同意进行处理。该数据集完全限用于非商业研究,研究人员严格禁止尝试重新识别任何个人或公开展示原始视频资产。我们邀请研究社区利用这些资源推进相关工作。
此项工作是超过七年努力的结晶。我们感谢论文合著者廖舜、保罗·迪·阿基莱、吴江、西尔维乌·博拉克、乔纳森·王、刘欣、蔡劳伦斯、杨宇哲、刘芸、丹尼尔·麦克达夫、苏浩威、布伦特·温斯洛、阿努帕姆·帕塔克、马克·马尔霍特拉、什韦塔克·帕特尔、詹姆斯·A·泰勒和詹姆森·K·罗杰斯。我们感谢关键贡献者:尼古拉·特斯洛维奇、亚历克斯·蒙、乔纳森·许、郑晓霞、德里克·维克斯、萨姆·姆拉夫卡、特蕾西·吉斯特、杰森·古斯、弗洛伦斯·唐、詹建宁、朱莉·坎农、梅尔·卡什亚普、贾斯普雷特·潘努、蒂芙尼·孔、李明杰、马修·肖尔、贾斯汀·坦苏万、陈立文、克里斯托·阿拉尼斯·巴雷拉、阿南德·萨克塞纳、杰里米·迈尔斯、梅丽莎·莫兰、迈克尔·V·麦康奈尔、艾弗·霍恩、本尼·阿亚莱乌、乔内尔·桑德斯、乔纳森·蔡、希瑟·科尔-刘易斯、埃博尼·雷斯普雷斯、佩里·佩恩、卡米拉·伍德、姆纳迪·埃齐奥诺奇、玛格达拉·切里和里奇·戈斯韦勒。我们感谢莉齐·多夫曼、周凯瑟琳、迈克尔·豪威尔和格雷格·科拉多的领导支持。特别感谢杨洁敏、乔什·格隆迪、肯尼亚·摩尔和凯蒂·巴顿为我们的自由生活研究提供的支持。
英文来源:
June 4, 2026
Eric S. Teasley, Product Manager, and Ming-Zher Poh, Staff Research Scientist, Google Research
We present a research system that passively measures heart rate and resting heart rate via facial video captured by the front-facing camera during everyday smartphone use.
Heart rate (HR), one of the cardinal vital signs, is a dynamic indicator of physiological status, influenced by everything from activity, to stress, to acute and chronic illness. Further, resting heart rate (RHR) is a key biomarker of cardiovascular health and long-term health risk. A higher RHR and increases in RHR over time are associated with major adverse cardiovascular events and all-cause mortality.
Wearables, such as Fitbit devices and the Pixel Watch, have made it possible to track these health markers throughout our daily lives. However, there is room to improve their adoption, especially in low-resource environments and among those most at risk for cardiovascular disease. Smartphones present a unique opportunity to broaden access to health tracking — today, around five billion people already own a device with powerful sensors capable of monitoring their health. In 2022, we demonstrated using smartphones for on-demand HR measurement via a finger placed over the camera, and subsequent Google research considered how the signal detected during that measurement could help predict cardiovascular disease.
In “Passive Heart Rate Monitoring During Smartphone Use in Everyday Life”, published in Nature, we introduce a research system (PHRM) that enables tracking of HR and RHR in the background during everyday smartphone use. PHRM leverages the front-facing camera to capture video of the user’s face in the seconds after face unlock events. It then applies deep learning to estimate HR with a mean absolute percentage error (MAPE) < 10% compared to electrocardiogram-derived ground truth, meeting industry accuracy standards for people of all skin tones. Finally, the system integrates HR measurements throughout the day into an estimate of daily RHR that matches the accuracy of wearables, with a mean absolute error (MAE) of < 5 beats per minute (bpm) compared to a wearable tracker. With our publication, we release the largest and most diverse dataset of smartphone videos publicly available for research along with a pre-trained “PHRM-mini” model. Qualified researchers can apply for access.
Like wearables, pulse oximeters, and our previous work, PHRM measures HR via photoplethysmography (PPG), i.e., by sensing the fluctuation in how light interacts with the skin each time blood pulses through it. We developed an on-device software pipeline that processes 8-second facial video clips and uses computationally-efficient temporal shift convolutional neural networks to predict HR along with a confidence score. The pipeline further aggregates HR predictions over the day and leverages confidence scores and Kalman filtering to estimate a daily RHR.
While computer vision models for such “remote” PPG (rPPG) have existed for two decades, previous work involved smaller studies under controlled conditions, limiting generalizability. Additionally, previous studies vastly underrepresented people with dark skin, in whom melanin makes the PPG signal more challenging for cameras to detect. Only recently have researchers investigated rPPG model performance on dark-skinned study participants more thoroughly, finding significantly lower accuracy — a trajectory similar to what has occurred for pulse oximeters and other PPG-based technologies. The concerns about pulse oximeters spurred the FDA to draft guidance to ensure diverse skin tone representation in validation studies. Thus far, there is a lack of studies of rPPG that achieve similar standards.
We developed PHRM using over 350,000 video clips from nearly 700 diverse consented research participants in both laboratory and real-world settings, and we devoted more model training to the most challenging cases, as in our earlier work. We leveraged colorimetric methods and the Monk Skin Tone scale to ensure that participants with light (“Group 1,” Monk 1-4) and medium (“Group 2,” Monk 5-7) skin each comprised at least 25% of our datasets and that participants with dark (“Group 3,” Monk 8-10) skin comprised at least 33%. This sampling approach aligned with the skin tone cohorts later proposed by the FDA. Going further, with support from Google’s Health Optimization team, we developed a non-inferiority criterion stipulating that PHRM’s MAPE for HR for each group must differ from that of the others by < 5 percentage points. These efforts make our study of PHRM the largest and most diverse rPPG study to date and enabled us to develop inclusive models that perform accurately across the skin tone spectrum.
We trained PHRM to handle varied conditions in laboratory studies, recording face video and simultaneous electrocardiogram (ECG) data from 365 diverse study participants across different lighting conditions and activity states. On a separate 104-participant test set, after gating with a minimum confidence score, PHRM achieved MAPEs significantly < 10% across skin tone groups despite the range of conditions we tested. PHRM significantly outperformed 15 of the leading published rPPG models on the same test set, both before and after gating, and was the only model to achieve MAPE < 10% across all skin tones.
To train PHRM on real-world data, we conducted a first-of-its-kind “free-living” study. 231 diverse study participants installed a custom data collection app on their personal phones and used them as normal for eight days while wearing an ECG chest strap and a Fitbit Charge 6 fitness tracker. Our app recorded 8-second video clips and ECG data immediately after each face unlock, capturing an average of 231 clips per day. At the end of each day, participants manually and affirmatively authorized uploads to our secure, encrypted servers after reviewing their clips to confirm exclusion of sensitive content and other faces.
On a held-out 101-participant validation subset, PHRM achieved an overall MAPE of 6.09% after confidence gating, with MAPEs of 5.04%, 5.12%, and 7.84% for Groups 1, 2, and 3, respectively. Each MAPE was significantly < 10% and met our pre-specified non-inferiority target. PHRM outperformed the same 15 leading rPPG models by an even wider margin under free-living conditions and remained the only model to achieve MAPE < 10% across all skin tones. Bland-Altman analysis showed that PHRM underestimated HR by only 0.64 bpm on average, with 95% limits of agreement between -11.3 and 10.3 bpm; measures with higher confidence scores had lower errors.
We then applied PHRM’s RHR algorithm for participants who had at least 20 HR measurements on one or more days. For those 90 participants, PHRM successfully estimated RHR on 73.6% of the participant-days. PHRM RHR demonstrated an overall MAE of 4.39 bpm versus daily RHR from the Fitbit Charge 6, significantly less than our pre-specified 5-bpm target. Bland-Altman analysis showed that PHRM underestimated RHR by an average of 0.1 bpm, with 95% limits of agreement between -9.1 and 9.2 bpm; error decreased with increasing days of RHR measurements. The MAEs by skin tone group were significantly < 5 bpm for all but Group 3. However, MAE for all groups decreased over time as our RHR algorithm’s Kalman filter converged — Group 3’s MAE was significantly < 5 bpm from day three onwards.
In confirming the validity of our approach, we further found that participants with higher PHRM-derived RHRs were more likely to have high body mass index (BMI) and poor cardiovascular fitness (low VO2max) after controlling for covariates, indicating that PHRM correctly captured the directionality of cardiovascular risk.
To our knowledge, PHRM marks the first large-scale demonstration of passive HR and daily RHR monitoring during everyday smartphone use. As the only rPPG method to meet HR accuracy standards for people of all skin tones — even in unpredictable real-world conditions — it sets a new standard for the field. It also represents the first use of rPPG to estimate daily RHR, achieving wearable-level accuracy across all skin tones. By combining an understanding of user habits with cutting-edge deep learning techniques and an inclusive design, we’ve developed a smartphone-based HR monitoring system that enables wearable-like heart health insights. As such, PHRM presents the opportunity to democratize the benefits of heart health tracking through our most ubiquitous devices. More broadly, it demonstrates how the devices we consult so frequently can in turn reflect insights into our health.
While PHRM met accuracy standards across skin tones, its HR measurement success rate was lower for Group 2 and lowest for Group 3, likely due to the difficulty of detecting the PPG signal in darker skin. Future efforts could explore optimizing camera exposure or triggering additional sampling attempts to improve measurement success rates. We additionally observed some outlier errors driven by participant talking and head motion. Improved video stabilization could mitigate these errors, and accelerometer-based gating could help to prioritize opportune at-rest moments. Finally, future systems could ensure data integrity and privacy by requiring face authentication and employing secure, on-device processing.
To catalyze further research, we are making our landmark data and modeling resources available to qualified researchers who possess Institutional Review Board (IRB) approval and meet our data protection requirements. To protect research participant privacy, all videos were collected under IRB approval and were processed according to explicit participant consent. This dataset is restricted entirely to non-commercial research use, and accessing researchers are strictly prohibited from attempting to re-identify any individuals or publicly displaying raw video assets. We invite the research community to leverage our resources to build on our work.
This work represents the culmination of more than 7 years of effort. We thank our paper co-authors Shun Liao, Paolo Di Achille, Jiang Wu, Silviu Borac, Jonathan Wang, Xin Liu, Lawrence Cai, Yuzhe Yang, Yun Liu, Daniel McDuff, Hao-Wei Su, Brent Winslow, Anupam Pathak, Mark Malhotra, Shwetak Patel, James A. Taylor, and Jameson K. Rogers. We thank key contributors including: Nikola Teslovich, Alex Mun, Jonathan Hsu, Xiaoxia Zheng, Derrick Vickers, Sam Mravca, Tracy Giest, Jason Guss, Florence Thng, Jiening Zhan, Julie Cannon, Mehr Kashyap, Jaspreet Pannu, Tiffany Kung, Ming Jack Po, Matthew Shore, Justin Tansuwan, Liwen Chen, Cristo Alanis Barrera, Anand Saxena, Jeremy Miles, Melissa Moran, Michael V. McConnell, Ivor Horn, Benny Ayalew, Jonelle Saunders, Jonathan Tsai, Heather Cole-Lewis, Ebony Respress, Perry Payne, Kamillah Wood, Nnamdi Ezeanochie, Magdala Chery, and Rich Gossweiler. We are grateful for leadership support from Lizzie Dorfman, Katherine Chou, Michael Howell, and Greg Corrado. Special thanks go to Jiemin Yang, Josh Grondie, Kenya Moore, and Katie Barton for animating our free-living study.