通过零信任聚合实现的隐私分析

qimuai 发布于 阅读:8 一手编译

通过零信任聚合实现的隐私分析

内容来源:https://research.google/blog/private-analytics-via-zero-trust-aggregation/

内容总结:

谷歌发布私密分析新方案:融合密码学与可信硬件,实现零信任隐私保护

2026年5月27日,谷歌研究院发布了一项全新的私密分析解决方案。该方案创新性地结合了新型密码安全聚合协议与可信执行环境(TEE)的透明性,旨在为大规模设备端数据提供业界领先的隐私与安全保障。

随着端侧人工智能(如安卓系统的SafetyCore)日益普及,开发者需要了解其模型在全球数百万台不同设备上的真实表现,但必须确保不触碰任何单一用户的原始数据。传统密码安全聚合虽能提供数学上的强保障,但长期存在要求设备长时间在线的复杂交互瓶颈。

谷歌此次推出的新方案突破性地引入了一种基于格密码的单轮交互协议。用户设备只需一次性提交加密信息,无需经历多轮与服务器的交互,极大降低了部署门槛。该方法在数学上保证,服务器只能解密获得聚合后的匿名统计结果,而无法重建任何个人数据。

更为关键的是,该方案践行了“零信任”原则。它在谷歌的“机密联合分析”系统中,将上述高效密码协议与TEE硬件隔离相结合,构建了多层防御体系。即便TEE的安全模型被攻破,密码层仍能确保原始数据绝不在任何服务器内存(包括硬件保护区)中被暴露。仅当数据已被聚合和匿名化处理的最后阶段,才会进行解密的明文运算。

同时,该方案利用TEE的“远程认证”机制,向所有参与者提供高保证、可验证的证据,证明聚合协议正在按照公开的代码正确执行,避免了软件被篡改的风险。

作为首个落地应用,安卓系统安全服务SafetyCore将利用这一技术,在不收集用户敏感内容的前提下,评估其安全模型的“真阳性”率等关键元数据。谷歌表示,该技术使其工程师能够通过观察全球设备群的匿名趋势来优化模型,既保证了本地安全防御的有效性,又严守了用户数据仅停留在设备上的隐私承诺。谷歌团队称,这一集成方案显著提升了私密分析的安全标准,未来将探索扩展到更多类型的计算任务。

中文翻译:

2026年5月27日
Google研究部研究员Adrià Gascón与Google高级研究员Mariana Raykova

我们推出了一项隐私分析解决方案,该方案结合了新型加密安全聚合协议与可信执行环境(TEE)的透明性特性,实现了业界领先的隐私与安全保障。

通过本地处理数据,设备端AI能够在保护用户隐私的同时提供增强防护和即时警报。例如,Android系统通过名为SafetyCore的组件,为用户提供隐私保护的设备端功能与通用基础设施,以屏蔽不良内容。在研发设备端技术时,开发团队需要了解其系统如何在数百万部各具独特数据分布、硬件限制及用户行为的智能手机上运行。要在不泄露个体用户数据的前提下仅揭示群体趋势,团队可将加密安全聚合作为核心基础模块。与其他加密协议相同,安全聚合通过先进数学工具实现安全保障。

如今,我们在隐私分析服务中为高效加密聚合设立了更高标准。我们遵循“零信任”原则,旨在降低对任何单一实体的信任依赖,通过结合加密与硬件防护机制的新型安全设计实现这一目标。我们的方案采用新型加密聚合方法,可保证Google仅能获取匿名化的群体聚合洞察。同时,可信执行环境(TEE)提供了严格的验证与透明性层。

当模型在设备端本地部署时,仅知晓模型“正在运行”并不足以理解其行为、有效性或故障模式,这限制了回答关键问题的能力。此时,隐私分析成为关键桥梁,能在不泄露个体用户内容的前提下,提供关于群体的匿名化聚合洞察。

Google团队通过联邦分析技术获取此类聚合性隐私洞察,并已应用于Pixel Recorder、Gboard等产品。联邦分析需要一条隐私聚合路径——个体设备数据在汇总前始终受到保护。当前,保护用户数据的主流方案分为两类:基于硬件的隔离(TEE)与加密协议。

硬件方案以TEE为核心,如Intel TDX、AMD SEV-SNP等。其基本思路是创建“安全区”——即处理器与内存中与设备其他部分隔离的保护区域。在该区域中,数据可被解密并以明文形式处理,即使操作系统被攻破或存在恶意虚拟机管理器,数据仍受保护。通过“认证”流程,TEE可生成运行于安全区内固件与软件状态的硬件级加密“指纹”。用户或审计方可通过认证验证数据是否由预期的特定防篡改程序处理,而非用于泄露信息的篡改版本。Google已在Pixel Recorder应用中对AI系统的洞察计算中部署了基于TEE的差异化隐私聚合。

然而,TEE隔离机制仍在持续演进。研究人员不断发现可被攻击者利用的侧信道漏洞,导致TEE安全保障或应用级特定保障失效(如SNPeek、TDXray)。尽管业界正致力于强化现有方案以抵御已知侧信道攻击,但新型侧信道漏洞仍预料会出现。因此,理想系统中数据应受多层安全保护,即便TEE安全模型失效,数据也不会泄露。

另一方面,加密协议依赖数学技术,这类技术可提供可证明的保证:个体数据无法被重建,唯一可见的是聚合后的匿名化输出。Google已大规模部署两代安全聚合协议(详见初始博客文章及后续更新)。然而,多轮交互协议要求用户设备长时间保持在线,这一复杂性限制了其广泛应用。

我们的新方案引入了一种创新的加密协议,允许用户设备通过一次性的单条信息安全提交数据,克服了传统交互式方案的障碍。通过实现单次信息提交,设备无需为多轮服务器交互保持在线。

该方案已集成至Google的机密联邦分析系统,我们将这种更高效率的协议与TEE内执行相结合,打造出多层防御架构。在此方案中,数据机密性不再完全依赖硬件保护。加密层确保个体原始数据绝不会在任何服务器内存中暴露或被重建——即使在硬件保护边界内也是如此。非加密数据在设备外处理的唯一时刻是最终阶段,此时数据已被聚合和匿名化。此外,我们的方案利用TEE认证机制,向所有参与者提供高可靠性的可验证证明:安全聚合协议严格按预期执行,即通过编译并运行正确的公开代码。

该加密方案的核心是一种创新的基于格(lattice)的协议。客户端可对数据进行加密,加密后的密文不仅能聚合底层消息,也能聚合加密密钥。服务器获取聚合值的唯一条件是获得一个仅能解密聚合值的解密密钥。为此,我们在客户端中组建小型委员会,这些委员会持有“线索”,帮助解锁经差异化隐私噪声掩盖的聚合值。客户端根据自身可用性不定期参与委员会工作,其机制确保任何解密密钥均由多方共同持有,每一方都保护加密数据的机密性。

Android System SafetyCore是Google面向Android 9+设备的系统服务,为Android安全功能提供隐私保护的设备端支持。在设备端安全领域,SafetyCore等工具扮演关键角色。但要使这些工具持续进化,开发者需要了解其真实运行表现——具体而言,哪些威胁已被拦截,哪些环节可进一步优化检测能力,同时绝不损害用户隐私。

为弥合这一差距,我们与Android SafetyCore团队合作,利用最先进的隐私分析方案提升分类器准确度,同时保护隐私。依赖聚合性隐私保护的匿名化洞察至关重要;它使工程师能够衡量全球多样化设备群中安全模型的“真实阳性”率,而无需查看触发本地警报的私密敏感内容。通过观察这些宏观趋势,开发者可优化模型阈值并部署更新,在确保原始数据仅存于设备端且严格隔离的前提下,使安全系统更有效应对新兴威胁。Android SafetyCore将利用我们的零信任隐私分析方案评估其工具有效性的元数据,同时坚守用户内容仅存于设备的隐私承诺。我们很高兴推出这项技术,助力Android在保护用户安全的同时维护其隐私。

安全计算的加密技术带来了以数学证明为基础的强安全保障。我们展示了如何设计可大规模分布式系统部署的安全聚合协议。最终方案与现有安全机制集成,提升了隐私分析的安全标准。未来,我们将探索扩展该模型支持的计算范围。

本文内容凝聚了众多贡献者的智慧,包括Bruno Alves、Carlos Balduz、Nacho Ballester Tester、James Bell-Clark、Oleg Chernyakhovskiy、Stanislav Chiknavaryan、Jim Choncholas、Stefan Dierauf、Emily Glanz、Shruthi Gorantala、Mira Holford、Mihaela Ion、Artem Lagzdin、Jean-Christophe Lilot、Peter Kairouz、Jonathan Katz、Baiyu Li、Ben Kreuter、Brett McLarnon、Mekhola Mukherjee、Amanda Nascimento、Timon Van Overveldt、Javed Ramjohn、Phillipp Schoppmann、Karn Seth、Debora Silva、Rakshita Tandon和Pierre Tholoniat。感谢Elie Bursztein、Bryant Gipson、Marco Gruteser、Alex Freire、Xavier Llorà、Dan Ramage、David Sehr、Amanda Walker的领导指导,以及Corinna Cortes、Brian Roddy、Pankaj Rohatgi、Eduardo Tejada的持续支持。

英文来源:

May 27, 2026
Adrià Gascón, Staff Research Scientist, Google Research, and Mariana Raykova, Senior Staff Research Scientist, Google
We introduce a private analytics solution which leverages a new cryptographic protocol for secure aggregation combined with the transparency properties of TEEs to achieve state-of-the-art privacy and security guarantees.
By processing data locally, on-device AI can provide enhanced protection and timely alerts while keeping user information private. For example, Android uses a system called SafetyCore to provide privacy-preserving on-device features and common infrastructure to protect users from unwanted content. When developing on-device technologies, teams need to understand how well their systems work across millions of individual smartphones, each with unique data distributions, varying hardware constraints, and different user behaviors. To achieve this in a way that reveals only collective trends without revealing individual user data, teams can leverage cryptographic secure aggregation as a key building block. Like all cryptographic protocols, secure aggregation uses advanced mathematical tools to provide its security assurance.
Today, we set a higher bar for efficient cryptographic aggregation in a private analytics service. We follow a zero-trust principle, which aims to reduce trust necessary in any single entity. We achieve this through a new security design that combines cryptographic and hardware protection mechanisms. Our solution leverages a new cryptographic aggregation method that provably guarantees only anonymized, aggregated insights about a population can be obtained by Google. Additionally, trusted execution environments (TEEs) are used to provide a strict layer of attestation and transparency.
When models are deployed locally on-device, simply knowing that a model is 'running' isn't enough to understand its behavior, effectiveness, or failure modes. This limits the ability to answer critical questions like:
This is where private analytics becomes the essential bridge, enabling anonymized, aggregated insights about a population without ever revealing individual user content.
Google teams use federated analytics for this kind of aggregated, private insight, with applications in Pixel Recorder, Gboard, and more. Federated analytics requires a private aggregation route, where the data from individual devices is protected until combined into a sum. Two paradigms have emerged to protect user data in this setting: hardware-based isolation (TEEs) and cryptographic protocols.
The hardware approach centers on TEEs, such as Intel TDX, AMD SEV-SNP and others. The core idea is to create a "secure enclave" — essentially a protected slice of the processor and memory that is isolated from the rest of the device. Inside this enclave, data can be decrypted and processed in plaintext, shielded even from a compromised operating system or a malicious hypervisor.
Through a process called attestation, TEEs can compute a hardware-backed cryptographic "fingerprint" of the exact firmware and software state running inside the enclave. For a user or an auditor, attestation offers a verifiable guarantee that the data is being handled by the specific, tamper-proof program they expect, rather than a modified version designed to leak information. Google has deployed TEE-backed differentially private aggregation for computing insights into AI systems in the Pixel Recorder app.
However, TEE isolation mechanisms are constantly evolving. Researchers regularly discover side-channel vulnerabilities that can be leveraged by an attacker to either invalidate TEE guarantees, or application-level specific guarantees [SNPeek, TDXray]. While the community is working towards hardening existing solutions against known side-channel attacks, new side-channel vulnerabilities are expected to be discovered. Therefore, in an ideal system, data would be protected by multiple layers of security so that even if a TEE’s security model fails, the data is not compromised.
On the other hand, cryptographic protocols rely on mathematical techniques which come with provable guarantees that individual data cannot be reconstructed and the only value that becomes visible is the aggregated, anonymized output. Google has deployed two generations of secure aggregation protocols at scale (detailed in the initial blogpost and follow-up). However, its widespread use has been limited by the complexity in its requirement that user devices remain online in multiround protocols over extended periods of time.
Our new solution introduces a novel cryptographic protocol that allows user devices to securely submit their information in a single, one-shot message, overcoming the barriers of traditional interactive schemes. By enabling a single-message submission, we eliminate the need for devices to remain online for multiple rounds of interaction with a server.
Integrated into Google’s confidential federated analytics system, we combine this higher-efficiency protocol with execution within a TEE to create a multi-layered defense architecture. With this solution, confidentiality no longer relies entirely on hardware protection. The cryptographic layer ensures that individual raw data is never exposed or reconstructed in any server memory — not even within the hardware-protected perimeters. The only time unencrypted data is ever processed off-device is at the final stage, when the data has already been aggregated and anonymized. Furthermore, our solution leverages TEE attestation mechanisms to provide high-assurance, verifiable proof to all participants that the secure aggregation protocol is being executed exactly as intended, i.e., by compiling and running correctly publicly available code.
At its heart, our cryptographic solution is powered by an innovative lattice-based protocol that allows clients to encrypt their data in a way that the resulting ciphertexts can be aggregated while aggregating the underlying messages as well as encryption keys. Now the only thing needed to enable the server to obtain the aggregated values is a decryption key that can only decrypt the aggregated value. To aid with this task we form small committees among the clients that hold hints which help unlock the aggregated value masked with additional differential privacy noise. Clients serve on committees infrequently according to their availability and facilitate the property that any decryption key is shared over a number of parties each one of which protects the confidentiality of the encrypted data.
Android System SafetyCore is a Google system service for Android 9+ devices that provides privacy-preserving on-device support for Android safety features. In the realm of on-device safety, tools like SafetyCore play a critical role. However, for these tools to evolve, developers need to understand their real-world performance — specifically, which threats are being caught and where there are opportunities to further refine detection capabilities, all without ever compromising user privacy.
To bridge this gap, in partnership with the Android SafetyCore team, we’re using our state-of-the-art private analytics solution to improve the accuracy of classifiers and at the same time preserve privacy.
Relying on aggregate privacy-preserving, anonymized insights is essential here; it allows engineers to measure the "true positive" rate of safety models across a diverse global fleet without ever seeing the private, sensitive content that triggered a local alert. By observing these high-level trends, developers can refine model thresholds and deploy updates that better protect the user, ensuring the safety system remains effective against emerging threats while keeping the raw data private and strictly isolated on the device. Android SafetyCore will leverage our zero-trust private analytics to evaluate metadata indicative of the effectiveness of its tools while respecting its privacy commitment that user content stays only on device. We are excited to introduce a technology that aids Android’s broader mission to protect user safety while preserving their privacy.
Cryptographic techniques for secure computation bring strong security guarantees anchored in mathematical proofs. We showed how to design secure aggregation protocols in ways compatible with deployment in large-scale distributed systems. The resulting solution integrated with existing security mechanisms raises the security bar for private analytics. Going forward we are exploring opportunities to expand the set of computations supported in this model.
The contents of this blogpost reflect the contributions of many people, including Bruno Alves, Carlos Balduz, Nacho Ballester Tester, James Bell-Clark, Oleg Chernyakhovskiy, Stanislav Chiknavaryan, Jim Choncholas, Stefan Dierauf, Emily Glanz, Shruthi Gorantala, Mira Holford, Mihaela Ion, Artem Lagzdin, Jean-Christophe Lilot, Peter Kairouz, Jonathan Katz, Baiyu Li, Ben Kreuter, Brett McLarnon, Mekhola Mukherjee, Amanda Nascimento, Timon Van Overveldt, Javed Ramjohn, Phillipp Schoppmann, Karn Seth, Debora Silva, Rakshita Tandon, and Pierre Tholoniat. We would like to thank Elie Bursztein, Bryant Gipson, Marco Gruteser, Alex Freire, Xavier Llorà, Dan Ramage, David Sehr, and Amanda Walker for their leadership and Corinna Cortes, Brian Roddy, Pankaj Rohatgi, and Eduardo Tejada for their continued support.

谷歌研究进展

文章目录


    扫描二维码,在手机上阅读