创新新纪元:Google Research 在 2026 年 I/O 大会

qimuai 发布于 阅读:6 一手编译

创新新纪元:Google Research 在 2026 年 I/O 大会

内容来源:https://research.google/blog/a-new-era-of-innovation-google-research-at-io-2026/

内容总结:

谷歌I/O 2026亮点:AI驱动科学发现与医疗突破,开启智能体新时代

2026年5月28日,谷歌研究与副总裁Yossi Matias发布年度总结,在刚结束的Google I/O 2026大会上,谷歌展示了面向用户、开发者和研究者的最先进技术,多项成果源于多年研究积累,标志着AI进入更具主动性的智能体时代。

科学发现:AI加速科研创新
谷歌推出“Gemini for Science”系列工具套件,旨在助力全球科研社区。其中,实证研究助手(ERA)联合科学家(Co-Scientist) 成果刚发表于《自然》杂志。ERA可帮助科学家编写专家级实证软件,已在神经科学、宇宙学等领域加速发现,如预测呼吸系统疾病住院率和加州河流季节性径流。Co-Scientist基于Gemini构建的多智能体系统,正被用于攻克抗菌素耐药性、植物免疫和肝纤维化等重大科学难题。新工具“计算发现”可并行生成并评分数千种代码变体,将原本数月的探索压缩至即时;“假设生成”工具则通过多智能体“创意竞赛”自动生成、辩论和评估科学假设,确保可溯源引证。

医疗健康:让AI走进真实诊疗场景
基于多年研究,谷歌推出新版Google Health应用及AI健康教练,已向Fitbit用户开放。研究显示,症状AI(Symptom AI) 在随机对照研究中,独立临床医生对其鉴别诊断的偏好程度是其他医生的两倍。AMIE多智能体系统 最新成果发表于《自然·医学》,展示了处理多模态医疗数据的能力,并与贝斯以色列女执事医疗中心合作测试其在真实病史采集中的减负效果。此外,开源模型MedGemma 已超500万次下载,赋能全球医疗开发者生态。

灾难应对与气候预测
谷歌天气模型WeatherNext 在2025年飓风季成功提前5天预测“梅丽莎”飓风的快速增强及在牙买加登陆,协助当地气象部门及时预警。针对城市闪洪预测难题,新方法Groundsource 利用Gemini将20年非结构化新闻转化为260万条高质量记录,训练出先进预测模型,已集成至覆盖150国20亿人口的洪水预警平台Flood Hub。

生成式AI与多语言扩展
Gemini模型在事实准确性、多语言能力及效率上持续突破,现已支持超70种语言、覆盖230多个国家,成为全球最广泛可用的AI助手。新推出的Ask MapsAsk YouTube 功能允许用户进行更复杂的交互查询。生成式UI技术让搜索和Gemini应用可呈现动态图表、模拟及嵌入式视频等沉浸式体验。

智能体平台与开发者工具
Google Antigravity 2.0 智能体开发平台发布,支持多智能体并行协作,演示中甚至实现了从零构建完整操作系统。Gemma V4 开源模型发布一个月下载量突破1亿,专为推理、编程和智能体工作流设计。

量子计算与生命科学交叉
谷歌在超导量子比特和中性原子量子比特两条路线上并行推进,Willow芯片 已验证量子优势,运行速度超经典超级计算机1.3万倍。同时启动REPLIQA计划,投入1000万美元与五所大学合作,将量子科学与AI应用于生命科学,改善人类健康。

用户信任与数据保护
随着AI能力增强,隐私保护技术持续创新,包括差分隐私、设备端AI及聚合匿名数据分析,确保用户数据安全。同时,先进风险管理技术被引入Gemini模型,增强AI生态系统韧性。

Matias总结道:“I/O上分享的突破反映了大胆、主动的智能体创新时代。AI正加速从研究到现实的‘魔法循环’,让不可能变为可能。”

中文翻译:

2026年5月28日
谷歌副总裁兼谷歌研究院总经理 Yossi Matias

在上周的Google I/O 2026大会上,谷歌团队向用户、开发者和研究人员展示了我们最先进的技术。以下是今年谷歌研究院的一些亮点——这些成果往往源于多年来实现"研究奇迹循环"的持续努力。

今年Google I/O的突破标志着一个全新的大胆智能体时代。凭借前所未有的强大模型和智能体编码平台,我们让谷歌产品对每个人都更有帮助,同时彻底改变了研究人员应对最紧迫科学和社会挑战的方式。随着研究成果转化为切实的现实影响,我们正将人工智能和技术转变为人类创造力的放大器。

以下是与谷歌内部多个团队及全球合作伙伴紧密协作下,谷歌研究院取得的几项关键成果。

人工智能正开启科学发现的新纪元。谷歌正在构建基于AI的先进工具,旨在加速全球科学界的进步。我们的基础技术使全球研究人员能够运用从假设生成到计算实验的科学方法,在各领域实现突破。在I/O大会上,我们发布了"科学版Gemini",该工具基于我们的基础研究构建,包括上周在《自然》杂志上发表的实证研究助手(ERA)和协同科学家(Co-Scientist)。

实证研究助手(ERA)是一个研究编码系统,旨在帮助科学家编写专家级的实证软件。ERA上周在《自然》杂志上发表,此前我们与学术合作伙伴进行了数月的合作,探索该系统的实际应用。ERA已帮助加速从神经科学到宇宙学等多个领域的发现。我们最新的成果包括预测呼吸道疾病的住院率以及加州河流域的季节性径流。这些成果已发布在我们的新GitHub目录中,展示了AI通过计算解锁更深层洞察并加速发现的力量。

协同科学家(Co-Scientist)是一个基于Gemini的多智能体系统,作为协作式AI伙伴运行。我们关于Co-Scientist的基础研究上周发表在《自然》杂志上,同时发布了一篇博客,重点介绍了研究人员的评价。我们之前的研究和验证论文展示了研究人员如何利用Co-Scientist应对从抗菌素耐药性到植物免疫和肝纤维化等最紧迫的科学挑战。

科学版Gemini是一套实验性工具,旨在扩大科学探索的规模和精度——与谷歌云、谷歌DeepMind和谷歌实验室的团队紧密合作开发。其中一项新工具"计算发现"是一个基于ERA和AlphaEvolve构建的智能体研究引擎。该原型工具可并行生成并评分数千种代码变体,使科学家能够快速测试多种假设和新颖的建模方法——这些工作若手动进行则需要数月时间。

每年有数百万篇论文发表,整合所有科学文献已成为一项艰巨挑战。另一项新工具"假设生成"基于Co-Scientist构建,旨在通过协助科学家定义研究挑战,并运行多智能体"创意竞赛"来生成、辩论和评估假设,从而弥合这一差距。为确保科学严谨性,所有主张均附有可点击的引用来源。

科学版Gemini还包含基于NotebookLM构建的"文献洞察"功能,该功能可整合科学文献中的发现并结构化呈现结果。此外,任何在谷歌反重力等平台上进行智能体编码的用户都能受益于"科学技能"——这是一组智能体技能,使研究人员能在几分钟而非几小时内自动完成结构生物信息学和基因组分析等复杂工作流程。

我们正在逐步开放这些工具的访问权限,并与全球科学界合作,以负责任的方式推动科学发展。如需登记意向,请访问 labs.google/science。

作为与生态系统合作、促进用户获取我们最新实验成果的更广泛努力的一部分,我们还在试点智能体同行评审和科学验证工具。ICML、STOC和NeurIPS等领先科学会议正在探索我们的论文助手工具(PAT)。在这些会议上,PAT以实验性能力审阅了超过1万篇论文——帮助许多作者识别关键理论漏洞,或根据AI工具的反馈开展全新实验。

我们还通过具备高级智能体推理能力的Gemini Deep Think加速数学与科学发现。与数学家、物理学家和计算机科学家合作,我们近期解决了专家级的未解研究问题,包括此前未解的网络谜题死锁、解决了一个历时十年的优化猜想、解释了机器学习优化异常、升级了拍卖经济学理论,并解决了宇宙弦中的物理奇点问题。

在科学家和研究人员手中,这些新型AI技术可能改变研究方式,并催生一个发现的新纪元。

人工智能在帮助人们活得更长寿、更健康方面可发挥关键作用。多年来,我们持续推进AI研究以应对医疗挑战,与医疗服务提供者、科学家、政府官员和学者紧密合作,将临床研究转化为实际的医疗场景,确保我们的创新安全且有用。

我们一直在研究的一个领域是:AI如何在人们的健康旅程中提供最佳支持——从了解症状、准备就诊,到理解个人医疗记录——这一旅程从人们就诊前就已开始,并持续到就诊之后。我们的基础研究正在支持新的谷歌健康应用和谷歌健康教练。上周,我们开始向所有现有Fitbit用户推出谷歌健康应用,使符合条件的用户获得个性化、全面且适应性强的指导。

这建立在我们多年的研究努力之上,包括关于个人健康大语言模型如何帮助改善睡眠和健身的研究。我们最新的研究包括"症状AI",这是一个实验性工具,旨在研究AI如何帮助推理与用户症状相关的对话数据。通过Fitbit应用进行的一项随机知情同意研究中,13917名参与者与实验性AI代理进行了互动,捕捉了现实世界中多样化的沟通风格和真实的疾病分布。在一项针对研究参与者队列的盲法比较中,独立临床医生审查了相同的对话,并对症状AI的鉴别诊断的偏好程度约为其他临床医生的两倍。在我们的"护理计划"试点研究中,我们考察了1779名参与者如何使用我们的系统为就诊做准备。与基线模型相比,感觉准备更充分的用户增加了15%,对充分利用就诊感到自信的用户增加了13%。在个人健康记录(PHR)研究中,我们评估了PHR数据在模型上下文中的引入对回答质量的影响,发现自动评分者和临床医生均认为AI回答的实用性显著提升。

另一项重要的研究工作围绕AI在临床环境中的潜力展开。在之前的两篇《自然》论文中,我们展示了AMIE——由谷歌研究院和谷歌DeepMind开发的研究性多智能体系统——如何解释和推理复杂病例及医疗对话数据。在上周发表在《自然·医学》上的一项新研究中,我们展示了其在多模态数据(包括病史、实验室结果和复杂医学影像)上的能力。为了评估该系统在现实场景中的实用性,我们正与贝斯以色列女执事医疗中心合作,测试该系统如何帮助减轻患者就诊前实时病史采集的负担。我们还与Included Health合作,开展了一项首创的全国性研究,以评估AI驱动的远程医疗。

推进医疗保健是一项全球性努力。我们通过MedGemma赋能全球医疗开发者生态系统,MedGemma是我们健康AI开发者基础套件的一部分,该套件提供开放权重的基座模型供开发者构建。MedGemma专为多模态医疗文本、临床推理和影像理解而优化。与之配套的MedASR则提供专业的医疗音频能力。这些模型正为各种用例的应用提供支持,帮助普及优质医疗服务。截至目前,MedGemma的下载量已超过500万次。

我们正在开发平台和工具,帮助硬件制造商生态系统开发高效的边缘应用。Coral NPU是我们为可穿戴设备和传感器等边缘应用开发的高能效AI机器学习加速核心。基于开放硬件,我们与深度硅供应商合作,这一经过验证的开源IP可供商业硅集成,有助于创建标准化架构,加速边缘AI生态系统。

在上周的I/O大会上,我们推出了首款来自Synaptics的Coralboard,专为AI和机器学习工程师及设备制造商设计,用于快速原型设计和设备构建。该板卡搭载Gemma 3 270M开放模型,并提供丰富的硬件接口,包括摄像头和显示屏支持、麦克风输入以及可选的Wi-Fi/蓝牙连接。Synaptics已投入资源将这些解决方案推向市场,将Coral的功耗与性能平衡与其Devboard智能相结合。

这一行业首创实现的威力在独特的展会前体验中得以展示:Coralboard被部署到蒙特雷湾水族馆,用于实时设备端水母图像检测,其动态被用于编排大屏幕体验。Synaptics Coralboard将于今年夏末全面上市。

热带气旋和洪水等自然灾害可能摧毁社区并危及生命。作为我们长期危机应对努力的一部分,我们正在生成基于AI的准确预测,帮助全球社区和组织确保安全并更好地为危机做好准备。

去年我们宣布与国家飓风中心合作,利用我们WeatherNext模型的气旋预测支持其预报工作——该模型由谷歌研究院和谷歌DeepMind的团队开发。在I/O大会上,我们展示了WeatherNext在最近飓风季中的影响。2025年10月飓风梅丽莎逼近时,WeatherNext提前五天高置信度预测了其快速增强及在牙买加登陆。牙买加气象局得以提前向公众发出通知,帮助挽救生命和生计。

另一个近期里程碑是城市山洪预测。为了应对因数据稀缺导致有效预测这一此前未解的挑战,我们推出了Groundsource,这是一种可扩展的新型方法,利用Gemini将20年非结构化的公开新闻报道转化为包含260万条记录的高质量数据集。这些数据使我们能够训练用于城市区域山洪的先进预测模型。这些预报现已与河流洪水预报一同在洪水中心上线,目前覆盖150个国家、约20亿人口的最重大洪水事件。

WeatherNext和洪水预测模型是谷歌地球AI的一部分——这是一个地理空间模型和数据集集合,旨在将行星信息转化为可行动的情报。它已帮助企业、城市和非营利组织应对从环境监测和灾害响应到支持公共卫生等挑战。地球AI的最新更新包括关于道路管理、人口动态以及航空和卫星洞察的新见解。

我们持续推进生成式AI的基础研究。与谷歌DeepMind合作,我们在事实性、多语言性和效率等领域的研究有助于提升Gemini模型的质量和性能,并扩大产品的全球可及性,以更好地满足用户需求。

我们对大语言模型事实性的研究可追溯到2021年关于评估事实一致性的开创性研究,以及2022年的早期基准测试。我们持续推动Gemini和AI模式的进步,并发表前沿研究以帮助整个社区提供事实信息。我们发布了FACTS基准测试,并扩展其功能以实现对大语言模型事实性的稳健评估,同时推出了改进事实性的技术,涵盖文本到图像、视频生成、长上下文和不确定性表达等领域。

在I/O大会上,我们看到信息获取旅程正变得日益复杂,人们通过更长的对话来获取所需信息。这给大语言模型带来了多项挑战,包括在上下文窗口中推理和分析更多相关信息、遵守对话早期出现的约束,以及使用更长的强化学习轨迹。谷歌研究院在这些挑战上率先开展研究,这些进步为我们的Gemini模型提供了动力。

新的"询问地图"功能还允许用户在谷歌地图中提出复杂的长问题。我们与Ask Maps合作升级了其评估框架,并重新定义了地图实用性的衡量方式。通过定位涉及模型推理和工具执行的复杂边缘案例,这一合作建立了至关重要的反馈循环——这对Ask Maps性能的持续改进至关重要。我们还推动了研究以提升"询问YouTube"的质量,这项新功能帮助用户轻松查找视频和信息。

生成式AI正使工具和产品变得更加易用,并让技术最终能够适配用户所在的场景。我们提升了Gemini的多语言性和本地化能力,包括发布了一个展示大语言模型在不同语言和不同地区运作方式的基准测试,以及与社区合作开发并开源了非洲语言数据。我们的努力帮助Gemini扩展到超过230个国家的70多种语言,使其成为全球可用性最广的AI助手。

谷歌构建了低延迟、高吞吐量的基础设施,以满足全球用户、开发者和企业的需求。我们的研究团队基于推测性解码开发了新技术——包括块验证和树结构草稿,这些技术能智能地同时探索多个候选延续路径,并每步接受更多令牌。我们的实现针对谷歌的TPU架构进行了高度优化,最大化硬件利用率,在不牺牲质量的前提下显著提升响应速度。这项工作为Gemini 3.5 Flash当前的运行速度提供了支持,相同模型也为Antigravity和AI Studio提供动力。

我们在生成式用户界面方面的研究为搜索和Gemini应用中新宣布的沉浸式体验奠定了基础。在搜索中,新的生成式UI功能将于今年夏天向所有用户开放。搜索能以适合问题的格式构建理想响应,为用户提供包括模拟、图表、追踪器和仪表盘在内的定制体验。在Gemini上,用户将看到交互式图像、时间线和嵌入式视频。该功能正在全球范围内推出,带来更流畅、更自然的体验。

随着AI打开新的创意可能性之门,用户正在寻找高质量、有吸引力的生成视频和图像。我们的研究团队与谷歌DeepMind紧密合作,帮助改进Gemini Omni——谷歌从视频开始的任意输入生成任意内容的新模型。我们帮助提升了生成视频片段中叙事部分的质量,使其更有趣、更吸引人,特别注重改进生成片段中人物表情的质量。

谷歌反重力2.0——我们改进后的智能体开发平台——在上周的I/O大会上发布。它允许用户并行管理多个本地智能体并自动化任务。我们的研究团队与谷歌各部门合作,在Antigravity中引入了/teamwork-preview智能体,展示了基于新版Flash模型的智能体如何执行复杂的长期软件和机器学习工程任务。这标志着开发者生产力新纪元的到来,将数天的工程工作压缩为几小时。/teamwork-preview命令工作流会调用一个智能体来优化用户提示,经用户批准后,一个协调器接管并生成数十个专门化的子智能体,在长时间的运行会话中自主编写、测试和调试代码。在I/O大会上,我们演示了该多智能体系统如何从头构建一个功能完整的操作系统——由一个自主智能体团队编写从调度程序到内存管理再到文件系统的每一行代码。其他演示包括实现AlphaZero论文,并通过自对弈构建具有竞争力的围棋对弈程序。

开源软件和开放数据集是现代科学的驱动力。它们是为下一代开创性研究和产品提供动力的关键。通过开放模型如MedGemma和开放数据集如上述的Groundsource,以及我们在基因组学、神经科学等领域的工具,我们确保创新成为全球进步的催化剂。四月,谷歌开源了Gemma V4——这是我们迄今为止最先进的开放模型,专为推理、编码和智能体工作流而设计。在I/O大会上,Gemma V4在发布仅一个月内下载量便突破1亿次。我们的研究团队推出了架构变更和训练策略,在保持相同高效资源占用的同时实现了更高的模型质量。这意味着开发者可以在无需更重计算资源的情况下运行更复杂、更自主的智能体循环。

在一个智能体可以代为购物和支付、智能眼镜可以无时无刻不在为你导航的世界里,赢得并维持用户信任至关重要。随着AI能力的提升,隐私和数据保护成为首要任务。多年来,我们开发了隐私保护技术(PPT)来保护用户数据安全。PPT可以从聚合、匿名化的数据中提取洞察以改进应用,同时提供强有力的保证,确保个人隐私得到保护。例如,我们与谷歌搜索合作,在上周分享了关于AI模式一年使用情况的隐私保护洞察。最近的隐私创新包括关于人们如何使用聊天机器人和设备端AI的隐私保护聚合洞察,以及差分隐私在机器学习、大语言模型、分区选择和合成数据生成方面的基础改进。

伴随这些数据保护措施,我们还在利用先进的风险管理推理创新,将其引入Gemini模型,以帮助保护AI生态系统,强化我们的AI系统,使其更能抵御新兴风险和漏洞。

我们在量子计算路线图上持续取得进展,使我们更接近量子计算的实际应用。

我们在超导量子比特的开发方面开创了先河,实现了纠错和可验证量子优势等里程碑。正如在《自然》杂志上发表的那样,借助我们的Willow芯片,我们展示了历史上首个实现可验证量子优势的算法——运行了无序时间相关器(OTOC)算法,我们称之为量子回波。该算法在Willow上的运行速度比全球最快超级计算机上最好的经典算法快13000倍。今年早些时候,我们将世界领先的量子计算研究扩展到包括中性原子量子计算——该技术使用单个原子作为量子比特——与超导量子比特并行发展。通过在这两条路径上投资,我们能够交叉推动研究和工程突破。

在I/O大会上,James Manyika和Hartmut Neven谈到了量子计算与AI的交汇点。这是高度互补的技术。AI已经在多个方面加速量子计算的进步,从芯片设计到更优的错误纠正。他们还讨论了量子计算在使AI在现实世界中更有效的巨大潜力,因为量子计算能够比经典计算更密切、更精确地探究自然在基础层面上运作的量子力学原理。

上周,我们启动了"生命科学与量子AI交叉研究项目(REPLIQA)",这是一项承诺向五所大学提供1000万美元的倡议,旨在将先进的量子科学和AI应用于生命科学,以改善人类健康成果。

在I/O大会上分享的突破标志着一个大胆创新的智能体新时代。这些进步中的许多展示了从研究到现实的奇迹循环的力量,推动将不可能变为可能。随着AI的进步,奇迹循环正在加速,使针对更大问题的研究成为可能,并对产品、科学和社会产生更快、更深远的影响。

衷心感谢为本博客及此处所呈现工作做出贡献的众多团队和合作者。

英文来源:

May 28, 2026
Yossi Matias, Vice President, Google & GM, Google Research
At Google I/O 2026 last week, Google teams showcased our most advanced technologies for users, developers and researchers. Here are some highlights from Google Research this year, often tapping into years-long efforts to realize the magic cycle of research.
This year’s breakthroughs at Google I/O reflect a new bold agentic era. With models that are more powerful than ever and an agentic coding platform, we’re making Google products substantially more helpful for everyone while transforming how researchers tackle the most pressing scientific and societal challenges. As research translates into tangible, real-world impact, we’re turning AI and technology into an amplifier of human ingenuity.
Here are a few key highlights from Google Research, done in close collaboration with many teams across Google and global partners.
AI is enabling a new era of scientific discovery. Google is building advanced AI-based tools designed to accelerate progress for the global scientific community. Our foundational technology is empowering researchers worldwide to drive breakthroughs across domains using the scientific method from hypothesis generation to computational experimentation. At I/O, we announced Gemini for Science which is built with our foundational research, including Empirical Research Assistance (ERA) and Co-Scientist — both published in Nature last week.
Empirical Research Assistance (ERA) is a research coding system developed to help scientists write expert-level empirical software. Last week’s ERA publication in Nature followed months of collaboration with academic partners to explore the system’s real-world applications. ERA has helped accelerate discoveries from neuroscience to cosmology. Our latest results include predicting hospital admissions for respiratory illnesses and forecasting seasonal runoff across California's river basins. These are available in our new GitHub directory. They signal the power of AI to unlock deeper insights with compute and accelerate discovery.
Co-Scientist is a multi-agent system based on Gemini which works as a collaborative AI partner. Our foundational research on Co-Scientist was published last week in Nature along with a blog highlighting testimonials from researchers. Our previous research and validation papers demonstrate how researchers are using Co-Scientist to tackle some of the most pressing scientific challenges, from antimicrobial resistance to plant immunity and liver fibrosis.
Gemini for Science is a suite of experimental tools designed to expand the scale and precision of scientific exploration — developed in close collaboration with teams from Google Cloud, Google DeepMind and Google Labs. One of the new tools in Gemini for Science, Computational Discovery, is an agentic research engine built with ERA and AlphaEvolve. The Computational Discovery prototype generates and scores thousands of code variations in parallel, enabling scientists to rapidly test multiple hypotheses and novel modeling approaches that would take months to explore manually.
With millions of papers published annually, synthesizing all the scientific literature has become a monumental challenge. Another new tool, Hypothesis Generation, was built using Co-Scientist. It aims to bridge this gap by collaborating with scientists to define a research challenge and running a multi-agent “idea tournament” to generate, debate and evaluate hypotheses. To ensure scientific rigor, claims are supported by clickable citations.
Gemini for Science also features Literature Insights, built with NotebookLM, which helps synthesize findings across scientific literature and structure the results. Plus, anyone engaged in agentic coding on platforms like Google Antigravity could benefit from the Science Skills, a collection of agent skills that automatically allow researchers to perform complex workflows like structural bioinformatics and genomic analyses in minutes rather than hours.
We are gradually opening access to these tools and partnering with the global scientific community to responsibly advance science. To register your interest, visit labs.google/science.
As part of our broader efforts to work with the ecosystem and foster access to our latest experiments, we’re also piloting tools for agentic peer review and scientific validation. Leading scientific conferences like ICML, STOC and NeurIPS are exploring our Paper Assistant Tool (PAT). Across these venues, PAT reviewed over 10,000 papers in an experimental capacity — helping many authors identify critical theoretical gaps or run entirely new experiments based on the AI tool's feedback.
We’re also accelerating mathematical and scientific discovery with Gemini Deep Think with advanced agentic reasoning. In collaboration with mathematicians, physicists, and computer scientists, we recently solved expert-level open research problems, including previously unsolved deadlocks in networks puzzles, settling a decade-old optimization conjecture, explaining machine learning optimization anomalies, upgrading economic theory for auctions, and resolving physics singularities in cosmic strings.
In the hands of scientists and researchers, these new types of AI based technologies could change how research is done and catalyze a new era of discoveries.
AI can be instrumental in helping people live longer, healthier lives. For years, we’ve been advancing AI research to address healthcare challenges, working closely with healthcare providers, scientists, public officials and academics to bring our clinical research to real-world care settings and ensure that our innovations are safe and helpful.
One area we’ve been researching is how AI can best support people throughout their health and wellness journeys, from learning about symptoms and preparing for a doctor’s visit, to making sense of their medical records — a journey that starts before people ever see a doctor and extends long after. Our foundational research is enabling the new Google Health app and the Google Health Coach. Last week, we began the rollout of Google Health app to all existing Fitbit users, enabling eligible users to have personalized, holistic, adaptive coaching.
This builds on our multi-year research effort, including research on how a personal health LLM could help with sleep and fitness. Our latest research includes Symptom AI, an investigational tool designed to study how AI can help reason about conversational data salient to a user’s symptoms. In a randomized consented research study via the Fitbit app, 13,917 participants interacted with experimental AI agents, capturing real world diverse communication styles and a realistic distribution of illnesses. In a blind comparison on a cohort of study participants, independent clinicians reviewed the same conversations and preferred Symptom AI’s differential diagnoses about twice as often as those from other clinicians. In our Plan for Care pilot research study, we examined how 1,779 participants used our system to prepare for their doctor's visit. When compared with baseline models, 15% more users felt better prepared and 13% more users felt confident that they could make the most of their visit. In our Personal Health Record (PHR) research, we evaluated the impact of PHR data in model context on answer quality and found that both auto-raters and clinicians judged the AI responses to be significantly more helpful.
Another significant research effort is around the potential of AI in clinical settings. In two previous Nature publications, we showed how AMIE — a research multi-agent system developed by Google Research and Google DeepMind — can interpret and reason about complex cases and medical conversational data. In new research published last week in Nature Medicine, we showcase its capabilities across multimodal data, including medical histories, lab results, and complex medical images. To evaluate the utility of the system in realistic settings, we’re collaborating with Beth Israel Deaconess Medical Center to test how the system can help reduce the burden of real-time history-taking before a patient’s visit. We’ve also partnered with Included Health to launch a first-of-its-kind, national-scale study to evaluate AI-driven telehealth care.
Advancing healthcare is a global effort. We’re empowering the global healthcare developer ecosystem with MedGemma, part of our Health AI Developer Foundations suite of open-weight foundation models for developers to build upon. MedGemma is specialized for multimodal medical text, clinical reasoning and imaging comprehension. Alongside it, MedASR provides specialized medical audio capabilities. These models are powering applications with a wide range of use cases, helping to democratize access to quality care. MedGemma now has more than 5M downloads to date.
We’re developing platforms and tools to help the hardware manufacturer ecosystem develop efficient edge applications. Coral NPU is an ML accelerator core that we developed for energy-efficient AI for edge applications like wearables and sensors. Based on open hardware, in partnership with deep silicon providers, this validated, open-source IP is available for commercial silicon integration, helping to create a standard architecture that accelerates the edge AI ecosystem.
At I/O last week, we launched the first Coralboard from Synaptics, designed for AI and ML engineers as well as equipment manufacturers to rapidly prototype and build devices. The board features the Gemma 3 270M open model and offers a rich set of hardware interfaces, including camera and display support, microphone inputs, and optional Wi-Fi / Bluetooth connectivity. Synaptics has invested in taking these solutions to market, bringing together the balance of power and performance from Coral with their Devboard intelligence.
The power of this industry-first implementation was illustrated in a unique pre-show experience: Coralboard was deployed to the Monterey Bay Aquarium for live on-device image detection of jellyfish and the movements were used to orchestrate the big screen experience. The Synaptics Coralboard will be generally available later this summer.
Natural disasters like tropical cyclones and floods can devastate communities and endanger lives. As part of our long-standing crisis resilience efforts, we’re generating accurate, AI-powered forecasts to help communities and organizations around the world stay safe and better prepare for crises.
Last year we announced our partnership with the National Hurricane Center to support their forecasts with cyclone predictions from our WeatherNext model, developed by teams from Google Research and Google DeepMind. At I/O, we showcased the impact of WeatherNext during the latest hurricane season. As Hurricane Melissa approached in October 2025, WeatherNext predicted the rapid intensification and Jamaican landfall with high confidence five days in advance. The Met Service in Jamaica was able to notify the public in advance, helping to save lives and livelihoods.
Another recent milestone was in urban flash flood prediction. To address a previously unsolved challenge in effective prediction due to data scarcity, we introduced Groundsource, a scalable novel methodology that leverages Gemini to turn 20 years of unstructured, public news reports into a high quality dataset of 2.6M records. This data enabled us to train advanced forecasting models for flash floods in urban areas. The forecasts are available on Flood Hub along with our riverine flood forecasts, which now cover 2B people in 150 countries for the most significant flood events.
WeatherNext and flood forecasting models are part of Google Earth AI, a collection of geospatial models and datasets, designed to transform planetary information into actionable intelligence. It is already helping enterprises, cities and nonprofits with challenges from environmental monitoring and disaster response to supporting public health. Recent updates on Earth AI include new insights on Roads Management, Population Dynamics and Aerial and Satellite Insights.
We keep advancing foundational research for generative AI. In collaboration with Google DeepMind, our work in areas spanning factuality, multilinguality and efficiency helps to advance Gemini model quality and performance, and expand global access to our products, to better meet the needs of users.
Our research on LLM factuality goes back to pioneering research on evaluating factual consistency in 2021 and an early benchmark in 2022. We continue to push Gemini and AI Mode forward, and publish cutting edge research to help the entire community provide factual information. We’ve published FACTS and extended it to allow robust benchmarking of factuality in LLMs, and techniques to improve factuality, including text-to-image, video generation, long-context and expressions of uncertainty.
At I/O, we saw that information journeys are becoming increasingly complex, where people engage in longer conversations to obtain what they need. This creates several challenges for LLMs, including being able to reason and analyze more relevant information in the context window, adhering to constraints that appeared early in the conversation, and using longer reinforcement learning trajectories. Google Research has pioneered work on all these challenges, and these advances fuel our Gemini models.
The new Ask Maps feature also allows people to ask complex, longer questions in Google Maps. We partnered with Ask Maps to upgrade its evaluation framework and redefine how map helpfulness is measured. By pinpointing complex edge cases involving model reasoning and tool execution, this collaboration established a vital feedback loop — critical for continuous improvement of Ask Maps' performance. We also drove research to improve the quality of Ask YouTube, a new feature which helps users find videos and information easily.
Generative AI is making tools and products far more accessible, and allowing technologies to finally meet users where they are. We’ve advanced multilinguality and localization capabilities for Gemini, including the publication of a benchmark which shows how LLMs operate in different languages, and in different locations, and open sourcing data in African languages, developed with the community. Our efforts helped enable the expansion of Gemini to more than 70 languages across more than 230 countries. This makes Gemini the most widely available AI assistant in the world.
Google builds its infrastructure to achieve low latency and high throughput, so that we can serve the needs of users, developers and enterprises around the world. Our research teams developed new techniques building on speculative decoding — including block verification and tree-structured drafting, which intelligently explores multiple candidate continuations at once and accepts more tokens per step. Our implementation is highly optimized for Google's TPU architecture, maximizing hardware utilization to deliver substantially faster responses with no loss in quality. This work enabled the current speed of Gemini 3.5 Flash, with the same models also powering Antigravity and AI Studio.
Our research into generative UI laid the foundations for newly announced immersive experiences in Search and in Gemini app. In Search, new generative UI features will be available for everyone this Summer. Search can build the ideal response, in the right format for the question, giving users custom experiences including simulations, graphs, trackers and dashboards. And on Gemini, users will see interactive images, timelines and embedded videos. This is now rolling out globally, with the resulting experience feeling more fluid and natural.
As AI opens the door to new creative possibilities, users are looking for compelling, high-quality generated videos and images. Our research teams collaborated closely with Google DeepMind to help improve Gemini Omni, Google’s new model for creating anything from any input, starting with video. We helped improve the quality of the storytelling component of generated video clips, to make them more interesting and engaging, with a particular focus on improving the quality of human expressions in generated clips.
Google Antigravity 2.0, our improved agentic development platform, was launched at I/O last week. It allows users to manage multiple local agents in parallel and automate tasks. Our research teams collaborated with teams across Google to introduce /teamwork-preview agents in Antigravity, showing how agents on top of the new Flash model can perform complex long-horizon software and ML engineering tasks. This heralds a new era in developer productivity and collapses multi-day engineering efforts into hours. The /teamwork-preview command workflow invokes an agent that refines the user prompt, and then, following user approval, an orchestrator takes over spawning dozens of specialized sub-agents to write, test, and debug code autonomously over extended, long-running sessions. At I/O, we demonstrated how this multi-agent system can build a functional Operating System from scratch, with an autonomous team of agents writing every line of code from the scheduler to the memory management to the file system. Other demos include implementing AlphaZero paper and building a competitive Go player via self-play.
Open-source software and open access datasets are drivers of modern science. They are key for powering the next generation of pioneering research and products. With open models like MedGemma and open datasets like Groundsource, mentioned above, along with our tools for genomics, neuroscience and more, we ensure that innovation is a catalyst for worldwide progress. In April, Google open-sourced Gemma V4 — our most advanced open model yet, purpose-built for reasoning, coding, and agentic workflows. At I/O, it was announced that Gemma V4 surpassed 100 million downloads in just one month. Our research teams launched architecture changes and training strategies, delivering higher model quality while maintaining the same efficient footprint. This means developers can run more sophisticated, autonomous agentic loops without requiring heavier compute resources.
In a world where agents can shop and make payments on your behalf and smart glasses can see and guide you everywhere you go, earning and maintaining user trust is paramount. As AI becomes more capable, privacy and data protection are a top priority. Over the years, we’ve developed privacy-preserving technology (PPT) to keep user data safe. PPT can help drive insights from aggregate, anonymized data to improve applications, while providing strong guarantees that individual privacy is protected. For example, we partnered with Google Search to produce privacy-preserving insights on AI Mode one-year usage, shared last week. Recent privacy innovations include privacy-preserving aggregate insight into how people use chatbots and on-device AI, as well as improved fundamentals of differential privacy for machine learning, LLMs, partition selection, and synthetic data generation.
Alongside these data protections, we are leveraging our innovations in advanced risk management reasoning, bringing them to Gemini models to help secure the AI ecosystem, hardening our AI systems and making them more resilient to emergent risks and vulnerabilities.
We continue making progress on our quantum computing roadmap, bringing us closer to real-world applications of quantum computing.
We have pioneered the development of superconducting quantum bits (qubits), achieving milestones like error correction and verifiable quantum advantage. As published in Nature, with our Willow chip, we demonstrated the first-ever algorithm in history to achieve verifiable quantum advantage, running the out-of-order time correlator (OTOC) algorithm, which we call Quantum Echoes. It runs 13,000 times faster on Willow than the best classical algorithm on one of the world’s fastest supercomputers. Earlier this year, we expanded our world-leading quantum computing research to include neutral atom quantum computing, which uses individual atoms as qubits, alongside superconducting qubits. By investing in both, we can cross-pollinate research and engineering breakthroughs.
On stage at I/O, James Manyika and Hartmut Neven spoke about the intersection between quantum computing and AI. These are highly complementary technologies. AI is already accelerating progress in quantum computing on multiple fronts, from chip design to better error correction. They also discussed the significant potential for quantum computing to make AI more effective in the real world, as it can probe the quantum mechanics of how nature operates at a fundamental level more closely and accurately than classical computation can.
Last week, we launched the Research Program at the Intersection of Life Sciences & Quantum AI (REPLIQA), an initiative committing $10 million to five universities to apply advanced quantum science and AI to the life sciences, to improve human outcomes.
The breakthroughs shared at I/O reflect a bold new agentic era of innovation. Many of these advancements demonstrate the power of the magic cycle from research to reality, driving to make the impossible, possible. With AI advancements the magic cycle is accelerating, enabling research on bigger questions, with faster and greater impact on products, science and society.
With thanks to the many teams and collaborators who have contributed to this blog and to the work represented here.

谷歌研究进展

文章目录


    扫描二维码,在手机上阅读