谷歌研究科学家运用实证研究辅助的四种方式

qimuai 发布于 阅读:6 一手编译

谷歌研究科学家运用实证研究辅助的四种方式

内容来源:https://research.google/blog/four-ways-google-research-scientists-have-been-using-empirical-research-assistance/

内容总结:

谷歌科学家实证研究助手四大应用场景:从流行病预测到宇宙弦探索

2026年4月29日 谷歌研究科学团队

自去年秋季推出实证研究助手(ERA)以来,谷歌研究科学家们已将其应用于流行病学、宇宙学、大气监测和神经科学等实际领域,展示了人工智能加速科学发现的变革能力。以下为四大应用成果:

一、公共卫生:流感、新冠与RSV住院预测
ERA被用于预测美国新冠住院病例,回顾性分析显示其表现可媲美甚至超越美国疾控中心(CDC)及顶尖机构的现有工具。随后,团队将预测范围扩大至流感和呼吸道合胞病毒(RSV),并每周提交实时预测。在2025-26流感季,谷歌每周提交全美各州未来四周的住院预测,并在CDC的全年实时预测中位居前列。独立排行榜显示,其流感与新冠预测准确率均处于顶尖水平,为新兴疾病监测和地域覆盖带来了显著公共卫生价值。

二、宇宙学:宇宙弦与引力波辐射计算
宇宙弦理论存在奇点,导致传统模型无法计算其引力辐射能谱。此前仅能解决最简单的正方形环情况。谷歌将ERA与Gemini Deep Think结合,系统探索能穿越奇点的数学技巧,成功推导出六种通用解及一个渐近极限的简洁公式,在宇宙学前沿领域实现了精准的新解法。

三、气候与可持续性:利用气象卫星监测二氧化碳
现有二氧化碳监测卫星覆盖范围有限,而地球静止气象卫星虽能快速扫描半球,却非为测碳设计。谷歌科学家利用ERA开发出单像素物理引导神经网络,从GOES East卫星16个波长波段提取柱平均二氧化碳信号。训练后模型每10分钟即可提供全球二氧化碳浓度估算,空间和时间分辨率前所未有。独立验证表明,该模型能准确捕捉真实二氧化碳变化,从现有观测设备中挖掘出额外价值。

四、神经科学:揭示神经回路工作机制
斑马鱼是研究脊椎动物感知与反应的常用模型。研究人员向ERA提供斑马鱼脑模拟器的接线图后,ERA成功提出将刺激、神经活动与运动反应相连的候选回路。测试表明,这些由AI假设的回路并非统计捷径,而是可推广至类似情境的准确神经机制。这超越了黑箱建模,实现了可解释、符合机制的解决方案,为研究活体大脑的复杂问题提供了蓝图。

结语
上述四个项目表明,基于大语言模型的AI系统正从数学理论、数据预测到观测数据分析,全面加速科学发现。ERA及谷歌其他协同工具正在解决未解难题、普及计算建模、最大化观测数据效用,为全球科学进步开辟新路径。

中文翻译:

谷歌研究科学家使用实证研究辅助工具的四种方式
2026年4月29日
谷歌研究科学团队

自去年秋季推出实证研究辅助工具(ERA)以来,谷歌研究科学家已将其应用于流行病学、宇宙学、大气监测和神经科学领域的实际场景,这预示了人工智能加速科学发现的变革潜力。

人工智能推动科学发现的能力每周都在增长,其成果不仅有望实现突破性发现,更将彻底改变科学研究的方式。去年九月,我们发布了一篇预印本论文,介绍了旨在帮助科学家生成专家级实证软件的ERA。该论文涵盖了从细胞生物学到神经科学等领域六个具有挑战性的多样化基准问题的创新解决方案。

自那时起,谷歌科学家及我们的学术合作者一直在开发和利用ERA来测试其能力并探索潜在应用。这些工作已超越概念验证阶段,深入流行病学、地理空间分析等实际场景,揭示了人工智能如何使计算建模的力量更加普及、解决未解难题、从现有数据集中挖掘更深层次的见解,并超越黑箱建模,发现可解释且机制准确的解决方案。

目睹谷歌研究科学家、访问学者及学术合作者对ERA实验的热情令人备受鼓舞。我们欣喜地看到,随着ERA即将更广泛地投入使用,这些能力将进一步拓展,为全球福祉提供人工智能辅助的科学发现支持。

公共卫生:流感、新冠和呼吸道合胞病毒的住院预测
在预印本论文中,作者使用ERA预测美国新冠住院人数,结果表明它能回顾性地匹配或超越美国疾病控制与预防中心(CDC)及顶尖研究机构的现有工具。作为后续工作,该团队现已将预测范围从新冠扩展至流感和呼吸道合胞病毒(RSV),并每周实时提交前瞻性预测。

当CDC长期运行的流感预测挑战赛于去年11月启动2025-26赛季时,谷歌开始每周为美国各州及所有时间跨度(最长未来四周)提交预测。去年年底,谷歌还加入了CDC针对州级新冠住院人数的全年实时预测,以及CDC新近推出的RSV预测中心。马萨诸塞大学阿默斯特分校生物统计学教授、本项目顾问尼古拉斯·赖希运营的流感和新冠公共排行榜显示,谷歌在向各项目提交预测的期间内,其表现始终位列榜首或接近榜首(见图示)。尽管RSV尚无公开排行榜,但内部分析显示其表现同样强劲。

一款能媲美甚至超越顶尖公共卫生机构预测准确性的人工智能工具,有望在追踪新型疾病和更广泛区域方面带来巨大公共卫生效益,使流行病学计算建模对更多感染病种和地理区域更加普及。

宇宙学:宇宙弦与引力能量辐射
宇宙弦是时空结构中的理论缺陷,据信形成于早期宇宙,并预测会发射引力辐射。计算这种辐射的能量谱是一个未解难题,主要因为控制方程存在奇点——即数值趋近无穷大且传统模型失效的数学点。去年秋天,一篇论文利用OpenAI的GPT-5找到了宇宙弦引力能量辐射的部分解,但仅限于α=π/2(即90度)的最简单方环情况。一个统一的精确解——即完美求解积分的单一完整数学公式——仍然悬而未决。

为解决这一问题,我们将ERA与Gemini Deep Think相结合。通过系统探索能够处理这些奇点的数学技巧,我们成功推导出六个通解以及一个渐近极限的简洁公式,并于今年三月公布。这展示了将ERA与高级大语言模型相结合,以解锁宇宙学前沿精确新颖解的强大潜力。

气候与可持续发展:利用气象卫星监测二氧化碳
对二氧化碳(CO₂)的常规观测始于20世纪50年代末夏威夷莫纳罗亚天文台,由此诞生了记录地球大气中CO₂浓度上升的标志性基林曲线。绘制人类温室气体排放图并了解植物、树木、土壤和海洋如何吸收这些排放,需要我们追踪CO₂在不同区域和时间的变化。现有天基CO₂传感器(如NASA的轨道碳观测者2号,OCO-2)专为高精度观测设计,但仅能覆盖地球表面的极小部分,且每16天才能重返同一地点。而像GOES东卫星这样用于支持天气预报的地球静止卫星,在更高的轨道上运行,每10分钟即可扫描整个半球。然而,现有地球静止卫星均非为绘制CO₂图而设计。

谷歌研究人员利用ERA开发了一种单像素、物理引导的神经网络,从现有GOES东卫星观测数据中提取柱平均CO₂信号。为此,该模型将来自GOES东卫星的16个波段数据与对流层低层气象条件、太阳角度和一年中的第几天相结合。在利用OCO-2和OCO-3的稀疏观测数据进行训练后,该模型能够每10分钟推算全球任一地点的柱平均CO₂估算值。

在国际温室气体空间测量研讨会上分享的研究表明,这一人工智能开发的模型能够利用GOES东卫星观测的高空间和时间密度,以前所未有的时空分辨率追踪柱平均CO₂。与来自OCO-2多年观测数据以及地基柱总量碳观测网络的独立数据进行的对比,证实了该模型捕捉真实CO₂变化的能力。

这些结果展示了人工智能算法如何从现有观测仪器中提取额外价值,尤其是对于资源密集型的卫星研究任务。该项目是谷歌研究人员利用ERA探索的气候与温室气体相关课题之一。

神经科学:发现神经回路机制
尽管我们现在能够绘制活体大脑中数万个神经元的结构图,但厘清功能回路是下一步挑战。谷歌研究人员利用ERA在真实和模拟斑马鱼中攻克了这一难题。斑马鱼是研究脊椎动物如何感知刺激、处理信息并做出反应的常用模式生物。在自然环境中,光线穿过水面涟漪会在海底或河床形成明暗相间的条纹图案。斑马鱼进化出对这些条纹变化的本能反应,以停留在浅水区避免被冲走。

在一项新研究中,我们聚焦于对应这种环境刺激的斑马鱼神经回路。我们向ERA提供了simZFish(一个简化的斑马鱼身体和大脑模拟器)的接线图。在此信息(揭示了细胞连接的存在,但省略了支配它们的数学规则)的引导下,ERA能够提出将刺激与神经活动及运动反应联系起来的回路。针对新视觉刺激测试这些人工智能假设的回路表明,它们并非仅仅是统计捷径,而是能够泛化到其他类似情况的精确神经机制。

这基于预印本论文的成果,该论文表明人工智能开发的模型在预测ZAPBench(斑马鱼活动预测基准,一个模拟典型环境刺激实验的神经活动数据集)中记录的超过7万个神经元活动方面,能够超越基线方法。

尽管ZAPBench证明了ERA寻找最先进预测解决方案的能力,但模拟环境揭示了它如何超越黑箱建模。凭借结构信息,ERA发现了可解释且机制准确的解决方案,为应对活体大脑领域的重大科学挑战提供了强大蓝图。

结论:人工智能辅助科学
这四个项目是日益增多的成果的一部分,这些成果展示了基于大语言模型的系统如何推动科学进步并加快发现速度。这些案例代表了从理论数学到数据预测,再到分析观测仪器数据和模拟输出等不同领域和类型的问题。它们还展示了人工智能赋能科学在解决未解难题、普及计算建模以及最大化现有观测数据效用方面的潜力。我们对ERA及其他谷歌工具(包括co-scientist和PAT)在加速科学发现方面取得的进展感到兴奋。

致谢
我们感谢ERA开发过程中的合作者,以及所有早期采用该工具的科学家。流行病学预测工作由Zahra Shamsi、Sarah Martinson、Nicholas Reich、Martyna Plomecka和Brian Williams领导。宇宙学论文由Michael Brenner、Vincent Cohen-Addad和David Woodruff撰写。二氧化碳监测研究由Aarón Sonabend-W、Sean Campbell、Renee Johnston、Vishal Batchu、Carl Elkin、Christopher Van Arsdale、John Platt和Anna Michalak领导。神经回路论文由Jan-Matthis Lückmann、Viren Jain和Michał Januszewski撰写。我们还感谢John Platt、Michael Brenner、Lizzie Dorfman、Vip Gupta、Alison Lentz、Erica Brand、Katherine Chou、Ronit Levavi Morad、Yossi Matias和James Manyika的领导支持。

英文来源:

Four ways Google Research scientists have been using Empirical Research Assistance
April 29, 2026
The Google Research Science team
Since introducing Empirical Research Assistance in the fall, Google Research scientists have been using it to address real-world applications in epidemiology, cosmology, atmospheric monitoring, and neuroscience, providing a hint of AI’s transformational capabilities to accelerate scientific discoveries.
AI’s capabilities to advance scientific discovery are growing every week, with outcomes that promise not just to enable breakthrough discoveries but to transform how science is done. In September, we released a preprint introducing Empirical Research Assistance (ERA) to help scientists generate expert-level empirical software. That included novel solutions to six diverse and challenging benchmark problems in fields ranging from cell biology to neuroscience.
Since then, Google scientists and our academic collaborators have been developing and using ERA to test its capabilities and explore potential applications. These efforts go beyond proof-of-concept tests to real-world scenarios in epidemiology, geospatial analysis, and more, revealing how AI can democratize access to the power of computational modeling, find solutions to unsolved problems, unlock deeper insights from existing data collections, and go beyond black-box modeling to discover interpretable, mechanistically accurate solutions.
It’s been inspiring to see the excitement of Google research scientists, visiting faculty researchers and academic collaborators as they experiment with ERA. We are thrilled to see these capabilities expand as it nears more widespread availability to support AI-assisted scientific discovery for global benefit.
Public health: Hospitalization forecasts for flu, COVID-19, and RSV
In the preprint, authors used ERA to predict U.S. hospitalizations for COVID-19, showing that it was able to retrospectively match or outperform existing tools from the Centers for Disease Control and Prevention (CDC) and leading research institutions. As a follow-on effort, the team has now expanded to generate forecasts not just for COVID, but also for influenza and respiratory syncytial virus (RSV), and has been submitting prospective forecasts in real time every week.
When the CDC’s long-running flu forecast challenge opened in November for the 2025-26 season, Google began submitting weekly forecasts for every U.S. state and at all time horizons, up to four weeks in the future. Late last year Google also joined CDC’s year-round live forecasts for state-level COVID-19 hospitalizations, as well as CDC’s recently launched hub for forecasting RSV. Public leaderboards for flu and COVID-19 run by Nicholas Reich, a biostatistics professor at the University of Massachusetts Amherst and consultant on this project, show that Google has been performing at or near the top of both leaderboards during the time they have been submitting forecasts to each project (see figure). Although there is no public leaderboard for RSV, internal analyses show a similarly strong performance.
An AI-powered tool that can meet or exceed the forecasting accuracy of leading public health agency tools promises huge public health benefit for tracking newer conditions and in broader locations, democratizing access to computational modeling for epidemiology for a wider range of infections and geographies.
Cosmology: Cosmic strings and gravitational energy radiation
Cosmic strings are theoretical defects in the fabric of spacetime, believed to have formed in the early universe and predicted to emit gravitational radiation. Calculating the spectrum of this emitted energy is an unsolved problem, largely because the governing equations contain singularities — mathematical points where values approach infinity and traditional models break down. Last fall, a paper used OpenAI’s GPT-5 to find a partial solution for the gravitational energy radiating from cosmic strings, but only for the simplest case of a square loop where the angle α = π/2, or 90 degrees. A unified exact solution — a single, complete mathematical formula that solved the integral perfectly — remained an open problem.
To address this, we combined ERA with Gemini Deep Think. By systematically exploring mathematical techniques capable of navigating these singularities, we successfully derived six general solutions and a concise formula for the asymptotic limit, which we shared in March. This illustrates the powerful potential of pairing ERA with advanced LLMs to unlock precise, novel solutions at the frontier of cosmology.
Climate and sustainability: Using weather satellites to monitor CO2
Regular observations of carbon dioxide (CO2) began at Hawaii’s Mauna Loa Observatory in the late 1950s, yielding the iconic Keeling Curve that documents rising global CO2 concentrations in Earth’s atmosphere. Mapping human greenhouse gas emissions and understanding how plants, trees, soils and oceans absorb those emissions requires us to track how CO2 varies across regions and over time. Current space-based CO2 sensors, like NASA’s Orbiting Carbon Observatory-2 (OCO-2) were designed to make high-precision observations, but they only map a tiny fraction of the Earth’s surface and return to each location just once every 16 days. Geostationary satellites, such as the GOES East satellite designed to support weather forecasting, orbit the Earth from a much higher altitude and can scan an entire hemisphere every 10 minutes. However, none of the existing geostationary satellites were designed to map CO2.
Google researchers used ERA to develop a single-pixel, physics-guided neural network to distill a column-averaged CO2 signal from the existing GOES East observations. To do so, the model combines data from 16 wavelength bands from GOES-East with lower-troposphere meteorology, solar angles, and day of the year. After training on the sparse observations from OCO-2 and OCO-3, the model was then able to derive estimates of column-averaged CO2 everywhere and every 10 minutes.
Research shared at the International Workshop on Greenhouse Gas Measurements from Space shows that the AI-developed model is able to leverage the high spatial and temporal density of the GOES East observations to track column-averaged CO2 with unprecedented spatial and temporal resolution. Comparisons against independent data from additional years of OCO-2 observations, and from the ground-based total column carbon observing network, confirm the model’s ability to capture real CO2 variability.
These results show how an AI algorithm can extract additional value from existing observational instruments, especially for resource-intensive satellite research missions. This project is among several questions related to climate and greenhouse gases that Google researchers are exploring using ERA.
Neuroscience: Discovering mechanisms of neural circuits
Although we can now map tens of thousands of neurons in living brains, untangling the functional circuits is the next step. Google researchers used ERA to tackle this challenge in both real and simulated zebrafish, a popular model organism for studying how a vertebrate detects stimuli, processes information and responds. In natural settings, light passing through ripples on the water’s surface creates patterns of light and dark stripes on the seafloor or riverbed. Zebrafish have evolved to instinctively respond to changes in those stripes in order to stay in shallow water and avoid getting swept away.
In a new study, we looked at the zebrafish neural circuit corresponding to this environmental stimulus. We provided ERA with the wiring diagram of simZFish, a simplified zebrafish body and brain simulator. Guided by this information — revealing what cellular connections exist, but omitting the mathematical rules that govern them — ERA was able to propose circuits that connect stimulus to neural activity to motor response. Testing these AI-hypothesized circuits against new visual stimuli showed that they were not just statistical shortcuts, but accurate neural mechanisms that generalize to other, similar situations.
This builds on results from the preprint, which showed that AI-developed models could outperform baseline methods at predicting the activity of over 70,000 neurons captured in the Zebrafish Activity Prediction Benchmark, ZAPBench, a dataset of neural activity from experiments that mimic typical environmental stimuli.
While ZAPBench proved ERA's ability to find state-of-the-art predictive solutions, the simulated environment reveals how it can go beyond black-box modeling. Equipped with structural information, ERA discovered interpretable, mechanistically accurate solutions, providing a powerful blueprint for addressing scientific grand challenges in living brains.
Conclusion: AI-assisted science
These four projects are among a growing list of results that show how LLM-backed systems can advance science and accelerate the pace of discovery. These examples represent a range of fields and also types of problems, from theoretical math to data forecasting to analyzing data from observational instruments and simulation output. They also showcase the potential for AI-enabled science to solve open problems, democratize access to computational modeling, and maximize the utility of existing observational data. We’re excited about the progress being unlocked by ERA and other Google tools — including co-scientist and PAT — designed to accelerate scientific discovery.
Acknowledgments
We’d like to thank our collaborators on developing ERA, and all the scientists who are among the early adopters. The epidemiological forecasting work is led by Zahra Shamsi, Sarah Martinson, Nicholas Reich, Martyna Plomecka, and Brian Williams. The cosmological paper is authored by Michael Brenner, Vincent Cohen-Addad, and David Woodruff. The research on carbon dioxide monitoring is led by Aarón Sonabend-W, Sean Campbell, Renee Johnston, Vishal Batchu, Carl Elkin, Christopher Van Arsdale, John Platt, and Anna Michalak. The paper on neural circuits was authored by Jan-Matthis Lückmann, Viren Jain, and Michał Januszewski. We also acknowledge leadership support from John Platt, Michael Brenner, Lizzie Dorfman, Vip Gupta, Alison Lentz, Erica Brand, Katherine Chou, Ronit Levavi Morad, Yossi Matias, and James Manyika.

谷歌研究进展

文章目录


    扫描二维码,在手机上阅读