关于人工智能如何帮助用户理解皮肤状况的研究

qimuai 发布于 2026-6-13 08:00 阅读：23 一手编译

内容来源：https://research.google/blog/research-into-how-ai-can-help-users-understand-skin-conditions/

内容总结：

谷歌研究显示：AI皮肤病诊断工具可提升公众自诊能力，但就医决策仍需改进

（2026年6月12日，谷歌研究院讯）随着超半数成年人通过互联网获取健康信息、三分之一转向人工智能寻求帮助，如何让普通人正确理解并运用AI提供的医疗信息，成为亟待解决的关键问题。谷歌研究院科学家Rory Sayres和Yun Liu今日发布系列研究成果，聚焦AI工具在皮肤病领域辅助大众进行自我健康管理的实际效果。

大规模调查：AI辅助使病症命名准确率提升近三倍

本周发表于《美国医学会杂志·皮肤病学》的一项研究显示，在涉及2345名参与者的对照实验中，使用AI辅助工具的用户对皮肤病症的命名意愿从41%跃升至62%，准确率更是比仅用标准搜索引擎的对照组提高近三倍（AI组23%对对照组8%）。在模拟完整AI功能的"绿野仙踪"测试组中，准确率高达36%。然而，研究同时发现，AI在帮助用户确定下一步就医决策（如居家护理还是紧急就诊）方面提升有限——仅"绿野仙踪"组的决策准确率有微弱提升（63.5%对60%），且AI组用户比皮肤科医生更倾向于建议非紧急处理（30%对27%），说明"识别病症不等于知道该怎么做"。

实地测试：多语言AI应用获92%临床医生认可

去年发表于人机交互顶会CHI的另一项研究中，团队与斯坦福医疗AI团队及加州圣克拉拉家庭健康计划合作，面向110名存在真实皮肤困扰、使用四种不同语言的参保居民展开实地测试。参与者使用支持其母语的AI皮肤应用后，病症命名能力提升260%（尽管绝对准确率仍较低）。86%的临床医生认为AI预测与自身判断一致，92%的医生认为该工具有助于医患沟通，尤其是通过共享界面进行"图像比对"有效降低了语言和专业知识门槛。

未来方向：多模态搜索与人本设计并重

研究团队指出，基于图像相似度的多模态信息搜索（图像+文本）比纯文字或纯图片更受用户欢迎。但要让AI真正助力医疗决策，仍需提供更丰富的"教科书级"案例图谱（涵盖不同肤色、严重程度和身体部位），并针对用户具体问题而非仅病症名称提供可操作建议。谷歌表示，未来将持续开展人本化研究，确保AI工具能帮助不同背景的人群有效解读医疗信息，真正支持其健康旅程。

中文翻译：

2026年6月12日
Rory Sayres 与 Yun Liu，谷歌研究院研究科学家
我们近期发布了关于皮肤科AI工具如何帮助普通人群解答自身皮肤相关问题的研究成果。

超过半数的成年人会通过互联网获取健康信息，其中三分之一的人会使用人工智能（AI）。然而，信息可及并不意味着这些信息易于理解或能被正确解读。简而言之，在健康信息的AI应用中，人的因素仍然是帮助人们从优质健康信息中获益的研究重点。

这一点在皮肤科（涵盖皮肤、毛发、指甲，以下简称“皮肤”）领域尤为重要，因为人们很难在线找到与自身皮肤问题相关的准确信息。例如，你可能注意到“腿上有红点”，但缺乏专业知识将症状准确描述为“可触性紫癜”。

多年来，我们已在该领域奠定了技术基础，包括开发用于鉴别诊断的AI模型、验证模型的泛化能力，以及发布SCIN等数据集以支持临床医生和研究人员。然而，只有通过提供高质量信息来支持有皮肤问题人群的决策，才能实现最大的影响力。

要做到这一点，理解人类如何借助AI进行决策至关重要。此前对非AI工具的研究表明，虽然人们可能通过互联网更准确地识别病症，但在决定下一步行动方面未必有所提升。我们必须确保在AI工具普及的同时，细致研究并优化人为因素，以帮助人们做出更明智的决策。

基于上述考量，今天我们将分享近期及过往关于消费者对皮肤科相关问题AI工具理解程度的研究。这包括一篇近期的大规模定量论文，证明了AI辅助能提升用户命名病症的能力，并在确定下一步行动方面带来一定益处。同时，另一项深入的混合方法研究探讨了人们如何将这些工具应用于自身皮肤问题，以及他们通过工具获得的理解与通过医生交流获得的理解有何差异。

在《消费者利用AI信息工具理解皮肤问题的研究》一文中（本周发表于《美国医学会杂志·皮肤病学》），我们探究了结构化AI辅助如何改变用户识别病症并决定下一步行动的能力。我们向2345名调查参与者展示了回顾性、去标识化的皮肤病症案例（包含图片和结构化病史），并请他们假设这些案例属于自己。

参与者被随机分为三组进行研究：

无辅助组：仅使用标准搜索引擎及文字搜索功能（如Google搜索）。
AI辅助组：可使用类似“皮肤病学AI工具”的原型应用，该应用能根据图片和病史信息匹配潜在的皮肤病名称，并展示带高亮区域的图片、非处方治疗方案及护理建议。
“奥兹巫师”组：可使用与AI辅助组相同界面的工具，但由皮肤科医生（而非AI）来匹配图片并填写卡片信息。

我们发现，AI辅助显著提升了消费者的理解能力。使用AI工具的参与者中，超过62%的人更愿意尝试命名展示的病症，而使用标准搜索工具的对照组这一比例为41%。

更重要的是，参与者对病症命名的准确性大幅提升。AI辅助组的准确率（23%）几乎是无辅助对照组（8%）的三倍。在“奥兹巫师”组中，准确率约为对照组四倍（36%），但仍远未达到完美。使用AI“卡片”展示匹配病症后，参与者在猜测病症时也表现出显著更高的信心，并对搜索结果及搜索时间感到更满意。

为避免过度指导，我们研究中的AI设计侧重于将图片与可能的病症进行匹配，由用户自行判断下一步行动。我们的目标是让用户高效搜索，而非提供指导性诊断。此外，治疗方法和信息均由皮肤科医生依据权威来源撰写，仅基于病症名称，并未针对具体案例的严重程度进行个性化调整。

或许由于所提供信息的通用性，用户在面对医疗下一步行动（如选择家庭疗法还是预约紧急门诊）时仍感困难。研究发现，“奥兹巫师”组的下一步行动准确率略有提升（63.5%对比对照组的60%），但标准AI组并未出现统计学显著改善。此外，与对照组相比，AI辅助组的参与者略倾向于建议比皮肤科医生建议更不紧急的下一步行动（30%对比27%）。

这进一步说明，仅识别病症往往不够。在设计工具以更有效地指导普通人采取最安全、最适当的下一步行动方面，仍有进步空间。

虽然大规模调查研究对于理解整体趋势至关重要，我们也认识到需要了解人们如何解读与自己切身相关而非他人病例图片的信息。为了获取更丰富、更细微的反馈，我们直接从最可能受益于这些工具的社区中寻求深入的质性见解。

在去年发表于ACM人机交互（CHI）会议的论文《用AI应对皮肤问题：针对多元化社区的皮肤科应用以人为中心的调查》中，我们与斯坦福医疗AI应用研究团队（HEA3RT）及圣克拉拉家庭健康计划（SCFHP）合作。SCFHP服务于周边社区成员，其中许多人依赖医疗安全网Medi-Cal。我们的目标是研究在真实环境中，来自不同背景、有活跃皮肤问题的参与者如何实际使用皮肤AI系统提供的信息并作出反应。

关键的是，我们确保这一设计符合该社区需求：由于参与者使用四种主要语言，AI应用被翻译成对应语言，并由精通该语言的志愿者或工作人员协助沟通。

在这项真实世界研究中，110名同意参与的参与者使用了该应用（并在使用后立即咨询临床医生以澄清疑问）。与上述调查研究类似，使用应用提升了参与者命名自身病症的能力（提升260%，但总体猜测正确率偏低）。参与者高度依赖教科书图片与自身病症的视觉匹配，这凸显了包含不同肤色、病症严重程度和身体部位图片的重要性，以便他们进行“模式匹配”。

研究中的临床医生认为，应用的预测结果总体（86%）与他们的评估一致。由于参与者可以在诊询时打开应用，临床医生也能将其作为共享参考点，促进医患对话。临床医生在92%的情况下认为该应用是有用的工具。

上述研究聚焦于利用基于图像的AI帮助不同背景的个体更好地理解皮肤问题。可能的改进方向包括：提供更多“教科书式”案例以引导用户理解和模式匹配，以及包含更针对用户实际查询（而非仅针对病症）的可操作信息。此外，我们基于图像相似度工具的研究表明，外行用户更偏好将图像与文本结合（即多模态）的AI皮肤信息搜索方式，而非单独使用其中一种。

综合所有这些研究，一幅搜索皮肤信息的未来图景逐渐浮现。提供视觉起点降低了使用门槛，而更个性化的AI引导可能有助于驾驭复杂的医疗信息。然而，构建高效工具需要持续进行以人为中心的研究，以确保每个人都能有效解读这些信息，从而支持其医疗健康之旅。

我们感谢Elyse Bagley、Trevor Crowell、Bhavna Daryani、Huy Doan、Morgan Du、Madison Elliott、Bea Erickson、Mat Fleck、Zoe Gan、Tammi Huynh、Yetunde Ibitoye、Yejin Jeong、Sergio Marquez、Jay Nayar、Kira Nguyen、Trang Nguyen、Javier Perez、Carola Ponce、Uriel Rivera、Sunny Virmani、Renee Wong及Allan Ysunza对混合方法研究实施所做的贡献；同时感谢Michael Howell、Naama Hammel、Rajeev Rikhye、Abi Jones及Dave Steiner对论文提出的宝贵意见。

英文来源：

June 12, 2026
Rory Sayres and Yun Liu, Research Scientists, Google Research
We present recent published findings on how dermatology AI tools may help laypeople with their own skin-related questions.
More than half of adults use the Internet for health information, and one-third turn to artificial intelligence (AI). However, access to information does not mean that it is easy to understand or correctly interpreted. In short, the human component of AI for health information remains important to research to help people benefit from better health information.
Specifically, this is important in the space of dermatology (skin, hair, nails; henceforth “skin” for brevity) because people have trouble looking for the right information online related to their skin concern. For instance, you may notice “red dots on legs,” but not have the background knowledge to specifically search for “palpable purpura”.
Over the years, we have built a technical foundation in this area, including developing AI models to inform differential diagnoses, performing validation of model generalization, and releasing datasets like SCIN to help clinicians and researchers. However, the most significant impact can only be realized by supporting the decision-making of people who have skin concerns through providing high-quality information.
To do this right, understanding how humans engage with AI to inform their decisions is critical. Previous studies evaluating non-AI tools have shown that while people might get better at identifying a condition using the internet, they don't necessarily get better at deciding what next steps to take. We need to ensure that as AI tools become available, we carefully study and improve upon the human factors to support people in making better decisions.
With the above in mind, today we share some of our recent and past research on consumer understanding of AI tools for their dermatology-related questions. These include a recent large-scale quantitative paper that demonstrates increased ability to name conditions with AI assistance, as well as some benefits in determining what next steps to take. It also includes an in-depth mixed-methods study addressing how people use these tools on their own skin concerns, and how the understanding they gain compares to that from conversation with doctors.
In “Consumer Understanding of Skin Concerns With an AI-Powered Informational Tool,” published this week in JAMA Dermatology, we investigated how structured AI assistance changes a user's ability to identify a condition and determine their next steps. We showed 2,345 survey participants retrospective, de-identified skin condition cases — complete with images and structured medical history — and asked them to imagine the cases were their own.
Participants were randomized into three groups to research the cases:
We found that AI assistance provided a statistically significant improvement for consumer understanding. When using the AI tool, participants were more willing to attempt to name the condition shown (over 62%) compared to the control group using standard search tools (41%).
More importantly, participants’ condition name guessing accuracy improved dramatically. Accuracy was nearly three times higher in the AI arm (23%) compared to the unassisted control arm (8%). In the "Wizard of Oz" arm, accuracy was about four times higher (36%), but still not near perfect. Having AI "cards" to display matching conditions also imparted significantly higher confidence in their condition guesses, and greater overall satisfaction with their search results and the time spent searching.
To avoid being prescriptive, the AI in our study was designed to focus on matching images to possible conditions and relying on the user to interpret what should be done. Our goal was to enable users to search efficiently and not to be prescriptive or diagnostic. In addition, the treatment and information given was written by dermatologists with access to authoritative sources, based purely on the condition name and not tailored to the specific severity of the condition in that case.
Perhaps because of the generality of information provided, deciding on the appropriate medical next steps, such as using a home remedy versus scheduling an urgent clinic visit, remained challenging for users. Our study found that while next-step accuracy increased by a small amount in the "Wizard of Oz" arm (63.5% vs 60% in control), the standard AI arm did not show a statistically significant improvement. Furthermore, participants in the AI arm were slightly more likely to suggest a less urgent next step than a dermatologist would, compared to the control group (30% vs 27%).
This reinforces that simply identifying the condition is not always enough. There is still progress to be made in designing tools that better inform laypeople about the safest and most appropriate next steps.
While large-scale survey studies are invaluable for understanding general trends, we also recognized the need to understand how people interpret information when it is directly relevant to their own concerns, rather than interpreting pictures of others’ conditions. To get this richer, more nuanced feedback, we sought deep, qualitative insights directly from the communities who stand to benefit most from these tools.
In "Navigating Skin Concerns with AI: A Human-Centered Investigation of a Dermatology App in a Diverse Community," published in the ACM Computer-Human Interaction (CHI) conference last year, we collaborated with the Stanford Healthcare AI Applied Research Team (HEA3RT) and the Santa Clara Family Health Plan (SCFHP). SCFHP serves members of the surrounding community, many of whom rely on a healthcare safety net, Medi-Cal. Our goal was to study how diverse, consented participants with active skin concerns actually used and reacted to information from a skin AI system in a real-world setting.
Crucially, we wanted to ensure we were building for this community; since the participants spoke four primary languages, the AI application was translated into their respective languages. Volunteers or staff fluent in the respective language were also present to facilitate communication.
In this real-world study, 110 consented participants used the app (and consulted with a clinician immediately after to clarify any concerns). Similar to the survey study above, using the app increased these participants’ ability to name their condition (an increase of 260%, though the correct guess rate was overall low). Participants heavily relied on visual matching of the textbook images to their condition, highlighting the importance of having images from a spectrum of skin tones, condition severities, and body parts to help them “pattern match”.
The clinicians in the study felt the app’s predictions were generally (86%) consistent with their own assessments of the condition. Because the participants could open the app during the clinician consultation, the clinicians were also able to use it as a shared reference point for discussion and facilitate patient-doctor conversation. The clinicians reported the app as a helpful tool 92% of the time.
Our studies above focused on the use of image-based AI to help individuals with diverse backgrounds better understand skin conditions. Key findings for possible improvements include providing more "textbook" examples to guide user understanding and pattern matching, and including actionable information more specific to the actual user query (as opposed to the conditions). Additionally, our research using image similarity based tools support that an image and text (i.e., multimodal) approach to AI-based skin condition information search is preferred by laypersons over using either alone.
When we look at all of these studies collectively, a potential picture of the future of searching for skin condition information emerges. Providing a visual start lowers the barrier to entry, and more personalized AI guidance may help navigate complex medical information. However, building highly effective tools requires continuous, human-centered research to ensure that everyone can effectively interpret this information to help support healthcare journeys.
We thank Elyse Bagley, Trevor Crowell, Bhavna Daryani, Huy Doan, Morgan Du, Madison Elliott, Bea Erickson, Mat Fleck, Zoe Gan, Tammi Huynh, Yetunde Ibitoye, Yejin Jeong, Sergio Marquez, Jay Nayar, Kira Nguyen, Trang Nguyen, Javier Perez, Carola Ponce, Uriel Rivera, Sunny Virmani, Renee Wong, and Allan Ysunza for their contributions to the execution of the mixed-methods studies; and Michael Howell, Naama Hammel, Rajeev Rikhye, Abi Jones and Dave Steiner for valuable feedback on the papers.

谷歌研究进展

文章目录

📚 推荐阅读

扫描二维码，在手机上阅读