你实际上需要为转录软件付费吗?

qimuai 发布于 阅读:33 一手编译

你实际上需要为转录软件付费吗?

内容来源:https://www.wired.com/story/do-you-actually-need-to-pay-for-transcription-software/

内容总结:

AI语音转写工具Wispr Flow年费144美元,免费替代方案实测推荐

近期,AI语音转写工具Wispr Flow的广告频繁出现,宣称可通过语音输入实现“4倍于键盘的书写速度”,尤其吸引打字慢的用户。该工具的核心卖点并非简单转写,而是利用AI进行二次处理:先通过语音识别将口语转为文字,再借助大语言模型(LLM)去除语气词、自动分段,最终输出格式规整的文本。测试显示,该工具在电脑和手机任意文本框内均可调用,处理效果出色,设计简洁流畅。

然而,其价格成为痛点:免费试用期极短,后续年费144美元(约合人民币1000元),月费15美元。事实上,其底层技术——AI语音识别(如英伟达Canary、OpenAI Whisper)和LLM处理(如ChatGPT、Claude)均已广泛开源或免费可用。记者实测发现,以下免费替代方案表现不俗:

1. Spokenly(最佳免费选择)
支持macOS和Windows,完全免费下载且无需注册。若使用本地模型,或接入已有OpenAI/Groq等API密钥,可零成本体验。支持自定义后处理提示词,甚至可离线运行,数据不离开电脑,隐私性极佳。基础功能已接近Wispr Flow,推荐优先尝试。

2. Mac专属免费工具

3. Windows/Linux用户

记者观点:Wispr Flow的易用性和统一界面确有优势,但若预算有限,免费工具完全可满足需求。记者本人更倾向键盘输入:“写作即思考,打字、审视、修改本身就是创作过程。”不过,工具因人而异,市场上丰富的选择让用户总能找到适合的方案。

中文翻译:

我经常看到Wispr Flow的广告,这是一款由人工智能驱动的转录工具。它的卖点——通过大声说话而不是打字来更快地书写——很有吸引力,尤其是如果你打字速度较慢的话。营销承诺说,你可以“以思维的速度书写,比键盘快4倍”。

我的打字速度已经比我的思考速度快了。(是打字快,还是思考慢?你来判断。)但Wispr Flow的核心承诺不仅仅是转录——而是后期处理。这个工具分两步走。首先,现代人工智能转录工具将你的语音转换成文字;其次,一个大型语言模型会移除填充词,并将你的话整理成完整的句子和段落。其理念是,你可以说出你的想法,然后看着它们变成格式规整的文字。这在你的电脑或手机上的任何文本框中都能使用。

我测试过几次,不得不承认效果相当不错。苹果在所有设备上免费提供的听写功能效果已经足够好——谷歌在Pixel手机上的助手语音输入也是如此(它很快将迎来另一次人工智能升级)。但能够移除填充词并将所有内容整理成段落的软件确实有真正的价值。而且Wispr Flow设计非常流畅,通过精美的图形引导你完成设置过程。

那么问题在哪里?价格。Wispr Flow每年收费144美元(按年付费),或者在极其有限的免费试用后每月收费15美元。但Wispr Flow所基于的技术——基于人工智能的转录和大型语言模型——已经很普及了。在语音转文字方面,英伟达的Canary和OpenAI的Whisper都是开源的,这意味着它们可以在你自己的设备上完全免费运行。而且大多数人工智能爱好者已经为OpenAI、Claude或谷歌的Gemini付费了,这些服务中的任何一个都能处理Wispr Flow的后期处理部分。像Ollama、谷歌录音机或苹果智能这样的免费本地工具也可以做到。

考虑到这一切,我一直在想:有没有一个良好的、免费的、跨平台的Wispr Flow替代方案?我尝试了几款应用程序——以下是我的发现。

Spokenly:最佳免费替代方案

如果你想快速获得Wispr Flow的体验而无需订阅,Spokenly是个不错的选择,它同时支持macOS和Windows。它不是开源的,但可以免费下载,并且不需要账户即可使用。有一个Pro计划,每月10美元或每年100美元。只有当你使用Spokenly的云端模型时才需要付费计划。你可以选择使用本地模型,这是免费的。或者,如果你已经在为OpenAI或Groq等服务付费,你可以添加你的API密钥并将其用于转录——这在Spokenly上是免费的。

Spokenly提供了可选的后期转录格式化功能。你也可以为文本的后期转录格式化选择不同的大型语言模型提供商。作为Mac用户,我选择使用苹果智能——它完全免费,在我的测试中效果非常好。但它也支持OpenAI、Anthropic和Groq,以及一些其他大型语言模型提供商。该应用程序还允许你为后期转录处理编写任意数量的自定义提示,每个提示都有自己的键盘快捷键。

我最喜欢的一点是,Spokenly可以完全离线工作。如果你使用本地模型进行转录,并使用苹果智能这样的本地模型进行后期转录格式化,整个过程无需任何数据离开你的电脑。从隐私角度看这很好,从功能角度看,即使你的网络不稳定,该功能也能正常工作。

毫无疑问,这比设置Wispr Flow要麻烦一些。不过,当你设置完成后,你就拥有了一个无需月费的、可用的应用程序。我建议你试试。

其他几个免费替代方案

正如我之前所说:人工智能转录和大型语言模型都是广泛可用的技术。因此,现在市面上有很多Wispr Flow的替代方案也就不足为奇了。

对于Mac用户来说,完全免费且开源的MacParakeet是一个很好的选择。它是开源的,完全免费下载和使用,无需账户。应用程序内也没有任何升级推销。转录使用本地模型(Parakeet或Whisper)处理,格式化步骤支持多种大型语言模型——包括本地和在线模型。这是我找到的最接近Wispr Flow的完全免费应用。

VoiceInk,另一个仅限Mac的选择,如果你从GitHub下载代码并自行编译,它是开源且免费使用的。否则,该应用程序一次性收费25美元,之后你可以使用所有功能,无需持续付费。请注意,它的格式化步骤需要来自Gemini、Anthropic、OpenAI或Claude等服务的API密钥。

Windows和Linux用户应该看看FOSS Voquill,它是完全免费的开源软件,并且可以离线工作。它没有提供格式化步骤,这有点令人失望,但我把它列入其中,因为它是我找到的最佳免费Windows和Linux选项,没有任何烦人的升级推销。

出于任何原因不喜欢上述选项的Windows用户和Mac用户还有另一个选择:OpenWhispr。这个开源工具不需要账户(但你需要找到一个很小的“无需账户继续”按钮)。该应用程序提供订阅,但你可以选择设置本地模型和外部API密钥来避免付费。

你真的需要用语音打字吗?

Wispr Flow有其优点。首先,它很容易配置,并且拥有统一的用户界面。我能理解为什么有人会选择付费订阅。但如果现在预算紧张,也有免费的选项可供选择。

我很享受探索这个不断发展的领域,但我还是会坚持使用我的键盘。Wispr Flow以及类似的应用承诺让你以思维的速度书写,但我的打字速度比思考速度快。如果允许我稍微哲学一点,写作就是我的思考方式。打出一个句子,看着它,然后润色它,这不是写作过程中令人烦恼的部分——它本身就是写作过程。而且,我常常只有在花时间梳理思路后,才知道自己对某件事的看法。我不禁觉得,如果我只是对着电脑说话而不是打字,很多东西就会丢失。

但每个人的大脑都不同,这些工具可能很适合你。这也正是我很高兴市面上有这么多选择的原因。

英文来源:

I'm constantly seeing ads for Wispr Flow, an AI-powered transcription tool. The pitch—that you'll be able to write faster by talking out loud instead of typing—is compelling, especially if you're a slow typist. The marketing promises you'll be able to "write at the speed of thought, 4x faster than your keyboard."
I already type faster than I can think. (Fast typist, or slow thinker? You decide.) But Wispr Flow's core promise isn't just transcription—it's post-processing. The tool uses two steps. First, modern AI transcription tools turn your voice into text; second, a large language model (LLM) removes filler words and formats your words into complete sentences and paragraphs. The idea is that you can talk out your ideas and watch them turn into properly formatted text. This works inside any text box on your computer or phone.
I've tested this a few times and have to admit the results are pretty good. Apple's dictation feature, free on all its devices, works well enough—so does Google's Assistant Voice Typing on Pixel phones (which is getting another AI upgrade soon). But there's real value in software that removes filler words and formats everything into paragraphs. And Wispr Flow is sleekly designed, guiding you through the setup process with snappy graphics.
So what's the catch? Price. WisprFlow costs $144 per year (billed annually) or $15 a month after an extremely limited free trial. But the technology Wispr Flow is built around—AI-based transcription and LLMs—is widely available. On the speech-to-text side, Nvidia's Canary and OpenAI's Whisper are both open source, meaning they're completely free to run on your own device. And most AI enthusiasts are already paying for OpenAI, Claude, or Google's Gemini, any of which can handle the post-processing part of Wispr Flow. So can free local tools like Ollama, Google Recorder, or Apple Intelligence.
With all this in mind, I've been wondering: Is there a good, free platform-agnostic alternative to Wispr Flow? I tried out several applications—here's what I found.
Spokenly, the Best Free Alternative
If you want to get the benefits of Wispr Flow without a subscription quickly, you could do worse than Spokenly, available on both macOS and Windows. It's not open source, but it is free to download and does not require an account to use. There's a Pro plan that costs $10 a month or $100 a year. The paid plan is only necessary if you're using Spokenly's cloud models. You can opt to use a local model instead, which is free. Alternatively, if you're already paying for a service like OpenAI or Groq, you can add your API key and use that for transcribing—that's free with Spokenly.
Spokenly offers optional post-transcription formatting. You can also choose a different LLM provider for the post-transcription formatting of text. As a Mac user, I opted to use Apple Intelligence—it's totally free and worked really well in my tests. But it supports OpenAI, Anthropic, and Groq, plus a few other LLM providers. The application also allows you to write as many custom prompts for post-transcription processing as you like, each with its own keyboard shortcut.
One of my favorite things is that Spokenly can work entirely offline. If you use a local model for transcription and a local model like Apple Intelligence for the post-transcription formatting, the entire thing works without any data leaving your computer. That's nice from a privacy perspective, and from a functionality standpoint, the feature will work even when your internet is shaky.
This is, without a doubt, more work than setting up Wispr Flow. When you're done, though, you have a working application with no monthly subscription. I recommend trying it out.
A Few Other Free Alternatives
Like I said before: AI transcription and LLMs are both widely available technologies. It should be no surprise, then, that there are many Wispr Flow alternatives out there right now.
For Mac users, the completely free and open source MacParakeet is a great option. It's open source and completely free to download and use without an account. There's also no upselling in the application. Transcribing is handled using local models, either Parakeet or Whisper, and a variety of LLMs—both local and online—are supported for the formatting step. That's the closest completely free app to Wispr Flow I've found.
VoiceInk, another Mac-only option, is open source and free to use if you download the code from GitHub and compile it yourself. The app otherwise costs $25, one time, after which you can use all features without any ongoing payments. Note that the formatting step for this requires an API key from a service such as Gemini, Anthropic, OpenAI, or Claude.
Windows and Linux users should look into FOSS Voquill, which is completely free, open source software (hence the FOSS), and works offline. It doesn't offer a formatting step, which is disappointing, but I'm including it because it's the best free Windows and Linux option I've found without any annoying upselling.
Windows users and Mac users who don't like the above options for any reason have one more choice: OpenWhispr. This open source tool doesn't require an account (but you'll have to find a tiny "Continue without an account" button). The application offers a subscription, but you can opt to set up local models and external API keys instead to avoid paying.
Do You Really Need to Type With Your Voice?
Wispr Flow has its upsides. It's easy to configure, for one thing, and has a consistent user interface. I can understand why someone might opt to pay for a subscription. But if money is tight right now, there are free options available.
I had fun exploring this growing field, but I'm going to stick to my keyboard. Wispr Flow, and apps like it, promise to let you write at the speed of thought, but I type faster than I think. If I can be philosophical for a second, writing is how I think. Typing a sentence, looking at it, and refining it isn't an annoying part of the writing process—it is the writing process. And I often don't know what my opinion on something is until I take the time to refine my thoughts. I can't help but feel a lot of that would be lost if, instead of typing, I just talked to my computer.
But every brain is different, and these tools may work well for you. Which is why I'm glad there are so many options out there.

连线杂志AI最前沿

文章目录


    扫描二维码,在手机上阅读