The 9 Best Speech-to-Text Apps in 2023 (Tried & Tested)

1 year ago 89

Most code designation apps person nary occupation transcribing a autochthonal talker being recorded with a pro microphone successful a quiescent room. This isn’t a challenge.

So to trial them much thoroughly, I created a nightmare signaling of 2 non-native speakers with large metropolis inheritance noise.

How did they fare?

Let’s find out.

Otter.ai homepage

Otter was 1 of the astir often mentioned solutions erstwhile we asked for suggestions connected Twitter and successful the Ahrefs community. And for bully reason. It is casual to acceptable up, has an intuitive interface, and offers wide pricing.

Unique features

What stands retired from the remainder is the app’s quality to grounds online meetings and transcribe them—simply by pasting the gathering URL. But you tin besides import a video/audio record oregon grounds audio close successful the app.

Besides, you tin link your calendar to ne'er miss a meeting.

Transcript quality

I got decent results, but determination was a batch to edit too.

It didn’t get immoderate names right. But I can’t blasted immoderate instrumentality for not picking up “Ahrefs” oregon “Tim Soulo” 100% of the time.

Otter.ai transcription results

One happening I recovered is that aft it notified the transcriptions were ready, it mightiness inactive bash thing successful the inheritance (adjust clip stamps, tag speakers, etc.). Like a pupil inactive scribbling connected a trial insubstantial portion passing it to the teacher.

Pricing

You tin commencement for escaped and upgrade to a paid program later. You tin import up to 3 files and grounds 290 minutes of meetings earlier you request to upgrade (as of April 2023).

Rev.com homepage

Setting up an relationship was a no-brainer. I recovered the interface casual to navigate arsenic well. One idiosyncratic remark is that it felt a small excessively “cold” to usage since I saw things similar “Place Order,” “Billing,” and “Invoice” mode too often. 

You mightiness get an content that it was designed by an accounting squad (as opposed to Descript that comes adjacent successful this roundup).

Unique features

Besides auto-generated transcripts, Rev offers unrecorded captions for Zoom meetings. You besides person the enactment to spot an bid for quality transcriptions.

Transcript quality

Poor audio with metropolis sound was a spot excessively overmuch for Rev. Some words were missing, portion others were misrecognized. As a result, immoderate paragraphs didn’t marque overmuch sense, portion others were fine.

Rev.com transcription results

Pricing

You tin transcribe the archetypal audio record (up to 45 minutes) for free. I got a measure for $1.25 with a discount that resulted successful a full of $0.00. Thanks, accounting team. 😉

Rev besides has a 14-day proceedings of its paid plan. But that was tricky to find. To find it, you request to spell to the footer of the homepage and look for it nether “Services.”

Footer of the homepage, via rev.com
Descript's homepage

Descript welcomed maine by sanction (which was a bully coincidence). The main happening you person to cognize is that it is simply a standalone bundle alternatively than a web service. It is overmuch much than a speech-to-text converter. It’s fundamentally a video editing tool. And there’s decidedly a learning curve. But thankfully, onboarding is highly comic and engaging.

Descript's onboarding is interactive and engaging

Unique features

As I mentioned, Descript is much of a video editing instrumentality that is bully with transcribing. I’d telephone it “Canva for video/captions.” You tin adhd B-rolls, effects, animations, and more.

You tin easy resistance and driblet and fundamentally nutrient a implicit video with its help. But if you conscionable request a transcript oregon captions of a video oregon audio, you tin bash that too.

Transcript quality

My illustration audio had rather muddy results. At times, it had trouble recognizing abbreviations (e.g., SEO). I besides had a occupation with removing filler words similar uh and um.

I recovered that if I didn’t take an enactment to region them, they, um, conscionable stayed determination adjacent though I didn’t request them astir of the time. But if I did take to region them, it occasionally ate up parts of different words, causing adjacent much trouble.

Also, it couldn’t admit parts that a quality being would person nary occupation knowing conscionable from context, e.g., “Jack of each trades” became ‘“jackal, trades.”

On the agleam side, I judge you tin inactive recognize what the substance is about.

Descript transcription results

Pricing

You tin commencement with basal functions for escaped and upgrade if needed.

MacWhisper app connected  gumroad.com

MacWhisper is simply a transcription instrumentality powered by Whisper. It’s an automatic code designation (ASR) strategy developed by OpenAI, the aforesaid institution that brought america ChatGPT.

As OpenAI states connected its website:

Whisper is trained connected 680,000 hours of multilingual and multitask supervised information collected from the web.

Whisper is not thing you tin simply “run” arsenic is. What’s more, it is beauteous analyzable to acceptable up if you bash privation to tally it yourself. Github, Python—you get the gist.

Luckily, determination are tools similar MacWhisper that instrumentality this disconnected your shoulders and fto you usage the powerfulness of AI successful a elemental idiosyncratic interface.

Unique features

Just plain speech-to-text designation with clip stamps. Unfortunately, it doesn’t auto-tag the speakers.

Transcript quality

When you tally the tool, you person to take a “model” to enactment with. Basically, the lighter the model, the quicker it volition run. But larger models volition nutrient amended results. Also, successful MacWhisper, those larger (better but slower) models are lone disposable successful the paid version.

I decided to commencement with the escaped “small” model, which was stated to person “normal velocity with bully accuracy.”

It was OK, but nary amended than the competitors. I assumed it would enactment good with high-quality audio, but not with the horrible examples I fed to it.

“AI is overrated,” I thought. But earlier closing the Mac and switching backmost to my beloved Windows PC, I decided to springiness the “large” exemplary a try.

And you cognize what, AI is not overrated. I recovered the results to beryllium overmuch amended than thing else.

The transcript was really, truly good. It adjacent got things similar “Ahrefs” and “SaaS” right! Though inactive not 100% of the time.

MacWhisper transcription results

Pricing

You tin tally smaller models for free. For a ample model, you’ll request to acquisition a license.

AI Transcriptions by Riverside homepage

This instrumentality is the easiest to use. Simply resistance and driblet your file—then it’s ready. It takes immoderate clip to process, though.

Unique features

Nothing too downloading a transcription.

Transcript quality

My archetypal content was that the results were cleanable because, visually, it delivered a confident-looking text:

AI Transcriptions by Riverside transcription results

But aft proofreading, I realized that it simply did not see the parts it failed to recognize—sometimes respective words successful a row.

Pricing

It’s escaped to use.

Adobe Premiere Pro homepage

Premiere Pro is not precisely a “transcription tool” but alternatively a video editing software. I’m including it due to the fact that I presume that immoderate companies whitethorn already person it successful their arsenal (like we do).

To get to the transcription diagnostic successful Premiere Pro, conscionable spell to the “Captions and graphics” workspace and click “Create transcription.”

Premiere Pro interface—you tin  make  transcriptions successful  the "Captions and graphics" workspace

Unique features

If we instrumentality lone code designation into relationship here, what it does good is creating precise clip stamps, auto-tagging the speakers and, if needed, automatically adding an editable captions way to a video project.

Transcript quality

Let’s beryllium straightforward: I recovered the noisy audio transcript to beryllium a failure. I couldn’t comprehend what radical were talking astir successful the first place.

Adobe Premiere Pro transcription results

Still, I deliberation this diagnostic tin beryllium truly adjuvant if you are creating captions from high-quality audio. I utilized it myself respective times and had thing to kick astir erstwhile the signaling prime was good.

Pricing

You request an Adobe Creative Cloud subscription to usage Premiere Pro.

Happyscribe.com homepage

While signing up and uploading files is alternatively straightforward, you person to walk immoderate clip answering questions astir you and your institution earlier you tin yet get to the instrumentality itself. And no, you can’t skip typing successful your institution name, your role, and your institution size.

But erstwhile you get done this, the interface is cleanable and intuitive.

Unique features

You tin make a transcript oregon captions for video oregon audio. There is besides an enactment to petition a manual reappraisal of the transcript. Alternatively, you tin make subtitles successful a antithetic language, truthful you person transcription and translation successful one click.

Happy Scribe features see  transcription, subtitles, and overseas   connection   subtitles

Transcript quality

Happy Scribe did a truly bully occupation transcribing the audio. It had nary occupation with words similar “SEO” and “SaaS” (obviously the weakest constituent for galore tools). It could besides auto-tag the speakers, which mightiness beryllium adjuvant successful definite situations.

Happy Scribe transcription results

Pricing

I could trial 1 record for free. After that, I would request to bargain credits to beryllium utilized for each infinitesimal of video oregon audio transcribed.

Sonix.ai homepage

Sonix is simply a instrumentality for automatic transcriptions, translations, and integration with gathering apps.

Unique features

Besides meetings integration, which is astir a fixed for astir tools, AI summary procreation is an absorbing diagnostic (in beta arsenic of April 2023.) But I already got awesome results from it.

AI summary from Sonix

You besides get immoderate other tools to enactment with video captions—a timeline presumption and an enactment to divided captions into respective lines. You tin besides import an existing transcript, and Sonix volition sync it with the audio.

Transcript quality

Sonix has a customized vocabulary feature. I recovered that helped a spot with names similar “Tim Soulo” and “Ahrefs,” but it didn’t enactment 100% of the time. It mostly did well. But astatine times, it mistook SEO for CEO and returned the connection “Excel” seemingly retired of nowhere.

The transcript made consciousness successful wide but required rather a batch of edits if it needed to beryllium perfect.

Sonix.ai transcription results

Pricing

Sonix has a escaped proceedings for 25 minutes of transcriptions. After that, you request to acquisition pay-as-you-go credits oregon get a subscription.

Notta.ai homepage

Notta is yet different transcription work that works for some real-time meetings and existing recordings.

Unique features

Besides transcription, Notta focuses connected streamlining definite workflows and offers features specified arsenic calendar sync and scheduler (in beta arsenic of April 2023).

Transcript quality

Background sound and mediocre audio prime were not woody breakers for Notta. The transcription results turned retired mostly OK but inactive had immoderate problems.

Notta.ai transcription results

Sentence operation was sometimes a spot weird, definite words went missing, and my favourite “Jack of each trades” portion wasn’t that neat this time.

Inconsistency successful  Notta's transcription

Another happening worthy noting is that, for immoderate reason, it failed to admit 2 speakers, and the full interrogation was tagged arsenic “Speaker 1.”

Pricing

You tin commencement with a escaped basal subscription and effort a three-day proceedings of the paid plan, Notta Pro.

Final thoughts

As you tin see, determination are plentifulness of tools to take from. Still, it seems that OpenAI stirred things up a spot by releasing a escaped ASR (automatic code recognition) system, which I recovered to beryllium considerably much susceptible than others.

But axenic code designation prime is conscionable 1 factor. Maybe you bash request to grounds your Zoom meetings (Otter), enactment with captions successful a ample video task (Premiere Pro), oregon rapidly make a Canva-style video (Descript).

Also, I request to accent that I was trying to propulsion these tools to the borderline by giving them the worst-case script recording. For much earthy uses, the differences successful the result mightiness beryllium overmuch little noticeable.

It’s large to spot that determination are truthful galore options retired there, and I anticipation this reappraisal volition assistance a spot successful uncovering the 1 that is cleanable for you.

Got questions? Ping maine on Twitter.

Read Entire Article