As journalists, we spend a lot of time transcribing audio recordings into text that is then used for articles. We’re not the only ones with this problem though — academics and researchers, students, and even people who attend a lot of meetings and need to keep everything organised would have ended up with a long transcription queue at some point of time or the other.
Our normal workflow to deal with this has been to keep the audio file playing in QuickTime in the background, as we type in a text editor. There are a couple of obvious problems with this — for one, things like pausing and moving back and forward are needlessly complicated as you move between programs, and for another, controlling playback speed to suit your typing speed isn’t easy either. In short, it’s a really bad workflow.
As a result, we’re always on the lookout for a good app that can solve this problem because it would make life a lot easier — in one instance where the volume of work was too high, we actually resorted to getting someone from Freelancer.com to help transcribe a book’s worth of research notes, but that’s not a great solution if you are on a limited budget. We decided to ask people what they’re using, and check on tech sites and forums like Product Hunt and Reddit, to find out what the best options are. We came across a lot of recommendations, and then using some of our interview recordings, took them all for trial runs to see what could be a long term solution.
From there, we’ve narrowed things down to just a few options that we thought were the best, and the includes some very different types of solutions. There are basically three ways to end up with a transcript. You can either do it manually, using different tools that make the process more efficient. Or you can try to get a computer generated transcript, which is going to be full of errors, but will at least get you started, and thus reduce the amount of time you spend on a project. Or you could pay someone to turn the transcript around for you — like we did with Freelancer.com. We focussed on the first two methods, and here are our top picks.
Transcribe by Wreally
The top recommendation across various platforms, Transcribe is an option we also liked for its simplicity and effectiveness. Transcribe is basically an audio player with a notes tool built in, that lets you listen to the recording and make your notes in the same place. You can use keyboard shortcuts for a number of important playback related features, and the combination is a serious step up from using a text editor with QuickTime in the background.
The tool runs on your computer in a browser window, but it also works offline. You can upload the audio, and save the text locally, without any issues. The audio file plays with controls on the top of the page, and there’s a text box below where you can enter the text, complete with formatting, and then export it as a .DOC file, if needed. Shortcuts using the function keys let you pause and play, speed up or slow down the audio, add a timestamp to the text, and so on. If you’re a Mac user, you’ll want to go to settings and have the keys work as function keys rather than controlling things like your brightness and volume, but otherwise it’s the same.
This is obviously a better solution to our normal transcription workflow, and using Transcribe by Wreally, we were able to convert a 30 minute recording into usable text in just over 45 minutes, something that used to take us an hour or a little bit longer.
There’s also an interesting workaround if you want to transcribe without typing; although Transcribe doesn’t let you upload audio files, you can dictate the words and it’ll automatically type them up, if you’re using Chrome. It only works on Chrome, and so it’s possibly using Google’s speech to text APIs — whatever the engine, the results are fairly accurate, although it’s not the best solution. For one thing, you can get the occasional substitution when «find» becomes «third», and «numerous» becomes «pneumatic». For another, it’s just not a great experience to keep repeating everything you’re hearing — either you can listen to the recording, or say the words, and so it’s hard to keep track, and required a lot of pausing and moving back and forth. We also had an issue where the cursor wouldn’t consistently move forwards. Despite these drawbacks, once you have used the dictation function for a while, you get used to its quirks, and it is fast and reliable enough.
Transcribe isn’t free though — the free trial lasts for a week, and after that you have to pay a $20 annual license. That’s a pretty good deal if you use it a lot, though it may feel a little expensive if you aren’t using it often.
You can try Transcribe out for yourself for a week and see if it’s a good fit. If you’re looking for a free alternative, check out oTranscribe. It’s a great option with almost all the same features, but it lacks the dictation mode, so you’ll have to type the whole text.
Trint is a pretty straightforward service that automatically transcribes the audio files you upload, and sends you a transcript. Trint lets you upload a file and then transcribes it on the Internet — when it’s done (which depends on the length of the audio file), you’ll get an email updating you, so you can close the window and do other work in the meanwhile. It didn’t take much time though — a 10 minute file took just about four minutes to digest.
However, Trint doesn’t just provide a text file. Instead, after transcribing, it provides a powerful text editor that allows you to listen to the playback while editing the text, just like Transcribe. You can even tag different sections of text by speaker, or add highlights. You can also add strikethrough to text, which tells Scribie to skip those parts when playing the audio. When you’re done, you can export the text, which could be as a .DOC file, or a .SRT subtitle file, or if you only need parts of the file, you could choose to export only the highlights.
You can change the playback speed, show a timestamp for every paragraph, or navigate the text by moving back and forth through the audio file. As the audio plays, the related text is highlighted as well, so it’s very easy to keep track. It’s pretty great, though one limitation is that you can only use it on your computer — there are no iOS or Android apps.
The accuracy of the transcription also leaves something to be desired. «Go on and on» somehow turned into «they don’t», while «obnoxious, arrogant» became «block every». Our favourite though was «are the envy of» becoming «zombie yo». By and large though, the text is pretty clean, with around 70 percent of it being correct; and it can speed up the transcription a lot to have this as a starting point.
You’ll be charged at $12 per hour of audio, which isn’t a bad rate, particularly since the recording and the transcript (with all the edits that you make) are always available whenever you need them. You can try Trint for 30 minutes free and see how well it suits your needs. If you’re not interested in paying, you can also use Scribie, which offers unlimited free machine transcription.
Scribie is a little less accurate, and does best with very clear audio and an American accent. In our experience with the same interview text, it was probably around 60 percent accurate to Trint’s 70, although interestingly, the two made different mistakes. Some of the best slip ups were «students» becoming «Shodan», and «Ivy League» turning into «idli». The company says it takes up to 30 minutes to transcribe, though our 20 minute clip took between four and five minutes.
Scribie also has a human-processed transcript, for which it charges $0.60 (roughly Rs. 40) per minute, which a maximum of five-days for the turnaround. A rush-job has a 12-hour turnaround time, and is priced at $2.40 (just over Rs. 150) per minute.
If you liked the idea of Trint but thought that the interface left something to be desired, and didn’t like the idea of running an app in your browser, give Descript a shot instead. The app is free, and comes with 30 minutes of free transcription, after which you’ll pay $0.15 (roughly Rs. 10) per minute, which is pretty reasonable.
Descript has a great looking Mac app that lets you do all the things that Trint does, starting with an automatic transcription, and then letting you edit the text. You can mark text to skip the audio playback, correcting errors and creating a smooth script that matches the audio perfectly. It’s really great and has all the features you need in an interface that we loved.
As you move through the text, it shows your place in the audio file as well, and allows you to publish the edited audio and text to the Web if you like. It’s powered by Google Speech, and it’s quite accurate, although there are obviously still some errors. We found it be close to 80 percent accurate, as long as the audio was clear, without overlap, and ideally with American accents.
Descript also offers a monthly subscription plan, where you pay $10 per month up front, but then your per minute fee is $0.07 per minute, which sounds like a good option for heavy users.
You can download Descript free, and try it out for a 30 minute file to get a sense of how it works, before either paying or signing up for a subscription. A Windows version is coming in January 2018. There is no mobile version for Descript either.
In our experience, Descript was probably the best tool of the bunch, though its per minute pricing isn’t fully convenient. As of now, we’re inclined towards Transcribe by Wreally, since it offers an annual subscription with no additional cost, and the dictation mode is a step up from oTranscribe. There were also a number of mobile apps which promised similar experiences, but in our testing were limited. Transcribing that involves a fair amount of typing on a touchscreen still leaves something to be desired, and it’s best to stick with these PC-based options instead.
What about you, which one do you think suits you best? Tell us, and the other readers, via the comments below.