By Brady Betzel
One of the most time consuming parts of editing can be dealing with the pre-post, including organizing scripts and transcriptions of interviews. In the past, I have used and loved Avid’s ScriptSync and Phrase Find. These days, with people becoming more comfortable with other NLEs such as Adobe Premiere Pro, Apple FCP X and Blackmagic Resolve, there is a need for similar technology inside those apps, and that is where Digital Anarchy’s Transcriptive plug-in comes in.
Transcriptive is a Windows- and Mac OS-compatible plugin for Premiere Pro CC 2015.3 and above. Transcriptive allows the editor to have a sequence or multiple clips transcribed in the cloud by either IBM Watson or Speechmatics, a script downloaded to your system and in sync with the clips and sequences for a price. From there you can search for specific words, sort by person speaking, including labelling each speaker, or just follow an interview along with a transcript.
Avid’s ScriptSync is an invaluable plugin, in my opinion, when working on shows with interviews, especially when combining multiple responses into one cohesive answer being covered by b-roll — often referred to as a Frankenbite. Transcriptive comes close to Avid’s ScriptSync within Premiere Pro, but has a few differences, and is priced at $299, plus the per-minute cost of transcription.
A Deeper Look
Within Premiere, Transcriptive lives under the Windows menu > Extension > Transcriptive. To get access to the online AI transcription services you will obviously need an Internet connection as well as an account with Speechmatics and/or IBM’s Watson. You’ll really want to follow along with the manual, which can be found here. It walks you step by step through setting up the Transcriptive plugin.
It is a little convoluted to get it all set up, but once you do you are ready to upload a clip and get transcribing. IBM’s Watson will get you going with 1,000 free minutes of transcription a month, and from there it goes from $.02/minute down to $.01/minute, depending on how much you need transcribed. If you need additional languages transcribed it will be up-charged $.03/minute. Speechmatics is another transcription service that runs roughly $.08 a minute (I say roughly because the price is in pounds and has fluctuated in the past) and it will go down if you do more than 1,000 minutes a month.
Your first question should be why the disparity in price, and in this instance you get what you pay for. If you aren’t as strict on accuracy, then Watson is for you — it doesn’t quite get everything correct and can sometimes fail to see when a new person is talking, even on a very clear recording. Speechmatics was faster during my testing and more accurate. If free is a good price for you then Watson might do the job, and you should try it first. But in my opinion Speechmatics is where you need to be.
When editing interviews, accuracy is extremely important, especially when searching specific key words, and this is where Speechmatics came through. Neither service has complete accuracy, and if something is wrong you can’t kick it back like you could a traditional, human-based transcription service.
To test Transcriptive I downloaded a CNN interview between Anderson Cooper and Hillary Clinton, which in theory should have perfect audio. Even with “perfect audio” Watson had some trouble when one person would talk over the other. Speechmatics seemed to get each person labeled correctly when they spoke, I would guess it missed only about 5% of the words, so about 95% accurate — Watson seemed to be about 70% accurate.
To get your file to these services you will either send your media from a sequence, multiple clips or a folder of clips. I seem to favor a specific folder of clips to transcode as it forces some organization and my OCD assistant editor brain feels a little more at home.
As a plugin, Transcriptive is an extension inside of Premiere Pro, as alluded to earlier. Inside Premiere you have to have the Transcriptive window active when doing edits or simply playing down a clip, otherwise you will be affecting the timeline (meaning if you hit undo you will be undoing your timeline work, so be careful). When working with transcriptions between clips and sequences your transcription will load differently. If you transcribe individual clips using the Batch Files command, the transcription will be loaded into the infamous Speech Analysis field of the files metadata. In this instance you can now search in the metadata field instead of the Transcriptive window.
One feature I really like is the ability to export a transcript as markers to be placed on the timeline. In addition, you can export many different closed captioning file types such as SMPTE-TT (XML file), which can be used inside of Premiere with its built-in caption integration. SRT and VTT are captioning file types to be uploaded alongside your video to services like YouTube, and JSON files allow you to send transcripts to other machines using the Transcriptive plugin. Besides searching inside of Transcriptive for any lines or speakers you want, you can also edit the transcript. This can be extremely useful if two speakers are combined or if there are some missed words that need to be corrected.
To really explain how Transcriptive works, it is easiest to compare it to Avid’s ScriptSync. If you have used Avid’s ScriptSync and then gave Transcriptive a try, you likely noticed some features that Transcriptive desperately needs in order to be the powerhouse that ScriptSync is — but Transcriptive has the added ability to upload your files and process them in the cloud.
ScriptSync allows the editor or assistant editor to take a bunch of transcriptions, line them up, then, for example, have every clip from a particular person in one transcription file that could be searched or edited from. In addition, there is a physical representation of the transcriptions that can be organized in bins and accessed separately from the clips. These functions would be a huge upgrade to Transcriptive in the future, especially for editors who work on unscripted or documentary projects with multiple interviews from the same people. If you use an external transcription file and want to align with clips you have in the system you must use (and pay) Speechmatics, which for a lower price per minute will align the two files.
Updates Are Coming
After I had finished my initial review, Jim Tierney, president of Digital Anarchy, was kind enough to email me about some updates that were coming to Transcriptive as well as a really handy transcription workflow that I had missed my first time around.
He mentioned that they are working on a Power Search function that will allow for a search of all clips and transcripts inside the project. A window will then show all the search results and can be clicked on to open the corresponding clips in the source window or sequence in the record window. Once that update rolls in, Transcriptive will be much more powerful and easier to use.
The only thing that will be hard to differentiate is if you have multiple interviews from multiple people. For instance, if I wanted to limit the search to only my interviews and for a specific phrase. In the future, a way to Power Search a select folder of clips or sequences may be a great way to search isolated clips or sequences, at least easier than searching all clips and sequences.
The other tidbit Jim mentioned was using YouTube’s built-in transcriptions in your own videos. Before you watch the tutorial keep in mind that this process isn’t flawless. While you can upload your video to YouTube in private mode, the uploading part may still turn away a few people who have security concerns. In addition, you will need to export a low-res proxy version of your clip to transcode, which can take time.
If you have the time, or have an assistant editor with time, this process through YouTube might be your saving grace. My two cents is that with some upfront bookkeeping like tape naming, and after transcribing corrections, this could be one of the best solutions if you aren’t worried about security.
Regardless, check out the tutorial if you want a way to get supposedly very accurate transcriptions via YouTube’s transcriber. In the end it will produce a VTT transcription file that you will import back into Transcriptive, where you will need to either leave alone or spend adjusting since VTT files will not allow for punctuation. The main benefit to the VTT file from YouTube is the timecode is carried back to Transcriptive and enables each word to be clicked on and the video will line up to it.
All in all, there are only a few options when working with transcriptions inside of Premiere. Transcriptive did a good job at what it did: uploading my file to one of the transcription services, acquiring the transcript and aligning the clip to the timecoded transcript with identifying markers for speakers that can be changed if needed. Once the Power Search gets ironed out and put into a proper release, Transcriptive will get even closer to being the transcription powerhouse you need for Premiere editing.
If you work with tons of interviews or just want clips transcribed for easy search you should definitely download Digital Anarchy’s Transcriptive demo and give it a whirl.
You can also find a ton of good video tutorials on their site. Keep in mind that the Transcriptive plugin runs $299 and you have some free transcriptions available to you through IBM’s Watson, but if you want very accurate transcriptions you will need to pay for Speechmatics or you can try YouTube’s built-in transcription service that charges nothing.
Brady Betzel is an Emmy-nominated online editor at Margarita Mix in Hollywood, working on Life Below Zero and Cutthroat Kitchen. You can email Brady at firstname.lastname@example.org. Follow him on Twitter @allbetzroff.