How To Transcribe A Video Automatically In 2020
This past week we have had all hands on deck for a video project we are working on for the Department of Defense and the University of Dayton. We needed a way to transcribe up to 16 hours of video, fast. After some research, we came across this video which explains how you can use a tool within Google docs to transcribe audio “automatically”.
After some testing and tweaking, I have come up with a much more efficient way to transcribe a video automatically. In fact, unlike many other methods out there, this one actually is automated. It’s a simple 5 step process that involves a computer, external speakers (or powerful headphones), and a Google doc. So forget everything you thought you knew about automated video transcribing, and listen up.
Step 1: Two Users, One Doc
The first thing you have to do is open up your preferred browser of choice (so, Chrome) and open up a Google doc, then sign in. Let’s call this “Browser 1”. Now open up your secondary browser (“Browser 2”), or pop open an incognito window and repeat the process. Except this time, sign into Google as a different user. So you should have two windows open to the same Google doc, logged in as two different users. Simple enough.
Step 2: Connect a Speaker
This step is pretty straightforward. On Browser 1, go to Tools -> Voice Typing. This should open up the Voice Type tool which allows you speak into the microphone of the computer or headphones and transcribe what you are saying. Now find a quality external speaker you can connect to. The quality is important here because the Voice Type tool has trouble with voices that are recorded. I also found a substitute for external speakers if you don’t happen to have any. As you will see below, I used my Audio-Technica headphones as speakers by turning the volume all the way up. I assume you can use any quality headphones like Beats, Bose, etc.
Step 3: Let the Computer Talk to Itself
Now locate where your computer’s microphone is. Wherever that may be, place the external speaker or headphones as close to it as possible. Find the video or audio file you would like to transcribe, and push play. You will now see the Google doc start to record what was said and type it down. Of course, it’s not perfect. As other blogs have suggested, the Voice Type tool will mistake about 3-5 words per page. That’s with an actual voice too. Transcribing over speakers I found to be slightly worse, with 5-7 words mistaken per page. The way this process is optimized has to do with what comes next.
Step 4: Fix the Mistakes
This is when we pull up Browser 2. On the same Google doc as Browser 1, you are now going to start at the top, fixing any mistakes the Voice Type tool may make as you go along. The problem with the original method is that you have to listen to the video or audio once to speak out loud, then watch it again to find where the mistakes are. This new method will cut that time in half, as you only have to play the file once to both transcribe and fix the mistakes.
Step 5: Don’t Forget Punctuation!
The Voice Type tool does not automatically punctuate unless you say “period”, “comma”, etc. So once the video segment is over, or after a page is completed, pause the video. Go over that section and place punctuations where necessary, which should be apparent. Whichever method you use for transcribing via Google’s Voice Typing, this step is always necessary.
That’s it! This is the only method I have seen that cuts as much as half the time and effort out of the already improved transcribing process. Let us know if you have any better ways to transcribe a video in the comments below. We have been using this method lately with our videos, and it has saved a ton of time in post production so far.