Live Support Chat
Home » Blogs » Kevin's blog » Audio or Video Transcription: Which One Do You Need?

Audio or Video Transcription: Which One Do You Need?

Audio or Video Transcription: Which One Do You Need?

By: Kevin

15 Sep 2022

There’s an old saying that says, “trust what you see, not what you hear,” and it’s something that brilliantly emphasizes the importance of "audio and video transcription".

Transcription is a skilled process that involves listening to a recording, researching the topic to get an in-depth understanding of it, and then accurately typing the spoken language into text format. This can be done verbatim (exact word-for-word), or the transcriptionist can clean up the text to improve readability in certain parts of the speech.

What is an audio or video recording's transcription?

The process of accurately transcribing information into text after listening to a recording, doing research on the topic, and comprehending the context is known as transcription. The transcriptionist may edit out specific passages of the speech or the transcript may be verbatim (word for word). When done well, the procedure might take a lot of time.

How do you translate audio or video?

Work on transcription is done by a qualified transcriber. The approach will be determined by the intended use. Specialized software is not required if time codes are not added. Any skilled audio typist can create the transcription in such a situation. If the transcript is being made from a video for voice-over or subtitle translation, the transcriber will need the appropriate software and training. Because the time codes must be documented in hours, minutes, seconds, and frames, subtitling transcription requires substantially more time than other types of transcription. The transcriber must give the spectator ample time to read each "subtitle" displayed on the screen.

What influences the procedure and how long does it take?

The overall amount of time needed to transcribe a recording varies on a number of variables. Let's examine the main variables that impact transcription time.

The recording's topic—Any audio that necessitates research into terminology and spelling takes longer to transcribe. An hour-long interview about university education would generally take an experienced transcriber 4 hours to transcribe. On the other hand, transcription of an hour-long session on hepatocarcinoma clinical trials would likely take 5–6 hours. To ensure accurate spelling, the transcriptionist needs to look up the medical phrases in this situation.

Several Speakers - For transcriptions of recordings with multiple speakers, an additional fee is charged. This is reasonable because it takes a lot of work to effectively transcript several speakers, such as in a focus group with 5–10 people speaking quickly and frequently over one another.

Audio Quality - Background noise or "echo" is typically present in outdoor recordings taken without external microphones, making it difficult to make out every word that is being uttered.

Style of Transcription - The level of detail in the transcript is determined by the transcription style. The most common transcription formats are verbatim, intelligent, and true. True verbatim transcriptions are extremely thorough and capture every nuance of the audio, including background noises like laughter. Compared to intelligent verbatim, which omits all these nuances during transcription, this technique requires a lot more time.

When is transcription necessary?

There are many reasons why transcription of video or audio content is needed. Below, we have listed some of these reasons and their benefits.

Deaf and hard-of-hearing audiences. In this case, your transcripts can open up services to a new audience.
Voice-over purposes. These kinds of transcriptions are generally used for corporate and explainer videos.
Translation purposes. The first step in any translation process almost always involves transcription, and the transcript is used to create an array of foreign language versions of the original document.
Closed captions and subtitling. If you’re adding captions and subtitles to your video content, a professional transcription is the first step in the right direction.

Are you aware that transcription of audio and video content might boost SEO?

It turns out that you can gain some immediate SEO advantages by transcribing material like podcasts, webinars, and films and posting them alongside your content. According to studies, pages containing transcripts generated 16% more money on average than pages without them. YouTube videos with captions received 7.32% higher views overall.

You'll probably discover that the best transcription tools are not inexpensive. The most sophisticated ones might cost between $50 and $150 each. A professional transcribing service should be used if you need to transcribe a lot of recordings or confidential material.

There is no authoritative study that lists the average cost per audio hour of transcription worldwide. You may find that most companies charge, on average, less than $3 for each audio minute by looking at their basic transcribing prices. A line of transcription from another company costs between 9 and 15 cents.

Factors Influencing The Transcription Process

The time it takes a transcriptionist to produce an audio or video file transcript depends on several factors. Here’s a look at the key factors that can impact transcription time:


If a lot of research is involved in the topic, it’ll naturally take longer to transcribe a file. It’s safe to say that it should take a professional transcriber about four hours to transcribe an hour-long university educational video. But if the subject is clinical trials for something like cancer treatments, it would probably take up to six hours to transcribe. This is because there would be a fair amount of research required to ensure the spelling is accurate in the transcript.

Amount of Speakers

Multiple speakers in a conversation complicate the transcription process, so expect to pay more for your transcripts if there are various speakers involved.

Transcription Style

The style of transcription determines how detailed the final transcript must be. You can choose between verbatim, intelligent verbatim, and true verbatim. True verbatim includes every detail, like laughter and ambient sounds. It takes much longer to do accurate verbatim transcriptions than intelligent verbatim, where background noises are left out.