FFmpeg Audio Video Sync

Last night, I came across a media file in which audio playback was way behind the original video. So, to resolve the issue, I took the liberty of getting my hands dirty with ffmpeg, instead of using a media player.

For those unaware of what FFmpeg is:

FFmpeg is the leading multimedia framework, able to decode, encode, transcode, mux, demux, stream, filter and play pretty much anything that humans and machines have created.

To put it in simple words, ffmpeg not only allows you to stream audio and video, but also enables you to convert and record them.

So, as I mentioned at the beginning of the post, the media file I was playing had a delayed audio of about 15 seconds. Yikes!

Normally you could use your favorite media player to resolve such issues. For instance, you can specify the number of seconds in the “Track Synchronization” tool of VLC media player to delay/hasten the audio. However, such large synchronizations on a low spec machine often results in a no-audio playback. Even increased file caching has no effects. Besides, I was looking for a rather permanent solution.

Fetching Media Information⌗

:: Info :: Since we won’t be looking into how to intall FFmpeg on your system, you can go to the official downloads page and follow your platform specific instructions to do the same.

Once installed, run the following command to verify your installation:

$ ffmpeg -version

The first thing required is to get the track information of the media file. For that, you use the following command:

$ ffmpeg -i input_file.mkv

-i stands for the input file flag. The resulting output will contain a lot of information, out of which, following are some of the important ones:

Input #0, matroska,webm, from input_file.mkv:
...
Stream #0:0: Video: hevc (Main), yuv420p(tv), 1280x720 [SAR 1:1 DAR 16:9], 23.98 fps, 23.98 tbr, 1k tbn, 23.98 tbc (default)
...
Stream #0:1(eng): Audio: opus, 48000 Hz, stereo, fltp (default)
...
...
Stream #0:3(eng): Subtitle: ass (default)

Exploring the Media Streams⌗

You can tell the common stream format from the above result as follows:

Stream #infile_index:stream_index

Since we specified only single input file, the infile_index returned is 0. What if we tried the same command with two input files? Let’s try that:

$ ffmpeg -i input_file.mkv -i input_file.mkv

You will see following additional information in the result:

...
Input #1, matroska,webm, from input_file.mkv:
...
Stream #1:0: Video: hevc (Main), yuv420p(tv), 1280x720 [SAR 1:1 DAR 16:9], 23.98 fps, 23.98 tbr, 1k tbn, 23.98 tbc (default)
...
Stream #1:1(eng): Audio: opus, 48000 Hz, stereo, fltp (default)

You would have noticed that the Stream # is different for both inputs. Did you also notice that I specified the same media file twice? Yes. It is perfectly valid.

:: Info :: When the media file has multiple audio tracks, they will be specified together with the stream information, i.e. Stream #0:1(eng)

Set Media Offset⌗

Now that we know of the media files’ streams, our next step would be to change the offset of audio stream (we must not go astray from our original goal). To change the offset, following command is used:

$ ffmpeg -itsoffset N -i input_file.mkv

The -itsoffset flag delays/hastens the whole input file. Here N stands for the number of seconds that should be delayed/hastened. Fractional values are also allowed. Although, you have to consider the following when changing the offset:

A positive value of N delays the stream
A negative value of N hastens the stream

Here is where we hit our milestone: “How to hasten the audio of media file by 15 seconds?”

The command to do the same is as follows:

$ ffmpeg -i input_file.mkv -itsoffset -15 -i input_file.mkv

But wait. The above command would result in nothing! Why? Because, we haven’t told ffmpeg which stream from which input file should be hastened i.e. Stream #1:1

How do we do that?

Media Stream Mappings⌗

As we know that the streams are of format #infile_index:stream_index. We can use this at our advantage and tell ffmpeg which input file’s stream to apply the offset to. Let me make this easier to understand:

Our goal is to hasten the audio of the media file only, not the video. The video stream should be kept as it is. To achieve this, we have a -map flag, that tells ffmpeg which stream of the input file should be exported to the output file. A command will make things more clear:

$ ffmpeg -i input_file.mkv -itsoffset -15 -i input_file.mkv -map 0:0 -map 1:1

In the above command, I have specified two mappings:

-map 0:0: Video stream of the 1st input file (without offset)
-map 1:1: Audio stream of the 2nd input file (with offset of -15 secs)

You must be wondering, “Why do we have to do this mapping?” The answer, as I mentioned earlier, is because the -itsoffset flag offsets the whole input file.

Exporting⌗

We are almost done at this point. We found out:

which streams are present in our media file, -i
how to hasten the audio stream, -itsoffset
how to tell ffmpeg which stream to apply offset to, -map

Now all that remains is to export the output to a file. This is done by passing the name of the output file at the end of whole command:

$ ffmpeg -i input_file.mkv -itsoffset -15 -i input_file.mkv -map 0:0 -map 1:1 output_file.mkv

It might take some time, few seconds to few minutes (depending on input media size), to export the file. On successful completion, you will see the output_file.mkv in your working directory, with synchronized audio.

Video synchronization works in a similar way. Just choose the video stream mapping of the offset media file and you are good to go.

Bonus⌗

Skip Re-Encoding⌗

By default, ffmpeg will try to perform re-encoding on both the video and audio streams of the input media before exporting the final result, which is a time consuming task.

What if we want to skip this re-encoding process? It can be done using following flags:

-acodec copy: do not re-encode, just copy the audio stream
-vcodec copy: do not re-encode, just copy the video stream

So, the final command would be something like this:

$ ffmpeg -i input_file.mkv -itsoffset -15 -i input_file.mkv -map 0:0 -map 1:1 -acodec copy -vcodec copy output_file.mkv

Convert Media Format⌗

As metioned at the beginning, ffmpeg can be used to convert the file formats. So, What if want the output of our media to be in different format?

It is possible. However, in such case, the re-encoding phase cannot be skipped, since the codec information will be different for both the input and the output files.

To convert the format, just change the extension of the output file and you are good to go:

$ ffmpeg -i input_file.mkv -itsoffset -15 -i input_file.mkv -map 0:0 -map 1:1 output_file.avi

Hope you learned something interesting. Thank you for reading. See you soon.