FFmpeg Series: complex filters to trim and concat files

FFmpeg is a great tool and we can't emphasize enough how important it is in the modern video processing stack. As John Carmack himself said:


We are very grateful to all the contributors of FFmpeg and wanted to share some of the ffmpeg tips-and-tricks we use with MediaMachine so we're kicking off this series hoping it will be helpful to people just starting with FFmpeg.

Ready for an interesting bite-sized ffmpeg trick?

Split & Concat the same input files at different time-stamps#

Here is an neat way you can take snippets from an input video and concat different chunks of time.

ffmpeg \
-i "$INPUT_FILE" \
-filter_complex \
"[0:v]trim=start=5:duration=10,setpts=PTS-STARTPTS[v0]; \
[0:a]atrim=start=5:duration=10,asetpts=PTS-STARTPTS[a0]; \
[0:v]trim=start=20:duration=10,setpts=PTS-STARTPTS[v1];
[0:a]atrim=start=20:duration=10,asetpts=PTS-STARTPTS[a1]; \
[v0][a0][v1][a1]concat=n=2:v=1:a=1[final]" \
-map "[final]" concat-video.mp4

Let's break this down:

  • -i input-file.mp4

    • the -i flag sets input. You can have multiple inputs which will map to a number starting from 0 (zero).
  • -filter_complex

    • this indicates a filter block. Note: filters work on raw data streams so it's a good indicator that ffmpeg will be forced to encode the data again (ie. slower)
  • [0:v]

    • this is the ffmpeg mapping syntax. Read this as: Select input 0, then select its (v)ideo stream.
  • trim=start=5:duration=10

    • the trim filter is quite simple, start at 5 seconds and trim till a duration of 10 seconds. Note you can use : to chain multiple options within the same filter.
  • atrim

    • as you might have guessed, atrim is an audio filter that works just like trim but on audio streams.
  • setpts & asetpts

    • these filters set the timestamps on the output stream (for video and audio streams respectively). Here, we are telling ffmpeg to start from the same point so that we have smooth playback.
  • [v0][a0][v1][a1]concat=n=2:v=1:a=1[final]

    • The trailing [v0],[a0],[v1],[a1] in each filter line are names assigned to each filter stream.
    • This final line can be read as: take inputs video0,audio0,video1,audio1 and concat them (n=2 segments) and output 1 video stream v=1 and 1 audio stream a=1 and call the output stream as final [final].

Bonus - concat 3 different inputs#

With our understanding of filters and naming, we can extend this idea to work with 3 files like this:

ffmpeg \
-i "$INPUT_FILE_1" \
-i "$INPUT_FILE_2" \
-i "$INPUT_FILE_3" \
-filter_complex \
"[0:v]trim=start=5:duration=10,setpts=PTS-STARTPTS[v0]; \
[0:a]atrim=start=5:duration=10,asetpts=PTS-STARTPTS[a0]; \
[1:v]trim=start=20:duration=10,setpts=PTS-STARTPTS[v1];
[1:a]atrim=start=20:duration=10,asetpts=PTS-STARTPTS[a1]; \
[2:v]trim=start=20:duration=10,setpts=PTS-STARTPTS[v2];
[2:a]atrim=start=20:duration=10,asetpts=PTS-STARTPTS[a2]; \
[v0][a0][v1][a1][v2][a2]concat=n=3:v=1:a=1[final]" \
-map "[final]" concat-video.mp4

Related reading#

Simplify your video pipelineTry MediaMachine today!

  • Get access to one of the cheapest Cloud-Transcode pipelines

  • Engage users early with great Thumbnails and NLP-Like Video summaries

  • No credit card required

Get started for free →