

I encountered a similar problem and found a solution that worked, at least for me. I recorded a sample video, added the commands to chop it up, then concat it. Here is the full ffprobe output Here is the original (red) vs the segment (green) Detailed Sample Files Since then I've tried to use -tom and -t in different places along with -af apad -c:v copy and I've still failed to get the duration to be the same. Which resulted in the following data ffprobe -v quiet -show_entries stream=start_time,duration output.MOV Somehow concat + encoding fixed the issue, but I don't want to re-encode the videos and loose quality each time.

The original set of clips had mis-matched timestamps. I realized the concat wasn't the problem. Note: all other questions that I could find on SO seem to "fix" the problem by simply encoding the videos over again. ffmpeg -f concat -fflags +genpts -async 1 -i segments.txt test.movįfmpeg -auto_convert 1 -f concat -fflags +genpts -async 1 -i segments.txt -c copy test2.movįfmpeg -f concat -i segments.txt -c copy -fflags +genpts test3.mp4įfmpeg -f concat -fflags +genpts -async 1 -i segments.txt -copyts test4.movįfmpeg -f concat -i segments.txt -copyts test5.movįfmpeg -f concat -i segments.txt -copyts -c copy test6.movįfmpeg -f concat -fflags +genpts -i segments.txt -copyts -c copy test7.mov None of these seem to correct the problem though. I've tried several flags to sort out this problem that appears to be based on the timestamps. Obviously, encoding the videos rather than simply joining them will result in a loss of information/quality so I would rather find a way around this problem. However, simply concatenating the videos without any transformation or encoding results in a slowly increasing sync issue. The audio stays in sync as ffmpeg does the full conversion calculations and seems to get everything right.

Using concat and encoding the video seems to work fine. I've tried with several videos and noticed the same problem for h.264 / MP4. Joining multiple files using ffmpeg concat seems to result in a mismatch of the timestamps or offsets for the audio.
