Reordering frames, extracting I-frames, and other ffmpeg tricks
By Robert Russell
- 8 minutes read - 1631 wordsThere are a bunch of ffmpeg
commands I find myself copying and pasting over and over. Maybe you do too. Here are mine.
I’ll explain the flags for this first one in a lot of detail, if I skip a flag on another example then it’s probably described here. And I’ll link to some more thorough docs at the end.
ffmpeg -i /mnt/nas/photos/PXL_20230128_171859165.mp4 -vf select='between(t\,200\,210)*eq(pict_type\,I)*gt(pts-prev_pts\,0.500)' -vsync 0 -frames:v 100 /mnt/nas/frames-PXL_20230128_171859165/img-%03d.jpg
Flags
-i
input file. Should be a video since it has to have frames.-vf
filter video. Replace it with-filter:v
unless yourffmpeg
is old. Theselect
is explained in detail below.-vsync 0
probably does something very important because I’ve copied it many times.-frames:v 100
says to take up to 100 video frames in total. Just like-vf
,-vframes
is a deprecated version of this flag.
Filter
I use the select filter in a lot of these. It chooses specific frames to pass on to the output based on criteria expressed in its own little query language.
select='between(t\,200\,210)*eq(pict_type\,I)*gt(pts-prev_pts\,0.500)'
between()
,eq()
, andgt()
are combined with the * which performs a logical AND. I think of it as multiplying each function together, each function is a 1 if the condition is true and 0 if it’s false. Theselect
part means we’re only interested in frames where the part on the right side of the=
is true.between(t\,200\,210)
is true only for frames when the time is between 200 and 210 seconds. Instead of a comma,
I use\,
because I’m running the command in the bash shell and bash would interpret the comma. You might need to escape other characters like&
or>
if they show up in your command line.t
is the time of the current frame counting up from 0 seconds. Its not the Presentation Timestamp - that comes from a metadata field in the video frame.
eq(pict_type\,I)
is true only when the frame being considered is an I-frame (or perhaps also an IDR frame).gt(pts-prev_pts\,0.500)
uses two variables:pts
is the Presentation Timestamp of the frame currently being consideredprev_pts
is the Presentation Timestamp of the previous frame- the overall condition is true when the difference between the current and previous PTS is over half a second
Description
Combining all of the conditions for this select
we get some I-frames between 200 and 210 seconds in the video which are over half a second apart. Depending on how often I-frames show up in the stream it could one frame every half second or it could be longer.
The select
filter is explained around line 19,000 of man ffmpeg-filters
along with some helpful examples.
The last part, with the img-%03d.jpg
is the output. Since this command will output a bunch of jpeg files ffmpeg
will interpret it as a template. The %03d
part is replaced with a 3 digit number which counts up with each file written. If you want more digits then make the 3
bigger. If you don’t want it zero-padded then get rid of the 0
. If you need a literal %
sign in the filename then try %%
and maybe that’ll work. There’s not a lot more to the template string.
Also, ffmpeg
seems to figure out what the input and output filetypes are using the extension mp4
and jpg
. I’d guess that it interogates the input stream too.
Extracting More I-Frames
ffmpeg -i PXL_20230414_142110054.mp4 -vf select='eq(pict_type\,I)' -vsync 0 -frames:v 100 img-%03d.jpg
This is similar but simpler. Take up to 100 I-frames from the video and write them out with names like img-000.jpg
, img-001.jpg
, img-002.jpg
… img-099.jpg
.
Here’s a different select criteria.
ffmpeg -i PXL_20230414_142110054.mp4 -vf select='isnan(prev_selected_t)+gte(t-prev_selected_t\,0.4)' -vsync 0 -frames:v 200 img-%03d.jpg
Filter
select='isnan(prev_selected_t)+gte(t-prev_selected_t\,0.4)'
isnan()
handles the first frame, whereprev_selected_t
is NaN (“not a number”)gte(t-prev_selected_t\,0.4)
would be more easy to read ast - prev_selected_t >= 0.4
.gte
means “greater than or equal to”.prev_selected_t
is the playing time (in seconds) of the last frame where theselect
selected a frame to output. “Last” as in most recent, or previous. Not the frame at the end of the stream. It’s handy for cases exactly like this where we want to get a frame based on the current frame’st
and the value oft
from the most recent frame that we used.
Extracting More I-Frames
ffmpeg -i source -vf fps=1,select='not(mod(t,5))' -vsync 0 z%d.jpg
Filter
select='not(mod(t,5))'
- selects a frame every 5 seconds (we neednot
because that ifmod(t,5)
is 0 then we want theselect
to be true).
And the files will have names like z1.jpg
, z2.jpg
, z3.jpg
…
Getting just the video stream
Most general purpose video files these days are actually container formats. Sometimes you need to get just the video stream itself out of the container though. Suppose you have a Matroska file with an H.265 (HEVC) video stream in it. To copy just the frames of that stream out to a new file you can use a command like this.
ffmpeg -i containerin.mkv -c:v copy -bsf hevc_mp4toannexb justastream.h265
-c:v copy
says to copy the video stream without transcoding it.-bsf hevc_mp4toannexb
says which bitstream filter to use. Not all bitstream filters are available on all installations. Runffmpeg -bsfs
to see what’s available on the system you’re using. My laptop hash264_mp4toannexb
but nothevc_mp4toannexb
. My desktop does have the HEVC bitstream filter. 1
The resulting stream can be played by VLC but some other video players might balk at it.
Get 50 I-frames, convert to greyscale, and output a raw h265 stream
ffmpeg -i videoin.mp4 -bsf hevc_mp4toannexb -frames:v 50 -filter:v monochrome,select='eq(pict_type\,I)' -f hevc monochrome.h265
-f hevc
forces the output to be HEVC format. I think I did that becauseffmpeg
won’t guess what I meant from the weird “.h265” file extension.
Getting single HEVC frames from the stream isn’t as easy as I’d like but this very inefficient hack I came up with might be useful to you in some other way. I have 50 frames and my framerate is 10fps. So I used the GNU bc command to calculate the time when each frame should happen. And I supplied that to the -ss
flag which gives the start time offset.
for f in {00..49} ; do ffmpeg -y -i justastream.h265 -start_at_zero -ss 0$( echo "$f * 0.1" | bc ) -frames:v 1 justastream/frame-${f}.h265 ; done
-start_at_zero
might not be needed here. The docs say it’s only relevant with-copyts
How would you put these back together into a stream again? A stream is just a bunch of frames one after the other. So cat
can do it.
cat justastream/*.h265 > justastream-all.h265
Though I’m relying on bash globbing to give filenames sorted in order here.
Reordering frames in video
Finally, here’s a little weird experiment I did. Ed Yong’s book (I read it last summer) had an analogy about seeing frames of video swapped in a specific order. Apparently some birds sing their songs with sounds in the same order every time but they seem to care more about hearing the complete set of sounds, not the order. Or something like that. The authour made a comparison to watching a show where the order of each set of frames was swapped.
So I made a video like that. First extract the frames:
ffmpeg -i ducks-on-ice.mp4 frames/f-%04d.jpg
Then the author says for each set of frames {n, n+1, n+2} reorder as {n+2, n+1, n}. Here’s how I reassembled the frames with a messy bash command:
s=frames/;t=reordered-frames/;f="f-%04d.jpg"; for n in {1..898..3} ; do n0=$(printf $f $(( $n )) ); n1=$(printf $f $(( $n+1 )) ); n2=$(printf $f $(( $n+2 )) ); cp "${s}${n0}" "${t}${n2}" ; cp "${s}${n1}" "${t}${n1}" ; cp "${s}${n2}" "${t}${n0}" ; done
for n in {1..898..3}
counts 1, 4, 7, … up to 898. We add 2 later so we use 900 frames.n0=$(printf $f $(( $n )) )
and the similar ones withn1
andn2
each use the format string"f-%04d.jpg"
to make sets of three names likef-0010.jpg
,f-0011.jpg
,f-0012.jpg
.- After making the filenames the three
cp
commands just copy the files with n0 and n2 swapped in thereordered-frames
folder.
Now we just combine frames to create a video.
ffmpeg -framerate 30 -i reordered-frames/f-%04d.jpg reordered-ducks-on-ice.mp4
Getting more help
Get help on a specific filter with a command like:
ffmpeg -h filter=v360
This is useful since it’ll give you information from the version of ffmpeg
that you’re actually using. Your system might have a different set of filters or hardware codec compiled in. The help says
-h type=name -- print all options for the named decoder/encoder/demuxer/muxer/filter/bsf/protocol
Which I find confusing. It means that the word “type” can be replaced with any of the others. Like these examples:
ffmpeg -h decoder=mp3
ffmpeg -h encoder=text
ffmpeg -h demuxer=mpjpeg
ffmpeg -h muxer=rtsp
ffmpeg -h filter=trim # Why didn't I find this one earlier??
ffmpeg -h bsf=trace_headers
ffmpeg -h protocol=file
This post has been in my drafts folder for a long time but when I saw this cool site it reminded me I should tell people what I’ve figured out too.
And here are the ffmpeg docs, also cool. I’ve been using ffmpeg for years and I still feel like I’m just scratching the surface. Flipping through the docs once in a while helps understand what you can do with ffmpeg. There are multitudes of tools that integrate ffmpeg or just act as a frontend for it. Understanding ffmpeg makes it easier to use all those tools too.
for b in $(ffmpeg -bsfs) ; do echo; echo ${b} ; ffmpeg -help bsf=${b} ; done
Or just ffmpeg -help full | less
.
-
Some of the bitstream filters are interesting but most are boring. If you want to see details on all of them try this and get ready to scroll through a lot of output. ↩︎