If you've got user-generated videos on your site that you want users to engage with, you need them to make a good first impression.
Great thumbnails are the key to that, but you've got to do better than just giving the viewer an idea of the content because a great/engaging thumbnail can surprisingly increase user engagement by almost 2x.
Too many sites are extracting thumbnails with one of these methodologies from the Hall of Lame:
- Select a frame from the first few seconds.
- Pick a random frame from the video.
Since a thumbnail is an image, we can apply objective visual quality measurement algorithms to them. This methodology lets us build good filters for selecting great thumbnails instead of picking a random frame from the video and hoping it's not blank because someone covered the camera with their finger.
There are different objective measures that we can consider to define a good picture:
Luminance is a photometric measure of the luminous intensity per unit area of light traveling in a given direction. The "quick and easy" definition of brightness is the subjective, perceived luminance of an image.
We can use the more specific relative luminance with sRGB color spaces:
For humans, green light contributes the most to the perceived intensity by humans. Blue light contributes the least. So this is a handy filter we can use while selecting great thumbnails.
Uniformity defines how much the intensity values of different pixels change within an image. Let's imagine that we represent the intensity value of a pixel as a function of its color and brightness:
then, if we can find the difference between 2 pixels on an image:
We say an image has high uniformity if the pixel intensity differences are generally close to 0.
There are more sophisticated ways to measure the uniformity of images which involve aggregating the intensity into a histogram and then computing a distribution sum for the top N bins.
For example, the image on the left has high uniformity while the one on the right is less uniform
Generally, highly uniform images have less useful information and tend to look bland, so we can discard them.
According to the formal definition, contrast is the difference in luminance or color that makes objects distinguishable from each other.
Contrast defines how objects look in the image (compared to other things in the same image). A great thumbnail should aim for a high contrast value. It means that all the objects in the thumbnail are visible and distinguishable with the appropriate brightness.
Sharpness is the number of details a system can produce. To select a great thumbnail, aim for high sharpness in the image. Of course, you can expect some blur in any image (photographers often play with blur to create some fantastic pictures).
To get a crisp thumbnail image, look for a good level of sharpness in the areas that make more sense, like people's faces, text, etc.
On the other side of the spectrum is blur, or the lack of sharpness. There are different kinds of blur, but for videos, one of the common types of blur is the "Motion Blur", or the blur that you notice when the camera is moving. Blur is also highly pronounced when a scene transition happens.
unless you're trying to recreate the Blurry World of Claude Monet
Now that we're equipped with some tools to evaluate thumbnails, let's see how to extract them from videos.
Automatically picking the right thumbnail from a video is not an easy task, especially if you have videos with black frames, lots of movement, etc.
Let's use this video as an example. It starts in darkness and has a bunch of motion blur:
One solution is to use ffmpeg's thumbnail filter to create a thumbnail.
Let's try the command:
We get the following image:
As you can see, the result looks like a black hole took a selfie. 🙁
We can try to improve this by first trimming all the black frames using ffmpeg's blackframe filter via
First, we use
ffprobe to detect all the black frames. This command returns a list of
(start, end) pairs with information about when black frames start and end.
The output of this command is of the form:
We can pick the
3.23657 as the start of our trimmed video and
32.9329 as the end.
The next step is to use
ffmpeg to trim the video and get a new video with the black section removed.
We now get an output video that has the initial black section removed 🚀. Finally, we can get the thumbnail of the video without the black section.
Let's run one final incantation. If your input video is large, you can also use this opportunity to fry an egg on your laptop.
Did we get a better thumbnail? let's see...
Even though it is an improvement compared to the first thumbnail, it still looks like the thumbnail of a video that will scare your dog.
We just ran through some magic incantations to avoid blank thumbnails. But there is a better way!
MediaMachine parses input videos and looks for candidate frames that give us a great thumbnail. The candidate frames are run through various image quality filters we talked about earlier and scene transitions are intelligently skipped over.
Compare our previous result with the thumbnail generated by MediaMachine for the same video:
Nowadays, most people use their smartphones to record videos. Sometimes these videos start with blank frames or contain blurry and shaky segments - recording with the phone in one hand and a burrito in another ain't easy.
We showcased this problem in a previous blog post with a handheld video.
For this video, we get an extremely blurry image using the
ffmpeg workflow. Compare it with the thumbnail MediaMachine generated:
|Blurry (Using ffmpeg thumbnail command)||Crisp (MediaMachine)|
Thumbnails can drive almost 2x more user engagement for your content. A great thumbnail can be the difference between users interacting with video content on your app or leaving to check what new foods might have materialized in the fridge.
We created MediaMachine to solve the problems we saw every day. The MediaMachine pipeline can automatically generate great video thumbnails at a low cost without manual intervention, and we are constantly learning and improving our selection algorithm.
Don't want to run magic incantations 🪄 to generate engaging video thumbnails? Give us a try. Sign up for an account with us (The first 10Gb are on us!).