I am looking for a VMAF-like objective user-perception video quality scanner that functions at scale. The use case is a twitch-like streaming service where videos are eligible to be played on demand after the live stream completes. We want to have some level of quality in the on demand library without having to view every live stream. We are encoding the livestreams into HLS playlists after the stream completes, but using VMAF to compare the post-stream mp4 to the post-encoded mp4s in HLS doesn't provide the information needed as the original mp4 could be of low quality due to bandwidth issues during the live stream.
Clarification
Not sure if I get the question correctly. You want to measure the output quality of the transcoded video without using the reference video. Is that correct?
Answer
VMAF is a reference quality metric, which means it simply compares how much subjective distortion was introduced into the transcoded video when compared to the source video. It always needs a reference input video.
I think what you are looking for is a no-reference quality metric(s). Where you can measure the "quality" of video without a reference source video. There are a lot of no-reference quality metrics intended to capture different distortion artifacts in the output video. For example, blurring, blocking, and so on. Then you can make an aggregated metric based on these values depending upon what you want to measure.
Conclusion
So, if I were you, I would start searching for no-reference quality metrics. And then look for tools that can measure those no-reference quality metrics efficiently. Hope that answers your question.
Related
I recently collected video data where the video was generated as image sequences. However, between different video of the same length, different numbers of frames were acquired, which made me think that the image sequence have varied frame rates between videos. So my question is how do I convert this image sequence back to video with accurate duration between frames. Is there a way to get that information from the date and time it was created using a code? I know ffmpeg seems to be the tools many people use.
I am not sure where to start. I am not very familiar with coding, so already have trouble executing the correct codes.
I need to measure quality of a live stream, and I was thinking of measuring PSNR. So I was wondering how can I run this test since it is a live stream, I do not have a refence stream or video. Should I look for a reference video of the same resolution and bitrate, like for example a Big buck bunny, and compare against It?
Thanks
I used to check and adjust the quality of my video encodings with an Avisynth script calculating the SSIM index. For some reason, since moving to Windows 10, the performance is much lower.
Then I found out about the ffmpeg ssim filter, which runs at least ten times faster.
But I have these issues:
I can't find what algorithm is used by either method or what are the differences,
More importantly, when I plot the results of each method for a number of video files, the corrrelation is virtually zero. With the avisynth SSIM, I find a good correlation between the h.264 CRF factor (as expected), but that correlation disappears with the ffmpeg SSIM.
Has anyone verified whether the ffmpeg SSIM filter really works?
EDIT: I need to investigate further on the second point. I actually found no correlation for ffmpeg over a small sample of different videos, however, there is an excellent correlation for both methods between CRF and SSIM for a single video sample.
I'm trying to extract raw streams from devices and files using ffmpeg. I notice the crucial frame information (Video: width, height, pixel format, color space, Audio: sample format) is stored both in the AVCodecContext and in the AVFrame. This means I can access it prior to the stream playing and I can access it for every frame.
How much do I need to account for these values changing frame-to-frame? I found https://ffmpeg.org/doxygen/trunk/demuxing__decoding_8c_source.html#l00081 which indicates that at least width, height, and pixel format may change frame to frame.
Will the color space and sample format also change frame to frame?
Will these changes be temporary (a single frame) or lasting (a significant block of frames) and is there any way to predict for this stream which behavior will occur?
Is there a way to find the most descriptive attributes that this stream is possible of producing, such that I can scale all the lower-quality frames up, but not offer a result that is mindlessly higher-quality than the source, even if this is a device or a network stream where I cannot play all the frames in advance?
The fundamental question is: how do I resolve the flexibility of this API with the restriction that raw streams (my output) do not have any way of specifying a change of stream attributes mid-stream. I imagine I will need to either predict the most descriptive attributes to give the stream, or offer a new stream when the attributes change. Which choice to make depends on whether these values will change rapidly or stay relatively stable.
So, to add to what #szatmary says, the typical use case for stream parameter changes is adaptive streaming:
imagine you're watching youtube on a laptop with various methods of internet connectivity, and suddenly bandwidth decreases. Your stream will automatically switch to a lower bandwidth. FFmpeg (which is used by Chrome) needs to support this.
alternatively, imagine a similar scenario in a rtc video chat.
The reason FFmpeg does what it does is because the API is essentially trying to accommodate to the common denominator. Videos shot on a phone won't ever change resolution. Neither will most videos exported from video editing software. Even videos from youtube-dl will typically not switch resolution, this is a client-side decision, and youtube-dl simply won't do that. So what should you do? I'd just use the stream information from the first frame(s) and rescale all subsequent frames to that resolution. This will work for 99.99% for the cases. Whether you want to accommodate your service to this remaining 0.01% depends on what type of videos you think people will upload and whether resolution changes make any sense in that context.
Does colorspace change? They could (theoretically) in software that mixes screen recording with video fragments, but it's highly unlikely (in practice). Sample format changes as often as video resolution: quite often in the adaptive scenario, but whether you care depends on your service and types of videos you expect to get.
Usually not often, or ever. However, this is based on the codec and are options chosen at encode time. I pass the decoded frames through swscale just in case.
Using ffmpeg I can take a number of still images and turn them into a video. I would like to do this to decrease the total size of all my timelapse photos. But I would also like to extract the still images for use at a later date.
In order to use this method:
- I will need to correlate the original still image against a frame number in the video.
- And I will need to extract a thumbnail of a given frame number in a
video.
But before I go down this rabbit hole, I want to know if the requirements are possible using ffmpeg, and if so any hints on how to accomplish the task.
note: The still images are timelapse from a single camera over a day, so temporal compression will be measurable compared to a stack of jpegs.
When you use ffmpeg to create a video from a sequence of images, the images aren't affected in any way. You should still be able to use them for what you're trying to do, unless I'm misunderstanding your question.
Edit: You can use ffmpeg to create images from an existing video. I'm not sure how well it will work for your purposes, but the images are pretty high quality, if not the same as the originals. You'd have to play around with it to make sure the extracted images are exactly the same as the input images as far as sequential order and naming, but if you take fps into account, it should work.
The command to do this (from the ffmpeg documentation) is as follows:
ffmpeg -i movie.mpg movie%d.jpg