How to filter motion vectors? - ffmpeg

My video is very noisy temporally. The video was taken under low light conditions at a high frame rate.
Currently I've tried
ffplay -flags2 +export_mvs -i test.mp4 -vf edgedetect=low=0.05:high=0.17,hqdn3d=4.0:3.0:6.0:4.5,codecview=mv=pf+bf+bb,"lutyuv=y='if(lt(val,19),0,val)'
The motion vectors are tracking noise as in the near dark areas the vectors varying greatly in magnitude and angle.
How do I decimate or filter the display motion vectors based on magnitude and/or location?

Remember that codecview will display the motion vectors from the encoded file, so if you denoise that file after decoding (such as ffplay [..] -vf hqdn3d), then the motion vectors aren't actually affected by the denoising, because they come from an earlier part in the pipeline.
To change the motion vectors in the compressed file, you need to re-encode it and denoise/degrain before encoding. I don't remember if there's a way to generate motion vectors (post-decoding) within the filter chain.

Related

Remove video noise from video with ffmpeg without producing blur / scale down effect

My videos are 1920x1080 recorded with high ISO (3200) using smartphone (to get bright view, backlight scene mode). It produce a lot of noise. I try many video filter but all of them produce blur similar to when we reduce the resolution in half then increase it back again.
Is there a good video noise filter that only remove noise without producing blur?
Because if it produce blur, I would prefer to not do any filtering at all.
I have tried video filter:
nlmeans=s=30:r=3:p=1
vaguedenoiser=threshold=22:percent=100:nsteps=4
owdenoise=8:6:6
hqdn3d=100:0:50:0
bm3d=sigma=30:block=4:bstep=8:group=1:range=8:mstep=64:thmse=0:hdthr=5:estim=basic:planes=1
dctdnoiz=sigma=30:n=4
fftdnoiz=30:1:6:0.8
All produce blur, some even worse. I have to use strong setting to make the noise moderately removed. I end up halving the resolution and use remove grain then scale it up again. This is much better for me than all the above method (pp filter is used to reduce size without reducing image detail):
scale=960:540,removegrain=3:0:0:0,pp=dr/fq|8,scale=1920:1080
code example
FOR %%G IN (*.jpg) DO "ffmpeg.exe" -y -i "%%G" -vf "nlmeans=s=30:r=3:p=1" -qmin 1 -qmax 1 -q:v 1 "%%G.jpg"
Part of the image
The image:
To help with blur, I always use unsharp to sharpen the image after nlmeans. Below are the parameters I find work best on old grainy movies, or 4K transfers of old movies that create unacceptable grain. It seems to work quite well. For 4K movies, it almost makes them as good as the 1080p Blu Ray versions.
nlmeans=s=1:p=7:pc=5:r=3:p=3
unsharp=7:7:2.5

perspective correction example

I have some videos taken of a display, with the camera not perfectly oriented, so that the result shows a strong trapezoidal effect.
I know that there is a perspective filter in ffmpeg https://ffmpeg.org/ffmpeg-filters.html#perspective, but I'm too dumb to understand how it works from the docs - and I cannot find a single example.
Somebody can show me how it works?
The following example extracts a trapezoidal perspective section from an input Matroska video to an output video.
An estimated coordinate had to be inserted to complete the trapezoidal pattern (out-of-frame coordinate x2=-60,y2=469).
Input video frame was 1280x720. Pixel interpolation was specified linear, however that is the default if not specified at all. Cubic interpolation bloats the output with NO apparent improvement in video quality. Output video frame size will be of the input video's frame size.
Video output was viewable but rough quality due to sampling error.
ffmpeg -hide_banner -i input.mkv -lavfi "perspective=x0=225:y0=0:x1=715:y1=385:x2=-60:y2=469:x3=615:y3=634:interpolation=linear" output.mkv
You can also make use of ffplay (or any player which lets you access ffmpeg filters, like mpv) to preview the effect, or if you want to keystone-correct a display surface.
For example, if you have your TV above your fireplace mantle and you're sitting on the floor looking up at it, this will un-distort the image to a large extent:
ffplay video.mkv -vf 'perspective=W*.1:0:W*.9:0:-W*.1:H:W*1.1:H'
The above expands the top by 20% and compresses the bottom by 20%, cropping the top and infilling the bottom with the edge pixels.
Also handy for playing back video of a building you're standing in front of with the camera pointed up around 30 degrees.

FFmpeg film grain

I want to add a film grain effect using FFMPEG if possible.
Taking a nice clean computer rendered scene and filter for a gritty black and white 16mm film look. As an example something like Clerks https://www.youtube.com/watch?v=Mlfn5n-E2WE
According to Simulating TV noise Ishould be able to use the following filter
-filter_complex "geq=random(1)*255:128:128;aevalsrc=-2+random(0)"
but when I add it to my ffmpeg command
ffmpeg.exe -framerate 30 -i XYZ%05d.PNG -vf format=yuv420p -dst_range 1 -color_range 2 -c:v libxvid -vtag xvid -q:v 1 -y OUTPUT.AVI
so the command is now
ffmpeg.exe -framerate 30 -i XYZ%05d.PNG -vf format=yuv420p -dst_range 1 -color_range 2 -c:v libxvid -vtag xvid -q:v 1 -y -filter_complex "geq=random(1)*255:128:128;aevalsrc=-2+random(0)" OUTPUT.AVI
I get the message
Filtergraph 'format=yuv420p' was specified through the -vf/-af/-filter option for output stream 0:0, which is fed from a complex filtergraph.
-vf/-af/-filter and -filter_complex cannot be used together for the same stream.
How can I change my ffmpeg command line so the grain filter works? Additionally, can I add a slight blur too? The old 16mm looks more like blurred then grainy.
Thanks for any tips.
I just needed to make a film grain and wanted something "neater" than just randomizing every pixel. Here's what I came up with: FFmpeg film grain.
It starts with white noise:
Then it uses the "deflate" and "dilation" filters to cause certain features to expand out to multiple pixels:
The effect is pretty subtle but you can see that there are a few larger "blobs" of white and black in amongst the noise. This means that the features of the noise aren't just straight-up single pixels any more. Then, that image gets halved in resolution, because it was being rendered at twice the resolution of the target video.
The highest-resolution detail is now softened, and the clumps of pixels are reduced in size to be 1-2 pixels in size. So, this is the noise plane.
Then, I take the source video and do some processing on it.
Desaturate:
Filter luminance so that the closer an input pixel was to luminance level 75 (arrived at experimentally), the brighter the pixel is. If the input pixel was darker or brighter, the output pixel is uniformly darker. This creates "bands" of brightness where the luminance level is close to 75.
This is then scaled down, and this is where the level of noise is "tuned". This band selection means that we will be adding noise specifically in the areas of the frame where it will be most noticed. Not adding noise in other areas leaves more bits to encode the noise.
This scaled mask is then applied to the previously-computed noise. In this screenshot, I've removed the tuning so that the noise is easily visible:
The areas not selected by the band filter are greatly scaled down and are essentially black; the noise variation fades to nothing.
Here's what it looks like with a scaling factor of 0.32 -- pretty subtle:
I then invert this image, so that the parts with no noise are solid white, and then areas with noise pull down slightly from the white:
Finally, I pull another copy of the same source video, apply this computed image to it as an alpha channel and overlay it on black, so that the film grain dots, which are slightly less white, become slightly darker pixels.
The effect is pretty subtle, hard to see in a still like that when it's not moving, but if you tune the noise way up, you can get frames like this:
The filters "geq=random(1)*255:128:128;aevalsrc=-2+random(0)" is for white noise
For "a gritty black and white 16mm film look", you want something like instead,
-vf hue=s=0,boxblur=lr=1.2,noise=c0s=7:allf=t
The format you specified is a filter, and all filters applied on an input should be specified in a single chain, so it should be,
-vf hue=s=0,boxblur=lr=1.2,noise=c0s=7:allf=t,format=yuv420p
See filter docs at https://ffmpeg.org/ffmpeg-filters.html for descriptions and list of parameters you can tweak.

Does simple rescaling from 1080p to frame height of 720 lead to 720p?

I want to convert a 1080p to 720p and also lower resolutions eventually.
I have been using ffmpeg for all my video processing activities so far, and would simply approach this task using the following command:
ffmpeg -i tos.mov -vf scale=-1:720 tos_0x720.mov
I understand that this will rescale my video to a new frame size having 720 pixels set as a fixed height and the width dynamically calculated.
What I am not sure about are the implications regarding the quality factors of the video when using ffmpeg this way.
Is it valid to assume that running this command will output a perfect HD 720p quality video?
What would be a benefit of using dedicated video conversion software to accomplish my goal compared to running the above command?
You can choose which scaling algorithm to use by setting the flags option in the scale filter. Some algorithms work better for up-scaling (bilinear) while others are better for down-sampling (bicubic, lanczos). Some are better for sharp graphics, others for gradual changes, some are faster and some are slower.
I think the default value for flags downsampling is bicubic, while some people recommend lanczos.
To set the flag use:
-vf scale=-1:720:flags=lanczos
Commercial video conversion software use the same algorithms. For eg. Adobe Premiere used variable-radius bicubic for Maximum Render Quality. They might help you choose one ore another depending on what you're after (speed vs. quality) and they may provide tweaks to reduce artifacts resulting from scaling.
There's a lot of literature covering the different algorithms.

Image similarity comparison

I originally asked this question on cstheory.stackexchange.com but was suggested to move it to stats.stackexchange.com.
Is there an existing algorithm that returns to me a similarity metric between two bitmap images? By "similar", I mean a human would say these two images were altered from the same photograph. For example, the algorithm should say the following 3 images are the same (original, position shifted, shrunken).
Same
I don't need to detect warped or flipped images. I also don't need to detect if it's the same object in different orientations.
Different
I would like to use this algorithm to prevent spam on my website. I noticed that the spammers are too lazy to change their spam images. It's not limited to faces. I already know there's already many great facial recognition algorithms out there. The spam image could be anything from a URL to a soccer field to a naked body.
There is a discussion of image similarity algorithms at stack overflow. Since you don't need to detect warped or flipped images, the histogram approach may be sufficient providing the image crop isn't too severe.
You can use existing deep learning architectures like VGG to generate features from images and then use a similarity metric like cosine similarity to see if two images are essentially the same.
The whole pipeline is pretty easy to set up and you do not need to understand the neural network architecture (you can just treat it like a black box). Also, these features are pretty generic and can be applied to find similarity between any kind of objects, not just faces.
Here are a couple of blogs that walk you through the process.
http://blog.ethanrosenthal.com/2016/12/05/recasketch-keras/
https://erikbern.com/2015/09/24/nearest-neighbor-methods-vector-models-part-1.html
Amazon has a new API called Rekognition which allows you to compare two images for facial similarity. The api returns a similarity percentage for each face with one another and the bounding boxes for each face.
Rekognition also includes an api for both Facial Analysis (returning the gender, approximate age, and other relevant facial details) and Object Scene Detection(returning tags of objects that are within in image).
One of the great technique to calculate similarity of images is "mean structural similarity".
import cv2
from skimage import compare_ssim
img = cv2.imread('img_1.png')
img_2 = cv2.imread('img_2.png')
print(compare_ssim(img, img_2))
If you just want image similarity that's one thing, but facial similarity is quite another. Two very different individuals could appear in the same background and an analysis of image similarity show them to be the same while the same person could be shot in two different settings and the similarity analysis show them to be different.
If you need to do facial analysis you should search for algorithms specific to that. Calculating relative eye, nose and mouth size and position is often done in this kind of analysis.
Use https://github.com/Netflix/vmaf to compare the two sets of images.
First convert the images to yuv422p using ffmpeg and then run the test. Note the score difference. This can be used to tell if the image is similar or different. For this sample they both look quite similiar...
ffmpeg -i .\different-pose-1.jpg -s 1920x1080 -pix_fmt yuv422p different-pose-1.yuv
ffmpeg -i .\different-pose-2.jpg -s 1920x1080 -pix_fmt yuv422p different-pose-2.yuv
.\vmafossexec.exe yuv422p 1920 1080 different-pose-1.yuv different-pose-2.yuv vmaf_v0.6.1.pkl --ssim --ms-ssim --log-fmt json --log different.json
Start calculating VMAF score...
Exec FPS: 0.772885
VMAF score = 2.124272
SSIM score = 0.424488
MS-SSIM score = 0.415149
ffmpeg.exe -i .\same-pose-1.jpg -s 1920x1080 -pix_fmt yuv422p same-pose-1.yuv
ffmpeg.exe -i .\same-pose-2.jpg -s 1920x1080 -pix_fmt yuv422p same-pose-2.yuv
.\vmafossexec.exe yuv422p 1920 1080 same-pose-1.yuv same-pose-2.yuv vmaf_v0.6.1.pkl --ssim --ms-ssim --log-fmt json --log same.json
Start calculating VMAF score...
Exec FPS: 0.773098
VMAF score = 5.421821
SSIM score = 0.285583
MS-SSIM score = 0.400130
References How can I create a YUV422 frame from a JPEG or other image on Ubuntu
Robust Hash Functions do that. But there's still a lot of research going on in that domain. I'm not sure if there are already usable prototypes.
Hope that helps.

Resources