To detect the image/frame in the video - image

I was given two inputs one is an image (image from .mp4 video file)and the other one is video(mostly in .ts format). Mostly the video is lossy encoding. I need to find the image in the video. Here I can't compare the raw frames of video and image as they are different in encoding . To my knowledge, I need to find same alike image/frame in the video with respect to image. Is there any tools/api to find the image in the video.

Detect features and try to establish a homography.
Then pick the frame with the most homography inliers (the cv::findHomography function has an output parameter named mask)

Related

Overlay video with moving images FFMPEG

I have a video and few images. I know two places in the video where I want to paste these images. But they shouldn't have fixed position and size. On the contrary, images should move, change their tilt angle and scale. For example you may imagine closed book and you want to overlay its name when the book is slowly opens.
I read FFMPEG documentation but didn't found anything about this. Can FFMPEG do this? If not, which libraries or methodics can do that?
The FFMPEG overlay filter can overlay one stream atop another.
It takes an expression which is evaluated per frame to determine the position.
https://ffmpeg.org/ffmpeg-filters.html#overlay-1
You may consider creating a filter chain to do the following.
1) Create a transparent image with the title of your book.
2) Use a 3D rotate filter to convert the single image into an animated sequence
3) Use the overlay filter to apply the animated stream atop your book video.

ffmpeg - Is is possible to create a video from image tiles

With ffmpeg, you can:
create a video from a list of images
create an image with tiles representing frames of a video
but, how is it possible to create a video from tiles, in a picture, representing frames of a video?
if I have this command line:
ffmpeg -i test.mp4 -vf "scale=320:240,tile=12x25" out.png
I will get an image (out.png) made of 12x25 tiles of 320x240 pixels each.
I am trying to reverse the process and, from that image, generate a video.
Is it possible?
Edit with more details:
What am I really trying to achieve is to convert a video into a gif preview. But in order to make an acceptable gif, I need to build a common palette. So, either I scan the movie twice, but it would be very long since I have to do it for a large batch, or I make a tiled image with all the frames in a single image, then make a gif with a palette computed from all the frames, which would be significantly faster... if possible.

Why not we use original image instead decoded image to P-frame?

I'm trying to want to know the P-frame at mpeg.
I have a query about reference image.
Why not we use original image instead decoded image to make P-frame?
I-frame, B-frame and P-frame allows to compress the video.
Indeed, in a video you have a lot of redundant information.
Think about a car moving across the screen: all the pixels in the background do not change from a picture to another, only those around the car are "moving". With the I-B-P frame truck, you give the code of the background and then, you just signalling slight changes (the car moving) through vectors.
This way you have to carry less information than if you have to repeat the entire picture each time.
See also:
Video compression
https://stackoverflow.com/a/24084121/3194340

Why are image sequences larger (in size) than the source videos?

When I'm using a command like this in ffmpeg (or any other program):
ffmpeg -i input.mp4 image%d.jpg
The combined files size of all the images always tends to be larger than the video itself. I've tried reducing the frames per second, lower compression settings, blurs, and everything else I can find but the JPEGs always end up being larger in size (combined) afterwards.
I'm trying to understand why this happens, and what can be done to match the size of the video. Are there other compression formats I can use besides JPEG or any settings or tricks I'm overlooking?
Is it even possible?
To simplify, when the video is encoded, only certain images (keyframes) are encoded as full image such as your JPEG.
The rest are encoded as a difference between the current image and the next image, which for most scenes is much less in size comparing to the whole image.
This is because in a video you apply compression not only image by image, by in the time direction as well. So separate images will always be larger than the video. You can't do anything about that.
Lennart is correct, and if you want more detail you should take a look at http://en.wikipedia.org/wiki/Video_compression_picture_types#Summary
Basically sequences of images are I frames only whereas videos can use I frames, P frames and B frames depending on the codec and encode settings, which greatly improves compression efficiency.

Finding YUV file format in Cocoa

I got a raw YUV file format all I know at this point is that the clip has a resolution of 176x144.
the Y pla is 176x144=25344 bytes, and the UV plan is half of that. Now, I did some readings about YUV, and there are different formats corresponding to different ways how the Y & US planes are stored.
Now, how can perform some sort of check in Cocoa to find the raw YUV file format. Is there a file header in the YUV frame where I can extract some information?
Thanks in advance to everyone
Unfortunately, if it's just a raw YUV stream, it will just be the data for the frames written to disk, one after another. There probably won't be a header that indicates what specific format is being used.
It sounds like you have determined that it's a YUV 4:2:2 stream, so you just need to determine the interleaving order (the most common possibilities are listed here). In response to your previous question, I posted a function which converts a frame from the UYVY (Y422) YUV format to the 2VUY format used by Apple's YUV OpenGL extension. Your best bet may be to try that out and see how the images look, then modify the interleaving format until the colors and image clears up.

Resources