Implementing picture zoom effect using Rmagick and FFmpeg - ruby

I have a picture and I need to get zoom effect on the resulting video. I almost get the desired result.. but. The resulting picture looks a bit shaky. It's because of rounding on cropping and resizing.. so centre of the picture shifts slightly with each conversion. What can i do with that? Or maybe there is some other method to implement it?
In the input I have
picture,zoom_type,zoom_percent,zoom_duration,scene_duration
Here is part of the code which making the job:
img = Magick::ImageList.new(picture).first
width, height = img.columns.to_f, img.rows.to_f
img_fps = 30
if width >= height
aspect_ratio = (width / height)
zoom_small_size = ((height * (100 - zoom_percent)) / 100).to_f
small_size = height
else
aspect_ratio = (height / width)
zoom_small_size = ((width * (100 - zoom_percent)) / 100).to_f
small_size = width
end
factor = (((small_size - zoom_small_size) / (img_fps * zoom_duration))).to_f
while factor < 2
img_fps -= 1
factor = ((small_size - zoom_small_size) / (img_fps * zoom_duration))
end
total_images = img_fps * scene_duration
zoom_images = img_fps * zoom_duration_seed
new_width = width
new_height = height
zoom_changed_small_size = small_size
total_images.times do |i|
if zoom_images > 0 && zoom_changed_small_size > zoom_small_size
img_n = img.crop(new_width, new_height, true)
new_width = (width <= height) ? (new_width - factor).round : (new_width-factor*aspect_ratio).round
new_height = (width >= height) ? (new_height-factor).round : (new_height-factor*aspect_ratio).round
zoom_changed_small_size = (width >= height) ? img_n.rows : img_n.columns
img_n.resize_to_fill!(width, height)
img_n.write("#{sprintf("img_%04d.jpg" % (i+1))}")
zoom_images -= 1
img = img_n.copy if zoom_images == 0 || zoom_changed_small_size <= zoom_small_size
img_n.destroy!
else
img.write("#{sprintf("img_%04d.jpg" % (i+1))}")
puts "Writing - #{img.filename}"
end
end
Then ffmpeg -y -f image2 -r 30 -i img_%04d.jpg -crf 0 -preset ultrafast -tune stillimage -pix_fmt yuv420p out.mp4

The simplest, brute-force approach you could take is to resize your starting images at the beginning of the process to be 3 or 4 times larger than they need to be at maximum zoom. That will reduce the effect of integer pixel rounding once you resize back down for the video frame, to the point where it is (hopefully) not easily visible or at least not distracting. The advantage of this approach is that you can keep most of your existing code as-is. The disadvantage is that you will need to experiment to get the right scaling factor, and depending on your source material and target video size you might end up working with some very large images.
If that simple patch is not to your liking, then it is possible to work with sub-pixel sampling zooms and pans in Image Magick . . .
I had a look at using affine_transform, but in practice it is fiddly. Instead, here is something using the distort method, which seems designed for your needs. This example takes an image, a set of points that define a "view rectangle" (which can all be floating point), and the target width, height of zoomed-in image (which should be integers):
def zoom_window image, from_left, from_top, from_right, from_bottom, to_width, to_height
from_width = (from_right - from_left).to_f
from_height = (from_bottom - from_top).to_f
from_centre_x = 0.5 * ( from_left + from_right )
from_centre_y = 0.5 * ( from_top + from_bottom )
scale_x = to_width/from_width
scale_y = to_height/from_height
zoomed_and_scaled_image = image.distort( Magick::ScaleRotateTranslateDistortion,
[ from_centre_x, from_centre_y, scale_x, scale_y, 0.0,
0.5 * to_width, 0.5 * to_height] ) { |i| i.define("distort:viewport", "#{to_width}x#{to_height}+0+0") }
zoomed_and_scaled_image
end
I have tested the output of this progressively shrinking the from_ rectangle, then using a variation of your ffmpeg command - it resulted in a smooth zoom effect, coping with sub-pixel accuracy nicely, even on extreme zoom-ins (although of course these look blurry). To use it, you would need to calculate the Float co-ords for the zoom window, and call the above method (or your variation of) where currently you crop and resize img_n.
NB I have not made much attempt to make this Ruby code "nice", it's just a proof-of-concept.

Related

Three.js - scaling background image to fit window, without stretching it

When using scene.background the texture is stretched to fit the window size.
I'm trying to reproduce the CSS cover attribute behavior as described on the MDN page:
Scales the image as large as possible without stretching the image.
If the proportions of the image differ from the element, it is cropped
either vertically or horizontally so that no empty space remains.
I understand the repeat and offset texture attributes should be used, but I'm not sure how.
Same problem with you. After digging docs I resolved it. Below code should work for people struggle with this.
targetWidth is your surface or canvas width.
imageWidth is width of texture need fit to target.
const targetAspect = targetWidth / targetHeight;
const imageAspect = imageWidth / imageHeight;
const factor = imageAspect / targetAspect;
// When factor larger than 1, that means texture 'wilder' than target。
// we should scale texture height to target height and then 'map' the center of texture to target, and vice versa.
scene.background.offset.x = factor > 1 ? (1 - 1 / factor) / 2 : 0;
scene.background.repeat.x = factor > 1 ? 1 / factor : 1;
scene.background.offset.y = factor > 1 ? 0 : (1 - factor) / 2;
scene.background.repeat.y = factor > 1 ? 1 : factor;

Three JS - Scaling texture to fit a (any size) Plane perfectly

In essence, I want to replicate the behaviour of how the CSS, background-size: cover works.
Looking here you can see the image is being scaled keeping its aspect ratio, but it's not really working correctly, as the image does not fill the Plane, leaving margins either side - https://next.plnkr.co/edit/8650f9Ji6qWffTqE?preview
Code snippet (Lines 170 - 175) -
var geometryAspectRatio = 5/3;
var imageAspectRatio = 3264/2448;
textTile.wrapT = THREE.RepeatWrapping;
textTile.repeat.x = geometryAspectRatio / imageAspectRatio;
textTile.offset.x = 0.5 * ( 1 - textTile.repeat.x );
What I want to happen is for it so scale-up and then reposition its self in the centre (much how cover works).
var repeatX, repeatY;
repeatX = w * this.textureHeight / (h * this.textureWidth);
if (repeatX > 1) {
//fill the width and adjust the height accordingly
repeatX = 1;
repeatY = h * this.textureWidth / (w * this.textureHeight);
mat.map.repeat.set(repeatX, repeatY);
mat.map.offset.y = (repeatY - 1) / 2 * -1;
} else {
//fill the height and adjust the width accordingly
repeatX = w * this.textureHeight / (h * this.textureWidth);
repeatY = 1;
mat.map.repeat.set(repeatX, repeatY);
mat.map.offset.x = (repeatX - 1) / 2 * -1;
}
Updated https://next.plnkr.co/edit/LUk37xLG2yvv6hgg?preview
For anyone confused by this as I was, the missing piece for me is that .repeat.x and .repeat.y properties of any texture can be values less than one, and scales up the image when is under 1 as the inverse of the scale. Think about it, when it's scale 2, in a way it repeats .5 times because you only see half of the image.
So...
Something not supported by textures in THREE.js and common in some libraries, would be
.scaleX = 2; (not supported in THREE.js textures as of v1.30.1)
And the THREE.js texture equivalent would be
texture.repeat.x = .5;
To convert scale to "repeat", simply do the inverse of the scale
var desiredScaleX = 3;
var desiredRepeatX = 1 / desiredScaleX;
The repeat for scale 3 comes out to (1/3) = .3333; In other words a 3x image would be cropped and only show 1/3 of the image, so it repeats .3333 times.
As for scaling to fit to cover, generally choosing the larger scale of the two will do the trick, something like:
var fitScaleX = targetWidth / actualWidth;
var fitScaleY = targetHeight / actualHeight;
var fitCoverScale = Math.max(fitScaleX,fitScaleY);
var repeatX = 1 / fitCoverScale;
var repeatY = 1 / fitCoverScale;

Zooming/scaling a tiled image anchoring the zoom point to the mouse cursor

I've got a project where I'm designing an image viewer for tiled images. Every image tile is 256x256 pixels. For each level of scaling, I'm increasing the size of each image by 5%. I represent the placement of the tiles by dividing the screen into tiles the same size as each image. An offset is used to precicely place each image where needed. When the scaling reaches a certain point(1.5), I switch over to a new layer of images that altogether has a greater resolution than the previous images. The zooming method itself looks like this:
def zoomer(self, mouse_pos, zoom_in): #(tuple, bool)
x, y = mouse_pos
x_tile, y_tile = x / self.tile_size, y / self.tile_size
old_scale = self.scale
if self.scale > 0.75 and self.scale < 1.5:
if zoom_in:
self.scale += SCALE_STEP # SCALE_STEP = 5% = 0.05
ratio = (SCALE_STEP + 1)
else:
self.scale -= SCALE_STEP
ratio = 1 / (SCALE_STEP + 1)
else:
if zoom_in:
self.zoom += 1
self.scale = 0.8
ratio = (SCALE_STEP + 1)
else:
self.zoom -= 1
self.scale = 1.45
ratio = 1 / (SCALE_STEP + 1)
# Results in x/y lengths of the relevant full image
x_len = self.size_list[self.levels][0] / self.power()
y_len = self.size_list[self.levels][1] / self.power()
# Removing extra pixel if present
x_len = x_len - (x_len % 2)
y_len = y_len - (y_len % 2)
# The tile's picture coordinates
tile_x = self.origo_tile[0] + x_tile
tile_y = self.origo_tile[1] + y_tile
# The mouse's picture pixel address
x_pic_pos = (tile_x * self.tile_size) -
self.img_x_offset + (x % self.tile_size)
y_pic_pos = (tile_y * self.tile_size) -
self.img_y_offset + (y % self.tile_size)
# Mouse percentile placement within the image
mouse_x_percent = (x_pic_pos / old_scale) / x_len
mouse_y_percent = (y_pic_pos / old_scale) / y_len
# The mouse's new picture pixel address
new_x = (x_len * self.scale) * mouse_x_percent
new_y = (y_len * self.scale) * mouse_y_percent
# Scaling tile size
self.tile_size = int(TILE_SIZE * self.scale)
# New mouse screen tile position
new_mouse_x_tile = x / self.tile_size
new_mouse_y_tile = y / self.tile_size
# The mouse's new tile address
new_tile_x = new_x / self.tile_size
new_tile_y = new_y / self.tile_size
# New tile offsets
self.img_x_offset = (x % self.tile_size) - int(new_x % self.tile_size)
self.img_y_offset = (y % self.tile_size) - int(new_y % self.tile_size)
# New origo tile
self.origo_tile = (int(new_tile_x) - new_mouse_x_tile,
int(new_tile_y) - new_mouse_y_tile)
Now, the issue arising from this is that the mouse_.._percent variables never seem to match up with the real position. For testing purposes, I feed the method with a mouse position centered in the middle of the screen and the picture centered in the middle too. As such, the resulting mouse_.._percent variable should, in a perfect world, always equal 50%. For the first level, it does, but quickly wanders off when scaling. By the time I reach the first zoom breakpoint (self.scale == 1.5), the position has drifted to x = 48%, y = 42%.
The self.origo_tile is a tuple containing the x/y coordinate for the tile to be drawn on screen tile (0, 0)
I've been staring at this for hours, but can't seen to find a remedy for it...
How the program works:
I apologize that I didn't have enough time to apply this to your code, but I wrote the following zooming simulator. The program allows you to zoom the same "image" multiple times, and it outputs the point of the image that would appear in the center of the screen, along with how much of the image is being shown.
The code:
from __future__ import division #double underscores, defense against the sinister integer division
width=256 #original image size
height=256
posx=128 #original display center, relative to the image
posy=128
while 1:
print "Display width: ", width
print "Display height: ", height
print "Center X: ", posx
print "Center Y: ", posy
anchx = int(raw_input("Anchor X: "))
anchy = int(raw_input("Anchor Y: "))
zmag = int(raw_input("Zoom Percent (0-inf): "))
zmag /= 100 #convert from percent to decimal
zmag = 1/zmag
width *= zmag
height *= zmag
posx = ((anchx-posx)*zmag)+posx
posy = ((anchy-posy)*zmag)+posy
Sample output:
If this program outputs the following:
Display width: 32.0
Display height: 32.0
Center X: 72.0
Center Y: 72.0
Explanation:
This means the zoomed-in screen shows only a part of the image, that part being 32x32 pixels, and the center of that part being at the coordinates (72,72). This means on both axes it is displaying pixels 56 - 88 of the image in this specific example.
Solution/Conclusion:
Play around with that program a bit, and see if you can implement it into your own code. Keep in mind that different programs move the Center X and Y differently, change the program I gave if you do not like how it works already (though you probably will, it's a common way of doing it). Happy Coding!

Resize image by pixel amount

I tried to find out, but I couldn't.
A image, for example, 241x76 has a total of 18,316 pixels (241 * 76).
The resize rule is, the amount of pixels cannot pass 10,000.
Then, how can I get the new size keeping the aspect ratio and getting less than 10,000 pixels?
Pseudocode:
pixels = width * height
if (pixels > 10000) then
ratio = width / height
scale = sqrt(pixels / 10000)
height2 = floor(height / scale)
width2 = floor(ratio * height / scale)
ASSERT width2 * height2 <= 10000
end if
Remember to use floating-point math for all calculations involving ratio and scale when implementing.
Python
import math
def capDimensions(width, height, maxPixels=10000):
pixels = width * height
if (pixels <= maxPixels):
return (width, height)
ratio = float(width) / height
scale = math.sqrt(float(pixels) / maxPixels)
height2 = int(float(height) / scale)
width2 = int(ratio * height / scale)
return (width2, height2)
An alternative function in C# which takes and returns an Image object:
using System.Drawing.Drawing2D;
public Image resizeMaxPixels(int maxPixelCount, Image originalImg)
{
Double pixelCount = originalImg.Width * originalImg.Height;
if (pixelCount < maxPixelCount) //no downsize needed
{
return originalImg;
}
else
{
//EDIT: not actually needed - scaleRatio takes care of this
//Double aspectRatio = originalImg.Width / originalImg.Height;
//scale varies as the square root of the ratio (width x height):
Double scaleRatio = Math.Sqrt(maxPixelCount / pixelCount);
Int32 newWidth = (Int32)(originalImg.Width * scaleRatio);
Int32 newHeight = (Int32)(originalImg.Height * scaleRatio);
Bitmap newImg = new Bitmap(newWidth, newHeight);
//this keeps the quality as good as possible when resizing
using (Graphics gr = Graphics.FromImage(newImg))
{
gr.SmoothingMode = SmoothingMode.AntiAlias;
gr.InterpolationMode = InterpolationMode.HighQualityBicubic;
gr.PixelOffsetMode = PixelOffsetMode.HighQuality;
gr.DrawImage(originalImg, new Rectangle(0, 0, newWidth, newHeight));
}
return newImg;
}
}
with graphics code from the answer to Resizing an Image without losing any quality
EDIT: Calculating the aspect ratio is actually irrelevant here as we're already scaling the width and height by the (square root) of the total pixel ratio. You could use it to calculate the newWidth based on the newHeight (or vice versa) but this isn't necessary.
Deestan's code works for square images, but in situations where the aspect ratio is different than 1, a square root won't do. You need to take scale to the power of aspect ratio divided by 2.
Observe (Python):
def capDimensions(width, height, maxPixels):
pixels = width * height
if (pixels <= maxPixels):
return (width, height)
ratio = float(width) / height
scale = (float(pixels) / maxPixels)**(width/(height*2))
height2 = round(float(height) / scale)
width2 = round(ratio * height2)
return (width2, height2)
Let's compare the results.
initial dimensions: 450x600
initial pixels: 270000
I'm trying to resize to get as close as possible to 119850 pixels.
with Deestan's algorithm:
capDimensions: 300x400
resized pixels: 67500
with the modified algorithm:
capDimensions. 332x442
resized pixels: 82668
width2 = int(ratio * height / scale)
would better be
width2 = int(ratio * height2)
because this would potentially preserve the aspect ratio better (as height2 has been truncated).
Without introducing another variable like 'sc', one can write
new_height = floor(sqrt(m / r))
and
new_width = floor(sqrt(m * r))
given m=max_pixels (here: 10.000), r=ratio=w/h (here: 241/76 = 3.171)
Both results are independent of each other! From each new_value, you can calculate the other dimension, with
(given: new_height) new_width = floor(new_height * r)
(given: new_width) new_height = floor(new_width / r)
Because of clipping the values (floor-function), both pairs of dimensions may differ in how close their ratio is to the original ratio; you'd choose the better pair.
Scaling images down to max number of pixels, while maintaining aspect ratio
This is what I came up with this afternoon, while trying to solve the math problem on my own, for fun. My code seems to work fine, I tested with a few different shapes and sizes of images. Make sure to use floating point variables or the math will break.
Pseudocode
orig_width=1920
orig_height=1080
orig_pixels=(orig_width * orig_height)
max_pixels=180000
if (orig_pixels <= max_pixels) {
# use original image
}
else {
# scale image down
ratio=sqrt(orig_pixels / max_pixels)
new_width=floor(orig_width / ratio)
new_height=floor(orig_height / ratio)
}
Example results
1920x1080 (1.77778 ratio) becomes 565x318 (1.77673 ratio, 179,670 pixels)
1000x1000 (1.00000 ratio) becomes 424x424 (1.00000 ratio, 179,776 pixels)
200x1200 (0.16667 ratio) becomes 173x1039 (0.16651 ratio, 179,747 pixels)

What's a more elegant rephrasing of this cropping algorithm? (in Python)

I want to crop a thumbnail image in my Django application, so that I get a quadratic image that shows the center of the image. This is not very hard, I agree.
I have already written some code that does exactly this, but somehow it lacks a certain ... elegance. I don't want to play code golf, but there must be a way to express this shorter and more pythonic, I think.
x = y = 200 # intended size
image = Image.open(filename)
width = image.size[0]
height = image.size[1]
if (width > height):
crop_box = ( ((width - height)/2), 0, ((width - height)/2)+height, height )
image = image.crop(crop_box)
elif (height > width):
crop_box = ( 0, ((height - width)/2), width, ((height - width)/2)+width )
image = image.crop(crop_box)
image.thumbnail([x, y], Image.ANTIALIAS)
Do you have any ideas, SO?
edit: explained x, y
I think this should do.
size = min(image.Size)
originX = image.Size[0] / 2 - size / 2
originY = image.Size[1] / 2 - size / 2
cropBox = (originX, originY, originX + size, originY + size)
The fit() function in the PIL ImageOps module does what you want:
ImageOps.fit(image, (min(*image.size),) * 2, Image.ANTIALIAS, 0, (.5, .5))
width, height = image.size
if width > height:
crop_box = # something 1
else:
crop_box = # something 2
image = image.crop(crop_box)
image.thumbnail([x, x], Image.ANTIALIAS) # explicitly show "square" thumbnail
I want to a content analysis of a jepg image. I wish to take a jpeg imafe say 251 x 261 and pass it through an algorithm to crop it to say 96 x 87. Can this program do that like t write an intelligent cropping algorithm, with a prompt to rezie the image.

Resources