Following : How to rotate a non-squared image in frequency domain
I just ran this exact code (copy pasted from the author's code with just an additional normalization of the resulting image between 0 an 255 as shown)
... but I get horrible "aliasing" artifacts ... How is this possible ? I see that the OP shows nice unartifacted images from the rotation in frequency space... I would be very curious to know how to obtain that, surely you did not show all your code?
import numpy as np
import cv2
from numpy.fft import fftshift as fftshift
from numpy.fft import ifftshift as ifftshift
angle = 30
M = cv2.imread("phantom.png")
M = cv2.cvtColor(M, cv2.COLOR_BGR2GRAY)
M=np.float32(M)
hanning=cv2.createHanningWindow((M.shape[1],M.shape[0]),cv2.CV_32F)
M=hanning*M
sM = fftshift(M)
rotation_center=(M.shape[1]/2,M.shape[0]/2)
rot_matrix=cv2.getRotationMatrix2D(rotation_center,angle,1.0)
FsM = fftshift(cv2.dft(sM,flags = cv2.DFT_COMPLEX_OUTPUT))
rFsM=cv2.warpAffine(FsM,rot_matrix,(FsM.shape[1],FsM.shape[0]),flags=cv2.INTER_LINEAR, borderMode=cv2.BORDER_CONSTANT)
IrFsM = ifftshift(cv2.idft(ifftshift(rFsM),flags=cv2.DFT_REAL_OUTPUT))
x = IrFsM
x = ((x-np.min(x[:]))/(np.max(x[:])-np.min(x[:])))*255.0
cv2.imwrite('rotated_phantom.png',x)
the output image is:
Also, i ve always been told that it was impossible to correctly rotate (in discrete, not continuous) Fourier space because of interpolation, so how can you explain that?
The DFT imposes a periodicity to the image in both the frequency and the spatial domain (some people disagree with this, but still agree that this view is a good way to explain just about everything that happens in the DFT...). So imagine that your input is not the Shepp-Logan phantom, but an infinite repetition of it. When manipulating the data in the frequency domain, you affect not just the one copy of the image you see, but all of them, and not always in intuitive ways.
One of the consequences is that neighboring copies of your image in the spatial domain rotate, but also expand and come into you image.
The simplest way to avoid this is to pad the image with zeros to double its size.
import numpy as np
import cv2
from numpy.fft import fftshift as fftshift
from numpy.fft import ifftshift as ifftshift
angle = 30
M = cv2.imread("shepp-logan-small.tif")
M = cv2.cvtColor(M, cv2.COLOR_BGR2GRAY)
M = np.float32(M)
# Pad
v, h = M.shape
v //= 2
h //= 2
M = cv2.copyMakeBorder(M, v, v, h, h, cv2.BORDER_CONSTANT)
Now you can apply the rotation in the frequency domain like before. But note that the center of rotation should the pixel at shape//2, do not use a true division! Also note that we no longer need to apply a window function.
sM = fftshift(M)
rotation_center = (M.shape[1]//2,M.shape[0]//2)
rot_matrix = cv2.getRotationMatrix2D(rotation_center,angle,1.0)
FsM = fftshift(cv2.dft(sM,flags = cv2.DFT_COMPLEX_OUTPUT))
rFsM = cv2.warpAffine(FsM,rot_matrix,(FsM.shape[1],FsM.shape[0]),flags=cv2.INTER_LINEAR, borderMode=cv2.BORDER_CONSTANT)
IrFsM = ifftshift(cv2.idft(ifftshift(rFsM),flags=cv2.DFT_REAL_OUTPUT))
Finally, crop the result back to its original size.
# Crop
IrFsM = IrFsM[v:-v, h:-h]
Do note that the result is not pretty. It is much better to rotate in the spatial domain because, as you said, interpolation in the frequency domain is not really sensical.
Related
When generating polygons by buffer (here squares), the geometric points used for generation have different coordinates than those taken by the .centroid method on the polygon after their generation.
Here is an example with just one point.
from shapely.ops import transform
import geopandas as gpd
import shapely.wkt
import pyproj
from math import sqrt
def edge_size(area): return sqrt(area)*1e3
point = "POINT (4379065.583907348 2872272.254645019)"
point = shapely.wkt.loads(point)
center = gpd.GeoSeries(point)
project = pyproj.Transformer.from_proj(
pyproj.Proj('epsg:3395'),
pyproj.Proj('epsg:4326'),
always_xy=True)
center = center.apply(lambda p: transform(project.transform, p))
print(center.iloc[0])
square = point.buffer(
edge_size(3), cap_style=3) #distance of 3km2
square = gpd.GeoSeries(square)
square = square.apply(lambda p: transform(project.transform, p))
square = square.apply(lambda p: p.centroid)
print(square.iloc[0])
#POINT (39.33781544185747 25.11929860805248)
#POINT (39.33781544185747 25.11929777802279)
This leads to processing errors afterwards.
First of all, is this normal? And how to solve this problem?
I also reported my problem here. Thank you for your attention.
Copying my answer from GitHub for posterity.
This is not a bug but a misunderstanding of coordinate transformation. You have to keep in mind that what is square in one projection is not square in another.
If you stick to the same CRS, the output of the centroid of a buffer equals the initial point. But the centroid of a reprojected polygon is slightly off, specifically because you did reprojection that skewed the geometry in one direction.
How to overcome this problem?
Do all your operations in one CRS and reproject once you are done.
An exciting animation was posted on twitter recently: https://twitter.com/thomas_rackow/status/1392509885883944960.
One of the authors explained in this Jupyter Notebook https://nbviewer.jupyter.org/github/koldunovn/FESOM_SST_shaded_by_U/blob/main/FESOM_SST_shaded_by_U.ipynb
how a frame is created.
Related to the simple code displayed by this notebook, my question is: when we call imshow twice for the same ax:
ax.imshow(np.flipud(sst.sst.values), cmap=cm.RdBu_r, vmin=12, vmax=24)
ax.imshow(np.flipud(u.u_surf.values), alpha=0.3, cmap=cm.gray, vmin=-.3, vmax=0.3)
what operations performs matplotlib, behind the scenes, to get a layered image?
I worked with alpha blending in Open CV - Python, but here it starts with two arrays of the same shape (1000, 1000), and via ax.imshow, called twice for the two arrays, it displays the resulting image. I'd like to know how is it possible. What arithmetic operations between images are involved?
I searched the matplotlib github repository to understand what's going on, but I couldn't find something relevant.
I succeeded to illustrate that the two imshow(s) hide the alpha-blending of the two images.
import numpy as np
import xarray as xr
import matplotlib.pyplot as plt
import matplotlib.cm as cm
sst = xr.open_dataset('GF_FESOM2_testdata/sst.nc')
u = xr.open_dataset('GF_FESOM2_testdata/u_surf.nc')
v = xr.open_dataset('GF_FESOM2_testdata/v_surf.nc')
#Define the heatmap from SST-data and extract the array representing it as an image:
fig1, ax1 = plt.subplots(1, 1,
constrained_layout=True,
figsize=(10, 10))
f1 = ax1.imshow(np.flipud(sst.sst.values), cmap=cm.RdBu_r, vmin=12, vmax=24)
ax1.axis('off');
arr1 = f1.make_image('notebook')[0] #array representing the above image
#Repeat the same procedure for u-data set:
fig2, ax2 = plt.subplots(1, 1,
constrained_layout=True,
figsize=(10, 10))
f2 = ax2.imshow(np.flipud(u.u_surf.values), cmap=cm.gray, vmin=-0.3, vmax=0.3)
ax2.axis('off');
arr2 = f2.make_image("notebook")[0]
#alpha blending of the two images amounts to a convex combination of the associated arrays
alpha1= 1 # background image alpha
alpha2 = 0.3 #foreground image alpha
arr = np.asarray((alpha2*arr2 + alpha1*(1-alpha2)*arr1)/(alpha2+alpha1*(1-alpha2)), dtype=np.uint8)
fig, ax = plt.subplots(1, 1,
constrained_layout=True,
figsize=(10, 10))
ax.imshow(np.flipud(arr))
ax.axis('off');
I'm trying to implement a perceptual-based image searching engine, that will allow users to find pictures, containing objects of relatively same or close colours to the user-specified template(object from the sample image).
The goal for now is not to match a precise object, but rather to find any significant areas that are close in color to the template. I am stuck with indexing my dataset.
I have tried some clustering algorithms, such as k-means from sklearn.cluster (as I've read from this article), to select centroids from the sample image as my features, that are eventually in CIELab color space to acquire more perceptual uniformity. But it doesn't seem to work well, as cluster centres are generated randomly and thus I've got poor metrics results even on an object and image, from which that same object was extracted!
As far as I'm concerned, a common algorithm in simple image searching programs is using distance between histograms, which is not acceptable as I try to sustain perceptually-valid colour difference, and by that I mean that I can only manage two separate colours (and maybe some additional values) to calculate metrics in CIELab colour space. I am using CMCl:c metric of my own implementation, and it produced good results so far.
Maybe someone can help me and recommend an algorithm more suitable for my purpose.
Some code that I've done so far:
import cv2 as cv
import numpy as np
from sklearn.cluster import KMeans, MiniBatchKMeans
from imageproc.color_metrics import *
def feature_extraction(image, features_length=6):
width, height, dimensions = tuple(image.shape)
image = cv.cvtColor(image, cv.COLOR_BGR2LAB)
image = cv.medianBlur(image, 7)
image = np.reshape(image, (width * height, dimensions))
clustering_handler = MiniBatchKMeans(n_init=40, tol=0.0, n_clusters=features_length, compute_labels=False,
max_no_improvement=10, max_iter=200, reassignment_ratio=0.01)
clustering_handler.fit(image)
features = np.array(clustering_handler.cluster_centers_, dtype=np.float64)
features[:, :1] /= 255.0
features[:, :1] *= 100.0
features[:, 1:2] -= 128.0
features[:, 2:3] -= 128.0
return features
if __name__ == '__main__':
first_image_name = object_image_name
second_image_name = image_name
sample_features = list()
reference_features = list()
for name, features in zip([first_image_name, second_image_name], [sample_features, reference_features]):
image = cv.imread(name)
features.extend(feature_extraction(image, 6))
distance_matrix = np.ndarray((6, 6))
distance_mappings = {}
for n, i in enumerate(sample_features):
for k, j in enumerate(reference_features):
distance_matrix[n][k] = calculate_cmc_distance(i, j)
distance_mappings.update({distance_matrix[n][k]: (i, j)})
minimal_distances = []
for i in distance_matrix:
minimal_distances.append(min(i))
minimal_distances = sorted(minimal_distances)
print(minimal_distances)
for ii in minimal_distances:
i, j = distance_mappings[ii]
color_plate1 = np.zeros((300, 300, 3), np.float32)
color_plate2 = np.zeros((300, 300, 3), np.float32)
color1 = cv.cvtColor(np.float32([[i]]), cv.COLOR_LAB2BGR)[0][0]
color2 = cv.cvtColor(np.float32([[j]]), cv.COLOR_LAB2BGR)[0][0]
color_plate1[:] = color1
color_plate2[:] = color2
cv.imshow("s", np.hstack((color_plate1, color_plate2)))
cv.waitKey()
print(sum(minimal_distances))
The usual approach would be to cluster only once, with a representative sample from all images.
This is a preprocessing step, to generate your "dictionary".
Then for feature extraction, you would map points to the fixed cluster centers, that are now shared across all images. This is a simple nearest-neighbor mapping, no clustering.
I am using imsave() sequentially to make many PNGs that I will combine as an AVI and I would like to add moving text annotations. I use ImageJ to make AVIs or GIFs.
I don't want the axes, numbers, borders or anything, just the color image (as imsave() provides for example) with text (and maybe arrows) inside. These will change frame by frame. Pardon the use of jet.
I could use savefig() with ticks off and then do cropping as post processing, but is there a more convenient, direct, or "matplotlibithic" way to do this that wouldn't be so hard on my hard drive? (final thing will be pretty big).
A code snippet, added by request:
import numpy as np
import matplotlib.pyplot as plt
nx, ny = 101, 101
phi = np.zeros((ny, nx), dtype = 'float')
do_me = np.ones_like(phi, dtype='bool')
x0, y0, r0 = 40, 65, 12
x = np.arange(nx, dtype = 'float')[None,:]
y = np.arange(ny, dtype = 'float')[:,None]
rsq = (x-x0)**2 + (y-y0)**2
circle = rsq <= r0**2
phi[circle] = 1.0
do_me[circle] = False
do_me[0,:], do_me[-1,:], do_me[:,0], do_me[:,-1] = False, False, False, False
n, nper = 100, 100
phi_hold = np.zeros((n+1, ny, nx))
phi_hold[0] = phi
for i in range(n):
for j in range(nper):
phi2 = 0.25*(np.roll(phi, 1, axis=0) +
np.roll(phi, -1, axis=0) +
np.roll(phi, 1, axis=1) +
np.roll(phi, -1, axis=1) )
phi[do_me] = phi2[do_me]
phi_hold[i+1] = phi
change = phi_hold[1:] - phi_hold[:-1]
places = [(32, 20), (54,25), (11,32), (3, 12)]
plt.figure()
plt.imshow(change[50])
for (x, y) in places:
plt.text(x, y, "WOW", fontsize=16)
plt.text(5, 95, "Don't use Jet!", color="white", fontsize=20)
plt.show()
Method 1
Using an excellent answer to another question as a reference, I came up with the following simplified variant which seems to work nicely - just make sure the figsize (which is given in inches) aspect ratio matches the size ratio of the plot data:
import numpy as np
import matplotlib.pyplot as plt
test_image = np.eye(100)
fig = plt.figure(figsize=(4,4))
ax = plt.axes(frameon=False, xticks=[],yticks=[])
ax.imshow(test_image)
plt.savefig('test.png', bbox_inches='tight', pad_inches=0)
Note that I am using imshow with a test_image, which might behave differently from other plotting functions... please let me know in a comment in case you'd like to do something else.
Also note that the image will be (re-) sampled, so the figsize will influence the resolution of the written image.
As pointed out in the comments, the figsize setting doesn't match the size of the output image (or the size on screen, for that matter). To overcome this, use...
Method 2
Reading the FAQ entry Move the edge of an axes to make room for tick labels, I found a way to make the figsize parameter set the output image size directly, by moving the axes' ticks out of the visible area:
import numpy as np
import matplotlib.pyplot as plt
test_image = np.eye(100)
fig = plt.figure(figsize=(4,4))
ax = fig.add_axes([0,0,1,1])
ax.imshow(test_image)
plt.savefig('test.png')
Note that savefig has a default DPI setting (100 in my case) which - in combination with figsize - determines the number of pixels in x and y directions of the saved image. You can override this with the dpi keyword argument to savefig.
If you want to display the image on screen rather than saving it (by using plt.show() instead of the plt.savefig line in the code above), the size of the figure is dependent on (apart from the already familiar figsize parameter) the figure's DPI setting, which also has a default (80 on my system). This value can be overridden by passing the dpi keyword argument to the plt.figure() call.
I am plotting tiled images in a similar way to the working code shown below:
import Image
import matplotlib.pyplot as plt
import random
import numpy
def r():
return random.randrange(50,200)
imsize = 100
rngsize = 5
rng = range(rngsize)
for i in rng:
for j in rng:
im = Image.new('RGB', (imsize, imsize), (r(),r(),r()))
plt.imshow(im, aspect='equal', extent=numpy.array([i, i+1, j, j+1])*imsize)
plt.xlim(-5,imsize * rngsize + 5)
plt.ylim(-5,imsize * rngsize + 5)
plt.show()
The problem is: as you pan and zoom, zoomscale-independent white stripes appear between the image edges, which is very undesireable. I guess this has to do with resampling and antialiasing, but have no idea how to solve it "the right way", specialy for not knowing exact implementation details of matplotlib's rendering engine.
With Cairo and HTML Canvas, you can draw "to the pixel corner" or "to the pixel center" (translating by 0.5 pixel) thus avoiding anti-aliasing effects. Would there be a way to do that with Matplotlib?
Thanks for any help!
You can simply fill in the values to a larger numpy array and plot the entire composite image in one shot. I've adapted your code above for a minimal example but with different sized images you'll need to take a different step size.
F = numpy.zeros((imsize*rngsize,imsize*rngsize,3))
for i in rng:
for j in rng:
F[i*imsize:(i+1)*imsize,
j*imsize:(j+1)*imsize, :] = (r(), r(), r())
plt.imshow(F, interpolation = 'nearest')
plt.show()