I wish to pack an image as a low-quality image along with packets of "quality increments", such that patching the low-quality image with the "quality increments" increases its quality and brings it closer to the original image.
To put it more clearly,
I want to pack an image as a "base-image" (original image in poor quality, say 10%), and packets q1, q2, q3, .... qn
such that
base-image + q1 = original image at quality 20%
base-image + q1 + q2 = original image at quality 30%
...
base-image + q1 +q2 + .... qn = original image at quality 100%
My requirement is to pack an image and send it via a Single Board Computer (Raspberry Pi). I need to reduce the file size as much as possible, but the image should not be pixelized so much that it is unclear. Using this "Image quality in increments" approach, my idea is to get an image of low quality, and receive only a few increments (say till q3) and deem it "acceptable" so I can stop sending/receiving any more data packets.
Please guide me on how to approach this.
Here's a quick toy example of what I mentioned in the comments
Original:
Rebuilt at 0.2 compression
import cv2
import numpy as np
# load image
img = cv2.imread("image.jpg", cv2.IMREAD_GRAYSCALE);
height, width = img.shape[:2];
max_rank = min([width, height]);
# do svd
columns, diags, rows = np.linalg.svd(img, full_matrices = False);
# rebuild image with reduced rank
rank = int(max_rank * 0.2);
rebuilt = np.dot(columns[:,:rank] * diags[:rank], rows[:rank, :]);
rebuilt = rebuilt.astype(np.uint8)
# show image
cv2.imshow("Image", img);
cv2.imshow("Rebuilt", rebuilt);
cv2.waitKey(0);
The idea here is that you can send each column, diagonal, and row one at a time. The number of full sets you have whenever you decide to stop waiting is the rank that you'll use to reconstruct the image.
A more complete example
import cv2
import numpy as np
# receiver
client = [[], [], []]; # columns, diags, rows
def receive(column, value, row):
# grab global
global client;
# add new data
client[0].append(column);
client[1].append(value);
client[2].append(row);
# load image
img = cv2.imread("image.jpg", cv2.IMREAD_GRAYSCALE);
height, width = img.shape[:2];
max_rank = min([width, height]);
# do svd
columns, diags, rows = np.linalg.svd(img, full_matrices = False);
# "send" data to client one rank at a time
for a in range(max_rank):
# check progress
print("Total Ranks Sent: " + str(a + 1));
# get a single rank
column = columns[:,a];
value = diags[a];
row = rows[a,:];
# "send" to client
receive(column, value, row);
# rebuild image with current client side data
client_cols, client_diags, client_rows = client;
# convert to numpy
client_cols = np.array(client_cols);
client_diags = np.array(client_diags);
client_rows = np.array(client_rows);
client_cols = np.transpose(client_cols);
# rebuild
rebuilt = np.dot(client_cols * client_diags, client_rows);
rebuilt = rebuilt.astype(np.uint8);
# show
cv2.imshow("Rebuilt", rebuilt);
key = cv2.waitKey(0);
# early quit
if key == ord('q'):
break;
Related
I was trying to find snr for a set of images that I have but my two methodologies of doing so creates two different answers and I'm not sure which is right. I was wondering if one of them is just straight up the wrong way of doing this or if neither way is correct?
I am trying to characterize the snr of a set of images that I'm processing. I have 1 set of data with images and darkfields. From these pieces of data I subtracted the darkfield from the image and got "corrected_images".
So since I know snr is (mean of signal)/(std of noise), in my first methodology I was working with the corrected image and background noise image and I just took the mean of every pixel on the spectrum (from the corrected image) with a value greater than 1 for signal and the general std for the background noise image as my values. the plot for this methodology is in blue.
In my second methodology I used a single uncorrected image and basically considered every pixel above 50 as signal and every pixel below 50 as noise.This gives us the orange values for snr.
# -*- coding: utf-8 -*-
"""
Spyder Editor
This is a temporary script file.
"""
from PIL import Image
from matplotlib import pyplot as plt
import numpy as np
import os
signals=[]
name = r"interpolating_streaks/corrected"
name2 = r"interpolating_streaks/averages"
file = os.listdir(name)
file2 = os.listdir(name2)
wv1=[]
signal = []
snr = []
noise = []
x=0
for i in file:
wv=(i[:3])
wv1.append(wv)
corrected_image = Image.open(name+"/"+i) #opens the image
streak= np.array(corrected_image)
dark_image = Image.open(name2+'/d'+wv+'_averaged.tif')
dark = np.array(dark_image)
darkavg = dark[:][:].mean(axis=0)
avg= streak[:][:].mean(axis=0)
for i in avg:
if i >= 1:
signal.append(i)
noiser = np.std(darkavg)
signalr = np.mean(signal)
snr.append(signalr/noiser)
plt.plot(wv1,snr)
signal = []
noise = []
snr = []
for i in file2:
if(i[0] !='d'):
image = Image.open(name2+'/' + i )
im = np.array(image)
im_avg = im[:][:].mean(axis=0)
for i in im_avg:
if i <= 50:
noise.append(i)
else:
signal.append(i)
snr.append(np.mean(signal)/np.std(noise))
plt.plot(wv1,snr)
I would expect the snr values to be the same , and I know for my camera the snr has to be below 45 dB (but also I'm pretty sure this methodology for snr doesnt output decibels)
here are my current results
![1]: https://imgur.com/a/Vgecyp1
I'm trying to build a simple image classifier using scikit-learn. I'm hoping to avoid having to resize and convert each image before training.
Question
Given two different images that are different formats and sizes (1.jpg and 2.png), how can I avoid a ValueError while fitting the model?
I have one example where I train using only 1.jpg, which fits successfully.
I have another example where I train using both 1.jpg and 2.png and a ValueError is produced.
This example will fit successfully:
import numpy as np
from sklearn import svm
import matplotlib.image as mpimg
target = [1, 2]
images = np.array([
# target 1
[mpimg.imread('./1.jpg'), mpimg.imread('./1.jpg')],
# target 2
[mpimg.imread('./1.jpg'), mpimg.imread('./1.jpg')],
])
n_samples = len(images)
data = images.reshape((n_samples, -1))
model = svm.SVC()
model.fit(data, target)
This example will raise a Value error.
Observe the different 2.png image in target 2.
import numpy as np
from sklearn import svm
import matplotlib.image as mpimg
target = [1, 2]
images = np.array([
# target 1
[mpimg.imread('./1.jpg'), mpimg.imread('./1.jpg')],
# target 2
[mpimg.imread('./2.png'), mpimg.imread('./1.jpg')],
])
n_samples = len(images)
data = images.reshape((n_samples, -1))
model = svm.SVC()
model.fit(data, target)
# ValueError: setting an array element with a sequence.
1.jpg
2.png
For this, I would really recommend using the tools in Keras that are specifically designed to preprocess images in a highly scalable and efficient way.
from keras.preprocessing.image import ImageDataGenerator
from PIL import Image
import matplotlib.pyplot as plt
import numpy as np
1 Determine the target size of your new pictures
h,w = 150,150 # desired height and width
batch_size = 32
N_images = 100 #total number of images
Keras works in batches, so batch_size just determines how many pictures at once will be processed (this does not impact your end result, just the speed).
2 Create your Image Generator
train_datagen = ImageDataGenerator(
rescale=1./255)
train_generator = train_datagen.flow_from_directory(
'Pictures_dir',
target_size=(h, w),
batch_size=batch_size,
class_mode = 'binary')
The object that is going to do the image extraction is ImageDataGenerator. It has the method flow_from_directory which I believe might be useful for you here. It will read the content of the folder Pictures_dir and expect your images to be in folders by class (eg: Pictures_dir/class0 and Pictures_dir/class1). The generator, when called, will then create images from these folders and also import their label (in this example, 'class0' and 'class1').
There are plenty of other arguments to this generator, you can check them out in the Keras documentation (especially if you want to do data augmentation).
Note: this will take any image, be it PNG or JPG, as you requested
If you want to get the mapping from class names to label indices, do:
train_generator.class_indices
# {'class0': 0, 'class1': 1}
You can check what is going on with
plt.imshow(train_generator[0][0][0])
3 Extract all resized images from the Generator
Now you are ready to extract the images from the ImageGenerator:
def extract_images(generator, sample_count):
images = np.zeros(shape=(sample_count, h, w, 3))
labels = np.zeros(shape=(sample_count))
i = 0
for images_batch, labels_batch in generator: # we are looping over batches
images[i*batch_size : (i+1)*batch_size] = images_batch
labels[i*batch_size : (i+1)*batch_size] = labels_batch
i += 1
if i*batch_size >= sample_count:
# we must break after every image has been seen once, because generators yield indifinitely in a loop
break
return images, labels
images, labels = extract_images(train_generator, N_images)
print(labels[0])
plt.imshow(images[0])
Now you have your images all at the same size in images, and their corresponding labels in labels, which you can then feed into any scikit-learn classifier of your choice.
Its difficult because of the math operations behind the scene, (the details are out of scope) if you manage do so, lets say you build your own algorithm, still you would not get the desired result.
i had this issue once with faces with different sizes. maybe this piece of code give you starting point.
from PIL import Image
import face_recognition
def face_detected(file_address = None , prefix = 'detect_'):
if file_address is None:
raise FileNotFoundError('File address required')
image = face_recognition.load_image_file(file_address)
face_location = face_recognition.face_locations(image)
if face_location:
face_location = face_location[0]
UP = int(face_location[0] - (face_location[2] - face_location[0]) / 2)
DOWN = int(face_location[2] + (face_location[2] - face_location[0]) / 2)
LEFT = int(face_location[3] - (face_location[3] - face_location[2]) / 2)
RIGHT = int(face_location[1] + (face_location[3] - face_location[2]) / 2)
if UP - DOWN is not LEFT - RIGHT:
height = UP - DOWN
width = LEFT - RIGHT
delta = width - height
LEFT -= int(delta / 2)
RIGHT += int(delta / 2)
pil_image = Image.fromarray(image[UP:DOWN, LEFT:RIGHT, :])
pil_image.thumbnail((50, 50), Image.ANTIALIAS)
pil_image.save(prefix + file_address)
return True
pil_image = Image.fromarray(image)
pil_image.thumbnail((200, 200), Image.ANTIALIAS)
pil_image.save(prefix + file_address)
return False
Note : i wrote this long time ago maybe not a good practice
I have a panorama image, and a smaller image of buildings seen within that panorama image. What I want to do is recognise if the buildings in that smaller image are in that panorama image, and how the 2 images line up.
For this first example, I'm using a cropped version of my panorama image, so the pixels are identical.
import cv2
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
import math
# Load images
cwImage = cv2.imread('cw1.jpg',0)
panImage = cv2.imread('pan1.jpg',0)
# Prepare for SURF image analysis
surf = cv2.xfeatures2d.SURF_create(4000)
# Find keypoints and point descriptors for both images
cwKeypoints, cwDescriptors = surf.detectAndCompute(cwImage, None)
panKeypoints, panDescriptors = surf.detectAndCompute(panImage, None)
Then I use OpenCV's FlannBasedMatcher to find good matches between the two images:
FLANN_INDEX_KDTREE = 0
index_params = dict(algorithm=FLANN_INDEX_KDTREE, trees=5)
search_params = dict(checks=50)
flann = cv2.FlannBasedMatcher(index_params, search_params)
# Find matches between the descriptors
matches = flann.knnMatch(cwDescriptors, panDescriptors, k=2)
good = []
for m, n in matches:
if m.distance < 0.7 * n.distance:
good.append(m)
So you can see that in this example, it perfectly matches the points between images. So then I find the homography, and apply a perspective warp:
cwPoints = np.float32([cwKeypoints[m.queryIdx].pt for m in good
]).reshape(-1, 1, 2)
panPoints = np.float32([panKeypoints[m.trainIdx].pt for m in good
]).reshape(-1, 1, 2)
h, status = cv2.findHomography(cwPoints, panPoints)
warpImage = cv2.warpPerspective(cwImage, h, (panImage.shape[1], panImage.shape[0]))
Result is that it perfectly places the smaller image within the larger image.
Now, I want to do this where the smaller image isn't a pixel-perfect version of the larger image.
For the new smaller image, the keypoints look like this:
You can see that in some cases, it matches correctly, and in some cases it doesn't.
If I call findHomography with these matches, it's going to take all of these data points into account and come up with a non-sensical warp perspective, because it's basing it on the correct matches and the incorrect matches.
What I'm looking for is a missing step in between detecting the good matches, and calling findHomography, where I can look at the relationship between the matches, and determine which matches are therefore correct.
I'm wondering if there's a function within OpenCV that I should be looking at for this step, or if this is something I'll need to work out on my own, and if so how I should go about doing that?
I wrote a blog in about finding object in scene last year( 2017.11.11). Maybe it helps. Here is the link. https://zhuanlan.zhihu.com/p/30936804
Env: OpenCV 3.3 + Python 3.5
Found matches:
The found object in the scene:
The code:
#!/usr/bin/python3
# 2017.11.11 01:44:37 CST
# 2017.11.12 00:09:14 CST
"""
使用Sift特征点检测和匹配查找场景中特定物体。
"""
import cv2
import numpy as np
MIN_MATCH_COUNT = 4
imgname1 = "box.png"
imgname2 = "box_in_scene.png"
## (1) prepare data
img1 = cv2.imread(imgname1)
img2 = cv2.imread(imgname2)
gray1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY)
gray2 = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)
## (2) Create SIFT object
sift = cv2.xfeatures2d.SIFT_create()
## (3) Create flann matcher
matcher = cv2.FlannBasedMatcher(dict(algorithm = 1, trees = 5), {})
## (4) Detect keypoints and compute keypointer descriptors
kpts1, descs1 = sift.detectAndCompute(gray1,None)
kpts2, descs2 = sift.detectAndCompute(gray2,None)
## (5) knnMatch to get Top2
matches = matcher.knnMatch(descs1, descs2, 2)
# Sort by their distance.
matches = sorted(matches, key = lambda x:x[0].distance)
## (6) Ratio test, to get good matches.
good = [m1 for (m1, m2) in matches if m1.distance < 0.7 * m2.distance]
canvas = img2.copy()
## (7) find homography matrix
## 当有足够的健壮匹配点对(至少4个)时
if len(good)>MIN_MATCH_COUNT:
## 从匹配中提取出对应点对
## (queryIndex for the small object, trainIndex for the scene )
src_pts = np.float32([ kpts1[m.queryIdx].pt for m in good ]).reshape(-1,1,2)
dst_pts = np.float32([ kpts2[m.trainIdx].pt for m in good ]).reshape(-1,1,2)
## find homography matrix in cv2.RANSAC using good match points
M, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC,5.0)
## 掩模,用作绘制计算单应性矩阵时用到的点对
#matchesMask2 = mask.ravel().tolist()
## 计算图1的畸变,也就是在图2中的对应的位置。
h,w = img1.shape[:2]
pts = np.float32([ [0,0],[0,h-1],[w-1,h-1],[w-1,0] ]).reshape(-1,1,2)
dst = cv2.perspectiveTransform(pts,M)
## 绘制边框
cv2.polylines(canvas,[np.int32(dst)],True,(0,255,0),3, cv2.LINE_AA)
else:
print( "Not enough matches are found - {}/{}".format(len(good),MIN_MATCH_COUNT))
## (8) drawMatches
matched = cv2.drawMatches(img1,kpts1,canvas,kpts2,good,None)#,**draw_params)
## (9) Crop the matched region from scene
h,w = img1.shape[:2]
pts = np.float32([ [0,0],[0,h-1],[w-1,h-1],[w-1,0] ]).reshape(-1,1,2)
dst = cv2.perspectiveTransform(pts,M)
perspectiveM = cv2.getPerspectiveTransform(np.float32(dst),pts)
found = cv2.warpPerspective(img2,perspectiveM,(w,h))
## (10) save and display
cv2.imwrite("matched.png", matched)
cv2.imwrite("found.png", found)
cv2.imshow("matched", matched);
cv2.imshow("found", found);
cv2.waitKey();cv2.destroyAllWindows()
Below is the current working code in python using PIL for highlighting the difference between the two images. But rest of the images is blacken.
Currently i want to show the background as well along with the highlighted image.
Is there anyway i can keep the show the background lighter and just highlight the differences.
from PIL import Image, ImageChops
point_table = ([0] + ([255] * 255))
def black_or_b(a, b):
diff = ImageChops.difference(a, b)
diff = diff.convert('L')
# diff = diff.point(point_table)
h,w=diff.size
new = diff.convert('RGB')
new.paste(b, mask=diff)
return new
a = Image.open('i1.png')
b = Image.open('i2.png')
c = black_or_b(a, b)
c.save('diff.png')
!https://drive.google.com/file/d/0BylgVQ7RN4ZhTUtUU1hmc1FUVlE/view?usp=sharing
PIL does have some handy image manipulation methods,
but also a lot of shortcomings when one wants
to start doing serious image processing -
Most Python lterature will recomend you to switch
to use NumPy over your pixel data, wich will give
you full control -
Other imaging libraries such as leptonica, gegl and vips
all have Python bindings and a range of nice function
for image composition/segmentation.
In this case, the thing is to imagine how one would
get to the desired output in an image manipulation program:
You'd have a black (or other color) shade to place over
the original image, and over this, paste the second image,
but using a threshold (i.e. a pixel either is equal or
is different - all intermediate values should be rounded
to "different) of the differences as a mask to the second image.
I modified your function to create such a composition -
from PIL import Image, ImageChops, ImageDraw
point_table = ([0] + ([255] * 255))
def new_gray(size, color):
img = Image.new('L',size)
dr = ImageDraw.Draw(img)
dr.rectangle((0,0) + size, color)
return img
def black_or_b(a, b, opacity=0.85):
diff = ImageChops.difference(a, b)
diff = diff.convert('L')
# Hack: there is no threshold in PILL,
# so we add the difference with itself to do
# a poor man's thresholding of the mask:
#(the values for equal pixels- 0 - don't add up)
thresholded_diff = diff
for repeat in range(3):
thresholded_diff = ImageChops.add(thresholded_diff, thresholded_diff)
h,w = size = diff.size
mask = new_gray(size, int(255 * (opacity)))
shade = new_gray(size, 0)
new = a.copy()
new.paste(shade, mask=mask)
# To have the original image show partially
# on the final result, simply put "diff" instead of thresholded_diff bellow
new.paste(b, mask=thresholded_diff)
return new
a = Image.open('a.png')
b = Image.open('b.png')
c = black_or_b(a, b)
c.save('c.png')
Here's a solution using libvips:
import sys
from gi.repository import Vips
a = Vips.Image.new_from_file(sys.argv[1], access = Vips.Access.SEQUENTIAL)
b = Vips.Image.new_from_file(sys.argv[2], access = Vips.Access.SEQUENTIAL)
# a != b makes an N-band image with 0/255 for false/true ... we have to OR the
# bands together to get a 1-band mask image which is true for pixels which
# differ in any band
mask = (a != b).bandbool("or")
# now pick pixels from a or b with the mask ... dim false pixels down
diff = mask.ifthenelse(a, b * 0.2)
diff.write_to_file(sys.argv[3])
With PNG images, most CPU time is spent in PNG read and write, so vips is only a bit faster than the PIL solution.
libvips does use a lot less memory, especially for large images. libvips is a streaming library: it can load, process and save the result all at the same time, it does not need to have the whole image loaded into memory before it can start work.
For a 10,000 x 10,000 RGB tif, libvips is about twice as fast and needs about 1/10th the memory.
If you're not wedded to the idea of using Python, there are a few really simple solutions using ImageMagick:
“Diff” an image using ImageMagick
I have a 3D array, of which the first two dimensions are spatial, so say (x,y). The third dimension contains point-specific information.
print H.shape # --> (200, 480, 640) spatial extents (200,480)
Now, by selecting a certain plane in the third dimension, I can display an image with
imdat = H[:,:,100] # shape (200, 480)
img = ax.imshow(imdat, cmap='jet',vmin=imdat.min(),vmax=imdat.max(), animated=True, aspect='equal')
I want to now rotate the cube, so that I switch from (x,y) to (y,x).
H = np.rot90(H) # could also use H.swapaxes(0,1) or H.transpose((1,0,2))
print H.shape # --> (480, 200, 640)
Now, when I call:
imdat = H[:,:,100] # shape (480,200)
img.set_data(imdat)
ax.relim()
ax.autoscale_view(tight=True)
I get weird behavior. The image along the rows displays the data till 200th row, and then it is black until the end of the y-axis (480). The x-axis extends from 0 to 200 and shows the rotated data. Now on, another rotation by 90-degrees, the image displays correctly (just rotated 180 degrees of course)
It seems to me like after rotating the data, the axis limits, (or image extents?) or something is not refreshing correctly. Can somebody help?
PS: to indulge in bad hacking, I also tried to regenerate a new image (by calling ax.imshow) after each rotation, but I still get the same behavior.
Below I include a solution to your problem. The method resetExtent uses the data and the image to explicitly set the extent to the desired values. Hopefully I correctly emulated the intended outcome.
import matplotlib.pyplot as plt
import numpy as np
def resetExtent(data,im):
"""
Using the data and axes from an AxesImage, im, force the extent and
axis values to match shape of data.
"""
ax = im.get_axes()
dataShape = data.shape
if im.origin == 'upper':
im.set_extent((-0.5,dataShape[0]-.5,dataShape[1]-.5,-.5))
ax.set_xlim((-0.5,dataShape[0]-.5))
ax.set_ylim((dataShape[1]-.5,-.5))
else:
im.set_extent((-0.5,dataShape[0]-.5,-.5,dataShape[1]-.5))
ax.set_xlim((-0.5,dataShape[0]-.5))
ax.set_ylim((-.5,dataShape[1]-.5))
def main():
fig = plt.gcf()
ax = fig.gca()
H = np.zeros((200,480,10))
# make distinguishing corner of data
H[100:,...] = 1
H[100:,240:,:] = 2
imdat = H[:,:,5]
datShape = imdat.shape
im = ax.imshow(imdat,cmap='jet',vmin=imdat.min(),
vmax=imdat.max(),animated=True,
aspect='equal',
# origin='lower'
)
resetExtent(imdat,im)
fig.savefig("img1.png")
H = np.rot90(H)
imdat = H[:,:,0]
im.set_data(imdat)
resetExtent(imdat,im)
fig.savefig("img2.png")
if __name__ == '__main__':
main()
This script produces two images:
First un-rotated:
Then rotated:
I thought just explicitly calling set_extent would do everything resetExtent does, because it should adjust the axes limits if 'autoscle' is True. But for some unknown reason, calling set_extent alone does not do the job.