find template image in directory of images - windows

I have a directory of images and an image that I know is in this image directory there is a similar image in the directory saved in a different format and scaled differently, but I dont know where (about 100 000 images).
I want to look for the image and find out its filename inside this directory.
I am looking for a mostly already made soulution which I couldn't find. I found OpenCV but I would need to write code around that. Is there a project like that out there?
If there isn't could you help me make a simple C# console app using OpenCV, I tried their templates but never managed to get SURF or CudaSURF working.
Thanks
Edited as per #Mark Setchell's comment

If the image is identical, the fastest way is to get the file size of the image you are looking for and compare it with the file sizes of the images amongst which you are searching.
I suggest this first because, as Christoph clarifies in the comments, it doesn't require reading the file at all - it is just metadata.
If that yields more than one matching answer, calculate a hash (MD5 or other) and pick the filename that produces the same hash.
Again, as mentioned by Christoph in the comments, this doesn't require decoding the image, or holding the decompressed image in RAM, just checksumming it.

So in the end I used this site and modified the python code used there for searching a directory instead of a single image. There is not much code so the full thing is below:
import argparse
from ast import For, arg
import cv2
from os import listdir
from os.path import isfile, join
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", type=str, required=True,
help="path to input image where we'll apply template matching")
ap.add_argument("-t", "--template", type=str, required=True,
help="path to template image")
args = vars(ap.parse_args())
# load the input image and template image from disk
print("[INFO] loading template...")
template = cv2.imread(args["template"])
cv2.namedWindow("Output")
cv2.startWindowThread()
# Display an image
cv2.imshow("Output", template)
cv2.waitKey(0)
# convert both the image and template to grayscale
templateGray = cv2.cvtColor(template, cv2.COLOR_BGR2GRAY)
imageFileNames = [f for f in listdir(args["image"]) if isfile(join(args["image"], f))]
for imageFileName in imageFileNames:
try:
imagePath = args["image"] + imageFileName
print("[INFO] Loading " + imagePath + " from disk...")
image = cv2.imread(imagePath)
print("[INFO] Converting " + imageFileName + " to grayscale...")
imageGray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
print("[INFO] Performing template matching for " + imageFileName + "...")
result = cv2.matchTemplate(imageGray, templateGray,
cv2.TM_CCOEFF_NORMED)
(minVal, maxVal, minLoc, maxLoc) = cv2.minMaxLoc(result)
(startX, startY) = maxLoc
endX = startX + template.shape[1]
endY = startY + template.shape[0]
if maxVal > 0.75:
print("maxVal = " + str(maxVal))
# draw the bounding box on the image
cv2.rectangle(image, (startX, startY), (endX, endY), (255, 0, 0), 3)
# show the output image
cv2.imshow("Output", image)
cv2.waitKey(0)
cv2.imshow("Output", template)
except KeyboardInterrupt:
break
except:
print(imageFileName)
print("Error")
cv2.destroyAllWindows()
The code above shows any image with match value (what I guess is how much similarity there is between source and template) greater than 0.75
Probably still too low but if you want to use it tweak it to your liking.
Note that this WILL NOT work if the image is rotated and if, like me, you have a bright light source in the template other lightsources will come up as false positives
As for time it took me about 7 hours, where the script paused about every 20 minutes for a false positive until I found my image. I got through about 2/3 of all images.
as a sidenote it took 10 minutes to just build the array of files inside the directory, and it took about 500mb of ram once done
This is not the best answer so if anyone more qualified finds this feel free to write another answer.

Related

Writing Macro in ImageJ to open, change color, adjust brightness and resave microscope images

I'm trying to write a code in Image J that will:
Open all images in separate windows that contains "488" within a folder
Use look up tables to convert images to green and RGB color From ImageJ, the commands are: run("Green"); and run("RGB Color");
Adjust the brightness and contrast with defined values for Min and Max (same values for each image).
I know that the code for that is:
//run("Brightness/Contrast..."); setMinAndMax(value min, value max); run("Apply LUT");
Save each image in the same, original folder , in Tiff and with the same name but finishing with "processed".
I have no experience with Java and am very bad with coding. I tried to piece something together using code I found on stackoverflow and on the ImageJ website, but kept getting error codes. Any help is much appreciated!
I don't know if you still need it, but here is an example.
output_dir = "C:/Users/test/"
input_dir = "C:/Users/test/"
list = getFileList(input_dir);
listlength = list.length;
setBatchMode(true);
for (z = 0; z < listlength; z++){
if(endsWith(list[z], 'tif')==true ){
if(list[z].contains("488")){
title = list[z];
end = lengthOf(title)-4;
out_path = output_dir + substring(title,0,end) + "_processed.tif";
open(input_dir + title);
//add all the functions you want
run("Brightness/Contrast...");
setMinAndMax(1, 15);
run("Apply LUT");
saveAs("tif", "" + out_path + "");
close();
};
run("Close All");
}
}
setBatchMode(false);
I think it contains all the things you need. It opens all the images (in specific folder) that ends with tif and contains 488. I didn't completely understand what you want to do with each photo, so I just added your functions. But you probably won't have problems with adding more/different since you can get them with macro recorder.
And the code is written to open tif files. If you have tiff just be cerful that you change that and also change -4 to -5.

Pillow: converting a TIFF from greyscale 16 bit to 8 bit results in fully white image

I know that there are multiple similar questions on SO, but I have tried multiple proposed solutions to no avail.
I have the following TIFF image that opens in Pillow as type='I;16'.
Google Drive link
Based on this SO question, I wrote this code to convert it:
def tiff_force_8bit(image, **kwargs):
if image.format == 'TIFF' and image.mode == 'I;16':
array = np.array(image)
normalized = (array.astype(np.uint16) - array.min()) * 255.0 / (array.max() - array.min())
image = Image.fromarray(normalized.astype(np.uint8))
return image
However, the result is a completely white image.
I have tried other solutions too, such as this:
table = [i/256 for i in range(65536)]
image = image.point(table, 'L')
with the same result: full white out.
Can anyone shed some light?
Thanks!
There's nothing wrong with your code. If you run:
# Open image
im = Image.open('NGC 281 11-01-2021 Ha 1.15.tif')
# Force to 8-bit
res = tiff_force_8bit(im)
# Check min and max of result
res.getextrema() # prints (0,255) as expected
# Save as PNG
res.save('result.png')
# Display it
res.show()
I can only guess there is a problem with your installation or the way you display the result.

Pytorch: load dataset of grayscale images

I want to load a dataset of grayscale images. I used ImageFolder but this doesn't load gray images by default as it converts images to RGB.
I found solutions that load images with ImageFolder and after convert images in grayscale, using:
transforms.Grayscale(num_output_channels=1)
or
ImageOps.grayscale(image)
Is it correct?
How can I load grayscale imaged without conversion? I try ImageDataBunch, but I have problems to import fastai.vision
Assuming the dataset is stored in the "Dataset" folder as given below, set the root directory as "Dataset":
Dataset
class_1
img1.png
img2.png
class_2
img1.png
img2.png
from torchvision.datasets import ImageFolder
from torch.utils.data import DataLoader, random_split
from torchvision import transforms
root = 'Dataset/'
data_transform = transforms.Compose([transforms.Grayscale(num_output_channels=1),
transforms.ToTensor()])
dataset = ImageFolder(root, transform=data_transform)
For reference, train and test dataset are being split into 70% and 30% respectively.
# Split test and train dataset
train_size = int(0.7 * len(dataset))
test_size = len(dataset) - train_size
train_data, test_data = random_split(dataset, [train_size, test_size])
This dataset can be further divided into train and test data loaders as given below to perform operation in batches.
Usually you will see the dataset is assigned batch_size once to be used for both train and test loaders. But, I try to define it separately. The idea is to give the batch_size such that it is a factor of the train/test data loader's size, otherwise it will give an error.
# Set batch size of train data loader
batch_size_train = 20
# Set batch size of test data loader
batch_size_test = 22
# load the split train and test data into batches via DataLoader()
train_loader = DataLoader(train_data, batch_size=batch_size_train, shuffle=True)
test_loader = DataLoader(test_data, batch_size=batch_size_test, shuffle=True)
Yes, that is correct and AFAIK pillow by default loads images in RGB, see e.g. answers to this question. So conversion to grayscale is the only way, though takes time of course.
Pure pytorch solution (if ImageFolder isn't appropriate)
You can roll out your own data loading functionalities and If I were you I wouldn't go fastai route as it's pretty high level and takes away control from you (you might not need those functionalities anyway).
In principle, all you have to do is to create something like this below:
import pathlib
import torch
from PIL import Image
class ImageDataset(torch.utils.data.Dataset):
def __init__(self, path: pathlib.Path, images_class: int, regex="*.png"):
self.files = [file for file in path.glob(regex)]
self.images_class: int = images_class
def __getitem__(self, index):
return Image.open(self.files[index]).convert("LA"), self.images_class
# Assuming you have `png` images, can modify that with regex
final_dataset = (
ImageDataset(pathlib.Path("/path/to/dogs/images"), 0)
+ ImageDataset(pathlib.Path("/path/to/cats/images"), 1)
+ ImageDataset(pathlib.Path("/path/to/turtles/images"), 2)
)
Above would get you images from the paths provided above and each image would return appropriate provided class.
This gives you more flexibility (different folder setting than torchvision.datasets.ImageFolder) for a few more lines.
Ofc, you could add more of those or use loop or whatever else.
You could also apply torchvision.transforms, e.g. transforming images above to tensors, read
torchdata solution
Disclaimer, author here. If you are cocerned about loading times of your data and grayscale transformation you could use torchdata third party library for pytorch.
Using it one could create the same thing as above but use cache or map (to use torchvision.transforms or other transformations easily) and some other things known e.g. from tensorflow.data module, see below:
import pathlib
from PIL import Image
import torchdata
# Change inheritance
class ImageDataset(torchdata.Dataset):
def __init__(self, path: pathlib.Path, images_class: int, regex="*.png"):
super().__init__() # And add constructor call and that's it
self.files = [file for file in path.glob(regex)]
self.images_class: int = images_class
def __getitem__(self, index):
return Image.open(self.files[index]), self.images_class
final_dataset = (
ImageDataset(pathlib.Path("/path/to/dogs/images"), 0)
+ ImageDataset(pathlib.Path("/path/to/cats/images"), 1)
+ ImageDataset(pathlib.Path("/path/to/turtles/images"), 2)
).cache() # will cache data in-memory after first pass
# You could apply transformations after caching for possible speed-up
torchvision ImageFolder loader
As correctly pointed out by #jodag in the comments, one can use loader callable with single argument path to do customized data opening, e.g. for grayscale it could be:
from PIL import Image
import torchvision
dataset = torchvision.datasets.ImageFolder(
"/path/to/images", loader=lambda path: Image.open(path).convert("LA")
)
Please notice you could also use it for other types of files, those doesn't have to be images.
Make custom loader, feed it to ImageFolder:
import numpy as np
from PIL import Image, ImageOps
def gray_reader(image_path):
im = Image.open(image_path)
im2 = ImageOps.grayscale(im)
im.close()
return np.array(im2) # return np array
# return im2 # return PIL Image
some_dataset = ImageFolder(image_root_path, loader=gray_reader)
Edit:
Below code is much better than previous, get color image and convert to grayscale in transform()
def get_transformer(h, w):
valid_transform = transforms.Compose([
transforms.ToPILImage(),
transforms.Grayscale(num_output_channels=1),
transforms.Resize((h, w)),
transforms.ToTensor(),
transforms.Normalize([0.5], [0.5]) ])
return valid_transform

Working on more than one image in Matlab

I started to learn Matlab newly. I am trying to learn about classification. I will make classification for my 23 images. In my function file I am using
I = imread('img.jpg');
a = rgb2gray(I);
bw = double(imread('mask_img.jpg'))/255;
b = rgb2gray(bw);
bwi = 1-b;
And working on the original image and ground truth of the image. I can handle one image and I have loop in the my main file.
for i=1:original_images_db.Count
original = original_images_db.ImageLocation(i);
groundtruth = original_file;
[x,y] = calculateFeatures(original, groundtruth, parameters);
dataset.HorizonFeats{i} = features;
end
And i related original_images_db with imageset to files. When i run my main file, naturally everytime it reads img from function file but actually in command file main can detect other images. My question is how can i make a loop in my function file so my data can be in all other images?
Thank you
fname={'1.jpg','2.jpg','3.jpg'};
create cell like that, it contains all file-path of images
for i=1: length(fname)
im= imread(fname{i});
end
and now you can iterate the all images
or
use dir(image_path) function
fnames = dir('image_directory_path');

Making a gif from images

I have a load of data in 100 .sdf files (labelled 0000.sdf to 0099.sdf), each of which contain a still image, and I'm trying to produce a .gif from these images.
The code I use to plot the figure are (in the same directory as the sdf files):
q = GetDataSDF('0000.sdf');
imagesc(q.data');
I've attempted to write a for loop that would plot the figure and then save it with the same filename as the sdf file but to no avail, using:
for a = 1:100
q=GetDataSDF('0000.sdf');
fh = imagesc(q.dist_fn.x_px.Left.data');
frm = getframe( fh );
% save as png image
saveas(fh, 'current_frame_%02d.jpg');
end
EDIT: I received the following errors when trying to run this code:
Error using hg.image/get
The name 'Units' is not an accessible property for an instance of class 'image'.
Error in getframe>Local_getRectanglesOfInterest (line 138)
if ~strcmpi(get(h, 'Units'), 'Pixels')
Error in getframe (line 56)
[offsetRect, absoluteRect, figPos, figOuterPos] = ...
Error in loop_code (line 4)
frm = getframe( fh );
How do I save these files using a for loop, and how do I then use those files to produce a movie?
The reason for the error is that you pass an image handle to getframe, but this function excpects a figure handle.
Another problem is that you always load the same file, and that you saveas will not work for gifs. (For saving figures as static images, maybe print is the better option?)
I tried to modify my own gif-writing loop so that it works with your data. I'll try to be extra explicit in the comments, since you seem to be starting out. Remember, you can always use help name_of_command to display a short Matlab help.
% Define a variable that holds the frames per second your "movie" should have
gif_fps = 24;
% Define string variable that holds the filename of your movie
video_filename = 'video.gif';
% Create figure 1, store the handle in a variable, you'll need it later
fh = figure(1);
for a = 0:99
% Prepare file name so that you loop over the data
q = GetDataSDF(['00' num2str(a,'%02d') 'sdf']);
% Plot image
imagesc(q.dist_fn.x_px.Left.data');
% Force Matlab to actually do the plot (it sometimes gets lazy in loops)
drawnow;
% Take a "screenshot" of the figure fh
frame = getframe(fh);
% Turn screenshot into image
im = frame2im(frame);
% Turn image into indexed image (the gif format needs this)
[imind,cm] = rgb2ind(im,256);
% If first loop iteration: Create the file, else append to it
if a == 0;
imwrite(imind,cm,video_filename,'gif', 'Loopcount',inf);
else
imwrite(imind,cm,video_filename,'gif','WriteMode','append','DelayTime',1/gif_fps);
end
end
One more note: When the size of the data is the same for each plot, it makes sense to only use the plot(or in this case, imagesc) command once, and in later loop iterations replace it with a set(ah,'Ydata',new_y_data) (or in this case set(ah,'CData',q.dist_fn.x_px.Left.data'), where ah is a handle of the plot axes (not the plot figure!). This is orders of magnitude faster than creating a whole new plot in each loop iteration. The downside is that the scaling (here, the color-scaling) will be the same for each plot. But in every case that I have worked on so far, that was actually desirable.

Resources