Pytorch: load dataset of grayscale images - image

I want to load a dataset of grayscale images. I used ImageFolder but this doesn't load gray images by default as it converts images to RGB.
I found solutions that load images with ImageFolder and after convert images in grayscale, using:
transforms.Grayscale(num_output_channels=1)
or
ImageOps.grayscale(image)
Is it correct?
How can I load grayscale imaged without conversion? I try ImageDataBunch, but I have problems to import fastai.vision

Assuming the dataset is stored in the "Dataset" folder as given below, set the root directory as "Dataset":
Dataset
class_1
img1.png
img2.png
class_2
img1.png
img2.png
from torchvision.datasets import ImageFolder
from torch.utils.data import DataLoader, random_split
from torchvision import transforms
root = 'Dataset/'
data_transform = transforms.Compose([transforms.Grayscale(num_output_channels=1),
transforms.ToTensor()])
dataset = ImageFolder(root, transform=data_transform)
For reference, train and test dataset are being split into 70% and 30% respectively.
# Split test and train dataset
train_size = int(0.7 * len(dataset))
test_size = len(dataset) - train_size
train_data, test_data = random_split(dataset, [train_size, test_size])
This dataset can be further divided into train and test data loaders as given below to perform operation in batches.
Usually you will see the dataset is assigned batch_size once to be used for both train and test loaders. But, I try to define it separately. The idea is to give the batch_size such that it is a factor of the train/test data loader's size, otherwise it will give an error.
# Set batch size of train data loader
batch_size_train = 20
# Set batch size of test data loader
batch_size_test = 22
# load the split train and test data into batches via DataLoader()
train_loader = DataLoader(train_data, batch_size=batch_size_train, shuffle=True)
test_loader = DataLoader(test_data, batch_size=batch_size_test, shuffle=True)

Yes, that is correct and AFAIK pillow by default loads images in RGB, see e.g. answers to this question. So conversion to grayscale is the only way, though takes time of course.
Pure pytorch solution (if ImageFolder isn't appropriate)
You can roll out your own data loading functionalities and If I were you I wouldn't go fastai route as it's pretty high level and takes away control from you (you might not need those functionalities anyway).
In principle, all you have to do is to create something like this below:
import pathlib
import torch
from PIL import Image
class ImageDataset(torch.utils.data.Dataset):
def __init__(self, path: pathlib.Path, images_class: int, regex="*.png"):
self.files = [file for file in path.glob(regex)]
self.images_class: int = images_class
def __getitem__(self, index):
return Image.open(self.files[index]).convert("LA"), self.images_class
# Assuming you have `png` images, can modify that with regex
final_dataset = (
ImageDataset(pathlib.Path("/path/to/dogs/images"), 0)
+ ImageDataset(pathlib.Path("/path/to/cats/images"), 1)
+ ImageDataset(pathlib.Path("/path/to/turtles/images"), 2)
)
Above would get you images from the paths provided above and each image would return appropriate provided class.
This gives you more flexibility (different folder setting than torchvision.datasets.ImageFolder) for a few more lines.
Ofc, you could add more of those or use loop or whatever else.
You could also apply torchvision.transforms, e.g. transforming images above to tensors, read
torchdata solution
Disclaimer, author here. If you are cocerned about loading times of your data and grayscale transformation you could use torchdata third party library for pytorch.
Using it one could create the same thing as above but use cache or map (to use torchvision.transforms or other transformations easily) and some other things known e.g. from tensorflow.data module, see below:
import pathlib
from PIL import Image
import torchdata
# Change inheritance
class ImageDataset(torchdata.Dataset):
def __init__(self, path: pathlib.Path, images_class: int, regex="*.png"):
super().__init__() # And add constructor call and that's it
self.files = [file for file in path.glob(regex)]
self.images_class: int = images_class
def __getitem__(self, index):
return Image.open(self.files[index]), self.images_class
final_dataset = (
ImageDataset(pathlib.Path("/path/to/dogs/images"), 0)
+ ImageDataset(pathlib.Path("/path/to/cats/images"), 1)
+ ImageDataset(pathlib.Path("/path/to/turtles/images"), 2)
).cache() # will cache data in-memory after first pass
# You could apply transformations after caching for possible speed-up
torchvision ImageFolder loader
As correctly pointed out by #jodag in the comments, one can use loader callable with single argument path to do customized data opening, e.g. for grayscale it could be:
from PIL import Image
import torchvision
dataset = torchvision.datasets.ImageFolder(
"/path/to/images", loader=lambda path: Image.open(path).convert("LA")
)
Please notice you could also use it for other types of files, those doesn't have to be images.

Make custom loader, feed it to ImageFolder:
import numpy as np
from PIL import Image, ImageOps
def gray_reader(image_path):
im = Image.open(image_path)
im2 = ImageOps.grayscale(im)
im.close()
return np.array(im2) # return np array
# return im2 # return PIL Image
some_dataset = ImageFolder(image_root_path, loader=gray_reader)
Edit:
Below code is much better than previous, get color image and convert to grayscale in transform()
def get_transformer(h, w):
valid_transform = transforms.Compose([
transforms.ToPILImage(),
transforms.Grayscale(num_output_channels=1),
transforms.Resize((h, w)),
transforms.ToTensor(),
transforms.Normalize([0.5], [0.5]) ])
return valid_transform

Related

Why Python stops to work on GPU when using SimpleITK library in MONAI transforms?

I'm using Python 3.9 with Spyder 5.2.2 (Anaconda) for a U-Net segmentation task with MONAI. After importing all the images in a dictionary, I create these lines to define pre-process steps:
import SimpleITK as sitk
from monai.inferers import SimpleInferer
from monai.transforms import (
AsDiscrete,
DataStatsd,
AddChanneld,
Compose,
Activations,
LoadImaged,
Resized,
RandFlipd,
ScaleIntensityRanged,
DataStats,
AsChannelFirstd,
AsDiscreted,
ToTensord,
EnsureTyped,
RepeatChanneld,
EnsureType
)
from monai.transforms import Transform
monai_load = [
LoadImaged(keys=["image","segmentation"],image_only=False,reader=PILReader()),
EnsureTyped(keys=["image", "segmentation"], data_type="numpy"),
AddChanneld(keys=["segmentation","image"]),
RepeatChanneld(keys=["image"],repeats=3),
AsChannelFirstd(keys=["image"], channel_dim = 0),
]
monai_transforms =[
AsDiscreted(keys=["segmentation"],threshold=0.5),
ToTensord(keys=["image","segmentation"]),
]
class N4ITKTransform(Transform):
def __call__(self,image):
filtered = []
for channel in image["image"]:
inputImage = sitk.GetImageFromArray(channel)
inputImage = sitk.Cast(inputImage, sitk.sitkFloat32)
corrector = sitk.N4BiasFieldCorrectionImageFilter()
outputImage = corrector.Execute(inputImage)
filtered.append(sitk.GetArrayFromImage(outputImage))
image["image"] = np.stack(filtered)
return image
train_transforms = Compose(monai_load + [N4ITKTransform()] + monai_transforms)
When i recall these transforms with Compose and apply them to the train images, python does not work on GPU despite
torch.cuda.is_available()
return True.
These are the lines where I apply the transforms:
train_ds = IterableDataset(data = train_data, transform = train_transforms)
train_loader = DataLoader(dataset = train_ds, batch_size = batch_size, num_workers = 0, pin_memory = True)
When I define the U-Net model, I send it to 'cuda'.
The problem is in the SimpleITK transform. If I don't use them, Python works on GPU as usual.
Thank you in advance for getting back to me.
Federico
The answer is simple: SimpleITK uses CPU for processing.
I am not sure whether it is possible to get it to use some of the GPU-accelerated filters from ITK (its base library). If you use ITK Python, you have the possibility to use GPU-filters. But only a few filters have GPU implementations. N4BiasFieldCorrection does NOT have a GPU implementation. So if you want to use this filter, it needs to be done on the CPU.

find template image in directory of images

I have a directory of images and an image that I know is in this image directory there is a similar image in the directory saved in a different format and scaled differently, but I dont know where (about 100 000 images).
I want to look for the image and find out its filename inside this directory.
I am looking for a mostly already made soulution which I couldn't find. I found OpenCV but I would need to write code around that. Is there a project like that out there?
If there isn't could you help me make a simple C# console app using OpenCV, I tried their templates but never managed to get SURF or CudaSURF working.
Thanks
Edited as per #Mark Setchell's comment
If the image is identical, the fastest way is to get the file size of the image you are looking for and compare it with the file sizes of the images amongst which you are searching.
I suggest this first because, as Christoph clarifies in the comments, it doesn't require reading the file at all - it is just metadata.
If that yields more than one matching answer, calculate a hash (MD5 or other) and pick the filename that produces the same hash.
Again, as mentioned by Christoph in the comments, this doesn't require decoding the image, or holding the decompressed image in RAM, just checksumming it.
So in the end I used this site and modified the python code used there for searching a directory instead of a single image. There is not much code so the full thing is below:
import argparse
from ast import For, arg
import cv2
from os import listdir
from os.path import isfile, join
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", type=str, required=True,
help="path to input image where we'll apply template matching")
ap.add_argument("-t", "--template", type=str, required=True,
help="path to template image")
args = vars(ap.parse_args())
# load the input image and template image from disk
print("[INFO] loading template...")
template = cv2.imread(args["template"])
cv2.namedWindow("Output")
cv2.startWindowThread()
# Display an image
cv2.imshow("Output", template)
cv2.waitKey(0)
# convert both the image and template to grayscale
templateGray = cv2.cvtColor(template, cv2.COLOR_BGR2GRAY)
imageFileNames = [f for f in listdir(args["image"]) if isfile(join(args["image"], f))]
for imageFileName in imageFileNames:
try:
imagePath = args["image"] + imageFileName
print("[INFO] Loading " + imagePath + " from disk...")
image = cv2.imread(imagePath)
print("[INFO] Converting " + imageFileName + " to grayscale...")
imageGray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
print("[INFO] Performing template matching for " + imageFileName + "...")
result = cv2.matchTemplate(imageGray, templateGray,
cv2.TM_CCOEFF_NORMED)
(minVal, maxVal, minLoc, maxLoc) = cv2.minMaxLoc(result)
(startX, startY) = maxLoc
endX = startX + template.shape[1]
endY = startY + template.shape[0]
if maxVal > 0.75:
print("maxVal = " + str(maxVal))
# draw the bounding box on the image
cv2.rectangle(image, (startX, startY), (endX, endY), (255, 0, 0), 3)
# show the output image
cv2.imshow("Output", image)
cv2.waitKey(0)
cv2.imshow("Output", template)
except KeyboardInterrupt:
break
except:
print(imageFileName)
print("Error")
cv2.destroyAllWindows()
The code above shows any image with match value (what I guess is how much similarity there is between source and template) greater than 0.75
Probably still too low but if you want to use it tweak it to your liking.
Note that this WILL NOT work if the image is rotated and if, like me, you have a bright light source in the template other lightsources will come up as false positives
As for time it took me about 7 hours, where the script paused about every 20 minutes for a false positive until I found my image. I got through about 2/3 of all images.
as a sidenote it took 10 minutes to just build the array of files inside the directory, and it took about 500mb of ram once done
This is not the best answer so if anyone more qualified finds this feel free to write another answer.

Process depth image message from ROS with openCV

so i am currently writing a python script that is supposed to receive a ros image message and then convert it to cv2 so i can do further processing. Right now the program just receives an image and then outputs it in a little window as well as saves it as a png.
Here is my code:
#! /usr/bin/python
import rospy
from sensor_msgs.msg import Image
from cv_bridge import CvBridge, CvBridgeError
import cv2
bridge = CvBridge()
def image_callback(msg):
print("Received an image!")
print(msg.encoding)
try:
# Convert your ROS Image message to OpenCV2
# Converting the rgb8 image of the front camera, works fine
cv2_img = bridge.imgmsg_to_cv2(msg, 'rgb8')
# Converting the depth images, does not work
#cv2_img = bridge.imgmsg_to_cv2(msg, '32FC1')
except CvBridgeError, e:
print(e)
else:
# Save your OpenCV2 image as a png
cv2.imwrite('camera_image.png', cv2_img)
cv2.imshow('pic', cv2_img)
cv2.waitKey(0)
def main():
rospy.init_node('image_listener')
#does not work:
#image_topic = "/pepper/camera/depth/image_raw"
#works fine:
image_topic = "/pepper/camera/front/image_raw"
rospy.Subscriber(image_topic, Image, image_callback)
rospy.spin()
if __name__ == '__main__':
main()
So my problem is that my code works perfectly fine if i use the data of the front camera but does not work for the depth images.
To make sure i get the correct encoding type i used the command msg.encoding which tells me the encoding type of the current ros message.
The cv2.imshow works exactly like it should for the front camera pictures and it shows me the same as i would get if i used ros image_view but as soon as i try it with the depth image i just get a fully black or white picture unlike what image_view shows me
Here the depth image i get when i use image_view
Here the depth image i receive when i use the script and cv2.imshow
Does anyone have experience working on depth images with cv2 and can help me to get it working with the depth images as well?
I really would appreciate any help :)
Best regards
You could try in the following way to acquire the depth images,
import rospy
from cv_bridge import CvBridge, CvBridgeError
from sensor_msgs.msg import Image
import numpy as np
import cv2
def convert_depth_image(ros_image):
cv_bridge = CvBridge()
try:
depth_image = cv_bridge.imgmsg_to_cv2(ros_image, desired_encoding='passthrough')
except CvBridgeError, e:
print e
depth_array = np.array(depth_image, dtype=np.float32)
np.save("depth_img.npy", depth_array)
rospy.loginfo(depth_array)
#To save image as png
# Apply colormap on depth image (image must be converted to 8-bit per pixel first)
depth_colormap = cv2.applyColorMap(cv2.convertScaleAbs(depth_image, alpha=0.03), cv2.COLORMAP_JET)
cv2.imwrite("depth_img.png", depth_colormap)
#Or you use
# depth_array = depth_array.astype(np.uint16)
# cv2.imwrite("depth_img.png", depth_array)
def pixel2depth():
rospy.init_node('pixel2depth',anonymous=True)
rospy.Subscriber("/pepper/camera/depth/image_raw", Image,callback=convert_depth_image, queue_size=1)
rospy.spin()
if __name__ == '__main__':
pixel2depth()

Is my topojson file structured properly in order to render a map on folium?

I've downloaded the us-counties shapefile from the US census bureau and converted it into topojson file using mapshaper.com. Unfortunately, I have to parse through the topojson quite a bit to get the FIPS county code. I'm using Folium to render the map but keep getting an error.
I've taken my dataframe and made it into a series of FIPS_codes and $amounts. Using the style_function, I call the FIPS_codes from the topojson file and compare that value to the series to render a map of us-counties.
import branca
colorscale = branca.colormap.linear.YlOrRd_09.scale(0, 50e3)
def style_function(feature):
county_dict = cms_2017_grouped_series.get(
features['objects']['tl_2017_us_county']['geometries']['properties']['GEOID'], None)
return {
'fillOpacity': 0.5,
'weight': 0,
'fillColor': '#black' if employed is None else colorscale(employed)
}
The error I'm getting is AttributeError: 'list' object has no attribute 'get'
The rest of code needed to render the map is below
m = folium.Map(
location=[48, -102],
tiles='cartodbpositron',
zoom_start=3
)
folium.TopoJson(
json.load(open(county_geo)),
'objects.tl_2017_us_county.geometries.properties.GEOID',
style_function=style_function
).add_to(m)
I followed your steps to create the topojson and good news, it checks out. Just need to change up a couple of things with your code
I created some mock user data first. I'm using geopandas and the topjson file to make it easy on myself, but you would just use your pandas dataframe that contains county and employment numbers
import geopandas as gpd
gdf = gpd.read_file('tl_2017_us_county.json')
gdf['employed'] = np.random.randint(low=1, high=100000, size= len(gdf))
Create a Series using your dataframe. This will be used in the style func to "bind" your data to the map
cms_2017_grouped_series = gdf.set_index('GEOID')['employed']
print(cms_2017_grouped_series.head())
GEOID
31039 54221
53069 68374
35011 8477
31109 2278
31129 40247
Name: employed, dtype: int64
This is pretty close to your style function. I've just changed the line with the .get() to use the corrected dict keys of feature. Oh and I'm using the return value(employed) in the fillColor below
import branca
colorscale = branca.colormap.linear.YlOrRd_09.scale(0, 50e3)
def style_function(feature):
employed = cms_2017_grouped_series.get(feature['properties']['GEOID'], None)
return {
'fillOpacity': 0.5,
'weight': 0,
'fillColor': '#black' if employed is None else colorscale(employed)
}
Slight mod of the object_path is next. I'm also saving the map and then opening it in Chrome since it wouldn't render in my notebook due to the size
m = folium.Map(
location=[48, -102],
tiles='cartodbpositron',
zoom_start=3
)
folium.TopoJson(open('tl_2017_us_county.json'), 'objects.tl_2017_us_county',
style_function=style_function).add_to(m)
m.save('map.html')

kivy: possible to use buffer as image source?

I've got code along the lines of the following which generates a new image out of some existing images.
from PIL import Image as pyImage
def create_compound_image(back_image_path, fore_image_path, fore_x_position):
back_image_size = get_image_size(back_image_path)
fore_image_size = get_image_size(fore_image_path)
new_image_width = (fore_image_size[0] / 2) + back_image_size[0]
new_image_height = fore_image_size[1] + back_image_size[1]
new_image = create_new_image_canvas(new_image_width, new_image_height)
back_image = pyImage.open(back_image_path)
fore_image = pyImage.open(fore_image_path)
new_image.paste(back_image, (0, 0), mask = None)
new_image.paste(fore_image, (fore_x_position, back_image_size[1]), mask = None)
return new_image
Later in the code, I've got something like this:
from kivy.uix.image import Image
img = Image(source = create_compound_image(...))
If I do the above, I get the message that Image.source only accepts string/unicode.
If I create a StringIO.StringIO() object from the new image, and try to use that as the source, the error message is the same as above. If I use the output of the StringIO object's getvalue() method as the source, the message is that the source must be encoded string without NULL bytes, not str.
What is the proper way to use the output of the create_compound_image() function as the source when creating a kivy Image object?
It seems you want to just combine two images into one, you can actually just create a texture using Texture.create and blit the data to a particular pos using Texture.blit_buffer .
from kivy.core.image import Image
from kivy.graphics import Texture
bkimg = Image(bk_img_path)
frimg = Image(fr_img_path)
new_size = ((frimg.texture.size[0]/2) + bkimg.texture.size[0],
frimg.texture.size[1] + bkimg.texture.size[1])
tex = Texture.create(size=new_size)
tex.blit_buffer(pbuffer=bkimg.texture.pixels, pos=(0, 0), size=bkimg.texture.size)
tex.blit_buffer(pbuffer=frimg.texture.pixels, pos=(fore_x_position, bkimg.texture.size[1]), size=frimg.texture.size)
Now you can use this texture anywhere directly like::
from kivy.uix.image import Image
image = Image()
image.texture = tex
source is a StringProperty and is expecting a path to file. That's why you got errors when you tried to pass PIL.Image object, StringIO object or string representation of image. It's not what framework wants. As for getting image from StringIO, it was discussed before here:
https://groups.google.com/forum/#!topic/kivy-users/l-3FJ2mA3qI
https://github.com/kivy/kivy/issues/684
You can also try much simpler, quick and dirty method - just save your image as a tmp file and read it normal way.

Resources