Tensorflow 2 - Associating csv lines with image files - image

New to Tensorflow here, so sorry if the question may be basic.
I am trying to create a GAN that will generate images based on a small set of parameters plus a random vector.
In the training set, for each image, I have also one line in a CSV file that is related to such image.
The structure of the CSV file is like this:
Parameter1, Parameter2, Parameter3, ImageFile
4, 7, 2, Image221.png
6, 0, 8, Image044.png
1, 4, 2, Image179.png
I also have a folder with the image files with the given file names.
My problem: I would like to create a pipeline that does not have to load the entire data into memory at once for training (which is a behavior tf.data.Dataset does exhibit), but I need to combine each line in the CSV file with its corresponding image file.
I know how to use list_files to use the images and I know how to use make_csv_dataset in order to use the CSV, but how do I guarantee that each CSV line will be necessarily linked to its correct image file?

For those facing the same problem, I found the obvious solution: all you have to do is to create a map function that takes the file name, loads it and inserts the loaded image as a tensor in a column that replaces the column of the file name.
Ex (for one column with the file name and one with a class):
import PIL
def load_image(filename, class):
img = PIL.Image.Open(filename)
return img, class
dataset = dataset.map(load_image)
Notice that I am using the pillow library (PIL) in order to load the image and this is not mandatory. You can use whatever means you see fit for that.
What really matters here is to load the image in a function and map your dataset with that function.

Related

Tensorflow’s flow_from_directory returns erroneous Set

I am using ImageDataGenerator to create validation set ( for an image classification using TF (Keras) from a labeled directory of images. The directories are 0,1,2,3,4 corresponding to class of each image, they contain 488, 185, 130, 131, 91 images respectively.
train_image_generator = ImageDataGenerator(rescale=1./255) # Generator for our training data
# validation_image_generator = ImageDataGenerator(rescale=1./255) # Generator for our validation data
train_data_gen = train_image_generator.flow_from_directory(batch_size=batch_size,
directory=train_dir,
shuffle=True,
target_size=(IMG_HEIGHT, IMG_WIDTH),
class_mode='categorical')
returns
Found 0 images belonging to 5 classes.
and the code below
validation_data_generator = train_image_generator.flow_from_directory(
train_dir, # same directory as training data
target_size=(IMG_HEIGHT, IMG_WIDTH),
batch_size=batch_size,
class_mode='categorical',
subset='validation') # set as validation data
outputs:
Found 0 images belonging to 5 classes.
What is wrong please? I suppose the validation set has at least few images for last class!
Any help would be greatly appreciated.
CS
It's hard to say without seeing what your directory string is, but I suspect perhaps the problem could be with characters in your "train_dir" variable. You can try using os.path.exists(train_dir) to see if your code is actually pointing to the right spot, or use os.listdir to see if you're seeing the files that way.
Throwing this line in there before you create your data generators might solve it:
import os
train_dir=os.path.normpath(train_dir)
Usually I find the problem is with the slashes in a path. If you just want to use a string, you usually need to use double backslashes (\) or a single forward slash (/) instead of the single backslashes you usually see in a path. This is especially problematic when your filenames or sub-directories start with a number (as yours do).

How to use DBSCAN algorithm for a list of points in python

I am new to image processing and python coding.
I have detected a number of features in an image and have their respective pixel locations placed in a list format.
My_list = [(x1,y1),(x2,y2),......,(xn,yn)]
I would like to use DBSCAN algorithm to form clusters from the following points.
Currently using sklearn.cluster to import the build in DBSCAN function for python.
If the current format for the points is not compatible would like to know which is?
Error currently facing with the current format:
C:\Python\python.exe "F:/opencv_files/dbscan.py"
**Traceback (most recent call last):**
**File "**F:/opencv_files/dbscan.py**", line 83, in <module>
db = DBSCAN(eps=0.5, min_samples=5).fit(X) # metric=X)**
**File "**C:\Python\lib\site-packages\sklearn\cluster\dbscan_.py**", line 282, in fit
X = check_array(X, accept_sparse='csr')
File "**C:\Python\lib\site-packages\sklearn\utils\validation.py**", line 441, in check_array
"if it contains a single sample.".format(array))
ValueError: Expected 2D array, got 1D array instead:
array=[].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.**
Your data is a list of tuple. There is nothing in this structure that prevents you from doing crazy things with that, such as having different lengths in there. Plus, this is a very slow and memory inefficient way of keeping the data because everything is boxed as a Python object.
Just call data = numpy.array(data) to convert your data into an efficient multidimensional numeric array. This array will then have a shape.

Importing images to prep for keras

I am trying to import a bunch of images and get them ready for keras. The goal here is to have an array of the following dimensions. (length, 160,329,3). As you can see my reshape function is commented out. The "print(images.shape) line returns (8037,). Not sure how to proceed to get the right array dimensions. For reference the 1st column in the csv file is a list of paths to the image in question. I have a function below that combines the path of the image inside the folder and the path to the folder.
When I run the commented out reshape function I get the following error. "ValueError: cannot reshape array of size 8037 into shape (8037,160,320,3)"
import csv
import cv2
f = open('/Users/username/Desktop/data/driving_log.csv')
csv_f = csv.reader(f)
m=[]
for row in csv_f:
n=(row)
m.append(n)
images=[]
for i in range(len(m)):
img=(m[i][1])
img=img.lstrip()
path='/Users/username/Desktop/data/'
img=path+img
image=cv2.imread(img)
images.append(image)
item_num = len(images)
images=np.array(images)
#images=np.array(images).reshape(item_num, 160, 320, 3)
print(images.shape) #returns (8037,)
Can you print the shape of an image before it is appended to images to verify it is what you expect? Even better would be adding an imshow in the loop to make sure you're loading the images you expect (only need to do for one or two). cv2.imread does not throw an error if there isn't an image at the file path you give it, so your array might be all None which would yield the exact behavior you've described.
If that is the problem, check the img variable and make sure it's pointing exactly where you want it to.
Turns out it was including the first line of the CSV file which was heading. After I sorted that out it ran great. It gave me the requested shape.
images=[]
for i in range(1,len(labels)):
img=(m[i][1])
img=img.lstrip()
path='/Users/user/Desktop/data/'
img=path+img
image=cv2.imread(img)
images.append(image)

In Python 3, best way to open an image stored in a list as a file object?

Using python 3.4 in linux and windows, I'm trying to create qr code images from a list of string objects. I don't want to just store the image as a file because the list of strings may change frequently. I want to then tile all the objects and display the resulting image on screen for the user to scan with a barcode scanner. For the user to know which code to scan I need to add some text to the qr code image.
I can create the list of image objects correctly and they are in a list and calling .show on these objects displays them properly but I don't know how to treat these objects as a file object to open them. The object that is given to the open function, (img_list[0] in my case), in my add_text_to_img needs to support read, seek and tell methods. When I try this as is I get an attribute error. I've tried BytesIO and StringIO but I get an error message that Image.open does not support buffer interface. Maybe I am not doing that part correctly.
I'm sure there are several ways to do this, but what is the best way to open in memory objects as a file object?
from io import BytesIO
import qrcode
from PIL import ImageFont, ImageDraw, Image
def make_qr_image_list(code_list):
"""
:param code_list: a list of string objects to encode into QR code image
:return: a list of image or some type of other data objects
"""
img_list = []
for item in code_list:
qr = qrcode.QRCode(
version=None,
error_correction=qrcode.ERROR_CORRECT_L,
box_size=4,
border=10
)
qr.add_data(item)
qr_image = qr.make_image(fit=True)
img_list.append(qr_image)
return img_list
def add_text_to_img(text_list, img_list):
"""
While I was working on this, I am only saving the first image. Once
it's working, I'll save the rest of the images to a list.
:param text_list: a list of strings to add to the corresponding image.
:param img_list: the list containing the images already created from
the text_list
:return:
"""
base = Image.open(img_list[0])
# img = Image.frombytes(mode='P', size=(164,164), data=img_list[0])
text_img = Image.new('RGBA', base.size, (255,255,255,0))
font = ImageFont.truetype('sans-serif.ttf', 10)
draw = ImageDraw.Draw(text_img)
draw.text((0,-20),text_list[0], (0,0,255,128), font=font)
# include some method to save the images after the text
# has been added here. Shouldn't actually save to a file.
# Should be saved to memory/img_list
output = Image.alpha_composite(base,text_img)
output.show()
if __name__ == '__main__':
test_list = ['AlGaN','n-AlGaN','p-AlGaN','MQW','LED AlN-AlGaN']
image_list = make_qr_image_list(test_list)
add_text_to_img(test_list, image_list)
im = image_list[0]
im.save('/my_save_path/test_image.png')
im.show()
Edit: I've been using python for about a year and I feel like this is a pretty common thing to do but I'm not even sure that I'm looking up/searching for the right terms. What topics would you search for to answer this? If anyone can post a link or two to what I need to read up on regarding this, that would be very appreciated.
You already have PIL image objects; qr.make_image() returns the (a wrapper around) the right type of object and you do not need to open them again.
As such, all you need to do is:
base = img_list[0]
and go from there.
You do need to match image modes when compositing; QR codes are black-and-white images (mode 1), so either convert that or use the same mode in your text_img image object. The Image.alpha_composite() operation does require that both images have an alpha channel. Converting the base is easy:
base = img_list[0].convert('RGBA')

Caffe Multiple Input Images

I'm looking at implementing a Caffe CNN which accepts two input images and a label (later perhaps other data) and was wondering if anyone was aware of the correct syntax in the prototxt file for doing this? Is it simply an IMAGE_DATA layer with additional tops? Or should I use separate IMAGE_DATA layers for each?
Thanks,
James
Edit: I have been using the HDF5_DATA layer lately for this and it is definitely the way to go.
HDF5 is a key value store, where each key is a string, and each value is a multi-dimensional array. Thus, to use the HDF5_DATA layer, just add a new key for each top you want to use, and set the value for that key to store the image you want to use. Writing these HDF5 files from python is easy:
import h5py
import numpy as np
filelist = []
for i in range(100):
image1 = get_some_image(i)
image2 = get_another_image(i)
filename = '/tmp/my_hdf5%d.h5' % i
with hypy.File(filename, 'w') as f:
f['data1'] = np.transpose(image1, (2, 0, 1))
f['data2'] = np.transpose(image2, (2, 0, 1))
filelist.append(filename)
with open('/tmp/filelist.txt', 'w') as f:
for filename in filelist:
f.write(filename + '\n')
Then simply set the source of the HDF5_DATA param to be '/tmp/filelist.txt', and set the tops to be "data1" and "data2".
I'm leaving the original response below:
====================================================
There are two good ways of doing this. The easiest is probably to use two separate IMAGE_DATA layers, one with the first image and label, and a second with the second image. Caffe retrieves images from LMDB or LEVELDB, which are key value stores, and assuming you create your two databases with corresponding images having the same integer id key, Caffe will in fact load the images correctly, and you can proceed to construct your net with the data/labels of both layers.
The problem with this approach is that having two data layers is not really very satisfying, and it doesn't scale very well if you want to do more advanced things like having non-integer labels for things like bounding boxes, etc. If you're prepared to make a time investment in this, you can do a better job by modifying the tools/convert_imageset.cpp file to stack images or other data across channels. For example you could create a datum with 6 channels - the first 3 for your first image's RGB, and the second 3 for your second image's RGB. After reading this in using the IMAGE_DATA layer, you can split the stream into two images using a SLICE layer with a slice_point at index 3 along the slice_dim = 1 dimension. If further down the road, you decide that you want to load even more complex assortments of data, you'll understand the encoding scheme and can write your own decoding layer based off of src/caffe/layers/data_layer.cpp to gain full control of the pipeline.
You may also consider using HDF5_DATA layer with multiple "top"s

Resources