In Python 3, best way to open an image stored in a list as a file object? - image

Using python 3.4 in linux and windows, I'm trying to create qr code images from a list of string objects. I don't want to just store the image as a file because the list of strings may change frequently. I want to then tile all the objects and display the resulting image on screen for the user to scan with a barcode scanner. For the user to know which code to scan I need to add some text to the qr code image.
I can create the list of image objects correctly and they are in a list and calling .show on these objects displays them properly but I don't know how to treat these objects as a file object to open them. The object that is given to the open function, (img_list[0] in my case), in my add_text_to_img needs to support read, seek and tell methods. When I try this as is I get an attribute error. I've tried BytesIO and StringIO but I get an error message that Image.open does not support buffer interface. Maybe I am not doing that part correctly.
I'm sure there are several ways to do this, but what is the best way to open in memory objects as a file object?
from io import BytesIO
import qrcode
from PIL import ImageFont, ImageDraw, Image
def make_qr_image_list(code_list):
"""
:param code_list: a list of string objects to encode into QR code image
:return: a list of image or some type of other data objects
"""
img_list = []
for item in code_list:
qr = qrcode.QRCode(
version=None,
error_correction=qrcode.ERROR_CORRECT_L,
box_size=4,
border=10
)
qr.add_data(item)
qr_image = qr.make_image(fit=True)
img_list.append(qr_image)
return img_list
def add_text_to_img(text_list, img_list):
"""
While I was working on this, I am only saving the first image. Once
it's working, I'll save the rest of the images to a list.
:param text_list: a list of strings to add to the corresponding image.
:param img_list: the list containing the images already created from
the text_list
:return:
"""
base = Image.open(img_list[0])
# img = Image.frombytes(mode='P', size=(164,164), data=img_list[0])
text_img = Image.new('RGBA', base.size, (255,255,255,0))
font = ImageFont.truetype('sans-serif.ttf', 10)
draw = ImageDraw.Draw(text_img)
draw.text((0,-20),text_list[0], (0,0,255,128), font=font)
# include some method to save the images after the text
# has been added here. Shouldn't actually save to a file.
# Should be saved to memory/img_list
output = Image.alpha_composite(base,text_img)
output.show()
if __name__ == '__main__':
test_list = ['AlGaN','n-AlGaN','p-AlGaN','MQW','LED AlN-AlGaN']
image_list = make_qr_image_list(test_list)
add_text_to_img(test_list, image_list)
im = image_list[0]
im.save('/my_save_path/test_image.png')
im.show()
Edit: I've been using python for about a year and I feel like this is a pretty common thing to do but I'm not even sure that I'm looking up/searching for the right terms. What topics would you search for to answer this? If anyone can post a link or two to what I need to read up on regarding this, that would be very appreciated.

You already have PIL image objects; qr.make_image() returns the (a wrapper around) the right type of object and you do not need to open them again.
As such, all you need to do is:
base = img_list[0]
and go from there.
You do need to match image modes when compositing; QR codes are black-and-white images (mode 1), so either convert that or use the same mode in your text_img image object. The Image.alpha_composite() operation does require that both images have an alpha channel. Converting the base is easy:
base = img_list[0].convert('RGBA')

Related

Tensorflow 2 - Associating csv lines with image files

New to Tensorflow here, so sorry if the question may be basic.
I am trying to create a GAN that will generate images based on a small set of parameters plus a random vector.
In the training set, for each image, I have also one line in a CSV file that is related to such image.
The structure of the CSV file is like this:
Parameter1, Parameter2, Parameter3, ImageFile
4, 7, 2, Image221.png
6, 0, 8, Image044.png
1, 4, 2, Image179.png
I also have a folder with the image files with the given file names.
My problem: I would like to create a pipeline that does not have to load the entire data into memory at once for training (which is a behavior tf.data.Dataset does exhibit), but I need to combine each line in the CSV file with its corresponding image file.
I know how to use list_files to use the images and I know how to use make_csv_dataset in order to use the CSV, but how do I guarantee that each CSV line will be necessarily linked to its correct image file?
For those facing the same problem, I found the obvious solution: all you have to do is to create a map function that takes the file name, loads it and inserts the loaded image as a tensor in a column that replaces the column of the file name.
Ex (for one column with the file name and one with a class):
import PIL
def load_image(filename, class):
img = PIL.Image.Open(filename)
return img, class
dataset = dataset.map(load_image)
Notice that I am using the pillow library (PIL) in order to load the image and this is not mandatory. You can use whatever means you see fit for that.
What really matters here is to load the image in a function and map your dataset with that function.

Reading and writing Windows "tags" with Python 3

In Windows image files can be tagged. These tags can be viewed and edited by right clicking on a file, clicking over to the Details tab, then clicking on the Tags property value cell.
I want to be able to read and write these tags using Python 3.
This is not EXIF data so EXIF solutions won't work. I believe it's part of the Windows Property System, but I can't find a reference in Dev Center. I looked into win32com.propsys and couldn't see anything in there either.
I wrote a program that does this once before, but I've since lost it, so I know it's possible. Previously I did it without pywin32, but any solution would be great. I think I used windll, but I can't remember.
Here is some sample code that's using the IPropertyStore interface through propsys:
import pythoncom
from win32com.propsys import propsys
from win32com.shell import shellcon
# get PROPERTYKEY for "System.Keywords"
pk = propsys.PSGetPropertyKeyFromName("System.Keywords")
# get property store for a given shell item (here a file)
ps = propsys.SHGetPropertyStoreFromParsingName("c:\\path\\myfile.jpg", None, shellcon.GPS_READWRITE, propsys.IID_IPropertyStore)
# read & print existing (or not) property value, System.Keywords type is an array of string
keywords = ps.GetValue(pk).GetValue()
print(keywords)
# build an array of string type PROPVARIANT
newValue = propsys.PROPVARIANTType(["hello", "world"], pythoncom.VT_VECTOR | pythoncom.VT_BSTR)
# write property
ps.SetValue(pk, newValue)
ps.Commit()
This code is pretty generic for any Windows property.
I'm using System.Keywords because that's what corresponds to jpeg's "tags" property that you see in the property sheet.
And the code works for jpeg and other formats for reading (GetValue) properties, but not all Windows codecs support property writing (SetValue), to it doesn't work for writing extended properties back to a .png for example.

Using Tensorflow to train a image classifier using my own data using inception and TFrecords

I follow the tutorial on how to train your own data from tensorflow at Github: https://github.com/tensorflow/models/tree/master/inception#how-to-construct-a-new-dataset-for-retraining.
I split my data (Training and validation), created labels suggested and managed to created the TFrecords using bazel-bin. Everything works and now I have my own data as TFrecords.
Now I want to train my image classifier using inception-v3 model from scratch and it seems I should use the script inception_train.py, but I am not sure. Is that right ? https://github.com/tensorflow/models/blob/master/inception/inception/inception_train.py.
If so, I have two questions:
1-) How can I train it using my TFrecords. If you can show me an example would be great.
2-) Can I run on CPU or is only possible on GPUs ?
Thank you very much.
Try the following sample code to read images and labels from your tfrecords,
import os
import glob
import tensorflow as tf
from matplotlib import pyplot as plt
def read_and_decode_file(filename_queue):
# Create an instance of tf record reader
reader = tf.TFRecordReader()
# Read the generated filename queue
_, serialized_reader = reader.read(filename_queue)
# extract the features you require from the tfrecord using their corresponding key
# In my example, all images were written with 'image' key
features = tf.parse_single_example(
serialized_reader, features={
'image': tf.FixedLenFeature([], tf.string),
'labels': tf.FixedLenFeature([], tf.int16)
})
# Extract the set of images as shown below
img = features['image']
img_out = tf.image.resize_image_with_crop_or_pad(img, target_height=128, target_width=128)
# Similarly extract the labels, be careful with the type
label = features['labels']
return img_out, label
if __name__ == "__main__":
tf.reset_default_graph()
# Path to your tfrecords
path_to_tf_records = os.getcwd() + '/*.tfrecords'
# Collect all tfrecords present in the records folder using glob
list_of_tfrecords = sorted(glob.glob(path_to_tf_records))
# Generate a tensorflow readable filename queue by supplying it with
# a list of tfrecords, optionally it is recommended to shuffle your data
# before feeding into the network
filename_queue = tf.train.string_input_producer(list_of_tfrecords, shuffle=False)
# Supply the tensorflow generated filename queue to the custom function above
image, label = read_and_decode_file(filename_queue)
# Create a new tf session to read the data
sess = tf.Session()
tf.train.start_queue_runners(sess=sess)
# Arbitrary number of iterations
for i in range(50):
img =sess.run(image)
# Show image
plt.imshow(img)
Now, you also have a function called tf.train.shuffle_batch to help you spawn multiple CPU threads that perform this function and return images and labels based on user specified batch size. You would need to create simultaneous data and training pipelines so that they work simultaneously.
To answer your second question, yes you can train your model using CPU alone but it would be slow and might take several hours or even days to achieve decent results. Remove the with tf.device('/gpu:{0}'): decorator before the creation of your inception model and tensorflow would create the model on your CPU.
Hope this explanation helps.

Importing images to prep for keras

I am trying to import a bunch of images and get them ready for keras. The goal here is to have an array of the following dimensions. (length, 160,329,3). As you can see my reshape function is commented out. The "print(images.shape) line returns (8037,). Not sure how to proceed to get the right array dimensions. For reference the 1st column in the csv file is a list of paths to the image in question. I have a function below that combines the path of the image inside the folder and the path to the folder.
When I run the commented out reshape function I get the following error. "ValueError: cannot reshape array of size 8037 into shape (8037,160,320,3)"
import csv
import cv2
f = open('/Users/username/Desktop/data/driving_log.csv')
csv_f = csv.reader(f)
m=[]
for row in csv_f:
n=(row)
m.append(n)
images=[]
for i in range(len(m)):
img=(m[i][1])
img=img.lstrip()
path='/Users/username/Desktop/data/'
img=path+img
image=cv2.imread(img)
images.append(image)
item_num = len(images)
images=np.array(images)
#images=np.array(images).reshape(item_num, 160, 320, 3)
print(images.shape) #returns (8037,)
Can you print the shape of an image before it is appended to images to verify it is what you expect? Even better would be adding an imshow in the loop to make sure you're loading the images you expect (only need to do for one or two). cv2.imread does not throw an error if there isn't an image at the file path you give it, so your array might be all None which would yield the exact behavior you've described.
If that is the problem, check the img variable and make sure it's pointing exactly where you want it to.
Turns out it was including the first line of the CSV file which was heading. After I sorted that out it ran great. It gave me the requested shape.
images=[]
for i in range(1,len(labels)):
img=(m[i][1])
img=img.lstrip()
path='/Users/user/Desktop/data/'
img=path+img
image=cv2.imread(img)
images.append(image)

Caffe Multiple Input Images

I'm looking at implementing a Caffe CNN which accepts two input images and a label (later perhaps other data) and was wondering if anyone was aware of the correct syntax in the prototxt file for doing this? Is it simply an IMAGE_DATA layer with additional tops? Or should I use separate IMAGE_DATA layers for each?
Thanks,
James
Edit: I have been using the HDF5_DATA layer lately for this and it is definitely the way to go.
HDF5 is a key value store, where each key is a string, and each value is a multi-dimensional array. Thus, to use the HDF5_DATA layer, just add a new key for each top you want to use, and set the value for that key to store the image you want to use. Writing these HDF5 files from python is easy:
import h5py
import numpy as np
filelist = []
for i in range(100):
image1 = get_some_image(i)
image2 = get_another_image(i)
filename = '/tmp/my_hdf5%d.h5' % i
with hypy.File(filename, 'w') as f:
f['data1'] = np.transpose(image1, (2, 0, 1))
f['data2'] = np.transpose(image2, (2, 0, 1))
filelist.append(filename)
with open('/tmp/filelist.txt', 'w') as f:
for filename in filelist:
f.write(filename + '\n')
Then simply set the source of the HDF5_DATA param to be '/tmp/filelist.txt', and set the tops to be "data1" and "data2".
I'm leaving the original response below:
====================================================
There are two good ways of doing this. The easiest is probably to use two separate IMAGE_DATA layers, one with the first image and label, and a second with the second image. Caffe retrieves images from LMDB or LEVELDB, which are key value stores, and assuming you create your two databases with corresponding images having the same integer id key, Caffe will in fact load the images correctly, and you can proceed to construct your net with the data/labels of both layers.
The problem with this approach is that having two data layers is not really very satisfying, and it doesn't scale very well if you want to do more advanced things like having non-integer labels for things like bounding boxes, etc. If you're prepared to make a time investment in this, you can do a better job by modifying the tools/convert_imageset.cpp file to stack images or other data across channels. For example you could create a datum with 6 channels - the first 3 for your first image's RGB, and the second 3 for your second image's RGB. After reading this in using the IMAGE_DATA layer, you can split the stream into two images using a SLICE layer with a slice_point at index 3 along the slice_dim = 1 dimension. If further down the road, you decide that you want to load even more complex assortments of data, you'll understand the encoding scheme and can write your own decoding layer based off of src/caffe/layers/data_layer.cpp to gain full control of the pipeline.
You may also consider using HDF5_DATA layer with multiple "top"s

Resources