Caffe Multiple Input Images - image

I'm looking at implementing a Caffe CNN which accepts two input images and a label (later perhaps other data) and was wondering if anyone was aware of the correct syntax in the prototxt file for doing this? Is it simply an IMAGE_DATA layer with additional tops? Or should I use separate IMAGE_DATA layers for each?
Thanks,
James

Edit: I have been using the HDF5_DATA layer lately for this and it is definitely the way to go.
HDF5 is a key value store, where each key is a string, and each value is a multi-dimensional array. Thus, to use the HDF5_DATA layer, just add a new key for each top you want to use, and set the value for that key to store the image you want to use. Writing these HDF5 files from python is easy:
import h5py
import numpy as np
filelist = []
for i in range(100):
image1 = get_some_image(i)
image2 = get_another_image(i)
filename = '/tmp/my_hdf5%d.h5' % i
with hypy.File(filename, 'w') as f:
f['data1'] = np.transpose(image1, (2, 0, 1))
f['data2'] = np.transpose(image2, (2, 0, 1))
filelist.append(filename)
with open('/tmp/filelist.txt', 'w') as f:
for filename in filelist:
f.write(filename + '\n')
Then simply set the source of the HDF5_DATA param to be '/tmp/filelist.txt', and set the tops to be "data1" and "data2".
I'm leaving the original response below:
====================================================
There are two good ways of doing this. The easiest is probably to use two separate IMAGE_DATA layers, one with the first image and label, and a second with the second image. Caffe retrieves images from LMDB or LEVELDB, which are key value stores, and assuming you create your two databases with corresponding images having the same integer id key, Caffe will in fact load the images correctly, and you can proceed to construct your net with the data/labels of both layers.
The problem with this approach is that having two data layers is not really very satisfying, and it doesn't scale very well if you want to do more advanced things like having non-integer labels for things like bounding boxes, etc. If you're prepared to make a time investment in this, you can do a better job by modifying the tools/convert_imageset.cpp file to stack images or other data across channels. For example you could create a datum with 6 channels - the first 3 for your first image's RGB, and the second 3 for your second image's RGB. After reading this in using the IMAGE_DATA layer, you can split the stream into two images using a SLICE layer with a slice_point at index 3 along the slice_dim = 1 dimension. If further down the road, you decide that you want to load even more complex assortments of data, you'll understand the encoding scheme and can write your own decoding layer based off of src/caffe/layers/data_layer.cpp to gain full control of the pipeline.

You may also consider using HDF5_DATA layer with multiple "top"s

Related

Setting the power spectral density from a file

How does one set the power spectral density (PSD) from file and is it possible to use a different PSD for generating the data and for likelihood evaluation?
Question asked by Vivien Raymond by email.
Setting the PSD from file
To set the PSD from a file, first initialise a list of interferometers, here we just use Hanford:
>>> ifos = bilby.gw.detector.InterferometerList(['H1'])
Every element of the list is initialised with a default PSD using the advanced LIGO noise curve, to check this
>>> ifos[0].power_spectral_density
PowerSpectralDensity(psd_file='/home/user1/miniconda3/lib/python3.6/site-packages/bilby-0.3.5-py3.6.egg/bilby/gw/noise_curves/aLIGO_ZERO_DET_high_P_psd.txt', asd_file='None')
Note, no data has yet been generated. To overwrite the PSD,simply create a new PowerSpectralDensity object and assign it (if you have multiple detectors, you'll need to do this for every element of the list)
ifos[0].power_spectral_density = bilby.gw.detector.PowerSpectralDensity(psd_file=PATH_TO_FILE)
Nest, generate an instance of the strain data from the PSD:
ifos.set_strain_data_from_power_spectral_densities(
sampling_frequency=4096, duration=4,
start_time=-3)
You can check what the data looks like by doing
ifos[0].plot_data()
Note, you can also inject signals using the ifos.inject_signal method.
Using a different PSD for likelihood evaluation
Each ifo in the ifos list contains both the data and a PSD (or equivalent ASD). For inference, we pass that list into the bilby.gw.GravitationalWaveLikelihood object as the first argument and the PSD for each element of the list is used in calculating the likelihood.
So, if you want to use a different PSD for likelihood estimate. First generate the data (as above). Then, assign the PSD you want to use for sampling to each element of ifos and pass that object into the likelihood instead. This won't overwrite the data (provided you don't call set_strain_data_from_power_spectral_densities of course).

Does anybody know why the last arg in input_shape must be specied 3 in keras' application?

I want to use pre-train Net, such as VGG, ResNet. While in Keras, there must be specified the formate in (w,h,3) in input_shape. If I want to specify the channel to 1, is there have more tricks?
conv_vgg = keras.application.VGG16(input_shape=(224,224,3))
I want to specify 3 to 1:
conv_vgg = keras.application.VGG16(input_shape=(224,224,1))
Thanks in advance!
Pre-trained networks as trained in imagenet or other image data sets. This means that is trained with RGB images that's why using a pretrained network requires three channels.
If you want to use pre-trained networks for a single channel image you could repeat your channel three times and proceed. (Repeat-copy two more times your 1-channel image, from (224,224,1) shape to (224,224,3) shape (3-channels image).

Paraview rotate fields

I am using Paraview 5.0.1. If any solution requires updating, I can try.
I want to programmatically obtain field plots (and corresponding PlotOverLine) of displacements and stresses in rotated coordinate systems.
What are appropriate/convenient/possible ways of doing this?
So far, I have created one Calculator filter for each component of displacements and stresses.
For instance, I used Calculators in 2D with results
(displacement.iHat)*cos(0.7853981625)+(displacement.jHat)*sin(0.7853981625)
(stress_3-stress_0)*sin(45.0*3.14159265/180)*cos(45.0*3.14159265/180)+stress_1*((cos(45.0*3.14159265/180))^2-(sin(45.0*3.14159265/180))^2)
It works fine, but it is quite cumbersome, in several aspects:
Creating them (one filter per component).
Plotting several of them in a single XY plot
Exporting them (one export per component).
Is there a simple way to do this?
PS: The Transform filter does not accomplish this. It rotates the view, not the fields.
Two solutions:
Ugly, inneficient solution
Use Transform and check "Transform All Input vectors"
Add a calculator and add a dummy array
Use transform the other way around, without checking "Transform All Input vectors"
Correct solution :
Compute the transformation yourself in a programmable filter
input = self.GetUnstructuredGridInput();
output = self.GetUnstructuredGridOutput();
output.ShallowCopy(input)
data = input.GetPointData().GetArray("YourArray")
vec = vtk.vtkDoubleArray();
vec.SetNumberOfComponents(3);
vec.SetName("TransformedVectors");
numPoints = input.GetNumberOfPoints()
for i in xrange(0, numPoints):
tuple = data.GetTuple(i)
transform(tuple) # implement the transform in python
vec.InsertNextTuple(tuple)
output.GetPointData().AddArray(vec)

How to use DBSCAN algorithm for a list of points in python

I am new to image processing and python coding.
I have detected a number of features in an image and have their respective pixel locations placed in a list format.
My_list = [(x1,y1),(x2,y2),......,(xn,yn)]
I would like to use DBSCAN algorithm to form clusters from the following points.
Currently using sklearn.cluster to import the build in DBSCAN function for python.
If the current format for the points is not compatible would like to know which is?
Error currently facing with the current format:
C:\Python\python.exe "F:/opencv_files/dbscan.py"
**Traceback (most recent call last):**
**File "**F:/opencv_files/dbscan.py**", line 83, in <module>
db = DBSCAN(eps=0.5, min_samples=5).fit(X) # metric=X)**
**File "**C:\Python\lib\site-packages\sklearn\cluster\dbscan_.py**", line 282, in fit
X = check_array(X, accept_sparse='csr')
File "**C:\Python\lib\site-packages\sklearn\utils\validation.py**", line 441, in check_array
"if it contains a single sample.".format(array))
ValueError: Expected 2D array, got 1D array instead:
array=[].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.**
Your data is a list of tuple. There is nothing in this structure that prevents you from doing crazy things with that, such as having different lengths in there. Plus, this is a very slow and memory inefficient way of keeping the data because everything is boxed as a Python object.
Just call data = numpy.array(data) to convert your data into an efficient multidimensional numeric array. This array will then have a shape.

In Python 3, best way to open an image stored in a list as a file object?

Using python 3.4 in linux and windows, I'm trying to create qr code images from a list of string objects. I don't want to just store the image as a file because the list of strings may change frequently. I want to then tile all the objects and display the resulting image on screen for the user to scan with a barcode scanner. For the user to know which code to scan I need to add some text to the qr code image.
I can create the list of image objects correctly and they are in a list and calling .show on these objects displays them properly but I don't know how to treat these objects as a file object to open them. The object that is given to the open function, (img_list[0] in my case), in my add_text_to_img needs to support read, seek and tell methods. When I try this as is I get an attribute error. I've tried BytesIO and StringIO but I get an error message that Image.open does not support buffer interface. Maybe I am not doing that part correctly.
I'm sure there are several ways to do this, but what is the best way to open in memory objects as a file object?
from io import BytesIO
import qrcode
from PIL import ImageFont, ImageDraw, Image
def make_qr_image_list(code_list):
"""
:param code_list: a list of string objects to encode into QR code image
:return: a list of image or some type of other data objects
"""
img_list = []
for item in code_list:
qr = qrcode.QRCode(
version=None,
error_correction=qrcode.ERROR_CORRECT_L,
box_size=4,
border=10
)
qr.add_data(item)
qr_image = qr.make_image(fit=True)
img_list.append(qr_image)
return img_list
def add_text_to_img(text_list, img_list):
"""
While I was working on this, I am only saving the first image. Once
it's working, I'll save the rest of the images to a list.
:param text_list: a list of strings to add to the corresponding image.
:param img_list: the list containing the images already created from
the text_list
:return:
"""
base = Image.open(img_list[0])
# img = Image.frombytes(mode='P', size=(164,164), data=img_list[0])
text_img = Image.new('RGBA', base.size, (255,255,255,0))
font = ImageFont.truetype('sans-serif.ttf', 10)
draw = ImageDraw.Draw(text_img)
draw.text((0,-20),text_list[0], (0,0,255,128), font=font)
# include some method to save the images after the text
# has been added here. Shouldn't actually save to a file.
# Should be saved to memory/img_list
output = Image.alpha_composite(base,text_img)
output.show()
if __name__ == '__main__':
test_list = ['AlGaN','n-AlGaN','p-AlGaN','MQW','LED AlN-AlGaN']
image_list = make_qr_image_list(test_list)
add_text_to_img(test_list, image_list)
im = image_list[0]
im.save('/my_save_path/test_image.png')
im.show()
Edit: I've been using python for about a year and I feel like this is a pretty common thing to do but I'm not even sure that I'm looking up/searching for the right terms. What topics would you search for to answer this? If anyone can post a link or two to what I need to read up on regarding this, that would be very appreciated.
You already have PIL image objects; qr.make_image() returns the (a wrapper around) the right type of object and you do not need to open them again.
As such, all you need to do is:
base = img_list[0]
and go from there.
You do need to match image modes when compositing; QR codes are black-and-white images (mode 1), so either convert that or use the same mode in your text_img image object. The Image.alpha_composite() operation does require that both images have an alpha channel. Converting the base is easy:
base = img_list[0].convert('RGBA')

Resources