Tensorflow’s flow_from_directory returns erroneous Set - image

I am using ImageDataGenerator to create validation set ( for an image classification using TF (Keras) from a labeled directory of images. The directories are 0,1,2,3,4 corresponding to class of each image, they contain 488, 185, 130, 131, 91 images respectively.
train_image_generator = ImageDataGenerator(rescale=1./255) # Generator for our training data
# validation_image_generator = ImageDataGenerator(rescale=1./255) # Generator for our validation data
train_data_gen = train_image_generator.flow_from_directory(batch_size=batch_size,
directory=train_dir,
shuffle=True,
target_size=(IMG_HEIGHT, IMG_WIDTH),
class_mode='categorical')
returns
Found 0 images belonging to 5 classes.
and the code below
validation_data_generator = train_image_generator.flow_from_directory(
train_dir, # same directory as training data
target_size=(IMG_HEIGHT, IMG_WIDTH),
batch_size=batch_size,
class_mode='categorical',
subset='validation') # set as validation data
outputs:
Found 0 images belonging to 5 classes.
What is wrong please? I suppose the validation set has at least few images for last class!
Any help would be greatly appreciated.
CS

It's hard to say without seeing what your directory string is, but I suspect perhaps the problem could be with characters in your "train_dir" variable. You can try using os.path.exists(train_dir) to see if your code is actually pointing to the right spot, or use os.listdir to see if you're seeing the files that way.
Throwing this line in there before you create your data generators might solve it:
import os
train_dir=os.path.normpath(train_dir)
Usually I find the problem is with the slashes in a path. If you just want to use a string, you usually need to use double backslashes (\) or a single forward slash (/) instead of the single backslashes you usually see in a path. This is especially problematic when your filenames or sub-directories start with a number (as yours do).

Related

Bullet list style for single parameter functions in RTD theme using autodoc and Sphinx?

I've noticed that when I use autodoc with the ReadTheDoc theme, if I have multiple arguments in my functions they are listed in a bullet list style:
arg1
arg2
...
but if there is only 1 argument then it is not using the bullet list style which is a bit silly to me since it breaks the continuity of the design.
I've found how to remove the disc via CSS to make things more uniform but I actually want to do the opposite and have the disk for the single argument functions.
At this point, I'm not sure it is a CSS change and I do not know how to do that.
I've also noticed the same thing in different docs.
Here is the rendered html:
Here are the 2 methods:
def add_attribute(self, name, index):
"""
:param name: The name attached to the attribute.
:param index: The position of the attribute within the list of attributes. """
print("")
def delete_attribute(self, name):
"""
:param name: The name of the attribute to delete."""
print("")
Here is the my .rst:
API
----------------
.. automodule:: my_module
:members:
Here is the conf.py
extensions = [
'sphinx_rtd_theme',
'sphinx.ext.autodoc',
'sphinx.ext.napoleon',
'sphinx.ext.coverage',
'sphinx.ext.autosummary',
]
templates_path = ['_templates']
language = 'python'
exclude_patterns = []
html_theme = "sphinx_rtd_theme"
html_static_path = ['_static']
autosummary_generate = True
Any idea?
Cheers!
After a lot of digging, I've found a partial workaround for this.
My solution involves manually editing the produced HTML files to insert the missing bullet points.
Required conf.py changes:
# Register hook to run when build is complete
def setup(app):
app.connect('build-finished', on_build_finished)
# Hook implementation
def on_build_finished(app, exception):
add_single_param_bullets("_build/html/index.html")
# Function to actually add the bullet points by overwriting the given HTML file
def add_single_param_bullets(file_path):
print('Add single parameter bullets in {:s}'.format(file_path))
if not os.path.exists(file_path):
print(' File not found, skipping...')
return
lines_enc = []
with open(file_path, 'rb') as f:
for l in f.readlines():
# Check for html that indicates single parameter function
if b'<dd class="field-odd"><p><strong>' in l:
# Work out the encoding if not defined
enc = None
if enc is None:
import chardet
enc = chardet.detect(l)['encoding']
# Decode html and get the parameter information that needs adding
l_dec = l.decode(enc)
l_insert = l_dec.replace('<dd class="field-odd">', '').replace('\r\n', '')
# Add new encoded lines to output
lines_enc.append('<dd class="field-odd"><ul class="simple">'.encode('utf=8'))
lines_enc.append('<li>{:s}</li>'.format(l_insert).encode(enc))
lines_enc.append('</ul>'.encode('utf=8'))
else:
lines_enc.append(l)
# Overwrite the original file with the new changes
with open(file_path, 'wb') as f:
for l in lines_enc:
f.write(l)
In my case, I only have single argument functions in index.html. However, you can register additional files in on_build_finished.
A few things to note:
This only edits the produced HTML files, and doesn't actually solve the underlying problem. I dug through the source for a bit but couldn't find why the bullet points aren't added for single parameter function.
The problem is not just for the RTD theme. It seems to occur with the basic theme as well. So I suspect it's a deeper problem with Sphinx rather than the RTD theme.
The code above somewhat deals with different encodings in the original HTML.
This does not work on the RTD website. As the HTML files are edited in place, and the RTD build outputs the HTML files to a different directory, this solution doesn't seem to work on the RTD website. This is quite annoying. A solution would be to somehow change the RTD build process, or tell RTD to use pre-built HTML sources rather than building its own, but I don't know how to do so.
After spending a few hours working all this out, I actually think it looks better without the bullet points...

In Pylint, how do I disable "Exactly one space after comma" for multidimensional array indices?

I like having PyLint check that commas are generally followed by spaces, except in one case: multidimensional indices. For example, I get the following warning from Pylint:
C: 31, 0: Exactly one space required after comma
num_features = len(X_train[0,:])
^ (bad-whitespace)
Is there a way to get rid of the warnings requiring spaces after commas for the case multidimensional arrays, but keep the space-checking logic the same for all other comma uses?
Thanks!
I am sure you figured this out by now but for anyone, like me, who happened upon this looking for an answer...
use # pylint: disable=C0326 on the line that is guilty of this. for instance:
num_features = len(X_train[0,:]) #pylint: disable=C0326
This applies to multiple kinds of space errors. See pylint wiki
You'll almost certainly want to disable this via the .pylintrc file for larger situations.
Example, say I have:
x111 = thing.abc(asdf)
x112_b = thing1.abc(asdf)
x112_b224 = thing.abc(asdf)
x112_f = thing1.abc(asdf)
... lots more
Now, presume I want to visually see the situation:
x111 = thing.abc(asdf)
x112_b = thing1.abc(asdf)
... lots more
so I add the following line to .pylintrc
disable=C0326,C0115,C0116
(note only the first one, c0326, counts, but I'm leaving two other docstring ones there so you can see you just add err messages you want to ignore.)

Importing images to prep for keras

I am trying to import a bunch of images and get them ready for keras. The goal here is to have an array of the following dimensions. (length, 160,329,3). As you can see my reshape function is commented out. The "print(images.shape) line returns (8037,). Not sure how to proceed to get the right array dimensions. For reference the 1st column in the csv file is a list of paths to the image in question. I have a function below that combines the path of the image inside the folder and the path to the folder.
When I run the commented out reshape function I get the following error. "ValueError: cannot reshape array of size 8037 into shape (8037,160,320,3)"
import csv
import cv2
f = open('/Users/username/Desktop/data/driving_log.csv')
csv_f = csv.reader(f)
m=[]
for row in csv_f:
n=(row)
m.append(n)
images=[]
for i in range(len(m)):
img=(m[i][1])
img=img.lstrip()
path='/Users/username/Desktop/data/'
img=path+img
image=cv2.imread(img)
images.append(image)
item_num = len(images)
images=np.array(images)
#images=np.array(images).reshape(item_num, 160, 320, 3)
print(images.shape) #returns (8037,)
Can you print the shape of an image before it is appended to images to verify it is what you expect? Even better would be adding an imshow in the loop to make sure you're loading the images you expect (only need to do for one or two). cv2.imread does not throw an error if there isn't an image at the file path you give it, so your array might be all None which would yield the exact behavior you've described.
If that is the problem, check the img variable and make sure it's pointing exactly where you want it to.
Turns out it was including the first line of the CSV file which was heading. After I sorted that out it ran great. It gave me the requested shape.
images=[]
for i in range(1,len(labels)):
img=(m[i][1])
img=img.lstrip()
path='/Users/user/Desktop/data/'
img=path+img
image=cv2.imread(img)
images.append(image)

SPSS syntax for naming individual analyses in output file outline

I have created syntax in SPSS that gives me 90 separate iterations of general linear model, each with slightly different variations fixed factors and covariates. In the output file, they are all just named as "General Linear Model." I have to then manually rename each analysis in the output, and I want to find syntax that will add a more specific name to each result that will help me identify it out of the other 89 results (e.g. "General Linear Model - Males Only: Mean by Gender w/ Weight covariate").
This is an example of one analysis from the syntax:
USE ALL.
COMPUTE filter_$=(Muscle = "BICEPS" & Subj = "S1" & SMU = 1 ).
VARIABLE LABELS filter_$ 'Muscle = "BICEPS" & Subj = "S1" & SMU = 1 (FILTER)'.
VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'.
FORMATS filter_$ (f1.0). FILTER BY filter_$.
EXECUTE.
GLM Frequency_Wk6 Frequency_Wk9
Frequency_Wk12 Frequency_Wk16
Frequency_Wk20
/WSFACTOR=Time 5 Polynomial
/METHOD=SSTYPE(3)
/PLOT=PROFILE(Time)
/EMMEANS=TABLES(Time)
/CRITERIA=ALPHA(.05)
/WSDESIGN=Time.
I am looking for syntax to add to this that will name this analysis as: "S1, SMU1 BICEPS, GLM" Not to name the whole output file, but each analysis within the output so I don't have to do it one-by-one. I have over 200 iterations at times that come out in a single output file, and renaming them individually within the output file is taking too much time.
Making an assumption that you are exporting the models to Excel (please clarify otherwise).
There is an undocumented command (OUTPUT COMMENT TEXT) that you can utilize here, though there is also a custom extension TEXT also designed to achieve the same but that would need to be explicitly downloaded via:
Utilities-->Extension Bundles-->Download And Install Extension Bundles--->TEXT
You can use OUTPUT COMMENT TEXT to assign a title/descriptive text just before the output of the GLM model (in the example below I have used FREQUENCIES as an example).
get file="C:\Program Files\IBM\SPSS\Statistics\23\Samples\English\Employee data.sav".
oms /select all /if commands=['output comment' 'frequencies'] subtypes=['comment' 'frequencies']
/destination format=xlsx outfile='C:\Temp\ExportOutput.xlsx' /tag='ExportOutput'.
output comment text="##Model##: This is a long/descriptive title to help me identify the next model that is to be run - jobcat".
freq jobcat.
output comment text="##Model##: This is a long/descriptive title to help me identify the next model that is to be run - gender".
freq gender.
output comment text="##Model##: This is a long/descriptive title to help me identify the next model that is to be run - minority".
freq minority.
omsend tag=['ExportOutput'].
You could use TITLE command here also but it is limited to only 60 characters.
You would have to change the OMS tags appropriately if using TITLE or TEXT.
Edit:
Given the OP wants to actually add a title to the left hand pane in the output viewer, a solution for this is as follows (credit to Albert-Jan Roskam for the Python code):
First save the python file "editTitles.py" to a valid Python search path (for example (for me anyway): "C:\ProgramData\IBM\SPSS\Statistics\23\extensions")
#editTitles.py
import tempfile, os, sys
import SpssClient
def _titleToPane():
"""See titleToPane(). This function does the actual job"""
outputDoc = SpssClient.GetDesignatedOutputDoc()
outputItemList = outputDoc.GetOutputItems()
textFormat = SpssClient.DocExportFormat.SpssFormatText
filename = tempfile.mktemp() + ".txt"
for index in range(outputItemList.Size()):
outputItem = outputItemList.GetItemAt(index)
if outputItem.GetDescription() == u"Page Title":
outputItem.ExportToDocument(filename, textFormat)
with open(filename) as f:
outputItem.SetDescription(f.read().rstrip())
os.remove(filename)
return outputDoc
def titleToPane(spv=None):
"""Copy the contents of the TITLE command of the designated output document
to the left output viewer pane"""
try:
outputDoc = None
SpssClient.StartClient()
if spv:
SpssClient.OpenOutputDoc(spv)
outputDoc = _titleToPane()
if spv and outputDoc:
outputDoc.SaveAs(spv)
except:
print "Error filling TITLE in Output Viewer [%s]" % sys.exc_info()[1]
finally:
SpssClient.StopClient()
Re-start SPSS Statistics and run below as a test:
get file="C:\Program Files\IBM\SPSS\Statistics\23\Samples\English\Employee data.sav".
title="##Model##: jobcat".
freq jobcat.
title="##Model##: gender".
freq gender.
title="##Model##: minority".
freq minority.
begin program.
import editTitles
editTitles.titleToPane()
end program.
The TITLE command will initially add a title to main output viewer (right hand side) but then the python code will transfer that text to the left hand pane output tree structure. As mentioned already, note TITLE is capped to 60 characters only, a warning will be triggered to highlight this also.
This editTitles.py approach is the closest you are going to get to include a descriptive title to identify each model. To replace the actual title "General Linear Model." with a custom title would require scripting knowledge and would involve a lot more code. This is a simpler alternative approach. Python integration required for this to work.
Also consider using:
SPLIT FILE SEPARATE BY <list of filter variables>.
This will automatically produce filter labels in the left hand pane.
This is easy to use for mutually exclusive filters but even if you have overlapping filters you can re-run multiple times (and have filters applied to get as close to your desired set of results).
For example:
get file="C:\Program Files\IBM\SPSS\Statistics\23\Samples\English\Employee data.sav".
sort cases by jobcat minority.
split file separate by jobcat minority.
freq educ.
split file off.

Save converted image into specific path in matlab?

Can someone explain to me how to save our converted image (from *.ppm into *.pgm) in specific directory in matlab..
Here is my code.
pathName = 'D:\Matlab\Training\PGM_Files';
% Create it if it doesn't exist.
if ~exist(pathName, 'dir')
mkdir(pathName);
% This will create to the full depth of the path.
% Upper folder levels don't have to exist yet.
end
fullFileName = fullfile(pathName);
pgm_File_Images(loop1,1)=imwrite((Train_Images,['data',num2str(loop1),'.pgm']),
fullFileName);
I want to save all my file into folder PGM_Files.
But everytime I got error
Expression or statement is incorrect--possibly unbalanced (, {, or [.
How to fix it??
The answer was already given in the comments, I will thus post it here to make sure the question does not remain unanswered:
You don't provide the right kind of input arguments to imwrite. That's
why the last ) claims to be unmatched.

Resources