Tensorflow image loading gives OutOfRangeError, regardless of which session initializer is used - image

I copied a test script to load a directory of images into Tensorflow:
# Typical setup to include TensorFlow.
import tensorflow as tf
from sys import argv
# Make a queue of file names including all the JPEG images files in the relative
# image directory.
filename_queue = tf.train.string_input_producer(
tf.train.match_filenames_once(argv[1] + "/*.jpg"))
# Read an entire image file which is required since they're JPEGs, if the images
# are too large they could be split in advance to smaller files or use the Fixed
# reader to split up the file.
image_reader = tf.WholeFileReader()
# Read a whole file from the queue, the first returned value in the tuple is the
# filename which we are ignoring.
_, image_file = image_reader.read(filename_queue)
# Decode the image as a JPEG file, this will turn it into a Tensor which we can
# then use in training.
image_orig = tf.image.decode_jpeg(image_file)
image = tf.image.resize_images(image_orig, [224, 224])
image.set_shape((224, 224, 3))
# Start a new session to show example output.
with tf.Session() as sess:
However, when I ran the script, I received an odd bug:
OutOfRangeError (see above for traceback): FIFOQueue '_1_input_producer' is closed and has insufficient elements (requested 1, current size 0)
And when I tried to look up a solution, I got several different answers:
tf.initialize_all_variables().run()
tf.local_variables_initializer().run()
sess.run(tf.local_variables_initializer())
sess.run(tf.global_variables_initializer())
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
I have tried all of those options, and all of them have failed. The original script (https://gist.github.com/eerwitt/518b0c9564e500b4b50f) has barely 40 lines. What solution am I missing?
UPDATE
I'm now running this:
# Start a new session to show example output.
with tf.Session() as sess:
# Required to get the filename matching to run.
sess.run(tf.local_variables_initializer())
sess.run(tf.global_variables_initializer())
# Coordinate the loading of image files.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
# Get an image tensor and print its value.
image_tensor = sess.run([image])
print(image_tensor)
# Finish off the filename queue coordinator.
coord.request_stop()
coord.join(threads)
And the error still occurs.

You need to initialize both local and global variables for some reason. I don't know exactly why though. Anyway match_filenames_once returns a local variable which is not initialized simply by using tf.global_variables_initializer().
So, to your problem adding:
with tf.Session() as sess:
sess.run(tf.local_variables_initializer())
sess.run(tf.global_variables_initializer())
# actual code should go here
coord.request_stop()
coord.join(threads)
should solve the problem.
tf.initialize_all_variables() is the old way of initialization and I think it used to initialize both global and local variables when it was the legal way of initialization. Nowadays it's considered deprecated and initializes only the global variables. So, some sources that use old style coding do not report any problem in executing the code but in newer tensorflow versions the same breaks down.

Related

How can I use SaveData to write an ASCII file in ParaView 3.98.1?

I am writing an automation script for an old project and I need some help with pvpython from Paraview 3.98.1. The function SaveData() in this version does not exist. I found its implementation here and moved it to my code. How can I save a file as ASCII? Calling it like SaveData(filename, proxy=px, FileType='Ascii') saves my files as binaries (awkward behavior).
I need this version because some of my codes in the scripting pipeline handle very specific vtk files. Using the SaveData() function of the latest versions ended up creating different metadata in my final files, and when I process them it ends up destroying my results. It is easier at the moment to use an older version of Paraview than to modify all my codes.
Edit:
The website is not working now, but it was yesterday. Maybe it is an internal problem? Anyway, the code is attached below.
# -----------------------------------------------------------------------------
def SetProperties(proxy=None, **params):
"""Sets one or more properties of the given pipeline object. If an argument
is not provided, the active source is used. Pass a list of property_name=value
pairs to this function to set property values. For example::
SetProperties(Center=[1, 2, 3], Radius=3.5)
"""
if not proxy:
proxy = active_objects.source
properties = proxy.ListProperties()
for param in params.keys():
pyproxy = servermanager._getPyProxy(proxy)
pyproxy.__setattr__(param, params[param])
# -----------------------------------------------------------------------------
def CreateWriter(filename, proxy=None, **extraArgs):
"""Creates a writer that can write the data produced by the source proxy in
the given file format (identified by the extension). If no source is
provided, then the active source is used. This doesn't actually write the
data, it simply creates the writer and returns it."""
if not filename:
raise RuntimeError ("filename must be specified")
session = servermanager.ActiveConnection.Session
writer_factory = servermanager.vtkSMProxyManager.GetProxyManager().GetWriterFactory()
if writer_factory.GetNumberOfRegisteredPrototypes() == 0:
writer_factory.UpdateAvailableWriters()
if not proxy:
proxy = GetActiveSource()
if not proxy:
raise RuntimeError ("Could not locate source to write")
writer_proxy = writer_factory.CreateWriter(filename, proxy.SMProxy, proxy.Port)
writer_proxy.UnRegister(None)
pyproxy = servermanager._getPyProxy(writer_proxy)
if pyproxy and extraArgs:
SetProperties(pyproxy, **extraArgs)
return pyproxy
# -----------------------------------------------------------------------------
def SaveData(filename, proxy=None, **extraArgs):
"""Save data produced by 'proxy' in a file. If no proxy is specified the
active source is used. Properties to configure the writer can be passed in
as keyword arguments. Example usage::
SaveData("sample.pvtp", source0)
SaveData("sample.csv", FieldAssociation="Points")
"""
writer = CreateWriter(filename, proxy, **extraArgs)
if not writer:
raise RuntimeError ("Could not create writer for specified file or data type")
writer.UpdateVTKObjects()
writer.UpdatePipeline()
del writer
# -----------------------------------------------------------------------------
The question is answered here (also my post). I used SaveData() to save a binary file with the proxy I need and then used DataSetWriter() to change my FileType to ASCII. It is not a beautiful solution since SaveData() is supposed to do that, but it does the job.

How to get progress bar with tqdm in a for loop over directory

I am trying to conditionally load some files from a directory. I would like to have a progress bar from tqdm on the process. I currently running this:
loaddir = r'D:\Folder'
# loop the files in the directory
print('Data load initiated')
for subdir, dirs, files in os.walk(loaddir_res):
for name in tqdm(files):
if name.startswith('Test'):
#do things
which gives
Data load initiated
0%| | 0/6723 [00:00<?, ?it/s]
0%| | 26/6723 [00:00<00:28, 238.51it/s]
1%| | 47/6723 [00:00<00:31, 213.62it/s]
1%| | 72/6723 [00:00<00:30, 220.84it/s]
1%|▏ | 91/6723 [00:00<00:31, 213.59it/s]
2%|▏ | 115/6723 [00:00<00:30, 213.73it/s]
This has two problems:
When progress is updated a new line appears in my IPython console in Spyder
I am actually timing the loop over the files and not over the files that start with 'Test' and therefore progress and remaining time are not accurate.
However, if I try this:
loaddir = r'D:\Folder'
# loop the files in the directory
print('Data load initiated')
for subdir, dirs, files in os.walk(loaddir_res):
for name in files:
if tqdm(name.startswith('Test')):
#do things
I get the following error.
Traceback (most recent call last):
File "<ipython-input-80-b801165d4cdb>", line 21, in <module>
if tqdm(name.startswith('Probe')):
TypeError: 'NoneType' object cannot be interpreted as an integer
I would like to have a progress bar in only one line that updates whenever the startswith loop is activated.
----UPDATE----
I also found out here that it can also be used like this:
files = [f for f in tqdm(files) if f.startswith('Test')]
Which allows to track progress with list comprehension by wrapping the iterable with tqdm. However in spyder this results in a separate line for each progress update.
----UPDATE2----
It actually works fine in spyder. Sometimes if the loop fails, it might go back to printing one line of progress update. But i haven't seen this very often after the latest updates.
firstly the answer:
loaddir = r'D:\surfdrive\COMSOL files\Batch folder\Current batch simulation files'
# loop the files in the directory
print('Data load initiated')
for subdir, dirs, files in os.walk(loaddir_res):
files = [f for f in files if f.startswith('Test')]
for name in tqdm(files):
#do things
This will work in any decent environment (including a bare terminal). The solution is to not give tqdm the unused filenames. You may find https://github.com/tqdm/tqdm/wiki/How-to-make-a-great-Progress-Bar insightful.
Secondly the issue with multiple lines output is well-known and due to some environments being broken (https://github.com/tqdm/tqdm#faq-and-known-issues) by not supporting carriage return (\r).
The correct links for this problem in Spyder are https://github.com/tqdm/tqdm/issues/512 and https://github.com/spyder-ide/spyder/issues/6172
(Spyder maintainer here) This is a known limitation of TQDM progress bars in Spyder. I'd recommend you to open an issue about it in its Github repository.
Specify position=0 and leave=True like this:
for i in tqdm(range(10), position=0, leave=True):
# Some code
Or in a list comprehension:
nums = [i for i in tqdm(range(10), position=0, leave=True)]
It's worth to mention that you can set `position=0` and `leave=True` to be the default settings, so you won't need to specify them each time, like this:
from tqdm import tqdm
from functools import partial
tqdm = partial(tqdm, position=0, leave=True) # this line does the magic
# for loop
for i in tqdm(range(10)):
# Some code
# list comprehension
nums = [for i in tqdm(range(10))]

VSC temporarily turn off yaml lintin

Trying to find a way to turn off the red lines temporarily for that file only.
maybe try to disable the yaml.schemaStore ?
go in in settings.json and add :
"yaml.schemaStore.enable": false
Since this is not valid YAML at all, but you want to edit this as YAML,
you should make it into valid YAML. If you turn of the errors,
instead you probably would not have all of the advantage of the YAML
editing mode.
If saltstate allows you to change the block_start_string and
variable_start_string jinja2 uses you can change {% into #% (or
###% if #% and ###% naturally occur in your source), and also
change {{ into <{ (or <<{, you get the idea). If you would call
jinja2 directly you would then then pass to the FireSystemLoader:
block_start_string='<{' and variable_start_string='#%' If the
above is possible, then you have to change your input file only once,
do that with an editor.
If you cannot control saltstate to do the sane thing, your still not
stuck but you have to do a bit more using Python,
ruamel.yaml and some
support packages (disclaimer: I am the author of those packages).
Install with:
pip install ruamel.yaml[jinja2] ruamel.std.pathlib
Then before editing run the program:
from ruamel.yaml import YAML
from ruamel.std.pathlib import Path
yamlj2 = YAML(typ='jinja2')
yamlrt = YAML()
yaml_flow_style = YAML()
yaml_flow_style.default_flow_style = True
in_file = Path('init.sls')
backup_file = Path('init.sls.org')
in_file.copy(backup_file)
data = yamlj2.load(in_file)
with in_file.open('w') as fp:
# write the header with info needed for revers
fp.write('# ruamel.yaml.jinja2: ') # no EOL
yaml_flow_style.dump(yamlj2._plug_in_jinja2, fp)
yamlrt.dump(data, fp)
which changes the offending jinja2 sequences and add a one-line header comment with the actual patterns used to the file. You should then be able
to edit the init.sls file without getting all those errors.
Before calling saltstate, do run the following:
from ruamel.yaml import YAML
from ruamel.std.pathlib import Path
in_file = Path('init.sls')
yamlj2 = YAML(typ='jinja2')
yamlrt = YAML()
yamlnort = YAML(typ='safe')
with in_file.open() as fp:
yamlj2._plug_in_jinja2 = yamlnort.load(fp.readline().split(':', 1)[1])
data = yamlrt.load(fp)
yamlj2.dump(data, in_file)
If you have multiple of these files, you probably want to take your
filename from sys.argv[1]. You might actually call the salstate program from this second Python program (i.e. decode and run).

How to set skipping of uninteresting functions while stepping from gdbinit script?

I'm trying to setup a set of functions to be skipped by gdb from stepping in by commands like:
skip myfunction
. But if I place them in ~/.gdbinit instead of just saying in the terminal gdb prompt, I get the error:
No function found named myfunction.
Ignore function pending future shared library load? (y or [n]) [answered N; input not from terminal]
So I need GDB to get Y answer. I've tried what was suggested for breakpoints as well as set confirm off suggested in a comment to this question. But these don't help with skip command.
How can I set skip in a .gdbinit script, answering Y about future library load?
you can use Python to wait for the execution to start, which is equivalent to pending on:
import gdb
to_skip = []
def try_pending_skips(evt=None):
for skip in list(to_skip): # make a copy for safe remove
try:
# test if the function (aka symbol is defined)
symb, _ = gdb.lookup_symbol(skip)
if not symb:
continue
except gdb.error:
# no frame ?
continue
# yes, we can skip it
gdb.execute("skip {}".format(skip))
to_skip.remove(skip)
if not to_skip:
# no more functions to skip
try:
gdb.events.new_objfile.disconnect(try_pending_skips) # event fired when the binary is loaded
except ValueError:
pass # was not connected
class cmd_pending_skip(gdb.Command):
self = None
def __init__ (self):
gdb.Command.__init__(self, "pending_skip", gdb.COMMAND_OBSCURE)
def invoke (self, args, from_tty):
global to_skip
if not args:
if not to_skip:
print("No pending skip.")
else:
print("Pending skips:")
for skip in to_skip:
print("\t{}".format(skip))
return
new_skips = args.split()
to_skip += new_skips
for skip in new_skips:
print("Pending skip for function '{}' registered.".format(skip))
try:
gdb.events.new_objfile.disconnect(try_pending_skips)
except ValueError: pass # was not connected
# new_objfile event fired when the binary and libraries are loaded in memory
gdb.events.new_objfile.connect(try_pending_skips)
# try right away, just in case
try_pending_skips()
cmd_pending_skip()
Save this code into a Python file pending_skip.py (or surrounded with python ... end in your .gdbinit), then:
source pending_skip.py
pending_skip fct1
pending_skip fct2 fct3
pending_skip # to list pending skips
Documentation references:
GDB Python TOC
Basic Python
Events in Python
Symbols in Python
This feature has been proposed here:
https://sourceware.org/ml/gdb-prs/2015-q2/msg00417.html
https://sourceware.org/bugzilla/show_bug.cgi?id=18531
So far, there's been no activity on that issue for 6 months though. As of writing this, the feature is not included in GDB 7.10.

Python multiprocessing stdin input

All code written and tested on python 3.4 windows 7.
I was designing a console app and had a need to use stdin from command-line (win os) to issue commands and to change the operating mode of the program. The program depends on multiprocessing to deal with cpu bound loads to spread to multiple processors.
I am using stdout to monitor that status and some basic return information and stdin to issue commands to load different sub-processes based on the returned console information.
This is where I found a problem. I could no get the multiprocessing module to accept stdin inputs but stdout was working just fine. I think found the following help on stack So I tested it and found that with the threading module this all works great, except for the fact that all output to stdout is paused until each time stdin is cycled due to GIL lock with stdin blocking.
I will say I have been successful with a work around implemented with msvcrt.kbhit(). However, I can't help but wonder if there is some sort of bug in the multiprocessing feature that is making stdin not read any data. I tried numerous ways and nothing worked when using multiprocessing. Even attempted to use Queues, but I did not try pools, or any other methods from multiprocessing.
I also did not try this on my linux machine since I was focusing on trying to get it to work.
Here is simplified test code that does not function as intended (reminder this was written in Python 3.4 - win7):
import sys
import time
from multiprocessing import Process
def function1():
while True:
print("Function 1")
time.sleep(1.33)
def function2():
while True:
print("Function 2")
c = sys.stdin.read(1) # Does not appear to be waiting for read before continuing loop.
sys.stdout.write(c) #nothing in 'c'
sys.stdout.write(".") #checking to see if it works at all.
print(str(c)) #trying something else, still nothing in 'c'
time.sleep(1.66)
if __name__ == "__main__":
p1 = Process(target=function1)
p2 = Process(target=function2)
p1.start()
p2.start()
Hopefully someone can shed light on whether this is intended functionality, if I didn't implement it correctly, or some other useful bit of information.
Thanks.
When you take a look at Pythons implementation of multiprocessing.Process._bootstrap() you will see this:
if sys.stdin is not None:
try:
sys.stdin.close()
sys.stdin = open(os.devnull)
except (OSError, ValueError):
pass
You can also confirm this by using:
>>> import sys
>>> import multiprocessing
>>> def func():
... print(sys.stdin)
...
>>> p = multiprocessing.Process(target=func)
>>> p.start()
>>> <_io.TextIOWrapper name='/dev/null' mode='r' encoding='UTF-8'>
And reading from os.devnull immediately returns empty result:
>>> import os
>>> f = open(os.devnull)
>>> f.read(1)
''
You can work this around by using open(0):
file is either a string or bytes object giving the pathname (absolute or relative to the current working directory) of the file to be opened or an integer file descriptor of the file to be wrapped. (If a file descriptor is given, it is closed when the returned I/O object is closed, unless closefd is set to False.)
And "0 file descriptor":
File descriptors are small integers corresponding to a file that has been opened by the current process. For example, standard input is usually file descriptor 0, standard output is 1, and standard error is 2:
>>> def func():
... sys.stdin = open(0)
... print(sys.stdin)
... c = sys.stdin.read(1)
... print('Got', c)
...
>>> multiprocessing.Process(target=func).start()
>>> <_io.TextIOWrapper name=0 mode='r' encoding='UTF-8'>
Got a

Resources