regarding using the pre trained im2txt model - image

I have followed every step from here https://edouardfouche.com/Fun-with-Tensorflow-im2txt/
but i get the following error
NotFoundError (see above for traceback): Tensor name "lstm/basic_lstm_cell/bias" not found in checkpoint files /home/asadmahmood72/Image_to_text/models/im2txt/model.ckpt-3000000
[[Node: save/RestoreV2_380 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_380/tensor_names, save/RestoreV2_380/shape_and_slices)]]
My os is UBUNTU 16.04
my tensorflow version is 1.2.0

This is a bit late, but hopefully this answer will help future people who encounter this problem.
Like Edouard mentioned, this error is caused because of a change in the Tensorflow API. If you want to use a more recent version of Tensorflow, there are a few ways I know of to "update" your checkpoint:
Use the official checkpoint_convert.py utility included in Tensorflow, or
Use this solution written by 0xDFDFDF on GitHub to rename the offending variables:
OLD_CHECKPOINT_FILE = "model.ckpt-1000000"
NEW_CHECKPOINT_FILE = "model2.ckpt-1000000"
import tensorflow as tf
vars_to_rename = {
"lstm/basic_lstm_cell/weights": "lstm/basic_lstm_cell/kernel",
"lstm/basic_lstm_cell/biases": "lstm/basic_lstm_cell/bias",
}
new_checkpoint_vars = {}
reader = tf.train.NewCheckpointReader(OLD_CHECKPOINT_FILE)
for old_name in reader.get_variable_to_shape_map():
if old_name in vars_to_rename:
new_name = vars_to_rename[old_name]
else:
new_name = old_name
new_checkpoint_vars[new_name] = tf.Variable(reader.get_tensor(old_name))
init = tf.global_variables_initializer()
saver = tf.train.Saver(new_checkpoint_vars)
with tf.Session() as sess:
sess.run(init)
saver.save(sess, NEW_CHECKPOINT_FILE)
I used option #2, and loading my checkpoint worked perfectly after that.

it looks like the tensorflow API changed again, which makes it incompatible with the checkpoint model. I was using tensorflow 0.12.1 in the article. Can you try with tensorflow 0.12.1 if it works? Otherwise you will have to train the model yourself (expensive) or find a checkpoint file that was generated with a more recent version of tensorflow...

Related

Unable to fix "RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False." MAC M1 MPS GPU

My MAC Pro has the latest MPS GPU. I'm trying to reproduce some results from a colleague who read a pickle file using a local NVIDIA GPU.
First, I ensured that my mps gpu is being used:
import torch
device = torch.device("mps") if torch.backends.mps.is_available() else "cpu"
Then I ran the following codes:
import pickle
with open('patient_notes_agg.pickle', 'rb') as df:
patient_notes_agg = pickle.load(df)
patient_notes_agg.head()
I received the following error:
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
So I tried to follow the prompt and switch to use cpu instead of mps gpu, and I used the code below:
model = torch.load('patient_notes_agg.pickle', map_location=torch.device('cpu'))
Unfortunately, I still received the same error.
To give more details:
MacOS Montery 12.3
chip Apple M1 Max
pytorch version '1.13.0'
Python 3.10.7
And I don't use conda environment.
I went through several posts with similar issues on stackoverflow, but unfortunately they didn't help. Any suggestion will be appreciated.
I found a way to work around the issue. Not utilizing my M1 GPU but using cpu instead by creating a reading function. Thanks to this link.
import io
import pickle
class CPU_Unpickler(pickle.Unpickler):
def find_class(self, module, name):
if module == 'torch.storage' and name == '_load_from_bytes':
return lambda b: torch.load(io.BytesIO(b), map_location='cpu')
else: return super().find_class(module, name)
with open('patient_notes_agg.pickle', 'rb') as df:
patient_notes_agg = CPU_Unpickler(df).load()
patient_notes_agg.head()

Time module: Couldn't find a version that satisfies the requirement

I've tried to run this code:
from time import clock
def f2():
t1 = clock()
res = ' ' * 10**6
print('f2:', clock()-t1)
but got Traceback:
from time import clock
ImportError: cannot import name 'clock' from 'time' (unknown location)
Python doesn't see the time module in the standard library?
I tried to install this module manually via pip (Yes, I know that it should already be installed. But what else could I do?). I got the following error in response:
ERROR: Could not find a version that satisfies the requirement time
ERROR: No matching distribution found for time
Trying to install the module via PyCharm also failed - it just runs pip and gets the same error.
I found the answer.
Method clock() in module time was deprecated in 3.8, see issue 36895.
So i used time.time()
import time
def f2():
t1 = time.time()
res = ' ' * 10**8
t2 = time.time()
print('f2:', t2 - t1)
Strange, but while googling the problem, I noticed that many people in 2019-2021 (after clock() deprecated in Python 3.8) had this error, but no one wrote how to solve it.
So my answer might be really helpful.

Geopandas to_file() gives an error regarding fiona.drivers(). Is it possible to work around this?

I'm using geopandas to get WKT and coordinates from a database:
df = pandas.read_sql(con=conn2, sql=test_query)
df['Coordinates'] = df['WKT'].apply(lambda x: wkt.loads(x.read()))
gdf = geopandas.GeoDataFrame(df, geometry='Coordinates')
loc = r"...\Layers\geopandastest2.shp"
gdf.to_file(loc)
When I use to_file() it gives me the following error:
C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\lib\site-packages\geopandas\io\file.py:108: FionaDeprecationWarning: Use fiona.Env() instead.
with fiona.drivers():
Is it possible to get around this and force to_file() to use fiona.Env() or do I need to wait for geopandas to be updated?
Relevant geopandas github issue: https://github.com/geopandas/geopandas/issues/845
It is just a warning, your file should be saved anyway. It is already fixed in Geopandas Master (https://github.com/geopandas/geopandas/pull/854), which should be released soon.
You don't have to do anything about it now, it does not affect your script.

Version mismatch between H2O and R package, where to get the right one?

I am using H2O (basic version) and it works well. I want to try Deep Water for GPU support. So, I carefully followed the instruction on;
https://www.h2o.ai/deep-water/#try
to install Deep Water. However, it failed to run and showed this error:
Error in h2o.init(nthreads = -1, port = 54323, startH2O = FALSE) :
Version mismatch! H2O is running version 3.15.0.393 but h2o-R package is version 3.13.0.369.
Install the matching h2o-R version from - http://h2o-release.s3.amazonaws.com/h2o/(HEAD detached at c46596cad)
Where do I get the right version?
According to the deep-water link, it wants you to use 3.13.0. And your error message is saying you are using the 3.13.0.369 R package.
So, I think the problem is that you have 3.15.0.393 already running on this machine. Kill it and try again.
From inside your current R session, h2o.shutdown() might work. If not, and you using unix, do something like ps auxw | grep h2o to find its PID and kill it; if using Windows search for h2o in the task manager. Or, cleanest, if you know you have an R (or Python, etc.) client where you started that 3.15.0 version of H2O, go and close that client.
you can force the Connection
h2o.init(ip=Cluster_ip, port = Cluster_port,
strict_version_check = FALSE,
startH2O = FALSE)

Error while connecting sparklyr to remote sparkR in Rstudio

I tried following command in my local RStudio session to connect to sparkR -
sc <- spark_connect(master = "spark://x.x.x.x:7077",
spark_home = "/home/hduser/spark-2.0.0-bin-hadoop2.7", version="2.0.0", config = list())
But, I am getting following error -
Error in start_shell(master = master, spark_home = spark_home, spark_version = version, :
SPARK_HOME directory '/home/hduser/spark-2.0.0-bin-hadoop2.7' not found
Any help?
Thanks in advance
may I ask you have you actually installed the spark into that folder?
Can you show the result of ls command in /home/ubuntu/ folder?
And sessionInfo() in R?
Let me please share with you how I am using the custom folder structure.
It is on Win, not Ubuntu but I guess it won't make much of the difference.
Using the most recent dev edition
If you would check on GitHub the RStudio guys are updating sparklyr almost every day fixing numerous reported bugs:
devtools::install_github("rstudio/sparklyr")
in my case only installation of sparklyr_0.4.12 has resolved problem with Spark 2.0 under Windows
Checking Spark availability
please check if version you're inquiring is available:
spark_available_versions()
You should see something like the line below, which indicates that the version you indend to use is actually available for your sparklyr package.
[13] 2.0.0 2.7 spark_install(version = "2.0.0", hadoop_version = "2.7")
Installation of Spark
Just to keep the order you may like to install spark in other location rather then home folder of RStudio cache.
options(spark.install.dir = "c:/spark")
Once you are sure the desire version is available it is time to install spark
spark_install(version = "2.0.0", hadoop_version = "2.7")
I'd check if it is install correctly (change it for shell ls if needed)
cd c:/spark
dir (in Win) | ls (in Ubuntu)
Now specify the location of the edition you want to use:
Sys.setenv(SPARK_HOME = 'C:/spark/spark-2.0.0-bin-hadoop2.7')
And finally enjoy the creation of connection:
sc <- spark_connect(master = "local")
I hope it helps.

Resources