Tensorflow: How to take advantage of multi GPUs? - multi-gpu

I have a CNN which run well with 1 GPU. Now I move to another computer which has 2 GPUs, I would like to train my network using both GPUs to save time. How could I do it?
I read the https://www.tensorflow.org/tutorials/using_gpu but I think the example was too simple and honestly I don't know how to apply it on my real network.
Could anyone give me a simple illustration on my network please? (I'm doing AutoEncoder).
Thank you very much!
graphCNN = tf.Graph()
with graphCNN.as_default():
# Input
x = tf.placeholder(tf.float32, shape=(None, img_w, img_h,img_ch), name="X") # X
# Output expected
y_ = tf.placeholder(tf.float32, shape=(None, img_w, img_h,img_ch), name="Y") # Y_
# Dropout
dropout = tf.placeholder(tf.float32)
### Model
def model(data):
### Encoder
c64 = ConvLayer(data, depth_in=1, depth_out=64, name="c64", kernel_size=3, acti=True)
c128 = ConvLayer(c64, depth_in=64, depth_out=128, name="c128", kernel_size=3, acti=True)
c256 = ConvLayer(c128, depth_in=128, depth_out=256, name="c256", kernel_size=3, acti=True)
c512_1 = ConvLayer(c256, depth_in=256, depth_out=512, name="c512_1", kernel_size=3, acti=True)
c512_2 = ConvLayer(c512_1, depth_in=512, depth_out=512, name="c512_2", kernel_size=3, acti=True)
c512_3 = ConvLayer(c512_2, depth_in=512, depth_out=512, name="c512_3", kernel_size=3, acti=True)
c512_4 = ConvLayer(c512_3, depth_in=512, depth_out=512, name="c512_4", kernel_size=3, acti=True)
c512_5 = ConvLayer(c512_4, depth_in=512, depth_out=512, name="c512_5", kernel_size=3, acti=True)
### Decoder
dc512_5 = DeconvLayer(c512_5, depth_in=512, depth_out=512, name="dc512_5", kernel_size=3, acti=True)
dc512_4 = DeconvLayer(dc512_5, depth_in=512, depth_out=512, name="dc512_4", kernel_size=3, acti=True)
dc512_3 = DeconvLayer(dc512_4, depth_in=512, depth_out=512, name="dc512_3", kernel_size=3, acti=True)
dc512_2 = DeconvLayer(dc512_3, depth_in=512, depth_out=512, name="dc512_2", kernel_size=3, acti=True)
dc512_1 = DeconvLayer(dc512_2, depth_in=512, depth_out=512, name="dc512_1", kernel_size=3, acti=True)
dc256 = DeconvLayer(dc512_1, depth_in=512, depth_out=256, name="dc256", kernel_size=3, acti=True)
dc128 = DeconvLayer(dc256, depth_in=256, depth_out=128, name="dc128", kernel_size=3, acti=True)
dc64 = DeconvLayer(dc128, depth_in=128, depth_out=64, name="dc64", kernel_size=3, acti=True)
output = ConvLayer(dc64, depth_in=64, depth_out=1, name="conv_out", kernel_size=3, acti=True)
return output
# Predictions
y = model(x)
y_image = tf.reshape(y, [-1, img_w, img_h, 1])
tf.summary.image('output', y_image, 6)
#Loss
loss = tf.reduce_sum(tf.pow(y - y_,2))/(img_w*img_h*img_ch) # MSE
loss_summary = tf.summary.scalar("Training_Loss", loss)
# Optimizer.
with tf.name_scope("train"):
train_step = tf.train.AdamOptimizer(learning_rate=learn_rate).minimize(loss)
In case you wanna see more details
def ConvLayer(input, depth_in, depth_out, name="conv", kernel_size=3, acti=True):
with tf.name_scope(name):
w = tf.Variable(tf.truncated_normal([kernel_size, kernel_size, depth_in, depth_out],
stddev=0.1), name="W")
b = tf.Variable(tf.constant(0.1, shape=[depth_out]), name="B")
conv = tf.nn.conv2d(input, w, strides=[1, 1, 1, 1], padding="SAME")
tf.summary.histogram("weights", w)
tf.summary.histogram("biases", b)
if (acti==True):
act = tf.nn.relu(conv + b)
tf.summary.histogram("activations", act)
result = act
else:
result = conv + b
result_maxpooled = max_pool(result,2)
return result_maxpooled
.
def DeconvLayer(input, depth_in, depth_out, name="deconv", kernel_size=3, acti=True):
with tf.name_scope(name):
w = tf.Variable(tf.truncated_normal([kernel_size, kernel_size, depth_out,depth_in],
stddev=0.1), name="W")
b = tf.Variable(tf.constant(0.1, shape=[depth_out]), name="B")
input_shape = tf.shape(input)
output_shape = tf.stack([input_shape[0], input_shape[1]*2, input_shape[2]*2, input_shape[3]//2])
deconv = tf.nn.conv2d_transpose(input, w, output_shape, strides=[1, 1, 1, 1], padding='SAME')
tf.summary.histogram("weights", w)
tf.summary.histogram("biases", b)
if (acti==True):
act = tf.nn.relu(deconv + b)
tf.summary.histogram("activations", act)
result = act
else:
result = deconv + b
return result

How to implement CNN (Convolutional Neural Network) on Multiple GPUs?
As Quoted from "Training a Model Using Multiple GPU Cards" (Tutorial from Tensorflow)
Place an individual model replica on each GPU.
Update model parameters synchronously by waiting for all GPUs to finish processing a batch of data.
In order to boost performance by understanding dataflow between Main Memory-CPU-GPU have a look at this answer: Why should preprocessing be done on CPU rather than GPU? : https://stackoverflow.com/a/44377741/4190159

Related

Fine tuning Bert for NER attempt on Mac OS

I'm using a MacBook Air/OS Monterey 12.5 (There are updates available; Ventura 13.1
Python version 3.10.8 and also tried using 3.11
Pylance has pointed that all the imports I was trying to execute were not being resolved so I changed the VS Code interpreter to Python 3.10.
Anyways, here's the code:
import pandas as pd
import torch
import numpy as np
from tqdm import tqdm
from transformers import BertTokenizerFast
from transformers import BertForTokenClassification
from torch.utils.data import Dataset, DataLoader
df = pd.read_csv('ner.csv')
labels = [i.split() for i in df['labels'].values.tolist()]
unique_labels = set()
for lb in labels:
[unique_labels.add(i) for i in lb if i not in unique_labels]
# print(unique_labels)
labels_to_ids = {k: v for v, k in enumerate(sorted(unique_labels))}
ids_to_labels = {v: k for v, k in enumerate(sorted(unique_labels))}
# print(labels_to_ids)
text = df['text'].values.tolist()
example = text[36]
#print(example)
tokenizer = BertTokenizerFast.from_pretrained('bert-base-uncased')
text_tokenized = tokenizer(example, padding='max_length', max_length=512, truncation=True, return_tensors='pt')
'''
print(text_tokenized)
print(tokenizer.decode(text_tokenized.input_ids[0]))
'''
def align_label_example(tokenized_input, labels):
word_ids = tokenized_input.word_ids()
previous_word_idx = None
label_ids = []
for word_idx in word_ids:
if word_idx is None:
label_ids.append(-100)
elif word_idx != previous_word_idx:
try:
label_ids.append(labels_to_ids[labels[word_idx]])
except:
label_ids.append(-100)
else:
label_ids.append(labels_to_ids[labels[word_idx]] if label_all_tokens else -100)
previous_word_idx = word_idx
return label_ids;
label = labels[36]
label_all_tokens = False
new_label = align_label_example(text_tokenized, label)
'''
print(new_label)
print(tokenizer.convert_ids_to_tokens(text_tokenized['input_ids'][0]))
'''
def align_label(texts, labels):
tokenized_inputs = tokenizer(texts, padding='max_length', max_length=512, truncation=True)
word_ids = tokenized_inputs.word_ids()
previous_word_idx = None
label_ids = []
for word_idx in word_ids:
if word_idx is None:
label_ids.append(-100)
elif word_idx != previous_word_idx:
try:
label_ids.append(labels_to_ids[labels[word_idx]])
except:
label_ids.append(-100)
else:
try:
label_ids.append(labels_to_ids[labels[word_idx]] if label_all_tokens else -100)
except:
label_ids.append(-100)
previous_word_idx = word_idx
return label_ids
class DataSequence(torch.utils.data.Dataset):
def __init__(self, df):
lb = [i.split() for i in df['labels'].values.tolist()]
txt = df['text'].values.tolist()
self.texts = [tokenizer(str(i),
padding='max_length', max_length=512, truncation=True, return_tensors='pt') for i in txt]
self.labels = [align_label(i,j) for i,j in zip(txt, lb)]
def __len__(self):
return len(self.labels)
def get_batch_labels(self, idx):
return torch.LongTensor(self.labels[idx])
def __getitem__(self, idx):
batch_data = self.get_batch_data(idx)
batch_labels = self.get_batch_labels(idx)
return batch_data, batch_labels
df = df[0:1000]
df_train, df_val, df_test = np.split(df.sample(frac=1, random_state=42),
[int(.8 * len(df)), int(.9 * len(df))])
class BertModel(torch.nn.Module):
def __init__(self):
super(BertModel, self).__init__()
self.bert = BertForTokenClassification.from_pretrained('bert-base-cased', num_labels=len(unique_labels))
def forward(self, input_id, mask, label):
output = self.bert(input_ids=input_id, attention_mask=mask, labels=label, return_dict=False)
return output
def train_loop(model, df_train, df_val):
train_dataset = DataSequence(df_train)
val_dataset = DataSequence(df_val)
train_dataloader = DataLoader(train_dataset, num_workers=4, batch_size=BATCH_SIZE, shuffle=True)
val_dataloader = DataLoader(val_dataset, num_workers=4, batch_size=BATCH_SIZE)
use_cuda = torch.cuda.is_available()
device = torch.device('cuda' if use_cuda else 'cpu')
optimizer = torch.optim.SGD(model.parameters(), lr=LEARNING_RATE)
if use_cuda:
model = model.cuda()
best_acc = 0
best_loss = 1000
for epoch_num in range(EPOCHS):
total_acc_train = 0
total_loss_train = 0
model.train()
for train_data, train_label in tqdm(train_dataloader):
train_label = train_label.to(device)
mask = train_data['attention_mask'].squeeze(1).to(device)
input_id = train_data['input_ids'].squeeze(1).to(device)
optimizer.zero_grad()
loss, logits = model(input_id, mask, train_label)
for i in range(logits.shape[0]):
logits_clean = logits[i][train_label[i] != -100]
label_clean = train_label[i][train_label[i] != -100]
predictions = logits_clean.argmax(dim=1)
acc = (predictions == label_clean).float().mean()
total_acc_train += acc
total_loss_train += loss.item()
loss.backward()
optimizer.step()
model.eval()
total_acc_val = 0
total_loss_val = 0
for val_data, val_label in val_dataloader:
val_label = val_label.to(device)
mask = val_data['attention_mask'].squeeze(1).to(device)
input_id = val_data['input_ids'].squeeze(1).to(device)
loss, logits = model(input_id, mask, val_label)
for i in range(logits.shape[0]):
logits_clean = logits[i][val_label[i] != -100]
label_clean = val_label[i][val_label[i] != -100]
predictions = logits_clean.argmax(dim=1)
acc = (predictions == label_clean).float().mean()
total_acc_val += acc
total_loss_val += loss.item()
val_accuracy = total_acc_val / len(df_val)
val_loss = total_loss_val / len(df_val)
print(
f'Epochs: {epoch_num + 1} | Loss: {total_loss_train / len(df_train): .3f} | Accuracy: {total_acc_train / len(df_train): .3f} | Val_Loss: {total_loss_val / len(df_val): .3f} | Accuracy: {total_acc_val / len(df_val): .3f}')
LEARNING_RATE = 5e-3
EPOCHS = 5
BATCH_SIZE = 2
model = BertModel()
train_loop(model, df_train, df_val)
And the debugger says:
Exception has occurred: RuntimeError (note: full exception trace is shown but execution is paused at: <module>)
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
File "/Users/filipedonatti/Projects/pyCodes/second_try.py", line 141, in train_loop
for train_data, train_label in tqdm(train_dataloader):
File "/Users/filipedonatti/Projects/pyCodes/second_try.py", line 197, in <module>
train_loop(model, df_train, df_val)
File "<string>", line 1, in <module> (Current frame)
By the way,
Despite using Mac, I have downloaded Anaconda-Navigator, however I've been trying and executing this code on VS Code. I've downloaded numpy, torch, datasets and other libraries through Brew with the pip3 command.
I'm at a loss, I can run the code on a google collab notebook or Jupiter notebook, and I know training models and such in my humble Mac would not be advised, but I am just exercising this so I can train and use the model in a much more powerful machine.
Please help me with this issue, I've been trying to find a solution for days.
Peace and happy holidays.
I've tried solving the issue by writing:
if __name__ == '__main__':
freeze_support()
I've tried using this:
import parallelTestModule
extractor = parallelTestModule.ParallelExtractor()
extractor.runInParallel(numProcesses=2, numThreads=4)
So...
It turns out the correct way to solve this is to implement a function to train the loop as such:
def run():
model = BertModel()
torch.multiprocessing.freeze_support()
print('loop')
train_loop(model, df_train, df_val)
if __name__ == '__main__':
run()
Redefining that train_loop line in the end. Issue solved. For more see this link: https://github.com/pytorch/pytorch/issues/5858

Is there any way to retrieve real time data from sql server on GUI interface with wxpython?

I have a simple application about GUI with wxpython , I'd like to display real time value from database. Currently what I can do is to display the value from database , but I can not figure out how to make it REAL TIME , I mean when the data changed in database then I hope the data on this interface will synchronize with database. Hope it's clear. Anyone who can help on this ? Any suggestion will be appreciated.
Here I attached my code: Python 3.5 , Win 10
# -*- coding: utf-8 -*-
import wx
import wx.adv
import wx.grid
import sys
import pyodbc
import time
bgcolor = (220,220,220)
class Mywin(wx.Frame):
def __init__(self, parent, title):
super(Mywin, self).__init__(parent, title = title, size = (600,320))
self.InitUI()
def InitUI(self):
nb = wx.Notebook(self)
nb.AddPage(MyPanel3(nb), "Table")
self.Centre()
self.Show(True)
def retrieve_data_fromdb():
# SQL Server configuration
server = 'localhost'
db = '***'
user = 'sa'
pwd = '******'
src_db = pyodbc.connect(r'DRIVER={SQL Server Native Client 11.0};SERVER=' + server + ';DATABASE=' + db + ';UID=' + user + ';PWD=' + pwd)
cur = src_db.cursor()
select = 'select * from real_time_test'
cur.execute(select)
rows = cur.fetchone()
wind_spd = rows[0]
site_pwr = rows[1]
acv_pr_setpnt = rows[2]
park_pbl_cap = rows[3]
tol_cur = rows[4]
tol_non_prod = rows[5]
data = []
data.append(wind_spd)
data.append(site_pwr)
data.append(acv_pr_setpnt)
data.append(park_pbl_cap)
data.append(tol_cur)
data.append(tol_non_prod)
return data
cur.commit()
cur.close()
src_db.close()
class MyPanel3(wx.Panel):
def __init__(self, parent):
super(MyPanel3, self).__init__(parent)
self.SetBackgroundColour(bgcolor)
self.Bind(wx.EVT_PAINT, self.OnPaint)
title_NDC = wx.StaticText(self, -1, " Real time signals ", (30, 22))
title_NDC.SetForegroundColour((0, 0, 255))
wx.StaticText(self, -1, "1. Wind Speed", (35, 75))
wx.StaticText(self, -1, "2. Site Power", (35, 95))
wx.StaticText(self, -1, "Instant", (300, 45))
wx.StaticText(self, -1, "m/s", (340, 75))
wx.StaticText(self, -1, "kW", (340, 95))
a = retrieve_data_fromdb()
wind_spd_val = wx.StaticText(self, -1, a[0], (300, 75))
wind_spd_val.SetForegroundColour((0, 0, 255))
def OnPaint(self, event):
pdc = wx.PaintDC(self)
gc = wx.GCDC(pdc)
gc.Clear()
brush_rec = wx.Brush(bgcolor)
gc.SetBrush(brush_rec)
gc.SetPen(wx.Pen("black", 2))
x1 = 20
y1 = 30
w1 = 500
h1 = 180
radius = 3
gc.DrawRoundedRectangle(x1, y1, w1, h1, radius)
ex = wx.App()
Mywin(None,'My example')
ex.MainLoop()
An easy, if inefficient, way to achieve this is using a simple Timer.
Simply read the database once every x amount of time and update the screen.
Forgive the messy code but you'll get the idea.
Use the input box to update the value.
You'll have to fiddle with the database code, as I've used sqlite3, which is already on my box.
import wx
import sqlite3
class Frame(wx.Frame):
def __init__(self):
wx.Frame.__init__(self, None)
#Dummy database
self.data = [['1111','Lots'],['2222','Fast'],['3333','Low']]
self.db = sqlite3.connect("test1.db")
self.cursor = self.db.cursor()
self.cursor.execute('CREATE TABLE IF NOT EXISTS TEST(Id Text, Value Text)')
self.cursor.execute("select * from TEST");
test = self.cursor.fetchone()
if not test:
for i in self.data:
result = self.db.execute("insert into Test (Id, Value) values (?,?)",(i[0],i[1]))
self.db.commit()
wx.StaticText(self, -1, "1. Wind Speed", (35, 75))
wx.StaticText(self, -1, "2. Site Power", (35, 95))
wx.StaticText(self, -1, "Instant", (300, 45))
wx.StaticText(self, -1, "m/s", (340, 75))
wx.StaticText(self, -1, "kW", (340, 95))
self.wind = wx.TextCtrl(self, -1, pos=(35,200),style=wx.TE_RIGHT|wx.TE_PROCESS_ENTER)
a = self.retrieve_data_fromdb()
self.wind_spd_val = wx.StaticText(self, -1, a[1], (300, 75))
#Bind the callbacks for the timer and the update function
self.wind.Bind(wx.EVT_TEXT_ENTER, self.OnUpdate)
self.dbtimer = wx.Timer(self)
self.Bind(wx.EVT_TIMER,self.OnTimer, self.dbtimer)
self.dbtimer.Start(1000)
def retrieve_data_fromdb(self):
self.cursor.execute("SELECT * FROM Test where Id = ?",['2222'])
a = self.cursor.fetchone()
return a
def OnUpdate(self,event):
self.cursor.execute("update Test set Value=? where Id=?" , (self.wind.GetValue(),'2222'))
self.db.commit()
def OnTimer(self,event):
a = self.retrieve_data_fromdb()
self.wind_spd_val.SetLabel(a[1])
print("Updated with "+str(a[1]))
if __name__ == '__main__':
app = wx.App()
frame = Frame()
frame.Show()
app.MainLoop()

tensorflow: After adding a rnn the whole work doesn't work

I have downloaded a code of FCN for image segmentation and it ran well. Now I want to add a rnn layer attempting to refine the result according to the work "ReSeg: A Recurrent Neural Network-Based Model for Semantic Segmentation". My code shows as follows:
This part is for the inference:
def inference(image, keep_prob):
"""
Semantic segmentation network definition
:param image: input image. Should have values in range 0-255
:param keep_prob:
:return:
"""
print("setting up vgg initialized conv layers ...")
#model_data = utils.get_model_data(FLAGS.model_dir, MODEL_URL)
model_data = scipy.io.loadmat("H:/Deep Learning/FCN.tensorflow-master/imagenet-vgg-verydeep-19.mat")
mean = model_data['normalization'][0][0][0]
mean_pixel = np.mean(mean, axis=(0, 1))
weights = np.squeeze(model_data['layers'])
processed_image = utils.process_image(image, mean_pixel)
with tf.variable_scope("inference"):
image_net = vgg_net(weights, processed_image)
conv_final_layer = image_net["conv5_3"]
pool5 = utils.max_pool_2x2(conv_final_layer)
W6 = utils.weight_variable([7, 7, 512, 4096], name="W6")
b6 = utils.bias_variable([4096], name="b6")
conv6 = utils.conv2d_basic(pool5, W6, b6)
relu6 = tf.nn.relu(conv6, name="relu6")
if FLAGS.debug:
utils.add_activation_summary(relu6)
relu_dropout6 = tf.nn.dropout(relu6, keep_prob=keep_prob)
W7 = utils.weight_variable([1, 1, 4096, 4096], name="W7")
b7 = utils.bias_variable([4096], name="b7")
conv7 = utils.conv2d_basic(relu_dropout6, W7, b7)
relu7 = tf.nn.relu(conv7, name="relu7")
if FLAGS.debug:
utils.add_activation_summary(relu7)
relu_dropout7 = tf.nn.dropout(relu7, keep_prob=keep_prob)
W8 = utils.weight_variable([1, 1, 4096, NUM_OF_CLASSESS], name="W8")
b8 = utils.bias_variable([NUM_OF_CLASSESS], name="b8")
conv8 = utils.conv2d_basic(relu_dropout7, W8, b8)
# annotation_pred1 = tf.argmax(conv8, dimension=3, name="prediction1")
# now to upscale to actual image size
deconv_shape1 = image_net["pool4"].get_shape()
W_t1 = utils.weight_variable([4, 4, deconv_shape1[3].value, NUM_OF_CLASSESS], name="W_t1")
b_t1 = utils.bias_variable([deconv_shape1[3].value], name="b_t1")
conv_t1 = utils.conv2d_transpose_strided(conv8, W_t1, b_t1, output_shape=tf.shape(image_net["pool4"]))
#fuse_1 = tf.add(conv_t1, image_net["pool4"], name="fuse_1")
deconv_shape2 = image_net["pool3"].get_shape()
W_t2 = utils.weight_variable([4, 4, deconv_shape2[3].value, deconv_shape1[3].value], name="W_t2")
b_t2 = utils.bias_variable([deconv_shape2[3].value], name="b_t2")
conv_t2 = utils.conv2d_transpose_strided(conv_t1, W_t2, b_t2, output_shape=tf.shape(image_net["pool3"]))
#fuse_2 = tf.add(conv_t2, image_net["pool3"], name="fuse_2")
shape = tf.shape(image)
deconv_shape3 = tf.stack([shape[0], shape[1], shape[2], NUM_OF_CLASSESS])
W_t3 = utils.weight_variable([16, 16, NUM_OF_CLASSESS, deconv_shape2[3].value], name="W_t3")
b_t3 = utils.bias_variable([NUM_OF_CLASSESS], name="b_t3")
conv_t3 = utils.conv2d_transpose_strided(conv_t2, W_t3, b_t3, output_shape=deconv_shape3, stride=8)
/////////////////////////////////////////////////////this is from where i added the rnn
shape_5 = tf.shape(image)
W_a = 224
H_a = 224
p_size_a = NUM_OF_CLASSESS
# x = tf.reshape(conv_t1, [shape_5[0],H_a,W_a, p_size_a])
x = tf.transpose(conv_t3, perm=[0,2,1,3])
x = tf.reshape(x,[-1,H_a,p_size_a])
mat = tf.unstack(x, H_a, 1)
lstm_fw_cell = rnn.BasicLSTMCell(N_HIDDEN, forget_bias=1.0)
lstm_bw_cell = rnn.BasicLSTMCell(N_HIDDEN, forget_bias=1.0)
#with tf.variable_scope('rnn1_1'):
try:
outputs, _, _ = rnn.static_bidirectional_rnn(lstm_fw_cell, lstm_bw_cell, mat,
dtype=tf.float32,scope='rnn1_1')
except Exception: # Old TensorFlow version only returns outputs not states
outputs = rnn.static_bidirectional_rnn(lstm_fw_cell, lstm_bw_cell, mat,
dtype=tf.float32)
outputs1 = tf.reshape(outputs,[H_a, shape_5[0], W_a, 2 * N_HIDDEN])
outputs1 = tf.transpose(outputs1,(1,0,2,3))
x_1 = tf.reshape(outputs1,[-1,W_a,2 * N_HIDDEN])
mat_1 = tf.unstack(x_1, W_a, 1)
lstm_lw_cell = rnn.BasicLSTMCell(N_HIDDEN, forget_bias=1.0)
lstm_rw_cell = rnn.BasicLSTMCell(N_HIDDEN, forget_bias=1.0)
#with tf.variable_scope('rnn1_2'):
try:
outputs2, _, _ = rnn.static_bidirectional_rnn(lstm_lw_cell, lstm_rw_cell, mat_1,
dtype=tf.float32,scope = 'rnn1_2')
except Exception: # Old TensorFlow version only returns outputs not states
outputs2 = rnn.static_bidirectional_rnn(lstm_lw_cell, lstm_rw_cell, mat_1,
dtype=tf.float32)
outputs2 = tf.reshape(outputs,[W_a, shape_5[0], H_a, 2 * N_HIDDEN])
outputs2 = tf.transpose(outputs2,(1,2,0,3))
///////////////////////////////////////////////////till here
annotation_pred = tf.argmax(outputs2, dimension=3, name="prediction")
return tf.expand_dims(annotation_pred, dim=3), outputs2
and this part is for the training:
def train(loss_val, var_list):
optimizer = tf.train.AdamOptimizer(FLAGS.learning_rate)
grads = optimizer.compute_gradients(loss_val, var_list=var_list)
if FLAGS.debug:
# print(len(var_list))
for grad, var in grads:
utils.add_gradient_summary(grad, var)
return optimizer.apply_gradients(grads)
def main(argv=None):
keep_probability = tf.placeholder(tf.float32, name="keep_probabilty")
image = tf.placeholder(tf.float32, shape=[None, IMAGE_SIZE, IMAGE_SIZE, 3], name="input_image")
annotation = tf.placeholder(tf.int32, shape=[None, IMAGE_SIZE, IMAGE_SIZE, 1], name="annotation")
pred_annotation, logits = inference(image, keep_probability)
tf.summary.image("input_image", image, max_outputs=2)
tf.summary.image("ground_truth", tf.cast(annotation, tf.uint8), max_outputs=2)
tf.summary.image("pred_annotation", tf.cast(pred_annotation, tf.uint8), max_outputs=2)
loss = tf.reduce_mean((tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits,
labels=tf.squeeze(annotation, squeeze_dims=[3]),
name="entropy")))
tf.summary.scalar("entropy", loss)
trainable_var = tf.trainable_variables()
if FLAGS.debug:
for var in trainable_var:
utils.add_to_regularization_and_summary(var)
train_op = train(loss, trainable_var)
print("Setting up summary op...")
summary_op = tf.summary.merge_all()
print("Setting up image reader...")
train_records, valid_records = scene_parsing.read_dataset(FLAGS.data_dir)
print(len(train_records))
print(len(valid_records))
print("Setting up dataset reader")
image_options = {'resize': True, 'resize_size': IMAGE_SIZE}
if FLAGS.mode == 'train':
train_dataset_reader = dataset.BatchDatset(train_records, image_options)
validation_dataset_reader = dataset.BatchDatset(valid_records, image_options)
sess = tf.Session()
print("Setting up Saver...")
saver = tf.train.Saver()
summary_writer = tf.summary.FileWriter(FLAGS.logs_dir, sess.graph)
sess.run(tf.global_variables_initializer())
ckpt = tf.train.get_checkpoint_state(FLAGS.logs_dir)
if ckpt and ckpt.model_checkpoint_path:
saver.restore(sess, ckpt.model_checkpoint_path)
print("Model restored...")
if FLAGS.mode == "train":
for itr in xrange(MAX_ITERATION):
train_images, train_annotations = train_dataset_reader.next_batch(FLAGS.batch_size)
feed_dict = {image: train_images, annotation: train_annotations, keep_probability: 0.85}
sess.run(train_op, feed_dict=feed_dict)
if itr % 10 == 0:
train_loss, summary_str = sess.run([loss, summary_op], feed_dict=feed_dict)
print("Step: %d, Train_loss:%g" % (itr, train_loss))
summary_writer.add_summary(summary_str, itr)
if itr % 500 == 0:
valid_images, valid_annotations = validation_dataset_reader.next_batch(FLAGS.batch_size)
valid_loss = sess.run(loss, feed_dict={image: valid_images, annotation: valid_annotations,
keep_probability: 1.0})
print("%s ---> Validation_loss: %g" % (datetime.datetime.now(), valid_loss))
saver.save(sess, FLAGS.logs_dir + "model.ckpt", itr)
elif FLAGS.mode == "visualize":
valid_images, valid_annotations = validation_dataset_reader.get_random_batch(FLAGS.batch_size)
pred = sess.run(pred_annotation, feed_dict={image: valid_images, annotation: valid_annotations,
keep_probability: 1.0})
valid_annotations = np.squeeze(valid_annotations, axis=3)
pred = np.squeeze(pred, axis=3)
for itr in range(FLAGS.batch_size):
utils.save_image(valid_images[itr].astype(np.uint8), FLAGS.logs_dir, name="inp_" + str(5+itr))
utils.save_image(valid_annotations[itr].astype(np.uint8), FLAGS.logs_dir, name="gt_" + str(5+itr))
utils.save_image(pred[itr].astype(np.uint8), FLAGS.logs_dir, name="pred_" + str(5+itr))
print("Saved image: %d" % itr)
The error was described as:
Not found: Key inference/rnn1_2/fw/basic_lstm_cell/weights not found in checkpoint
So i think there must be something wrong with the variables.
I'll be very appreciate if someone could tell me how to fix it!
looking forward to your help!

Loading test data using batch Tensorflow

The following code is my pipeline for reading images and labels from files:
import tensorflow as tf
import numpy as np
import tflearn.data_utils
from tensorflow.python.framework import ops
from tensorflow.python.framework import dtypes
import sys
#process labels in the input file
def process_label(label):
info=np.zeros(6)
...
return info
def read_label_file(file):
f = open(file, "r")
filepaths = []
labels = []
lines = []
for line in f:
tokens = line.split(",")
filepaths.append([tokens[0],tokens[1],tokens[2]])
labels.append(process_label(tokens[3:]))
lines.append(line)
return filepaths, np.vstack(labels), lines
def get_data_batches(params):
# reading labels and file path
train_filepaths, train_labels, train_line = read_label_file(params.train_info)
test_filepaths, test_labels, test_line = read_label_file(params.test_info)
# convert string into tensors
train_images = ops.convert_to_tensor(train_filepaths)
train_labels = ops.convert_to_tensor(train_labels)
train_line = ops.convert_to_tensor(train_line)
test_images = ops.convert_to_tensor(test_filepaths)
test_labels = ops.convert_to_tensor(test_labels)
test_line = ops.convert_to_tensor(test_line)
# create input queues
train_input_queue = tf.train.slice_input_producer([train_images, train_labels, train_line], shuffle=params.shuffle)
test_input_queue = tf.train.slice_input_producer([test_images, test_labels, test_line],shuffle=False)
# process path and string tensor into an image and a label
train_image=None
for i in range(train_input_queue[0].get_shape()[0]):
file_content = tf.read_file(params.path_prefix+train_input_queue[0][i])
train_imageT = (tf.to_float(tf.image.decode_jpeg(file_content, channels=params.num_channels)))*(1.0/255)
train_imageT = tf.image.resize_images(train_imageT,[params.load_size[0],params.load_size[1]])
train_imageT = tf.random_crop(train_imageT,size=[params.crop_size[0],params.crop_size[1],params.num_channels])
train_imageT = tf.image.random_flip_up_down(train_imageT)
train_imageT = tf.image.per_image_standardization(train_imageT)
if(i==0):
train_image = train_imageT
else:
train_image = tf.concat([train_image, train_imageT], 2)
train_label = train_input_queue[1]
train_lineInfo = train_input_queue[2]
test_image=None
for i in range(test_input_queue[0].get_shape()[0]):
file_content = tf.read_file(params.path_prefix+test_input_queue[0][i])
test_imageT = tf.to_float(tf.image.decode_jpeg(file_content, channels=params.num_channels))*(1.0/255)
test_imageT = tf.image.resize_images(test_imageT,[params.load_size[0],params.load_size[1]])
test_imageT = tf.image.central_crop(test_imageT, (params.crop_size[0]+0.0)/params.load_size[0])
test_imageT = tf.image.per_image_standardization(test_imageT)
if(i==0):
test_image = test_imageT
else:
test_image = tf.concat([test_image, test_imageT],2)
test_label = test_input_queue[1]
test_lineInfo = test_input_queue[2]
# define tensor shape
train_image.set_shape([params.crop_size[0], params.crop_size[1], params.num_channels*3])
train_label.set_shape([66])
test_image.set_shape( [params.crop_size[0], params.crop_size[1], params.num_channels*3])
test_label.set_shape([66])
# collect batches of images before processing
train_image_batch, train_label_batch, train_lineno = tf.train.batch([train_image, train_label, train_lineInfo],batch_size=params.batch_size,num_threads=params.num_threads,allow_smaller_final_batch=True)
test_image_batch, test_label_batch, test_lineno = tf.train.batch([test_image, test_label, test_lineInfo],batch_size=params.test_size,num_threads=params.num_threads,allow_smaller_final_batch=True)
if(params.loadSlice=='all'):
return train_image_batch, train_label_batch, train_lineno, test_image_batch, test_label_batch, test_lineno
elif params.loadSlice=='train':
return train_image_batch, train_label_batch
elif params.loadSlice=='test':
return test_image_batch, test_label_batch
elif params.loadSlice=='train_info':
return train_image_batch, train_label_batch, train_lineno
elif params.loadSlice=='test_info':
return test_image_batch, test_label_batch, test_lineno
else:
return train_image_batch, train_label_batch, test_image_batch, test_label_batch
I want to use the same pipeline for loading the test data. The size of my test data is huge and I cannot load all of them at once.
I have 20453 test examples which is not an integer multiply of the batch size (here 512).
How can I read all of my test examples via this pipeline one and only one time and then measure the performance on them?
Currently, I am using this code for batching my test data and it does not work. It always read a full batch from the queue even when I set allow_smaller_final_batch to True
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
saver.restore(sess,"checkpoints2/snapshot-16")
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
more = True
num_examples=0
while(more):
img_test, lbl_test, lbl_line=sess.run([test_image_batch,test_label_batch,test_lineno])
print(lbl_test.shape)
size=lbl_test.shape[0]
num_examples += size
if size<args.batch_size:
more = False
sess.close()
This is the code of my model:
from tflearn.layers.core import input_data, dropout, fully_connected
from tflearn.layers.conv import conv_2d, max_pool_2d
from tflearn.layers.normalization import local_response_normalization
from tflearn.layers.normalization import batch_normalization
from tflearn.layers.estimator import regression
from tflearn.activations import relu
def get_alexnet(x,num_output):
network = conv_2d(x, 64, 11, strides=4)
network = batch_normalization(network,epsilon=0.001)
network = relu (network)
network = max_pool_2d(network, 3, strides=2)
network = conv_2d(network, 192, 5)
network = batch_normalization(network,epsilon=0.001)
network = relu(network)
network = max_pool_2d(network, 3, strides=2)
network = conv_2d(network, 384, 3)
network = batch_normalization(network,epsilon=0.0001)
network = relu(network)
network = conv_2d(network, 256, 3)
network = batch_normalization(network,epsilon=0.001)
network = relu(network)
network = conv_2d(network, 256, 3)
network = batch_normalization(network,epsilon=0.001)
network = relu(network)
network = max_pool_2d(network, 3, strides=2)
network = fully_connected(network, 4096)
network = batch_normalization(network,epsilon=0.001)
network = relu(network)
network = dropout(network, 0.5)
network = fully_connected(network, 4096)
network = batch_normalization(network,epsilon=0.001)
network = relu(network)
network = dropout(network, 0.5)
network1 = fully_connected(network, num_output)
network2 = fully_connected(network, 12)
network3 = fully_connected(network,6)
return network1,network2,network3
This simply could be achieved by setting num_epochs=1 and allow_smaller_final_batch= True!
One solution is set batch_size=size of test set

"socket.error: [Errno 98] Address already in use" in heterogeneous example on veins-lte

I'm trying to change the application type in the veins-lte "heterogeneous" example and I get the following error (in the SUMO log):
"socket.error: [Errno 98] Address already in use"
I tried with different traffic configurations in SUMO or with different applications, but I always get the same error. I'm able to run the the example, but it stops in a few seconds without showing any errors on OMNeT++.
Here's my omnetpp.ini:
[General]
cmdenv-express-mode = true
cmdenv-autoflush = true
cmdenv-status-frequency = 10000000s
#tkenv-default-config = debug
#tkenv-default-run = 1
sim-time-limit = 30s
tkenv-image-path = bitmaps
ned-path = .
network = scenario
##########################################################
# Simulation parameters #
##########################################################
debug-on-errors = true
print-undisposed = false
**.scalar-recording = true
**.vector-recording = true
#record-eventlog = true
**.debug = false
**.coreDebug = false
*.playgroundSizeX = 20000m
*.playgroundSizeY = 20000m
*.playgroundSizeZ = 50m
##########################################################
# Annotation parameters #
##########################################################
*.annotations.draw = false
##########################################################
# Obstacle parameters #
##########################################################
*.obstacles.debug = false
##########################################################
# WorldUtility parameters #
##########################################################
*.world.useTorus = false
*.world.use2D = false
##########################################################
# TraCIScenarioManager parameters #
##########################################################
*.manager.updateInterval = 0.1s
*.manager.host = "localhost"
*.manager.port = 9999
*.manager.moduleType = "org.car2x.veins.modules.heterogeneous.HeterogeneousCar"
*.manager.moduleName = "node"
*.manager.moduleDisplayString = ""
*.manager.autoShutdown = true
*.manager.margin = 25
*.manager.launchConfig = xmldoc("heterogeneous.launchd.xml")
##########################################################
# 11p specific parameters #
# #
# NIC-Settings #
##########################################################
*.connectionManager.pMax = 20mW
*.connectionManager.sat = -89dBm
*.connectionManager.alpha = 2.0
*.connectionManager.carrierFrequency = 5.890e9 Hz
*.connectionManager.sendDirect = true
*.**.nic80211p.mac1609_4.useServiceChannel = false
*.**.nic80211p.mac1609_4.txPower = 20mW
*.**.nic80211p.mac1609_4.bitrate = 18Mbps
*.**.nic80211p.phy80211p.sensitivity = -89dBm
*.**.nic80211p.phy80211p.maxTXPower = 10mW
*.**.nic80211p.phy80211p.useThermalNoise = true
*.**.nic80211p.phy80211p.thermalNoise = -110dBm
*.**.nic80211p.phy80211p.decider = xmldoc("config.xml")
*.**.nic80211p.phy80211p.analogueModels = xmldoc("config.xml")
*.**.nic80211p.phy80211p.usePropagationDelay = true
##########################################################
# Mobility #
##########################################################
*.node[*].veinsmobilityType = "org.car2x.veins.modules.mobility.traci.TraCIMobility"
*.node[*].mobilityType = "TraCIMobility"
*.node[*].mobilityType.debug = true
*.node[*].veinsmobilityType.debug = true
*.node[*].veinsmobility.x = 0
*.node[*].veinsmobility.y = 0
*.node[*].veinsmobility.z = 1.895
*.node[*0].veinsmobility.accidentCount = 0
*.node[*0].veinsmobility.accidentStart = 75s
*.node[*0].veinsmobility.accidentDuration = 30s
###########################
# LTE specific parameters #
###########################
**.node[*].masterId = 1
**.node[*].macCellId = 1
**.eNodeB1.macCellId = 1
**.eNodeB1.macNodeId = 1
**.eNodeBCount = 1
**.configurator.config = xmldoc("topology-config.xml")
#*.server.numUdpApps = 1
#*.server.udpApp[0].typename = "SimpleServerApp"
#*.server.udpApp[0].localPort = 4242
#============= Application Setup =============
##########################################################
# WaveAppLayer #
##########################################################
*.node[*].applType = "UDPVideoStreamCli"
*.node[*].appl.serverAddress = "server" #
*.node[*].appl.localPort = 9999
*.node[*].appl.serverPort = 3088 #
*.node[*].appl.startTime = uniform(0s, 0.02s)
##########################################################
# RSU SETTINGS #
##########################################################
*.server.applType = "UDPVideoStreamSvr"
*.server.appl.videoSize = 10MiB
*.server.appl.localPort = 3088
*.server.appl.sendInterval = 20ms
*.server.appl.packetLen = ${packetLen = 1000B }
**.mtu = 10000B
##########################################################
# channel parameters #
##########################################################
**.channelControl.pMax = 10W
**.channelControl.alpha = 1.0
**.channelControl.carrierFrequency = 2100e+6Hz
################### RLC parameters #######################
#**.fragmentSize=75B
#**.timeout=50s
################### MAC parameters #######################
**.mac.queueSize = ${queue = 2MiB}
**.mac.maxBytesPerTti = ${maxBytesPerTti = 3MiB}
**.mac.macDelay.result-recording-modes = all
**.mac.macThroughput.result-recording-modes = all
# Schedulers
**.mac.schedulingDisciplineDl = ${scheduler = "MAXCI"} #MAXCI, DRR, PF
**.mac.schedulingDisciplineUl = ${scheduler}
################ PhyLayer parameters #####################
**.nic.phy.usePropagationDelay = true
**.nic.phy.channelModel=xmldoc("config_channel.xml")
################ Feedback parameters #####################
**.feedbackComputation = xmldoc("config_channel.xml")
# UEs
**.enableHandover = false
################# Deployer parameters #######################
# UEs attached to eNB
**.fbDelay = 1
# General
**.deployer.positionUpdateInterval = 0.1s
**.deployer.broadcastMessageInterval = 1s
# RUs
**.deployer.numRus = 0
**.deployer.ruRange = 50
**.deployer.ruTxPower = "50,50,50;"
**.deployer.ruStartingAngle = 0deg
**.deployer.antennaCws = "2;" # !!MACRO + RUS (numRus + 1)
# AMC
**.deployer.numRbDl = ${RB = 100}
**.deployer.numRbUl = ${RB}
**.deployer.rbyDl = 12
**.deployer.rbyUl = 12
**.deployer.rbxDl = 7
**.deployer.rbxUl = 7
**.deployer.rbPilotDl = 3
**.deployer.rbPilotUl = 0
**.deployer.signalDl = 1
**.deployer.signalUl = 1
**.deployer.numBands = 1
**.deployer.numPreferredBands = 1
############### AMC MODULE PARAMETERS ###############
**.rbAllocationType = "localized"
**.mac.amcMode = "AUTO"
**.feedbackType = "ALLBANDS"
**.feedbackGeneratorType = "IDEAL"
**.maxHarqRtx = 3
**.pfAlpha = 0.95
**.pfTmsAwareDL = false
**.numUe = ${numUEs=1000}
############### Transmission Power ##################
**.ueTxPower = 26
**.microTxPower = 20
**.eNodeBTxPower = 45
[Config nodebug]
description = "default settings"
**.debug = false
**.coreDebug = false
*.annotations.draw = false
[Config debug]
description = "(very slow!) draw and print additional debug information"
**.debug = true
**.coreDebug = true
*.annotations.draw = true
I'd appreciate any help, I really don't know how to solve it... Thanks in advance!
I get the following error (in the SUMO log):
"socket.error: [Errno 98] Address already in use"
If I am not mistaken, this is an error message from Python. It is not an OMNeT++ error, nor is it an error message that SUMO would output.
My guess is that you are getting the error message when you are trying to run sumo-launchd.py, the script that launches SUMO when needed by OMNeT++.
There are two possible reasons I can see:
You are trying to run two instances of sumo-launchd.py in parallel. This is not necessary. Having only one instance running is enough.
Some other program is using the same address and port (TCP port 9999). Independent of whether you are running this program knowingly or if it is malware, either shutting down the conflicting program or changing the sumo-launchd.py port number will help. See the sumo-launchd.py documentation for how to change its port number.

Resources