Tensorflow debug or print statements

Tensorflow debug or print statements - debugging

I am very new to TensorFlow and trying to learn it. I copied a program from tutorial website. As I modified it, there are issues with the program and I have to debug. I am looking for help to understand how I can print certain values such as cost and optimizer. I have to figure out to see the value being updated in each iteration. I understand that notes cannot be printed but I take that cost and optimizers are inputs which should be printable, right?
plt.ion()
n_observations = 100
xs = np.linspace(-3, 3, n_observations)
ys = np.sin(xs) + np.random.uniform(-0.5, 0.5, n_observations)
X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)
Y_pred = tf.Variable(tf.random_normal([1]), name='bias')
for pow_i in range(1, 5):
W = tf.Variable(tf.random_normal([1]), name='weight_%d' % pow_i)
Y_pred = tf.add(tf.multiply(tf.pow(X, pow_i), W), Y_pred)
cost = tf.reduce_sum(tf.pow(Y_pred - Y, 2)) / (n_observations - 1)
d = tf.Print(cost, [cost, 2.0], message="Value of cost id:")
learning_rate = 0.01
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
n_epochs = 10
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
prev_training_cost = 0.0
for epoch_i in range(n_epochs):
for (x, y) in zip(xs, ys):
print("Msg2 x, y ", x, y, cost);
sess.run(optimizer, feed_dict={X: x, Y: y})
sess.run(d)
print("Msg3 x, y ttt ", x, y, optimizer);
training_cost = sess.run(
cost, feed_dict={X: xs, Y: ys})
print(training_cost)
print("Msg3 cost, xs ys", cost, xs, ys);
if epoch_i % 100 == 0:
ax.plot(xs, Y_pred.eval(
feed_dict={X: xs}, session=sess),
'k', alpha=epoch_i / n_epochs)
fig.show()
#plt.draw()
# Allow the training to quit if we've reached a minimum
if np.abs(prev_training_cost - training_cost) < 0.001:
break
prev_training_cost = training_cost
ax.set_ylim([-3, 3])
fig.show()
plt.waitforbuttonpress()

In your example, cost and optimizer refer to tensors in the graph, not inputs to your graph. The need to be fetched in a session.run call to be able to print their python values. For example, in your example, printing training_cost should be printing the cost. Similarly, if you return the value you of optimizer from session.run(optimizer, ...), it should return the correct printable value.
If you are interested in debugging and printing values check out:
tfdbg
tf.Print
Hope that helps!

Related

solving of equations by the Euler method in Prolog

I wrote this mini pseudo code that solves the equation using the Euler method:
// y'= x^2 - 5y
int n = 10; // count of steps
double h = 0.5; // step
double x = 0;
double y = 1;
for (int i = 0; i < n; i++) {
y += h * (x * x - 5 * y);
x += h;
}
write << y; //result
Now I am trying to write this in Prolog language
loop2(N,H,X,Y) :-
N > 0,
write(Y), nl,
Y is Y + H * (X * X - 5 * Y),
X is X + H,
S is N - 1,
loop2(S, H, X, Y).
Here I solved the example on a piece of paper and should get 62.5
But in Prolog my Output =
?- loop2(10, 0.5, 0, 1).
1
false.

X is X + H becomes 0 is 0 + 0.5 and it is not, so case closed as far as Prolog is concerned, it has found your logical code is false, and reports that to you. You did tell it to writeln(Y) before that, so you still see 1 while it was trying.
You need to use new variable names for the results of calculations like you have used S is N - 1, e.g. Xnext is X + H.
The way you have shaped the countdown, S will eventually become 0 and then N > 0 will be false and the whole thing will fail then. You can probably get away with using this to write the values before it eventually fails, but a more normal way to end recursion is to have
loop2(0,_,_,_).
loop2(N,H,X,Y) :-
...
which says that when the call to loop2 happens, the top one is found first, if the first position is 0, then it succeeds. That ends the recursion and the whole thing succeeds.

Pytorch: Memory Efficient weighted sum with weights shared along channels

Inputs:
1) I = Tensor of dim (N, C, X) (Input)
2) W = Tensor of dim (N, X, Y) (Weight)
Output:
1) O = Tensor of dim (N, C, Y) (Output)
I want to compute:
I = I.view(N, C, X, 1)
W = W.view(N, 1, X, Y)
PROD = I*W
O = PROD.sum(dim=2)
return O
without incurring N * C * X * Y memory overhead.
Basically I want to calculate the weighted sum of a feature map wherein the weights are the same along the channel dimension, without incurring memory overhead per channel.
Maybe I could use
from itertools import product
O = torch.zeros(N, C, Y)
for n, x, y in product(range(N), range(X), range(Y)):
O[n, :, y] += I[n, :, x]*W[n, x, y]
return O
but that would be slower (no broadcasting) and I'm not sure how much memory overhead would be incurred by saving variables for the backward pass.

You can use torch.bmm (https://pytorch.org/docs/stable/torch.html#torch.bmm). Just do torch.bmm(I,W)
To verify the results :
import torch
N, C, X, Y= 100, 10, 9, 8
i = torch.rand(N,C,X)
w = torch.rand(N,X,Y)
o = torch.bmm(i,w)
# desired result code
I = i.view(N, C, X, 1)
W = w.view(N, 1, X, Y)
PROD = I*W
O = PROD.sum(dim=2)
print(torch.allclose(O,o)) # should output True if outputs are same.
EDIT: Ideally, I would assume using pytorch's internal matrix multiplication is efficient. However, you can also measure the memory usage with tracemalloc (at least on CPU). See https://discuss.pytorch.org/t/measuring-peak-memory-usage-tracemalloc-for-pytorch/34067 for GPU.
import torch
import tracemalloc
tracemalloc.start()
N, C, X, Y= 100, 10, 9, 8
i = torch.rand(N,C,X)
w = torch.rand(N,X,Y)
o = torch.bmm(i,w)
# output is a tuple indicating current memory and peak memory
print(tracemalloc.get_traced_memory())
You can do the same with other code and see the bmm implementation is indeed efficient.
import torch
import tracemalloc
tracemalloc.start()
N, C, X, Y= 100, 10, 9, 8
i = torch.rand(N,C,X)
w = torch.rand(N,X,Y)
I = i.view(N, C, X, 1)
W = w.view(N, 1, X, Y)
PROD = I*W
O = PROD.sum(dim=2)
# output is a tuple indicating current memory and peak memory
print(tracemalloc.get_traced_memory())

Avoid Numpy Index For loop

Is there any way to avoid using a second for loop for an operation like this?
for x in range(Size_1):
for y in range(Size_2):
k[x,y] = np.sqrt(x+y) - y
Or is there a better way to optimize this? Right now it is incredibly slow for large sizes.

Here's a vectorized solution with broadcasting -
X,Y = np.ogrid[:Size_1,:Size_2]
k_out = np.sqrt(X+Y) - Y

Supplementing Divakar's solution: If Y and X are not new ranges but some preexisting vectors of numbers, use np.ix_:
Y, X = np.array([[1.3, 3.5, 2], [2.0, -1, 1]])
Y, X = np.ix_(Y, X) # does the same as Y = Y[:, None]; X = X[None, :]
out = np.sqrt(Y+X) - X

how to calculate a quadratic equation that best fits a set of given data

I have a vector X of 20 real numbers and a vector Y of 20 real numbers.
I want to model them as
y = ax^2+bx + c
How to find the value of 'a' , 'b' and 'c'
and best fit quadratic equation.
Given Values
X = (x1,x2,...,x20)
Y = (y1,y2,...,y20)
i need a formula or procedure to find following values
a = ???
b = ???
c = ???
Thanks in advance.

Everything #Bartoss said is right, +1. I figured I just add a practical implementation here, without QR decomposition. You want to evaluate the values of a,b,c such that the distance between measured and fitted data is minimal. You can pick as measure
sum(ax^2+bx + c -y)^2)
where the sum is over the elements of vectors x,y.
Then, a minimum implies that the derivative of the quantity with respect to each of a,b,c is zero:
d (sum(ax^2+bx + c -y)^2) /da =0
d (sum(ax^2+bx + c -y)^2) /db =0
d (sum(ax^2+bx + c -y)^2) /dc =0
these equations are
2(sum(ax^2+bx + c -y)*x^2)=0
2(sum(ax^2+bx + c -y)*x) =0
2(sum(ax^2+bx + c -y)) =0
Dividing by 2, the above can be rewritten as
a*sum(x^4) +b*sum(x^3) + c*sum(x^2) =sum(y*x^2)
a*sum(x^3) +b*sum(x^2) + c*sum(x) =sum(y*x)
a*sum(x^2) +b*sum(x) + c*N =sum(y)
where N=20 in your case. A simple code in python showing how to do so follows.
from numpy import random, array
from scipy.linalg import solve
import matplotlib.pylab as plt
a, b, c = 6., 3., 4.
N = 20
x = random.rand((N))
y = a * x ** 2 + b * x + c
y += random.rand((20)) #add a bit of noise to make things more realistic
x4 = (x ** 4).sum()
x3 = (x ** 3).sum()
x2 = (x ** 2).sum()
M = array([[x4, x3, x2], [x3, x2, x.sum()], [x2, x.sum(), N]])
K = array([(y * x ** 2).sum(), (y * x).sum(), y.sum()])
A, B, C = solve(M, K)
print 'exact values ', a, b, c
print 'calculated values', A, B, C
fig, ax = plt.subplots()
ax.plot(x, y, 'b.', label='data')
ax.plot(x, A * x ** 2 + B * x + C, 'r.', label='estimate')
ax.legend()
plt.show()
A much faster way to implement solution is to use a nonlinear least squares algorithm. This will be faster to write, but not faster to run. Using the one provided by scipy,
from scipy.optimize import leastsq
def f(arg):
a,b,c=arg
return a*x**2+b*x+c-y
(A,B,C),_=leastsq(f,[1,1,1])#you must provide a first guess to start with in this case.

That is a linear least squares problem. I think the easiest method which gives accurate results is QR decomposition using Householder reflections. It is not something to be explained in a stackoverflow answer, but I hope you will find all that is needed with this links.
If you never heard about these before and don't know how it connects with you problem:
A = [[x1^2, x1, 1]; [x2^2, x2, 1]; ...]
Y = [y1; y2; ...]
Now you want to find v = [a; b; c] such that A*v is as close as possible to Y, which is exactly what least squares problem is all about.

Given an increasing polynomial, how do you efficiently find x values for fixed intervals of y?

Problem: Given a polynomial of degree n (with coefficients a0 through an-1) that is guaranteed to be increasing from x = 0 to xmax, what is the most efficient algorithm to find the first m points with equally-spaced y values (i.e. yi - yi-1 == c, for all i)?
Example: If I want the spacing to be c = 1, and my polynomial is f(x) = x^2, then the first three points would be at y=1 (x=1), y=2 (x~=1.4142), and y=3 (x~=1.7321).
I'm not sure if it will be significant, but my specific problem involves the cube of a polynomial with given coefficients. My intuition tells me that the most efficient solution should be the same, but I'm not sure.
I'm encountering this working through the problems in the ACM's problem set for the 2012 World Finals (problem B), so this is mostly because I'm curious.
Edit: I'm not sure if this should go on the Math SE?

You can find an X for a given Y using a binary search. It's logarithmic time complexity, proportional to the size of the range of x values, divided by your error tolerance.
def solveForX(polyFunc, minX, maxX, y, epsilon):
midX = (minX + maxX) / 2.0
if abs(polyFunc(midX) - y) < epsilon:
return midX
if polyFunc(midX) > y:
return solveForX(polyFunc, minX, midX, y, epsilon)
else:
return solveForX(polyFunc, midX, maxX, y, epsilon)
print solveForX(lambda x: x*x, 0, 100, 2, 0.01)
output:
1.416015625
Edit: to expand on an idea in the comments, if you know you will be searching for multiple X values, it's possible to narrow down the [minX, maxX] search range.
def solveForManyXs(polyFunc, minX, maxX, ys, epsilon):
if len(ys) == 0:
return []
midIdx = len(ys) / 2
midY = ys[midIdx]
midX = solveForX(polyFunc, minX, maxX, midY, epsilon)
lowYs = ys[:midIdx]
highYs = ys[midIdx+1:]
return solveForManyXs(polyFunc, minX, midX, lowYs, epsilon) + \
[midX] + \
solveForManyXs(polyFunc, midX, maxX, highYs, epsilon)
ys = [1, 2, 3]
print solveForManyXs(lambda x: x*x, 0, 100, ys, 0.01)
output:
[1.0000884532928467, 1.41448974609375, 1.7318960977718234]

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Tensorflow debug or print statements - debugging

Related

solving of equations by the Euler method in Prolog

Pytorch: Memory Efficient weighted sum with weights shared along channels

Avoid Numpy Index For loop

how to calculate a quadratic equation that best fits a set of given data

Given an increasing polynomial, how do you efficiently find x values for fixed intervals of y?

Categories

Resources