Performance comparison among Keras/Tf-slim and native Tensorflow? - performance

I have a network and I am selecting the framework to implement: Keras, Tf-Slim or Tensorflow.
My question is that does the performance (accuracy) decrease when we use Keras/Tf-slim instead of native Tensorflow? I found that the time may be lower when we use Keras (Native TF vs Keras TF performance comparison), but how about performance?

There should not be any difference in accuracy while using keras(Tf backend)/Tf-slim in comparison to native Tf as they are just a high-level library written on top on native Tf.

Related

Can deep learning TensorFlow models be trained in Python, pickled and used for prediction in C++?

I have learnt a neural net based model using TensorFlow in Python.
I would like to store this model in a file and be able to load it into
memory in a C++ program for prediction later.
I am doing a comparative study of my machine learning model versus a standard algorithm written in C++. For this reason, I would like to load the model and do the prediction in C++ since I don't want the internals of the programming language to cause differences in the runtimes of the implementations.
Are there other ways to keep the comparisons language-neutral?
Yes. I think you can do it using bazel (google's tool for TF). Once you make sure that you are saving checkpoints, build the project. It would create executable file for you to use in c++.

tensorflow pow function is slow

I am using tensorflow for accelerating a stiff chemistry solver. In the process, I often have to calculate tf.pow(a,b) where a is a tensor and b is a constant. During profiling, I found that tf.pow was quite slow, even slower than tf.exp. I was surprised by that so I calculated the power as tf.exp(tf.log(a)*b) and timed it. To my surprise, the exponental log was twice as fast as tf.pow. Why is that? It was quite unexpected.
I should mention I am using single precision floats as my tensors, and I'm running on windows with python 3.6 and tf v1.5 on a CPU using the pip installed whl file using conda
I believe Tensorflow's exp and pow operations are calling Eigen's implementations. It appears that Eigen is using SIMD instructions for exp but not for pow: https://eigen.tuxfamily.org/dox/group__CoeffwiseMathFunctions.html

h2o deeplearning vs other packages(tensorflow, theano etc)

How does h2o implementation of deep learning differs from tensorflow or theano. This tutorial shows the tensorflow implementation in h2o.
And what is the use of tensorflow in h2o, when h2o itself can find minimum for the NN's gradient descent problem.
h2o's deeplearning functionality is is missing features that are present in the tensorflow implementation. For example, h2o's deeplearning function does not currently support constitutional neural networks.
See the first slide in the presentation here: https://github.com/h2oai/deepwater/blob/master/README.md

TensorFlow: Does it only have SGD algorithms? or does it also have others like LBFGS

Question about TensorFlow:
I was looking at the video and model on the site, and it appeared to only have SGD as an algorithm for machine learning. I was wondering if other algorithms are also included in tensorflow, such as L-BFGS.
Thank you for your responses.
TensorFlow's jargon for the algorithms such as Stochastic Gradient Descent (SGD) is optimizer. Following are the optimizers supported by TensorFlow:
GradientDescentOptimizer
AdadeltaOptimizer
AdagradOptimizer
AdamOptimizer
FtrlOptimizer
MomentumOptimizer
RMSPropOptimizer
You can also use the TensorFlow SciPy Optimizer interface which gives you access to optimizers like the L-BFGS.
Further, here and here are all the available TensorFlow optimizers you could use.

Integrating ODEs on the GPU using boost and python

I posted here not too long ago about a model I am trying to build using pycuda which solves About 9000 coupled ODEs. My model is too slow however and an SO member suggested that memory transfers from host to GPU is probably the culprit.
Right now cuda is being used only to calculate the rate of change of each of the 9000 species I am dealing with. Since I am passing in an array from the host to the GPU to perform this calculation and returning an array from the GPU to integrate on the host I can see how this would slow things down.
Would boost be the solution to my problem? From what I read, boost allows interoperability between c++ and python. It also includes c++ odeint , which I read, partnered with thrust allows quick reduction and integration all on the GPU. Is my understanding correct?
Thank you,
Karsten
Yes, boost.odeint and boost.python should solve your problem. You can use odeint with Thrust. There are also some OpenCL libraries (VexCL, ViennaCL) which might be easier to use then Thrust. Have a look at thist paper for a comparions and for use cases of odeint on GPUs.
Boost.python can do the communication between the C++ application and Python. Another approach would be a very slim command line application for solving the ODE (using boost.odeint) and which is entirely controlled by your python application.

Resources