Anaconda/IntelPython doesn't seem to offload to Xeon Ph 30 - anaconda

>>> import numpy
>>> numpy.show_config()
mkl_info:
libraries = ['mkl_rt', 'pthread']
library_dirs = ['/home/steph/anaconda3/envs/intel_py/lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['/home/steph/anaconda3/envs/intel_py/include']
blas_mkl_info:
libraries = ['mkl_rt', 'pthread']
library_dirs = ['/home/steph/anaconda3/envs/intel_py/lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['/home/steph/anaconda3/envs/intel_py/include']
blas_opt_info:
libraries = ['mkl_rt', 'pthread']
library_dirs = ['/home/steph/anaconda3/envs/intel_py/lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['/home/steph/anaconda3/envs/intel_py/include']
lapack_mkl_info:
libraries = ['mkl_rt', 'pthread']
library_dirs = ['/home/steph/anaconda3/envs/intel_py/lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['/home/steph/anaconda3/envs/intel_py/include']
lapack_opt_info:
libraries = ['mkl_rt', 'pthread']
library_dirs = ['/home/steph/anaconda3/envs/intel_py/lib']
define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]
include_dirs = ['/home/steph/anaconda3/envs/intel_py/include']
>>>
Which seems to indicate that I have the correct libraries.
My env settings are:
export MKL_MIC_ENABLE=1
export OFFLOAD_DEVICES=1,2
export OFFLOAD_ENABLE_ORSL=1
export MKL_HOST_WORKDIVISION=0,2
export MKL_MIC_WORKDIVISION=1
export MKL_MIC_1_WORKDIVISION=0.9
export MKL_MIC_2_WORKDIVISION=0.9
#export MKL_MIC_MAX_MEMORY=<value>
#export MKL_MIC_<number>_MAX_MEMORY=<value>
#For example: export MKL_MIC_0_MAX_MEMORY=2G
export MKL_MIC_REGISTER_MEMORY=1
#export MKL_MIC_RESOURCE_LIMIT=<value>
#For example: export MKL_MIC_RESOURCE_LIMIT=0.34
#export MIC_OMP_NUM_THREADS=<value>
#export MIC_<number>_OMP_NUM_THREADS=<value>
#For example: export MIC_0_OMP_NUM_THREADS=240
export OFFLOAD_REPORT=2
#For example: export OFFLOAD_REPORT=2
#export LD_LIBRARY_PATH="/opt/intel/mic/coi/host-linux-release/lib:${LD_LIBRARY_PATH}"
#export MIC_LD_LIBRARY_PATH="/opt/intel/mic/coi/device-linux-release/lib:${MKLROOT}/lib/mic:${MIC_LD_LIBRARY_PATH}"
#export MKL_MIC_THRESHOLDS_?GEMM="<N>,<M>,<K>"
#For example: export MKL_MIC_THRESHOLDS_?GEMM="2000,1000,500"
export OMP_NUM_THREADS=16
export MIC_OMP_NUM_THREADS=236
export KMP_AFFINITY=granularity=fine,compact,1,0
export MIC_KMP_AFFINITY=explicit,granularity=fine,proclist=[1-236:1]
export MIC_ENV_PREFIX=MIC_
Yet, when I run an fft.py, the mismc indicates no activity on the Phis.
There is also no offload report.
Any idea what I'm doing wrong?
StackOverflow wants me to add more to the question before I can post. I have all the details in the post, so this is just fluff to see if I can get this to post. What a silly little AI bot.

Related

Can I use environment variables in paths to crates in `Cargo.toml`?

Is it possible to use environment variables in a Cargo.toml file?
Like so:
[package]
name = "tmp-xzxgxn"
version = "0.1.0"
edition = "2021"
[dependencies]
# bevy = {branch = "main", git = "https://github.com/bevyengine/bevy.git"}
# bevy = { path = "~/Documents/GitHub/bevy" }
bevy = { path = "%USERPROFILE%/Documents/GitHub/bevy" }
Note that ~ doesn't work either.

How to combine two shell.nix files?

I have the first shell.nix file:
{ pkgs ? import ./nix { }
, useClang ? false
, ae_name ? "ae"
}:
with pkgs;
(if useClang then tvb.aeClangStdenv else tvb.aeGccStdenv).mkDerivation rec {
name = ae_name;
nativeBuildInputs = tvb.cppNativeBuildInputs;
buildInputs = tvb.rustBuildInputs
++ tvb.cppBuildInputs
++ tvb.rBuildInputs
;
TZDIR = "${tzdata}/share/zoneinfo";
LOCALE_ARCHIVE = "${glibcLocales}/lib/locale/locale-archive";
out_dir = (toString ./out);
cur_dir = (toString ./.);
shellHook = ''
export PS1="\[\033[38;5;10m\]\u#\h[${name} nix-shell]\[$(tput sgr0)\]\[\033[38;5;15m\]:\[$(tput sgr0)\]\[\033[38;5;39m\]\w\[$(tput sgr0)\]\\$\[$(tput sgr0)\] \[$(tput sgr0)\]"
# for tools only bin paths are needed
for prog in ${toString tvb.shellTools}; do
export PATH="$prog/bin:$PATH"
done
export DEPLOY_CFG=${cur_dir}/.deploy.json
export LD_LIBRARY_PATH="${out_dir}/lib:${fts5-snowball}/lib"
export PATH="${cur_dir}/lua/bin:${out_dir}/bin:$PATH"
export AE_SHARE="${out_dir}/share/ae"
export AE_LIBEXEC="${out_dir}/libexec/ae"
## LuaJIT
export LUA_PATH="$LUA_PATH;${cur_dir}/lua/lib/?.lua;${out_dir}/share/lua/5.1/?.lua;;"
export LUA_CPATH="$LUA_CPATH;${out_dir}/lib/lua/5.1/?.so;;"
## Lua box
export LUABOX_UNIT_PATH="${out_dir}/share/ae/box/units/?.lua;"
## Python
export PYTHONPATH="${out_dir}/lib/python2.7:$PYTHONPATH"
'';
}
and I have the seconds shell.nix file:
let
jupyter = import (builtins.fetchGit {
url = https://github.com/tweag/jupyterWith;
rev = "37cd8caefd951eaee65d9142544aa4bd9dfac54f";
}) {};
iPython = jupyter.kernels.iPythonWith {
name = "python";
packages = p: with p; [ numpy ];
};
iHaskell = jupyter.kernels.iHaskellWith {
extraIHaskellFlags = "--codemirror Haskell"; # for jupyterlab syntax highlighting
name = "haskell";
packages = p: with p; [ hvega formatting ];
};
jupyterEnvironment =
jupyter.jupyterlabWith {
kernels = [ iPython iHaskell ];
};
in
jupyterEnvironment.env
Firstly, I tried to append the second to the first one, but then I received the following error:
jbezdek#ubuntu:~$ nix-shell
error: syntax error, unexpected ID, expecting '{', at /home/jbezdek/shell.nix:51:3
After that, I tried many other combinations how to put those two together, but I have never been successful. Could you help me with that, please?
Merging two shell.nix files in full generality is tricky, and unlikely to be solved off the shelf.
To solve the case in point, I think you will just have to dig into the Nix expression language a little more to write a syntactically valid .nix file, that contains the content of both files. Something along the lines of this could work:
{ pkgs ? import ./nix { }
, useClang ? false
, ae_name ? "ae"
}:
with pkgs;
let
jupyter = import (builtins.fetchGit { ... })
...
jupyterEnvironment = ...
in
{
first_file = (if useClang ...).mkDerivation rec {
name = ae_name;
...
};
second_file = jupyterEnvironment.env;
}

cx_freeze tkinter application not working with multiprocessing in Windows

I created a basic Tkinter GUI that has one button that when you click on it, it is supposed to use multiprocessing in order to open up 5 Chrome instances of "https://www.google.com." When I compile the script using cx_freeze, I click on the new exe and the button does nothing. Here is the code:
main.py
from tkinter import *
from selenium import webdriver
import os, multiprocessing
import numpy as np
def func():
print("bot started")
home = "https://www.google.com"
chromeOptions = webdriver.ChromeOptions()
user_txt_path = []
user_txt_path.append('\\chromedriver.exe')
filename = os.path.dirname(sys.argv[0])
path_to_chromedriver = str(os.path.abspath(filename) + user_txt_path[0])
driver = webdriver.Chrome(chrome_options=chromeOptions, executable_path=path_to_chromedriver) #DRIVER DRIVER!
driver.get_cookies()
driver.get(home)
def callback():
num = 5
for n in np.arange(num):
p = multiprocessing.Process(target=func)
p.start()
master = Tk()
b = Button(master, text="OK", command=callback)
b.pack()
mainloop()
setup.py
from cx_Freeze import setup, Executable
import sys
import os
from pathlib import Path
user_txt_path = []
base = "Console"#"Win32GUI"
user_txt_path.append('chromedriver.exe')
import os.path
PYTHON_INSTALL_DIR = os.path.dirname(os.path.dirname(os.__file__))
os.environ['TCL_LIBRARY'] = os.path.join(PYTHON_INSTALL_DIR, 'tcl', 'tcl8.6')
os.environ['TK_LIBRARY'] = os.path.join(PYTHON_INSTALL_DIR, 'tcl', 'tk8.6')
buildOptions = {"include_files": [user_txt_path[0], 'tcl86t.dll', 'tk86t.dll'], "packages": ['encodings', "os"], "includes": ['numpy.core._methods', 'numpy.lib.format'], "excludes": []}
executables = [
Executable('main.py', base=base)
]
home_path = str(Path.home())
pathname = str(os.path.dirname(sys.argv[0]))
setup(name='numbers',
version = '1.0',
#description = 'test',
#shortcutName="TachySloth",
#shortcutDir="DesktopFolder",
options = dict(build_exe = buildOptions),
executables = executables
)
Note that you need all of the include_files in your parent directory with main.py.

Tensorflow: How to take advantage of multi GPUs?

I have a CNN which run well with 1 GPU. Now I move to another computer which has 2 GPUs, I would like to train my network using both GPUs to save time. How could I do it?
I read the https://www.tensorflow.org/tutorials/using_gpu but I think the example was too simple and honestly I don't know how to apply it on my real network.
Could anyone give me a simple illustration on my network please? (I'm doing AutoEncoder).
Thank you very much!
graphCNN = tf.Graph()
with graphCNN.as_default():
# Input
x = tf.placeholder(tf.float32, shape=(None, img_w, img_h,img_ch), name="X") # X
# Output expected
y_ = tf.placeholder(tf.float32, shape=(None, img_w, img_h,img_ch), name="Y") # Y_
# Dropout
dropout = tf.placeholder(tf.float32)
### Model
def model(data):
### Encoder
c64 = ConvLayer(data, depth_in=1, depth_out=64, name="c64", kernel_size=3, acti=True)
c128 = ConvLayer(c64, depth_in=64, depth_out=128, name="c128", kernel_size=3, acti=True)
c256 = ConvLayer(c128, depth_in=128, depth_out=256, name="c256", kernel_size=3, acti=True)
c512_1 = ConvLayer(c256, depth_in=256, depth_out=512, name="c512_1", kernel_size=3, acti=True)
c512_2 = ConvLayer(c512_1, depth_in=512, depth_out=512, name="c512_2", kernel_size=3, acti=True)
c512_3 = ConvLayer(c512_2, depth_in=512, depth_out=512, name="c512_3", kernel_size=3, acti=True)
c512_4 = ConvLayer(c512_3, depth_in=512, depth_out=512, name="c512_4", kernel_size=3, acti=True)
c512_5 = ConvLayer(c512_4, depth_in=512, depth_out=512, name="c512_5", kernel_size=3, acti=True)
### Decoder
dc512_5 = DeconvLayer(c512_5, depth_in=512, depth_out=512, name="dc512_5", kernel_size=3, acti=True)
dc512_4 = DeconvLayer(dc512_5, depth_in=512, depth_out=512, name="dc512_4", kernel_size=3, acti=True)
dc512_3 = DeconvLayer(dc512_4, depth_in=512, depth_out=512, name="dc512_3", kernel_size=3, acti=True)
dc512_2 = DeconvLayer(dc512_3, depth_in=512, depth_out=512, name="dc512_2", kernel_size=3, acti=True)
dc512_1 = DeconvLayer(dc512_2, depth_in=512, depth_out=512, name="dc512_1", kernel_size=3, acti=True)
dc256 = DeconvLayer(dc512_1, depth_in=512, depth_out=256, name="dc256", kernel_size=3, acti=True)
dc128 = DeconvLayer(dc256, depth_in=256, depth_out=128, name="dc128", kernel_size=3, acti=True)
dc64 = DeconvLayer(dc128, depth_in=128, depth_out=64, name="dc64", kernel_size=3, acti=True)
output = ConvLayer(dc64, depth_in=64, depth_out=1, name="conv_out", kernel_size=3, acti=True)
return output
# Predictions
y = model(x)
y_image = tf.reshape(y, [-1, img_w, img_h, 1])
tf.summary.image('output', y_image, 6)
#Loss
loss = tf.reduce_sum(tf.pow(y - y_,2))/(img_w*img_h*img_ch) # MSE
loss_summary = tf.summary.scalar("Training_Loss", loss)
# Optimizer.
with tf.name_scope("train"):
train_step = tf.train.AdamOptimizer(learning_rate=learn_rate).minimize(loss)
In case you wanna see more details
def ConvLayer(input, depth_in, depth_out, name="conv", kernel_size=3, acti=True):
with tf.name_scope(name):
w = tf.Variable(tf.truncated_normal([kernel_size, kernel_size, depth_in, depth_out],
stddev=0.1), name="W")
b = tf.Variable(tf.constant(0.1, shape=[depth_out]), name="B")
conv = tf.nn.conv2d(input, w, strides=[1, 1, 1, 1], padding="SAME")
tf.summary.histogram("weights", w)
tf.summary.histogram("biases", b)
if (acti==True):
act = tf.nn.relu(conv + b)
tf.summary.histogram("activations", act)
result = act
else:
result = conv + b
result_maxpooled = max_pool(result,2)
return result_maxpooled
.
def DeconvLayer(input, depth_in, depth_out, name="deconv", kernel_size=3, acti=True):
with tf.name_scope(name):
w = tf.Variable(tf.truncated_normal([kernel_size, kernel_size, depth_out,depth_in],
stddev=0.1), name="W")
b = tf.Variable(tf.constant(0.1, shape=[depth_out]), name="B")
input_shape = tf.shape(input)
output_shape = tf.stack([input_shape[0], input_shape[1]*2, input_shape[2]*2, input_shape[3]//2])
deconv = tf.nn.conv2d_transpose(input, w, output_shape, strides=[1, 1, 1, 1], padding='SAME')
tf.summary.histogram("weights", w)
tf.summary.histogram("biases", b)
if (acti==True):
act = tf.nn.relu(deconv + b)
tf.summary.histogram("activations", act)
result = act
else:
result = deconv + b
return result
How to implement CNN (Convolutional Neural Network) on Multiple GPUs?
As Quoted from "Training a Model Using Multiple GPU Cards" (Tutorial from Tensorflow)
Place an individual model replica on each GPU.
Update model parameters synchronously by waiting for all GPUs to finish processing a batch of data.
In order to boost performance by understanding dataflow between Main Memory-CPU-GPU have a look at this answer: Why should preprocessing be done on CPU rather than GPU? : https://stackoverflow.com/a/44377741/4190159

How to change variable values when a target is called

I have following things in my Makefile(GNU)
DLIB = $(DLIB_STATIC)
DLIBFLAGS = $(DLIB_STATIC)
DLIB_BUILDS = $(DLIB_STATIC) LIBDDUMMY
# DLIB = $(DLIB_SHARED)
# DLIBFLAGS = -Llib -lD
# DLIB_BUILDS = $(DLIB_SHARED)
all: BUILDALL TB
tgt2: BUILDALL TB
TB: $(DLIB_BUILDS)
I need to modify values of DLIB, DLIBFLAGS & DLIB_BUILDS as follows
DLIB = $(DLIB_SHARED)
DLIBFLAGS = -Llib -lD
DLIB_BUILDS = $(DLIB_SHARED)
when tgt2 is called.
I tried following
TEMP:
DLIB = $(DLIB_SHARED)
DLIBFLAGS = -Llib -lD
DLIB_BUILDS = $(DLIB_SHARED)
tgt2: TEMP BUILDALL
But it not working, how can I do that ?
GNU make has notion about target-specific variable values. Just place variable's assignment in place of target's prerequisites:
tgt2: DLIB = $(DLIB_SHARED)
tgt2: DLIBFLAGS = -Llib -lD
tgt2: DLIB_BUILDS = $(DLIB_SHARED)
tgt2: BUILDALL
That way, if BUILDALL target is built via tgt2, it will use special set of variable's values.

Resources