h2o.xgboost - no GPUs, no multithreading - h2o

I wanted to add this as a comment to this question - is multi-cpu supported by h2o-xgboost? - but apparently my rep is too low.
I am using the latest stable version of h2o (3.14.06).
In order to try and solve this problem i've made sure that gcc is built within my docker image (using apt-get install gcc)
dpkg -l | grep gcc
gcc 4:5.3.1-1ubuntu1 amd64 GNU C compiler
gcc-5 5.4.0-6ubuntu1~16.04.5 amd64 GNU C compiler
**output truncated**
Unfortunately when the cluster is spun up its still reporting:
INFO: Found XGBoost backend with library: xgboost4j
INFO: Your system supports only minimal version of XGBoost (no GPUs, no multithreading)!
Can anyone provide any insights? Clearly I'm missing a piece of the puzzle.

Right now H2O bundles only GPU-enabled and minimal (no GPU, no OMP) version of XGBoost. However, there is an experimental change in branch mm/xgb_upgrade which contains OMP-enabled version of XGBoost (instead of minimal version): https://github.com/h2oai/h2o-3/tree/mm/xgb_upgrade

Building the mm/xgb_upgrade works. Which jira ticket is referring to this issue?

Related

Installing gcc on s390x

I need a C compiler on my s390, which runs RHEL 7.6. When I do "yum list | grep gcc", I have the following:
libgcc.s390x 4.8.5-36.el7
compat-gcc-44.s390x 4.4.7-8.el7
compat-gcc-44-c++.s390x 4.4.7-8.el7
gcc.s390x 4.8.5-16.el7
gcc-c++.s390x 4.8.5-16.el7
gcc-gfortran.s390x 4.8.5-16.el7
gcc-objc.s390x 4.8.5-16.el7
gcc-objc++.s390x 4.8.5-16.el7
libgcc.s390 4.8.5-16.el7
I then do: yum install gcc.s390x and I obtain the following error:
Error: Package: glibc-2.17-196.el7.s390
Requires: glibc-common = 2.17-196.el7
Installed: glibc-common-2.17-260.el7_6.3.s390x (#rhel-7-for-system-z-rpms)
glibc-common = 2.17-260.el7_6.3
Available: glibc-common-2.17-196.el7.s390x
glibc-common = 2.17-196.el7
What I read from this is that s390x package is installed but the one needed is the one that does not have the s390 extension.
How can I get around this ? I pulling gcc directly from git but when I do a configure the message says that a compiler needs to be installed.
Any help would be much appreciated. Thanks - C
This output line
Available: glibc-common-2.17-196.el7.s390x
shows that the configured repositories only contain glibc versions up to RHSA-2017:1916. This means that you have configured repositories for Red Hat Enterprise Linux 7.4 (and not even Extended Update Support). However, someone at one point upgraded glibc to a package version from Red Hat Enterprise Linux 7.6.
Installing GCC needs additional glibc components, and these have to match the already-installed version. Since the 7.6 packages are not available from the configured repositories, installation fails with a dependency error.
If you need assistance with subscription management, you should file a support ticket.

Broke yum on Centos 6.10, cannot install missing GLIBC in error due to missing libunwind

I was installing nvidia-drivers on Centos 6.10 which included a --skip-broken flag and may have broken yum. Whenever I ran yum commands this error pops up.
There was a problem importing one of the Python modules
required to run yum. The error leading to this problem was:
/lib64/libc.so.6: version `GLIBC_2.14' not found (required by /lib64/libgcc_s.so.1)
Please install a package which provides this module, or
verify that the module is installed correctly.
It's possible that the above module doesn't match the
current version of Python, which is:
2.6.6 (r266:84292, Jun 20 2019, 14:14:55)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-23)]
If you cannot solve this problem yourself, please go to
the yum faq at:
http://yum.baseurl.org/wiki/Faq
I stumbled upon this thread which talks about installing the missing GLIBC version, but I ran into this error in step 8 ../configure --prefix=/opt/glibc-2.14
checking for forced unwind support... no
configure: error: forced unwind support is required
Which then took me to this forum thread that states I should install libunwind via yum. Which was my original problem, thus leaving me at an impasse. What should I do?
You need to reinstall GCC, or more precisely the libgcc package. Something overwrote /lib64/libgcc_s.so.1 with an incompatible version. You should be able to download the libgcc RPM package from a mirror, and then run:
# rpm --reinstall libgcc-4.4.7-23.el6.x86_64.rpm
This should still work because RPM itself does not depend on libgcc_s.
In general, if you need newer versions of these core system libraries (glibc, libstdc++, libgcc_s), you need to upgrade the entire operating system. Even if you manage to replace them in a consistent fashion, you are running something that isn't very close to the original operating system anymore. At that point, it is more prudent to upgrade, because that will give you a consistent system that has been tested by many others.

There isn‘t much different between AVX2 and AVX512 when using MKL?

CPU Environment:Intel(R) Xeon(R) Gold 6148 CPU # 2.40GHz
Fisrt,I install tensorflow with pip install tensorflow==1.12.0, and download tensorflow-benchmark
Run 1:export MKL_VERBOSE=0;export MKL_ENABLE_INSTRUCTIONS=AVX512;python tf_cnn_benchmarks.py --device=cpu --data_format=NHWC --model=alexnet --batch_size=8
Run 2:export MKL_VERBOSE=0;export MKL_ENABLE_INSTRUCTIONS=AVX2;python tf_cnn_benchmarks.py --device=cpu --data_format=NHWC --model=alexnet --batch_size=8
The speed almost same!!! I also change different model and batch_size.
Second, I also test caffe compile with mkl. I found that
MKL_ENABLE_INSTRUCTIONS=AVX512 does not work much than MKL_ENABLE_INSTRUCTIONS=AVX2.
Why?
I assume your intentions are to test TensorFlow accelerated with MKLDNN. Unlike traditional MKL lib, this lib features math accelerations just for DL operations. However, the terms MKL and MKLDNN are apparently used interchangeably in Intel-optimized-TensorFlow, although accelerated with Intel MKLDNN. So now to answer your question, MKLDNN lib don't support the functionality to control ISA dispatching as of yet.
By the way, pip install Tensorflow installs Google's official tensorflow lib that doesn't come with MKL accelerations. To get Intel-optimized-TensorFlow, Please refer to the install guide: https://software.intel.com/en-us/articles/intel-optimization-for-tensorflow-installation-guide. To check if MKLDNN is enabled in your build use the command export MKLDNN_VERSBOSE=1 instead on MKL_VERBOSE=1

Upstart causes kernel panic on embedded Linux

I am using ptxdist to create kernel and rootfs images for a Linux embedded system, running on an ARM Cortex A8 CPU.
I was trying to use a newer compiler (GCC 5+) and so was forced to upgrade several external packages that would not compile under the new GCC.
I compiled the following versions of Upstart and its immediate dependencies:
upstart: 1.13.2
libnih: 1.0.3
dbus: 1.11.2
json-c: 0.12.1
When I boot, I get the following message:
init: com.ubuntu.Upstart.c:3525: Assertion failed in control_emit_event_emitted: env != NULL
init: Caught abort, core dumped
Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000600
Searching online did not yield useful hints - the only relevant issue I found is this, but it is relevant to an older version of Upstart, and my libnih is of the correct version already.
According to comment #8 in the bug report you linked, it is not enough to use version 1.0.3 of libnih -- you have to specifically use the Ubuntu version, as this seems to include dbus fixes which could solve the problem you are seeing. From the bug report:
David Ireland (e-david) wrote on 2013-04-22: #7
I've built libnih
1.0.3 from source and also made sure that upstart builds with that version of the nih-dbus-tool. I'm still having this problem.
James Hunt (jamesodhunt) wrote on 2013-04-22: #8
Which problem? The
crash? If so, you are still using the wrong version of libnih: you
should be using the Ubuntu version (specifically 1.0.3-4ubuntu16) from
here:
https://code.launchpad.net/~ubuntu-branches/ubuntu/raring/libnih/raring
You do not need the --session flag to run a "Session Init" (yes, this
is a little confusing but --session was added for testing a long time
ago and is still required for that). A "Session Init" only requires
"--user".

Compile nginx 1.8.1 with http_perl_module

I need help compiling nginx with the perl_module on my Mac:
System Software Overview:
System Version: OS X 10.11.1 (15B42)
Kernel Version: Darwin 15.0.0
Boot Volume: Macintosh HD
Boot Mode: Normal
Computer Name: Philipp
User Name: XXXXXXX XXXXXXXX (philipp)
Secure Virtual Memory: Enabled
Time since boot: 4 days 22:46
I configure nginx to be compiled with PCRE and PERL, i.e.,:
./configure --with-pcre=/Users/philipp/downloads/pcre-8.38 --with-http_perl_module --prefix=/servers/nginx
The output of the configure states:
checking for perl
+ perl version: This is perl 5, version 18, subversion 2 (v5.18.2)
built for darwin-thread-multi-2level
+ perl interpreter multiplicity found
If I execute make I run into the following error:
ld: symbol(s) not found for architecture i386
clang: error: linker command failed with exit code 1 (use -v to see invocation)
I googled and found a hint to export KERNEL_BITS=64, which did not solve the problem. Any suggestions on how to compile nginx with perl. The compilation succeeds without the --with-http_perl_module option, but in that case I cannot use perl in the nginx.conf (of course :)).
UPDATE:
I was not able to compile nginx with perl. I ended up using a pre-compiled package, which is kind of unsatisfying, because now I have to deal with a lot of packages I don't like. Anyways, if someone has a solution I'd be more than happy to know.
I'm not sure if this is offtopic or not, but as it's about using perl within nginx (which I do, and is awesome).
I had a similar problem when trying to build this module. The root of it was that I hadn't got the right LD flags.
The easiest way of doing this - IMO - is install nginx from a package, run nginx -V and see what flags were used - copy them all, and include the extras. (And check it wasn't already build in your distribution - it was in mine, which I think was a Centos 7.2 package - I don't have it to hand, but I'm not sure it would necessarily help)
You may also need to install a new perl version, to go with it though.
--with-pcre=../modules/pcre-8.38
--with-openssl=../modules/openssl-1.0.2g
--prefix=/usr/local/nginx
--with-http_perl_module
--with-http_ssl_module
--with-http_addition_module
--with-http_sub_module
--with-http_realip_module
--with-http_stub_status_module
--with-cc-opt='-I /usr/local/include'
--with-ld-opt='-L /usr/local/lib -L/usr/local/opt/perl518/lib'
I got the same problem. after about 3 hours digging with this, finally solved. This is my configure file, i installed perl518 with brew located in /usr/local/opt/perl518/lib, and use the newest nginx source code of version nginx version: nginx/1.10.0.
Things seems happend as #Sobrique said, you need to give the right --with-ld-opt & --with-cc-opt.
important, after ./configure, DO NOT doing make immediately, vi ./objs/Makefile and modify openssl config from './config' to './Configure darwin-x86_64-cc' so that it can build openssl in x86_64 mod.
Hope it can help you!

Resources