Build issue with LightGBM 2.2.4, Boost 1.64.0 on Power9 w/GPU - lightgbm

I am attempting to build LightGBM version 2.2.4 (git hash 5256cda69300d6b83b18180da2992a1e50a6b392) on an IBM Power9 system ("Witherspoon", CPU is a Power System AC922, 8335-GTH) running Red Hat Enterprise Server 7.5 (Maipo).
I am using the RHEL-packaged C compiler, gcc 4.8.5, a local version of cmake, version 3.13.1, and a local installation of Boost version 1.64.0, The system has CUDA 9.2 installed, and I have located the libOpenCL directories and include files.
My configuration operation is (from inside a newly-created build directory in the root of the unpacked LightGBM tree):
# export BOOST_ROOT=/share/sw/boost/1_64_0/
# cmake3 -DUSE_GPU=1 -DOpenCL_LIBRARY=/usr/lib64/nvidia/libOpenCL.so -DOpenCL_INCLUDE_DIR=/usr/include/CL/ ..
# make
The configuration step apparently succeeds, generating a runnable makefile.
The build fails at around 41% with errors from deep in the bowels of Boost:
[ 41%] Building CXX object CMakeFiles/_lightgbm.dir/src/treelearner/data_parallel_tree_learner.cpp.o
In file included from /share/sw/boost/1_64_0/include/boost/mpl/aux_/integral_wrapper.hpp:22:0,
from /share/sw/boost/1_64_0/include/boost/mpl/int.hpp:20,
from /share/sw/boost/1_64_0/include/boost/mpl/lambda_fwd.hpp:23,
from /share/sw/boost/1_64_0/include/boost/mpl/aux_/na_spec.hpp:18,
from /share/sw/boost/1_64_0/include/boost/mpl/identity.hpp:17,
from /share/sw/boost/1_64_0/include/boost/iterator/detail/enable_if.hpp:11,
from /share/sw/boost/1_64_0/include/boost/iterator/transform_iterator.hpp:11,
from /share/sw/boost/1_64_0/include/boost/algorithm/string/iter_find.hpp:17,
from /share/sw/boost/1_64_0/include/boost/algorithm/string/split.hpp:16,
from /wrk/user/src/lightgbm/LightGBM/compute/include/boost/compute/device.hpp:18,
from /wrk/user/src/lightgbm/LightGBM/compute/include/boost/compute/context.hpp:19,
from /wrk/user/src/lightgbm/LightGBM/compute/include/boost/compute/buffer.hpp:15,
from /wrk/user/src/lightgbm/LightGBM/compute/include/boost/compute/core.hpp:18,
from /wrk/user/src/lightgbm/LightGBM/src/treelearner/gpu_tree_learner.h:27,
from /wrk/user/src/lightgbm/LightGBM/src/treelearner/parallel_tree_learner.h:5,
from /wrk/user/src/lightgbm/LightGBM/src/treelearner/data_parallel_tree_learner.cpp:1:
/share/sw/boost/1_64_0/include/boost/mpl/vector.hpp:28:18: error: pasting ")" and "20" does not give a valid preprocessing token
BOOST_PP_CAT(vector, BOOST_MPL_LIMIT_VECTOR_SIZE).hpp \
^
/share/sw/boost/1_64_0/include/boost/preprocessor/cat.hpp:29:34: note: in definition of macro ‘BOOST_PP_CAT_I’
# define BOOST_PP_CAT_I(a, b) a ## b
^
/share/sw/boost/1_64_0/include/boost/mpl/vector.hpp:28:5: note: in expansion of macro ‘BOOST_PP_CAT’
BOOST_PP_CAT(vector, BOOST_MPL_LIMIT_VECTOR_SIZE).hpp \
^
/share/sw/boost/1_64_0/include/boost/mpl/vector.hpp:36:49: note: in expansion of macro ‘AUX778076_VECTOR_HEADER’
# include BOOST_PP_STRINGIZE(boost/mpl/vector/AUX778076_VECTOR_HEADER)
^
In file included from /share/sw/boost/1_64_0/include/boost/math/policies/policy.hpp:14:0,
from /share/sw/boost/1_64_0/include/boost/math/special_functions/math_fwd.hpp:28,
from /share/sw/boost/1_64_0/include/boost/math/special_functions/sign.hpp:17,
from /share/sw/boost/1_64_0/include/boost/lexical_cast/detail/inf_nan.hpp:34,
from /share/sw/boost/1_64_0/include/boost/lexical_cast/detail/converter_lexical_streams.hpp:63,
from /share/sw/boost/1_64_0/include/boost/lexical_cast/detail/converter_lexical.hpp:54,
from /share/sw/boost/1_64_0/include/boost/lexical_cast/try_lexical_convert.hpp:42,
from /share/sw/boost/1_64_0/include/boost/lexical_cast.hpp:32,
from /wrk/user/src/lightgbm/LightGBM/compute/include/boost/compute/detail/meta_kernel.hpp:23,
from /wrk/user/src/lightgbm/LightGBM/compute/include/boost/compute/iterator/buffer_iterator.hpp:26,
from /wrk/user/src/lightgbm/LightGBM/compute/include/boost/compute/algorithm/detail/copy_on_device.hpp:18,
from /wrk/user/src/lightgbm/LightGBM/compute/include/boost/compute/algorithm/copy.hpp:26,
from /wrk/user/src/lightgbm/LightGBM/compute/include/boost/compute/container/vector.hpp:32,
from /wrk/user/src/lightgbm/LightGBM/src/treelearner/gpu_tree_learner.h:28,
from /wrk/user/src/lightgbm/LightGBM/src/treelearner/parallel_tree_learner.h:5,
from /wrk/user/src/lightgbm/LightGBM/src/treelearner/data_parallel_tree_learner.cpp:1:
/share/sw/boost/1_64_0/include/boost/mpl/vector.hpp:36:73: fatal error: boost/mpl/__attribute__((altivec(vector__)))/__attribute__((altivec(vector__)))20.hpp: No such file or directory
# include BOOST_PP_STRINGIZE(boost/mpl/vector/AUX778076_VECTOR_HEADER)
From the messages, it looks like some preprocessor string manipulation has gone wrong, it's maybe trying to find the "vector20.hpp" file in the boot/mpl/vector include directory, but the BOOST_PP_CAT operation has gone wrong, so it's failing to construct a proper filename? Also, the "altivec" is implicated, the Power9 CPU is altivec-capable, maybe an additional header or compiler switch is required?
I can successfully build (with warnings) on a Debian 9 "stretch" system with x86_64 architecture and CUDA 9.1 (for the libOpenCL stuff), with the Debian-packaged Boost version 1.62.
I also tried building the Power9 version against Boost 1.69, and against Boost 1.62 (the one that worked on Debian), and got the same errors in the same place.
Help?

This is addressed in an issue on the LightGBM github, which I somehow missed on my initial search.
This build attempt is misguided.
The compilation problem is apparently an altivec/boost interaction, and there's no OpenCL GPU support on the Power architecture, and LightGBM is OpenCL under the hood, so the effort is doomed in any case.

Related

Which compiler settings should be used to compile Pantheios in current OSX?

Pantheios INSTALL.TXT says:
Open a command shell in the appropriate directory that matches your compiler: ...
My compiler is Clang:
> gcc --version
Apple LLVM version 5.1 (clang-503.0.40) ...
Target: x86_64-apple-darwin13.3.0
Which of the compiler settings files in Pantheios 1.0.1-beta214 is the most appropriate?
You can use homebrew to install an older version of gcc (e.g., gcc-4.2) and use the matching Pantheios makefile, like this:
brew install gcc42
make CC=gcc-4.2
However, as far as I have been able to tell, Pantheios is not going to be buildable on a recently-updated system. For example, as of today, building on OSX looks like this:
04:29:23 ~/src/pantheios-1.0.1-beta214/build/gcc42.unix$ make CC=gcc-4.2
Ensuring all STLSoft C source files are in UNIX format
sed: RE error: illegal byte sequence
make: *** [/Users/username/src/stlsoft-1.9.118/include/stlsoft/internal/dos2unix.has.been.performed] Error 1
I have also tried building on Windows as recently as 6 months ago and lost a good bit of time on it before giving up.
The library hasn't been updated in a very long time and the author has very little online activity since then. I call that "abandoned software". Building will very likely require a non-trivial amount of work on your part. I'd highly recommend severing the dependency on STLSoft if you do because it also appears to be abandoned.

Another GSL linking error in Windows

I've done everything, and it's payed off.
Trying to compile a mex file from MATLAB using the Windows 7.1 SDK.
~ I've created an compiled my C source code on GCC
~ I've created a MEX file that links and compiles fine via GCC on both Linux and OS X. Does not crash MATLAB, gateway function works fine
~ After much confusion, I switched my dev platform form 64-bit to x86 Win7
~ I've found .dll built files, but they do not link. Linking libs in MATLAB using MATLAB's linker flags will default to .lib, so...
~ I've found--after much googling--simple, pre-compiled x86 GSL .lib's and source files and linked them with MATLAB, eliminating any gsl_blas.h-and-it's-dependencies unrecognized external symbol errors
~ I've re-written every single variable declaration in my source code such that it is C89 standard compatible
~ I've set linker flags appropriately to avoid LIBCMT and any other LIB conflicts
~ I've installed the 2010 and 2012 VC C Runtime libraries
~ I've checked to make sure I have msvcrt.dll and msvcp60.dll in my System files
~ I've followed multiple tutorials online on how supposedly link everything together, most of which had nothing broken links or un-replicable results. I didn't find much to go off of for Cygwin or MinGW.
~ I've tried using the Lcc-win32 2.4.1 compiler
If I was doing basic matrix and vector operations, I'd be set, but unfortunately the various decomposition routines I'm utilizing require parts from the cblas library, which I linked as well, but I get ~30 errors all reporting the same thing...
cblas.lib(ctrsv.obj) : error LNK2001: unresolved external symbol __libm_sse2_sqrt_precise
Here's my MATLAB command.
mex -largeArrayDims -IC:\gsl\include -LC:\gsl\lib -lgsl -lcblas LINKFLAGS="$LINKFLAGS /NODEFAULTLIB:libcmt.lib" file1.c file2.c
So, out of options and frustrated out of my mind, I (naturally) come to stack overflow. Anyone have any idea how to solve this one? The only thing I've foudn on google points to wineHQ errors, not very helpful.
And, if possible, I'd rather not try to compile first on VS201X. I have access to whatever version I need, if necessary, but to me that just seems like a redundant step. Maybe I'm spoiled with Unix-based file system management and linking, though.
It's easy to compile the GSL library under MinGW, in fact the process of compiling from sources is exactly identical to that in Linux. Here are the steps I took:
Setup MinGW for Windows. I am using MinGW-w64 but there is also the popular TDM-GCC distribution which comes with a friendly web-installer.
Obtain GSL sources, and extract the tarball (gsl-1.16.tar.gz is the latest as of now)
Compile as usual, I've used the following commands:
$ ./configure --host=x86_64-w64-mingw32 --prefix=/mingw/local --enable-shared --enable-static
$ make
$ make install
It should take several minutes to finish. Maybe you can enable parallel builds to speed up compilation (make -j)
You'll end up with the necessary files installed in /mingw/local with the usual structure underneath (bin, lib, include).
Finally you can compile an example program with:
$ export PATH=/mingw/local:$PATH
$ gcc `gsl-config --cflags` -o main main.c `gsl-config --libs`
Of course if you prefer using Visual C++ as compiler, people out there have prepared solutions to build GSL using Visual Studio (either manually created project files, or using a build system like CMake and the like). See this question for such projects.
A third option is using Cygwin.

Unable to build rapidjson tests on Mac OS X

I am trying to build the tests for rapidjson 0.11 (http://code.google.com/p/rapidjson/) on Mac OS X . It includes three projects: gtest (builds fine), unittest (build fails), and perftest (build fails), and when building make error out with Error 1 and Error 2.
The compiler output shows the following errors for both unittest and perftest which causes make to fail:
../../include/rapidjson/reader.h: In function ‘const char* rapidjson::SkipWhitespace_SIMD(const char*)’:
../../include/rapidjson/reader.h:116: error: ‘_SIDD_UBYTE_OPS’ was not declared in this scope
../../include/rapidjson/reader.h:116: error: ‘_SIDD_CMP_EQUAL_ANY’ was not declared in this scope
../../include/rapidjson/reader.h:116: error: ‘_SIDD_BIT_MASK’ was not declared in this scope
../../include/rapidjson/reader.h:116: error: ‘_SIDD_NEGATIVE_POLARITY’ was not declared in this scope
These pre-processor constants are related to SSE4 instructions. rapidjson can use SSE2 or SSE4.2 to speed it up, and it defaults to using SSE4.2 when building.
The makefile includes the -msse4.2 compiler switch to enable SSE4.2 support, and looking through the header files reveal that on OS X, both SSE4_1 and SSE4_2 pre-processor constants need to be defined for the SIDD... constants to be defined. For some reason, these SIDD... constants aren't being defined.
Further research showed that the -msse4 switch enables support for both SSE4.1 and SSE4.2, so I tried chaning the switch to -msse4, but it still errors out.
Not sure if the -msse4.2 switch automatically defines SSE4_2 , but I tried manually defining it, and sill no luck.
NOTE: If you want to try building it yourself on Mac, you will need to download a different premake script file, as the included one doesn't work. You can download the corrected script from the attachment on the second post here https://code.google.com/p/rapidjson/issues/detail?id=54
Any ideas on how to get it building successfully on OS X ?
Short answer - I had an older version of gcc (4.2) which didn't support -msse4.2 flag (it was introduced in gcc 4.3).
After upgrading to the latest version of gcc, the above issue disappeared:
Check which version of gcc is active with by opening a terminal and
running gcc -v
Download MacProst installer for your version of OS X from http://www.macports.org/install.php and run installer (easy way to upgrade GCC version)
Open new terminal window (must be new as PATH environment var is updated after MacPorts
install)
Check which versions of gcc you already have installed with port select --list gcc (NOTE: you probably won't have some of the later versions installed already. See next step)
Install latest version of gcc (gcc47 at the moment) with sudo port install gcc47 (this will take a while to download)
Run port select --list gcc again and you should see the new version in the list (eg. mp-gcc47)
Select this latest version as active gcc version with sudo port select gcc mp-gcc47
Run gcc -v again to check the latest version is active
With the compiler sorted, the first attempt to build rapidjson for release32 gave me errors about the limits header file due to __int128 not being defined for 32-bit builds`. Gnu's official position is that you need to roll your own. See answers at following link for more info:
Compiling 32bit binary: expected unqualified-id before '__int128'
Building for release64 or debug64 solved this issue, but it still failed to build due some warnings about casting away qualifiers in test/unittest/readertest.cpp:187:4. As the make file included the compiler flag -Werror=cast-qual, these warnings were treated as errors. Removing this flag in both unittest and perftest makefiles solved this issue (not ideal solution but I just wanted to get it building).
There were still linker warnings as the /usr/lib64 folder didn't exist, and the makefiles included the flag -L/usr/lib64, but the build was still successful.
SUCCESS - Both unittest_release_x64_gmake and perftest_release_x64_gmake ran without problems!
NOTE: rapidjson build instructions are included in the readme file in the ZIP archive.

Boost 1.50 CMake 2.8 and Ogre 1.8 - win64 - dynamic linking

I've built the dynamic boost libraries required by the Ogre 3d engine (thread and date_time). My boost directory is in C:\boost , the lib is in C:\boost\lib and the include in C:\boost(\boost) as required by the standards.
If you're familiar with CMake and Ogre (since that's the simplest way to build any Ogre repository clone from sources), you know that there isn't much else to specify. That's not true in my case: Cmake always reports that it cannot find boost. And this happens only when I try to build the Ogre 1.8 version from their repository. When I use the Ogre 2.0 experimental unstable (at the time I wrote this question), boost is successfully found and so are its threading and date-time components.
Has anyone got any ideas? Preferably, has anyone tried to build the Ogre 1.8 sources this way?
I did try almost everything (even command line cmake), but with no positive results.
I've asked this question almost twice on the Ogre forums and nothing from those sources solved the problem for the stable release of Ogre.
What is it that makes Cmake derail so much when building one version over the other? How can I at least check for boost's existence in CMake (without creating a build solution or anything else)?
There must be a simple command line flag or a simple script to run with cmake, but apart from the FindBoost.cmake file, nothing else really helps (and that one is too big to make something out of it at a quick glance).
UPDATE
Using sakra's suggestion, I see that boost is recognized:
-- [ C:/Program Files (x86)/CMake 2.8/share/cmake-2.8/Modules/FindBoost.cmake:6
7 ] location of version.hpp: C:/boost/boost/version.hpp
-- [ C:/Program Files (x86)/CMake 2.8/share/cmake-2.8/Modules/FindBoost.cmake:7
6 ] version.hpp reveals boost 1.50.0
but although this section does reveal that boost is where it should be, the thread and date_time libraries are invisible to CMake.
The Boost_USE_STATIC_LIBS flag is set to OFF/FALSE, just in case..
UPDATE using the --find-package cmake command line argument:
C:\Ogre18\Build>cmake --find-package -DNAME=Boost -DCOMPILER_ID=GNU -DLANGUAGE=C
XX -DMODE=EXIST
Boost found.
Ultimately, cmake doesn't find the required components. Can one check for specific libraries belonging to a boost installation?
Try invoking cmake with the variable Boost_DEBUG set to TRUE. This may give you some hints on why the FindBoost module does not find your Boost installation.
cmake -DBoost_DEBUG=TRUE .

Unable to build Boost libraries with GCC

I am using Windows 7 64-bit, and want to compile the non-precompiled libraries (specifically, I need Filesystem) from the command line (I do not use MSVC). I have MinGW, but read on the Boost website that MSYS shell is not supported, so I'm trying to compile the libraries from the Windows command prompt.
First of all, running bootstrap.bat results in the following error:
Building Boost.Jam build engine
'cl' is not recognized as an internal or external command,
operable program or batch file.
Failed to build Boost.Jam build engine.
Please consult bjam.log for furter diagnostics.
You can try to obtain a prebuilt binary from
http://sf.net/project/showfiles.php?group_id=7586&package_id=72941
Also, you can file an issue at http://svn.boost.org
Please attach bjam.log in that case.
Plus, there is not bjam.log file anywhere in the boost_root directory.
Disregarding this error, and trying to run the downloaded bjam.exe file, I get another error:
c:/boost_1_45_0/tools/build/v2/build\configure.jam:145: in builds-raw
*** argument error
* rule UPDATE_NOW ( targets * : log ? : ignore-minus-n ? )
* called with: ( <pbin.v2\libs\regex\build\gcc-mingw-4.5.2\debug\address-model64\architecture-x86>has_icu.exe : : ignore-minus-n : ignore-minus-q )
* extra argument ignore-minus-q
(builtin):see definition of rule 'UPDATE_NOW' being called
c:/boost_1_45_0/tools/build/v2/build\configure.jam:179: in configu
re.builds
c:/boost_1_45_0/tools/build/v2/build\configure.jam:216: in object(
check-target-builds-worker)#409.check
etc. with quite a lot of complaints. Setting the 'architecture' and 'address-model' options doesn't help.
Any suggestions?
#Andre
Following Andre's suggestion, I created minGW-bjam that was running for an hour and a half and built most of the libraries, but not the one I need at this moment: Filesystem.
Trying to compile only Filesystem, specifying version 2 with define="BOOST_FILESYSTEM_VERSION=2" and --disable-filesystem3 does not help. I get the following error:
gcc.compile.c++ bin.v2\libs\filesystem\build\gcc-mingw-4.5.2\debug\v3\src\operations.o
In file included from ./boost/filesystem/v3/operations.hpp:24:0,
from libs\filesystem\v3\src\operations.cpp:48:
./boost/filesystem/v3/config.hpp:16:5: error: #error Compiling Filesystem version 3
file with BOOST_FILESYSTEM_VERSION defined != 3
libs\filesystem\v3\src\operations.cpp:647:26: warning:
'<unnamed>::create_symbolic_link_api' defined but not used
"g++" -ftemplate-depth-128 -O0 -fno-inline -Wall -g -DBOOST_ALL_NO_LIB=1 -
DBOOST_FILESYSTEM_DYN_LINK=1 -DBOOST_FILESYSTEM_VERSION=2 -DBOOST_SYSTEM_DYN_LINK=1 -
I"." -c -o "bin.v2\libs\filesystem\build\gcc-mingw-4.5.2\debug\v3\src\operations.o"
"libs\filesystem\v3\src\operations.cpp"
etc. with a lot of ...failed statements.
Any hints here?
It's easy. Just use "bootstrap.bat gcc" to select GCC
The bootstrap script assumes the msvc compiler is available. But you can build bjam by hand without the bootstrap script:
Step into the tools\build\v2\engine\src directory and call "build.bat mingw". It will create a bjam.exe. You can then put it in your %PATH% or perhaps in the root boost directory...
To be honest, I usually build bjam like this with the msvc compiler and use this "msvc-bjam" to build my mingw boost libraries.
So... the first part of the problem was solved by Andre's suggestion.
The second part was solved by setting the variable BOOST_FILESYSTEM_VERSION to 3 everywhere (the error above complains about incompatibility with what is set in file user.hpp). Although this is not the default option for Boost 1.45 that I'm using, it's the only thing that works (i.e. bjam wants to compile version 3 no matter what). So now I have version 3 of the filesystem library, and version 2 for all others, but that doesn't seem to be an issue for the moment.
I do have a problem with using Boost with OpenCV and Eigen libraries, though... off to the next challenge ;)
Since I can't comment yet, I want to add that I ran
bootstrap mingw
to generate b2 properly and then
b2 --build-dir="c:\boost_release" toolset=gcc --build-type=complete "c:\boost_release\stage"
The includes will be located at your boost root folder (boost_1_58_00/boost) and your binaries at the specified build folder.

Resources