How to use distcc to preprocess and compile everything remotely only? - gcc

Background:
I have a 128-core server which I would like to use as a build server.
I have a bunch of client machines which work with a not-so-new and not-so-powerful PC. (Can't upgrade! Not in my control.)
What I did:
I followed the distcc documentation.
And installed a virtual machine on the server with exactly the same compiler version, the same distcc version -- basically the same distribution, as on the client-machines.
After configuring the clients and the servers, I can remotely build. I can verify this using the distccmon-text tool. I can see on the server, there are 8 threads started by the distcc daemon and that are awaiting for build-jobs to come. This was good as a first step. You can see the output below to be sure.
Second Step: Since the client machines are dual-core machines while the server offers 128-cores, and not all clients will be compiling at the same time, I wanted to offload as much of the build as possible to the build-server.
Problems:
First Problem: distcc, no matter how I try to configure it, always tries to distribute the build-jobs equally on the client and the server. Even though my configuration file looks as shown below:
# --- /etc/distcc/hosts -----------------------
# See the "Hosts Specification" section of
# "man distcc" for the format of this file.
#
# By default, just test that it works in loopback mode.
# 127.0.0.1
172.24.26.208/8,cpp,lzo
localhost/0
Which as per the distcc documentation should give higher priority to the build-server and lower-priority to the localhost since it comes later in the list. Also, it should give 8 jobs to the build server and 0 jobs to the localhost. But no, that doesn't happen. Upon typing make -j8 what it tries to do is start 4 threads on localhost and 4 on remote. Not good. This you can see from the image below.
Second Problem: What you would also notice is that the pre-processing is being done on the localmachine. For this there is a tool that comes with distcc. It is called the "distcc-pump" or the pump mode and can be used like this.
time pump make CC="distcc gcc" CXX="distcc g++" -j8
Unfortunately, pump mode or not, the pre-processing happens to be happening all on the localhost, as you can see from the above image. Sad.
Note: At no point does the distcc program, with the configurations I listed here, throw any errors nor warnings, neither on the server nor on the clients.
Versions:
gcc 4.4.5
distcc 3.2rc1.2
(Before someone suggests - "upgrade software!", newer versions are most likely not possible for me. Anyways, this version of distcc offers the features that I need. Also, I can upgrade the server virtual-machine but then there would be compiler version mistmatch between clients and the server. The clients I cannot upgrade.)
Any suggestions, feedback on how to improve this setup/(fix the problems) are most welcome.

EDIT : these solutions do not work, I let the answer to avoid someone else to propose again them
Try by
removing the line concerning the localhost in /etc/distcc/hosts c.f. https://superuser.com/questions/568133/force-most-compilation-to-a-remote-host-with-distcc
or may be specifying 127.0.0.1 rather than localhost in /etc/distcc/hosts c.f. an other problem solved with that substitution in https://distcc.github.io/faq.html

distcc actually differentiates between remote and local CPUs. But contrary to your interpretation, in the hosts file the IP address 127.0.0.1 is considered as a remote CPU and a distccd server is expected to be running there. Any number of jobs you define in the hosts file is interpreted only for these server nodes.
According to the man page, "localhost" is interpreted specially. This is what seems to not work for you. An alternative syntax is --localslots=<int>. Have you tested this?
Additionally, distcc runs jobs on the local host (the one where you start the driver program). First, all linking is done there. Second, when you specify a certain parallelism with make -jN, all jobs exceeding the available number of remote jobs are run on your local host, too - in addition to the workload distribution part of distcc. The option --localslots limits these. The man page does not mention localhost explicitly here. And then there are those jobs, which fail on the server and are repeated locally.
For the given 128-core server I would use the number of cores in the hosts file and start only that number of compile jobs:
$ cat ~/.distcc/hosts
172.24.26.208/128,cpp,lzo
$ make -j 128
...
Then I would expect to not see any compile jobs on the local machine.
The man page has some more words regarding recommended job numbers. Search for the section(s) starting with distcc spreads the jobs across both local and remote CPUs.

Related

Substrate genesis blocks not matching

I am currently doing this tutorial. When I am following it as described and execute the alice and bob nodes both on the same machine it works as expected: the nodes are connecting and are creating and finalizing blocks. Now I want to get the same done but over the internet and on different machines. So I execute the bootnode on my PC and the other node on my Laptop. I compiled both from the same code and also forwarded the ports in my router. So now I expect the same behavior as when I am running both on my local machine. So when I execute them I see network traffic printed in both consoles but the bob node prints a warning: Bootnode with peer id '12D3KooWEyoppNCUx8Yx66oV9fJnriXwCcXwDDUA2kj6vnc6iDEp' is on a different chain (our genesis: 0xbfbd…3144 theirs: 0x8859…14c4) and they are not connected (Idle 0 peers).
So from the warning I conclude they have not the same genesis blocks which is obviously necessary to run as a blockchain. But from my understanding the joining node should copy the current state of the chain from the bootnode. How can I change bobs part to use alices state of the chain?
Both machines are running rust version 1.50.0
Thanks for your help!
Rust compilations are not deterministic, thus the exact same code for the exact same blockchain compiled on two computers will unfortunately not have the same genesis hash. (Specifically because part of the genesis of that chain is the Wasm Runtime blob, which is compiled non-deterministically with Rust).
You need to create a chainspec file to use for all other nodes. Note that you want to generate this on one computer, and pass the file to other nodes (don't re-generate it, as the genesis code will differ, as you ran into) before you then start other nodes using the correct chainspec and manually specifying boot nodes.

The JVM should have exited but did not

During Distributed testing with Jmeter 3.3 in non gui mode i'm getting error as, how can I fix this :
I'm using same version of JMeter and JDK on Master as well as Slave machines.
The JVM should have exited but did not.
The following non-daemon threads are still running (DestroyJavaVM is OK):
Thread[main,5,main],
stackTrace:java.net.SocketInputStream#socketRead0
java.net.SocketInputStream#socketRead
java.net.SocketInputStream#read
java.net.SocketInputStream#read
java.io.BufferedInputStream#fill
java.io.BufferedInputStream#read
java.io.DataInputStream#readByte
sun.rmi.transport.StreamRemoteCall#executeCall
sun.rmi.server.UnicastRef#invoke
java.rmi.server.RemoteObjectInvocationHandler#invokeRemoteMethod
java.rmi.server.RemoteObjectInvocationHandler#invoke
com.sun.proxy.$Proxy19#rrunTest
org.apache.jmeter.engine.ClientJMeterEngine#runTest at line:149
org.apache.jmeter.engine.DistributedRunner#start at line:132
org.apache.jmeter.engine.DistributedRunner#start at line:149
org.apache.jmeter.JMeter#runNonGui at line:1005
org.apache.jmeter.JMeter#startNonGui at line:910
org.apache.jmeter.JMeter#start at line:538
sun.reflect.NativeMethodAccessorImpl#invoke0
sun.reflect.NativeMethodAccessorImpl#invoke
sun.reflect.DelegatingMethodAccessorImpl#invoke
java.lang.reflect.Method#invoke
org.apache.jmeter.NewDriver#main at line:248
I strongly recommend using this jmeter property:
jmeterengine.force.system.exit=true
documented here. These Chinese language web pages link link tipped me off.
You can add -Jjmeterengine.force.system.exit=true on the command line when launching JMeter, or add jmeterengine.force.system.exit=true to JMETER_HOME/bin/jmeter.properties.
How I Confirmed This Fix
With JMeter 5.1 and java version "1.8.0_231" on MS-Win10, we're using a customized version of this JMeter InfluxDB backend Listener.
After my 60 second test run from the command line (jmeter.bat -n -t plan.jtl), the command line hung after displaying this output (very similar to op):
Tidying up ... # Wed Jan 29 14:41:04 CST 2020 (1580330464874)
... end of run
The JVM should have exited but did not.
The following non-daemon threads are still running (DestroyJavaVM is OK):
Thread[DestroyJavaVM,5,main], stackTrace:
Thread[pool-2-thread-3,5,main], stackTrace:sun.misc.Unsafe#park
java.util.concurrent.locks.LockSupport#parkNanos
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject#awaitNanos
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue#take
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue#take
java.util.concurrent.ThreadPoolExecutor#getTask
java.util.concurrent.ThreadPoolExecutor#runWorker
java.util.concurrent.ThreadPoolExecutor$Worker#run
java.lang.Thread#run
Thread[pool-2-thread-4,5,main], stackTrace:sun.misc.Unsafe#park
java.util.concurrent.locks.LockSupport#parkNanos
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject#awaitNanos
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue#take
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue#take
java.util.concurrent.ThreadPoolExecutor#getTask
java.util.concurrent.ThreadPoolExecutor#runWorker
java.util.concurrent.ThreadPoolExecutor$Worker#run
java.lang.Thread#run
Thread[pool-2-thread-1,5,main], stackTrace:sun.misc.Unsafe#park
java.util.concurrent.locks.LockSupport#parkNanos
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject#awaitNanos
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue#take
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue#take
java.util.concurrent.ThreadPoolExecutor#getTask
java.util.concurrent.ThreadPoolExecutor#runWorker
java.util.concurrent.ThreadPoolExecutor$Worker#run
java.lang.Thread#run
After modifying my command line as follows, jmeter.bat cleanly exited instead of hanging and all the ugly stack trace went away too:
jmeter.bat -n -Jjmeterengine.force.system.exit=true -t plan.jtl
To confirm that the problem was caused by our customized JMeter InfluxDB backend Listener, I removed it from the .jmx and I also removed the jmeterengine.force.system.exit=true. No hang, no ugly stacktrace (I actually love stacktraces).
I have not taken the next step to discover whether the problem is with the official JMeter InfluxDB backend Listener or with our customized variant, which is not (and will never be) available publicly.
Should mention one gap in this story. I feel this test conclusively points to our customized backend listener (or jmeter's). However, its odd that none of the threads in the above thread dump seem to belong to the backend listener. So I applaud that JMeter did the right thing by dumping the stack trace -- few other apps go to the extent of auto-dumping when appropriate for troubleshooting. But in this case, perhaps that jmeter auto-dump code needs to be enhanced, because it did not point to the culprit backend listener code. Anyone over there at Apache listening in on this?
Good luck.
Most probably your JMeter engine(s) is(are) overloaded therefore cannot gracefully shut down running threads when you request them to do so.
Make sure you follow JMeter Best Practices
The very first "best practice" states Always use latest version of JMeter so consider migrating to JMeter 5.0 or whatever latest version is available at JMeter Downloads page
Make sure your JMeter instances have enough headroom to operate in terms of CPU, RAM and so on. You can use JMeter PerfMon Plugin for this if you don't have other monitoring software in place/in mind.
Take a thread dump and examine it - this way you will know where exactly your test is stuck
Introduce reasonable timeout values in HTTP Request Defaults so in case when server fails to respond JMeter wouldn't wait infinitely but rather fail with an error
And finally (however I wouldn't recommend this) you can suppress this check by adding the next line to user.properties file:
jmeter.exit.check.pause=-1
if you go for this keep in mind that you may run into a situation when JMeter slaves will still be trying to execute something even after your test ends so you will need to kill and restart the processes manually or using a script.

Where does Cloudera Manager get the --hostname value for Impala commands?

I am working through the process for activating Kerberos on the Cloudera quick-start VM. The vm begins life with hostname = "quickstart.cloudera" but I had to change it to get it into our local DNS consistently. After changing the name I was able to get everything except impala to come up. The manager is passing it --hostname=quickstart.cloudera even though everything else in the whole system knows the new name. I don't strictly have to have impala running for the tests I need to run but it's driving me nuts. Any clues?
I'm looking at a relatively fresh install of CM 5.3 with Impala 2.1 and I don't see the hostname being passed to the catalog server via cmdline flags.
Unless you're explicitly setting the hostname parameter in CM via a safety-valve configuration (I assume you're not doing this), then Impala gets the hostname to use by calling the gethostname() stdlib function (see the gethostname() man page). The log output snipped you showed is somewhat deceiving because when the flag isn't set, Impala actually sets that value manually and it shows up as if it were passed by the user.
Anyway, you should check that the hostname is properly changed on the box,
which may depend on your OS. A few things to try: check the hostname command returns the name you expect and that you've restarted the OS.

Erlang Nodes See Each Other Only After Ping

I am running some erlang code on a Mac OSX, and I have this weird issue. My application is a multi node app where I have a single instance of a server that is shared between nodes (global).
The code works perfectly, except for one annoying thing: the different erlang nodes (I am running each node on a different terminal window) can only communicate with each other after ping!
So if on terminalA I am starting the server, and on terminalB I am running
erl>global:registered_names().
terminalB will return an empty list, unless, before starting the server on terminalA, I have ran a ping (from either one of the terminals).
For example, if I do this on either terminals before starting the server:
erl>net_adm:ping("terminalB").
then I start the server and from the second terminal I list the processes:
erl>global:registered_names().
This time I WILL see the registered process from the second terminal.
Is it possible that the mere net_adm:ping call does some kind of work (like DNS resolving or something like that) that allows the communication?
The nodes in a distributed Erlang system are loosely connected. The
first time the name of another node is used, for example if
spawn(Node,M,F,A) or net_adm:ping(Node) is called, a connection
attempt to that node will be made.
I find this in this link: http://www.erlang.org/doc/reference_manual/distributed.html#id85336
I think you should read this article.

What is good/free software for monitoring IIS in Windows Vista?

I always forget to check what's going on in IIS on our webservers, and am wondering: is there some stupid applet or something that always runs locally that I can click on to check event logs and IIS logs on a remote machine?
Mark
You can set up samurize to follow the output of the logging on the local and remote machines but it requires some setup.
You can use a remote shell utility such as OpenSSH to connect to remote machines securely.
One at a time. Compmgmt.msc -> connect to another computer.
But one at a time is boring. Monitoring dozens of machines? I've been using logparser from MS for my log monitoring needs. I run a query that dumps errors and warnings to a csv file a few times a day.
So far, I've only used it to aggregate event logs across the dozen servers in our QA environment, but it appears to take many forms on log input, including IIS. A pseudo log file query (don't have samples with me)
logparser "Select [eventtype], [message] into output.csv FROM \\server1\system, \\server2\system" -i EVT
This shows: You can aggregate multiple servers. You tell it the input format - it supports a dozen log types. You can dump it into a csv file. It looks sort of like SQL. This article in security focus has an IIS log sample.
I'm not an applet type of guy, so I haven't though much about desktop widgets to do this.

Resources