Heterogeneous nodes in OpenMPI - parallel-processing

I am new to OpenMPI. I heard that it supports heterogeneous nodes.
I have couple of raspberry-pis and an i7 machine with me. I have installed OpenMPI in all of them. I have also configured password-less ssh so that master (i7 pc) could launch a process in raspberry-pis.
When I run simple hello_MPI.exe using following command from i7 machine,
mpiexec -machinefile machinefile -n 2 hello_MPI.exe
Nothing happens! It hangs. However, hello_MPI.exe executes properly when I am working with only 2 r-pis (one of the r-pis is master in this case. i7 machine is not used as one of the computing nodes)
Additional information:
hello_MPI.exe is in the same directory in all the nodes (2 raspberry-pi s and i7 machine). machinefile contains ip addresses of 2 raspberry-pis. .exe file on i7 machine and r-pi is not the same. i.e. the one on r-pi is compiled on r-pi and the one on i7 machine is compiled on i7 pc.
It will be very helpful for me if anyone could tell me what's happening here.
Thanks!

Related

What happens if I use more cores in QEMU than total available cores in host

I am running dhrystone benchmarking tool to see the performance of qemu-system-riscv64 which is running ubuntu 22.04 pre-installed image. Host machine has 2 cores with 1 thread each. I ran tests on qemu-system-riscv64 in combination of 1, 2 and 4 cores (can be specified with smp flag). I observed that when I go from 1 core to two cores for qemu-system-riscv64, the dhrystones increase but when I go from 2 cores to 4 cores, the number of dhrystones become lower than that of two cores. What can be the reason of this behavior. I am using following command to boot ubuntu 22.04:
qemu-system-riscv64 \
-machine virt -nographic -m 2048 -smp 4 \
-kernel $UBOOTPATH/u-boot.bin \
-device virtio-net-device,netdev=eth0 -netdev user,id=eth0,hostfwd=::<host_port>-:<VM_port> \
-drive file=ubuntu-22.04.1-preinstalled-server-riscv64+unmatched.img,format=raw,if=virtio
I also tried running make with -j flag, the same behavior occurs when I use -j4 and -j2 as is described above.
Qemu target riscv64-softmmu supports MTTCG, so every emulated guest core runs in a separate host thread, thus guest performance is saturated by the total host processing power. I.e. with a guest capable of using all available guest cores on an otherwise idle host system adding a new guest core will increase overall guest performance as long as the total number of guest cores does not exceed the number of host cores. After that the host CPU load will approach 100% and adding new guest cores will only increase concurrence for the host CPU time.

Installing Hadoop over 5 hard drives on a desktop

I have been working with installing Hadoop. I followed some instruction on a Udemy course, and I installed Hadoop on pseudo distributed mode, on my laptop. It was fairly straightforward.
After that, I started to wonder if I could set up Hadoop on a desktop computer. So went out and bought an empty case and put in a 64 bit, 8 core AMD processor, along with a 50GB SSD hard drive and 4 inexpensive 500GB hard drives. I installed Ubuntu 14.04 on the SSD drive, and put virtual machines on the other drives.
I'm envisioning using my SSD as the master and using my 4 hard drives as nodes. Again, everything is living in the same case.
Unfortunately, and I've been searching everywhere, and I can't find any tutorials, guides, books, etc, that describe setting up Hadoop in this manner. It seems like most everything I've found that details installation of Hadoop is either a simple pseudo distributed setup (which I've already done), or else the instructions jump straight to large scale commercial applications. I'm still learning the basics, clearly, but I'd like to play in this sort-of in between place.
Has anyone done this before, and/or come across any documentation / tutorials / etc that describe how to set Hadoop up in this way? Many thanks in advance for the help.
You can run hadoop in different VM's which are located in different drives in the same system.
But you need to allocate same configurations for all the master and slave nodes
Also ensure that all the VM's having different ip addresses.
You can get different IP addresses by connecting your master computer to the LAN or you need to disable some functionality in VM machines in order to get different IP addresses.
if you done the hadoop installation in pseduo mode means then follow the below steps this may help you.
MULTINODE :
Configure the hosts in the network using the following settings in the host file. This has to be done in all machine [in namenode too].
sudo vi /etc/hosts
add the following lines in the file:
yourip1 master
yourip2 slave01
yourip3 slave02
yourip4 slave03
yourip5 slave04
[Save and exit – type ESC then :wq ]
Change the hostname for the namenode and datanodes.
sudo vi /etc/hostname
For master machine [namenode ] – master
For other machines – slave01 and slave02 and slave03 and slave04 and slave 05
Restart the machines in order to get the settings related to the network applied.
sudo shutdown -r now
Copy the keys from the master node to all datanodes, so as this will help to access the machines without asking for permissions everytime.
#ssh-copy-id –i ~/.ssh/id_rsa.pub hduser#slave01
#ssh-copy-id –i ~/.ssh/id_rsa.pub hduser#slave02
#ssh-copy-id –i ~/.ssh/id_rsa.pub hduser#slave03
#ssh-copy-id –i ~/.ssh/id_rsa.pub hduser#slave04
Now we are about to configure the hadoop configuration settings, so navigate to the ‘conf’ folder.
cd ~/hadoop/etc
Edit the slaves file within the hadoop directory.
vi ~/hadoop/conf/slaves
And add the below :
master
slave01
slave02
slave03
slave04
Now update localhost to master in core-site.xml,hdfs-site.xml,mapred-site.xml and yarn-site.xml
Now copy the files from the hadoop/etc/hadoop folder from master to slave machines.
then format you namenode in all machines.
and start the hadoop services.
I given you the some clues for how to configure the hadoop multinode cluster.
Never tried, but if you type ifconfig then it gives you same ipaddress on all the vm machines in hard drives. So this may not be the better option to go..
You can try creating Hadoop Cluster on Amazon EC2 for free using this step by step guide HERE
Or Video guide HERE
Hope it helps!

DSE with Hadoop: Error in getting started

I am facing a problem in DSE with Hadoop.
Let me describe the setup, including steps in some details, for you to be able to help me.
I set up a three-node cluster of DSE, with the cluster name as 'training'. All three machines are running Ubuntu 14.04, 64-bit, 4 GB RAM.
DSE was installed using the GUI installer (sudo command). After the installation, cassandra.yaml file was modified for
rpc_address = 0.0.0.0
One-by-one the three nodes were started. A keyspace with replication_factor = 3 was created. Data inserted and accessed from any other node, successfully.
Then DSE installed on a fourth machine (let's call this machine as HadoopMachine), again, with the same configuration, using GUI installer (sudo).
/etc/default/dse modified as follows:
HADOOP_ENABLED = 1
Then, on this HadoopMachine, the following command is run:
sudo service dse start
So far so good.
Then from the installation directory:
bin/dse hadoop fs -mkdir /user/hadoop/wordcount
This fails. Gives a very long series of error messages, running into hundreds of lines, ending with:
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1368)
at com.datastax.bdp.loader.SystemClassLoader.tryLoadClassInBackground(SystemClassLoader.java:163)
at com.datastax.bdp.loader.SystemClassLoader.loadClass(SystemClassLoader.java:117)
at com.datastax.bdp.loader.SystemClassLoader.loadClass(SystemClassLoader.java:81)
at com.datastax.bdp.loader.SystemClassLoader.loadClass(SystemClassLoader.java:75)
at java.util.ResourceBundle$RBClassLoader.loadClass(ResourceBundle.java:503)
at java.util.ResourceBundle$Control.newBundle(ResourceBundle.java:2640)
at java.util.ResourceBundle.loadBundle(ResourceBundle.java:1501)
at java.util.ResourceBundle.findBundle(ResourceBundle.java:1465)
at java.util.ResourceBundle.findBundle(ResourceBundle.java:1419)
at java.util.ResourceBundle.getBundleImpl(ResourceBundle.java:1361)
at java.util.ResourceBundle.getBundle(ResourceBundle.java:890)
at sun.util.resources.LocaleData$1.run(LocaleData.java:164)
at sun.util.resources.LocaleData$1.run(LocaleData.java:160)
FATAL ERROR in native method: processing of -javaagent failed
bin/dse: line 192: 12714 Aborted (core dumped) "$HADOOP_BIN/hadoop" "$HADOOP_CMD" $HADOOP_CREDENTIALS "${#:2}"
I don't know what the problem is, and how to fix it.
Will appreciate any help. Thanks.
I managed to find the solution, after a lot of struggle. I had been guessing all this time, that the problem would be one mis-step somewhere that was not very obvious, at least to me, and that's how it turned out.
So for the benefit of anybody else who may face the same problem, what the problem was and what worked is as follows.
DSE documentation specifies that for DSE with integrated Hadoop you must have Oracle JRE 7. I, perhaps foolishly, assumed it would mean Oracle JRE 7 or higher. So I had JRE 8 on my machine, and never realized that that would be the issue. When I removed JRE 8 and installed JRE 7, and bingo, it worked.
I am amazed. Now I realize that since DSE uses Hadoop 1.0.4 (an ancient version), it works with JRE 7 only. JRE 8 must have come after Hadoop 1.0.4 and something in JRE 8 must be incompatible with JRE 7, I guess.

Spark: how to set worker-specific SPARK_HOME in standalone mode [duplicate]

This question already has answers here:
How to use start-all.sh to start standalone Worker that uses different SPARK_HOME (than Master)?
(3 answers)
Closed 4 months ago.
I'm setting up a [somewhat ad-hoc] cluster of Spark workers: namely, a couple of lab machines that I have sitting around. However, I've run into a problem when I attempt to start the cluster with start-all.sh: namely, Spark is installed in different directories on the various workers. But the master invokes $SPARK_HOME/sbin/start-all.sh on each one using the master's definition of $SPARK_HOME, even though the path is different for each worker.
Assuming I can't install Spark on identical paths on each worker to the master, how can I get the master to recognize the different worker paths?
EDIT #1 Hmm, found this thread in the Spark mailing list, strongly suggesting that this is the current implementation--assuming $SPARK_HOME is the same for all workers.
I'm playing around with Spark on Windows (my laptop) and have two worker nodes running by starting them manually using a script that contains the following
set SPARK_HOME=C:\dev\programs\spark-1.2.0-worker1
set SPARK_MASTER_IP=master.brad.com
spark-class org.apache.spark.deploy.worker.Worker spark://master.brad.com:7077
I then create a copy of this script with a different SPARK_HOME defined to run my second worker from. When I kick off a spark-submit I see this on Worker_1
15/02/13 16:42:10 INFO ExecutorRunner: Launch command: ...C:\dev\programs\spark-1.2.0-worker1\bin...
and this on Worker_2
15/02/13 16:42:10 INFO ExecutorRunner: Launch command: ...C:\dev\programs\spark-1.2.0-worker2\bin...
So it works, and in my case I duplicated the spark installation directory, but you may be able to get around this
You might want to consider assign the name by changing SPARK_WORKER_DIR line in the spark-env.sh file.
A similar question was asked here
The solution I used was to create a symbolic link mimicking the master node's installation path on each worker node so when the start-all.sh executing on the master node does its SSH into the worker node, it will see identical pathing to run the worker scripts.
Example in my case, I had 2 Macs and 1 Linux machine. Both Macs had spark installed under /Users/<user>/spark however the Linux machine had it under /home/<user>/spark. One of the Macs was the master node so running the start-all.sh it would error each time on the Linux machine due to pathing (error: /Users/<user>/spark does not exist)).
The simple solution was to mimic the Mac's pathing on the Linux machine using a symbolic link:
open terminal
cd / <-- go to the root of the drive
sudo ln -s home Users <-- create a sym link "Users" pointing to the actual "home" directory.

Hadoop cluster configuration with Ubuntu Master and Windows slave

Hi I am new to Hadoop.
Hadoop Version (2.2.0)
Goals:
Setup Hadoop standalone - Ubuntu 12 (Completed)
Setup Hadoop standalone - Windows 7 (cygwin being used for only sshd) (Completed)
Setup cluster with Ubuntu Master and Windows 7 slave (This is mostly for learning purposes and setting up a env for development) (Stuck)
Setup in relationship with the questions below:
Master running on Ubuntu with hadoop 2.2.0
Slaves running on Windows 7 with a self compiled version from hadoop 2.2.0 source. I am using cygwin only for the sshd
password less login setup and i am able to login both ways using ssh
from outside hadoop. Since my Ubuntu and Windows machine have
different usernames I have set up a config file in the .ssh folder
which maps Hosts with users
Questions:
In a cluster does the username in the master need to be same as in the slave. The reason I am asking this is that post configuration of the cluster when I try to use start-dfs.sh the logs say that they are able to ssh into the slave nodes but were not able to find the location "/home/xxx/hadoop/bin/hadoop-daemon.sh" in the slave. The "xxx" is my master username and not the slaveone. Also since my slave in pure Windows version the install is under C:/hadoop/... Does the master look at the env variable $HADOOP_HOME to check where the install is in the slave? Is there any other env variables that I need to set?
My goal was to use the Windows hadoop build on slave since hadoop is officially supporting windows now. But is it better to run the Linux build under cygwin to accomplish this. The question comes since I am seeing that the start-dfs.sh is trying to execute hadoop-daemon.sh and not some *.cmd.
If this setup works out in future, a possible question that I have is whether Pig, Mahout etc will run in this kind of a setup as I have not seen a build of Pig, Mahout for Windows. Does these components need to be present only on the master node or do they need to be in the slave nodes too. I saw 2 ways of running mahout when experimenting with standalone mode first using the mahout script which I was able to use in linux and second using the yarn jar command where I passed in the mahout jar while using the windows version. In the case Mahout/ Pig (when using the provided sh script) will assume that the slaves already have the jars in place then the Ubuntu + Windows combo does not seem to work. Please advice.
As I mentioned this is more as an experiment rather than an implementation plan. Our final env will be completely on linux. Thank you for your suggestions.
You may have more success going with more standard ways of deploying hadoop. Try out using ubuntu vm's for master and slaves.
You can also try to do a pseudo-distributed deployment in which all of the processes run on a single VM and thus avoid the need to even consider multiple os's.
I have only worked with the same username. In general SSH allows to login with a different login name with the -l command. But this might get tricky. You have to list your slaves in the slaves file.
At least at the manual https://hadoop.apache.org/docs/r0.19.1/cluster_setup.html#Slaves I did not find anything to add usernames. it might be worth trying to add -l login_name to the slavenode in the slave conf file and see if it works.

Resources