Impossible to run the program with the "mpirun" command - cpu

I perfectly builded a program with cygwin, however when i've to run the .exe file with the command "mpirun" as the tutorial of the program says
https://github.com/jalombar/starsmasher/blob/master/documentation/walkthroughs/star_star_flyby.md
It appears the following error:
$ mpirun -np 4 test_cpu_sph
[Francyrad:00524] PMIX ERROR: INIT in file /cygdrive/d/cyg_pub/devel/openmpi/v3.1/openmpi-3.1.5-1.x86_64/src/openmpi-3.1.5/opal/mca/pmix/pmix2x/pmix/src/mca/gds/ds21/gds_ds21_lock_pthread.c at line 188
[Francyrad:00524] PMIX ERROR: SUCCESS in file /cygdrive/d/cyg_pub/devel/openmpi/v3.1/openmpi-3.1.5-1.x86_64/src/openmpi-3.1.5/opal/mca/pmix/pmix2x/pmix/src/mca/common/dstore/dstore_base.c at line 2432
--------------------------------------------------------------------------
Open MPI tried to fork a new process via the "execve" system call but
failed. Open MPI checks many things before attempting to launch a
child process, but nothing is perfect. This error may be indicative
of another problem on the target host, or even something as silly as
having specified a directory for your application. Your job will now
abort.
Local host: Francyrad
Application name: /cygdrive/c/Users/Francyrad/Desktop/starsmasher/GAM1.667_N1.5
Error: /cygdrive/c/Users/Francyrad/Desktop/starsmasher/GAM1.667_N1.5/test_cpu_sph
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun was unable to start the specified application as it encountered an
error:
Error name:
Node: (null)
when attempting to start process rank 34361314336.
--------------------------------------------------------------------------
4 total processes failed to start
[Francyrad:00524] 3 more processes have sent help message help-orte-odls-default.txt / execve error
[Francyrad:00524] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
I tried everything, to change the syntaxis and other, but anything! Ive no idea how to make run this application. What the hell i've to do????

Hum, the executable seem to be the problem...
Does your executable has been well built ? Do you see it with a "ls" command ?
And did you try to call the executable with mpirun using "mpirun -np 4 ./test_cpu_sph" instead of "mpirun -np 4 test_cpu_sph" ?
And could you give a bit more information please ? (if you're using cygwin, I guess that you're running on Windows ?)

Related

running windows commands remotely

I'm trying to run a command remotely.
Here is what I've tried
wmic /node:"my_server" /user:my_username /password:my_pass process call create "cmd.exe \c dir C:>C:\temp\x.txt"
I can see the process id returned and I see a terminal running on the remote machine with that process id and that process is just stuck and the output x.txt is not generated.
Any idea how to make it work?
Any idea why the process is running but not doing anything?
My goal is to get the output back so it is not necessary to write to a file.

Abnormal termination of GNU Octave script

On a i686 / 32-bit dual CPU, with a fresh Debian Stretch installation, I've installed Octave 4.2.1 and run ./mytest after providing it with execution privileges:
#!/bin/bash
./mytest.m
where test.m reads
#!/usr/bin/octave
exit(0)
The result is:
Terminate called after throwing an instance of 'octave::exit_exception'
panic: Aborted -- stopping myself...
attempting to save variables to 'octave-workspace'..
save to 'octave-workspace' complete
octave exited with signal 6
but the program is intended to exit normally. Same result replacing exit with quit, but it terminates correctly when starting it with $ octave -q --no-gui and then > quit. What's wrong here?
Update: In the meanwhile, this showed up: http://savannah.gnu.org/bugs/?49271, so now the question would be: can a non-Octave configuration solve the problem? (Confirmed: Octave 4.0.0 does not reproduce the error.)

ExitCode of RunProgramInGuest in Jenkins job

I'm running a batch file in virtual machine by jenkins job. I using following command to run it.
..path..\vmrun.exe -T ws -gu username -gp password runProgramInGuest "c:\vm_image.vmx" -activeWindow -interactive "C:\Installer.bat"
The job is running correctly and installing software (by run batch file).
But sometime it is exiting with exit code 2.
So jenkins is showing as job failed.
Shall I know what is the exit code 2 mean in this job?
What are other possible exit code for this command and there meanings?
How shall I find whether job passed or failed?
If I understood what you ran, it's:
0 – VIX_OK
The operation was successful.
1 – VIX_E_FAIL
Unknown error.
2 – VIX_E_OUT_OF_MEMORY
Memory allocation failed: out of memory.

Running Oprofile with MPI

I'm having issues using Oprofile to profile a parallel program that I call via mpirun. The command I'd like to use is:
$ operf mpirun -n 4 [program and arguments]
Unfortunately, when I do this, operf starts logging, but something funny happens when the MPI program is finished - operf seems to not recognize that it's returned (MPI-spawned processes no longer appear in htop, but operf still does), and things just hang waiting for me to interrupt them.
Is there an option I can pass to operf or mpirun which will make the two play nicely together? Failing that, is there a bash trick I can use to automatically kill operf when my MPI program is finished?
Edit: Previously thought that it Oprofile wasn't always generating results, but it turns out that I was just confused and looking in the wrong location. The only problem is that operf doesn't recognize that the MPI program has terminated.
Try using this line, it works like a charm:
sudo operf mpirun --allow-run-as-root -x LD_LIBRARY_PATH="build/" -np 2 (path_to_the_file)

Running remotely Linux script from Windows and get execution result code

I have the current scenario to deal with:
I have to schedule the backup of my company's Linux-based server (under Suse Linux) with ARCServe R15 (installed on Windows 2003R2SP2).
I know I have the ability in my backup software (ARCServe) to add pre/post execution scripts to my backup-jobs.
If failure of the script, ARCServe would be specified NOT to run the backup-job, and if success, specified to be run. I have no problem with this.
The problem is, I want to make a windows script (to be launched by ARCServe) for executing a Linux script on the cluster:
- If this Linux-script fails, I want my windows-script to fail, so my backup job in ARCServe wouldn't run
- If the Linux-script success, I want my windows-script to end normally with error code 0, so my ARCServe job would run normally.
I've tried creating this batch file (let's call it HPC.bat):
echo ON
start /wait "C:\Program Files\PUTTY\plink.exe" -v -l root -i "C:\IST\admin\scripts\HPC\pri.ppk" [cluster_name] /appli/admin/backup_admin
exit %errorlevel%
If I manually launch this .bat by double-clicking on it, or launching it in a command prompt under Windows, it executes normally and then ends.
If I make it being launched by ARCServe, the script seems never to end.
My job stays in "waiting" status, it seems the execution code of the linux script isn't returned to my batch file, and this one doesn't close.
In my mind, what's happening is plink just opens the connection to the Linux, send the sript execution signal, and then close the connection, so the execution code can't be returned to the batch. Am I right ?
Is what I want to do possible or am I trying something impossible to do ?
So, do I have to proceed differently ?
Do I have to use PUTTY or CygWin instead of plink ?
Please, it's giving me headaches ...
If you install Cygwin, you could do it exactly like you can do it on Linux to Linux, i.e. remotely run a command with ssh someuser#remoteserver.com somecommand
This command will return with the same return code on the calling client, as the command exited with on the remote end. If you use SSH shared keys for authentication instead of passwords, it can also be scripted without user interaction.

Resources