Running parallel simulations (using the Command Line) - omnet++

How can I run the simulation with different configurations? I am using omnet++ version 4.6.
My omnetpp.ini file looks as below :
[General]
[Config Dcn2]
network = Dcn2
# leaf switch
#**.down_port = 2
**.up_port = 16 #12 # 4
# spine switch
**.port = 28 # 20 #2048
# crossconnect
**.cross_down_port = 28 # 20 #2048
**.cross_up_port = 28 # 20 #2048
# to set destination of packet
**.number_leaf_switch = 28 # 20 #2048
# link speed
#**.switch_switch_link_speed = 40 Mbps
**.interArrivalTime = ${exponential(.0001),exponential(0.0002),exponential(0.0003)}
**.batch_length = 10
**.buffer_length = 10
sim-time-limit = 1000s
I want to run the code with different values of interArrivalTime. But I can neither run with different configs (one after another), nor can I run individual runs in parallel on separate cores.
I have tried with cmdev option in run configurations but the different runs doesn't show up apart from the 1st one. When I try mentioning the number of processes to be more than one then also only the first run gets simulated. I really cannot find out the reason.

Config Examinataion
In your case you can perform config examination. OMNeT++ offers different options for that. They are explained under the Parameter Studies section of the OMNeT++ manual.
So you can try one of the following options to examine your configs and thus config file:
./run –a - will show all the configurations in the omnet.ini
./run -x <config_name> - will give more info about a specific config
./run -x <config_name> -g - see all the combinations of configs
First you will have to navigate to your example folder, and there execute one of the aforementioned commands.
I executed: ./run -x Dcn2 -g and got the following resuls
OMNeT++ Discrete Event Simulation (C) 1992-2014 Andras Varga, OpenSim Ltd.
Version: 4.6, build: 141202-f785492, edition: Academic Public License -- NOT FOR COMMERCIAL USE
See the license for distribution terms and warranty disclaimer
Setting up Tkenv...
Config: Dcn2
Number of runs: 3
Run 0: $0=exponential(.0001), $repetition=0
Run 1: $0=exponential(0.0002), $repetition=0
Run 2: $0=exponential(0.0003), $repetition=0
End.
This confirms indeed that you have 3 different runs for the simulation parameter you are trying to modify. However, variable name you are using for the interArrivalTime parameter is assigned to $0 by default because you have not specified it.
If you change the following line in your config:
**.interArrivalTime = ${exponential(.0001),exponential(0.0002),exponential(0.0003)}
to
**.interArrivalTime = ${interArrivalTime = exponential(0.0001),exponential(0.0002),exponential(0.0003)}
you will get a more descriptive output for ./run -x Dcn2 -g
Running different runs of a config:
Next step for you would be to run the different runs for your config. You can do that by navigating to your example directory and execute:
./run -c <config-name> -r <run-number> -u Cmdenv
Note that the <config-name> would be Dcn2 for you, and the -r specifies which of the runs given above you would like to execute.
In other words you can open three terminal windows and navigate to your example directory and do:
./run -c Dcn2 -r 0 -u Cmdenv - for interArrivalTime = exponential(0.0001)
./run -c Dcn2 -r 1 -u Cmdenv - for interArrivalTime = exponential(0.0002)
./run -c Dcn2 -r 2 -u Cmdenv - for interArrivalTime = exponential(0.0003)
Distinguishing Different run results
To be able to distinguish between the output result files of the different runs for your given config you can modify the default name of the output file.
The "how-to" is given in the 12.2.3 Result File Names section of the OMNeT++ manual.
output-vector-file = "${resultdir}/${configname}-${runnumber}.vec"
output-scalar-file = "${resultdir}/${configname}-${runnumber}.sca"
As you can see by default your output files will be distinguished by the ${runnumber} variable. You can further improve it by adding the interArrivalTime to the output file name.
Example:
output-scalar-file = "${resultdir}/${configname}-${runnumber}-IAtime=${interArrivalTime}.sca/vec"
I have not tested the final approach. So you might get some error along the path.

Related

Problems with Orca and OpenMPI for parallel jobs

Hello to the community:
I recently started to use ORCA software for some quantum calculation but I have been having a lot of problems to lunch a parallel calculation in the cluster of my University.
To install Orca I used the static version:
orca_4_2_1_linux_x86-64_openmpi314.tar.xz.
In a shared direction of the cluster (/data/shared/opt/ORCA/).
And putted in my ~/.bash_profile:
export PATH="/data/shared/opt/ORCA/orca_4_2_1_linux_x86-64_openmpi314:$PATH"
export LD_LIBRARY_PATH="/data/shared/opt/ORCA/orca_4_2_1_linux_x86-64_openmpi314:$LD_LIBRARY_PATH"
For the installation of the corresponding OpenMPI version (3.1.4)
tar -xvf openmpi-3.1.4.tar.gz
cd openmpi-3.1.4
./configure --prefix="/data/shared/opt/ORCA/openmpi314/"
make -j 10
make install
When I use the frontend server all is wonderful:
With a .sh like this:
#! /bin/bash
export PATH="/data/shared/opt/ORCA/openmpi314/bin:$PATH"
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/data/shared/opt/ORCA/openmpi314/lib"
$(which orca) test.inp > test.out
and an input like this:
# Computation of myjob at b3lyp/6-31+G(d,p)
%pal nprocs 10 end
%maxcore 8192
! RKS B3LYP 6-31+G(d,p)
! TightSCF Grid5 NoFinalGrid
! Opt
! Freq
%cpcm
smd true
SMDsolvent "water"
end
* xyz 0 1
C 0 0 0
O 0 0 1.5
*
The problem appears when I use the nodes:
.inp file:
#! Computation at RKS B3LYP/6-31+G(d,p) for cis1_bh267_m_Cell_152
%pal nprocs 12 end
%maxcore 8192
! RKS B3LYP 6-31+G(d,p)
! TightSCF Grid5 NoFinalGrid
! Opt
! Freq
%cpcm
smd true
SMDsolvent "water"
end
* xyz 0 1
C -4.38728130 0.21799058 0.17853303
C -3.02072869 0.82609890 -0.29733316
F -2.96869122 2.10937041 0.07179384
F -3.01136328 0.87651596 -1.63230798
C -1.82118365 0.05327804 0.23420220
O -2.26240947 -0.92805650 1.01540713
C -0.53557484 0.33394113 -0.05236121
C 0.54692198 -0.46942807 0.50027196
O 0.31128292 -1.43114232 1.22440290
C 1.93990391 -0.12927675 0.16510948
C 2.87355011 -1.15536140 -0.00858832
C 4.18738231 -0.82592189 -0.32880964
C 4.53045856 0.52514329 -0.45102225
N 3.63662927 1.52101319 -0.26705841
C 2.36381718 1.20228695 0.03146190
F -4.51788749 0.24084604 1.49796862
F -4.53935644 -1.04617745 -0.19111502
F -5.43718443 0.87033190 -0.30564680
H -1.46980819 -1.48461498 1.39034280
H -0.26291843 1.15748249 -0.71875720
H 2.57132559 -2.20300864 0.10283592
H 4.93858460 -1.60267627 -0.48060140
H 5.55483009 0.83859415 -0.70271364
H 1.67507560 2.05019549 0.17738396
*
.sh file (Slurm job):
#!/bin/bash
#SBATCH -p deflt #which partition I want
#SBATCH -o cis1_bh267_m_Cell_152_myjob.out #path for the slurm output
#SBATCH -e cis1_bh267_m_Cell_152_myjob.err #path for the slurm error output
#SBATCH -c 12 #number of cpu(logical cores)/task (task is normally an MPI process, default is one and the option to change it is -n)
#SBATCH -t 2-00:00 #how many time I want the resources (this impacts the job priority as well)
#SBATCH --job-name=cis1_bh267_m_Cell_152 #(to recognize your jobs when checking them with "squeue -u USERID")
#SBATCH -N 1 #number of node, usually 1 when no parallelization over nodes
#SBATCH --nice=0 #lowering your priority if >0
#SBATCH --gpus=0 #number of gpu you want
# This block is echoing some SLURM variables
echo "Jobid = $SLURM_JOBID"
echo "Host = $SLURM_JOB_NODELIST"
echo "Jobname = $SLURM_JOB_NAME"
echo "Subcwd = $SLURM_SUBMIT_DIR"
echo "SLURM_CPUS_PER_TASK = $SLURM_CPUS_PER_TASK"
# This block is for the execution of the program
export PATH="/data/shared/opt/ORCA/openmpi314/bin:$PATH"
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/data/shared/opt/ORCA/openmpi314/lib"
$(which orca) ${SLURM_JOB_NAME}.inp > ${SLURM_JOB_NAME}.log --use-hwthread-cpus
I used the --use-hwthread-cpus flag as a recommendation but the same problem appears with and without this flag.
All the error is:
There are not enough slots available in the system to satisfy the 12 slots that were requested by the application: /data/shared/opt/ORCA/orca_4_2_1_linux_x86-64_openmpi314/orca_gtoint_mpi
Either request fewer slots for your application, or make more slots available for use. A "slot" is the Open MPI term for an allocatable unit where we can launch a process. The number of slots available are defined by the environment in which Open MPI processes are run:
1. Hostfile, via "slots=N" clauses (N defaults to number of processor cores if not provided)
2. The --host command line parameter, via a ":N" suffix on the hostname (N defaults to 1 if not provided)
3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
4. If none of a hostfile, the --host command line parameter, or an RM is present, Open MPI defaults to the number of processor cores In all the above cases, if you want Open MPI to default to the number
of hardware threads instead of the number of processor cores, use the --use-hwthread-cpus option.
Alternatively, you can use the --oversubscribe option to ignore the number of available slots when deciding the number of processes to launch.
*[file orca_tools/qcmsg.cpp, line 458]:
.... aborting the run*
When I go to the output of the calculation, it looks like start to run but when launch the parallel jobs fail and give:
ORCA finished by error termination in GTOInt
Calling Command: mpirun -np 12 --use-hwthread-cpus /data/shared/opt/ORCA/orca_4_2_1_linux_x86-64_openmpi314/orca_gtoint_mpi cis1_bh267_m_Cell_448.int.tmp cis1_bh267_m_Cell_448
[file orca_tools/qcmsg.cpp, line 458]:
.... aborting the run
We have two kind of nodes on the cluster:
A punch of them are:
Xeon 6-core E-2136 # 3.30GHz (12 logical cores) and Nvidia GTX 1070Ti
And the other ones:
AMD Epyc 24-core (24 logical cores) and 4x Nvidia RTX 2080Ti
Using the command scontrol show node the details of one node of each group are:
First Group:
NodeName=fang1 Arch=x86_64 CoresPerSocket=6
CPUAlloc=12 CPUTot=12 CPULoad=12.00
AvailableFeatures=(null)
ActiveFeatures=(null)
Gres=gpu:gtx1070ti:1
NodeAddr=fang1 NodeHostName=fang1 Version=19.05.5
OS=Linux 5.7.12-arch1-1 #1 SMP PREEMPT Fri, 31 Jul 2020 17:38:22 +0000
RealMemory=15923 AllocMem=0 FreeMem=171 Sockets=1 Boards=1
State=ALLOCATED ThreadsPerCore=2 TmpDisk=7961 Weight=1 Owner=N/A MCS_label=N/A
Partitions=deflt,debug,long
BootTime=2020-10-27T09:56:18 SlurmdStartTime=2020-10-27T15:33:51
CfgTRES=cpu=12,mem=15923M,billing=12,gres/gpu=1,gres/gpu:gtx1070ti=1
AllocTRES=cpu=12,gres/gpu=1,gres/gpu:gtx1070ti=1
CapWatts=n/a
CurrentWatts=0 AveWatts=0
ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
Second Group
NodeName=fang50 Arch=x86_64 CoresPerSocket=24
CPUAlloc=48 CPUTot=48 CPULoad=48.00
AvailableFeatures=(null)
ActiveFeatures=(null)
Gres=gpu:rtx2080ti:4
NodeAddr=fang50 NodeHostName=fang50 Version=19.05.5
OS=Linux 5.7.12-arch1-1 #1 SMP PREEMPT Fri, 31 Jul 2020 17:38:22 +0000
RealMemory=64245 AllocMem=0 FreeMem=807 Sockets=1 Boards=1
State=ALLOCATED ThreadsPerCore=2 TmpDisk=32122 Weight=1 Owner=N/A MCS_label=N/A
Partitions=deflt,long
BootTime=2020-12-15T10:09:43 SlurmdStartTime=2020-12-15T10:14:17
CfgTRES=cpu=48,mem=64245M,billing=48,gres/gpu=4,gres/gpu:rtx2080ti=4
AllocTRES=cpu=48,gres/gpu=4,gres/gpu:rtx2080ti=4
CapWatts=n/a
CurrentWatts=0 AveWatts=0
ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
I use in the script of Slurm the flag -c, --cpus-per-task = integer; and in the input for Orca the command %pal nprocs integer end. I tested different combinations of this two parameters in order to see if I am using more CPU than the available:
-c, --cpus-per-task = integer
%pal nprocs integer end
None
6
None
3
None
2
1
2
1
12
2
6
3
4
12
12
With different amount of memories: 8000 MBi and 2000 MBi (my total memory is around 15 GBi). And in all the cases the same error appears. I am not an expert user neither in ORCA non in informatic (but maybe you guess this for the extension of the question), so maybe the solution is simple but I really don’t have it, Idon't know what's going on!
A lot of thanks in advance,
Alejandro.
Faced the same issue.
Explicit declaration --prefix ${OMPI_HOME} directly as ORCA parameter and using of static linked ORCA version helps me:
export RSH_COMMAND="/usr/bin/ssh"
export PARAMS="--mca routed direct --oversubscribe -machinefile ${HOSTS_FILE} --prefix ${OMPI_HOME}"
$ORCA_DIR/orca $WORKDIR/$JOBFILE.inp "$PARAMS" > $WORKDIR/$JOBFILE.out
Also, It's better to build OpenMPI 3.1.x with --disable-builtin-atomics flag.
Thank you #Alexey for your answer. And sorry for the wrong Tag, like I said, I am pretty rookie on this stuff.
The problem was not in the Orca or OpenMPI configuration but in the bash script used for scheduled the Slurm job.
I thought that the entire Orca job itself was what Slurm call a "task". For that reason I declared the flag --cpus-per-task equal to the number of parallel jobs that I want to do with Orca. But the problem is that each parallel Orca job (that is launch using OpenMPI) is a task for Slurm. Therefore with my Slurm script I was reserving a node with at least 12 CPU, but when Orca launch their parallel jobs, each one ask for 12 CPU, so: "There are not enough slots available ..." because I needed 144 CPU.
The rest of the cases in the table of my Question fails for another reason. I was launching at the same time 5 different Orca calculation. Now, because --cpus-per-task could be None, 1, 2 or 3; the five calculation might enter in the same node or in another node with this amount of free CPU, but when Orca ask for the parallel jobs, fail again because there are not this amount of CPU on the node.
The solution that I found is pretty simple. On the .sh script for Slurm I putted this:
#SBATCH --mincpus=n*m
#SBATCH --ntasks=n
#SBATCH --cpus-per-task m
Instead of only:
#SBATCH --cpus-per-task m
Where n will be equal to the number of parallel jobs specified on the Orca input (%pal nprocs n end) and m the number of CPU that you want to use for each parallel Orca job.
In my case I used n = 12, m = 1. With the flag --mincpus I ensured to take a node with at least 12 CPU and allocated them. With the --cpus-per-task is pretty evident what this flag do (even for me :-) ), which, by the way, has a default value of 1 and I don't know if more than 1 CPU for each OpenMPI Orca job improve the velocity of the calculation. And --ntasks gives the information to Slurm of how many task you will do.
Of course if you know the number of task and the CPU per task is easy to know how many CPU you need to reserve, but I don't know if this is easy to Slurm too :-). So, to be sure that I allocate the correct number of CPU i used --mincpus flag, but maybe is not needed. The thing is that it works now ^_^.
It is also important to take into account the amount of memory that you declare in the input of Orca in order of do not exceed the available memory. For example, if you have 12 task and a RAM of 15000 MBi, the right amount of memory to declared should be no more than 15000/12 = 1250 MBi
I had a similar problem with parallel jobs before. The slurm also output not enough slots error.
My solution is to change parallel threads into parallel processes. For my system is to change
#SBATCH -c 24
into
#SBATCH -n 24
and everything works just fine.

Can a snakemake input rule be defined with different paths/wildcards

I want to know if one can define a input rule that has dependencies on different wildcards.
To elaborate, I am running this Snakemake pipeline on different fastq files using qsub which submits each job to a different node:
fastqc on original fastq - no downstream dependency on other jobs
adapter/quality trimming to generate trimmed fastq
fastqc_after on trimmed fastq (output from step 2) and no downstream dependency
star-rsem pipeline on trimmed fastq (output from step 2 above)
rsem and tximport (output from step 4)
Run multiqc
MultiQC - https://multiqc.info/ - runs on the results folder which has results from fastqc, star, rsem, etc. However, because each job runs on a different node, sometimes Step 3 (fastqc and/or fastqc_after) is still running on the nodes while other steps finish running (Steps 2, 4 and 5) OR vice-versa.
Currently, I can create a MultiQc rule which waits on results from Steps 2, 4, 5 because they are linked to each other by input/output rules.
I have attached my pipeline as png to this post. Any suggestions would help.
What I need: I want to create a "collating" step where I want MultiQC to wait till all steps (from 1 to 5) finish. In other words, using my attached png as guide, I want to define multiple input rules for MultiQC that also wait on results from fastqc
Thanks in advance.
Note: Based on comments I received from 'colin' and 'bli' after my original post, I have shared the code for the different rules here.
Step 1 - fastqc
rule fastqc:
input: "raw_fastq/{sample}.fastq"
output: "results/fastqc/{sample}_fastqc.zip"
log: "results/logs/fq_before/{sample}.fastqc.log"
params: ...
shell: ...
Step 2 - bbduk
rule bbduk:
input: R1 = "raw_fastq/{sample}.fastq"
output: R1 = "results/bbduk/{sample}_trimmed.fastq",
params: ...
log: "results/logs/bbduk/{sample}.bbduk.log"
priority:95
shell: ....
Step 3 - fastqc_after
rule fastqc_after:
input: "results/bbduk/{sample}_trimmed.fastq"
output: "results/bbduk/{sample}_trimmed_fastqc.zip"
log: "results/logs/fq_after/{sample}_trimmed.fastqc.log"
priority: 70
params: ...
shell: ...
Step 4 - star_align
rule star_align:
input: R1 = "results/bbduk/{sample}_trimmed.fastq"
output:
out_1 = "results/bam/{sample}_Aligned.toTranscriptome.out.bam",
out_2 = "results/bam/{sample}_ReadsPerGene.out.tab"
params: ...
log: "results/logs/star/{sample}.star.log"
priority:90
shell: ...
Step 5 - rsem_norm
rule rsem_norm:
input:
bam = "results/bam/{sample}_Aligned.toTranscriptome.out.bam"
output:
genes = "results/quant/{sample}.genes.results"
params: ...
threads = 16
priority:85
shell: ...
Step 6 - rsem_model
rule rsem_model:
input: "results/quant/{sample}.genes.results"
output: "results/quant/{sample}_diagnostic.pdf"
params: ...
shell: ...
Step 7 - tximport_rsem
rule tximport_rsem:
input: expand("results/quant/{sample}_diagnostic.pdf",sample=samples)
output: "results/rsem_tximport/RSEM_GeneLevel_Summarization.csv"
shell: ...
Step 8 - multiqc
rule multiqc:
input: expand("results/quant/{sample}.genes.results",sample=samples)
output: "results/multiqc/project_QS_STAR_RSEM_trial.html"
log: "results/log/multiqc"
shell: ...
If you want rule multiqc to happen only after fastqc completed, you can add the output of fastqc to the input of multiqc:
rule multiqc:
input:
expand("results/quant/{sample}.genes.results",sample=samples),
expand("results/fastqc/{sample}_fastqc.zip", sample=samples)
output: "results/multiqc/project_QS_STAR_RSEM_trial.html"
log: "results/log/multiqc"
shell: ...
Or, if you need to be able to refer to the output of rsem_norm in your shell section:
rule multiqc:
input:
rsem_out = expand("results/quant/{sample}.genes.results",sample=samples),
fastqc_out = expand("results/fastqc/{sample}_fastqc.zip", sample=samples)
output: "results/multiqc/project_QS_STAR_RSEM_trial.html"
log: "results/log/multiqc"
shell: "... {input.rsem_out} ..."
In one of your comments, you wrote:
MultiQC needs directory as input - I give it the 'results' directory in my shell command.
If I understand correctly, this means that results/quant/{sample}.genes.results are directories, and not plain files. If this is the case, you should make sure no downstream rule writes files inside those directories. Otherwise, the directories will be considered as having been updated after the output of multiqc, and multiqc will be re-run every time you run the pipeline.

Pyro4 configuration doesn't change

I put the Pyro4 configuration as this in the starting part of my code:
Pyro4.config.THREADPOOL_SIZE = 1
Pyro4.config.THREADPOOL_SIZE_MIN = 1
I check if I tried to run two client code at the same time, it will say ' rejected: no free workers, increase server threadpool size'. It looks like the setting is working, but when I open the console to check the pyro configuration using "python -m Pyro4.configuration", it returns:
THREADPOOL_SIZE = 40
THREADPOOL_SIZE_MIN = 4
Does someone know why?
When you run python -m Pyro4.configuration, it will simply print the default settings (influenced only by any environment variables you may have set). I'm not sure why you think that this should know about the settings you added in your own code.

command output not captured by shell script when invoked by snmp pass

The problem
SNMPD is correctly delegating SNMP polling requests to another program but the response from that program is not valid. A manual run of the program with the same arguments is responding correctly.
The detail
I've installed the correct LSI raid drivers on a server and want to configure SNMP. As per the instructions, I've added the following to /etc/snmp/snmpd.conf to redirect SNMP polling requests with a given OID prefix to a program:
pass .1.3.6.1.4.1.3582 /usr/sbin/lsi_mrdsnmpmain
It doesn't work correctly for SNMP polling requests:
snmpget -v1 -c public localhost .1.3.6.1.4.1.3582.5.1.4.2.1.2.1.32.1
I get the following response:
Error in packet
Reason: (noSuchName) There is no such variable name in this MIB.
Failed object: SNMPv2-SMI::enterprises.3582.5.1.4.2.1.2.1.32.1
What I've tried
SNMPD passes two arguments, -g and <oid> and expects a three line response <oid>, <data-type> and <data-value>.
If I manually run the following:
/usr/sbin/lsi_mrdsnmpmain -g .1.3.6.1.4.1.3582.5.1.4.2.1.2.1.32.0
I correctly get a correct three line response:
.1.3.6.1.4.1.3582.5.1.4.2.1.2.1.32.0
integer
30
This means that the pass command is working correctly and the /usr/sbin/lsi_mrdsnmpmain program is working correctly in this example
I tried replacing /usr/sbin/lsi_mrdsnmpmain with a bash script. The bash script delegates the call and logs the supplied arguments and output from the delegated call:
#!/bin/bash
echo "In: '$#" > /var/log/snmp-pass-test
RETURN=$(/usr/sbin/lsi_mrdsnmpmain $#)
echo "$RETURN"
echo "Out: '$RETURN'" >> /var/log/snmp-pass-test
And modified the pass command to redirect to the bash script. If I run the bash script manually /usr/sbin/snmp-pass-test -g .1.3.6.1.4.1.3582.5.1.4.2.1.2.1.32.0 I get the correct three line response as I did when I ran /usr/sbin/lsi_mrdsnmpmain manually and I get the following logged:
In: '-g .1.3.6.1.4.1.3582.5.1.4.2.1.2.1.32.0
Out: '.1.3.6.1.4.1.3582.5.1.4.2.1.2.1.32.0
integer
30'
When I rerun the snmpget test, I get the same Error in packet... error and the bash script's logging shows that the captured delegated call output is empty:
In: '-g .1.3.6.1.4.1.3582.5.1.4.2.1.2.1.32.0
Out: ''
If I modify the bash script to only echo an empty line I also get the same Error in packet... message.
I've also tried ensuring that the environment variables that are present when I manually call /usr/sbin/lsi_mrdsnmpmain are the same for the bash script but I get the same empty output.
Finally, my questions
Why would the bash script behave differently in these two scenarios?
Is it likely that the problem that exists with the bash scripts is the same as originally noticed (manually running program has different output to SNMPD run program)?
Updates
eewanco's suggestions
What user is running the program in each scenario?
I added echo "$(whoami)" > /var/log/snmp-pass-test to the bash script and root was added to the logs
Maybe try executing it in cron
Adding the following to root's crontab and the correct three line response was logged:
* * * * * /usr/sbin/lsi_mrdsnmpmain -g .1.3.6.1.4.1.3582.5.1.4.2.1.2.1.32.1 >> /var/log/snmp-test-cron 2>&1
Grisha Levit's suggestion
Try logging the stderr
There aren't any errors logged
Checking /var/log/messages
When I run it via SNMPD, I get MegaRAID SNMP AGENT: Error in getting Shared Memory(lsi_mrdsnmpmain) logged. When I run it directly, I don't. I've done a bit of googling and I may need lm_sensors installed; I'll try this.
I installed lm_sensors & compat-libstdc++-33.i686 (the latter because it said it was a pre-requisite from the instructions and I was missing it), uninstalled and reinstalled the LSI drivers and am experiencing the same issue.
SELinux
I accidently stumbled upon a page about extending snmpd with scripts and it says to check the script has the right SELinux context. I ran grep AVC /var/log/audit/audit.log | grep snmp before and after running a snmpget and the following entry is added as a direct result from running snmpget:
type=AVC msg=audit(1485967641.075:271): avc: denied { unix_read unix_write } for pid=5552 comm="lsi_mrdsnmpmain" key=558265 scontext=system_u:system_r:snmpd_t:s0 tcontext=system_u:system_r:initrc_t:s0 tclass=shm
I'm now assuming that SELinux is causing the call to fail; I'll dig further...see answer for solution.
strace (eewanco's suggestion)
Try using strace with and without snmp and see if you can catch a system call failure or some additional hints
For completeness, I wanted to see if strace would have hinted that SELinux was denying. I had to remove the policy packages using semodule -r <policy-package-name> to reintroduce the problem then ran the following:
strace snmpget -v1 -c public localhost .1.3.6.1.4.1.3582.5.1.4.2.1.2.1.32.1 >> strace.log 2>&1
The end of strace.log is as follows and unless I'm missing something, it doesn't seem to provide any hints:
...
sendmsg(3, {msg_name(16)={sa_family=AF_INET, sin_port=htons(161), sin_addr=inet_addr("127.0.0.1")}, msg_iov(1)= [{"0;\2\1\0\4\20public\240$\2\4I\264-m\2"..., 61}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_IP, cmsg_type=, ...}, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 61
select(4, [3], NULL, NULL, {0, 999997}) = 1 (in [3], left {0, 998475})
brk(0xab9000) = 0xab9000
recvmsg(3, {msg_name(16)={sa_family=AF_INET, sin_port=htons(161), sin_addr=inet_addr("127.0.0.1")}, msg_iov(1)= [{"0;\2\1\0\4\20public\242$\2\4I\264-m\2"..., 65536}], msg_controllen=0, msg_flags=0}, MSG_DONTWAIT) = 61
write(2, "Error in packet\nReason: (noSuchN"..., 81Error in packet
Reason: (noSuchName) There is no such variable name in this MIB.
) = 81
write(2, "Failed object: ", 15Failed object: ) = 15
write(2, "SNMPv2-SMI::enterprises.3582.5.1"..., 48SNMPv2- SMI::enterprises.3582.5.1.4.2.1.2.1.32.1
) = 48
write(2, "\n", 1
) = 1
brk(0xaa9000) = 0xaa9000
close(3) = 0
exit_group(2) = ?
+++ exited with 2 +++
It was SELinux that was denying snmpd a delegated call to /usr/sbin/lsi_mrdsnmpmain (and probably beyond).
To identify it, I ran grep AVC /var/log/audit/audit.log and for each entry, I ran the following:
echo "<grepped-output>" | audit2allow -a -M <filename>
This creates a SELinux policy package that should allow the delegated call through. The package is then loaded using the following:
semodule -i <filename>.pp
I had to do this 5 times as there were different causes of denial (unix_read unix_write, associate, read write). I'll look to combine the modules into one.
Now when I run snmpget I get the correct delegated output:
SNMPv2-SMI::enterprises.3582.5.1.4.2.1.2.1.32.1 = INTEGER: 34

Store the log output into a file with cmdenv-output-file

I need to recover the content of the show log module of Omnet++/Tkenv into a file, I added in the omnetpp.ini:
cmdenv-express-mode = false
cmdenv-output-file = log.txt
but I have two types of problems:
1) after the simulation, I did not find the "log.txt" If I do not create it
2) and when I created it before launching the simulation under ../omnetpp-4.6/log.txt also I find it empty
I used EV << to display the content of variables that I used, I need to resolve this problem in order to analyze the traffic so how can I do that please?
You have to start your simulation in Cmdenv mode. To do that go to Run | Run Configurations | select your configuration, then select Command line as User interface. The log file is created in simulations directory by default.

Resources