Trying to understanding how VUs works k6 - performance

As shown in the image I ran only 2 VUs but it returned “100” complete. What would that 100 be? Number of scripts run or vus? So for every 1 VU I have 50 scripts running, is that it?

You have configured k6 to run with 2 vus for 2 minutes --vus 2 --duration 2m.
You can see this in the "summary" as "2 looping VUs for 2m0s".
This means that you are running the constant-vus executor, the options you provided are just a shortcut for it. Arguably the "looping vus" description explains better that you have 2 VUs looping for 2 minutes.
As you can see above you also do 100 iterations- executions of the default function. And those take around 2.42s and 2min = 120s / 2.42 = 49.5. But k6 will try to finish started execution for a time (gracefulStop as mentioned in the screenshot), so it will make around 50 iterations per VU.
If you just want to do 2 iterations - add --iterations 2 or use shared-iterations executor directly, those options are the shortcut for it basically.

As shown in the image I ran only 2 VUs but it returned “100” complete.
What would that 100 be? Number of scripts run or vus? So for every 1
VU I have 50 scripts running, is that it?
-> sudo apt-get update
-> sudo apt-get upgrade
if not, download it again with root manager


Problems with Orca and OpenMPI for parallel jobs

Hello to the community:
I recently started to use ORCA software for some quantum calculation but I have been having a lot of problems to lunch a parallel calculation in the cluster of my University.
To install Orca I used the static version:
In a shared direction of the cluster (/data/shared/opt/ORCA/).
And putted in my ~/.bash_profile:
export PATH="/data/shared/opt/ORCA/orca_4_2_1_linux_x86-64_openmpi314:$PATH"
export LD_LIBRARY_PATH="/data/shared/opt/ORCA/orca_4_2_1_linux_x86-64_openmpi314:$LD_LIBRARY_PATH"
For the installation of the corresponding OpenMPI version (3.1.4)
tar -xvf openmpi-3.1.4.tar.gz
cd openmpi-3.1.4
./configure --prefix="/data/shared/opt/ORCA/openmpi314/"
make -j 10
make install
When I use the frontend server all is wonderful:
With a .sh like this:
#! /bin/bash
export PATH="/data/shared/opt/ORCA/openmpi314/bin:$PATH"
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/data/shared/opt/ORCA/openmpi314/lib"
$(which orca) test.inp > test.out
and an input like this:
# Computation of myjob at b3lyp/6-31+G(d,p)
%pal nprocs 10 end
%maxcore 8192
! RKS B3LYP 6-31+G(d,p)
! TightSCF Grid5 NoFinalGrid
! Opt
! Freq
smd true
SMDsolvent "water"
* xyz 0 1
C 0 0 0
O 0 0 1.5
The problem appears when I use the nodes:
.inp file:
#! Computation at RKS B3LYP/6-31+G(d,p) for cis1_bh267_m_Cell_152
%pal nprocs 12 end
%maxcore 8192
! RKS B3LYP 6-31+G(d,p)
! TightSCF Grid5 NoFinalGrid
! Opt
! Freq
smd true
SMDsolvent "water"
* xyz 0 1
C -4.38728130 0.21799058 0.17853303
C -3.02072869 0.82609890 -0.29733316
F -2.96869122 2.10937041 0.07179384
F -3.01136328 0.87651596 -1.63230798
C -1.82118365 0.05327804 0.23420220
O -2.26240947 -0.92805650 1.01540713
C -0.53557484 0.33394113 -0.05236121
C 0.54692198 -0.46942807 0.50027196
O 0.31128292 -1.43114232 1.22440290
C 1.93990391 -0.12927675 0.16510948
C 2.87355011 -1.15536140 -0.00858832
C 4.18738231 -0.82592189 -0.32880964
C 4.53045856 0.52514329 -0.45102225
N 3.63662927 1.52101319 -0.26705841
C 2.36381718 1.20228695 0.03146190
F -4.51788749 0.24084604 1.49796862
F -4.53935644 -1.04617745 -0.19111502
F -5.43718443 0.87033190 -0.30564680
H -1.46980819 -1.48461498 1.39034280
H -0.26291843 1.15748249 -0.71875720
H 2.57132559 -2.20300864 0.10283592
H 4.93858460 -1.60267627 -0.48060140
H 5.55483009 0.83859415 -0.70271364
H 1.67507560 2.05019549 0.17738396
.sh file (Slurm job):
#SBATCH -p deflt #which partition I want
#SBATCH -o cis1_bh267_m_Cell_152_myjob.out #path for the slurm output
#SBATCH -e cis1_bh267_m_Cell_152_myjob.err #path for the slurm error output
#SBATCH -c 12 #number of cpu(logical cores)/task (task is normally an MPI process, default is one and the option to change it is -n)
#SBATCH -t 2-00:00 #how many time I want the resources (this impacts the job priority as well)
#SBATCH --job-name=cis1_bh267_m_Cell_152 #(to recognize your jobs when checking them with "squeue -u USERID")
#SBATCH -N 1 #number of node, usually 1 when no parallelization over nodes
#SBATCH --nice=0 #lowering your priority if >0
#SBATCH --gpus=0 #number of gpu you want
# This block is echoing some SLURM variables
echo "Jobid = $SLURM_JOBID"
echo "Jobname = $SLURM_JOB_NAME"
echo "Subcwd = $SLURM_SUBMIT_DIR"
# This block is for the execution of the program
export PATH="/data/shared/opt/ORCA/openmpi314/bin:$PATH"
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/data/shared/opt/ORCA/openmpi314/lib"
$(which orca) ${SLURM_JOB_NAME}.inp > ${SLURM_JOB_NAME}.log --use-hwthread-cpus
I used the --use-hwthread-cpus flag as a recommendation but the same problem appears with and without this flag.
All the error is:
There are not enough slots available in the system to satisfy the 12 slots that were requested by the application: /data/shared/opt/ORCA/orca_4_2_1_linux_x86-64_openmpi314/orca_gtoint_mpi
Either request fewer slots for your application, or make more slots available for use. A "slot" is the Open MPI term for an allocatable unit where we can launch a process. The number of slots available are defined by the environment in which Open MPI processes are run:
1. Hostfile, via "slots=N" clauses (N defaults to number of processor cores if not provided)
2. The --host command line parameter, via a ":N" suffix on the hostname (N defaults to 1 if not provided)
3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
4. If none of a hostfile, the --host command line parameter, or an RM is present, Open MPI defaults to the number of processor cores In all the above cases, if you want Open MPI to default to the number
of hardware threads instead of the number of processor cores, use the --use-hwthread-cpus option.
Alternatively, you can use the --oversubscribe option to ignore the number of available slots when deciding the number of processes to launch.
*[file orca_tools/qcmsg.cpp, line 458]:
.... aborting the run*
When I go to the output of the calculation, it looks like start to run but when launch the parallel jobs fail and give:
ORCA finished by error termination in GTOInt
Calling Command: mpirun -np 12 --use-hwthread-cpus /data/shared/opt/ORCA/orca_4_2_1_linux_x86-64_openmpi314/orca_gtoint_mpi cis1_bh267_m_Cell_448
[file orca_tools/qcmsg.cpp, line 458]:
.... aborting the run
We have two kind of nodes on the cluster:
A punch of them are:
Xeon 6-core E-2136 # 3.30GHz (12 logical cores) and Nvidia GTX 1070Ti
And the other ones:
AMD Epyc 24-core (24 logical cores) and 4x Nvidia RTX 2080Ti
Using the command scontrol show node the details of one node of each group are:
First Group:
NodeName=fang1 Arch=x86_64 CoresPerSocket=6
CPUAlloc=12 CPUTot=12 CPULoad=12.00
NodeAddr=fang1 NodeHostName=fang1 Version=19.05.5
OS=Linux 5.7.12-arch1-1 #1 SMP PREEMPT Fri, 31 Jul 2020 17:38:22 +0000
RealMemory=15923 AllocMem=0 FreeMem=171 Sockets=1 Boards=1
State=ALLOCATED ThreadsPerCore=2 TmpDisk=7961 Weight=1 Owner=N/A MCS_label=N/A
BootTime=2020-10-27T09:56:18 SlurmdStartTime=2020-10-27T15:33:51
CurrentWatts=0 AveWatts=0
ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
Second Group
NodeName=fang50 Arch=x86_64 CoresPerSocket=24
CPUAlloc=48 CPUTot=48 CPULoad=48.00
NodeAddr=fang50 NodeHostName=fang50 Version=19.05.5
OS=Linux 5.7.12-arch1-1 #1 SMP PREEMPT Fri, 31 Jul 2020 17:38:22 +0000
RealMemory=64245 AllocMem=0 FreeMem=807 Sockets=1 Boards=1
State=ALLOCATED ThreadsPerCore=2 TmpDisk=32122 Weight=1 Owner=N/A MCS_label=N/A
BootTime=2020-12-15T10:09:43 SlurmdStartTime=2020-12-15T10:14:17
CurrentWatts=0 AveWatts=0
ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
I use in the script of Slurm the flag -c, --cpus-per-task = integer; and in the input for Orca the command %pal nprocs integer end. I tested different combinations of this two parameters in order to see if I am using more CPU than the available:
-c, --cpus-per-task = integer
%pal nprocs integer end
With different amount of memories: 8000 MBi and 2000 MBi (my total memory is around 15 GBi). And in all the cases the same error appears. I am not an expert user neither in ORCA non in informatic (but maybe you guess this for the extension of the question), so maybe the solution is simple but I really don’t have it, Idon't know what's going on!
A lot of thanks in advance,
Faced the same issue.
Explicit declaration --prefix ${OMPI_HOME} directly as ORCA parameter and using of static linked ORCA version helps me:
export RSH_COMMAND="/usr/bin/ssh"
export PARAMS="--mca routed direct --oversubscribe -machinefile ${HOSTS_FILE} --prefix ${OMPI_HOME}"
Also, It's better to build OpenMPI 3.1.x with --disable-builtin-atomics flag.
Thank you #Alexey for your answer. And sorry for the wrong Tag, like I said, I am pretty rookie on this stuff.
The problem was not in the Orca or OpenMPI configuration but in the bash script used for scheduled the Slurm job.
I thought that the entire Orca job itself was what Slurm call a "task". For that reason I declared the flag --cpus-per-task equal to the number of parallel jobs that I want to do with Orca. But the problem is that each parallel Orca job (that is launch using OpenMPI) is a task for Slurm. Therefore with my Slurm script I was reserving a node with at least 12 CPU, but when Orca launch their parallel jobs, each one ask for 12 CPU, so: "There are not enough slots available ..." because I needed 144 CPU.
The rest of the cases in the table of my Question fails for another reason. I was launching at the same time 5 different Orca calculation. Now, because --cpus-per-task could be None, 1, 2 or 3; the five calculation might enter in the same node or in another node with this amount of free CPU, but when Orca ask for the parallel jobs, fail again because there are not this amount of CPU on the node.
The solution that I found is pretty simple. On the .sh script for Slurm I putted this:
#SBATCH --mincpus=n*m
#SBATCH --ntasks=n
#SBATCH --cpus-per-task m
Instead of only:
#SBATCH --cpus-per-task m
Where n will be equal to the number of parallel jobs specified on the Orca input (%pal nprocs n end) and m the number of CPU that you want to use for each parallel Orca job.
In my case I used n = 12, m = 1. With the flag --mincpus I ensured to take a node with at least 12 CPU and allocated them. With the --cpus-per-task is pretty evident what this flag do (even for me :-) ), which, by the way, has a default value of 1 and I don't know if more than 1 CPU for each OpenMPI Orca job improve the velocity of the calculation. And --ntasks gives the information to Slurm of how many task you will do.
Of course if you know the number of task and the CPU per task is easy to know how many CPU you need to reserve, but I don't know if this is easy to Slurm too :-). So, to be sure that I allocate the correct number of CPU i used --mincpus flag, but maybe is not needed. The thing is that it works now ^_^.
It is also important to take into account the amount of memory that you declare in the input of Orca in order of do not exceed the available memory. For example, if you have 12 task and a RAM of 15000 MBi, the right amount of memory to declared should be no more than 15000/12 = 1250 MBi
I had a similar problem with parallel jobs before. The slurm also output not enough slots error.
My solution is to change parallel threads into parallel processes. For my system is to change
#SBATCH -c 24
#SBATCH -n 24
and everything works just fine.

sge All queues dropped because of overload or full

I'm going to run a million batch jobs with " sge ".
Approximately 10,000 jobs are well executed, but after an hour of execution, they stop running.
After about an hour's run, the process slows down and eventually stops.
Checking the error message does not confirm any errors.
i can check the message below only.
"All queues dropped because of overload or full"
How do I set up the layout to run normally?
there is one master server and four clients and files share using nfs
and every system run on docker and docker-swirm
do qstat when job execution speed was slow down
$qstat -j
queue instance "peteris.q#sge00" dropped because it is full
queue instance "peteris.q#sge02" dropped because it is full
queue instance "peteris.q#sge03" dropped because it is full
queue instance "peteris.q#sge01" dropped because it is full
All queues dropped because of overload or full
detail messages
$qstat -j 1595799
job_number: 1595799
exec_file: job_scripts/1595799
submission_time: Sun May 27 08:08:10 2018
owner: root
uid: 0
group: root
gid: 0
sge_o_home: /root
sge_o_path: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
sge_o_workdir: /data/23andMe
sge_o_host: sge
account: sge
cwd: /data/23andMe
mail_list: root#sge
notify: FALSE
job_name: python3
jobshare: 0
script_file: python3
usage 1: cpu=00:00:02, mem=0.59503 GBs, io=0.03963, vmem=493.180M, maxvmem=493.180M
scheduling info: queue instance "peteris.q#sge00" dropped because it is full
queue instance "peteris.q#sge02" dropped because it is full
queue instance "peteris.q#sge03" dropped because it is full
queue instance "peteris.q#sge01" dropped because it is full
All queues dropped because of overload or full
sge config
algorithm default
schedule_interval 0:0:10
maxujobs 0
queue_sort_method load
job_load_adjustments np_load_avg=100.0
load_adjustment_decay_time 0:7:30
load_formula np_load_avg
schedd_job_info true
flush_submit_sec 2
flush_finish_sec 2
params none
reprioritize_interval 0:0:0
halftime 168
usage_weight_list cpu=1.000000,mem=0.000000,io=0.000000
compensation_factor 5.000000
weight_user 0.250000
weight_project 0.250000
weight_department 0.250000
weight_job 0.250000
weight_tickets_functional 0
weight_tickets_share 0
share_override_tickets TRUE
share_functional_shares TRUE
max_functional_jobs_to_schedule 200
report_pjob_tickets TRUE
max_pending_tasks_per_job 50
halflife_decay_list none
policy_hierarchy OFS
weight_ticket 0.500000
weight_waiting_time 0.278000
weight_deadline 3600000.000000
weight_urgency 0.500000
weight_priority 0.000000
max_reservation 0
default_duration INFINITY
sge queue config
qname peteris.q
hostlist #allhosts
seq_no 0
load_thresholds NONE
suspend_thresholds NONE
nsuspend 1
suspend_interval 00:00:05
priority 0
min_cpu_interval 00:00:05
processors UNDEFINED
ckpt_list NONE
pe_list make
rerun FALSE
slots 20
tmpdir /tmp
shell /bin/bash
prolog NONE
epilog NONE
shell_start_mode posix_compliant
starter_method NONE
suspend_method NONE
resume_method NONE
terminate_method NONE
notify 00:00:01
owner_list NONE
user_lists NONE
xuser_lists NONE
subordinate_list NONE
complex_values NONE
projects NONE
xprojects NONE
calendar NONE
initial_state default
s_fsize INFINITY
h_fsize INFINITY
s_stack INFINITY
h_stack INFINITY
Seems like you have hit a practical limit on the number of active jobs that the queue can handle at any given time. I cannot confirm where the maximum is defined by SGE, but seems likely it is:
The number of active (not finished) jobs simultaneously
allowed in Sun Grid Engine is controlled by this parameter.
A value greater than 0 defines the limit. The default value
0 means "unlimited". If the max_jobs limit is exceeded by a
job submission then the submission command exits with exit
status 25 and an appropriate error message.
Changing max_jobs will take immediate effect.
This value is a global configuration parameter only. It can-
not be overwritten by the execution host local configura-
If this is correct then value is unlimited; however, SGE will likely not perform well trying to manage ~1 million active jobs, hence the issue you are likely having. I would recommend you use job arrays, as this is the purpose of this type of job ie, to manage and run many near identical tasks.
There are many resources online for job arrays in SGE, such as this one:
I am happy to assist further if you edit your question with specific requirements for each task. For example, does each of the ~ 1 millions tasks require one or more parameters as input?

Code shows error in cluster but works fine otherwise

Hello everyone,
I have a bash file which has the following code:
./lda --num_topics 15 --alpha 0.1 --beta 0.01 --training_data_file testdata/test_data.txt --model_file Model_Files/lda_model_t15.txt --burn_in_iterations 120 --total_iterations 150
This works perfectly fine normally but when I run it in a cluster it is not loading the data that it is supposed to load from the connected .cc files. I have given #!/bin/bash in the header. What can I do to rectify this situation? Please help!
You will need to mention the full path to the lda executable. Since it's not invoked by you manually, the system will not know where to find the executable if invoked by the shell. Since this is not a shell command, you don't necessarily need the #!/bin/bash even.
/<FullPath>/lda --num_topics 15 --alpha 0.1 --beta 0.01 --training_data_file testdata/test_data.txt --model_file Model_Files/lda_model_t15.txt --burn_in_iterations 120 --total_iterations 150

ab (Apache Bench) error: apr_poll: The timeout specified has expired (70007) on Windows

I'm load testing IIS 7.5 (WinR2/SP1) from my Windows 7/SP1 client. I have a script that makes three ab calls like:
start /B cmd /c ab.exe -k -n 500 -c 50 http://rhvwr2vsu410/HelloWebAPI/Home/SyncProducts > SyncProducts.txt
When the concurrency is > 5, I soon get the error message
apr_poll: The timeout specified has expired (70007)
And ab stops making requests. I don't even get to Completed 100 requests.
This happens within 30 seconds of starting my script. The ab documentation page doesn't provide much. Related Stack Overflow question. Server Fault related question .
You must have the 2.4 version and use -s timeout option.
Edit: - includes Apache 2.4.x Win32 and Win64.
Deprecated but still available however I not known until when and just not available:
You can use my win32-x86 binary (compiled under Visual Studio 2008 from trunk 8 Feb 2013): (no longer available) (no longer available)
I was made it using: and (just not available).
ab --help
-s timeout Seconds to max. wait for each response
Default is 30 seconds
Add option: -s 120 to ab command, Where 120 is new timeout. If it is not enough set it even higher...
ab --help
-s timeout Seconds to max. wait for each response
Default is 30 seconds
-k Use HTTP KeepAlive feature
It works for me
Sounds like an ab bug.
I had a similar problem on OS X (now that you mention it happens on Windows, I feel more confident that ab is the culprit). I went around profiling and tracing my web application, but couldn't find anything. I then tested static pages off of nginx, and it still gave me the error. So I then went and found a replacement... jMeter. Works great, but I would still like to know what the ab problem is.

Linpack sometimes starting, sometimes not, but nothing changed

I installed Linpack on a 2-Node cluster with Xeon processors. Sometimes if I start Linpack with this command:
mpiexec -np 28 -print-rank-map -f /root/machines.HOSTS ./xhpl_intel64
linpack starts and prints the output, sometimes I only see the mpi mappings printed and then nothing following. To me this seems like random behaviour because I don't change anything between the calls and as already mentioned, Linpack sometimes starts, sometimes not.
In top I can see that xhpl_intel64processes have been created and they are heavily using the CPU but when watching the traffic between the nodes, iftop is telling me that it nothing is sent.
I am using MPICH2 as MPI implementation. This is my HPL.dat:
# cat HPL.dat
HPLinpack benchmark input file
Innovative Computing Laboratory, University of Tennessee
HPL.out output file name (if any)
6 device out (6=stdout,7=stderr,file)
1 # of problems sizes (N)
10000 Ns
1 # of NBs
250 NBs
0 PMAP process mapping (0=Row-,1=Column-major)
1 # of process grids (P x Q)
2 Ps
14 Qs
16.0 threshold
1 # of panel fact
2 PFACTs (0=left, 1=Crout, 2=Right)
1 # of recursive stopping criterium
4 NBMINs (>= 1)
1 # of panels in recursion
1 # of recursive panel fact.
1 RFACTs (0=left, 1=Crout, 2=Right)
1 # of broadcast
1 BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
1 # of lookahead depth
1 DEPTHs (>=0)
2 SWAP (0=bin-exch,1=long,2=mix)
64 swapping threshold
0 L1 in (0=transposed,1=no-transposed) form
0 U in (0=transposed,1=no-transposed) form
1 Equilibration (0=no,1=yes)
8 memory alignment in double (> 0)
I now just let the program run for a while and after 30min it tells me:
# mpiexec -np 32 -print-rank-map -f /root/machines.HOSTS ./xhpl_intel64
Assertion failed in file ../../socksm.c at line 2577: (it_plfd->revents & 0x008) == 0
internal ABORT - process 0
Is this a mpi problem?
Do you know what type of problem this could be?
I figured out what the problem was: MPICH2 uses different random ports each time it starts and if these are blocked your application wont start up correctly.
The solution for MPICH2 is to set the environment variable MPICH_PORT_RANGE to START:END, like this:
export MPICH_PORT_RANGE=50000:51000
