How to cycle through a small set of options within a for loop? - bash

I have a bunch of jobs I need to submit to a job queue. The queue has 8 different machines I can pick from or I can submit to any available server. Sometimes a server may be faulty so I would like to be able to loop through available servers I send my jobs to. A barebones version is below
# jobscript.sh
dir='some/directory/of/files/to/process'
for fn in $(ls $dir); do
submit_job -q server#machine -x python script.py $fn
done
If I don't care what machine the job is sent to I remove the #machine portion so the command is just submit_job -q server -x python script.py $fn.
If I do want to specify the specific machine then I specify which machine by appending a number after machine as server#machine1 then on the next iteration server#machine2 then server#machine2 etc. The output of the script would then look like the following if I only use the first 3 servers
submit_job -q server#machine1 -x python script.py file1
submit_job -q server#machine2 -x python script.py file2
submit_job -q server#machine3 -x python script.py file3
submit_job -q server#machine1 -x python script.py file4
submit_job -q server#machine2 -x python script.py file5
submit_job -q server#machine3 -x python script.py file6
submit_job -q server#machine1 -x python script.py file7
submit_job -q server#machine2 -x python script.py file8
...
The list of available servers is [1, 2, 3, 4, 5, 6, 7, 8] but I would like to additionally specify from the command line a list of servers to ignore so something like
$bash jobscript.sh -skip 1,4,8
which would only cycle through 2, 3, 5, 6, 7 and product the output
submit_job -q server#machine2 -x python script.py file1
submit_job -q server#machine3 -x python script.py file2
submit_job -q server#machine5 -x python script.py file3
submit_job -q server#machine6 -x python script.py file4
submit_job -q server#machine7 -x python script.py file5
submit_job -q server#machine2 -x python script.py file6
submit_job -q server#machine3 -x python script.py file7
submit_job -q server#machine5 -x python script.py file8
submit_job -q server#machine6 -x python script.py file8
...
if flag -skip is not present, just run the command without #machine which will allow the queue to decide where to place the job and the commands look like
submit_job -q server -x python script.py file1
submit_job -q server -x python script.py file2
submit_job -q server -x python script.py file3
submit_job -q server -x python script.py file4
submit_job -q server -x python script.py file5
submit_job -q server -x python script.py file6
submit_job -q server -x python script.py file7
submit_job -q server -x python script.py file8
submit_job -q server -x python script.py file8
...

Something like this should do most of the work for you:
#!/bin/bash
machines=(1 2 3 4 5 6 7 8)
skip_arr=(1 4 8)
declare -a arr
for i in "${machines[#]}"; do
if [[ ! " ${skip_arr[#]} " =~ " $i " ]]; then
arr+=($i)
fi
done
arr_len="${#arr[#]}"
declare -i i=0
for f in $(ls); do
i="i % arr_len"
echo "file is $f, machine is $i"
let i++
done
Right now, I've set it up to go through the current directory, and just echo the values of the machine and filename. Obviously you'll want to change this to actually execute the commands from the right directory.
The last thing you need to do is build up skip_arr from the command line input, and then check if it's empty when you're executing your command.
Hopefully this gets you most of the way there. Let me know if you have any questions about anything I've done here.

Cycle through array of machines
#!/bin/bash
rotate() {
if [[ "$1" = "all" ]]; then
machines=(1 2 3 4 5 6 7 8)
else
machines=($*)
fi
idx=0
max=${#machines[#]}
for ((fn=0; fn<20; fn++)); do
if (( $max > 0 )); then
servernr=${machines[idx]}
((idx=(idx+1) % ${max}))
else
servernr=""
fi
echo "submit -q server${servernr} file${fn}"
done
}
# test
echo "Rotate 0 machines"
rotate
echo "Rotate all machines"
rotate all
echo "Rotate some machines"
rotate 2 5 6

Related

how to use trickle to limit upload bandwith from .sh file?

I want to limit the upload bandwidth limit of the linux version of 115.com webapp. This webapp actually is run by "sh /usr/local/115/115.sh". If I do
trickle -s -u 5 sh /usr/local/115/115.sh, then the upload limit is not in effect.
The inside of /usr/local/115/115.sh is
#!/bin/sh
export LD_LIBRARY_PATH=/usr/local/115/lib:$LD_LIBRARY_PATH export PATH=/usr/local/115:$PATH
/bin/bash -c "exec -a $0 /usr/local/115/115 > /dev/null 2>&1" $0
I feel I need to put the trickle command inside the 115.sh. How exactly should I do it? Thanks
I tried
trickle -s -u 5 /bin/bash -c "exec -a $0 /usr/local/115/115 > /dev/null 2>&1" $0
/bin/bash -c "exec -a trickle -s -u 5 $0 /usr/local/115/115 > /dev/null 2>&1" $0
and
/bin/bash -c "exec -a $0 trickle -s -u 5 /usr/local/115/115 > /dev/null 2>&1" $0
but still the speed limit is not effective.

Bash Upload file over Netcat

Im trying to write a bash script that will curl a file and send it to my server over netcat then sleep (10) and send another file and sleep for 1hour then repeat all the process.
the first file is uploaded successfully but the second file : NO, i don't know what wrong with my code.
Ant help will be appreciated.
#!/bin/bash
file="curl -L mydomain.net/file.txt -o file.php"
file2="curl -L mydomain.net/file2.txt -o file2.php"
while true
do
if cat <(echo "${file}") | nc -u 120.0.0.1 4444 -w 1
echo -e "\e[92m[*][INFO] file1 uploaded"
sleep 10
then
cat <(echo "${file2}") | nc -u 120.0.0.1 4444 -w 1
echo -e "\e[91m[*][INFO] file2 uploaded"
sleep 3600
fi
done

bash can't capture output from aria2c to variable and stdout

I am trying to use aria2c to download a file. The command looks like this:
aria2c --http-user=$USER --http-passwd=$usepw -x 16 -s 100 $urlPath
The command works perfectly from the script when run this way. What I'm trying to do is capture the output from the command to a variable and still display it on the screen in real-time.
I have successfully captured the output to a variable by using:
VAR=$(aria2c --http-user=$USER --http-passwd=$usepw -x 16 -s 100 $urlPath)
With this scenario though, there's a long delay on the screen where there's no update while the download is happening. I have an echo command after this line in the script and $VAR has all of the aria2c download data captured.
I have tried using different combinations of 2>&1, and | tee /dev/tty at the end of the command, but nothing shows in the display in realtime.
Example:
VAR=$(aria2c --http-user=$USER --http-passwd=$usepw -x 16 -s 100 $urlPath 2>&1)
VAR=$(aria2c --http-user=$USER --http-passwd=$usepw -x 16 -s 100 $urlPath 2>&1 | tee /dev/tty )
VAR=$(aria2c --http-user=$USER --http-passwd=$usepw -x 16 -s 100 $urlPath | tee /dev/tty )
VAR=$((aria2c --http-user=$USER --http-passwd=$usepw -x 16 -s 100 $urlPath) 2>&1)
VAR=$((aria2c --http-user=$USER --http-passwd=$usepw -x 16 -s 100 $urlPath) 2>&1 | tee /dev/tty )
VAR=$((aria2c --http-user=$USER --http-passwd=$usepw -x 16 -s 100 $urlPath) 2>&1 ) | tee /dev/tty )
I've been able to use the "2>&1 | tee" combination before with other commands but for some reason I can't seem to capture aria2c to both simultaneously. Anyone had any luck doing this from a bash script?
Since aria2c seems to output to stdout, consider teeing that to stderr:
var=$(aria2c --http-user=$USER --http-passwd=$usepw -x 16 -s 100 $urlPath | tee /dev/fd/2)
The stdout ends up in var while tee duplicates it to stderr, which displays to your screen.

Shell script to batch download files using curl + cookie and merge those files

I have a list of urls to files that I want to download and join. Those can only be accessed when authenticated.
So first I call:
curl -c cookie.txt http://url.to.authenticate
Then I can download a file file1 using the cookie:
curl -b cookie.txt -O http://url.to.file1
At the end I would just use cat:
cat file1 file2 file3 ... > file_merged
I have 320 of those urls stored in a text file and want to create a script with these urls included in the script, so all I need is to copy the script to a remote computer and execute it.
I am not that good at shell scripting and would love it if someone could help me out a bit.
Maybe something a little more fail-proof than
#!/bin/sh
curl -c cookie.txt http://url.to.authenticate
curl -b cookie.txt -O http://url.to.file1
curl -b cookie.txt -O http://url.to.file2
curl -b cookie.txt -O http://url.to.file3
...
cat file1 file2 file3 ... file320 > file_merged
So, something like (if your list of files is stored in files.txt):
#!/bin/sh
curl -c cookie.txt http://url.to.authenticate
while read f; do
curl -b cookie.txt -O http://url.to."$f"
cat "$f" >> file_merged
rm -f "$f"
done < files.txt

How to tell curl to check file existence before download?

I use this command to download a series of images:
curl -O --max-time 10 --retry 3 --retry-delay 1 http://site.com/image[0-100].jpg
Some images are corrupted, so I delete them.
for i in *.jpg; do jpeginfo -c $i || rm $i; done
How to tell curl to check file existence before download?
I can use this command to prevent curl override existing images:
chmod 000 *.jpg
But I don't want to re-download them.
If the target resource is static, curl has an option -z to only download a newer target copy.
Usage example:
curl -z image0.jpg http://site.com/image0.jpg
An example for your case:
for i in $(seq 0 100); do curl -z image$i.jpg -O --max-time 10 --retry 3 --retry-delay 1 http://site.com/image$i.jpg; done
No idea about doing it with curl, but you could check it with Bash before you run the curl command.
for FILE in FILE1 FILE2 …
do
if [[ ! -e $FILE ]]; then
# Curl command to download the image.
fi
done

Resources