Shell Scripts: Nested for Loop iterating only once - shell

Im trying to automate a download of multiple files from a server. There is a nested loop structure to pass through directories. The files in one directory are numbered and I'm just iterating through them using a counter. The indication that a directory is devoid of all required files is when a 404 is encountered. However, the break statement in the if statement seems to break out of both loops during this case. I tried giving break 1 to indicate that only one loop is to be stopped, but to no avail.
page=$1
chapStart=$2
chapEnd=$3
k=$2
for i in {$2..$3}
do
j=1
while true
do
if ((j < 10))
then
pg=0"$j"
else
pg=$j
fi
res=$(wget --timeout=10 http://www.somesite.com/content/content/"$page"/"$k"/0"$pg".file)
if(($? != 0))
then
break 1
fi
let ++j
done
let ++k
done
I need the break to exit just the while loop when encountered.
Edit
As per chepner's correct answer, the problem was in the for loop structure rather than the break statement, which functions rather beautifully. Updated, working code is:
page=$1
chapStart=$2
chapEnd=$3
k=$2
for((i=$2;i<=$3;i++));
do
j=1
while true
do
if ((j < 10))
then
pg=0"$j"
else
pg=$j
fi
res=$(wget --timeout=10 http://www.somesite.com/content/content/"$page"/"$k"/0"$pg".file)
if(($? != 0))
then
break 1
fi
let ++j
done
let ++k
done

You cannot use parameter expansion inside brace expansions. Your outer loop only has one iteration, with i set to the literal string {$2..$3}. You should use one of
for ((i=$2; i<=$3; i++)); do
or
for i in $(seq $2 $3); do
instead.

Related

How to avoid skipping items in a list in zsh?

I am using zsh on Mac. I created a list subjects of say 25 items in that list.
Now, I would like to run all possible pairwise comparisons between the items in the list, e.g., subject1 vs. subject2, but without running repeated measurements (such as subject1 vs. subject2 and subject2 vs subject1) Here is my code for this task:
subjects=(Subject1 Subject2 Subject4 Subject5 Subject6 Subject7 Subject8 Subject9 Subject10 Subject11 Subject12 Subject13 Subject14 Subject15 Subject16 Subject17 Subject18 Subject19 Subject20 Subject22 Subject23 Subject24 Subject25)
for i in $subjects
do
for j in $subjects
do
if [[ $i < $j ]]
then
echo "Processing pair $i - $j ..."
fi
done
done
The problem is that zsh skips the subjects from subject10 to subject19 and directly jumps to subject20 after comparing subject1 vs. subject9.
Where is the flaw in my code?
I would not iterate over the arrays, but over the array indices, which allows us to use numeric comparision:
for i in {1..#$subjects}
do
for j in {1..#$subjects}
do
if ((i < j))
then
echo Processing pair $subjects[i] and $subjects[j]
fi
done
done

Fibonacci & for loop: how are the commands executed step by step?

#!/bin/bash
a=0
b=1
echo "give a number:"
read n
clear
echo "the fibonacci sequence until $n:"
for (( i=0; i<n; i++ ))
do
echo -n "$a "
c=$((a + b))
a=$b
b=$c
done
If I interpret it well, this code echoes a $a value after every i++ jumps, then switches the variables as you can see, then on the next i++ loop jump it happens again until "i" reaches "n".
Question: if we want in every loop jump the value of the new "c" why shall we echo $a? I see the connection that: a=$b, b=$c, and c=$((a + b)) but i don't get it why do we refer to $a when doing echo?
Is there a more elegant solution?
You mean, “never ever calculate anything needlessly, ever”? It is possible, of course, but it depends on how much ugliness in the control logic you are willing to tolerate. In the example below, fibonacci1 calculates at most one extra element of the series that may not get printed out and fibonacci2 never calculates any extra series elements and everything makes it to the standard output.
Is any of that “elegant”? Probably not. This is actually a common problem most people encounter when coding (in languages other than purely functional ones): Most high(er)-level languages (unlike e.g. assemblers) provide predefined control structures that work great in typical and obvious cases (e.g. one control variable and one operation per iteration) but may become “suboptimal” in more complex scenarios.
A notoriously common example is a variable that stores a value from the previous iteration. Let’s assume you assign it at the very end of the loop. That works fine, but… Could you avoid the very last assignment (because it is useless), instead of leaving it to the compiler’s wisdom? Yes, you could, but then (e.g.) for ((init; condition; step)); do ...; ((previous = current)); done becomes (e.g.) for ((init;;)); do ...; ((step)); ((condition)) || break; ((previous = current)); done.
On one hand, a tiny bit of something (such as thin air) may have been “saved”. On the other hand, the code became assembler-like and harder to write, read and maintain.
To find a balance there^^^ and {not,} optimize when it {doesn’t,does} matter is a lifelong struggle. It may be something like CDO, which is like OCD, but sorted correctly.
fibonacci1() {
local -ai fib=(0 1)
local -i i
for ((i = $1; i > 2; i -= 2)) {
printf '%d %d ' "${fib[#]}"
fib=($((fib[0] + fib[1])) $((fib[0] + 2 * fib[1])))
}
echo "${fib[#]::i}"
}
fibonacci2() {
trap 'trap - return; echo' return
local -i a=0 b=1 i="$1"
((i)) || return 0
printf '%d' "$a"
((--i)) || return 0
printf ' %d' "$b"
for ((;;)); do
((--i)) || return 0
printf ' %d' "$((a += b))"
((--i)) || return 0
printf ' %d' "$((b += a))"
done
}
for ((i = 0; i <= 30; ++i)); do
for fibonacci in fibonacci{1,2}; do
echo -n "${fibonacci}(${i}): "
"$fibonacci" "$i"
done
done

BASH: How to write values generated by a for loop to a file quickly

I have a for loop in bash that writes values to a file. However, because there are a lot of values, the process takes a long time, which I think can be saved by improving the code.
nk=1152
nb=24
for k in $(seq 0 $((nk-1))); do
for i in $(seq 0 $((nb-1))); do
for j in $(seq 0 $((nb-1))); do
echo -e "$k\t$i\t$j"
done
done
done > file.dat
I've moved the output action to after the entire loop is done rather than echo -e "$k\t$i\t$j" >> file.dat to avoid opening and closing the file many times. However, the speed the script writes to the file is still rather slow, ~ 10kbps.
Is there a better way to improve the IO?
Many thanks
Jacek
It looks like the seq calls are fairly punishing since that is a separate process. Try this just using shell math instead:
for ((k=0;k<=$nk-1;k++)); do
for ((i=0;i<=$nb-1;i++)); do
for ((j=0;j<=$nb-1;j++)); do
echo -e "$k\t$i\t$j"
done
done
done > file.dat
It takes just 7.5s on my machine.
Another way is to compute the sequences just once and use them repeatedly, saving a lot of shell calls:
nk=1152
nb=24
kseq=$(seq 0 $((nk-1)))
bseq=$(seq 0 $((nb-1)))
for k in $kseq; do
for i in $bseq; do
for j in $bseq; do
echo -e "$k\t$i\t$j"
done
done
done > file.dat
This is not really "better" than the first option, but it shows how much of the time is spent spinning up instances of seq versus actually getting stuff done.
Bash isn't always the best for this. Consider this Ruby equivalent which runs in 0.5s:
#!/usr/bin/env ruby
nk=1152
nb=24
nk.times do |k|
nb.times do |i|
nb.times do |j|
puts "%d\t%d\t%d" % [ k, i, j ]
end
end
end
What is the most time consuming is calling seq in a nested loop. Keep in mind that each time you call seq it loads command from disk, fork a process to run it, capture the output, and store the whole output sequence into memory.
Instead of calling seq you could use an arithmetic loop:
#!/usr/bin/env bash
declare -i nk=1152
declare -i nb=24
declare -i i j k
for ((k=0; k<nk; k++)); do
for (( i=0; i<nb; i++)); do
for (( j=0; j<nb; j++)); do
printf '%d\t%d\t%d\n' "$k" "$i" "$j"
done
done
done > file.dat
Running seq in a subshell consumes most of the time.
Switch to a different language that provides all the needed features without shelling out. For example, in Perl:
#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };
my $nk = 1152;
my $nb = 24;
for my $k (0 .. $nk - 1) {
for my $i (0 .. $nb - 1) {
for my $j (0 .. $nb - 1) {
say "$k\t$i\t$j"
}
}
}
The original bash solution runs for 22 seconds, the Perl one finishes in 0.1 seconds. The output is identical.
#Jacek : I don't think the I/O is the problem, but the number of child processes spawned. I would store the result of the seq 0 $((nb-1)) into an array and loop over the array, i.e.
nb_seq=( $(seq 0 $((nb-1)) )
...
for i in "${nb_seq[#]}"; do
for j in "${nb_seq[#]}"; do
seq is bad) once i've done this function special for this case:
$ que () { printf -v _N %$1s; _N=(${_N// / 1}); printf "${!_N[*]}"; }
$ que 10
0 1 2 3 4 5 6 7 8 9
And you can try to write first all to a var and then whole var into a file:
store+="$k\t$i\t$j\n"
printf "$store" > file
No. it's even worse like that)

Beginner Shell, can't find the issue (array sorting)

Working on a little script which put random numbers in a 10 000 size array and then sort all this array with the method ask during the course.
I've done this code but it seem that it begin to sort (when I test I have some "a" that are printed but not as much as supposed to and I don't understand why)
I'm believing the problem come fromes my test on val array, and it's probably a beginner error but I don't really know how to find the problem on th web as I don't really now which line is the problem.
I don't necessary need an answer, just some clues to find it could be good :)
Here is my code: (new to stackoverflow so I don't know how to put a good code view directly, if anyone can show me)
for i in `seq 1 10000`;
do
val[${i}]=$RANDOM
done
echo `date +"%M.%S.%3N"`
FLAG=0
until [ $FLAG -eq 1 ]
do
FLAG=1
for j in `seq 1 9999`;
do
if [ ${val[${j}]} -gt ${val[${j+1}]} ]
then
TMP=${val[${j}]}
val[${j}]=${val[${j+1}]}
val[${j+1}]=$TMP
FLAG=0
echo a
fi
done
done
echo `date +"%M.%S.%3N"`
as asked I can't really have a useful output as I just want the date before and after the sort operation. But the sort is just supposed to put values from lower to higher by taking them two by two and invert them if necessary. Doing this until no numbers are inverted.
Edit: I tried with manual number:
10 3 6 9 1
when running it by putting echo ${val[*]} in the for loop it just print 4 times the same list in the same order, so I'm guessing it doesn't work at all... Is my use of "if" wrong ?
Edit 2: At the begining, I did it in C# and I wanted to do it in shell then, firstly because I wanted to practice shell and then because I wanted to compare efficiency and time needed for the same thing. here is the C# code, working.
Random random = new Random();
int[] _tab = new int[100000];
for (int i = 0; i < _tab.Length; i++)
{
_tab[i] = random.Next(1, _tab.Length);
}
bool perm;
int tmp;
DateTime dt = DateTime.Now;
do
{
perm = false;
for (int i = 0; i < (_tab.Length - 1); i++)
{
if (_tab[i] > _tab[i + 1])
{
tmp = _tab[i];
_tab[i] = _tab[i + 1];
_tab[i + 1] = tmp;
perm = true;
}
}
}
while (perm == true);
Console.WriteLine((DateTime.Now - dt).TotalMilliseconds);
Console.Read();
Thanks :)
If my understanding that you want to know why this script is not producing an "a" indicating the ordering of the array of the numbers initially produced in the "for" loop is correct, then here is a solution:
The syntax is incorrect for your variable expansion. The ${var} cannot have math operators inside the braces, because they have different meaning here. In a normal non-associative array Zsh handles subscripts with some basic math support, so you can use ${array[var+1]} instead of ${array[${var+1}]} as you previously did.
I suspect the reason this came about - complicated, error prone POSIX syntax - would have been avoided by using simplified Zsh syntax, but as stated in an earlier comment, it would not be portable to other shells.
Some shells support similar features: Bash supports most, but not bare subscripts ($array[var]). Strings may be ordered in Zsh in a similar manner, but the math-context brackets (( and )) would have to be replaced with normal test brackets [[ and ]] and the array $val might have to be defined with special typeset options to make the strings compare in the desired manner; that is, they might have to be padded and right or left aligned. For comparing enumeration types, like Jan - Feb, it gets a little more complicated with associative arrays and case-conversion.
Here is the script with the appropriate changes, then again in simplified Zsh:
#!/bin/sh
for i in `seq 1 10000`;
do
val[$((i))]=$RANDOM
done
echo `date +"%M.%S.%3N"`
FLAG=0
until [ $FLAG -eq 1 ]
do
FLAG=1
for j in `seq 1 9999`;
do
if [ ${val[$((j))]} -gt ${val[$((j+1))]} ]
then
TMP=${val[$((j))]}
val[$((j))]=${val[$((j+1))]}
val[$((j+1))]=$TMP
FLAG=0
echo a
fi
done
done
echo `date +"%M.%S.%3N"`
Zsh:
#!/bin/zsh
foreach i ( {1..10000} )
val[i]=$RANDOM
end
echo `date +"%M.%S.%3N"`
FLAG=0
until ((FLAG))
do
FLAG=1
foreach j ( {1..9999} )
if (( val[j] > val[j+1] ))
then
TMP=$val[j]
val[j]=$val[j+1]
val[j+1]=$TMP
FLAG=0
echo a
fi
end
done
echo `date +"%M.%S.%3N"`

bash shell script two variables in for loop

I am new to shell scripting. so kindly bear with me if my doubt is too silly.
I have png images in 2 different directories and an executable which takes an images from each directory and processes them to generate a new image.
I am looking for a for loop construct which can take two variables simultaneously..this is possible in C, C++ etc but how do I accomplish something of the following. The code is obviously wrong.
#!/bin/sh
im1_dir=~/prev1/*.png
im2_dir=~/prev3/*.png
index=0
for i,j in $im1_dir $im2_dir # i iterates in im1_dir and j iterates in im2_dir
do
run_black.sh $i $j
done
thanks!
If you are depending on the two directories to match up based on a locale sorted order (like your attempt), then an array should work.
im1_files=(~/prev1/*.png)
im2_files=(~/prev3/*.png)
for ((i=0;i<=${#im1_files[#]};i++)); do
run_black.sh "${im1_files[i]}" "${im2_files[i]}"
done
Here are a few additional ways to do what you're looking for with notes about the pros and cons.
The following only works with filenames that do not include newlines. It pairs the files in lockstep. It uses an extra file descriptor to read from the first list. If im1_dir contains more files, the loop will stop when im2_dir runs out. If im2_dir contains more files, file1 will be empty for all unmatched file2. Of course if they contain the same number of files, there's no problem.
#!/bin/bash
im1_dir=(~/prev1/*.png)
im2_dir=(~/prev3/*.png)
exec 3< <(printf '%s\n' "${im1_dir[#]}")
while IFS=$'\n' read -r -u 3 file1; read -r file2
do
run_black "$file1" "$file2"
done < <(printf '%s\n' "${im1_dir[#]}")
exec 3<&-
You can make the behavior consistent so that the loop stops with only non-empty matched files no matter which list is longer by replacing the semicolon with a double ampersand like so:
while IFS=$'\n' read -r -u 3 file1 && read -r file2
This version uses a for loop instead of a while loop. This one stops when the shorter of the two lists run out.
#!/bin/bash
im1_dir=(~/prev1/*.png)
im2_dir=(~/prev3/*.png)
for ((i = 0; i < ${#im1_dir[#]} && i < ${#im2_dir[#]}; i++))
do
run_black "${im1_dir[i]}" "${im2_dir[i]}"
done
This version is similar to the one immediately above, but if one of the lists runs out it wraps around to reuse the items until the other one runs out. It's very ugly and you could do the same thing another way more simply.
#!/bin/bash
im1_dir=(~/prev1/*.png)
im2_dir=(~/prev3/*.png)
for ((i = 0, j = 0,
n1 = ${#im1_dir[#]},
n2 = ${#im2_dir[#]},
s = n1 >= n2 ? n1 : n2,
is = 0, js = 0;
is < s && js < s;
i++, is = i, i %= n1,
j++, js = j, j %= n2))
do
run_black "${im1_dir[i]}" "${im2_dir[i]}"
done
This version only uses an array for the inner loop (second directory). It will only execute as many times as there are files in the first directory.
#!/bin/bash
im1_dir=~/prev1/*.png
im2_dir=(~/prev3/*.png)
for file1 in $im1_dir
do
run_black "$file1" "${im2_dir[i++]}"
done
If you don't mind going off the beaten path (bash), the Tool Command Language (TCL) has such a loop construct:
#!/usr/bin/env tclsh
set list1 [glob dir1/*]
set list2 [glob dir2/*]
foreach item1 $list1 item2 $list2 {
exec command_name $item1 $item2
}
Basically, the loop reads: for each item1 taken from list1, and item2 taken from list2. You can then replace command_name with your own command.
This might be another way to use two variables in the same loop. But you need to know the total number of files (or, the number of times you want to run the loop) in the directory to use it as the value of iteration i.
Get the number of files in the directory:
ls /path/*.png | wc -l
Now run the loop:
im1_dir=(~/prev1/*.png)
im2_dir=(~/prev3/*.png)
for ((i = 0; i < 4; i++)); do run_black.sh ${im1_dir[i]} ${im2_dir[i]}; done
For more help please see this discussion.
I have this problem for a similar situation where I want a top and bottom range simultaneously. Here was my solution; it's not particularly efficient but it's easy and clean and not at all complicated with icky BASH arrays and all that nonsense.
SEQBOT=$(seq 0 5 $((PEAKTIME-5)))
SEQTOP=$(seq 5 5 $((PEAKTIME-0)))
IDXBOT=0
IDXTOP=0
for bot in $SEQBOT; do
IDXTOP=0
for top in $SEQTOP; do
if [ "$IDXBOT" -eq "$IDXTOP" ]; then
echo $bot $top
fi
IDXTOP=$((IDXTOP + 1))
done
IDXBOT=$((IDXBOT + 1))
done
It is very simple you can use two for loop functions in this problem.
#bin bash
index=0
for i in ~/prev1/*.png
do
for j ~/prev3/*.png
do
run_black.sh $i $j
done
done
The accepted answer can be further simplified using the ${!array[#]} syntax to iterate over array's indexes:
a=(x y z); b=(q w e); for i in ${!a[#]}; do echo ${a[i]}-${b[i]}; done
Another solution. The two lists with filenames are pasted into one.
paste <(ls --quote-name ~/prev1/*.png) <(ls --quote-name ~/prev3/*.png) | \
while read args ; do
run_black $args
done

Resources