To Continuously loop using for in shell scripting - shell

for m in $count
do
`cat $op ${arr[$m]} > $op1`
`rm -f $op`
`touch $op`
`cat $op1 ${arr[$m+1]} > $op`
if [ $m ge $count ]; then
`rm -f $op1`
`touch $op1`
fi
m=$((m+1))
done
I wanted to continuously loop from the start count 2 till the end count 10 . The $count=10 here. But the above piece of code executes the for loop only once.

Rainy sunday - having much free time - long answer ;)
Many issues with your script, some recommended solutions. Because you used the construction m=$((m+1)) - will be using bash as "shell". (Consider adding the bash tag)
For the cycle - several possibilities
count=10
m=2 #start with 2
while (( $m <= $count )) #while m is less or equal to 10
do #do
echo $m #this action
let m++ #increment m (add one to m)
done #end of while
or, if the count is a constant (not a variable), you can write
for m in {2..10} #REMEMBER, will not works with a variables, like {2..$count}
do
echo "$m"
done
another variant - using the seq (man seq) command for counting
for m in $(seq 2 ${count:=10}) # ${count:=10} - defaults the $count to 10 if it is undefined
do
echo $m
done
or C-like for loop
let count=10
for ((m=2; m<=count; m++))
do
echo $m
done
All 4 loops produces:
2
3
4
5
6
7
8
9
10
so, having a right cycle now. Now add your specific actions.
The:
rm -f $op
touch $op
can be replaced by one command
echo -n > $op #echo nothing and write the "nothing" into the file
it is faster, because the echo is an bash builtin (doesn't start two external commands)
So your actions could looks like
cat $op ${arr[$m]} > $op1
echo -n > $op
cat $op1 ${arr[$m+1]} > $op
in this case, the echo is useless, because the second cat will write its output
to the $op anyway (and before write shortens the file to zero size), so this result is
identical with the above
cat $op ${arr[$m]} > $op1
cat $op1 ${arr[$m+1]} > $op
Those two cat commands can be shorted to one, using bash's >> append to file redirection
cat ${arr[$m]} ${arr[m+1]} >> $op
The whole script could look like the next
#making a testing environment
for f in $(seq 12) #create 12 files opdata-N
do
arr[$f]="opdata-$f" #store the filenames in the array "arr"
echo "data-$f" > ${arr[$f]} #each file contains one line "data-N"
done
#echo ${arr[#]}
#setting the $op and $op1 filenames
#consider choosing more descriptive variable names
op="file_op"
#op1="file_op1" #not needed
#add some initial (old) value to $op
echo "initial value" > $op
#end of creating the testing environment
#the script
count=10
for m in $(seq 2 $count)
do
cat ${arr[$m]} ${arr[m+1]} >> $op
done
at the end, file $op will contain:
initial value
data-2
data-3
data-3
data-4
data-4
data-5
data-5
data-6
data-6
data-7
data-7
data-8
data-8
data-9
data-9
data-10
data-10
data-11
BTW, are you sure about the result? Because if only want add file-2 .. file-10 to the end of $op (without duplicating entries), you can simple write:
cat file-{2..10} >> $op #the '>>' adds to the end of file...
or by using your array:
startpos=2
count=10
cat ${arr[#]:$startpos:$count} >> $op
Ufff.. ;)
Ps: usually it is good practice to enclose variables in double quotes like "$filename" - in the above examples for better readability I omitted them.

Any loop needs a "condition to keep looping". When you use a
for m in count
type of loop, the condition is "if there are more elements in the collection count, pick the next one and keep going". This doesn't seem to be what you want. You are looking for the bash equivalent of
for(m = 0; m < 10; m++)
I think. The best way to do this is - with exactly that kind of loop (but note - an extra pair of parentheses, and a semicolon):
#!/bin/bash
# Display message 5 times
for ((i = 0 ; i < 5 ; i++)); do
echo "Welcome $i times."
done
see nix craft for original
I think you can extend this to your situation… if I understood your question correctly you need something like this:
for ((m = 2; m <= 10; m++))
do
cat $op ${arr[$m]} > $op1
rm -f $op
touch $op
cat $op1 ${arr[$m+1]} > $op
if [ $m ge $count ]; then
rm -f $op1
touch $op1
fi
done

Use a while loop instead.
The for loop is when you have multiple objects to iterate against. You have only one, i.e. $count.

Related

How can I extract the last X folders from the path in k shell

Say I have a path name /server/user/folderA/folderB/folderC, how would I extract (to a variable) just the last few folders? I'm looking for something that will be flexible enough to give me folderC, or folderB/folderC, or folderA/folderB/folderC, etc.
I'm trying to use sed, but I'm not sure that's the best approach.
This would have to be in either ksh or csh (no bash on our machines, sadly)
This will get you started:
arr=( $(echo "/server/user/folderA/folderB/folderC" | sed 's#/# #g') )
echo ${#arr[*]}
echo ${arr[*]}
echo ${arr[3]}
echo "${arr[2]}/${arr[3]}/${arr[4]}"
output
5
server user folderA folderB folderC
folderB
folderA/folderB/folderC
IHTH
You can use arrays, but ksh88 (at least the one I tested with, on Solaris 8) uses the old Korn Shell syntax of set -A, and it doesn’t do (( i++ )) either, so this looks a bit more baroque than contemporary ksh93 or mksh code. On the other hand, I’m also giving you a function to extract the last n items ;)
p=/server/user/folderA/folderB/folderC
saveIFS=$IFS
IFS=/
set -A fullpath -- $p
echo all: "${fullpath[*]}"
unset fullpath[0] # leading slash
unset fullpath[1]
unset fullpath[2]
echo all but first two: "${fullpath[*]}"
IFS=$saveIFS
# example function to get the last n:
function pathlast {
typeset saveIFS parr i=0 n
saveIFS=$IFS
IFS=/
set -A parr -- $2
(( n = ${#parr[*]} - $1 ))
while (( i < n )); do
unset parr[i]
(( i += 1 ))
done
echo "${parr[*]}"
IFS=$saveIFS
}
for lst in 1 2 3; do
echo all but last $lst: $(pathlast $lst "$p")
done
Output:
tg#stinky:~ $ /bin/ksh x
all: /server/user/folderA/folderB/folderC
all but first two: folderA/folderB/folderC
all but last 1: folderC
all but last 2: folderB/folderC
all but last 3: folderA/folderB/folderC
Other than the first line setting $p, you can just copy the function part.
This could be done with perl if you've got it:
$ path=/server/user/folderA/folderB/folderC
$ X=3
$ echo $path|perl -F/ -ane '{print join "/",#F[(#F-'$X')..(#F-1)]}'
folderA/folderB/folderC

slow running script. How can I increase its speed?

How can I speed this up? it's taking about 5 minutes to make one file...
it runs correctly, but I have a little more than 100000 files to make.
Is my implementation of awk or sed slowing it down? I could break it down into several smaller loops and run it on multiple processors but one script is much easier.
#!/bin/zsh
#1000 configs per file
alpha=( a b c d e f g h i j k l m n o p q r s t u v w x y z )
m=1000 # number of configs per file
t=1 #file number
for (( i=1; i<=4; i++ )); do
for (( j=i; j<=26; j++ )); do
input="arc"${alpha[$i]}${alpha[$j]}
n=1 #line number
#length=`sed -n ${n}p $input| awk '{printf("%d",$1)}'`
#(( length= $length + 1 ))
length=644
for ((k=1; k<=$m; k++ )); do
echo "$hmbi" >> ~/Glycine_Tinker/configs/config$t.in
echo "jobtype = energy" >> ~/Glycine_Tinker/configs/config$t.in
echo "analyze_only = false" >> ~/Glycine_Tinker/configs/config$t.in
echo "qm_path = qm_$t" >> ~/Glycine_Tinker/configs/config$t.in
echo "mm_path = aiff_$t" >> ~/Glycine_Tinker/configs/config$t.in
cat head.in >> ~/Glycine_Tinker/configs/config$t.in
water=4
echo $k
for (( l=1; l<=$length; l++ )); do
natom=`sed -n ${n}p $input| awk '{printf("%d",$1)}'`
number=`sed -n ${n}p $input| awk '{printf("%d",$6)}'`
if [[ $natom -gt 10 && $number -gt 0 ]]; then
symbol=`sed -n ${n}p $input| awk '{printf("%s",$2)}'`
x=`sed -n ${n}p $input| awk '{printf("%.10f",$3)}'`
y=`sed -n ${n}p $input| awk '{printf("%.10f",$4)}'`
z=`sed -n ${n}p $input| awk '{printf("%.10f",$5)}'`
if [[ $water -eq 4 ]]; then
echo "--" >> ~/Glycine_Tinker/configs/config$t.in
echo "0 1 0.4638" >> ~/Glycine_Tinker/configs/config$t.in
water=1
fi
echo "$symbol $x $y $z" >> ~/Glycine_Tinker/configs/config$t.in
(( water= $water + 1 ))
fi
(( n= $n + 1 ))
done
cat tail.in >> ~/Glycine_Tinker/configs/config$t.in
(( t= $t + 1 ))
done
done
done
One thing that is going to be killing you here is the sheer number of processes being created. Especially when they are doing the exact same thing.
Consider doing the sed -n ${n}p $input once per loop iteration.
Also consider doing the equivalent of awk as a shell array assignment, then accessing the individual elements.
With these two things you should be able to get the 12 or so processes (and the shell invocation via back quotes) down to a single shell invocation and the backquote.
Obviously, Ed's advice is far preferable, but if you don't want to follow that, I had a couple of thoughts...
Thought 1
Rather than run echo 5 times and cat head.in onto the Glycine file, each of which causes the file to be opened, seeked (or sought maybe) to the end, and appended, you could do that in one go like this:
# Instead of
hmbi=3
echo "$hmbi" >> ~/Glycine_thing
echo "jobtype = energy" >> ~/Glycine_thing
echo "somethingelse" >> ~/Glycine_thing
echo ... >> ~/Glycine_thing
echo ... >> ~/Glycine_thing
cat ... >> ~/Glycine_thing
# Try this
{
echo "$hmbi"
echo "jobtype = energy"
echo "somethingelse"
echo
echo
cat head.in
} >> ~/Glycine_thing
# Or, better still, this
echo -e "$hmbi\njobtype = energy\nsomethingelse" >> Glycine_thing
# Or, use a here-document, as suggested by #mklement0
cat -<<EOF >>Glycine
$hmbi
jobtype = energy
next thing
EOF
Thought 2
Rather than invoke sed and awk 5 times to find 5 parameters, just let awk do what sed was doing, and also do all 5 things in one go:
read symbol x y z < <(awk '...{printf "%.10f %.10f %.10f" $2,$3,$4}' $input)

Tree hierarchy in Bash

I'm trying to implement a function in bash which displays a tree of files/directories for the given depth. It takes 3 arguments.
$1 = *current directory*
$2 = *current depth*
$3 = *lines*
for example, if my current directory is ".../living/", my depth is 2, my function should output:
DIR .../living/
----DIR animals
--------FILE dog
--------FILE cat
----DIR plants
--------FILE flowers
As you can see, the number of lines is increased by 4 for each depth change. The type of file (DIR, FILE) is not the question of this thread.
Here's what I have so far:
function tree {
#some code to get the directory in variable cwd
...
a=$(getType $cwd)
echo "$a $cwd"
depth=3 #the value does not matter, it's just for you guys to see
drawTree $cwd $depth "----"
}
function drawTree {
if [[ $2 == 0 ]]; then
return
fi
dat=$1
list=$(ls $dat)
depth=$2
lines=$3
for d in $list; do
f="$dat/$d"
t=$(getType $f)
echo "$lines$t $d"
if [[ $t == "DIR" ]]; then
g=$(($depth-1))
l="$lines----"
if [[ $g > 00 ]]; then
drawTree $f $g $l
fi
fi
done
The output of this code is sadly false and I have no idea why.
There are quite a few issues with that code.
The most serious is that your variables are not made local (see help local) which can be disastrous in a recursive function. In the loop in drawtree, the second iteration will see unwanted modifications to $depth and $lines, both of which will cause the output to be incorrect in different ways.
Also:
g=$(($depth-1))
l="$lines----"
if [[ $g > 00 ]]; then
drawTree $f $g $l
fi
would much better be written without so many unnecessary variables, and using arithmetic rather than string comparison:
if (( depth > 1 )); then
drawTree $f $((depth - 1)) ${lines}----
fi
Finally:
list=$(ls $dat)
for d in $list; do
will fail disastrously if there is whitespace or a shell metacharacter in a filepath. Much better is the use of a bash array and glob expansion rather than the ls command):
# Create an array from a glob
list=("$dat"/*)
# Use the elements of the array, individually quoted:
for d in "${list[#]}"; do

How to write a tail script without the tail command

How would you achieve this in bash. It's a question I got asked in an interview and I could think of answers in high level languages but not in shell.
As I understand it, the real implementation of tail seeks to the end of the file and then reads backwards.
The main idea is to keep a fixed-size buffer and to remember the last lines. Here's a quick way to do a tail using the shell:
#!/bin/bash
SIZE=5
idx=0
while read line
do
arr[$idx]=$line
idx=$(( ( idx + 1 ) % SIZE ))
done < text
for ((i=0; i<SIZE; i++))
do
echo ${arr[$idx]}
idx=$(( ( idx + 1 ) % SIZE ))
done
If all not-tail commands are allowed, why not be whimsical?
#!/bin/sh
[ -r "$1" ] && exec < "$1"
tac | head | tac
Use wc -l to count the number of lines in the file. Subtract the number of lines you want from this, and add 1, to get the starting line number. Then use this with sed or awk to start printing the file from that line number, e.g.
sed -n "$start,\$p"
There's this:
#!/bin/bash
readarray file
lines=$(( ${#file[#]} - 1 ))
for (( line=$(($lines-$1)), i=${1:-$lines}; (( line < $lines && i > 0 )); line++, i-- )); do
echo -ne "${file[$line]}"
done
Based on this answer: https://stackoverflow.com/a/8020488/851273
You pass in the number of lines at the end of the file you want to see then send the file via stdin, puts the entire file into an array, and only prints the last # lines of the array.
The only way I can think of in “pure” shell is to do a while read linewise on the whole file into an array variable with indexing modulo n, where n is the number of tail lines (default 10) — i.e. a circular buffer, then iterate over the circular buffer from where you left off when the while read ends. It's not efficient or elegant, in any sense, but it'll work and avoids reading the whole file into memory. For example:
#!/bin/bash
incmod() {
let i=$1+1
n=$2
if [ $i -ge $2 ]; then
echo 0
else
echo $i
fi
}
n=10
i=0
buffer=
while read line; do
buffer[$i]=$line
i=$(incmod $i $n)
done < $1
j=$i
echo ${buffer[$i]}
i=$(incmod $i $n)
while [ $i -ne $j ]; do
echo ${buffer[$i]}
i=$(incmod $i $n)
done
This script somehow imitates tail:
#!/bin/bash
shopt -s extglob
LENGTH=10
while [[ $# -gt 0 ]]; do
case "$1" in
--)
FILES+=("${#:2}")
break
;;
-+([0-9]))
LENGTH=${1#-}
;;
-n)
if [[ $2 != +([0-9]) ]]; then
echo "Invalid argument to '-n': $1"
exit 1
fi
LENGTH=$2
shift
;;
-*)
echo "Unknown option: $1"
exit 1
;;
*)
FILES+=("$1")
;;
esac
shift
done
PRINTHEADER=false
case "${#FILES[#]}" in
0)
FILES=("/dev/stdin")
;;
1)
;;
*)
PRINTHEADER=true
;;
esac
IFS=
for I in "${!FILES[#]}"; do
F=${FILES[I]}
if [[ $PRINTHEADER == true ]]; then
[[ I -gt 0 ]] && echo
echo "==> $F <=="
fi
if [[ LENGTH -gt 0 ]]; then
LINES=()
COUNT=0
while read -r LINE; do
LINES[COUNT++ % LENGTH]=$LINE
done < "$F"
for (( I = COUNT >= LENGTH ? LENGTH : COUNT; I; --I )); do
echo "${LINES[--COUNT % LENGTH]}"
done
fi
done
Example run:
> bash script.sh -n 12 <(yes | sed 20q) <(yes | sed 5q)
==> /dev/fd/63 <==
y
y
y
y
y
y
y
y
y
y
y
y
==> /dev/fd/62 <==
y
y
y
y
y
> bash script.sh -4 <(yes | sed 200q)
y
y
y
y
Here's the answer I would give if I were actually asked this question in an interview:
What environment is this where I have bash but not tail? Early boot scripts, maybe? Can we get busybox in there so we can use the full complement of shell utilities? Or maybe we should see if we can squeeze a stripped-down Perl interpreter in, even without most of the modules that would make life a whole lot easier. You know dash is much smaller than bash and perfectly good for scripting use, right? That might also help. If none of that is an option, we should check how much space a statically linked C mini-tail would need, I bet I can fit it in the same number of disk blocks as the shell script you want.
If that doesn't convince the interviewer that it's a silly question, then I go on to observe that I don't believe in using bash extensions, because the only good reason to write anything complicated in shell script nowadays is if total portability is an overriding concern. By avoiding anything that isn't portable even in one-offs, I don't develop bad habits, and I don't get tempted to do something in shell when it would be better done in a real programming language.
Now the thing is, in truly portable shell, arrays may not be available. (I don't actually know whether the POSIX shell spec has arrays, but there certainly are legacy-Unix shells that don't have them.) So, if you have to emulate tail using only shell builtins and it's got to work everywhere, this is the best you can do, and yes, it's hideous, because you're writing in the wrong language:
#! /bin/sh
a=""
b=""
c=""
d=""
e=""
f=""
while read x; do
a="$b"
b="$c"
c="$d"
d="$e"
e="$f"
f="$x"
done
printf '%s\n' "$a"
printf '%s\n' "$b"
printf '%s\n' "$c"
printf '%s\n' "$d"
printf '%s\n' "$e"
printf '%s\n' "$f"
Adjust the number of variables to match the number of lines you want to print.
The battle-scarred will note that printf is not 100% available either. Unfortunately, if all you have is echo, you are up a creek: some versions of echo cannot print the literal string "-n", and others cannot print the literal string "\n", and even figuring out which one you have is a bit of a pain, particularly as, if you don't have printf (which is in POSIX), you probably don't have user-defined functions either.
(N.B. The code in this answer, sans rationale, was originally posted by user 'Nirk' but then deleted under downvote pressure from people whom I shall charitably assume were not aware that some shells do not have arrays.)

How can I rewrite this BASH for loop not using the `seq` command?

I was trying to write a BASH loop of the form:
~/$ for i in {1..$(grep -c "match" file)} ; do echo $i ; done
{1..20}
where I was hoping it would produce counted output. So I tried this instead:
~/$ export LOOP_COUNT=$(grep -c "match" file)
~/$ for i in {1..$LOOP_COUNT} ; do echo $i ; done
{1..20}
What I fell back to using was:
~/$ for i in $(seq 1 1 $(grep -c "match" file)) ; do echo $i ; done
1
2
3
...
20
Perfect! But how can I get that behaviour without using seq?
Have you tried this?
max=$(grep -c "match" file)
for (( c=1; c <= $max; c++ ))
do
echo $c
done
According to bash documentation
A sequence expression takes the form {x..y[..incr]}, where x and y are
either integers or single characters, and incr, an optional
increment, is an integer.
You can still use eval in other cases, but Mithrandir's advice is probably faster.
eval "for i in {1..$(grep -c 'match' file)} ; do echo \$i ; done"
Here is a recursive solution:
loop () {
i=$1
n=$2
echo $i
((i < n)) && loop $((i+1)) $n
}
LOOP_COUNT=$(grep -c "Int" sum.scala)
loop 1 $LOOP_COUNT

Resources