Unexpected behaviour of for - bash

Script:
#!/bin/bash
IFS=','
i=0
for j in `cat database | head -n 1`; do
variables[$i]=$j
i=`expr $i + 1`
done
k=0
for l in `cat database | tail -n $(expr $(cat database | wc -l) - 1)`; do
echo -n $k
k=`expr $k + 1`
if [ $k -eq 3 ]; then
k=0
fi
done
Input file
a,b,c
d,e,f
g,e,f
Output
01201
Expected output
012012
The question is why the for skips last echo? It is weird, because if I change $k to $l echo will run 6 times.

Update:
#thom's analysis is correct. You can fix the problem by changing IFS=',' to IFS=$',\n'.
My original statements below may be of general interest, but do not address the specific problem.
If accidental shell expansions were a concern, here's how the loop could be rewritten (assuming it's practical to read everything into an array variable first):
IFS=$',\n' read -d '' -r -a fields < <(echo $'*,b,c\nd,e,f\ng,h,i')
for field in "${fields[#]}"; do
  # $field is '*' in 1st iteration, then 'b', 'c', 'd',...
done
Original statements:
Just a few general pointers:
You should use a while loop rather than for to read command output - see http://mywiki.wooledge.org/BashFAQ/001; the short of it: with for, the input lines are subject to various shell expansions.
A missing iteration typically stems from the last input line missing a terminating \n (or a separator as defined in $IFS). With a while loop, you can use the following approach to address this: while read -r line || [[ -n $line ]]; do …
For instance, your 2nd for loop could be rewritten as (using process substitution as input to avoid creating a subshell with a separate variable scope):
while read -r l || [[ -n $l ]]; do …; done < <(cat database | tail -n $(expr $(cat database | wc -l) - 1))
Finally, you could benefit from using modern bashisms: for instance,
k=`expr $k + 1`
could be rewritten much more succinctly as (( ++k )) (which will run faster, too).

Your code expects after EVERY read variable a comma but you only give this:
a,b,c
d,e,f
g,e,f
instead of this:
a,b,c,
d,e,f,
g,e,f,
so it reads:
d,e,f'\n'g,e,f
and that is equal to 5 values, not 6

Related

How to extend a variable outside a multi-threaded while loop

I am writing a shell script that contains a multi-threaded while loop. My loop iterates through the values of an array. Within the loop, I am calling a function. At the end of the function I am saving the results as a string variable. I want to add this string variable to an array on each iteration, and then be able to retrieve the contents of this array when the while loop completes.
From my understanding running the multi-threaded while loop, is what is causing for the array to be empty once the while loop completes. Each thread is ran in its own environment and the array value does not extend outside that environment. I would like to be able to extend this array value outside of the thread if possible. Currently I am just writing the string value to a temp file and then after the while loop, reading the contents of the temp file and saving that as my array. This method works, as the file generally isn't "too" large, but I would like to avoid writing to file if possible
My Code - doDeepLookup actually is a API call, but for the sake of argument lets just say it appends some text in-front of the read line from the while loop
#!/bin/bash
n=0
maxjobs=20
resultsArray=""
while IFS= read -r line
do
IPaddress="$(echo $line | sed 's/ /\n/g' | grep -E -o "([0-9]{1,3}[\.]){3}[0-9]{1,3}")"
doDeepLookup "$line" "$IPaddress" &
if(( $(($((++n)) % $maxjobs)) == 0 )) ; then
wait
fi
done <<< "$(printf '%s\n' "${SomeOtherArray[#]}")"
printf '%s\n' "${resultsArray[#]}" #Returns NULL
doDeepLookup() {
results="$(echo "help me : $line")"
resultsArray+=($results)
}
Thanks to William
#!/bin/bash
n=0
maxjobs=20
WhileLoopFunction() {
resultsArray=""
while IFS= read -r line
do
IPaddress="$(echo $line | sed 's/ /\n/g' | grep -E -o "([0-9]{1,3}[\.]){3}[0-9]{1,3}")"
doDeepLookup "$line" "$IPaddress" &
if(( $(($((++n)) % $maxjobs)) == 0 )) ; then
wait
fi
done <<< "$(printf '%s\n' "${SomeOtherArray[#]}")"
}
doDeepLookup() {
results="$(echo "help me : $line")"
echo $results
}
resultsArray=( $(WhileLoopFunction"${DeepArray[#]}") )
printf '%s\n' "${resultsArray[#]}"
With parset from GNU Parallel you would do something like:
parset resultsArray doDeepLookup ::: "${DeepArray[#]}"
printf '%s\n' "${resultsArray[#]}"

printing line numbers that are multiple of 5

Hi I am trying to print/echo line numbers that are multiple of 5. I am doing this in shell script. I am getting errors and unable to proceed. below is the script
#!/bin/bash
x=0
y=$wc -l $1
while [ $x -le $y ]
do
sed -n `$x`p $1
x=$(( $x + 5 ))
done
When executing above script i get below errors
#./echo5.sh sample.h
./echo5.sh: line 3: -l: command not found
./echo5.sh: line 4: [: 0: unary operator expected
Please help me with this issue.
For efficiency, you don't want to be invoking sed multiple times on your file just to select a particular line. You want to read through the file once, filtering out the lines you don't want.
#!/bin/bash
i=0
while IFS= read -r line; do
(( ++i % 5 == 0 )) && echo "$line"
done < "$1"
Demo:
$ i=0; while read line; do (( ++i % 5 == 0 )) && echo "$line"; done < <(seq 42)
5
10
15
20
25
30
35
40
A funny pure Bash possibility:
#!/bin/bash
mapfile ary < "$1"
printf "%.0s%.0s%.0s%.0s%s" "${ary[#]}"
This slurps the file into an array ary, which each line of the file in a field of the array. Then printf takes care of printing one every 5 lines: %.0s takes a field, but does nothing, and %s prints the field. Since mapfile is used without the -t option, the newlines are included in the array. Of course this really slurps the file into memory, so it might not be good for huge files. For large files you can use a callback with mapfile:
#!/bin/bash
callback() {
printf '%s' "$2"
ary=()
}
mapfile -c 5 -C callback ary < "$1"
We're removing all the elements of the array during the callback, so that the array doesn't grow too large, and the printing is done on the fly, as the file is read.
Another funny possibility, in the spirit of glenn jackmann's solution, yet without a counter (and still pure Bash):
#!/bin/bash
while read && read && read && read && IFS= read -r line; do
printf '%s\n' "$line"
done < "$1"
Use sed.
sed -n '0~5p' $1
This prints every fifth line in the file starting from 0
Also
y=$wc -l $1
wont work
y=$(wc -l < $1)
You need to create a subshell as bash will see the spaces as the end of the assignment, also if you just want the number its best to redirect the file into wc.
Dont know what you were trying to do with this ?
x=$(( $x + 5 ))
Guessing you were trying to use let, so id suggest looking up the syntax for that command. It would look more like
(( x = x + 5 ))
Hope this helps
There are cleaner ways to do it, but what you're looking for is this.
#!/bin/bash
x=5
y=`wc -l $1`
y=`echo $y | cut -f1 -d\ `
while [ "$y" -gt "$x" ]
do
sed -n "${x}p" "$1"
x=$(( $x + 5 ))
done
Initialize x to 5, since there is no "line zero" in your file $1.
Also, wc -l $1 will display the number of line counts, followed by the name of the file. Use cut to strip the file name out and keep just the first word.
In conditionals, a value of zero can be interpreted as "true" in Bash.
You should not have space between your $x and your p in your sed command. You can put them right next to each other using curly braces.
You can do this quite succinctly using awk:
awk 'NR % 5 == 0' "$1"
NR is the record number (line number in this case). Whenever it is a multiple of 5, the expression is true, so the line is printed.
You might also like the even shorter but slightly less readable:
awk '!(NR%5)' "$1"
which does the same thing.

Using bash, how to assign integer to variable using echo

I'd like to understand bash a bit better as I'm apparently horrible at it...
I'm trying to generate a sequence of constant width integers, but then test them to do something exceptional for particular values. Like so:
for n in $(seq -w 1 150)
do
# The next line does not work: doit.sh: line 9: XX: command not found
#decval= $( echo ${n} | sed 's/^0//g' | sed 's/^0//g' )
#if [[ ${decal} -eq 98 ]] ; then
if [[ $( echo ${n} | sed 's/^0//g' | sed 's/^0//g' ) -eq 98 ]] ; then
echo "Do something different for 98"
elif [[ $( echo ${n} | sed 's/^0//g' | sed 's/^0//g' ) -eq 105 ]] ; then
echo "Do something different for 98"
fi
done
This script works for my purposes, but if I try and make the assignment 'decval= $(…' I get an error 'command not found'. I don't understand this, can someone explain?
Also, is there an improvement I can make to this script if I have a large number of exceptions to prevent a long list of if ; then elif … ?
The problem is in the space between = and $:
decval= $(…
You should write without spaces:
decval=$(...
Because, if you write the space, your shell reads decval= as declval="" and treats the result of $(echo...) as the name of a command to execute, and obviously it doesn't find the command.
Also (just a small optimization), you can write:
sed 's/^0\+//'
instead of
sed 's/^0//g' | sed 's/^0//g'
Here:
0\+ means 0 one or more times;
g is removed, because g means replace all occurences in the string, and you have only one occurence (^ can be only one time in a string).
Also, you can check your variable even with leading zeros, without sed:
[[ "$n" =~ "0*98" ]]

Incrementing a variable inside a Bash loop

I'm trying to write a small script that will count entries in a log file, and I'm incrementing a variable (USCOUNTER) which I'm trying to use after the loop is done.
But at that moment USCOUNTER looks to be 0 instead of the actual value. Any idea what I'm doing wrong? Thanks!
FILE=$1
tail -n10 mylog > $FILE
USCOUNTER=0
cat $FILE | while read line; do
country=$(echo "$line" | cut -d' ' -f1)
if [ "US" = "$country" ]; then
USCOUNTER=`expr $USCOUNTER + 1`
echo "US counter $USCOUNTER"
fi
done
echo "final $USCOUNTER"
It outputs:
US counter 1
US counter 2
US counter 3
..
final 0
You are using USCOUNTER in a subshell, that's why the variable is not showing in the main shell.
Instead of cat FILE | while ..., do just a while ... done < $FILE. This way, you avoid the common problem of I set variables in a loop that's in a pipeline. Why do they disappear after the loop terminates? Or, why can't I pipe data to read?:
while read country _; do
if [ "US" = "$country" ]; then
USCOUNTER=$(expr $USCOUNTER + 1)
echo "US counter $USCOUNTER"
fi
done < "$FILE"
Note I also replaced the `` expression with a $().
I also replaced while read line; do country=$(echo "$line" | cut -d' ' -f1) with while read country _. This allows you to say while read var1 var2 ... varN where var1 contains the first word in the line, $var2 and so on, until $varN containing the remaining content.
Always use -r with read.
There is no need to use cut, you can stick with pure bash solutions.
In this case passing read a 2nd var (_) to catch the additional "fields"
Prefer [[ ]] over [ ].
Use arithmetic expressions.
Do not forget to quote variables! Link includes other pitfalls as well
while read -r country _; do
if [[ $country = 'US' ]]; then
((USCOUNTER++))
echo "US counter $USCOUNTER"
fi
done < "$FILE"
minimalist
counter=0
((counter++))
echo $counter
You're getting final 0 because your while loop is being executed in a sub (shell) process and any changes made there are not reflected in the current (parent) shell.
Correct script:
while read -r country _; do
if [ "US" = "$country" ]; then
((USCOUNTER++))
echo "US counter $USCOUNTER"
fi
done < "$FILE"
I had the same $count variable in a while loop getting lost issue.
#fedorqui's answer (and a few others) are accurate answers to the actual question: the sub-shell is indeed the problem.
But it lead me to another issue: I wasn't piping a file content... but the output of a series of pipes & greps...
my erroring sample code:
count=0
cat /etc/hosts | head | while read line; do
((count++))
echo $count $line
done
echo $count
and my fix thanks to the help of this thread and the process substitution:
count=0
while IFS= read -r line; do
((count++))
echo "$count $line"
done < <(cat /etc/hosts | head)
echo "$count"
USCOUNTER=$(grep -c "^US " "$FILE")
Incrementing a variable can be done like that:
_my_counter=$[$_my_counter + 1]
Counting the number of occurrence of a pattern in a column can be done with grep
grep -cE "^([^ ]* ){2}US"
-c count
([^ ]* ) To detect a colonne
{2} the colonne number
US your pattern
Using the following 1 line command for changing many files name in linux using phrase specificity:
find -type f -name '*.jpg' | rename 's/holiday/honeymoon/'
For all files with the extension ".jpg", if they contain the string "holiday", replace it with "honeymoon". For instance, this command would rename the file "ourholiday001.jpg" to "ourhoneymoon001.jpg".
This example also illustrates how to use the find command to send a list of files (-type f) with the extension .jpg (-name '*.jpg') to rename via a pipe (|). rename then reads its file list from standard input.

How to write a tail script without the tail command

How would you achieve this in bash. It's a question I got asked in an interview and I could think of answers in high level languages but not in shell.
As I understand it, the real implementation of tail seeks to the end of the file and then reads backwards.
The main idea is to keep a fixed-size buffer and to remember the last lines. Here's a quick way to do a tail using the shell:
#!/bin/bash
SIZE=5
idx=0
while read line
do
arr[$idx]=$line
idx=$(( ( idx + 1 ) % SIZE ))
done < text
for ((i=0; i<SIZE; i++))
do
echo ${arr[$idx]}
idx=$(( ( idx + 1 ) % SIZE ))
done
If all not-tail commands are allowed, why not be whimsical?
#!/bin/sh
[ -r "$1" ] && exec < "$1"
tac | head | tac
Use wc -l to count the number of lines in the file. Subtract the number of lines you want from this, and add 1, to get the starting line number. Then use this with sed or awk to start printing the file from that line number, e.g.
sed -n "$start,\$p"
There's this:
#!/bin/bash
readarray file
lines=$(( ${#file[#]} - 1 ))
for (( line=$(($lines-$1)), i=${1:-$lines}; (( line < $lines && i > 0 )); line++, i-- )); do
echo -ne "${file[$line]}"
done
Based on this answer: https://stackoverflow.com/a/8020488/851273
You pass in the number of lines at the end of the file you want to see then send the file via stdin, puts the entire file into an array, and only prints the last # lines of the array.
The only way I can think of in “pure” shell is to do a while read linewise on the whole file into an array variable with indexing modulo n, where n is the number of tail lines (default 10) — i.e. a circular buffer, then iterate over the circular buffer from where you left off when the while read ends. It's not efficient or elegant, in any sense, but it'll work and avoids reading the whole file into memory. For example:
#!/bin/bash
incmod() {
let i=$1+1
n=$2
if [ $i -ge $2 ]; then
echo 0
else
echo $i
fi
}
n=10
i=0
buffer=
while read line; do
buffer[$i]=$line
i=$(incmod $i $n)
done < $1
j=$i
echo ${buffer[$i]}
i=$(incmod $i $n)
while [ $i -ne $j ]; do
echo ${buffer[$i]}
i=$(incmod $i $n)
done
This script somehow imitates tail:
#!/bin/bash
shopt -s extglob
LENGTH=10
while [[ $# -gt 0 ]]; do
case "$1" in
--)
FILES+=("${#:2}")
break
;;
-+([0-9]))
LENGTH=${1#-}
;;
-n)
if [[ $2 != +([0-9]) ]]; then
echo "Invalid argument to '-n': $1"
exit 1
fi
LENGTH=$2
shift
;;
-*)
echo "Unknown option: $1"
exit 1
;;
*)
FILES+=("$1")
;;
esac
shift
done
PRINTHEADER=false
case "${#FILES[#]}" in
0)
FILES=("/dev/stdin")
;;
1)
;;
*)
PRINTHEADER=true
;;
esac
IFS=
for I in "${!FILES[#]}"; do
F=${FILES[I]}
if [[ $PRINTHEADER == true ]]; then
[[ I -gt 0 ]] && echo
echo "==> $F <=="
fi
if [[ LENGTH -gt 0 ]]; then
LINES=()
COUNT=0
while read -r LINE; do
LINES[COUNT++ % LENGTH]=$LINE
done < "$F"
for (( I = COUNT >= LENGTH ? LENGTH : COUNT; I; --I )); do
echo "${LINES[--COUNT % LENGTH]}"
done
fi
done
Example run:
> bash script.sh -n 12 <(yes | sed 20q) <(yes | sed 5q)
==> /dev/fd/63 <==
y
y
y
y
y
y
y
y
y
y
y
y
==> /dev/fd/62 <==
y
y
y
y
y
> bash script.sh -4 <(yes | sed 200q)
y
y
y
y
Here's the answer I would give if I were actually asked this question in an interview:
What environment is this where I have bash but not tail? Early boot scripts, maybe? Can we get busybox in there so we can use the full complement of shell utilities? Or maybe we should see if we can squeeze a stripped-down Perl interpreter in, even without most of the modules that would make life a whole lot easier. You know dash is much smaller than bash and perfectly good for scripting use, right? That might also help. If none of that is an option, we should check how much space a statically linked C mini-tail would need, I bet I can fit it in the same number of disk blocks as the shell script you want.
If that doesn't convince the interviewer that it's a silly question, then I go on to observe that I don't believe in using bash extensions, because the only good reason to write anything complicated in shell script nowadays is if total portability is an overriding concern. By avoiding anything that isn't portable even in one-offs, I don't develop bad habits, and I don't get tempted to do something in shell when it would be better done in a real programming language.
Now the thing is, in truly portable shell, arrays may not be available. (I don't actually know whether the POSIX shell spec has arrays, but there certainly are legacy-Unix shells that don't have them.) So, if you have to emulate tail using only shell builtins and it's got to work everywhere, this is the best you can do, and yes, it's hideous, because you're writing in the wrong language:
#! /bin/sh
a=""
b=""
c=""
d=""
e=""
f=""
while read x; do
a="$b"
b="$c"
c="$d"
d="$e"
e="$f"
f="$x"
done
printf '%s\n' "$a"
printf '%s\n' "$b"
printf '%s\n' "$c"
printf '%s\n' "$d"
printf '%s\n' "$e"
printf '%s\n' "$f"
Adjust the number of variables to match the number of lines you want to print.
The battle-scarred will note that printf is not 100% available either. Unfortunately, if all you have is echo, you are up a creek: some versions of echo cannot print the literal string "-n", and others cannot print the literal string "\n", and even figuring out which one you have is a bit of a pain, particularly as, if you don't have printf (which is in POSIX), you probably don't have user-defined functions either.
(N.B. The code in this answer, sans rationale, was originally posted by user 'Nirk' but then deleted under downvote pressure from people whom I shall charitably assume were not aware that some shells do not have arrays.)

Resources