I am trying to write a script that:
a) reads the content of a .csv file
b) sets a variable to the value in the first position (ie to the left of the comma)
c) compare the variable value to each position in an array. If the value is in the array execute one command, if it isn't, insert that value into the first available slot in the array.
The .csv file is in the format:
co:7077,he17208am3200816internet.pdf,he17208am3200917internet.pdf
co:7077,he17208am3200817internet.pdf,he17208am3200918internet.pdf
co:7077,he17208am3200818internet.pdf,he17208am3200919internet.pdf
co:7077,he17208am3200819internet.pdf,he17208am3200915internet.pdf
co:7162,tra210051internet.pdf,tra21005101internet.pdf
co:7162,tra210051appinternet.pdf,tra21005102internet.pdf
co:7178,tra4157l11201021internet.pdf,tra4158l11201021internet.pdf
co:7178,tra4157l11201022internet.pdf,tra4158l11201022internet.pdf
My script so far looks like:
#!/bin/bash
declare -a array
anum=0
src=source.csv
pid=0
while read line;
do
pid=$( echo $line | awk '{print$1}' FS=",")
for n in "${array[#]}";
do
if [[ "$pid" = "$n" ]] ;
then
echo Duplicate value: "$pid";
else
array[$anum]="$pid"
anum=$(( $anum +1 ))
fi
done
done < $src
echo ${array[#]}
When the script is executed the pid is successfully set and reset with each iteration of the while loop, but apparently the nested for loop is never ran.
From my google'ing I suspect it has something to do with the pipe in pid line, but I'll be buggered if I can figure out how to make it work.
Any help is greatly appreciated.
You're not populating your array. The for loop is never executed because the array is empty.
Set a flag in the else clause instead of adding the array element there. After your for loop if the flag is set, add the array element. Don't forget to unset the flag.
You can do array[anum++] without the next line or (( anum++ )) instead of anum=$(($anum + 1)).
Also: while IFS=, read -r pid discard if you don't need the rest of the line (you could do it a little differently if you need it). Doing this, you won't need the echo and awk.
why did you use double square brackets? and also you used a single equals rather than double in the if?
try these one-liners...
$ if [ "a" == "b" ] ; then echo hello ; fi
$ if [ "a" == "a" ] ; then echo hello ; fi
Related
I am writing bash script given below (Please ignore the capital letters variable names, this is just my test file):
#!/bin/bash
create_nodes_directories(){
HOSTS=(192.168.110.165 192.168.110.166 192.168.110.167)
accounts=('accountnum11' 'accountnum12' 'accountnum13')
for i in "${!HOSTS[#]}"; do
read -r curhost _ < <(hostname -I)
printf 'Enter the key pair for the %s node\n' "${accounts[i]}"
printf "Enter public key\n"
read -r EOS_PUB_KEY
printf "Enter private key\n"
read -r EOS_PRIV_KEY
PRODUCER=${accounts[i]}
args=()
args+=("$curhost")
for j in "${!HOSTS[#]}"; do
if [[ "$i" != "$j" ]]; then
args+=("${HOSTS[$j]}")
else
continue;
fi
done
#echo 'Array before test:'"${args[*]}"
create_genesis_start_file "$EOS_PUB_KEY" "$EOS_PRIV_KEY" "${HOSTS[$i]}" "$PRODUCER" args
create_start_file "$EOS_PUB_KEY" "$EOS_PRIV_KEY" "${HOSTS[$i]}" "$PRODUCER" args
done
}
create_genesis_start_file(){
EOS_PUB_KEY=$1
EOS_PRIV_KEY=$2
CURRENTHOST=$3
PRODUCER=$4
peerags="$5[#]"
peers=("${!peerags}")
echo 'Genesis Currenthost is:'"$CURRENTHOST"
#echo "${peers[*]}"
VAR=""
length=${#peers[#]}
last=$((length - 1))
for i in "${!peers[#]}" ; do
if [[ "$i" == "$last" ]]; then
VAR+="--p2p-peer-address ${peers[$i]}:8888 \\"
else
VAR+=$"--p2p-peer-address ${peers[$i]}:8888 \\"$'\n\t'
fi
done
}
create_start_file(){
EOS_PUB_KEY=$1
EOS_PRIV_KEY=$2
CURRENTHOST=$3
PRODUCER=$4
peerags="$5[#]"
peers=("${!peerags}")
echo 'Start file Currenthost is:'"$CURRENTHOST"
#echo "${peers[*]}"
}
create_nodes_directories
For every iteration of the first for loop, I am displaying the third argument $CURRENTHOST which is passed to functions create_genesis_start_file and create_start_file.
For first iteration, output is:
Genesis Currenthost is:192.168.110.165
Start file Currenthost is:192.168.110.167
Second iteration:
Genesis Currenthost is:192.168.110.166
Start file Currenthost is:192.168.110.167
Third iteration,
Genesis Currenthost is:192.168.110.167
Start file Currenthost is:192.168.110.167
Genesis Currenthost is as expected and Start file Currenthost should be same with it. I am not getting why the Start file Currenthost is always set as 192.168.110.167.
If I remove the below code from create_genesis_start_file it is working fine:
VAR=""
length=${#peers[#]}
last=$((length - 1))
for i in "${!peers[#]}" ; do
if [[ "$i" == "$last" ]]; then
VAR+="--p2p-peer-address ${peers[$i]}:8888 \\"
else
VAR+=$"--p2p-peer-address ${peers[$i]}:8888 \\"$'\n\t'
fi
done
I am not getting the exact problem why the variable value is getting changed? Please help.
The "$5[#]" looks odd to me. You can't use a scalar $5 as if it were an array.
It seems that you want to pass a whole array as parameter. Since bash does not have a native way to do this, I suggest that on the calling side, you pass "${args[#]}" as parameter, and inside your function, you do a
shift 4
peers=( "$#" )
Another possibility, which however violates the idea of encapsulation, is to treet peers as a global variable, which is accessible to all functions. With this approach, you would on the caller side collect the information already in the variable peers instead of args.
From a programming style, global variables (accross function boundaries) are usually disliked for good reasons, but in my personal opinion, if you just do simple shell scripting, I would find it an acceptable solution.
I am horribly perplexed.
I've written a bash script to sort lines into categories based on substrings within that line.
Here's my example "lines.txt"
i am line1
i am line2
If a line contains "line1", then it should be sorted into group "l1". If it contains "line2", then it should be sorted into group "l2"
The problem is that the variable which holds the category isn't retaining its value, and I have no clue why. Here's the script.
#!/bin/bash
categories="l1 l2"
l1="
line1
"
l2="
line2
"
# match line1
cat lines.txt | while read fline
do
cate="no match"
for c in $categories
do
echo "${!c}" | while read location
do
if [ ! -z "$location" ] && [[ "$fline" =~ "$location" ]]
then
echo "we are selecting category $c"
cate="$c"
break
fi
done
if [ "$cate" != "no match" ]
then
echo "we found a match"
break
fi
done
echo "$cate:$fline"
done
exit 0
And when I run it, I see the output
we are selecting category l1
no match:i am line1
we are selecting category l2
no match:i am line2
This means that we are selecting the correct group, but we don't remember it when we exit the nested "while" loop.
Why is my variable not retaining its value, and how could I fix that?
The while loop is executed in a subshell because of the pipe. That means that the name 'cate' really refers to two different variables. One outside the while loop and the other inside the loop inside the subshell. When the subshell exits that value is lost.
A way to get around this is to use a redirect like this
while read line; do
...
done < $myfile
If the expression is more complicated and you need something executed in a subshell, then you can use process substitution (Thanks to David Rankin for reminding me about this one).
while read -r line; do
...
done < <(find . -iname "*sh")
Example
for FILE in $DIR/*
do
if(<is last File>)
doSomethingSpecial($FILE)
else
doSomethingRegular($FILE)
fi
done
What to call for <is last file> to check if the current file is the last one in the array ?
Is there an easy built-in check without checking the array's length by myself ?
What to call for to check if the current file is the last one in the array ?
For a start, you are not using an array. If you were then it would be easy:
declare -a files
files=($DIR/*)
pos=$(( ${#files[*]} - 1 ))
last=${files[$pos]}
for FILE in "${files[#]}"
do
if [[ $FILE == $last ]]
then
echo "$FILE is the last"
break
else
echo "$FILE"
fi
done
I know of no way to tell that you are processing the last element of a list in a for loop. However you could use an array, iterate over all but the last element, and then process the last element outside the loop:
files=($DIR/*)
for file in "${files[#]::${#files[#]}-1}" ; do
doSomethingRegular "$file"
done
doSomethingSpecial "${files[#]: -1:1}"
The expansion ${files[#]:offset:length} evaluates to all the elements starting at offset (or the beginning if empty) for length elements. ${#files[#]}-1 is the number of elements in the array minus 1.
${files[#]: -1:1} evaluates to the last element - -1 from the end, length 1. The space is necessary as :- is treated differently to : -.
Try this
LAST_FILE=""
for f in *
do
if [ ! -z $LAST_FILE ]
then
echo "Process file normally $LAST_FILE"
fi
LAST_FILE=$f
done
if [ ! -z $LAST_FILE ]
then
echo "Process file as last file $LAST_FILE"
fi
Produces
bash[1051]: ls
1 2 3 4
bash[1052]: sh ../last_file.sh
Process file normally 1
Process file normally 2
Process file normally 3
Process file as last file 4
You can use find to find the total number of files.
Then when you are in the loop count to the total number and carry out your task when the total equals the count i.e, the last file.
f=0
tot_files=`find . -iname '*.txt' | wc -l`
for FILE in $DIR/*
do
f=($f+1)
if [[ $f == $tot_files ]];then
carryout your task
fi
done
Building on the current highest-voted answer from #cdarke (https://stackoverflow.com/a/12298757/415523), if looking at a general array of values (rather than specifically files on disk), the loop code would be as follows:
declare -a array
declare -i length current
array=( a b c d e c )
length=${#array[#]}
current=0
for VALUE in "${array[#]}"; do
current=$((current + 1))
if [[ "$current" -eq "$length" ]]; then
echo "$VALUE is the last"
else
echo "$VALUE"
fi
done
This yields the output:
a
b
c
d
e
c is the last
This ensures that only the last item in the array triggers the alternative action and that, if any other item in the array duplicates the last value, the alternative action is not called for the earlier duplicates.
In the case of an array of paths to files in a specific directory, e.g.
array=( $DIR/* )
...it is probably less of a concern, since individual filenames within the same directory are almost-certainly unique (unless you have a really odd filesystem!)
You can abuse the positional parameters, since they act similarly to an array,
but are a little easier to manipulate. You should either save the old positional
parameters, or execute in a subshell.
# Method 1: use a subshell. Slightly cleaner, but you can't always
# do this (for example, you may need to affect variables in the current
# shell
files=( $DIR/* )
(
set -- "${files[#]}"
until (( $# == 1 )); do
doSomethingRegular "$1"
shift
done
doSomethingSpecial "$1"
)
# Method 2: save the positional parameters. A bit uglier, but
# executes everything in the same shell.
files=( $DIR/* )
oldPP=( "$#" )
set -- "${files[#]}"
until (( $# == 1 )); do
doSomethingRegular "$1"
shift
done
doSomethingSpecial "$1"
set -- "${oldPP[#]}"
What makes a file the last one? Is there something special about it? Is it the file with the greatest name when sorted by name?
Maybe you can take the file names backwards. Then, it's the first file you want to treat special and not the last. figuring out the first is a much easier task than doing the last:
for file in $(ls -r1 $dir)
do
if [ ! $processedLast ]
then
doSomethingSpecial($file)
processedLast=1
else
doSomethingRegular($file)
fi
done
No arrays needed. Actually, I like chepner's answer about using positional parameters.
It's old question - but building on answer from #GregReynolds please use this one-liner if commands differ only by parameters on last pass. Ugly, ugly code for one-liner lovers
( ff="" ; for f in * "" ; do [ -n "$ff" ] && echo $(${f:+false} && echo $ff alternate params here || echo normal params $ff ) ; ff=$f ; done )
normal params 1
normal params 2
normal params 3
4 alternate params here
can anybody explain why the following bash code involving compound operators is not behaving as expected? basically, nothing enters the if statement inside the for loop but i am passing it correct parameters that should return something by running:
./my_bash_script 20100101 20120101
dates.txt is a list of all days since 2000
#!/bin/bash
old_IFS=$IFS
IFS=$'\n'
lines=($(cat dates.txt)) # array
IFS=$old_IFS
for (( i=1; i<${#lines[#]}; i++ ))
do
if [[ ${line[$i]} -ge $1 && ${line[$i]} -le $2 ]]; then
echo 0 > ${line[$i]} # redirect to file
echo ${line[$i]}
fi
done
The problem is that you've declared an array named lines, but then you try to access it as though it were named line. You need to change every occurrence of ${line[$i]} to ${lines[$i]}.
Better yet, you can dispense with the arithmetic for-loop, and write:
for line in "${lines[#]}" ; do
which will let you refer to the line as $line or "$line" rather than as ${lines[$i]}.
(By the way, how come you have that logic to modify $IFS? It seems like its default value would work just as well.)
I want to do a loop (for file in *) only 5 times (so it's not a real loop anymore but however) is there anyway to do this?
Put the files in an array, then slice the array.
$ files=(*)
$ for file in "${files[#]::5}" ; do echo "$file" ; done
あいうえお
0000000000-11-005978.txt
0000000000-11-020832.txt
1
,123
This will only look at the first five items in the directory:
for file in $(ls | head -5)
As Ignacio Vazquez-Abrams points-out, this only works if your filenames don't contain any whitespace. (They likely won't, but something to keep in mind.)
Assuming the variable i is undefined or 0 when you enter the loop and is not used in the loop, just add the line:
test $((++i)) -ge 5 && break
in the loop body. The loop will break out during the 5th iteration, so if you put the line at the end of the loop body, your commands will execute 5 times. If your shell supports it, you can also use the less portable
((++i >= 5)) && break
Here is yet another solution, which only uses bash and no external tools:
let COUNT=0
for FILENAME in *
do
echo do something to $FILENAME
let COUNT=COUNT+1
if (( $COUNT == "5" )); then
break
fi
done