BASH: Iterate range of numbers in a for cicle - bash

I want to create an array from a list of words. so, i'm using this code:
for i in {1..$count}
do
array[$i]=$(cat file.txt | cut -d',' -f3 | sort -r | uniq | tail -n ${i})
done
but it fails... in tail -n ${i}
I already tried tail -n $i, tail -n $(i) but can't pass tail the value of i
Any ideas?

It fails because you cannot use a variable in range directive in shell i.e. {1..10} is fine but {1..$n} is not.
While using BASH you can use ((...)) operators:
for ((i=1; i<=count; i++)); do
array[$i]=$(cut -d',' -f3 file.txt | sort -r | uniq | tail -n $i)
done
Also note removal of useless use of cat from your command.

Your range is not evaluated the way you are thinking, e.g.:
$ x=10
$ echo {1..$x}
{1..10}
You're better off just using a for loop:
for ((i = 1; i <= count; i++))
do
# ...
done

Just to elaborate on previous answers, this occurs because the 'brace expansion' is the first part of bash's parsing, and never gets repeated: when the braces are expanded, the '$count' is just a piece of text and so the braces are left as is. Then, when '$count' is expanded to a number, the brace expansion never runs again. See here.
If you wanted for some reason to force this brace expansion to happen again, you can use 'eval':
replace the {1..$count} with $(eval echo {1..${count}})
Better, in your case, to do as anubhava suggests.

Instead of reading the file numerous times, use the built-in mapfile command:
mapfile -t array < <(cut -d, -f3 file.txt | sort -r | uniq)

Related

Find the occurrences of an element in array

arr=(7793 7793123 7793 37793 3214)
I'd like to find the occurrence of 7793. I tried: grep -o '7793' <<< $arr | wc -l
However, this also counts other elements that contain 7793 (e.g. 7793123, 37793)
printf '%s\n' "${arr[#]}" | grep -c '^7793$'
Explanation:
printf prints each item of the array on a new line
grep -c '^7793$' uses the start and end anchors to match 7793 exactly and outputs the count
With GNU grep (note the correct counting of elements containing newlines, refer to documentation for a description of options used):
arr=(7793 7793123 7793 37793 3214 7793$'\n'7793)
printf '%s\0' "${arr[#]}" | grep --null-data -cFxe 7793
Output:
2
This works because variables in bash cannot contain the NUL character.
You can use regex in this case
grep -e ^7793$
To make a bash script efficient (from CPU/memory consumption point of view), whenever possible, avoid running sub-shells and programs. Hence, instead of using grep or any other program, here we have the choice of using a simple loop with variable comparison and arithmetic:
#!/bin/bash
key=7793
arr=(7793 7793123 7793 37793 3214)
count=0
for i in "${arr[#]}"
do if [ "$i" = "$key" ]
then count=$((count+1))
fi
done
echo $count

How to process values from for loop in shell script

I have below for loop in shell script
#!/bin/bash
#Get the year
curr_year=$(date +"%Y")
FILE_NAME=/test/codebase/wt.properties
key=wt.cache.master.slaveHosts=
prop_value=""
getproperty(){
prop_key=$1
prop_value=`cat ${FILE_NAME} | grep ${prop_key} | cut -d'=' -f2`
}
#echo ${prop_value}
getproperty ${key}
#echo "Key = ${key}; Value="${prop_value}
arr=( $prop_value )
for i in "${arr[#]}"; do
echo $i | head -n1 | cut -d "." -f1
done
The output I am getting is as below.
test1
test2
test3
I want to process the test2 from above results to below script in place of 'ABCD'
grep test12345 /home/ptc/storage/**'ABCD'**/apache/$curr_year/logs/access.log* | grep GET > /tmp/test.access.txt
I tried all the options but could not able to succeed as I am new to shell scripting.
Ignoring the many bugs elsewhere and focusing on the one piece of code you say you want to change:
for i in "${arr[#]}"; do
val=$(echo "$i" | head -n1 | cut -d "." -f1)
grep test12345 /dev/null "/home/ptc/storage/$val/apache/$curr_year/logs/access.log"* \
| grep GET
done > /tmp/test.access.txt
Notes:
Always quote your expansions. "$i", "/path/with/$val/"*, etc. (The * should not be quoted on the assumption that you want it to be expanded).
for i in $prop_value would have the exact same (buggy) behavior; using arr buys you nothing. If you want using arr to increase correctness, populate it correctly: read -r -a arr <<<"$prop_value"
The redirection is moved outside the loop -- that way the second iteration through the loop doesn't overwrite the file written by the first one.
The extra /dev/null passed to grep ensures that its behavior is consistent regardless of the number of matches; otherwise, it would display filenames only if more than one matching log file existed, and not otherwise.

shell - Characters contained in both strings - edited

I want to compare two string variables and print the characters that are the same for both. I'm not really sure how to do this, I was thinking of using comm or diff but I'm not really sure the right parameters to print only matching characters. also they say they take in files and these are strings. Can anyone help?
Input:
a=$(echo "abghrsy")
b=$(echo "cgmnorstuvz")
Output:
"grs"
You don't need to do that much work to assign $a and $b shell variables, you can just...
a=abghrsy
b=cdgmrstuvz
Now, there is a classic computer science problem called the longest common subsequence1 that is similar to yours.
However, if you just want the common characters, one way would let Ruby do the work...
$ ruby -e "puts ('$a'.chars.to_a & '$b'.chars.to_a).join"
1. Not to be confused with the different longest common substring problem.
Use Character Classes with GNU Grep
The isn't a widely-applicable solution, but it fits your particular use case quite well. The idea is to use the first variable as a character class to match against the second string. For example:
a='abghrsy'
b='cgmnorstuvz'
echo "$b" | grep --only-matching "[$a]" | xargs | tr --delete ' '
This produces grs as you expect. Note that the use of xargs and tr is simply to remove the newlines and spaces from the output; you can certainly handle this some other way if you prefer.
Set Intersection
What you're really looking for is a set intersection, though. While you can "wing it" in the shell, you'd be better off using a language like Ruby, Python, or Perl to do this.
A Ruby One-Liner
If you need to integrate with an existing shell script, a simple Ruby one-liner that uses Bash variables could be called like this inside your current script:
a='abghrsy'
b='cgmnorstuvz'
ruby -e "puts ('$a'.split(//) & '$b'.split(//)).join"
A Ruby Script
You could certainly make things more elegant by doing the whole thing in Ruby instead.
string1_chars = 'abghrsy'.split //
string2_chars = 'cgmnorstuvz'.split //
intersection = string1_chars & string2_chars
puts intersection.join
This certainly seems more readable and robust to me, but your mileage may vary. At least now you have some options to choose from.
Nice question +1.
You can use an awk trick to get this done.
a=abghrsy
b=cdgmrstuvz
comm -12 <(echo $a|awk -F"\0" '{for (i=1; i<=NF; i++) print $i}') <(echo $b|awk -F"\0" '{for (i=1; i<=NF; i++) print $i}')|tr -d '\n'
OUTPUT:
grs
Note use of awk -F"\0" that breaks input string character by character into different awk fiedls. Rest is pretty straightforward use of comm and tr.
PS: If you input string is not sorted then you need to pipe awk's output to sort or do sort of an array inside awk.
UPDATE: awk only solution (without comm):
echo "$a;$b" | awk -F"\0" '{scnd=0; for (i=1; i<=NF; i++) {if ($i!=";") {if (!scnd) arr1[$i]=$i; else if ($i in arr1) arr2[$i]=$i} else scnd=1}} END { for (a in arr2) printf("%s", a)}'
This assumes semicolon doesn't appear in your string (you can use any other character if that's not the case).
UPDATE 2: I think simplest solution is using grep -o
(thanks to answer from #CodeGnome)
echo "$b" | grep -o "[$a]" | tr -d '\n'
Using gnu coreutils(inspired by #DigitalRoss)..
a="abghrsy"
b="cgmnorstuvz"
echo "$(comm -12 <(echo "$a" | fold -w1 | sort | uniq) <(echo "$b" | fold -w1 | sort | uniq) | tr -d '\n')"
will print grs. I assumed you only want uniq characters.
UPDATE:
Modified for dash..
#!/bin/dash
string1=$(printf "$1" | fold -w1 | sort | uniq | tr -d '\n');
string2=$(printf "$2" | fold -w1 | sort | uniq | tr -d '\n');
while [ "$string1" != "" ]; do
c1=$(printf '%s\n' "$string1" | cut -c 1-1 )
string2=$(printf "$2" | fold -w1 | sort | uniq | tr -d '\n');
while [ "$string2" != "" ]; do
c2=$(printf '%s\n' "$string2" | cut -c 1-1 )
if [ "$c1" = "$c2" ]; then
echo "$c1\c"
fi
string2=$(printf '%s\n' "$string2" | cut -c 2- )
done
string1=$(printf '%s\n' "$string1" | cut -c 2- )
done
echo;
Note: I am just a beginner. There might be a better way of doing this.

Bash escaping and syntax

I have a small bash file which I intend to use to determine my current ping vs my average ping.
#!/bin/bash
output=($(ping -qc 1 google.com | tail -n 1))
echo "`cut -d/ -f1 <<< "${output[3]}"`-20" | bc
This outputs my ping - 20 ms, which is the number I want. However, I also want to prepend a + if the number is positive and append "ms".
This brings me to my overarching problem: Bash syntax regarding escaping and such heavy "indenting" is kind of flaky.
While I'll be satisfied with an answer of how to do what I wanted, I'd like a link to, or explanation of how exactly bash syntax works dealing with this sort of thing.
output=($(ping -qc 1 google.com | tail -n 1))
echo "${output[3]}" | awk -F/ '{printf "%+fms\n", $1-20}'
The + modifier in printf tells it to print the sign, whether it's positive or negative.
And since we're using awk, there's no need to use cut or bc to get a field or do arithmetic.
Escaping is pretty awful in bash if you use the deprecated `..` style command expansion. In this case, you have to escape any backticks, which means you also have to escape any other escapes. $(..) nests a lot better, since it doesn't add another layer of escaping.
In any case, I'd just do it directly:
ping -qc 1 google.com.org | awk -F'[=/ ]+' '{n=$6}
END { v=(n-20); if(v>0) printf("+"); print v}'
Here's my take on it, recognizing that the result from bc can be treated as a string:
output=($(ping -qc 1 google.com | tail -n 1))
output=$(echo "`cut -d/ -f1 <<< "${output[3]}"`-20" | bc)' ms'
[[ "$output" != -* ]] && output="+$output"
echo "$output"
Bash cannot handle floating point numbers. A workaround is to use awk like this:
#!/bin/bash
output=($(ping -qc 1 google.com | tail -n 1))
echo "`cut -d/ -f1 <<< "${output[3]}"`-20" | bc | awk '{if ($1 >= 0) printf "+%fms\n", $1; else printf "%fms\n", $1}'
Note that this does not print anything if the result of bc is not positive
Output:
$ ./testping.sh
+18.209000ms

Randomizing arg order for a bash for statement

I have a bash script that processes all of the files in a directory using a loop like
for i in *.txt
do
ops.....
done
There are thousands of files and they are always processed in alphanumerical order because of '*.txt' expansion.
Is there a simple way to random the order and still insure that I process all of the files only once?
Assuming the filenames do not have spaces, just substitute the output of List::Util::shuffle.
for i in `perl -MList::Util=shuffle -e'$,=$";print shuffle<*.txt>'`; do
....
done
If filenames do have spaces but don't have embedded newlines or backslashes, read a line at a time.
perl -MList::Util=shuffle -le'$,=$\;print shuffle<*.txt>' | while read i; do
....
done
To be completely safe in Bash, use NUL-terminated strings.
perl -MList::Util=shuffle -0 -le'$,=$\;print shuffle<*.txt>' |
while read -r -d '' i; do
....
done
Not very efficient, but it is possible to do this in pure Bash if desired. sort -R does something like this, internally.
declare -a a # create an integer-indexed associative array
for i in *.txt; do
j=$RANDOM # find an unused slot
while [[ -n ${a[$j]} ]]; do
j=$RANDOM
done
a[$j]=$i # fill that slot
done
for i in "${a[#]}"; do # iterate in index order (which is random)
....
done
Or use a traditional Fisher-Yates shuffle.
a=(*.txt)
for ((i=${#a[*]}; i>1; i--)); do
j=$[RANDOM%i]
tmp=${a[$j]}
a[$j]=${a[$[i-1]]}
a[$[i-1]]=$tmp
done
for i in "${a[#]}"; do
....
done
You could pipe your filenames through the sort command:
ls | sort --random-sort | xargs ....
Here's an answer that relies on very basic functions within awk so it should be portable between unices.
ls -1 | awk '{print rand()*100, $0}' | sort -n | awk '{print $2}'
EDIT:
ephemient makes a good point that the above is not space-safe. Here's a version that is:
ls -1 | awk '{print rand()*100, $0}' | sort -n | sed 's/[0-9\.]* //'
If you have GNU coreutils, you can use shuf:
while read -d '' f
do
# some stuff with $f
done < <(shuf -ze *)
This will work with files with spaces or newlines in their names.
Off-topic Edit:
To illustrate SiegeX's point in the comment:
$ a=42; echo "Don't Panic" | while read line; do echo $line; echo $a; a=0; echo $a; done; echo $a
Don't Panic
42
0
42
$ a=42; while read line; do echo $line; echo $a; a=0; echo $a; done < <(echo "Don't Panic"); echo $a
Don't Panic
42
0
0
The pipe causes the while to be executed in a subshell and so changes to variables in the child don't flow back to the parent.
Here's a solution with standard unix commands:
for i in $(ls); do echo $RANDOM-$i; done | sort | cut -d- -f 2-
Here's a Python solution, if its available on your system
import glob
import random
files = glob.glob("*.txt")
if files:
for file in random.shuffle(files):
print file

Resources