Numerically sorting strings from file - bash

Thing is, I would like to numerically sort those strings from file, without changing content of the file. Strings in file must not be changed after sorting operation. I want to use lines for editing them later, so my variable var should get values starting with 0:wc...'till 200:wc.
Input:
11:wc
1:wc
0:wc
200:wc
Desired order:
0:wc
1:wc
11:wc
200:wc
I'm using this code, but has no effect:
sort -k1n $1 | while read line
do
if [[ ${line:0:1} != "#" ]]
then
var=$line
fi
done <$1

Why not just
$ sort -k1n -t: file.txt
specifying the field separator as ':'.

You need to sort numerically on the first key, and if you need them for later, just read them into an array:
myarray=( $(sort -k1n <file) )
which will provide an array with sorted contents:
0:wc
1:wc
11:wc
200:wc

Two issues:
When you create a pipe, such as command | while read line; do ... end, the individual commands in the pipe (command and while read line; do ... end) run in subshells.
The subshells are created with copies of all the current variables, but are not able to reflect changes back into their parent. In this case line is only present in the subshell, and when the subshell terminates, it disappears with it.
You can use bash process substitution to avoid creating a subshell for one of the pipeline commands. For example, you could use:
while read line; do ... end < <(command)
If you both pipe and redirect, the redirect wins.
So when you write: command | while read line; do ... end < input, the while loop actually reads from input, not from the output of command.

Related

File input getting consumed inside the while loop

I'm reading through a lookup file and performing a set of actions for each line in the file. However the while loop only reads the first line in the file and exits. Here's the current code that I have.
sql_from_lkp $lookup
function sql_from_lkp {
lkp=$1
while read line; do
sql_from_columns ${line}
echo ${line}
done < ${lkp}
}
function sql_from_columns {
table_name=$1
table_column_info_file=${table_name}_columns
line_count=`cat $table_info_file | wc -l`
....
}
By selectively commenting the code, I found that if I comment the line_count line, the while loop goes through every line in the file and works fine. So the input is getting consumed by the cat statement.
I've checked other answers and understood that ssh usually consumes the file inputs inside while loops if -n option is not used. But not sure how to fix this case. Need some help.
You've mistyped a variable name: $table_info_file should be $table_column_info_file.
If you correct that, your problem will go away.
By referring to a non-extant variable - the mistyped $table_info_file - you're essentially executing cat | wc -l (no filename argument passed to cat) in sql_from_columns(), which makes cat read from stdin.
Therefore - after having read the 1st line in the while loop - the cat command in sql_from_columns() consumes the entire rest of your input (< ${lkp}), which is why the while loop exits after the 1st iteration.
Generally,
You should double-quote all your variable references so as not to subject their values to word-splitting and globbing.
Bash won't allow you to call functions before they're defined, so as presented in your question, your code fundamentally couldn't work.
While the legacy `...` syntax for command substitutions is still supported, it has pitfalls that can be avoided with the modern $(...) syntax.
A more efficient way to count lines is to pass the input file to wc -l via < rather than via cat and a pipeline (wc also accepts filename operands directly, but it then prints the input filename after the counts).
Incidentally, you probably would have caught your mistyped variable reference more easily had you done that, as Bash would have reported an ambiguous redirect error in the absence of a filename following <.
Here's a reformulation that addresses all the issues:
function sql_from_lkp {
lkp=$1
while read line; do
sql_from_columns "${line}"
echo "${line}"
done < "${lkp}"
}
function sql_from_columns {
table_name=$1
table_column_info_file=${table_name}_columns
line_count=$(wc -l < "$table_column_info_file")
# ...
}
sql_from_lkp "$lookup"
Note that I've only added double quotes where strictly needed to make the command robust; it wouldn't hurt to add them whenever a parameter (variable) is referenced.

Get tokens from a String until they are exhausted in shell script

I have a shell script that reads input strings from stdin and get only part of the value from the input. The input string can have any number of key/value pairs and is in the following format:
{"input0":"name:/data/name0.csv",
"input1":"name:/data/name1.csv",
....}
So in the above example, I want to get these as the output of my script:
/data/name0.csv
/data/name1.csv
.....
I think I need two while loops, one needs to keep reading from stdin, the other one needs to extract the values from the input until there is no more. Can someone let me know how to do the second loop block ?
if you have
{"input0":"name:/data/name0.csv",
"input1":"name:/data/name1.csv",
....}
inside a file abc.in, then you can do the following to parse your input with a command called sed:
cat abc.in | sed 's/.*"input[0-9]\+":"name:\(\/data\/name[0-9]\+.csv\)".*$/\1/g'
it basically lookup the current line with a regular expression and see if it matches one of the form Begining of line then anything"input and a number":"name:/data/name and a number.csv"anything and then end of line.
The result is:
/data/name1.csv
/data/name2.csv
/data/name3.csv
/data/name4.csv
...
A simple BashFAQ #1 loop works here, with jq preprocessing your string into line-oriented content:
while read -r value; do
echo "${value#name:}"
done < <(jq -r '.[]')
That said, you can actually do the whole thing just in jq with no bash at all; the following transforms your given input directly to your desired output (given jq 1.5 or newer):
jq -r '.[] | sub("name:"; "")'
If you really want to do things the fragile way rather than leveraging a JSON parser, you can do that too:
# This is evil: Will fail very badly if input formatting changes
content_re='"name:(.*)"'
while read -r line; do
[[ $line =~ $content_re ]] && printf '%s\n' "${BASH_REMATCH[1]}"
done
There's still no inner loop required -- just a single loop iterating over lines of input, with the body determining how to process each line.

Shell file doesn't extract value properly [grep/cut] from file [bash]

I have a test.txt file which contains key value pair just like any other property file.
test.txt
Name="ABC"
Age="24"
Place="xyz"
i want to extract the value of different key's value into corresponding variables. For that i have written the following shell script
master.sh
file=test.txt
while read line; do
value1=`grep -i 'Name' $file|cut -f2 -d'=' $file`
value2=`grep -i 'Age' $file|cut -f2 -d'=' $file`
done <$file
but when i execute it; it doesnt run properly, giving me the entire line extracted by the grep part of the command as output. Can someone please point me to the error ?
If I understood your question correctly, the following Bash script should do the trick:
#!/bin/bash
IFS="="
while read k v ; do
test -z "$k" && continue # skip empty lines
declare $k=$v
done <test.txt
echo $Name
echo $Age
echo $Place
Why is that working? Most information can be retrieved from bash's man page:
IFS is the "Internal Field Separator" which is used by bash's 'read' command to separate fields in each line. By default, IFS separates along spaces, but it is redefined to separate along the equal sign. It is a bash-only solution similar to the 'cut' command, where you define the equal sign as delimiter ('-d =').
The 'read' builtin reads two fields from a line. As only two variables are provided (k and v), the first field ends up in k, all remaining fields (i.e. after the equal sign) end up in v.
As the comment states, empty lines are skipped, i.e. those where the k variable is emtpy (test -z).
'eval' is a bash builtin as well, which executes the arguments (but only after evaluating $k=$v), i.e. the eval statement becomes equivalent to Name="ABC" etc.
'<test.txt' after 'done' tells bash to read test.txt and to feed it line by line into the 'read' builtin further up.
The three 'echo' statements are simply to show that this solution did work.
The format or the file is valid sh syntax, so you could just source the file:
source test.txt
In any case, your code doesn't work because after the pipe you shouldn't specify the file again.
value1=$(grep -i 'Name' "$file" | cut -f2 -d'=')
would keep your logic
This is a comment, but the comment box does not allow formatting. Consider rewriting this:
while read line; do
value1=`grep -i 'Name' $file|cut -f2 -d'=' $file`
value2=`grep -i 'Age' $file|cut -f2 -d'=' $file`
done <$file
as:
while IFS== read key value; do
case $key in
Name|name) value1=$value;;
Age|age) value2=$value;;
esac;
done < $file
Parsing the line multiple times via cut is inefficient. This is slightly different than your version, since the comparison is case sensitive, but that is easily fixed if necessary. For example, you could preprocess the input file and convert everything to lower case. You can do the preprocessing on the fly, but be aware that this will put your while loop in a subprocess which will require some additional care (since the variable definitions will end with the pipeline), but that is not significant. But running the entire file through grep twice for each line of the file is O(n^2), and ghastly! (Why are you reading the entire file anyway instead of just echoing the line ?)

How to pass array of values to shell script in command line and advice on loop

The goal is to get only the filenames from svn log based on the revision number. Every commit has a jira ticket number in the svn comment, so the svn revisions are got by looking for the jira ticket numbers.
The script so far works fine when I give only one jira ticket number but I need have it work when I give more than one jira ticket.
The issue with this script is that the output has only values from ticket-2. How can I have the output to include values from both ticket-1 and ticket-2?
I need some help on how to pass the ticket-1 and ticket-2 as arguments to the script rather than assign them in the script?
Code:
#!/bin/sh
src_url=$1
target_url=$2
jira_ticket=("ticket-1 ticket-2")
for i in $jira_ticket; do
revs=(`svn log $1 --limit 10 | grep -B 2 $i | grep "^r" | cut -d"r" -f2 | cut -d" " - f1|sort -r`)
done
for revisions in ${!revs[*]}; do
files=(`svn log -v $1 -r ${revs[$revisions]} | awk '$1~/^[AMD]$/{for(i=2;i<=NF;i++)print $i}'`)
for (( i = 0; i < ${#files[#]}; i++ )); do
echo "${files[$i]} #" ${revs[$revisions]} " will be merged."
done
done
tl;dr
Because the second loop (processing revs) is outside the first loop (setting revs). Move the second loop to within the first loop to fix this problem.
Detailed repairs
This script needs some serious fixing.
The array jira_ticket was declared incorrectly - it should be jira_ticket=("ticket-1" "ticket-2").
To loop over every element in an array, use "${array[#]}" (the quotes are important to avoid unintended word splitting, and using # instead of * makes the expansion be split into one word per element, which is what you're after). $array is equivalent to ${array[0]}.
Same principle with looping over an array's keys: say "${!array[#]}" instead of ${!array[*]}.
Why loop over keys when you can loop over values and you don't need the keys?
Variable assignments in a loop are not guaranteed to be propagated out of it (they probably are here, but odd things happen in pipelines and such).
Did you mean to execute the second loop within the first loop, to use each copy of revs? (As it stands you're only processing the last copy.)
Please quote all your variable expansions ("$1", not $1).
Please use modern command substitution syntax $(command) instead of backquotes. It's much less error-prone.
You'll need to set IFS properly to properly split the command substitution results. I think you're after an IFS of $'\n'; I may be wrong.
Passing the tickets as arguments
Use shift after dealing with $1 to get rid of $1, then assign everything that's left to the jira_tickets array.
The script, repaired as best I can:
#!/bin/sh
# First argument is the source URL; remaining args are ticket numbers
src_url="$1"; shift
#target_url="$2"; shift # Never used
# Handy syntax hint: `for i in "$#"; do` == `for i; do`
for ticket; do
# Fixed below $1 to be $src_url
revs=($(IFS=$'\n'; svn log "$src_url" --limit 10 | grep -B 2 "$ticket" | grep "^r" | cut -d"r" -f2 | cut -d" " - f1 | sort -r))
for revision in "${revs[#]}"; do # I think you meant to loop over the values here, not the keys
files=($(IFS=$'\n'; svn log -v "$src_url" -r "$revision" | awk '$1~/^[AMD]$/{for(i=2;i<=NF;i++)print $i}'))
for file in "${files[#]}"; do # Think you wanted to loop over the values here too
echo "$file # $revision will be merged."
done
done
done

Manipulating data text file with bash command?

I was given this text file, call stock.txt, the content of the text file is:
pepsi;drinks;3
fries;snacks;6
apple;fruits;9
baron;drinks;7
orange;fruits;2
chips;snacks;8
I will need to use bash-script to come up this output:
Total amount for drinks: 10
Total amount for snacks: 14
Total amount for fruits: 11
Total of everything: 35
My gut tells me I will need to use sed, group, grep and something else.
Where should I start?
I would break the exercise down into steps
Step 1: Read the file one line at a time
while read -r line
do
# do something with $line
done
Step 2: Pattern match (drinks, snacks, fruits) and do some simple arithmetic. This step requires that you tokenized each line which I'll leave an exercise for you to figure out.
if [[ "$line" =~ "drinks" ]]
then
echo "matched drinks"
.
.
.
fi
Pure Bash. A nice application for an associative array:
declare -A category # associative array
IFS=';'
while read name cate price ; do
((category[$cate]+=price))
done < stock.txt
sum=0
for cate in ${!category[#]}; do # loop over the indices
printf "Total amount of %s: %d\n" $cate ${category[$cate]}
((sum+=${category[$cate]}))
done
printf "Total amount of everything: %d\n" $sum
There is a short description here about processing comma separated files in bash here:
http://www.cyberciti.biz/faq/unix-linux-bash-read-comma-separated-cvsfile/
You could do something similar. Just change IFS from comma to semicolon.
Oh yeah, and a general hint for learning bash: man is your friend. Use this command to see manual pages for all (or most) of commands and utilities.
Example: man read shows the manual page for read command. On most systems it will be opened in less, so you should exit the manual by pressing q (may be funny, but it took me a while to figure that out)
The easy way to do this is using a hash table, which is supported directly by bash 4.x and of course can be found in awk and perl. If you don't have a hash table then you need to loop twice: once to collect the unique values of the second column, once to total.
There are many ways to do this. Here's a fun one which doesn't use awk, sed or perl. The only external utilities I've used here are cut, sort and uniq. You could even replace cut with a little more effort. In fact lines 5-9 could have been written more easily with grep, (grep $kind stock.txt) but I avoided that to show off the power of bash.
for kind in $(cut -d\; -f 2 stock.txt | sort | uniq) ; do
total=0
while read d ; do
total=$(( total+d ))
done < <(
while read line ; do
[[ $line =~ $kind ]] && echo $line
done < stock.txt | cut -d\; -f3
)
echo "Total amount for $kind: $total"
done
We lose the strict ordering of your original output here. An exercise for you might be to find a way not to do that.
Discussion:
The first line describes a sub-shell with a simple pipeline using cut. We read the third field from the stock.txt file, with fields delineated by ;, written \; here so the shell does not interpret it. The result is a newline-separated list of values from stock.txt. This is piped to sort, then uniq. This performs our "grouping" step, since the pipeline will output an alphabetic list of items from the second column but will only list each item once no matter how many times it appeared in the input file.
Also on the first line is a typical for loop: For each item resulting from the sub-shell we loop once, storing the value of the item in the variable kind. This is the other half of the grouping step, making sure that each "Total" output line occurs once.
On the second line total is initialized to zero so that it always resets whenever a new group is started.
The third line begins the 'totaling' loop, in which for the current kind we find the sum of its occurrences. here we declare that we will read the variable d in from stdin on each iteration of the loop.
On the fourth line the totaling actually occurs: Using shell arithmatic we add the value in d to the value in total.
Line five ends the while loop and then describes its input. We use shell input redirection via < to specify that the input to the loop, and thus to the read command, comes from a file. We then use process substitution to specify that the file will actually be the results of a command.
On the sixth line the command that will feed the while-read loop begins. It is itself another while-read loop, this time reading into the variable line. On the seventh line the test is performed via a conditional construct. Here we use [[ for its =~ operator, which is a pattern matching operator. We are testing to see whether $line matches our current $kind.
On the eighth line we end the inner while-read loop and specify that its input comes from the stock.txt file, then we pipe the output of the entire loop, which by now is simply all lines matching $kind, to cut and instruct it to show only the third field, which is the numeric field. On line nine we then end the process substitution command, the output of which is a newline-delineated list of numbers from lines which were of the group specified by kind.
Given that the total is now known and the kind is known it is a simple matter to print the results to the screen.
The below answer is OP's. As it was edited in the question itself and OP hasn't come back for 6 years, I am editing out the answer from the question and posting it as wiki here.
My answer, to get the total price, I use this:
...
PRICE=0
IFS=";" # new field separator, the end of line
while read name cate price
do
let PRICE=PRICE+$price
done < stock.txt
echo $PRICE
When I echo, its :35, which is correct. Now I will moving on using awk to get the sub-category result.
Whole Solution:
Thanks guys, I manage to do it myself. Here is my code:
#!/bin/bash
INPUT=stock.txt
PRICE=0
DRINKS=0
SNACKS=0
FRUITS=0
old_IFS=$IFS # save the field separator
IFS=";" # new field separator, the end of line
while read name cate price
do
if [ $cate = "drinks" ]; then
let DRINKS=DRINKS+$price
fi
if [ $cate = "snacks" ]; then
let SNACKS=SNACKS+$price
fi
if [ $cate = "fruits" ]; then
let FRUITS=FRUITS+$price
fi
# Total
let PRICE=PRICE+$price
done < $INPUT
echo -e "Drinks: " $DRINKS
echo -e "Snacks: " $SNACKS
echo -e "Fruits: " $FRUITS
echo -e "Price " $PRICE
IFS=$old_IFS

Resources