First line of text file not read in shell - bash

Okay so Ive compile this code to read a text file however it succesfully finds the sum of every column needed except from the first line! Hence gives me the wrong summation which excludes the value on the first line it reads in. It sets value: $line = ddsdfj:jdskf:1:fjf but never extracts the 1 from the first line. Any clues would be appreciated.
FILE=$1
while read line
do
awk -F: '{summation += $3;}END{print summation;}'
done < $FILE

The while loop is completely superfluous. It looks like what you want is
awk -F: '{s+=$3}END{print s}' "$1"
quite simply.
The code you had would read the first line with read, then the other lines as standard input to awk; hence, the behavior you were observing. Something like
while read line; do
awk -F: '{s+=$3}END{print s}' <<<"$line"
done <"$1"
would have actually used the value from line for something, but of course, that would just extract the third field from each line individually, not performed any actual addition of values from different lines.

Related

Replace a particular string (at fixed position) in a text file by line number

I built a script (bash) to replace a figure in a specific position on a .txt file.
This is the file, data.txt:
Date; Buy; Sell; Coupon; Fee
21/6/2019;0.0000000000;0.0000000000;0.0000000000
I want to replace the 12th character.
Find below the part of the code that changes the position if the 12th position is equal to "0" change the figure to "59"
sed 's/^\(.\{11\}\)0/\159/' data.txt
All good so far, the problem is when I want to make the changes only on the last row (#34)image a file (data.txt) as:
Date; Buy; Sell; Coupon; Fee
26/4/2006;0.0000000000;-200000000.0;0.0000000000
4/5/2006;0.0000000000;-100000000.0;0.0000000000
4/5/2006;0.0000000000;-300000000.0;0.0000000000
1/12/2006;0.0000000000;0.0000000000;-30000000.00
5/12/2006;0.0000000000;-250000000.0;0.0000000000
8/12/2006;0.0000000000;-250000000.0;0.0000000000
19/12/2006;0.0000000000;-650000000.0;0.000000000
18/1/2007;0.0000000000;-250000000.0;0.0000000000
1/2/2007;0.0000000000;-250000000.0;0.0000000000
2/2/2007;0.0000000000;-720000000.0;0.0000000000
28/3/2007;0.0000000000;-200000000.0;0.0000000000
28/3/2007;0.0000000000;-400000000.0;0.0000000000
3/5/2007;0.0000000000;-250000000.0;0.0000000000
3/5/2007;0.0000000000;-750000000.0;0.0000000000
3/5/2007;0.0000000000;-250000000.0;0.0000000000
5/6/2007;0.0000000000;-500000000.0;0.0000000000
3/7/2007;0.0000000000;-300000000.0;0.0000000000
3/12/2007;0.0000000000;0.0000000000;-281000000.0
1/12/2008;0.0000000000;0.0000000000;-281000000.0
1/12/2009;0.0000000000;0.0000000000;-281000000.0
1/12/2010;0.0000000000;0.0000000000;-281000000.0
5/4/2011;525000000.00;0.0000000000;0.0000000000
1/12/2011;0.0000000000;0.0000000000;-254750000.0
2/11/2012;1348000000.0;0.0000000000;0.0000000000
2/11/2012;840000000.00;0.0000000000;0.0000000000
3/12/2012;0.0000000000;0.0000000000;-145350000.0
2/12/2013;0.0000000000;0.0000000000;-145350000.0
1/12/2014;0.0000000000;0.0000000000;-145350000.0
1/12/2015;0.0000000000;0.0000000000;-145350000.0
1/12/2016;0.0000000000;0.0000000000;-145350000.0
1/12/2017;0.0000000000;0.0000000000;-145350000.0
3/12/2018;0.0000000000;0.0000000000;-145350000.0
21/6/2019;0.0000000000;0.0000000000;0.0000000000
I used the following:
#!/bin/bash
i=1
while read line;do
if((i==34));then
sed 's/^\(.\{11\}\)0/\159/' data.txt
fi
((i++))
Seems that there is a problem with the condition, the script runs a never stop, no output produced is like a loop without end.
Try this:
sed '$s/^\([^;]\+;\)0\(.*\)/\159\2/' input
The address $ tells sed to work only on the last line in the file. Instead of replacing the 12th character it is probably wise to replace the character after the first ;.
It makes absolutely no sense to split the lines for sed with a bash. It is one of sed's core features to loop over lines.
The following works with gawk, mawk, original-awk, and busybox:
awk 'END {if(int($2)==0) $2=59; print}' {O,}FS=\; data.txt
Outputs:
21/6/2019;59;0.0000000000;0.0000000000
On the last record (line), awk checks the integer value of the second semicolon ; delimited field, changing it to 59 if it's equal to zero.

How to find integer values and compare them then transfer the main files?

I have some output files (5000 files) of .log which are the results of QM computations. Inside each file there are two special lines indicate the number of electrons and orbitals, like this below as an example (with exact spaces as in output files):
Number of electrons = 9
Number of orbitals = 13
I thought about a script (bash or Fortran), as a solution to this problem, which grep these two lines (at same time) and get the corresponding integer values (9 and 13, for instance), compare them and finds the difference between two values, and finally, list them in a new text file with the corresponding filenames.
I would really appreciate any help given.
Am posting an attempt in GNU Awk, and have tested it in that only.
#!/bin/bash
for file in *.log
do
awk -F'=[[:blank:]]*' '/Number of/{printf "%s%s",$2,(NR%2?" ":RS)}' "$file" | awk 'function abs(v) {return v < 0 ? -v : v} {print abs($1-$2)}' >> output_"$file"
done
The reason I split the AWK logic to two was to reduce the complexity in doing it in single huge command. The first part is for extracting the numbers from your log file in a columnar format and second for getting their absolute value.
I will break-down the AWK logic:-
-F'=[[:blank:]]*' is a mult0 character delimiter logic including = and one or more instances of [[:blank:]] whitespace characters.
'/Number of/{printf "%s%s",$2,(NR%2?" ":RS)}' searches for lines starting with Number of and prints it in a columnar fashion, i.e. as 9 13 from your sample file.
The second part is self-explanatory. I have written a function to get the absolute value from the two returned values and print it.
Each output is saved in a file named output_, for you to process it further.
Run the script from your command line as bash script.sh, where script.sh is the name containing the above lines.
Update:-
In case if you are interested in negative values too i.e. without the absolute function, change the awk statement to
awk -F'=[[:blank:]]*' '/Number of/{printf "%s%s",$2,(NR%2?" ":RS)}' "$file" | awk '{print ($1-$2)}' >> output_"$file"
Bad way to do it (but it will work)-
while read file
do
first=$(awk -F= '/^Number/ {print $2}' "$file" | head -1)
second=$(awk -F= '/^Number/ {print $2}' "$file" | tail -1)
if [ "$first" -gt "$second" ]
then
echo $(("$first" - "$second"))
else
echo $(("$second" - "$first"))
fi > "$file"_answer ;
done < list_of_files
This method picks up the values (in the awk one liner and compares them.
It then subtracts them to give you one value which it saves in the file called "$file"_answer. i.e. the initial file name with '_answer' as a suffix to the name.
You may need to tweak this code to fit your purposes exactly.

How to assign line number to a variable in a while loop

I have a file contains some lines. Now I want to read the lines and get the line numbers. As below:
while read line
do
string=$line
number=`awk '{print NR}'` # This way is not right, gets all the line numbers.
done
Here is my scenario: I have one file, contains some lines, such as below:
2015Y7M3D0H0Mi44S7941
2015Y7M3D22H24Mi3S7927
2015Y7M3D21H28Mi21S5001
I want to read each line of this file, print out the last characters starts with "S" and the line number of it. it shoud looks like:
1 S7941
2 S7927
3 S5001
So, what should I properly do to get this?
Thanks.
Can anyone help me out ???
The UNIX shell is simply an environment from which to call tools and a language to sequence those calls. The UNIX general purpose text processing tool is awk so just use it:
$ awk '{sub(/.*S/,NR" S")}1' file
1 S7941
2 S7927
3 S5001
If you're going to be doing any text manipulation, get the book Effective Awk Programming, 4th Edition, by Arnold Robbins.
I just asked one of my friend, Found a simple way:
cat -n $file |while read line
do
number=echo $line | cut -d " " -f 1
echo $number
done
That means if we can not get line number from the file itself, we pass it with a line number.

awk to write different columns from different lines into single line of output file?

I am using a while do loop to read in from a file that contains a list of hostnames, run a command against the host list, and input specific data from the results into a second file. I need the output to be from line 33 column 3 and line 224 column 7, output to a single line in the second file. I can do it for either one or the other but having trouble getting it to work for both. example:
while read i; do
/usr/openv/netbackup/bin/admincmd/bpgetconfig -M $i |\
awk -v j=33 -v k=3 'FNR == j {print $k}' > /tmp/clientversion.txt
done < /tmp/clientlist.txt
Any hints or help is greatly appreciated!
You could use something like this:
awk 'NR==33{a=$3}NR==224{print a,$7}'
This saves the value in the third column of line 33 to the variable a, then prints it out along with the seventh column of line 224.
However, you're currently overwriting the file /tmp/clientversion.txt every iteration of the while loop. Assuming you want the file to contain all of the output once the loop has run, you should move the redirection outside the loop:
while read -r i; do
/usr/openv/netbackup/bin/admincmd/bpgetconfig -M $i |\
awk 'NR==33{a=$3}NR==224{print a,$7}'
done < /tmp/clientlist.txt > /tmp/clientversion.txt
As a bonus, I have added the -r switch to read (see related discussion here). Depending on the contents of your input file, you might also want to use double quotes around "$i" as well.

Replacing a column in CSV file with another in bash

I have a csv file with a number of columns. I am trying to replace the second column with the second to last column from the same file.
For example, if I have a file, sample.csv
1,2,3,4,5,6
a,b,c,d,e,f
g,h,i,j,k,l
I want to output:
1,5,3,4,5,6
a,e,c,d,e,f
g,k,i,j,k,l
Can anyone help me with this task? Also note that I will be discarding the last two columns afterwards with the cut function so I am open to separating the csv file to begin with so that I can replace the column in one csv file with another column from another csv file. Whichever is easier to implement. Thanks in advance for any help.
How about this simpler awk:
awk 'BEGIN{FS=OFS=","} {$2=$(NF-1)}'1 sample.csv
EDIT: Noticed that you also want to discard last 2 columns. Use this awk one-liner:
awk 'BEGIN{FS=OFS=","} {$2=$(NF-1); NF=NF-2}'1 sample.csv
In bash
while IFS=, read -r -a arr; do
arr[1]="${arr[4]}";
printf -v output "%s," "${arr[#]}";
printf "%s\n" "${output%,}";
done < sample.csv
Pure bash solution, using IFS in a funny way:
# Set globally the IFS, you'll see it's funny
IFS=,
while read -ra a; do
a[1]=${a[#]: -2:1}
echo "${a[*]}"
done < file.csv
Setting globally the IFS variable is used twice: once in the read statement so that each field is split according to a coma and in the line echo "${a[*]}" where "${a[*]}" will expand to the fields of the array a separated by IFS... which is a coma!
Another special thing: you mentionned the second to last field, and that's exactly what ${a[#]: -2:1} will expand to (mind the space between : and -2), so that you don't have to count your number of fields.
Caveat. csv files need a special csv parser that is difficult to implement. This answer (and I guess all the other answers that will not use a genuine csv parser) might break if a field contains a coma, e.g.,
1,2,3,4,"a field, with a coma",5
If you want to discard the last two columns, don't use cut, but this instead:
IFS=,
while read -ra a; do
((${#a[#]}<2)) || continue # skip if array has less than two fields
a[1]=${a[#]: -2:1}
echo "${a[*]::${#a[#]}-2}"
done < file.csv

Resources