Appending text from first line to last of a file through loop - bash

I want to append a text to the end of the each line of a existing file. This text has to be appended from the first line of the file to last line.
The thing is, I am passing the file contents as an input to the loop and output of that has to be appended onto the same file. I could not figure out a logic for it.
FileName: sample
cat sample
Alex, Johston
Samuel, John
Vebron, Justus
Above are the content of the file. Now, in this, I want to append first column of values ie) Alex, Samuel and Verbron to the end of the file with comma.
My intented output:
Alex, Johston,Alex
Samuel, John,Samuel
Vebron, Justus,Vebron
My script I wrote to take the first column values:
while
read LINE
do
fcol=$(echo $LINE|awk -F, '{ print $1 }')
done < sample
Running through the above loop, variable fcol will store the values - Alex, Samuel, Vebron. I need to append these values into end of each line
Can some one guide me on this and so that I can alter the above code to have the intented output as explained above.
Thanks!

awk -F, '{print $0", "$1}' sample
The loop is not required as the awk takes each line from the input file and process it based on the command provided.
Here the command is print $0", "$1 which appends the $0 and $1 with a "," between them

you can use awk to do this :
cat sample |awk -F "," '{print $0 ", "$1}'

You can't do this without an intermediate file of some sort (without storing the files contents in memory before processing them). That said things like sed -i will hide that detail from you.
awk -F, -v OFS=, '{print $0,$1}' sample > sample.new
sed -i.orig -e 's/^\(.\)\(.*\)/&\1/' sample

Initialize line_no = 0 before looping through the file.
Increment line_no by 1 before processing each line:
Following sed command will append the $fcol at the end of the $line_no
sed -i "$line_no s/$/ ,$fcol/" sample
Final Script:
line_no=0
while read LINE
do
line_no=$((line_no+1))
fcol=$(echo $LINE|awk -F, '{ print $1 }')
sed -i "$line_no s/$/ ,$fcol/" sample
done < sample

My solution is
sed -r 's/([^,]*).*/\0, \1/'

Related

Writing the output of a command to specific columns of a csv file, unix

I wanted to write the output of command to specific columns (3rd and 5th) of the csv file.
#!/bin/bash
echo -e "Value,1\nCount,1" >> file.csv
echo "Header1,Header2,Path,Header4,Value,Header6" >> file.csv
sed 'y/ /,/' input.csv >> file.csv
input.csv in the above snippet will look something like this
1234567890 /training/folder
0325435287 /training/newfolder
Current output of file.csv
Value,1
Count,1
Header1,Header2,Path,Header4,Value,Header6
1234567890,/training/folder
0325435287,/training/newfolder
Expected Output of file.csv
Value,1
Count,1
Header1,Header2,Path,Header4,Value,Header6
,,/training/folder,,1234567890,
,,/training/newfolder,,0325435287,
All the operations can be done in a single awk:
awk -v OFS=, -v pre="Value,1\nCount,1" -v hdr="Header1,Header2,Path,Header4,Value,Header6" '
BEGIN {print pre; print hdr}
{print "", "", $1, "", $2, ""}
' input.csv
Value,1
Count,1
Header1,Header2,Path,Header4,Value,Header6
,,i1234567890,,/training/folder,
,,0325435287,,/training/newfolder,
With sed you could try following code. Which is using sed's capability of back reference.
sed -E 's/(^[^ ]*) +(.*$)/,,\2,,\1,/' Input_file
Explanation: Using -E option of sed to enable ERE(extended regular expressions) first. Then in main program using s option to perform substitution operation. In 1st part of substitution creating 2 back references(capability to catch values by using regex and keep them in temp buffer memory to be used later on while substituting it with in 2nd part of substitution). In 2nd part of substitution substituting whole line with 2 commas followed by 2nd capturing group\2 followed by 2 commas followed by 1st capturing group \1 following by ,.
You can use awk instead of sed
cat input.csv | awk '{print ",," $1 "," $2 ","}' >> file.csv
awk can process a stdin input by line to line. It implements a print function and each word is processed as a argument (in your case, $1 and $2). In the above example, I added ,, and , as an inline argument.
You can trivially add empty columns as part of your sed script.
sed 'y/ /,/;s/,/,,/;s/^/,,/;s/$/,/' input.csv >> file.csv
This replaces the first comma with two, then adds two up front and one at the end.
Your expected output does not look like valid CSV, though. This is also brittle in that it will fail for any file names which contain a space or a comma.

Comma-delimited text: read last two, add them, place them at end of each line

I have a file that looks like this:
GOES-15,167,170,+,3
GOES-14,150,146,-,4
GOES-13,100,100,-,0
GOES-WEST,-160,-170,-,10
I would like to read the last two elements of each line (for example + and 3 on the first line) and add them together side by side (+3) and put it at the end of the line with a comma delimit, so like this:
GOES-15,167,170,+,3,+3
Here is what I am trying:
#!/bin/bash
file=weather_sats.txt
while read line
do
ADD=$(awk -F, '{print $4$5}')
sed -i 's/$/,$ADD/' $file
done < $file
exit 0
This doesn't work, since I get "$ADD" at end of each line.
This might do what you wanted.
awk -F, '{print $0","$(NF-1)$NF}' file.txt
Use pure awk:
awk -F, 'BEGIN { OFS="," } {print $0, $4$5 }'
That produces the required output.

Remove hyphen from duration format time

I need to remove hyphen from duration format time and i didn't succeed with sed command as i intended to do it.
original output:
00:0-26:0-8
00:0-28:0-30
00:0-28:0-4
00:0-28:0-28
00:0-27:0-54
00:0-27:0-19
Expected output:
00:26:08
00:28:30
00:28:04
00:28:28
00:27:54
00:27:19
I tried with command but i am stucked.
sed 's/;/ /g' temp_file.txt | awk '{print $8}' | grep - | sed 's/-//g;s/00:0/0:/g'
Using sed:
sed 's/\<[0-9]\>/0&/g;s/:00-/:/g' file
The first command s/\<[0-9]\>/0&/g is adding a zero to single digit numbers.
The second command s/:00-/:/g is removing the 0- in front of the number.
With your shown sample only, following awk may help you on same.
awk -F":" '{for(i=1;i<=NF;i++){sub(/0-/,"",$i);$i=length($i)==1?0$i:$i}} 1' OFS=":" Input_file
In case you want to save output into Input_file itself then append > temp_file && mv temp_file Input_file to above command too.
For the given example, this one-liner does the job:
awk -F':0-' '{printf "%02d:%02d:%02d\n",$1,$2,$3}' file
If I have the below output with two columns "duration time"? When I try to use one of your regexp above is adding me "0" for the first column duration time/timestamp and I dont want that, just the column $7 = duration_time separated by ; to be modified.
01;12May2018 8:20:36;192.168.1.111;78787;192.168.1.111;78787;80:25:0-49;2018-05-12_111111;RO
02;14May2018 2:43:16;192.168.1.132;78787;192.168.1.111;78787;36:10:0-10;2018-05-12_111111;RO
03;15May2018 7:40:01;192.168.131.1;78787;192.168.1.111;78787;18:39:0-44;2018-05-12_111111;RO
04;15May2018 12:37:46;192.168.1.201;78787;192.168.1.111;78787;12:51:0-14;2018-05-12_111111;RO
Here is the output:
root#root> sed 's/\<[0-9]\>/0&/g;s/:00-/:/g' temp_file
01;12May2018 08:20:36;192.168.01.111;78787;192.168.01.111;78787;80:25:49;2018-05-12_111111;RO
02;14May2018 02:43:16;192.168.01.132;78787;192.168.01.111;78787;36:10:10;2018-05-12_111111;RO
03;15May2018 07:40:01;192.168.131.01;78787;192.168.01.111;78787;18:39:44;2018-05-12_111111;RO
04;15May2018 12:37:46;192.168.01.201;78787;192.168.01.111;78787;12:51:14;2018-05-12_111111;RO

How to write a bash script that dumps itself out to stdout (for use as a help file)?

Sometimes I want a bash script that's mostly a help file. There are probably better ways to do things, but sometimes I want to just have a file called "awk_help" that I run, and it dumps my awk notes to the terminal.
How can I do this easily?
Another idea, use #!/bin/cat -- this will literally answer the title of your question since the shebang line will be displayed as well.
Turns out it can be done as pretty much a one liner, thanks to #CharlesDuffy for the suggestions!
Just put the following at the top of the file, and you're done
cat "$BASH_SOURCE" | grep -v EZREMOVEHEADER
So for my awk_help example, it'd be:
cat "$BASH_SOURCE" | grep -v EZREMOVEHEADER
# Basic form of all awk commands
awk search pattern { program actions }
# advanced awk
awk 'BEGIN {init} search1 {actions} search2 {actions} END { final actions }' file
# awk boolean example for matching "(me OR you) OR (john AND ! doe)"
awk '( /me|you/ ) || (/john/ && ! /doe/ )' /path/to/file
# awk - print # of lines in file
awk 'END {print NR,"coins"}' coins.txt
# Sum up gold ounces in column 2, and find out value at $425/ounce
awk '/gold/ {ounces += $2} END {print "value = $" 425*ounces}' coins.txt
# Print the last column of each line in a file, using a comma (instead of space) as a field separator:
awk -F ',' '{print $NF}' filename
# Sum the values in the first column and pretty-print the values and then the total:
awk '{s+=$1; print $1} END {print "--------"; print s}' filename
# functions available
length($0) > 72, toupper,tolower
# count the # of times the word PASSED shows up in the file /tmp/out
cat /tmp/out | awk 'BEGIN {X=0} /PASSED/{X+=1; print $1 X}'
# awk regex operators
https://www.gnu.org/software/gawk/manual/html_node/Regexp-Operators.html
I found another solution that works on Mac/Linux and works exactly as one would hope.
Just use the following as your "shebang" line, and it'll output everything from line 2 on down:
test.sh
#!/usr/bin/tail -n+2
hi there
how are you
Running this gives you what you'd expect:
$ ./test.sh
hi there
how are you
and another possible solution - just use less, and that way your file will open in searchable gui
#!/usr/bin/less
and this way you can grep if for something too, e.g.
$ ./test.sh | grep something

How to extract FASTA sequence using sequence ID (shell script)

I have the following sequences which is in a fasta format with sequence header and its nucleotides.
How can I compare two files(Kcompare.pep and clade1i.txt) and extract the sequences with the same sequence header?
Can anyone help me?
Kcompare.pep
>ztr:MYCGRDRAFT_45998
MAAPLHAEGPIRTPYTGVELLNTPYLNKGTAFPADERRVLGLTALLPTSVHTLDQQLQRA
WHQYQSRDNDLARNTFLTSLKEQNEVLYYRLVLDHLSEVFSIIYTPTEGEAIQRYSSLFR
>kal:KALB_5042
MTAEVAVVSDGSAIPGASPPATLPLLQDYAELVREHAGLSAVPLAVDSARLAAELCALPK
RFRAVFLTHTDPERAFQVQRAVAKAGGPLVITDQDTTAISLTASTLTTLARRGRSPSDSR
clade1i.txt
cpo:COPRO5265_0583
ble:BleG1_3845
kal:KALB_5042
expected output
>kal:KALB_5042
MTAEVAVVSDGSAIPGASPPATLPLLQDYAELVREHAGLSAVPLAVDSARLAAELCALPK
RFRAVFLTHTDPERAFQVQRAVAKAGGPLVITDQDTTAISLTASTLTTLARRGRSPSDSR
I tried to run this but no error or result appeared.
for i in K*
do
echo $i
awk -F ' ' '{print $1}' $i/$i.pep > Kcompare.pep
mv Kcompare.pep $i
awk -F '_' '{print $2":"$3"_"$4}' $i/firstClade.txt > $i/clade1i.txt
awk 'NR==1{printf $0"\t";next}{printf /^>/ ? "\n"$0"\t" : $0}' $i/Kcompare.pep | awk -F"\t" 'BEGIN{while((getline k <"$i/clade1i.txt")>0)i[k]=1}{gsub("^>","",$0);if(i[$1]){print ">"$1"\n"$2}}' > $i/firsti.pep
done
Using awk:
awk 'NR==FNR{a[">"$0];next}/^>/{f=0;}($0 in a)||f{print;f=1}' clade1i.txt Kcompare.pep
Read the clade1i.txt file and store in an array as keys.
Read the Kcompare.pep. For every line beginning with '>', set a flag, and keep printing the lines till the next line beginning with '>' is encountered.
Use this:
while read l; do
sed -n '/^>'"$l"'/,/^>|$/p' Kcompare.pep
done <clade1i.txt
The while loop loops trough the clade1i.txt file line by line.
sed -n suppresses auto print.
/regex/,/regex/ matches all from the first regex to the second.
p prints matched lines.

Resources