Remove hyphen from duration format time - bash

I need to remove hyphen from duration format time and i didn't succeed with sed command as i intended to do it.
original output:
00:0-26:0-8
00:0-28:0-30
00:0-28:0-4
00:0-28:0-28
00:0-27:0-54
00:0-27:0-19
Expected output:
00:26:08
00:28:30
00:28:04
00:28:28
00:27:54
00:27:19
I tried with command but i am stucked.
sed 's/;/ /g' temp_file.txt | awk '{print $8}' | grep - | sed 's/-//g;s/00:0/0:/g'

Using sed:
sed 's/\<[0-9]\>/0&/g;s/:00-/:/g' file
The first command s/\<[0-9]\>/0&/g is adding a zero to single digit numbers.
The second command s/:00-/:/g is removing the 0- in front of the number.

With your shown sample only, following awk may help you on same.
awk -F":" '{for(i=1;i<=NF;i++){sub(/0-/,"",$i);$i=length($i)==1?0$i:$i}} 1' OFS=":" Input_file
In case you want to save output into Input_file itself then append > temp_file && mv temp_file Input_file to above command too.

For the given example, this one-liner does the job:
awk -F':0-' '{printf "%02d:%02d:%02d\n",$1,$2,$3}' file

If I have the below output with two columns "duration time"? When I try to use one of your regexp above is adding me "0" for the first column duration time/timestamp and I dont want that, just the column $7 = duration_time separated by ; to be modified.
01;12May2018 8:20:36;192.168.1.111;78787;192.168.1.111;78787;80:25:0-49;2018-05-12_111111;RO
02;14May2018 2:43:16;192.168.1.132;78787;192.168.1.111;78787;36:10:0-10;2018-05-12_111111;RO
03;15May2018 7:40:01;192.168.131.1;78787;192.168.1.111;78787;18:39:0-44;2018-05-12_111111;RO
04;15May2018 12:37:46;192.168.1.201;78787;192.168.1.111;78787;12:51:0-14;2018-05-12_111111;RO
Here is the output:
root#root> sed 's/\<[0-9]\>/0&/g;s/:00-/:/g' temp_file
01;12May2018 08:20:36;192.168.01.111;78787;192.168.01.111;78787;80:25:49;2018-05-12_111111;RO
02;14May2018 02:43:16;192.168.01.132;78787;192.168.01.111;78787;36:10:10;2018-05-12_111111;RO
03;15May2018 07:40:01;192.168.131.01;78787;192.168.01.111;78787;18:39:44;2018-05-12_111111;RO
04;15May2018 12:37:46;192.168.01.201;78787;192.168.01.111;78787;12:51:14;2018-05-12_111111;RO

Related

Bash replace in CSV multiple columns

I have the following CSV format:
data_disk01,"/opt=920MB;4512;4917;0;4855","/=4244MB;5723;6041;0;6359","/tmp=408MB;998;1053;0;1109","/var=789MB;1673;1766;0;1859","/boot=53MB;656;692;0;729"
I would like to take from each column, except the first one, the last value from the array, like this:
data_disk01,"/opt=4855","/=6359","/tmp=1109","/var=1859","/boot=729"
I have tried something like:
awk 'BEGIN {FS=OFS=","} {if(NF==!1);gsub(/\=.*/,",")} 1'
Just the string, I managed to do it with:
string="/opt=920MB;4512;4917;0;4855"
echo $string | awk '{split($0,a,";"); print a[1],a[5]}' | sed 's#=.* #=#'
/opt=4855
But could not make it work for the whole CSV.
Any hints are appreciated.
If your input never contains commas in the quoted fields, simple sed script should work:
sed 's/=[^"]*;/=/g' file.csv
Could you please try following awk and let me know if this helps you.
awk '{gsub(/=[^"]*;/,"=")} 1' Input_file
In case you want to save output into Input_file then append > temp_file && mv temp_file Input_file in above code too.

Extract the last three columns from a text file with awk

I have a .txt file like this:
ENST00000000442 64073050 64074640 64073208 64074651 ESRRA
ENST00000000233 127228399 127228552 ARF5
ENST00000003100 91763679 91763844 CYP51A1
I want to get only the last 3 columns of each line.
as you see some times there are some empty lines between 2 lines which must be ignored. here is the output that I want to make:
64073208 64074651 ESRRA
127228399 127228552 ARF5
91763679 91763844 CYP51A1
awk  '/a/ {print $1- "\t" $-2 "\t" $-3}'  file.txt.
it does not return what I want. do you know how to correct the command?
Following awk may help you in same.
awk 'NF{print $(NF-2),$(NF-1),$NF}' OFS="\t" Input_file
Output will be as follows.
64073208 64074651 ESRRA
127228399 127228552 ARF5
91763679 91763844 CYP51A1
EDIT: Adding explanation of command too now.(NOTE this following command is for only explanation purposes one should run above command only to get the results)
awk 'NF ###Checking here condition NF(where NF is a out of the box variable for awk which tells number of fields in a line of a Input_file which is being read).
###So checking here if a line is NOT NULL or having number of fields value, if yes then do following.
{
print $(NF-2),$(NF-1),$NF###Printing values of $(NF-2) which means 3rd last field from current line then $(NF-1) 2nd last field from line and $NF means last field of current line.
}
' OFS="\t" Input_file ###Setting OFS(output field separator) as TAB here and mentioning the Input_file here.
You can use sed too
sed -E '/^$/d;s/.*\t(([^\t]*[\t|$]){2})/\1/' infile
With some piping:
$ cat file | tr -s '\n' | rev | cut -f 1-3 | rev
64073208 64074651 ESRRA
127228399 127228552 ARF5
91763679 91763844 CYP51A1
First, cat the file to tr to squeeze out repeted \ns to get rid of empty lines. Then reverse the lines, cut the first three fields and reverse again. You could replace the useless cat with the first rev.

How to print the csv file excluding first column till end using awk

I have a csv file with dynamic columns.
I've tried to use awk -F , 'NF>1' resul1.txt but it still prints all columns.
Since it has dynamic columns.
Its quite difficult to print using print $1 till end.
Try this awk command:
awk -F, '{$1=""}1' input.txt | awk -vOFS=, '{$1=$1}1' > output.txt
Make the 1st field empty
Print out entire line again
try substr function :
substr(string, start [, length ])
Return a length-character-long substring of string, starting at character number start. The first character of a string is character
number one.For example, substr("washington", 5, 3) returns "ing".*
awk -F, '{print substr($0,length($1)+1+length(FS))}' file
You can use cut:
cut -d',' -f2- yourfile.csv > output.csv
Explanation:
-d - setting delimiter to ,
-f - fields to print
2- - from 2 field to end of line
With awk:
awk -F, '{sub(/[^,]+,/,"",$0);}1' OFS=, yourfile.csv > output.csv
With sed:
sed -i.bak 's/^[^,]\+,//g' yourfile.csv
-i - in-place edit

Print last line of text file

I have a text file like this:
1.2.3.t
1.2.4.t
complete
I need to print the last non blank line and two line to last as two variable. the output should be:
a=1.2.4.t
b=complete
I tried this for last line:
b=awk '/./{line=$0} END{print line}' myfile
but I have no idea for a.
grep . file | tail -n 2 | sed 's/^ *//;1s/^/a=/;2s/^/b=/'
Output:
a=1.2.4.t
b=complete
awk to the rescue!
$ awk 'NF{a=b;b=$0} END{print "a="a;print "b="b}' file
a=1.2.4.t
b=complete
Or, if you want to the real variable assignment
$ awk 'NF{a=b;b=$0} END{print a, b}' file
| read a b; echo "a="$a; echo "b="$b
a=1.2.4.t
b=complete
you may need -r option for read if you have backslashes in the values.

Concatenating characters on each field of CSV file

I am dealing with a CSV file which has the following form:
Dates;A;B;C;D;E
"1999-01-04";1391.12;3034.53;66.515625;86.2;441.39
"1999-01-05";1404.86;3072.41;66.3125;86.17;440.63
"1999-01-06";1435.12;3156.59;66.4375;86.32;441
Since the BLAS routine I need to implement on such data takes double-floats only, I guess the easiest way is to concatenate d0 at the end of each field, so that each line looks like:
"1999-01-04";1391.12d0;3034.53d0;66.515625d0;86.2d0;441.39d0
In pseudo-code, that would be:
For every line except the first line
For every field except the first field
Substitute ; with d0; and Substitute newline with d0 newline
My imagination suggests me it should be something like
cat file.csv | awk -F; 'NR>1 & NF>1'{print line} | sed 's/;/d0\n/g' | sed 's/\n/d0\n/g'
Any input?
Could use this sed
sed '1!{s/\(;[^;]*\)/\1d0/g}' file
Skips the first line then replaces each field beginning with ;(skipping the first) with itself and d0.
Output
Dates;A;B;C;D;E
"1999-01-04";1391.12d0;3034.53d0;66.515625d0;86.2d0;441.39d0
"1999-01-05";1404.86d0;3072.41d0;66.3125d0;86.17d0;440.63d0
"1999-01-06";1435.12d0;3156.59d0;66.4375d0;86.32d0;441d0
I would say:
$ awk 'BEGIN{FS=OFS=";"} NR>1 {for (i=2;i<=NF;i++) $i=$i"d0"} 1' file
Dates;A;B;C;D;E
"1999-01-04";1391.12d0;3034.53d0;66.515625d0;86.2d0;441.39d0
"1999-01-05";1404.86d0;3072.41d0;66.3125d0;86.17d0;440.63d0
"1999-01-06";1435.12d0;3156.59d0;66.4375d0;86.32d0;441d0
That is, set the field separator to ;. Starting on line 2, loop through all the fields from the 2nd one appending d0. Then, use 1 to print the line.
Your data format looks a bit weird. Enclosing the first column in double quotes makes me think that it can contain the delimiter, the semicolon, itself. However, I don't know the application which produces that data but if this is the case, then you can use the following GNU awk command:
awk 'NR>1{for(i=2;i<=NF;i++){$i=$i"d0"}}1' OFS=\; FPAT='("[^"]+")|([^;]+)' file
The key here is the FPAT variable. Using it use are able to define how a field can look like instead of being limited to specify a set of field delimiters.
big-prices.csv
Dates;A;B;C;D;E
"1999-01-04";1391.12;3034.53;66.515625;86.2;441.39
"1999-01-05";1404.86;3072.41;66.3125;86.17;440.63
"1999-01-06";1435.12;3156.59;66.4375;86.32;441
preprocess script
head -n 1 big-prices.csv 1>output.txt; \
tail -n +2 big-prices.csv | \
sed 's/;/d0;/g' | \
sed 's/$/d0/g' | \
sed 's/"d0/"/g' 1>>output.txt;
output.txt
Dates;A;B;C;D;E
"1999-01-04";1391.12d0;3034.53d0;66.515625d0;86.2d0;441.39d0
"1999-01-05";1404.86d0;3072.41d0;66.3125d0;86.17d0;440.63d0
"1999-01-06";1435.12d0;3156.59d0;66.4375d0;86.32d0;441d0
note: would have to make minor modification to second sed if file has trailing whitespaces at end of lines..
Using awk
Input
$ cat file
Dates;A;B;C;D;E
"1999-01-04";1391.12;3034.53;66.515625;86.2;441.39
"1999-01-05";1404.86;3072.41;66.3125;86.17;440.63
"1999-01-06";1435.12;3156.59;66.4375;86.32;441
gsub (any awk)
$ awk 'FNR>1{ gsub(/;[^;]*/,"&d0")}1' file
Dates;A;B;C;D;E
"1999-01-04";1391.12d0;3034.53d0;66.515625d0;86.2d0;441.39d0
"1999-01-05";1404.86d0;3072.41d0;66.3125d0;86.17d0;440.63d0
"1999-01-06";1435.12d0;3156.59d0;66.4375d0;86.32d0;441d0
gensub (gawk)
$ awk 'FNR>1{ print gensub(/(;[^;]*)/,"\\1d0","g"); next }1' file
Dates;A;B;C;D;E
"1999-01-04";1391.12d0;3034.53d0;66.515625d0;86.2d0;441.39d0
"1999-01-05";1404.86d0;3072.41d0;66.3125d0;86.17d0;440.63d0
"1999-01-06";1435.12d0;3156.59d0;66.4375d0;86.32d0;441d0

Resources