From the cmd (awk 'some expression') I got a result in the format
Key:(white_space)Value
Key:(white_space)Value
...
How to manipulate the result to be in the format:
Key=Value
I need this because I want to put the information into .properties file format which is key=value
In other words I need to replace : with = and remove the whitespace.
Is there a command in awk that can achieve this ?
You ask for awk, while sed provides just as easy a solution. However, awk makes it trivial with sub as well:
awk '{ sub(/:[ \t]*/,"=") }1'
Example
$ echo "Key: Value" | awk '{ sub(/:[ \t]*/,"=") }1'
Key=Value
Another awk approach.
awk -F'[: ]' '{print $1 "=" $NF}' file.txt
I have a file with 2 lines
123|456|789
abc|123|891
I need a to output like below. Basically, I want to add the string "xyz" to col 1 and to add "xyz" as a new col 2
xyz-123|xyz|456|789
xyz-abc|xyz|123|891
This is what I used
awk 'BEGIN{FS=OFS="fs-"$1}{print value OFS $0}' /tmp/b.log
I get
xyz-123|456|789
xyz-abc|123|891
I tried
awk 'BEGIN{FS=OFS="fs-"$1}{print value OFS $0}' /tmp/b.log|awk -F" " '{$2="fs" $0;}1' OFS=" "
In addition to the awk's updating fields ($1, $2...) approach, we can also use substitution to do the job:
sed 's/^[^|]*/xyz-&|xyz/' file
If awk is a must:
awk '1+sub(/^[^|]*/, "xyz-&|xyz")' file
Both one-liners give expected output.
Could you please try following.
awk 'BEGIN{FS=OFS="|"} {$1="xyz-"$1;$2="xyz" OFS $2} 1' Input_file
OR as per #Corentin Limier's comment try:
awk 'BEGIN{FS=OFS="|"} {$1="xyz-" $1 OFS "xyz"} 1' Input_file
Output will be as follows.
xyz-123|xyz|456|789
xyz-abc|xyz|123|891
I would use sed instead of awk as follows:
sed -e 's/^/xyz-/' -e 's/|/|xyz|/' Input_file
This prepends xyz- at beginning of each line and changes the first | into |xyz|
Another slight variation of sed:
sed 's/^/xyz-/;s/|/&xyz&/' file
I have a lot of *.csv files. I want to delete the content after a specific line. I will delete all lines after 20031231
How do I solve this problem with some lines of a shell script?
Test,20031231,000107,0.74843,0.74813
Test,20031231,000107,0.74838,0.74808
Test,20031231,000108,0.74841,0.74815
Test,20031231,000108,0.74835,0.74809
Test,20031231,000110,0.74842,0.74818
Test,20040101,000100,0.73342,0.744318
quick and dirty but without any other info about constraint
sed '1,/20031231/p;d' YourFile
If you want to use a shell script, the best is to use awk. This will do the trick:
awk 'BEGIN {FS=","} {if ($2 == "20031231") print $0}' input.csv > output.csv
This code will write to a different file only the lines that have 20031231.
ignores empty lines and unmatched data
awk file:
$ cat awk.awk
{
if($2<="20031231" && $0!=""){
print $0
}else{
next
}
}
execution:
$ awk -F',' -f awk.awk input
Test,20031231,000107,0.74843,0.74813
Test,20031231,000107,0.74838,0.74808
Test,20031231,000108,0.74841,0.74815
Test,20031231,000108,0.74835,0.74809
Test,20031231,000110,0.74842,0.74818
one liner:
$ awk -F',' '{if($2<="20031231" && $0!=""){print $0}else{next}}' input
Test,20031231,000107,0.74843,0.74813
Test,20031231,000107,0.74838,0.74808
Test,20031231,000108,0.74841,0.74815
Test,20031231,000108,0.74835,0.74809
Test,20031231,000110,0.74842,0.74818
with Miller (http://johnkerl.org/miller/doc/)
mlr --nidx --fs "," filter '$2>20031231' input
gives you
Test,20040101,000100,0.73342,0.744318
With awk please try:
awk -F, '$2<=20031231' input.csv
I'm new using awk and I found it very useful for extracting data from columns. For example in my file I had
Data: 1234 23434 31324
If I wanted the second column I used:
awk '/Data:/ {print $3}' file.txt
But next, I had some variables inside the file, let's say:
variable_1=1
variable_2=4
How can I extract only the value? how can I extract the name of the variable by knowing the value?
awk offers to specify the field delimiter:
awk -F'=' '$1 == "variable_1" {print $2}' file
Prints:
1
You can do a lot of things with your file, what do you really want?
Get values:
source file.txt
echo "variable_1=${variable_1}"
echo "variable_2=${variable_2}"
Get keys corresponding to value 2
sed '/=2$/ s/=.*//' file.txt
I need to convert a 4-column file to 4 lines per entry. The file is tab-delimited.
The file at current is arranged in the following format, with each line representing one record/sequence (with millions of such lines):
#SRR1012345.1 NCAATATCGTGG #4=DDFFFHDHH HWI-ST823:136:C24YTACXX
#SRR1012346.1 GATTACAGATCT #4=DDFFFHDHH HWI-ST823:136:C22YTAGXX
I need to rearrange this such that the four columns are presented as 4 lines:
#SRR1012345.1
NCAATATCGTGG
#4=DDFFFHDHH
HWI-ST823:136:C24YTACXX
#SRR1012346.1
GATTACAGATCT
#4=DDFFFHDHH
HWI-ST823:136:C22YTAGXX
What would be the best way to go about doing this, preferably with a bash one-liner? Thank you for your assistance!
You can use tr:
< file tr '\t' '\n' > newfile
very clear to use awk here:
awk '{print $1; print $2; print $3; print $4}' file
$ awk -v OFS='\n' '{$1=$1}1' file
#SRR1012345.1
NCAATATCGTGG
#4=DDFFFHDHH
HWI-ST823:136:C24YTACXX
#SRR1012346.1
GATTACAGATCT
#4=DDFFFHDHH
HWI-ST823:136:C22YTAGXX