Awk input and output file delimiter - bash

I try to parse a column delimited password file using awk and put hostname in the beginning and add some fields. I need a comma separated output. So what I try is:
/usr/xpg4/bin/awk -F':' MYHOST=$(hostname) 'BEGIN{OFS=",";} {print MYHOST, $1, $3, $4, $5;}' /etc/passwd
But this command didn't produce output I wanted. This is a Solaris box, regular awk didn't work so I try with /usr/xpg4/bin/awk

this may help you:
/usr/xpg4/bin/awk -F':' -v MYHOST="$(hostname)" 'BEGIN{OFS=","} {print MYHOST, $1, $3, $4, $5;}' /etc/passwd

Related

Custom field and row separator in AWK (bash)

I have data in this order :
Person:Joe
Age:24
City:PH
---
Person:Joe
Age:22
City:NY
And i want to achieve data in this format
John|24|PH
Joe|22|NY
I tried with custom RS and OFS but i can't do this property.
$ awk -v RS= -F'[:[:space:]]+' -v OFS='|' '{print $2, $4, $6}' file
Joe|24|PH
Joe|22|NY

Convert time from dd/mm/yyyy hh:mm:ss to unix timestamp in bash script

I have browsed through the similar threads and they helped me come closest to what I want to do but didn't fully answer my question.
I have a date in the format dd/mm/yyyy hh:mm:ss ($mydate = 26/12/2013 09:42:42) that I want to convert in unix timestamp via the command:
date -d $mydate +%s
But here the accepted format is this one: yyyy-mm-dd hh:mm:ss
So I did this transformation:
echo $mydate| awk -F' ' '{printf $1}'| awk -F/ '{printf "%s-%s-%s\n",$3,$2,$1}'
And have this ouput:
2013-12-26
Which is good, now I try to append to hour part before doing the conversion:
echo $mydate| awk -F' ' '{printf $1; $hour=$2}'| awk -F/ '{printf "%s-%s-%s %s\n",$3,$2,$1,$hour}'
But then I have this:
2013-12-26 26/12/2013
It seem to not keep the variable $hour.
I am new in awk, how could I do this ?
In awk you can use a regex as a field separator. In your case instead of using awk twice you may want to do the following:
echo $mydate| awk -F' |/' '{printf "%s-%s-%s %s",$3,$2,$1,$4}'
With this we use both space and / as separators. First 3 parts are the date field, 4th one is time which lies after space.
The thing here is that you are "losing" the time block when you say awk '{print $1}'.
Instead, you can use a single awk command like:
awk -F'[: /]' '{printf "%d-%d-%d %d:%d:%d", $3, $2, $1, $4, $5, $6}'
This slices the record based on either :, / or space and then puts them back together in the desired format.
Test:
$ echo "26/12/2013 09:42:42" | awk -F'[: /]' '{printf "%d-%d-%d %d:%d:%d", $3, $2, $1, $4, $5, $6}'
2013-12-26 9:42:42
Then store it in a var and use date -d: formatted_date=$(awk '...'); date -d"$formatted_date".

enclose a string where missing double quotes

I have an input file like below. The issue is that the file is pipe delimited and enclosed by double quotes, optionally. It is missed in the third field at the end of the string and I could see that it happens whenever the length exceeds say 2.
"SER1828"|"ZXC"|"A1"|10002
"SER1878"|"IOP"|"B1"|98989
"SER1930"|"QWE"|"A2"|10301
"SER1930"|"QWE"|"Asdf2|10301 # 3rd field -> closing " missed out
The output should look like
"SER1828"|"ZXC"|"A1"|10002
"SER1878"|"IOP"|"B1"|98989
"SER1930"|"QWE"|"A2"|10301
"SER1930"|"QWE"|"Asdf2"|10301
I was trying with some awk commands but could not achieve it.
awk -F'|' -v q=\" '{$3=$3 q;}1' OFS=| temp
awk -F'|' -v q=\" '{if (length($3) > 2) ($3=$3;}1)}' OFS='|' temp
Using awk you can write,
awk -F'"?\\|' -vOFS='"|' '{print $1, $2, $3, $4}'
Example
awk -F'"?\\|' -vOFS='"|' '{print $1, $2, $3, $4}' input
"SER1828"|"ZXC"|"A1"|10002
"SER1878"|"IOP"|"B1"|98989
"SER1930"|"QWE"|"A2"|10301
"SER1930"|"QWE"|"Asdf2"|10301
What it does?
-F'"?\\|' Sets the input field separator to either "| or |
-vOFS='"|' Sets the output filed separator to "|. This is set always, that is even if the input field separator is | or "|
Or you can also write
awk -F'"?\|' -vOFS='"|' '1' input
Here 1 is always evaluated to true, in which case it will print the entire line.
awk -F'"?\\|' -vOFS='"|' '1' input
or
awk -F'"?\\|' -vOFS='"|' '{$1=$1}1' input
See #Kent's comment.
EDIT
If you want to add the quoting only for the third filed based on the length, you can write something like
awk -F'|' -vOFS='|' '{print $1, $2, $3(length($3)>4 ? "\"" : ""), $4}'
this sed one-liner works for given example:
sed 's/\([^"]\)|"/\1"|"/' file # this only works for the original example
This works for the original and current example:
sed 's/\([^"]\)|/\1"|/' file
awk '{sub(/Asdf2/,"Asdf2\"")}1' file
"SER1828"|"ZXC"|"A1"|10002
"SER1878"|"IOP"|"B1"|98989
"SER1930"|"QWE"|"A2"|10301
"SER1930"|"QWE"|"Asdf2"|10301

Print a column in awk if it matches, if not then still print the line (without that column)

I'm trying to do some filtering with awk but I'm currently running into an issue. I can make awk match a regex and print the line with the column but I cannot make it print the line without the column.
awk -v OFS='\t' '$6 ~ /^07/ {print $3, $4, $5, $6}' file
Is currently what I have. Can I make awk print the line without the sixth column if it doesn't match the regex?
Set $6 to the empty string if the regex doesn't match. As simple as that. This should do it:
awk -v OFS='\t' '{ if ($6 ~ /^07/) { print $3, $4, $5, $6 } else { $6 = ""; print $0; } }' file
Note that $0 is the entire line, including $2 (which you didn't seem to use). It will print every column except the 6th column.
If you just want to print $3, $4 and $5 when there isn't a match, use this instead:
awk -v OFS='\t' '{ if ($6 ~ /^07/) print $3, $4, $5, $6; else print $3, $4, $5 }' file

Adding new column using awk command adds the whole content to the last?

I am trying to append a current date field using awk in every line to my pipe separated content. Instead of what is intended, the whole 16 columns are getting appended to the 17th (new) position. I tried changing things but did not help. I think there is some basic mistake. Can I be helped here?
awk ' BEGIN{FS="|";}
{
$17=$(date +"%d-%m-%Y");
printf("%s|%s|%s|%s|%s|%s|%s|%s|%s|%s|%s|%s|%s|%s|%s|%s|%s\n", $1, $2, $3, $4, $5, $6, $7, $8, $9, $10,
$11, $12, $13, $14, $15, $16, $17);
}' /Users/temp/dispn/content3.txt | less > /Users/temp/dispn/content4.txt
Input(16 fields):
EX122YED| Buy online |example.com/EX122YED |example.com/EX122YED.jpg|White|new|00.00|in stock|XYZ|Accessories|Accessories|Male|30-40|y|EX122YED|0.0
My Output
EX122YED| Buy online |example.com/EX122YED |example.com/EX122YED.jpg|White|new|00.00|in stock|XYZ|Accessories|Accessories|Male|30-40|y|EX122YED|0.0|EX122YED| Buy online |example.com/EX122YED |example.com/EX122YED.jpg|White|new|00.00|in stock|XYZ|Accessories|Accessories|Male|30-40|y|EX122YED|0.0
Intended (17 fields):
EX122YED| Buy online |example.com/EX122YED |example.com/EX122YED.jpg|White|new|00.00|in stock|XYZ|Accessories|Accessories|Male|30-40|y|EX122YED|0.0|06-22-2014
I guess the problem here is that the $(date +"%d-%m-%Y"); in the script is interpreted as $0 because bash is not expanding it. To prevent this you can define the awk script in double quotes like this:
awk "{print \$0\"|\"$(date +\"%d-%m-%Y\")}"
which requires all sort of escaping $ and " characters to make it work.
If you do not need to use awk but could also use sed you could use:
sed "s/$/|$(date +"%d-%m-%Y")/"
Assign the date to a variable and print it with the entire line,
awk -F'|' -v var=$(date +"%d-%m-%Y") '{print $0,var}' OFS="|"
Example:
$ echo 'EX122YED| Buy online |example.com/EX122YED |example.com/EX122YED.jpg|White|new|00.00|in stock|XYZ|Accessories|Accessories|Male|30-40|y|EX122YED|0.0' | awk -F'|' -v var=$(date +"%d-%m-%Y") '{print $0,var}' OFS="|"
EX122YED| Buy online |example.com/EX122YED |example.com/EX122YED.jpg|White|new|00.00|in stock|XYZ|Accessories|Accessories|Male|30-40|y|EX122YED|0.0|22-06-2014
If you want month-date-year format then change the date command to date +"%m-%d-%Y"

Resources