I started only a few weeks ago with scripting or I am trying at least ...
bash-4.3# /usr/openv/netbackup/bin/admincmd/bperror -backstat -hoursago 72 \
| grep xxx1 \
| awk '{ print $1 "\t" $19 "\t" $12 "\t" $14 "\t" $16 }' >> test
bash-4.3# cat test
1535229470 0 xxx1 policy1 sched1
1535314239 0 xxx1 policy1 sched1
1535400749 0 xxx1 policy1 sched1
Now I want to transform the first entry (timestamp) into a readable date
date=$(awk 'NR == 1 {print $1}' test); bpdbm -ctime $date |awk '{ print $3 " " $4 " " $5 " " $6 " " $8 }'
Sat Aug 25 22:37:50 2018
How can I now replace the first entry on each line by this output or change the first command?
thank you very much!
Using GNU awk:
awk '$1~/[0-9]+/{$1=strftime(PROCINFO["strftime"],$1)}1' file
This replaces the timestamp in the first field of the line with the associated readable date using the function strftime.
The date format is the default one PROCINFO["strftime"] as mentioned in the awk man page.
Related
Looking to extract Specific Words from each line
Nov 2 11:25:51 imau03ftc CSCOacs_TACACS_Accounting 0687979272 1 0 2016-11-02 11:25:51.250 +13:00 0311976914 3300 NOTICE Tacacs-Accounting: TACACS+ Accounting with Command, ACSVersion=acs-5.6.0.22-B.225, ConfigVersionId=145, Device IP Address=10.107.32.53, CmdSet=[ CmdAV=show controllers <cr> ], RequestLatency=0, Type=Accounting, Privilege-Level=15, Service=Login, User=nc-rancid, Port=tty1, Remote-Address=172.26.200.204, Authen-Method=TacacsPlus, AVPair=task_id=8280, AVPair=timezone=NZDT, AVPair=start_time=1478039151, AVPair=priv-lvl=1, AcctRequest-Flags=Stop, Service-Argument=shell, AcsSessionID=imau03ftc/262636280/336371030, SelectedAccessService=Default Device Admin, Step=13006 , Step=15008 , Step=15004 , Step=15012 , Step=13035 , NetworkDeviceName=CASWNTHS133, NetworkDeviceGroups=All Devices:All Devices, NetworkDeviceGroups=Device Type:All Device Types:Corporate, NetworkDeviceGroups=Location:All Locations, Response={Type=Accounting; AcctReply-Status=Success; }
Looking to extract
Nov 2 11:25:51 show controllers User=nc-rancid NetworkDeviceName=CASWNTHS133
can use awk,grep or sed
i have tried few combinations like
sudo tail -n 20 /var/log/tacacs/imau03ftc-accounting.log | grep -oP 'User=\K.*' & 'NetworkDeviceName=\K.*'
sudo tail -n 20 /var/log/tacacs/imau03ftc-accounting.log | sudo awk -F" " '{ print $1 " " $3 " " $9 " " $28}'
i can add few more lines but most of them have same format
thanks
Try to run this:
sudo tail -n 20 /var/log/tacacs/imau03ftc-accounting.log > tmpfile
Then execute this script:
#!/bin/sh
while read i
do
str=""
str="$(echo $i |awk '{print $1,$2,$3}')"
str="$str $(echo $i |awk 'match($0, /CmdAV=([^<]+)/) { print substr( $0, RSTART,RLENGTH ) }'|awk -F "=" '{print $2}')"
str="$str $(echo $i |awk 'match($0, /User=([^,]+)/) { print substr( $0, RSTART, RLENGTH ) }')"
str="$str $(echo $i |awk 'match($0, /NetworkDeviceName=([^,]+)/) { print substr( $0, RSTART, RLENGTH ) }')"
echo $str
done < tmpfile
Output:
Nov 2 11:25:51 show controllers User=nc-rancid NetworkDeviceName=CASWNTHS133
I'm trying to format a date in a column of a csv.
The input is something like: 28 April 1966
And I'd like this output: 1966-04-28
which can be obtain with this code:
date -d "28 April 1966" +%F
So now I thought of mixing awk and this code to format the entire column but I can't find out how.
Edit :
Example of input : (separators "|" are in fact tabs)
1 | 28 April 1966
2 | null
3 | null
4 | 30 June 1987
Expected output :
1 | 1966-04-28
2 | null
3 | null
4 | 30 June 1987
A simple way is
awk -F '\\| ' -v OFS='| ' '{ cmd = "date -d \"" $3 "\" +%F 2> /dev/null"; cmd | getline $3; close(cmd) } 1' filename
That is:
{
cmd = "date -d \"" $3 "\" +%F 2> /dev/null" # build shell command
cmd | getline $3 # run, capture output
close(cmd) # close pipe
}
1 # print
This works because date doesn't print anything to its stdout if the date is invalid, so the getline fails and $3 is not changed.
Caveats to consider:
For very large files, this will spawn a lot of shells and processes in those shells (one each per line). This can become a noticeable performance drag.
Be wary of code injection. If the CSV file comes from an untrustworthy source, this approach is difficult to defend against an attacker, and you're probably better off going the long way around, parsing the date manually with gawk's mktime and strftime.
EDIT re: comment: To use tabs as delimiters, the command can be changed to
awk -F '\t' -v OFS='\t' '{ cmd = "date -d \"" $3 "\" +%F 2> /dev/null"; cmd | getline $3; close(cmd) } 1' filename
EDIT re: comment 2: If performance is a worry, as it appears to be, spawning processes for every line is not a good approach. In that case, you'll have to do the parsing manually. For example:
BEGIN {
OFS = FS
m["January" ] = 1
m["February" ] = 2
m["March" ] = 3
m["April" ] = 4
m["May" ] = 5
m["June" ] = 6
m["July" ] = 7
m["August" ] = 8
m["September"] = 9
m["October" ] = 10
m["November" ] = 11
m["December" ] = 12
}
$3 !~ /null/ {
split($3, a, " ")
$3 = sprintf("%04d-%02d-%02d", a[3], m[a[2]], a[1])
}
1
Put that in a file, say foo.awk, and run awk -F '\t' -f foo.awk filename.csv.
This should work with your given input
awk -F'\\|' -vOFS="|" '!/null/{cmd="date -d \""$3"\" +%F";cmd | getline $3;close(cmd)}1' file
Output
| 1 |1966-04-28
| 2 | null
| 3 | null
| 4 |1987-06-30
I would suggest using a language that supports parsing dates, like perl:
$ cat file
1 28 April 1966
2 null
3 null
4 30 June 1987
$ perl -F'\t' -MTime::Piece -lane 'print "$F[0]\t",
$F[1] eq "null" ? $F[1] : Time::Piece->strptime($F[1], "%d %B %Y")->strftime("%F")' file
1 1966-04-28
2 null
3 null
4 1987-06-30
The Time::Piece core module allows you to parse and format dates, using the standard format specifiers of strftime. This solution splits the input on a tab character and modifies the format if the second field is not "null".
This approach will be much faster than using system calls or invoking subprocesses, as everything is done in native perl.
Here is how you can do this in pure BASH and avoid calling system or getline from awk:
while IFS=$'\t' read -ra arr; do
[[ ${arr[1]} != "null" ]] && arr[1]=$(date -d "${arr[1]}" +%F)
printf "%s\t%s\n" "${arr[0]}" "${arr[1]}"
done < file
1 1966-04-28
2 null
3 null
4 1987-06-30
Only one date call and no code injection problem is possible, see the following:
This script extracts the dates (using awk) into a temporary file processes them with one "date" call and merges the results back (using awk).
Code
awk -F '\t' 'match($3,/null/) { $3 = "0000-01-01" } { print $3 }' input > temp.$$
date --file=temp.$$ +%F > dates.$$
awk -F '\t' -v OFS='\t' 'BEGIN {
while ( getline < "'"dates.$$"'" > 0 )
{
f1_counter++
if ($0 == "0000-01-01") {$0 = "null"}
date[f1_counter] = $0
}
}
{$3 = date[NR]}
1' input.$$
One-liner using bash process redirections (no temporary files):
inputfile=/path/to/input
awk -F '\t' -v OFS='\t' 'BEGIN {while ( getline < "'<(date -f <(awk -F '\t' 'match($3,/null/) { $3 = "0000-01-01" } { print $3 }' "$inputfile") +%F)'" > 0 ){f1_counter++; if ($0 == "0000-01-01") {$0 = "null"}; date[f1_counter] = $0}}{$3 = date[NR]}1' "$inputfile"
Details
here is how it can be used:
# configuration
input=/path/to/input
temp1=temp.$$
temp2=dates.$$
output=output.$$
# create the sample file (optional)
#printf "\t%s\n" $'1\t28 April 1966' $'2\tnull' $'3\tnull' $'4\t30 June 1987' > "$input"
# Extract all dates
awk -F '\t' 'match($3,/null/) { $3 = "0000-01-01" } { print $3 }' "$input" > "$temp1"
# transform the dates
date --file="$temp1" +%F > "$temp2"
# merge csv with transformed date
awk -F '\t' -v OFS='\t' 'BEGIN {while ( getline < "'"$temp2"'" > 0 ){f1_counter++; if ($0 == "0000-01-01") {$0 = "null"}; date[f1_counter] = $0}}{$3 = date[NR]}1' "$input" > "$output"
# print the output
cat "$output"
# cleanup
rm "$temp1" "$temp2" "$output"
#rm "$input"
Caveats
Using "0000-01-01" as a temporary placeholder for invalid (null) dates
The code should be faster than other methods calling "date" a lot of times, but it reads the input file two times.
I have a csv file that needs a lot of manipulation. Maybe by using awk and sed?
input:
"Sequence","Fat","Protein","Lactose","Other Solids","MUN","SCC","Batch Name"
1,4.29,3.3,4.69,5.6,11,75,"35361305a"
2,5.87,3.58,4.41,5.32,10.9,178,"35361305a"
3,4.01,3.75,4.75,5.66,12.2,35,"35361305a"
4,6.43,3.61,3.56,4.41,9.6,275,"35361305a"
final output:
43330075995647
59360178995344
40380035995748
64360275964436
I'm able to get through some of it going step by step.
How do I test specific columns for a value over 9.9 and replace it with 9.9 ?
Also, is there a way to combine any of these steps?
remove first line:
tail -n +2 test.csv > test1.txt
remove commas:
sed 's/,/ /g' test1.txt > test2.txt
remove quotes:
sed 's/"//g' test2.txt > test3.txt
remove columns 1 and 8 and
reorder remaining columns as 1,2,6,5,4,3:
sort test3.txt | uniq -c | awk '{print $3 "\t" $4 "\t" $8 "\t" $7 "\t" $6 "\t" $5}' test4.txt
test new columns 1,2,4,5,6 - if the value is over 9.9, replace it with 9.9
How should I do this step?
solution for following parts were found in a previous question - reformating a text file
columns 1,2,4,5,6 round decimals to tenths
column 3 needs to be four characters long, using zero to left fill
remove periods and spaces
awk '{$0=sprintf("%.1f%.1f%4s%.1f%.1f%.1f", $1,$2,$3,$4,$5,$6);gsub(/ /,"0");gsub(/\./,"")}1' test5.txt > test6.txt
This produces the output you want from the original file. Note that in the question you specified - note that in the question you specified "column 4 round to whole number" but in the desired output you had rounded it to one decimal place instead:
awk -F'[,"]+' 'function m(x) { return x < 9.9 ? x : 9.9 }
NR > 1 {
s = sprintf("%.1f%.1f%04d%.1f%.1f%.1f", m($2),m($3),$7,m($6),m($5),m($4))
gsub(/\./, "", s)
print s
}' test.csv
I have specified the field separator as any number of commas and double quotes together, so this "parses" your CSV format for you without requiring any additional steps.
The function m returns the minimum of 9.9 and the number you pass to it.
Output:
43330075995647
59360178995344
40380035995748
64360275964436
The three first in one go:
awk -F, '{gsub(/"/,"");$1=$1} NR>1' test.csc
1 4.29 3.3 4.69 5.6 11 75 35361305a
2 5.87 3.58 4.41 5.32 10.9 178 35361305a
3 4.01 3.75 4.75 5.66 12.2 35 35361305a
4 6.43 3.61 3.56 4.41 9.6 275 35361305a
tail -n +2 file | sort -u | awk -F , '
{
$0 = $1 FS $2 FS $6 FS $5 FS $4 FS $3
for (i = 1; i <= 6; ++i)
if ($i > 9.9)
$i = 9.9
$0 = sprintf("%.1f%.1f%4s%.0f%.1f%.1f", $1, $2, $3, $4, $5, $6)
gsub(/ /, "0"); gsub(/[.]/, "")
print
}
'
Or
< file awk -F , '
NR > 1 {
$0 = $1 FS $2 FS $6 FS $5 FS $4 FS $3
for (i = 1; i <= 6; ++i)
if ($i > 9.9)
$i = 9.9
$0 = sprintf("%.1f%.1f%4s%.0f%.1f%.1f", $1, $2, $3, $4, $5, $6)
gsub(/ /, "0"); gsub(/[.]/, "")
print
}
'
Output:
104309964733
205909954436
304009964838
406409643636
I have the nawk command where I need to format the data based on the length .All the time I need to keep first 6 digit and last 4 digit and make xxxx in the middle. Can you help in fine tuning the below script
#!/bin/bash
FILES=/export/home/input.txt
cat $FILES | nawk -F '|' '{
if (length($3) >= 13 )
print $1 "|" $2 "|" substr($3,1,6) "xxxxxx" substr($3,13,4) "|" $4"|" $5
else
print $1 "|" $2 "|" $3 "|" $4 "|" $5"|
}' > output.txt
done
input.txt
"2"|"X"|"A"|"ST"|"245552544555201"|"1111-11-11"|75.00
"6"|"Y"|"D"|"VT"|"245652544555200"|"1111-11-11"|95.00
"5"|"X"|"G"|"ST"|"3445625445552023"|"1111-11-11"|75.00
"3"|"Y"|"S"|"VT"|"24532254455524"|"1111-11-11"|95.00
output.txt
"X"|"ST"|"245552544555201"|"245552xxxxx5201"
"Y"|"VT"|"245652544555200"|"245652xxxxx5200"
"X"|"ST"|"3445625445552023"|"344562xxxxxx2023"
"Y"|"VT"|"24532254455524"|"245322xxxx5524"
Try this:
$ awk '
BEGIN {FS = OFS = "|"}
length($5)>=13 {
fld5=$5
start = substr($5,1,7)
end = substr($5,length($5)-4)
gsub(/./,"x",fld5)
sub(/^......./,start,fld5)
sub(/.....$/,end,fld5)
$1=$2; $2=$4; $3=$5; $4=fld5; NF-=3;
}1' file
"X"|"ST"|"245552544555201"|"245552xxxxx5201"
"Y"|"VT"|"245652544555200"|"245652xxxxx5200"
"X"|"ST"|"3445625445552023"|"344562xxxxxx2023"
"Y"|"VT"|"24532254455524"|"245322xxxx5524"
I am trying to write the script to capture and mask the specific column.I need to have the 4 column with clear text and also mask it too in output file .I am not sure how to mask the same column
Pls help me in rewriting the below command or new command
input.txt
---------
AA | BB | CC | 123456
output.txt
---------
BB | 123456 | 12xx56
Script I wrote
cat input.txt | nawk -F '|' '{print $2 "|" $4 "|" $4} >output.txt
nawk -F '|' '{print $2 "|" $4 "|" substr($4, 1,3) "xx" substr($4,6,2)}' input.txt > output.txt
output
BB | 123456| 12xx56
Assuming you don't really need the leading and trailing spaces, I would make it
nawk -F '|' '{gsub(/ */, "", $0);print $2 "|" $4 "|" substr($4, 1,2) "xx" substr($4,5,2)}' input.txt > output.txt
cat output.txt
BB|123456|12xx56
final solution
echo "AA | BB | CC | 12345678" \
| awk -F '|' '{gsub(/ */, "", $0)
#dbg print "length$4=" (length($4)-4)
masking=sprintf("%"(length($4)-4)"s", " ") ; gsub(/ /, "x", masking)
print $2 "|" $4 "|" substr($4, 1,2) masking substr($4,(length($4)-1),2)
}'
BB|12345678|12xxxx78
I using echo "..." to simplfy the testing process. You can take that out, replace with input.txt > output.txt and the end of the line and it will work as before.
I've added the (length($4)-1) to make the position of the 2nd to last char on $4 dynamic, based on the length of what ever word is in $4.
IHTH