Bash find/replace and run command on matching group - bash

I'm trying to do a dynamic find/replace where a matching group from the find gets manipulated in the replace.
testfile:
…
other text
base64_encode_SOMEPATH_ something
other(stuff)
text base64_encode_SOMEOTHERPATH_
…
Something like this:
sed -i "" -e "s/(base64_encode_(.*)_)/cat MATCH | base64/g" testfile
Which would output something like:
…
other text
U09NRVNUUklORwo= something
other(stuff)
text U09NRU9USEVSU1RSSU5HCg==
…

Updated per your new requirement. Now using GNU awk for the 3rd arg to match() for convenience:
$ awk 'match($0,/(.*)base64_encode_([^_]+)_(.*)/,arr) {
cmd = "base64 <<<" arr[2]
if ( (cmd | getline rslt) > 0) {
$0 = arr[1] rslt arr[3]
}
close(cmd)
} 1' file
…
other text
U09NRVNUUklORwo= something
other(stuff)
text U09NRU9USEVSU1RSSU5HCg==
…
Make sure you read and understand http://awk.info/?tip/getline if you're going to use getline.
If you can't install GNU awk (but you really, REALLY would benefit from having it so do try) then something like this would work with any modern awk:
$ awk 'match($0,/base64_encode_[^_]+_/) {
arr[1] = substr($0,1,RSTART-1)
arr[2] = arr[3] = substr($0,RSTART+length("base64_encode_"))
sub(/_.*$/,"",arr[2])
sub(/^[^_]+_/,"",arr[3])
cmd = "base64 <<<" arr[2]
if ( (cmd | getline rslt) > 0) {
$0 = arr[1] rslt arr[3]
}
close(cmd)
} 1' file
I say "something like" because you might need to tweak the substr() and/or sub() args if they're slightly off, I haven't tested it.

awk '!/^base64_encode_/ { print } /^base64_encode_/ { fflush(); /^base64_encode_/ { fflush(); sub("^base64_encode_", ""); sub("_$", ""); cmd = "base64" ; print $0 | cmd; close(cmd); }' testfile > testfile.out
This says to print non-matching lines unaltered.
Matching lines get altered with the awk function sub() to extract the string to be encoded, which is then piped to the base64 command, which prints the result to stdout.
The fflush call is needed so that all the previous output from awk has been flushed before the base64 output appears, ensuring lines aren't re-ordered.
Edit:
As pointed out in the comment, testing every line twice for matching a pattern and non-matching the same pattern isn't very good. This single action handles all lines:
{
if ($0 !~ "base64_encode_")
{
print;
next;
}
fflush();
sub("^.*base64_encode_", "");
sub("_$", "");
cmd = "base64";
print $0 | cmd;
close(cmd);
}

Related

Increment a value regarding a pattern in a file

I have a file like this :
"A";"1"
"A";""
"A";""
"B";"1"
"C";"1"
"C";""
"C";""
When I have the same pattern between first part of current line and previous line, I want increment the second part of my line.
like this :
"A";"1"
"A";"2"
"A";"3"
"B";"1"
"C";"1"
"C";"2"
"C";"3"
or if second part is empty I take the previous line and I increment it.
Do you have any idea how I can do this with a shell script or maybe with awk or sed command?
With perl:
$ perl -F';' -lane 'if ($F[1] =~ /"(\d+)"/) { $saved = $1; } else { $saved++; $F[1] = qq/"$saved"/; }
print join(";", #F)' example.txt
"A";"1"
"A";"2"
"A";"3"
"B";"1"
"C";"1"
"C";"2"
"C";"3"
With awk:
$ awk -F';' -v OFS=';' '
$2 ~ /"[0-9]+"/ { saved = substr($2, 2, length($2) - 2) }
$2 == "\"\"" { $2 = "\"" ++saved "\"" }
{ print }' example.txt
"A";"1"
"A";"2"
"A";"3"
"B";"1"
"C";"1"
"C";"2"
"C";"3"

Appending result of function on another field into csv using shell script, awk

I have a csv file stored as a temporary variable in a shell script (*.sh).
Let's say the data looks like this:
Account,Symbol,Price
100,AAPL US,200
102,SPY US,500
I want to add a fourth column, "Type", which is the result of a shell function "foobar". Run from the command line or a shell script itself:
$ foobar "AAPL US"
"Stock"
$ foobar "SPY US"
"ETF"
How do I add this column to my csv, and populate it with calls to foobar which take the second column as an argument? To clarify, this is my ideal result post-script:
Account,Symbol,Price,Type
100,AAPL US,200,Common Stock
102,SPY US,500,ETF
I see many examples online involving such a column addition using awk, and populating the new column with fixed values, conditional values, mathematical derivations from other columns, etc. - but nothing that calls a function on another field and stores its output.
You may use this awk:
export -f foobar
awk 'BEGIN{FS=OFS=","} NR==1{print $0, "Type"; next} {
cmd = "foobar \"" $2 "\""; cmd | getline line; close(cmd);
print $0, line
}' file.csv
Account,Symbol,Price,Type
100,AAPL US,200,Common Stock
102,SPY US,500,ETF
#anubhavas answer is a good approach so please don't change the accepted answer as I'm only posting this as an answer as it's too big and in need of formatting to fit in a comment.
FWIW I'd write his awk script as:
awk '
BEGIN { FS=OFS="," }
NR==1 { type = "Type" }
NR > 1 {
cmd = "foobar \047" $2 "\047"
type = ((cmd | getline line) > 0 ? line : "ERROR")
close(cmd)
}
{ print $0, type }
' file.csv
to:
better protect $2 from shell expansion, and
protect from silently printing the previous value if/when cmd | getline fails, and
consolidate the print statements to 1 line so it's easy to change for all output lines if/when necessary
awk to the rescue!
$ echo "Account,Symbol,Price
100,AAPL US,200
102,SPY US,500" |
awk -F, 'NR>1{cmd="foobar "$2; cmd | getline type} {print $0 FS (NR==1?"Type":type)}'
Not sure you need to quote the input to foobar
Another way not using awk:
paste -d, input.csv <({ read; printf "Type\n"; while IFS=, read -r _ s _; do foobar "$s"; done; } < input.csv)

Translating a sed one-liner into awk

I am parsing files containing lines of "key=value" pairs. An example could be this:
Normal line
Another normal line
[PREFIX] 1=Something 5=SomethingElse 26=42
Normal line again
I'd like to leave all lines not containing key=value pairs as they are, while transforming all lines containing key=value pairs as follows:
Normal line
Another normal line
[PREFIX]
AAA=Something
EEE=SomethingElse
ZZZ=42
Normal line again
Assume I have a valid dictionary for the translation.
What I do at the moment is passing the input to sed, where I turn spaces into newlines for the lines that match '^\['.
The output is then piped into this awk script:
BEGIN {
dict[1] = "AAA"
dict[5] = "EEE"
dict[26] = "ZZZ"
FS="="
}
{
if (match($0, "[0-9]+=.+")) {
key = ""
if ($1 in dict) {
key = dict[$1]
}
printf("%7s = %s\n", key, $2)
}
else {
print
next
}
}
The overall command line then becomes:
cat input | sed '/^\(\[.*\)/s/ /\n/g' | awk -f script.awk
My question is: is there any way I can include the sed operation in the middle so to get rid of that additional step?
$ cat tst.awk
BEGIN {
split("1 AAA 5 EEE 26 ZZZ",tmp)
for (i=1; i in tmp; i+=2) {
dict[tmp[i]] = tmp[i+1]
}
FS="[ =]"
OFS="="
}
$1 == "[PREFIX]" {
print $1
for (i=2; i<NF; i+=2) {
print " " ($i in dict ? dict[$i] : $i), $(i+1)
}
next
}
{ print }
$ awk -f tst.awk file
Normal line
Another normal line
[PREFIX]
AAA=Something
EEE=SomethingElse
ZZZ=42
Normal line again
In fact I could not force awk to read the file twice; one for sed command, one for your algo, so I had to modify your algo.
BEGIN {
dict[1] = "AAA"
dict[5] = "EEE"
dict[26] = "ZZZ"
# FS="="
}
$0 !~/[0-9]+=.+/ { print }
/[0-9]+=.+/ {
nb = split($0,arr1);
for (i=1; i<=nb; i++ in arr1) {
nbb = split(arr1[i], keyVal, "=");
if ( (nbb==2) && (keyVal[1] in dict) ) {
printf("%7s = %s\n", dict[keyVal[1]], keyVal[2])
}
else
print arr1[i];
}
}
When you have to convert a lot, you can first migrate your dict file into a sed script file. When your dicht file has a fixed format, you can convert it on the fly.
Suppose your dict file looks like
1=AAA
5=EEE
26=ZZZ
And your input file is
Normal line
Another normal line
[PREFIX] 1=Something 5=SomethingElse 26=42
Normal line again
You want to do something like
cat input | sed '/^\[/ s/ /\n/g' | sed 's/^1=/ AAA=/'
# Or eliminating the extra step with cat
sed '/^\[/ s/ /\n/g' input | sed 's/^1=/ AAA=/'
So your next step is converting your dict file into sed commands:
sed 's#\([^=]*\)=\(.*\)#s/^\1=/ \2=/#' dictfile
Now you can combine these with
sed '/^\[/ s/ /\n/g' input | sed -f <(
sed 's#\([^=]*\)=\(.*\)#s/^\1=/ \2=/#' dictfile
)

Overwriting a file in bash

I have a file, of which a part is shown below:
OUTPUT_FILENAME="out.Received.Power.x.0.y.1.z.0.41
X_TX=0
Y_TX=1
Z_TX=0.41
I would like to automatically change some part of it with BASH: every time i see OUTPUT_FILENAME i want to over write the name next to it and change it with a new one. Then i want to do the same with the values X_TX, Y_TX and Z_TX: delete the value next to it and rewrite a new one. For example instead of X_TX=0 i want X_TX=0.3 or viceversa.
Do you think it's possible?Maybe with grep or so..
You can use sed like this:
i.e. to replace X_TX= with X_TX=123 you can do:
sed -i -e 's/X_TX=.*/X_TX=123/g' /tmp/file1.txt
One option using awk. Your values are passed as variables to the awk script and substituted when exists a match:
awk -v outfile="str_outfile" -v x_tx="str_x" -v y_tx="str_y" -v z_tx="str_z" '
BEGIN { FS = OFS = "=" }
$1 == "OUTPUT_FILENAME" { $2 = outfile; print; next }
$1 == "X_TX" { $2 = x_tx; print $0; next }
$1 == "Y_TX" { $2 = y_tx; print $0; next }
$1 == "Z_TX" { $2 = z_tx; print $0; next }
' infile

Writing a shell wrapper script for awk

I want to embed an awk script inside a shell script but I have trouble to do so as I don't know where to end a statement with a ; and where not.
Here's my script
#!/bin/sh
awk='
BEGIN {FS = ",?+" }
# removes all backspaces preceded by any char except _
function format() {
gsub("[^_]\b", "")
}
function getOptions() {
getline
format()
print
}
{
format()
if ($0 ~ /^SYNOPSIS$/ {
getOptions()
next
}
if ($0 /^[ \t]+--?[A-Za-z0-9]+/) {
print $0
}
}
END { print "\n" }'
path='/usr/share/man/man1'
list=$(ls $path)
for item in $list
do
echo "Command: $item"
zcat $path$item | nroff -man | awk "$awk"
done > opts
I'm using nawk by the way.
Thanks in advance
There are several things wrong, as far as I can see:
You don't close the multi-line string being assigned to $awk. You need a single quote on the line after END { ... }
You don't seem to actually use $awk anywhere. Perhaps you meant on the invocation of awk inside the do loop.
Once you fix those issues, awk is usually fairly forgiving about semicolons, but any problems in that regard don't have anything to do with using it inside a shell script.
These three lines:
path='/usr/share/man/man1'
list=$(ls $path)
for item in $list
Need to be changed into:
path='/usr/share/man/man1'
for item in $path/*
in case there are spaces in filenames and since ls is not intended to be used in this way.
i am not really sure what you meant, but if i understand you correctly, your showOpts.awk is that awk code at the beginning of your script, so you could do this
path='/usr/share/man/man1'
list=$(ls $path)
for item in $list
do
echo "Command: $item"
zcat $path$item | nroff -man | nawk ' BEGIN {FS = ",?+" }
# removes all backspaces preceded by any char except _
function format() {
gsub("[^_]\b", "")
}
function getOptions() {
getline
format()
print
}
{
format()
if ($0 ~ /^SYNOPSIS$/ {
getOptions()
next
}
if ($0 /^[ \t]+--?[A-Za-z0-9]+/) {
print $0
}
}
END { print "\n" } '
done >> opts
and you should probably use >> instead of > .

Resources