Why are results different when passing an argument to a function from piping to it as a process? - bash

I found this thread with two solutions for trimming whitespace: piping to xargs and defining a trim() function:
trim() {
local var="$*"
# remove leading whitespace characters
var="${var#"${var%%[![:space:]]*}"}"
# remove trailing whitespace characters
var="${var%"${var##*[![:space:]]}"}"
echo -n "$var"
}
I prefer the second because of one comment:
This is overwhelmingly the ideal solution. Forking one or more external processes merely to trim whitespace from a single string is fundamentally insane – particularly when most shells (including bash) already provide native string munging facilities out-of-the-box.
I am getting, for example, the wifi SSID on macOS by piping to awk (when I get comfortable with regular expressions in bash, I won't fork an awk process), which includes a leading space:
$ /System/Library/PrivateFrameworks/Apple80211.framework/Resources/airport -I | awk -F: '/ SSID/{print $2}'
<some-ssid>
$ /System/Library/PrivateFrameworks/Apple80211.framework/Resources/airport -I | awk -F: '/ SSID/{print $2}' | xargs
<some-ssid>
$ /System/Library/PrivateFrameworks/Apple80211.framework/Resources/airport -I | awk -F: '/ SSID/{print $2}' | trim
$ wifi=$(/System/Library/PrivateFrameworks/Apple80211.framework/Resources/airport -I | awk -F: '/ SSID/{print $2}')
$ trim "$wifi"
<some-ssid>
Why does piping to the trim function fail and giving it an argument work?

It is because your trim() function is expecting a positional argument list to process. The $* is the argument list passed to your function. For the case that you report as not working, you are connecting the read end of a pipe to the function inside which you need to fetch from the standard input file descriptor.
In such a case you need to read from standard input using read command and process the argument list, i.e. as
trim() {
# convert the input received over pipe to a a single string
IFS= read -r var
# remove leading whitespace characters
var="${var#"${var%%[![:space:]]*}"}"
# remove trailing whitespace characters
var="${var%"${var##*[![:space:]]}"}"
echo -n "$var"
}
for which you can now do
$ echo " abc " | trim
abc
or using a command substitution syntax to run the command that fetches the string, that you want to pass to trim() with your older definition.
trim "$(/System/Library/PrivateFrameworks/Apple80211.framework/Resources/airport -I | awk -F: '/ SSID/{print $2}')"
In this case, the shell expands the $(..) by running the command inside and replaces it with output of the commands run. So now the function sees trim <args> which it interprets as a positional argument and runs the string replacement functions directly on it.

Related

How do I cut everything on each line starting with a multi-character string in bash?

How do I cut everything starting with a given multi-character string using a common shell command?
e.g., given:
foo+=bar
I want:
foo
i.e., cut everything starting with +=
cut doesn't work because it only takes a single-character delimiter, not a multi-character string:
$ echo 'foo+=bar' | cut -d '+=' -f 1
cut: bad delimiter
If I can't use cut, I would consider using perl instead, or if there's another shell command that is more commonly installed.
cut only allows single character delimiter.
You may use bash string manipulation:
s='foo+=bar'
echo "${s%%+=*}"
foo
or use more powerful awk:
awk -F '\\+=' '{print $1}' <<< "$s"
foo
'\\+=' is a regex that matches + followed by = character.
You can use 'sed' command to do this:
string='foo+=bar'
echo ${string} | sed 's/+=.*//g'
foo
or if you're using Bash shell, then use the below parameter expansion (recommended) since it doesn't create unnecessary pipeline and another sed process and so is efficient:
echo ${string%%\+\=*}
or
echo ${string%%[+][=]*}

How to remove the username/hostname line from an output on Korn Shell?

I run the command
df -gP /data1 /data2 | grep -v File | awk '{print $1}' |
awk -F/dev/ '$0=$2' | tr '\n' '
on the AIX shell (ksh) and it prints the output below:
lv_data01 lv_data02 root#testhost:/
However, I would like the output to be printed this way. Could someone help?
lv_data01 lv_data02
Using grep … | awk … | awk … is not necessary; a single awk could do the whole job. So could sed and it might even be easier. I'd be tempted to deal with the spacing by using:
x=$(df … | sed …); echo $x
The tr command, once corrected, replaces newlines with spaces, so the prompt follows without a newline before it. The ; echo suggestion adds the missing newline; the echo $x suggestion (note no double quotes) does too.
As for the sed command:
sed -n '/File/!{ s/[[:space:]].*//; s%^.*/dev/%%p; }'
Don't print anything by default
If the line doesn't match File (doing the work of grep -v):
remove the first space (blank or tab) and everything after it (doing the work of awk '{print $1}')
replace everything up to /dev/ with nothing and print (doing the work of awk -F/dev/ '{$0=$2}')
The command substitution and capture, followed by echo, deals with spaces and newlines.
So, my suggested solution is:
x=$(df -gP /data1 /data2 | sed -n '/File/!{ s/[[:space:]].*//; s%^.*/dev/%%p; }'); echo $x
You could add unset x after the echo if you are going to be using this directly in the shell and not in a shell script. If it'll be encapsulated in a shell script, you don't have to worry about it.
I'm blithely assuming the output from df -gP won't contain a path such as this, with two occurrences of /dev:
/who/knows/dev/lv_data01/dev/bin
If that's a real problem, you can fix the sed script, but I don't think it will be. It's one thing the second awk script in the question handles differently.

how to alter an awk variable with sed

i have a bash command that produces a list of files for which i need to alter the filename.
so i was thinking of using something like this:
mycommand | awk {mv $1 altered$1}
the problem is that the second $1 should be altered replacing with sed some regular expressions.
how can i apply sed to the second parameter?
i tried with $() and |, but it does not work.
I also tried
awk '{print $1 sed "s/[^A-Za-z0-9._-]/_/g" <<< $1}'
awk: cmd. line:1: Unexpected token
mv is not an awk command. You need shell. Try:
mycommand | while IFS= read -r f; do mv "$f" "${f//[^A-Za-z0-9._-]/_}"; done
This assumes that the file names are newline-separated. This is OK unless a file name contains a newline as part of its name. For better reliability, mycommand and the while loop should be modified to use NUL as the separator.
How it works:
while IFS= read -r f; do
This starts a loop that reads each line, in turn, into variable f.
IFS= tells the shell to keep the leading or trailing whitespace on a line. If mycommand produces superfluous leading or trailing whitespace, then remove this.
-r tells the shell to keep backslashes in the input just as they are.
mv "$f" "${f//[^A-Za-z0-9._-]/_}"
This renames the file.
done
This signals the end of the while loop.
Is subshell accepted by you? If yes, a simple way you can do as followed:
mv `mycommand | awk '{print $1}'` {altered$1}
Use rename (always installed in Debian based distros, via the required util-linux package):
rename 's/^/altered/' $(mycommand)

Grep last match of returned multi line result and assign to variable

Lets say that I have a command list kittens that returns something in this multi line format in my terminal (in this exact layout):
[ 'fluffy'
'buster'
'bob1' ]
How can I fetch bob1 and assign to a variable for scripting use? Here's my non working try so far.
list kittens | grep "'([^']+)' \]"
I am not overly familiar with grepping on the cli and am running into issues of syntax with quotes and such.
If you know that bob1 will be in the last line, you can capture it like that:
myvar="$(list kittens | tail -n1 | grep -oP "'\K[^']+(?=')")"
This uses tail to find the last line and then grep with a lookahead and a lookbehind in the regular expression to extract the part inside the quotes.
Edit: The above assume that you are using GNU grep (for the -P mode). Here's an alternative with sed:
myvar="$(list kittens | tail -n1 | sed -e "s/^[^']*'//; s/'[^']*$//")"
Could be done by awk alone:
list kittens |awk 'END{gsub(/\047|[[:blank:]]|\]/,"");print $0}'
bob1
Example:
echo "$kit"
[ 'fluffy'
'buster'
'bob1' ]
echo "$kit" |awk 'END{gsub(/\047|[[:blank:]]|\]/,"");print $0}'
bob1
To Assign it to any variable:
var=$(list kittens |awk 'END{gsub(/\047|[[:blank:]]|\]/,"");print $0}'
Explanation:
END{}: End block is used to take data from last line as we are interested only for last line.
gsub: This is awk's inbuilt function for search and replacement tasks. Here white space and double quoted and single quotes are removed. Not that \047 is used for single quote replacement.

How to parse variable to sed command in shell?

I have some variables:
$begin=10
$end=20
how to pass them to sed command.
sed -n '$begin,$endp' filename | grep word
sed -n '10,20p' filename | grep word
The reason this doesn't work is that single quotes in shell code prevent variable expansion. The good way is to use awk:
awk -v begin="$begin" -v end="$end" 'NR == begin, NR == end' filename
It is possible with sed if you use double quotes (in which shell variables are expanded):
sed -n "$begin,$end p" filename
However, this is subject to code injection vulnerabilities because sed cannot distinguish between code and data this way (unlike the awk code above). If a user manages to set, say, end="20 e rm -Rf /;", unpleasant things can happen.

Resources