AWK - taking incorrect parameters when using $1, $2 etc - bash

I am using a script which launches with a number parameters assigned to it. For example, you can imagine that this is what I am doing when launching the script from the command line: "/tmp" "/apps" "/var".
This script runs through a system which I will not delve into detail about. This system is the cause of my problem but I am powerless to make any changes to this.
So, when I use awk '{print $2}', I receive the 2nd parameter I passed to my script ("/apps") rather than the field in the command I am running.
My question: is there any alternative notation that I can use in AWK except for $1, $2 etc to signify field values?
Here is an example of one command within the script:
df | grep -e /$ | awk '{print $3/1024}'.
This problem is, the 3rd parameter is populated by "/var" as mentioned above

Try \$1, \$2 etc. (The "system" seems to drop at least one level of quoting from the command, so let's add one.)


How do I trim whitespace, but not newlines, from subshell output in bash?

There are many tens, maybe a hundred or more previous questions that seem "identical" to this already here, but after extensive search, I found NOTHING that even came close to working - though I did learn quite a lot - and so I decided to just RTFM and figure this out on my own.
The Problem
I wanted to search the output of a ps auxwww command to find processes of interest, and the issue was that I can't just simply use cut to find the exact data from them that I wanted. ps, it turns out, tries to columnate the output, adding either extra spaces or tabs that get in the way of using cut to get the correct data.
So, since I'm not a master at bash, I did a search... The answers I found were all focused on either variables - a "backup strategy" from my point of view that itself didn't solve the whole problem - or they only trimmed leading or trailing space or all "whitespace" including newlines. NOPE, Won't Work For Cut! And, neither will removing trailing newlines and so forth.
So, restated, the question is, how do we efficiently end up with the white space defined as simply a single space between other characters without eliminating newlines?
Below, I will give my answer, but I welcome others to give theirs - who knows, maybe someone has a better answer?!
At least MY answer - please leave your own, too! - was to do this:
ps auxwww | grep <program> | tr -s [:blank:] | cut -d ' ' -f <field_of_interest>
This worked great!
Obviously, there are many ways to adapt this to other needs.
As an alternative to all of the pipes and grep with cut, you could simply use awk. The benefit of using awkwith the default field-separator (FS) being set to break on whitespace is that it considers any number of whitespace between fields as a single separator.
So using awk will do away with needing to use tr -s to "squeeze" whitespace to define fields. Further, awk gives far greater control over field matching using regular expressions rather than having to rely on grep of a full line and cut to locate a pre-determined field numbers. (though to some extent you will still have to tell awk what field out of the ps command you are interested in)
Using bash, you can also eliminate the pipe | by using process substitution to send the output of ps auxwww to awk on stdin using redirection, e.g. awk ... < <(ps auxwww) for a single tidy command line.
To get your "program" and "file_of_interest" into awk you have two options. You can initialize awk variables using the -v var=value option (there can be multiple -v otions given), or you can use the BEGIN rule to initialize the variables. The only difference being with -v you can provide a shell variable for value and there is no whitespace allowed surrounding the = sign, while within BEGIN any whitespace is ignored.
So in your case a couple of examples to get the virtual memory size for firefox processes, you could use:
awk -v prog="firefox" -v fnum="5" '
$11 ~ prog {print $fnum}
' < <(ps auxwww)
(above if you had myprog=firefox as a shell variable, you could use -v prog="$myprog" to initialize the prog variable for awk)
or using the BEGIN rule, you could do:
awk 'BEGIN {prog = "firefox"; fnum = "5"}
$11 ~ prog {print $fnum }
' < <(ps auxwww)
In each command above, it locates the COMMAND field from ps (field 11) and checks whether it contains firefox and if so it outputs field no. 5 the virtual memory size used by each process.
Both work fine as one-liners as well, e.g.
awk -v prog="firefox" -v fnum="5" '$11 ~ prog {print $fnum}' < <(ps auxwww)
Don't get me wrong, the pipeline is perfectly fine, it will just be slow. For short commands with limited output there won't be much difference, but when the output is large, awk will provide orders of magnitude improvement over having to tr and grep and cut reading over the same records three times.
The reason being, the pipes and the process on each side requires separate processes be spawned by the shell. So minimizes their use, improves the efficiency of what your script is doing. Now if the data is small as are the processes, there isn't much of a difference. However if you are reading a 3G file 3 times over -- that's is the difference in orders of magnitude. Hours verses minutes or seconds.
I had to use single quotes on CentosOS Linux to get tr working like described above:
ps -o ppid= $$ | tr -d '[:space:]'
You can reduce the number of pipes using this Perl one-liner, which uses Perl regexes instead of a separate grep process. This combines grep, tr and cut in a single command, with an easy way to manipulate the output (#F is the array of fields, 0-indexed):
# Start an example process to provide the input for `ps` in the next commands:
/Applications/ --geometry 109x65 /tmp/foo &
# Print single space-delimited output of `ps` for all emacs processes:
ps auxwww | perl -lane 'print "#F" if $F[10] =~ /emacs/i'
# Prints:
# bar 72144 0.0 0.5 4610272 82320 s006 SN 11:15AM 0:01.31 /Applications/ --geometry 109x65 /tmp/foo
# Print emacs PID and file name opened with emacs:
ps auxwww | perl -lane 'print join "\t", #F[1, -1] if $F[10] =~ /emacs/i'
# Prints:
# 72144 /tmp/foo
The Perl one-liner uses these command line flags:
-e : Tells Perl to look for code in-line, instead of in a file.
-n : Loop over the input one line at a time, assigning it to $_ by default.
-l : Strip the input line separator ("\n" on *NIX by default) before executing the code in-line, and append it when printing.
-a : Split $_ into array #F on whitespace or on the regex specified in -F option.
perldoc perlrun: how to execute the Perl interpreter: command line switches
perldoc perlre: Perl regular expressions (regexes)

how to search from the command line?

I'd like to write a simple bash script to search from the command line and open the results in a browser. (Or maybe you wouldn't even call this a bash script at all since all it does is make a browser call.)
I created a file called goosh that contains one line:
open "$1"
This works, without the need to enclose the search in quotation marks, when the search term is just one word:
$ goosh monarch
But it fails when I use multiple search terms:
$ goosh monarch butterfly
This doesn't work:
open "$1 $2 $3 $4 $5 $6 $7 $8 $9"
How would I do this?
Spaces, hex value 0x20, are converted to %20 in a browser. You probably want to use sed on the input values:
echo 1 2 3 4 5 6 7 8 | sed -e 's/ /%20/'
Will give you an output of
I'd suggest that you parse your script's "$#" inputs (everything provided to the script) in the same manner and see if that doesn't give you the results you're looking for.
Here's a simple function you can try:
function goosh() {
ARGS=$(echo "$#" | sed -e 's/ /%20/g');
echo "$ARGS";
This won't actually call the site; it will just dump the output to your console (so you can make sure the variables are correct). Simply replace echo with open and it should be fine.
$ goosh When will Apple buy stock in Microsoft

how can I supply bash variables as fields for print in awk

I currently am trying to use awk to rearrange a .csv file that is similar to the following:
and the output would be:
(or any other order, just using this as an example)
and when it comes time to rearrange the csv file, I have been trying to use the following:
awk -v a=$first -v b=$second -v c=$third -v d=$fourth -v e=$fifth -F '^|,|$' '{print $a,$b,$c,$d,$e}' somefile.csv
with the intent of awk/print interpreting the $a,$b,$c,etc as field numbers, so it would come out to the following:
{print $2,$5,$1,$3,$4}
and print out the fields of the csv file in that order, but unfortunately I have not been able to get this to work correctly yet. I've tried several different methods, this seeming like the most promising, but unfortunately have not been able to get any solution to work correctly yet. Having said that, I was wondering if anyone could possibly give any suggestions or point out my flaw as I am stumped at this point in time, any help would be much appreciated, thanks!
Use simple numbers:
awk -v a=$first -v b=$second -v c=$third -v d=$fourth -v e=$fifth -F '^|,|$' \
'{print $a, $b, $c, $d, $e}' somefile.csv
Another way with a shorter example:
awk -F '^|,|$' "{print $aa,$bb,$cc}" somefile.csv
You already got the answer to your specific question but have you considered just specifying the order as a string instead of each individual field? For example:
order="2 5 1 3 4"
awk -v order="$order" '
BEGIN{ FS=OFS=","; n=split(order,a," ") }
{ for (i=1;i<n;i++) printf "%s%s",$(a[i]),OFS; print $(a[i]) }
' somefile.csv
That way if you want to add/delete fields or change the order you just trivially rearrange the numbers in the first line instead of having to mess with a bunch of hard-coded variables, etc.
Note that I changed your FS as there was no need for it to be that complicated. Also, you don't need the shell variable, "order",you could just populate the awk variable of the same name explicitly, I just started with the shell variable since you had started with shell variables so maybe you have a reason.

How do I get a user's friendly username on UNIX?

I want to get the "friendly" name, not the username, at least if such a string exists for the given user. Things I've tried:
id -un
id -p
uid jamesarosen
groups staff ...
id -P
jamesarosen:********:501:20::0:0:James A. Rosen:/Users/jamesarosen:/bin/bash
That last one has the information I'm looking for, but I'd prefer not to have to parse it out, particularly since I'm not terribly confident that the format (specifically the number of :s) will remain consistent across OSes.
Parse the GECOS Field for User's Full Name
The format of /etc/passwd and most of the GECOS field is extremely well standardized across Unix-like systems. If you find an exception, by all means let us know. Meanwhile, the easiest way to get what you want on a Unix-like system is to use getent and cut to parse the GECOS field. For example:
getent passwd $LOGNAME | cut -d: -f5 | cut -d, -f1
The only way that I know would be to parse it:
grep -P "^$(whoami):" /etc/passwd | cut -f5 -d:
You can be pretty certain of the format of /etc/passwd
You could use finger to obtain that information:
finger `id -un` | head -1 | cut -d: -f3-
which has the advantage (or disadvantage, depending on your requirements) that it will retrieve the information for non-local users as well.
If you only want to get the information from /etc/passwd, you'll most likely have to parse the file one way or the other, as others have already mentioned. Personally I'd prefer awk for this task:
awk -F: -vid=`id -u` '{if ($3 == id) print $5}' /etc/passwd
Take a look at the /etc/passwd file. This file shows you how user information is stored. Your user information may or may not be stored here (There are several different databases that Unix uses for storing users), but the format is the same.
Basically, Unix uses the User ID (UID) to store what user is what. The next entry was the old password entry, then the UID, the primary Group ID, the GECOS field, the $HOME directory, and the user's shell. (There are three extra entries displayed in the id -P command in MacOS. I don't know what they are, but they make the GECOS field the eighth field instead of the fifth field).
Using the id -P command on your system gave you this entry. Some systems use getent or even getpwent as a command. What you need to do is parse this entry. Each field is separated by colons, so you need either the fifth or eighth the entry (depending upon the command you had to use).
The awk and cut commands do this quite nicely. cut is probably more efficient, but awk is more common, so I tend to use that.
In awk, the standard field separator is white space, but you can use the -F parameter to change this. In Awk, each field in a line is given a number and preceded by a dollar sign. The $0 field is the entire line.
Using awk, you get:
id -P | awk -F: '{print $8}'
This says to take the id -P command, and use the : as a field separator, and to print out the eighth field. THe curly braces surround all AWK programs, and the single quotes are needed to keep the shell from interpreting the $8.
In BASH, you can use $( ) to run a command and return it's output, so you can set environment variables:
$USER_NAME=$(id -P | awk -F: `{print $8}`)
On macOS at least (and probably other *BSD-alikes), you may use: id -F to get just the full name.

Bash script - For loop multiple arguments and awk

I'm trying to write a BASH script for patching my CentOS servers. My goal is that when I run the script, it outputs the package name, the currently installed version, and new package version. Something like this:
nspr.x86_64 / 4.8.8-3.el6 / 4.8.9-3.el6_2
This way, if I ever need to downgrade because a package broke something, I have a record of this.
The command yum check-update command gives me the 1st piece of information (the package name) and the 3rd piece of information (the new version) listed above, and I can use awk to separate the two. It's easy after that; just run rpm -q {package} to get the 2nd piece of information, then concatenate them.
However, I'm running into trouble with the for loop in my code. I need to pass multiple arguments into the loop (the package name and newer version) so I can echo them later.
Here's my code:
for package in `/usr/bin/yum --cacheonly --noplugins check-update | awk '{print $1, $2}'`;
OLD_VER=`rpm -q ${package}` # ${package} should actually be $1 from the awk statement above
NEW_VER=${2} # This is $2 from the awk statement above
echo "${package} / ${OLD_VER} / ${NEW_VER}"
Pardon the obvious errors in the code; I'm new to BASH scripting. My confusion mostly stems from awk; I'm not very familiar with it, nor how it interacts with a BASH script. However, I think it's clear what I'm trying to do. Does package get passed as an array? Is there a better way to do this?
you want to read the whole line:
/usr/bin/yum --cacheonly --noplugins check-update |
while read line; do
set - $line;
echo first: $1 second: $2
Try this:
/usr/bin/yum --cacheonly --noplugins check-update \
| awk '{package=$1; newVer=$2; cmd = "rpm -q " package; cmd | getline oldVer; close(cmd); print package " / " oldVer " / " newVer}'
The issue with your script was that although both bash and awk use $ to reference variables, they are different vars and you can't reference awk's $2 from bash or vice versa. Writing everything in either awk or bash should solve the issue.
