Filter awk system output with awk? - bash

I need to use awk to see what users are logged in the computer, create a file with their names and inside that file print the pid of the process they're running. I've used this, but it does not work:
who | awk '{for(i = 0; i < NR; i++)
system("ps -u " $1 "| tail +2 | awk '{print $1}' >" $1".log")
}'
Is there any way to do this?
Thanks a lot!

To achieve your goal of using awk to create those files, I would start with ps rather than with who. That way, ps does more of the work so that awk can do less. Here is an example that might work for you. (No guarantees, obviously!)
ps aux | awk 'NR>1 {system("echo " $2 " >> " $1 ".txt")}'
Discussion:
The command ps aux prints a table describing each active process, one line at a time. The first column of each line contains the name of the process's user, the second column its PID. The line also contains lots of other information, which you can play with as you improve your script. That's what you pipe into awk. (All this is true for Linux and the BSDs. In Cygwin, the format is different.)
Inside awk, the pattern NR>1 gets rid of the first line of the output, which contains the table headers. This line is useless for the files you want awk to generate.
For all other lines in the output of ps aux, awk adds the PID of the current process (ie, $2) to the file username.txt, using $1 for username. Because we append with >> rather than overwriting with >, all PIDs run by the user username end up being listed, one line at a time, in the file username.txt.
UPDATE (Alternative for when who is mandatory)
If using who is mandatory, as noted in a comment to the original post, I would use awk to strip needless lines and columns from the output of who and ps.
for user in $(who | awk 'NR>1 {print $1}')
do
ps -u "$user" | awk 'NR>1' > "$user".txt
done
For readers who wonder what the double-quotes around $user are about : Those serve to guard against globbing (if $user contains asterisks (*)) and word splitting (if $user contains whitespace).
I will leave my original answer stand for the benefit of any readers with more freedom to choose the tools for their job.
Is that what you had in mind?

Related

bash script concatenates input arguments with pipe xargs arguments

I am trying to execute my script, but the $ 1 argument is concatenated with the arguments of the last pipe, resulting in the following
killProcess(){
ps aux |grep $1 | tr -s " " " " | awk "{printf \"%s \",\$2}" | tr " " "\n" | xargs -l1 echo $1
}
$killProcess node
node 18780
node 965856
node 18801
node 909028
node 19000
node 1407472
node 19028
node 583620
node 837
node 14804
node 841
node 14260
but I just want the list of pids, without the node argument to be able to delete them, that only happens when I put it under a script, in command line it works normally for me because I don't pass any arguments to the script and it doesn't get concatenated.
The immediate problem is that you don't want the $1 at the end. In that context, $1 expands to the first argument to the function ("node", in your example), which then gets passed to xargs and treated as part of the command it should execute. That is, the last part of the pipeline expands to:
xargs -l1 echo node
...so when xargs receives "18780" as input, it runs echo node 18780, which of course prints "node 18780".
Solution: remove the $1, making the command just xargs -l1 echo, so when xargs receives "18780" as input, it runs echo 18780, which prints just "18780".
That'll fix it, but there's also a huge amount of simplification that can be done here. Many elements of the pipe aren't doing anything useful, or are working at cross purposes with each other.
Start with the last command in the pipe, xargs. It's taking in PIDs, one per line, and printing them one per line. It's not really doing anything at all (that I can see anyway), so just leave it off. (Unless, of course, you actually want to use kill instead of echo -- in that case, leave it on.)
Now look at the next two commands from the end:
awk "{printf \"%s \",\$2}" | tr " " "\n"`
Here, awk is printing the PIDs with a space after each one, and then tr is turning the spaces into newlines. Why not just have awk print each one with a newline to begin with? You don't even need printf for this, you can just use print since it automatically adds a newline. It's also simpler to pass the script to awk in single-quotes, so you don't have to escape the double-quotes, dollar sign, and (maybe) backslash. So any of these would work:
awk "{printf \"%s\\n\",\$2}"
awk '{printf "%s\n",$2}'
awk '{print $2}'
Naturally, I recommend the last one.
Now, about the command before awk: tr -s " " " ". This "squeezes" runs of spaces into single spaces, but that's not needed since awk treats runs of spaces as (single) field delimiters. So, again, leave that command out.
At this point, we're down to the following pipeline:
ps aux | grep $1 | awk '{print $2}'
There are two more things I'd recommend here. First, you should (almost) always have double-quotes around shell variable, parameter, etc references like $1. So use grep "$1" instead.
But don't do that, because awk is perfectly capable of searching; there's no need for both grep and awk. In fact, awk can be much more precise, searching only a specific field instead of the whole line. The downside is, it is a bit more complex to do, but knowing how to make awk do more complex things is useful. The best way to let awk work with a shell variable or parameter is to use its -v option to create an awk variable with the same value, and use that. You can then use the ~ to check for a regex match to the variable. Something like this:
awk -v proc="$1" '$11 ~ proc {print $2}'
Note: I'm assuming you want to search for $1 in the executable name, and that that's the 11th field of ps aux on your system. Searching that field only will keep it from matching in e.g. the username (killing all of a user's processes because their name contains some program name isn't polite). You might actually want to be even more specific, so that e.g. trying to kill node doesn't accidentally kill nodemon as well; that'll be a matter of using more specific search patterns.
So, here's the final result:
killProcess(){
ps aux | awk -v proc="$1" '$11 ~ proc {print $2}'
}
To actually kill the processes, add back xargs -l1 kill at the end.

Find string in col 1, print col 2 in awk

I'm on a Mac, and I want to find a field in a CSV file adjacent to a search string
This is going to be a single file with a hard path; here's a sample of it:
84:a5:7e:6c:a6:b0, AP-ATC-151g84
84:a5:7e:6c:a6:b1, AP-A88-131g84
84:a5:7e:73:10:32, AP-AG7-133g56
84:a5:7e:73:10:30, AP-ADC-152g81
84:a5:7e:73:10:31, AP-D78-152e80
so if my search string is "84:a5:7e:73:10:32"
I want to get returned "AP-AG7-133g56"
I had been working within an Applescript, but maybe a shell script will do.
I just need the proper syntax for opening the file and having awk search it. Again, I'm weak conceptually on how shell commands run, how they must be executed, etc
This errors, gives me ("command not found"):
set the_file to "/Users/Paw/Desktop/AP-Decoder 3.app/Contents/Resources/BSSIDtable.csv"
set the_val to "70:56:81:cb:a2:dc"
do shell script "'awk $1 ~ the_val {print $2} the_file'"
Thank you for coddling me...
This is a relatively simple:
awk '$1 == "70:56:81:cb:a2:dc," {print "The answer is "$2}' 'BSSIDtable.csv'
(the "The answer is " text can be omitted if you only wish to see only the data, but this shows you how to get more user-friendly output if desired).
The comma is included since awk uses white space for separators so the comma becomes part of column 1.
If the thing you're looking for is in a shell variable, you can use -v to provide that to awk as an awk variable:
lookfor="70:56:81:cb:a2:dc,"
awk -v mac=$lookfor '$1 == mac {print "The answer is "$2}' 'BSSIDtable.csv'
As an aside, your AppleScript solution is probably not working because the $1/$2 are being interpreted as shell variable rather than awk variables. If you insist on using AppleScript, you will have to figure out how to construct a shell command that quotes the awk commands correctly.
My advice is to just use the shell directly, the number of people proficient in that almost certainly far outnumber those proficient in AppleScript :-)
if sed is available (normaly on mac, event if not tagged in OP)
simple but read all the file
sed -n 's/84:a5:7e:73:10:32,[[:blank:]]*//p' YourFile
quit after first occurence (so average of 50% faster on huge file)
sed -n -e '/84:a5:7e:73:10:32,[[:blank:]]*/!b' -e 's///p;q' YourFile
awk
awk '/^84:a5:7e:73:10:32/ {print $2}'
# OR using a variable for batch interaction
awk -v Src='84:a5:7e:73:10:32' '$1 == Src {print $2}'
# OR assuming that case is unknow
awk -v Src='84:a5:7e:73:10:32' 'BEGIN{IGNORECASE=1} $1 == Src {print $2}'
by default it take $0 as compare test if a regex is present, just add the ^ to take first field content

How do I write an awk print command in a loop?

I would like to write a loop creating various output files with the first column of each input file, respectively.
So I wrote
for i in $(\ls -d /home/*paired.isoforms.results)
do
awk -F"\t" {print $1}' $i > $i.transcript_ids.txt
done
As an example if there were 5 files in the home directory named
A_paired.isoforms.results
B_paired.isoforms.results
C_paired.isoforms.results
D_paired.isoforms.results
E_paired.isoforms.results
I would like to print the first column of each of these files into a seperate output file, i.e. I would like to have 5 output files called
A.transcript_ids.txt
B.transcript_ids.txt
C.transcript_ids.txt
D.transcript_ids.txt
E.transcript_ids.txt
or any other name as long as it is 5 different names and I can still link them back to the original files.
I understand, that there is a problem with the double usage of $ in both the awk and the loop command, but I don't know how to change that.
Is it possible to write a command like this in a loop?
This should do the job:
for file in /home/*paired.isoforms.results
do
base=${file##*/}
base=${base%%_*}
awk -F"\t" '{print $1}' $file > $base.transcript_ids.txt
done
I assume that there can be spaces in the first field since you set the delimiter explicitly to tab. This runs awk once per file. There are ways to do it running awk once for all files, but I'm not convinced the benefit is significant. You could consider using cut instead of awk '{print $1}', too. Note that using ls as you did is less satisfactory than using globbing directly; it runs foul of file names with oddball characters (spaces, tabs, etc) in the name.
You can do that entirely in awk:
awk -F"\t" '{split(FILENAME,a,"_"); out=a[1]".transcript_ids.txt"; print $1 > out}' *_paired.isoforms.results
If your input files don't have names as indicated in the question, you'd have to split on something else ( as well as use a different pattern match for the input files ).
My original answer is actually doing extra name resolution every time something is printed. Here's a version that only updates the output filename when FILENAME changes:
awk -F"\t" 'FILENAME!=lf{split(FILENAME,a,"_"); out=a[1]".transcript_ids.txt"; lf=FILENAME} {print $1 > out}' *_paired.isoforms.results

Bash Output different from command line

I have tried all kinds of filters using grep to try and solve this but just cannot crack it.
cpumem="$(ps aux | grep -v 'grep' | grep 'firefox-bin' | awk '{printf $3 "\t" $4}'
I am extracting the CPU and Memory usage for a process and when I run it from the command line, I get the 2 fields outputted correctly:
ps aux | grep -v 'grep' | grep 'firefox-bin' | awk '{printf $3 "\t" $4}'
> 1.1 4.4
but the same command executed from within the bash script produces this:
cpumem="$(ps aux | grep -v 'grep' | grep 'firefox-bin' | awk '{printf $3 "\t" $4}')"
echo -e cpumem
> 1.1 4.40.0 0.10.0 0.0
I am guessing that it is picking up 3 records, but I just don't know where from.
I am filtering out any other grep processes by using grep -v 'grep', can someone offer any suggestions or a more reliable way ??
Maybe you have 3 records because 3 firefox are running (or one is running, and it is threading itself).
You can avoid the grep hazzle by giving ps and option to select the processes. E.g. the -C to select processes by name. With ps -C firefox-bin you get only the firefox processes. But this does not help at all, when there is more than one process.
(You can also use the ps option to output only the columns you want, so your line would be like
ps -C less --no-headers -o %cpu,%mem
).
For the triple-record you must come up with a solution, what should happen, where more than one is running. In a multiuser environment with programms that are threading there can always be situations where you have more than one process of a kind. There are many possible solution where none can help you, as you dont say, way you are going to do with it. One can think of solutions like selecting only from one user, and only the one with the lowest pid, or the process-leader in case of groups, to change the enclosing bash-script to use a loop to handle the multiple values or make it working somehow different when ps returns multiple results.
I was not able to reproduce the problem, but to help you debug, try print $11 in your awk command, that will tell you what process it is talking about
cpumem="$(ps aux | grep -v 'grep' | grep 'firefox-bin' | awk '{printf $3 "\t" $4 "\t" $11 "\n"}')"
echo -e cpumem
It's actually an easy fix for the output display; In your echo statement, wrap the variable in double-quotes:
echo -e "$cpumem"
Without using double-quotes, newlines are not preserved by converting them to single-spaces (or empty values). With quotes, the original text of the variable is preserved when outputted.
If your output contains multiple processes (i.e. - multiple lines), that means your grep actually matched multiple lines. There's a chance a child-process is running for firefox-bin, maybe a plugin/container? With ps aux, the 11th column will tell you what the actual process is, so you can update your awk to be the following (for debugging):
awk '{printf $3 "\t" $4 "\t" $11}'

How do I print a field from a pipe-separated file?

I have a file with fields separated by pipe characters and I want to print only the second field. This attempt fails:
$ cat file | awk -F| '{print $2}'
awk: syntax error near line 1
awk: bailing out near line 1
bash: {print $2}: command not found
Is there a way to do this?
Or just use one command:
cut -d '|' -f FIELDNUMBER
The key point here is that the pipe character (|) must be escaped to the shell. Use "\|" or "'|'" to protect it from shell interpertation and allow it to be passed to awk on the command line.
Reading the comments I see that the original poster presents a simplified version of the original problem which involved filtering file before selecting and printing the fields. A pass through grep was used and the result piped into awk for field selection. That accounts for the wholly unnecessary cat file that appears in the question (it replaces the grep <pattern> file).
Fine, that will work. However, awk is largely a pattern matching tool on its own, and can be trusted to find and work on the matching lines without needing to invoke grep. Use something like:
awk -F\| '/<pattern>/{print $2;}{next;}' file
The /<pattern>/ bit tells awk to perform the action that follows on lines that match <pattern>.
The lost-looking {next;} is a default action skipping to the next line in the input. It does not seem to be necessary, but I have this habit from long ago...
The pipe character needs to be escaped so that the shell doesn't interpret it. A simple solution:
$ awk -F\| '{print $2}' file
Another choice would be to quote the character:
$ awk -F'|' '{print $2}' file
Another way using awk
awk 'BEGIN { FS = "|" } ; { print $2 }'
And 'file' contains no pipe symbols, so it prints nothing. You should either use 'cat file' or simply list the file after the awk program.

Resources