Convert slurm accounting output - bash

I'm looking for a way to get the elapsed time output to always include days, at the moment I can't see away in defining an output format so I'm looking at using cut, awk, sed or similar command(s) to do this after the output has been generated.
So any ideas how I can change output such as:
JobID|Partition|User|State|Elapsed|
902464|interactive-a|bob|COMPLETED|10-00:10:40
968491|interactive-a|bob|COMPLETED|12:49:20
970801|interactive-a|sam|COMPLETED|07:00:46
912973|interactive-a|tom|COMPLETED|41-02:34:41
971356|interactive-a|mat|COMPLETED|04:36:35
971912|interactive-a|mat|COMPLETED|02:12:02
972668|interactive-a|mat|COMPLETED|00:09:06
Into this format (the last column has 0- added where needed)
JobID|Partition|User|State|Elapsed|
902464|interactive-a|bob|COMPLETED|10-00:10:40|
968491|interactive-a|bob|COMPLETED|0-12:49:20|
970801|interactive-a|sam|COMPLETED|0-07:00:46|
912973|interactive-a|tom|COMPLETED|41-02:34:41|
971356|interactive-a|mat|COMPLETED|0-04:36:35|
971912|interactive-a|mat|COMPLETED|0-02:12:02|
972668|interactive-a|mat|COMPLETED|0-00:09:06|
Thanks

$ sed 's/|\([0-9:]\{1,\}\)$/|0-\1/' file
JobID|Partition|User|State|Elapsed|
902464|interactive-a|bob|COMPLETED|10-00:10:40
968491|interactive-a|bob|COMPLETED|0-12:49:20
970801|interactive-a|sam|COMPLETED|0-07:00:46
912973|interactive-a|tom|COMPLETED|41-02:34:41
971356|interactive-a|mat|COMPLETED|0-04:36:35
971912|interactive-a|mat|COMPLETED|0-02:12:02
972668|interactive-a|mat|COMPLETED|0-00:09:06

In awk:
$ awk -F\| '$5 ~ /-|E/ || ($5 = "0-" $5) && gsub(/ /,"|")' file
-F\| set FS to |
$5 ~ /-|E/ matches and prints records with - OR E in fifth field
|| logical OR, ie. if previous didn't match, then:
($5 = "0-" $5) prepend 0- to fifth field
&& gsub(/ /,"|") AND replace those space-replaced field separators with |s.
above could be removed if -v OFS="|" was used:
$ awk -v OFS=\| -F\| '$5 ~ /-|E/ || ($5 = "0-" $5)' file
$ awk -v OFS=\| -F\| '$5 ~ /-|E/ || ($5 = "0-" $5)' file

Related

How to match a unique patter using awk?

I have a text file called 'file.txt' with the content like,
test:one
test_test:two
test_test_test:three
If the pattern is test, then the expected output should be one and similarly for the other two lines.
This is what I have tried.
pattern=test && awk '{split($0,i,":"); if (i[1] ~ /'"$pattern"'$/) print i[2]}'
This command gives the output like,
one
two
three
and pattern=test_test && awk '{split($0,i,":"); if (i[1] ~ /'"$pattern"'$/) print i[2]}'
two
three
How can I match the unique pattern being "test" for "test" and not for "test_test" and so on.
How can I match the unique pattern being test for test and not for test_test and so on.
Don't use a regex for comparing the value, just use equality:
awk -F: -v pat='test' '$1 == pat {print $2}' file
one
awk -F: -v pat='test_test' '$1 == pat {print $2}' file
two
If you really want to use regex, then use it like this with anchors:
awk -F: -v pat='test' '$1 ~ "^" pat "$" {print $2}' file
one
If you want to use a regex, you can create it dynamically with pattern and optionally repeating _ followed by pattern until matching a :
If it matches the start of the string, then you can print the second field.
awk -v pattern='test' -F: '
$0 ~ "^"pattern"(_"pattern")*:" {
print $2
}
' file
Output
one
two
three
Or if only matching the part before the first underscore is also ok, then splitting field 1 on _ and printing field 2:
awk -v pattern='test' -F: ' {
split($1, a, "_")
if(a[1] == pattern) print $2
}' file
Using GNU sed with word boundaries
$ sed -n '/\<test\>/s/[^:]*://p' input_file
one

Is it possible to change column header and filter a column in one command?

I'm using awk to filter interesting lines in a large text file before reading it with a statistical software.
Here is some dummy data
printf 'VEGETABLE_NAME,RECIPE_NAME,OBSCURE_CODE\ncarrot,cake,1\ncarrot,soup,1\npotato,cake,2\nspinach,soup,1' > dummydata.dat
I have managed to :
Change the column header
$ awk -F, 'NR==1 {$0="vegetable,recipe,code"} 1' dummydata.dat
vegetable,recipe,code
carrot,cake,1
carrot,soup,1
potato,cake,2
spinach,soup,1
Filter for product code 1
$ awk -F, '$3 ~ /^1/' dummydata.dat
carrot,cake,1
carrot,soup,1
spinach,soup,1
But when I try to combine both commands, the result doesn't include the column header:
$ awk -F, 'NR==1 {$0="vegetable,recipe,code"} $3 ~ /^1/' dummydata.dat
carrot,cake,1
carrot,soup,1
spinach,soup,1
In your approach, you didn't get the column header because it will print lines
only based on the condition
$3 ~ /^1/
If that evaluates to true(1), then print else(0) don't. Unfortunately it evaluates false for the header.
Below is my try
awk -v FS="," 'BEGIN{print "vegetable,recipe,code"}NR>1 && $3==1'
vegetable,recipe,code
carrot,cake,1
carrot,soup,1
spinach,soup,1
You are setting $0 for NR==1 but that record never gets printed anywhere.
You can make a small change in your script to make it:
awk -F, 'NR==1{print "vegetable,recipe,code"} $3 ~ /^1$/' dummydata.dat
vegetable,recipe,code
carrot,cake,1
carrot,soup,1
spinach,soup,1

How to do a if else match on pattern in awk

I've tried the below command:
awk '/search-pattern/ {print $1}'
How do I write the else part for the above command?
Classic way:
awk '{if ($0 ~ /pattern/) {then_actions} else {else_actions}}' file
$0 represents the whole input record.
Another idiomatic way
based on the ternary operator syntax selector ? if-true-exp : if-false-exp
awk '{print ($0 ~ /pattern/)?text_for_true:text_for_false}'
awk '{x == y ? a[i++] : b[i++]}'
awk '{print ($0 ~ /two/)?NR "yes":NR "No"}' <<<$'one two\nthree four\nfive six\nseven two'
1yes
2No
3No
4yes
A straightforward method is,
/REGEX/ {action-if-matches...}
! /REGEX/ {action-if-does-not-match}
Here's a simple example,
$ cat test.txt
123
456
$ awk '/123/{print "O",$0} !/123/{print "X",$0}' test.txt
O 123
X 456
Equivalent to the above, but without violating the DRY principle:
awk '/123/{print "O",$0}{print "X",$0}' test.txt
This is functionally equivalent to awk '/123/{print "O",$0} !/123/{print "X",$0}' test.txt
Depending what you want to do in the else part and other things about your script, choose between these options:
awk '/regexp/{print "true"; next} {print "false"}'
awk '{if (/regexp/) {print "true"} else {print "false"}}'
awk '{print (/regexp/ ? "true" : "false")}'
The default action of awk is to print a line. You're encouraged to use more idiomatic awk
awk '/pattern/' filename
#prints all lines that contain the pattern.
awk '!/pattern/' filename
#prints all lines that do not contain the pattern.
# If you find if(condition){}else{} an overkill to use
awk '/pattern/{print "yes";next}{print "no"}' filename
# Same as if(pattern){print "yes"}else{print "no"}
This command will check whether the values in the $1 $2 and $7-th column are greater than 1, 2, and 5.
!IF! the values do not mach they will be ignored by the filter we declared in awk.
(You can use logical Operators and = "&&"; or= "||".)
awk '($1 > 1) && ($2 > 1) && ($7 > 5)'
You can monitoring your system with the "vmstat 3" command, where "3" means a 3 second delay between the new values
vmstat 3 | awk '($1 > 1) && ($2 > 1) && ($7 > 5)'
I stressed my computer with 13GB copy between USB connected HardDisks, and scrolling youtube video in Chrome browser.

Multiple pattern matching

I have an input file with columns seperated by | as follows.
[3yu23yuoi]|$name
!$fjkdjl|[kkklkl]
$hjhj|$mmkj
I want the output as
0 $name
!$fjkdjl 0
$hjhj $mmkj
Whenever the string begins with $ or !$ or "any", i want it to get printed as such else 0.
I have tried the following command.It prints verything same as input file only.
awk -F="|" '{if (($1 ~ /^.*\$/) || ($1 ~ /^.*\!$/) || ($1 ~ /^any/)) {print $1} else if ($1 ~ /^\[.*/){print "0"} else if (($2 ~ /^.*\$/) || ($2 ~ /^.*\!$/) || ($2 ~ /^any/)) {print $2} else if($2 ~ /^\[.*/){print "0"}}' input > output
This should do:
awk -F\| '{$1=$1;for (i=1;i<=NF;i++) if ($i!~/^(\$|!\$|any)/) $i=0}1' file
0 $name
!$fjkdjl 0
$hjhj $mmkj
If data does not start with $ !$ or any, set it to 0
Or if you like tab as separator:
awk -F\| '{$1=$1;for (i=1;i<=NF;i++) if ($i!~/^(\$|!\$|^any)/) $i=0}1' OFS="\t" file
0 $name
!$fjkdjl 0
$hjhj $mmkj
$1=$1 make sure all line have same output, even if no data is changed.

AWK: How to use OFS ignoring blank and commented out lines

I'm trying to rewrite a file on the fly, like this:
10.213.20.173, mem_chld, p3b-aggr-103, c3.xlarge, db, mysql
#10.213.20.191, mem_leaf, p3b-leaf-101, r3.xlarge, db, mysql
10.213.20.192, mem_leaf, p3b-leaf-102, r3.xlarge, db, mysql
10.213.20.190, mem_leaf, p3b-leaf-103, r3.xlarge, db, mysql
.....
from the original , separated filed to a : separated ones. So, I used this:
awk -F', ' 'BEGIN{OFS=":";} { $1=$1; print }'
which is pretty much working but that file also has some blank and commented out lines, which I also want to exclude. My attempt with:
awk -F', ' '!/^(#|$)/ {OFS=":";} { $1=$1; print }'
did not work as I expected. How can I do that? Best!
Using awk:
$ awk -F', ' 'BEGIN{OFS=":"} !/^#/ && NF{$1=$1; print}' file
10.213.20.173:mem_chld:p3b-aggr-103:c3.xlarge:db:mysql
10.213.20.192:mem_leaf:p3b-leaf-102:r3.xlarge:db:mysql
10.213.20.190:mem_leaf:p3b-leaf-103:r3.xlarge:db:mysql
alternatively you can set OFS like:
awk -F', ' -v OFS=':' '!/^#/ && NF{$1=$1; print}' file
or even
awk -F', ' '!/^#/ && NF{$1=$1; print}' OFS=':' file
As Ed Morton suggested in the comments, for an edge case where you might have space before the # it is best to use the following:
awk -F', ' 'BEGIN{OFS=":"} !/^[[:space:]]*#/ && NF{$1=$1; print}' file
Explanation:
$1=$1 rebuilds the $0 variable. It takes all the fields and concatenates them, separated by OFS which we have set to : instead of space which is the default.
What about:
awk -F', ' -v OFS=':' '/^[^#]/ {$1=$1; print}' datafile
This will ignore both empty lines and lines starting with a # sign.
If comments might be preceded by some spaces, you would prefer:
awk -F', ' -v OFS=':' '!/^[ \t]*(#.*)?$/ {$1=$1; print}' datafile
awk -F', ' -v OFS=: '/^[ \t]*(#|$)/{next}{$1=$1}1' file
Output:
10.213.20.173:mem_chld:p3b-aggr-103:c3.xlarge:db:mysql
10.213.20.192:mem_leaf:p3b-leaf-102:r3.xlarge:db:mysql
10.213.20.190:mem_leaf:p3b-leaf-103:r3.xlarge:db:mysql

Resources