I have a set of wireless stats from various branches in the organization:
There are 2 entries for each branch: 1st is internal wifi usage, the next is guest usage. I'd like to merge them into a single total as we don't care whether it's guests or staff ...etc.
Desired output should be:
The input file has a header and some markdown so it has to identify a match, not just assume the next line is related --- though the data could be cleaned first, it is my opinion that a match would make this more bulletproof.
Here is my approach: Remove the _guests and tally:
# file: tally.awk
FS = OFS = ","
sub(/_guests/, "", $1) # Remove _guests
stat[$1] += $2 # Tally
for (branch in stat) {
printf "%s,%d\n", branch, stat[branch]
Running the script:
awk -f tally.awk data.txt
In the BEGIN pattern, I set the field separator (FS) and output field separator (OFS) both to a comma
Next, for each line, I remove the _guests part and tally the count
Finally, at the end of the file, I print out the counts
I have file with two columns. First column is string, second is positive number. in If first field (string) doesn't have double in file (so, first field is unique for the file), I want to copy that unique line to (let's say) result.txt. If first field does have duplicate in file, then I want to subtract second field (number) in those duplicated lines. By the way, file will have one duplicate max, no more than that. I want to save that also in result.txt. So, output file will have all lines with unique values of first field and lines in which first field is duplicated name and second is subtracted value from those duplicates. Files are not sorted. Here is example:
hello 7
something 8
hey 9
hello 8
something 12
nathanforyou 23
OUTPUT FILE that I need (result.txt):
hello 1
something 4
hey 9
nathanforyou 23
I can't have negative numbers in ending file, so I have to subtract smaller number from bigger. What have I tried so far? All kinds of sort (I figure out how to find non-duplicate lines and put them in separate file, but choked on duplicate substraction), arrays in awk (I saved all lines in array, and do "for" clause... problem is that I don't know how to get second field from array element that is line) etc. By the way, problem is more complicated than I described (I have four fields, first two are the same and so on), but at the end - it comes to this.
$ cat tst.awk
{ val[$1,++cnt[$1]] = $2 }
for (name in cnt) {
if ( cnt[name] == 1 ) {
print name, val[name,1]
else {
val1 = val[name,1]
val2 = val[name,2]
print name, (val1 > val2 ? val1 - val2 : val2 - val1)
$ awk -f tst.awk file
hey 9
hello 1
nathanforyou 23
something 4
I've got a rather large set of CSV's that I need to parse. Most of it is extremely easy, however I've got some 'group' objects with embedded objects that I need to extract correctly.
The file looks something like this
So, as you can see, I need something along the lines of:
awk -F"[,|]" '{if ($2=="Group")
then - pseudo code->
print "create group",$1
print "add member in $7 to group found in $1 of first row"
continue until you reach next $2=="Group"), then loop
This is perplexing me greatly :)
It seems a lot of the values are somewhat bogus and contain '-' when they're blank instead of just being ,,
Something like
sed 's/\,\-\,/\,\,/g'
should replace them I'd think, however I think I need a leading wildcard.
New example:
Running the new string on it only produces:
create group grp-ext-test-test
You could try something like this and adapt as required..
awk -F, '$2=="Group"{g=$1; print "create group",g}{print "add " $7 " to " g}' file
create group Test_GroupA
add NodeA to Test_GroupA
add NodeB to Test_GroupA
add NodeC to Test_GroupA
add NodeD to Test_GroupA
add NodeE to Test_GroupA
create group Test_GroupB
add NodeA to Test_GroupB
add NodeB to Test_GroupB
add NodeC to Test_GroupB
add NodeX to Test_GroupB
add NodeE to Test_GroupB
add NodeF to Test_GroupB
To check if the contents of $7 are valid you could try something like:
awk -F, '$2=="Group"{ g=$1; print "create group",g } $7!="-"{print "add " $7 " to " g}' file