How to find out uptime in minutes? - bash

How to find out and save to file uptime on Solaris in minutes without cutting and converting it ? Is there any elegant way of doing it? Thanks for answers

Here is a reliable and accurate way to get the number of minutes since last boot on Solaris:
kstat -n system_misc |
nawk '/boot_time/ {printf("%d minutes\n",(srand()-$2)/60)}'
Under nawk, the srand() function returns the number of seconds since the epoch while the boot_time kstat statistic returns the number of seconds since the epoch at boot time. Subtracting the former from the latter gives the number of seconds, and dividing it further by 60 gives the number of minutes since last boot.

On Solaris, the uptime(1) command is just a link to w(1). You can find its source code at https://github.com/illumos/illumos-gate/blob/master/usr/src/cmd/w/w.c.
There, you will find that uptime(1) gets the boot time as recorded in /var/run/utmpx. Just as they do, you can read the data from this file using the getutxent(3) family of functions, and look for a record with ut_type == BOOT_TIME. In this record, look at the ut_tv field, which is a struct timeval and contains the seconds and microseconds of the boot time. From this you can calculate how long the system has been up.
(Edit: I just noticed the shell tag. This solution would be more suitable for calling from a C program. Oh well, maybe it will be useful to someone.)

Not completely sure that this will work on Solaris but...
uptime | awk -F ',' ' {print $1} ' | awk ' {print $3} ' | awk -F ':' ' {hrs=$1; min=$2; print hrs*60 + min} '

A small improvement on #jilliagre's suggestion is to use kstat(1m) to extract more information for you before passing to nawk(1):
$ kstat -p -n system_misc -s boot_time| nawk '{printf("%d minutes\n",(srand()-$2)/60)}'
26085 minutes
-p means "parseable output", -s selects the specific stat underneath -n module name.

Related

Efficient search pattern in large CSV file

I recently asked how to use awk to filter and output based on a searched pattern. I received some very useful answers being the one by user #anubhava the one that I found more straightforward and elegant. For the sake of clarity I am going to repeat some information of the original question.
I have a large CSV file (around 5GB) I need to identify 30 categories (in the action_type column) and create a separate file with only the rows matching each category.
My input file dataset.csv is something like this:
action,action_type, Result
up,1,stringA
down,1,strinB
left,2,stringC
I am using the following to get the results I want (again, this is thanks to #anubhava).
awk -F, 'NR > 1{fn = $2 "_dataset.csv"; print >> fn; close(fn)}' file
This works as expected. But I have found it quite slow. It has been running for 14 hours now and, based on the size of the output files compared to the original file, it is not at even 20% of the whole process.
I am running this on a Windows 10 with an AMD Ryzen PRO 3500 200MHz, 4 Cores, 8 Logical Processors with 16GB Memory and an SDD drive. I am using GNU Awk 5.1.0, API: 3.0 (GNU MPFR 4.1.0, GNU MP 6.2.0). My CPU is currently at 30% and Memory at 51%. I am running awk inside a Cygwin64 Terminal.
I would love to hear some suggestions on how to improve the speed. As far as I can see it is not a capacity problem. Could it be the fact that this is running inside Cygwin? Is there an alternative solution? I was thinking about Silver Searcher but could not quite workout how to do the same thing awk is doing for me.
As always, I appreciate any advice.
with sorting:
awk -F, 'NR > 1{if(!seen[$2]++ && fn) close(fn); if(fn = $2 "_dataset.csv"; print >> fn}' < (sort -t, -nk2 dataset.csv)
or with gawk (unlimited number of opened fd-s)
gawk -F, 'NR > 1{fn = $2 "_dataset.csv"; print >> fn;}' dataset.csv
This is the right way to do it using any awk:
$ tail -n +2 file | sort -t, -k2,2n |
awk -F, '$2!=p{close(out); out=$2"_dataset.csv"; p=$2} {print > out}'
The reason I say this is the right approach is it doesn't rely on the 2nd field of the header line coming before the data values when sorted, doesn't require awk to test NR > 1 for every line of input, doesn't need an array to store $2s or any other values, and only keeps 1 output file open at a time (the more files open at once the slower any awk will run, especially gawk once you get past the limit of open files supported by other awks as gawk then has to start opening/closing the files in the background as needed). It also doesn't require you to empty existing output files before you run it, it will do that automatically, and it only does string concatenation to create the output file once per output file, not once per line.
Just like the currently accepted answer, the sort above could reorder the input lines that have the same $2 value - add -s if that's undesirable and you have GNU sort, with other sorts you need to replace the tail with a different awk command and add another sort arg.

how rand() works in awk

I am trying to sample the 2nd column of a csv file (any number of samples is fine) using awk and rand(). But, I noticed that I always end up with the same number of samples
cat toy.txt | awk -F',' 'rand()<0.2 {print $2}' | wc -l
I explored and it seems rand() is not working as I expected. For example, a in the following seems to always be 1,
cat toy.txt | awk -F',' 'a=rand() a<0.2 {print a}'
Why?
From the documentation:
CAUTION: In most awk implementations, including gawk, rand() starts generating numbers from the same starting number, or seed, each time you run awk. Thus, a program generates the same results each time you run it. The numbers are random within one awk run but predictable from run to run. This is convenient for debugging, but if you want a program to do different things each time it is used, you must change the seed to a value that is different in each run. To do this, use srand().
So, to apply what's been pointed out in the man page, and duplicated all over this forum and elsewhere on the Internet, I like to use:
awk -v rseed=$RANDOM 'BEGIN{srand(rseed);}{print rand()" "$0}'
The rseed variable is optional, but included here, because sometimes it helps me to have a deterministic/repeatable random series for simulations when other variables can change, etc.

Incrementing Numbers & Counting with sed syntax

I am trying to wrap my head around sed and thought it would be best to try using something simple yet useful. At work I want to keep count on a small LCD display each time a specific script is run by users. I am currently doing this with a total count using the following syntax:
oldnum=`cut -d ':' -f2 TotalCount.txt`
newnum=`expr $oldnum + 1`
sed -i "s/$oldnum\$/$newnum/g" TotalCount.txt
This modifies the file that has this one line in it:
Total Recordings:0
Now I want to elaborate a little and increment the numbers starting at midnight and resetting to zero at 23:59:59 each day. I created a secondary .txt file for the display to read from with only one single line in it:
Total Recordings Today:0
But the syntax is not going to be the same. How must the above sed syntax be changed to change the number in the dialog of the second file?
I can change and reset the files using sed/bash in conjunction with a simple cron job on a schedule. The problem is that I can't figure out the syntax of sed to replicate the same effect as I originally got to work. Can anyone help please, I have been reading for hours on this, finally decided to post this and just make a pot of coffee. I have a 4 line LCD and would love to track counts across schedules if it is easy enough to learn the syntax.
sed should work fine for doing increments on both Total Recordings:, or Total Recordings Today: in your file since it's looking for the same pattern. To reset it each day at a certain time I would recommend a cronjob.
0 0 * * * echo \"Total Recordings Today:0\" > /path/to/TotalCount.txt >/dev/null 2>&1
The other things I would encourage is to use the newer style syntax $( ... ) for the shell expansion, and create a variable for your TotalCount.txt file.
#!/bin/bash
totals=/path/to/TotalCount.txt
oldnum=$(cut -d ':' -f2 "$totals")
newnum=$((oldnum + 1))
sed -i "s/$oldnum\$/$newnum/g" "$totals"
This way you can easily reuse it for whatever else you want to do with it, quote it properly and simplfy your code. Note: on OS X sed inplace expansion would need to be sed -i ''.
Whenever in doubt, http://shellcheck.net is a really nice tool to help find mistakes in your code.
although you're looking for a sed solution, cannot resist to post how it can be done in awk
$ awk -F: -v OFS=: '{$2++}1' file > temp && temp > file
-F: set the input field delimiter and -v OFS=: output field delimiter to :, awk parses the second field and increments by one, 1 is a shorthand for print (can be replaced with any "true" value); output will be written to a temp file and if successful will overwrite the original input file (to mimic in-place edit).
Sed is a fine tool, but notoriously not the best for arithmetic. You could make what you already have work by initializing the counter to zero prior to incrementing it, if the file was not last modified today (or does not exist):
[ `date +%Y-%m-%d` != "`stat --printf %z TotalCount.txt 2> /dev/null|cut -d ' ' -f 1`" ] && echo "Total Recordings Today:0" > TotalCount.txt
To do same with shifts, you would likely calculate shift "ordinal number" by subtracting first shift start since midnight (say 7 * 3600) from seconds since epoch (which is a midnight) and dividing by length of shift (8 * 3600) and initialize the counter if that changes. Something like:
[ $(((`date +%s` - 7 * 3600) / (8 * 3600))) -gt $(((`stat --printf %Z TotalCount.txt 2> /dev/null` - 7 * 3600) / (8 * 3600))) ] && echo "Total Recordings This Shift:0" > TotalCount.txt

Convert Current Date to Unix timestamp but using Days

In Solaris, I am trying to write a shell script that converts current date to the number of days after 1/1/1970 for Unix. This is because etc/shadow isn't using Epoch time but instead a 'days format'
i.e "root:G9yPfhFAqvlsI:15841::::::" where the 15841 is a date.
So in essence what command do I use to find out the epoch time for now and then convert that to days.
You probably don't have GNU tools, which might make things easier. This is simple enough though:
perl -le 'print int(time/86400)'
I found some pseudo code to calculate it from the basics:
if month > 2 then
month=month+1
else
month=month+13
year=year-1
fi
day=(year*365)+(year/4)-(year/100)+(year/400)+(month*306001/10000)+day
days_since_epoch=day-719591
Credit: http://www.unix.com/shell-programming-and-scripting/115449-how-convert-date-time-epoch-time-solaris.html). On the same forum thread, another poster said this would work in Solaris:
truss /usr/bin/date 2>&1 | grep ^time | awk -F"= " '{print $2}'
Solaris date utility doesn't support %s format. However, nawk utility has srand() function that returns date in seconds when there's no parameter passed to it.
nawk 'BEGIN {print srand()}'
Results in
1405529923
To get the days instead of seconds, you can divide the result by 86400.
nawk 'BEGIN {printf("%d", srand() / 86400)}'

count the occurrences of a string in a log file per hour (with a shell script)

I want to make a script to count the occurrences of a specific string (domain name)
from a log file (mail log) per hour, in order to check how many emails they sent per hour.
I know there are many easy and different ways to find a script into a file (like grep etc)
and count the lines (like wc -l)
but I don't know how to do it per hour.
Yes I can call the script every 60 minutes via a cron job but this would read the log file from the beginning till the moment the script was executed..and not the lines made in the last 60 minutes, and I don't know how to overcome this.
Note:
the command that I'm using the show all the sent emails per domain is :
\# cat /usr/local/psa/var/log/maillog | grep -i qmail-remote-handlers \
| grep from | awk {' print $6 '} | gawk -F# '{ print $2 }' \
| sort | uniq -c | sort -n | tail
the result is like this:
8 domain1.tld
45 domain34.tld
366 domain80948.tld
etc etc
The main point of the question is this one:
Yes I can call the script every 60 minutes via a cron job but this would read the log file
from the beginning till the moment the script was executed..and not the lines made in the
last 60 minutes, and I don't know how to overcome this.
How could you solve the problem?
You could save the number of lines in the log file lat time you processed it. After that skip these lines using sed.
The same as in 1 but save the number of bytes in processed file; then skip it using dd.
You could rotate (rename) the file after processing it (this method has disadvantage that you need to reconfigure your system to make log processing)
I personally would chose method 2. It is very efficient and very simple to implement.

Resources