julian dates as.date - macos

I am trying to run a simple regression in R (mac OSX), to see if the level of an environmental certification has improved over time - among other things. The data I downloaded offers a level from 1-4, and the dates in 1-Mar-12 format. I can't seem to get R to convert the dates, and I keep getting the same error message. The variables are the same length.
$ certification_date: chr "1-Aug-11" "1-Aug-11" "1-Aug-11" "1-Jul-11" ...
jday<-as.Date('certification_date',format='%d-%b-%y',"%j")
mod <- lm(Level_numberĀ ~ jday, data=data)
Error in model.frame.default(formula = Level_number ~ jday, data = data, :
variable lengths differ (found for 'jday')
summary(jday)
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
NA NA NA NA NA NA "1"
Can anyone spot where I've gone wrong?

You should remove the quotes around certification_date as mentioned in the comment, But %b is the abbreviated month name in the current locale. So you can get another problem with your locals. Here I present a independent-local solution:
## get your current local time
loc <- Sys.getlocale('LC_TIME')
## set the local to english , since %b is local dependent
Sys.setlocale('LC_TIME','ENGLISH')
jday <-as.Date(certification_date,format='%d-%b-%y',"%j")
Sys.setlocale('LC_TIME',loc)
The result is:
jday
[1] "2011-08-01" "2011-08-01" "2011-08-01" "2011-07-01"

Related

Multiple plots from a single text file (gnuplot)

Currently, I have a text file and I'm interested in plotting two different curves from a single file(values for x axis are the same-column 1, values for y axis-columns 3 and 4). The plot should be in STDOUT since I'm working from ssh. The file that I am working with looks like this (filename: tmp)
%Iter duration train_objective valid_objective difference
0 6.0 0.0195735 0.0610958 0.0415223
1 5.0 0.180216 0.191344 0.011128
2 5.0 0.223318 0.241081 0.017763
3 6.0 0.245895 0.262197 0.016302
4 6.0 0.25796 0.28056 0.0226
5 6.0 0.269223 0.291769 0.022546
6 5.0 0.281187 0.298474 0.017287
7 5.0 0.283891 0.305579 0.021688
8 5.0 0.296456 0.307381 0.010925
9 5.0 0.296856 0.315487 0.018631
10 5.0 0.295805 0.321391 0.025586
Total training time is 0:06:27
So far, I can only plot the values corresponding to the 3rd column using the following line:
cat tmp | gnuplot -e "set terminal dumb size 120, 30; set autoscale; plot '-' u 1:3 with lines notitle"
Could someone tell me then how I could include the 4th column in the same plot? is that possible?
Thanks!
There is nothing in your description that rules out the trivial answer:
gnuplot -e "plot 'tmp' u 1:3 with lines, '' u 1:4 with lines"
The terminal choice is not relevant (you used 'set term dumb' but it could just as easily be any other output terminal, connection via ssh does not prevent that). If you have additional constraints that require a more complicated solution, please add them to the question.

Calculate 15 minutes ago in shell

I have a time-stamp like 7:00:00, which means 7am.
I would like to write a short command that returns 06:45:00, or simply 06:45, preferably using date command so that I can avoid long shell script. Do you have any elegant solution?
I'm also looking for a 24h format. For example, 12:00:00 - 15 minutes = 11:45:00.
With GNU date, use 7:00:00 AM - 15 minutes as d (--date) string :
% date -d '7:00:00 AM - 15 minutes' '+%H:%M'
06:45
+%H:%M sets the output format as HH:MM.
On BSD variants Date has a -v flag which can be used to take the current timestamp and display the result of a positive or negative adjustment.
This will subtract 15mins from the current timestamp:
date -v -15M

How to resume reading a file?

I'm trying to find the best and most efficient way to resume reading a file from a given point.
The given file is being written frequently (this is a log file).
This file is rotated on a daily basis.
In the log file I'm looking for a pattern 'slow transaction'. End of such lines have a number into parentheses. I want to have the sum of the numbers.
Example of log line:
Jun 24 2015 10:00:00 slow transaction (5)
Jun 24 2015 10:00:06 slow transaction (1)
This is easy part that I could do with awk command to get total of 6 with above example.
Now my challenge is that I want to get the values from this file on a regular basis. I've an external system that polls a custom OID using SNMP. When hitting this OID the Linux host runs a couple of basic commands.
I want this SNMP polling event to get the number of events since the last polling only. I don't want to have the total every time, just the total of the newly added lines.
Just to mention that only bash can be used, or basic commands such as awk sed tail etc. No perl or advanced programming language.
I hope my description will be clear enough. Apologizes if this is duplicate. I did some researches before posting but did not find something that precisely correspond to my need.
Thank you for any assistance
In addition to the methods in the comment link, you can also simply use dd and stat to read the logfile size, save it and sleep 300 then check the logfile size again. If the filesize has changed, then skip over the old information with dd and read the new information only.
Note: you can add a test to handle the case where the logfile is deleted and then restarted with 0 size (e.g. if $((newsize < size)) then read all.
Here is a short example with 5 minute intervals:
#!/bin/bash
lfn=${1:-/path/to/logfile}
size=$(stat -c "%s" "$lfn") ## save original log size
while :; do
newsize=$(stat -c "%s" "$lfn") ## get new log size
if ((size != newsize)); then ## if change, use new info
## use dd to skip over existing text to new text
newtext=$(dd if="$lfn" bs="$size" skip=1 2>/dev/null)
## process newtext however you need
printf "\nnewtext:\n\n%s\n" "$newtext"
size=$((newsize)); ## update size to newsize
fi
sleep 300
done

Bash scripting: date -d won't accept my string format of hhmmss. I need a workaround

I need a Bash script to accept 1 argument representing a time in hhmmss format, and from that derive a second time 3 minutes before that.
I've been trying to use date -d:
#! /bin/bash
DATE=`date +%Y%m%d`
TIME=$1
NEWTIME=`date -d "$DATE $TIME - 3 minutes" +%H%M%S`
echo $NEWTIME
In action:
$ ./myscript.sh 123456
invalid date `20141022 123456 - 3 minutes'
It seems the problem is with the 6 character time format because 4 characters (eg 1234) works. The subtraction of the 3 minutes is not the problem because I get the same error when I remove it.
It has occurred to me I could parse the time into a more palatable format before sending it to date. I tried inserting delimiters by adding this line:
TIME=${TIME:0:2}:${TIME:2:2}:${TIME:4:2}
It accepted that format but the answer to the - 3 minutes part was inexplicably very wrong (it subtracted 2 hours and 1 minute):
$ ./myscript.sh 123456
103356
Vexing.
It has also occurred to me that I might be able to provide date with an input format, like strptime which I'm familiar with from Python. I've found references to strptime in the context of Bash but I've been unable to get it to do anything.
Does anyone have any suggestions on getting the hhmmss time-string to work? Any help is much appreciated.
FYI: I'm trying to avoid changing the 6 character input format because that would involve changing other scripts as well as getting certain human users to alter long-entrenched habits. I'm also trying to avoid outsourcing this task to another language. (I could easily do this in Python). I want a Bash solution to this problem, if there is one.
TIME=093000
TIME=${TIME:0:2}:${TIME:2:2}:${TIME:4:2} # your line
date -d "2014-10-20 $TIME 3 mins ago" +%H%M%S
Output:
092700

Text-Message Gateways & Incrementing Bash Variable Daily

I have a bash script that is sending me a text daily, for 100 days.
#! /bin/bash
EMAIL="my-phone-gateway#address.net"
MESSAGE="message_content.txt"
mail $EMAIL < $MESSAGE
Using crontab, I can have the static $MESSAGE sent to me every day.
Other than hard-coding 100 days of texts ;)
How could I implement a variable counter such that I can have my texts say:
"Today is Day #1" on the first day, "Today is Day #2" on the second day, etc. ?
Note: The location of the requested text within the $MESSAGE file doesn't matter. Last line, first line, middle, etc.
The only requirement for an answer here is that I know what day it is relative to the first, where the first day is the day the script was started.
Of course, bonus awesome points for the cleanest, simplest, shortest solution :)
For our nightly build systems, I wrote a C program that does the calculation (using local proprietary libraries that store dates as a number of days since a reference date). Basically, given a (non-changing) reference date, it reports the number of days since the reference date. So, the cron script would have a hard-wired first day in it, and the program would report the number of days since then.
The big advantage of this system is that the reference date doesn't change (very often), so the script doesn't change (very often), and there are no external files to store information in.
There probably are ways to achieve the same effect with standard Unix tools, but I've not sat down and worked out the portable solution. I'd probably think it terms of using Perl. (The C program only works up to 2999 CE; I left a note in the code for people to contact me about 50 years before it becomes a problem for the Y3K fix. It is probably trivial.)
You could perhaps work in terms of Unix timestamps...
Create a script 'days_since 1234567890' which treats the number as the reference date, gets the current time stamp (from date with appropriate format specification; on Linux, date '+%s' would do that job, and it works on Mac OS X too), takes the difference and divides by 86,400 (the number of seconds in a day).
refdate=1234567890
bc <<EOF
scale=0
($(date '+%s') - $refdate) / 86400
EOF
An example:
$ timestamp 1234567890
1234567890 = Fri Feb 13 15:31:30 2009
$ timestamp
1330027280 = Thu Feb 23 12:01:20 2012
$ refdate=1234567890
$ bc <<EOF
> scale=0
> ($(date '+%s') - $refdate) / 86400
> EOF
1104
$
So, if the reference date was 13th Feb 2009, today is day 1104. (The program bc is the calculator; its name has nothing to do with Anno Domini or Before Christ. The program timestamp is another homebrew of mine that prints timestamps according to a format that can be specified; it is a specialized variant of date originally written in the days before date had the functionality, by which I mean in the early 1980s.)
In a Perl one-liner (assuming you specify the reference date in your script):
perl -e 'printf "%d\n", int((time - 1234567890)/ 86400)'
or:
days=$(perl -e 'printf "%d\n", int((time - 1234567890)/ 86400)')
The only way to accomplish this would be to store the date in a file, and read from that file each day. I would suggest storing the epoch time.
today=$(date +%s)
time_file="~/.first_time"
if [[ -f $time_file ]]; then
f_time=$(< "$time_file")
else
f_time=$today
echo "$f_time" > "$time_file"
fi
printf 'This is day: %s\n' "$((($today - $f_time) / 60 / 60 / 24))"
Considering that your script is running only once a day, something like this should work:
#!/bin/bash
EMAIL="my-phone-gateway#address.net"
MESSAGE="message_content.txt"
STFILE=/tmp/start.txt
start=0
[ -f $STFILE ] && start=$(<$STFILE)
start=$((start+1))
MESSAGE=${MESSAGE}$'\n'"Today is Day #${start}"
echo "$start" > $STFILE
mail $EMAIL < $MESSAGE
A simple answer would be to export the current value to an external file, and read that back in again later.
So, for example, make a file called "CurrentDay.dat" that has the number 1 in it.
Then, in your bash script, read in the number and increment it.
e.g. your bash script could be:
#!/bin/bash
#Your stuff here.
DayCounter=$(<CurrentDay.dat)
#Use the value of DayCounter (i.e. $DayCounter) in your message.
DayCounter=$((DayCounter + 1))
echo $DayCounter > CurrentDay.dat
Of course, you may need to implement some additional checks to avoid something going wrong, but that should work as is.

Resources