Import sensor data to RRDtool DB - bash

Trying to import data to RRDtool DB for a couple of temperature sensor collected from a RFXtrx433e USB-controller. Output to .txt files
My database created like this:
[code]
# Script to create rrd-file
# 24h with 2,5 min resolution
# 7d with 5 min resolution
# 1y with 10 min resolution
# 20y with 1h resolution
directory="/home/pi/temp/rrddata/"
filename="domoticz_temp.rrd"
# Check i file already exists
if [ ! -f "$directory$filename" ]
then
# File doesn't exist, create new rrd-file
echo "Creating RRDtool DB for outside temp sensor"
rrdtool create $directory$filename \
--step 120 \
DS:probe:GAUGE:120:-50:60 \
DS:xxxx1:GAUGE:120:-50:60 \
DS:vardagsrum:GAUGE:120:-50:60 \
RRA:AVERAGE:0.5:1:576 \
RRA:AVERAGE:0.5:2:2016 \
RRA:AVERAGE:0.5:4:52560 \
RRA:AVERAGE:0.5:24:175200 \
RRA:MAX:0.5:1:5760 \
RRA:MAX:0.5:2:2016 \
RRA:MAX:0.5:4:52560 \
RRA:MAX:0.5:24:175200 \
RRA:MIN:0.5:1:5760 \
RRA:MIN:0.5:2:2016 \
RRA:MIN:0.5:4:52560 \
RRA:MIN:0.5:24:175200
echo "Done!"
else
echo $directory$filename" already exists, delete it first."
fi
Import of sensor data
rrdtool update /home/pi/temp/rrddata/domoticz_temp.rrd --template probe N:`head -n 1 </home/pi/temp/output/temp_probe.txt`
The textfile imported just contain one row with a number (temperature collected from the sensor through a LUA-script)
The code for create the graph
rrdtool graph /home/pi/temp/output/img/test/hour.png \
-w 697 -h 287 -a PNG \
--slope-mode \
--start -6h --end now \
--vertical-label "Last 6 hour temperature" \
DEF:probe=/home/pi/temp/rrddata/domoticz_temp.rrd:probe:AVERAGE \
DEF:xxxx1=/home/pi/temp/rrddata/domoticz_temp.rrd:xxxx1:AVERAGE \
DEF:vardagsrum=/home/pi/temp/rrddata/domoticz_temp.rrd:vardagsrum:AVERAGE \
COMMENT:" Location Min Max Senaste\l" \
LINE1:probe#ff0000:"Utetemp" \
LINE1:0#ff0000: \
GPRINT:probe:MIN:" %5.1lf" \
GPRINT:probe:MAX:" %5.1lf" \
GPRINT:probe:LAST:" %5.1lf\n" \
LINE1:xxxx1#00ff00:"Xxxx1" \
LINE1:0#00ff00: \
GPRINT:probe:MIN:" %5.1lf" \
GPRINT:probe:MAX:" %5.1lf" \
GPRINT:probe:LAST:" %5.1lf\n" \
LINE1:vardagsrum#0000ff:"vardagsrum" \
LINE1:0#0000ff: \
GPRINT:probe:MIN:" %5.1lf" \
GPRINT:probe:MAX:" %5.1lf" \
GPRINT:probe:LAST:" %5.1lf\n" \
Gives me this graph http://i.imgur.com/lnFxTik.png
Now to my questions:
Have I created the database and the rest of script in a correct way? I think should get NAN on the values not in the DB?
How do I import the rest of the sensors? They are in several simular TXT files.
Should/can I collect data from the sensor in another better way to get them in to the RRDtool DB?
Hope anyone can help me.
New info!
My LUA-script for collection sensor data
commandArray = {}
if (devicechanged['Probe']) then
local file = io.open("/home/pi/temp/output/temp_probe.txt", "w")
file:write(tonumber(otherdevices_temperature['Probe']))
file:close()
end
if (devicechanged['Xxxx1']) then
local file = io.open("/home/pi/temp/output/temp_xxxx1.txt", "w")
file:write(tonumber(otherdevices_temperature['Xxxx1']))
file:close()
end
if (devicechanged['Vardagsrum']) then
local file = io.open("/home/pi/temp/output/temp_vardagsrum.txt", "w")
file:write(tonumber(otherdevices_temperature['Vardagsrum']))
file:close()
end
return commandArray`

Yes if a value is missing you get NaN. Your create statement looks ok ... although 20y with 1h resolution ... wow!
importing from several text files would work like this
.
A=`perl -ne 'chomp;print;exit' xx1.txt`
B=`perl -ne 'chomp;print;exit' xx2.txt`
rrdtool update domoticz_temp.rrd --template xx1:xx2 N:$A:$B
.
yes instead of writing them to a file first, I would recommend to update the rrd file directly.

# 24h with 2,5 min resolution
# 7d with 5 min resolution
# 1y with 10 min resolution
# 20y with 1h resolution
...
rrdtool create $directory$filename \
--step 120 \
DS:probe:GAUGE:120:-50:60 \
DS:xxxx1:GAUGE:120:-50:60 \
DS:vardagsrum:GAUGE:120:-50:60 \
RRA:AVERAGE:0.5:1:576 \
RRA:AVERAGE:0.5:2:2016 \
RRA:AVERAGE:0.5:4:52560 \
RRA:AVERAGE:0.5:24:175200 \
OK, you seem to have a 2min step size, and your RRAs are consolodating 1, 2, 4 and 24 steps. This corresponds to 2min, 4min, 8min and 48min, not to 2.5, 5, 10 and 1h. Maybe your step should be 150? Also, the heartbeat on your DSs is the same as your step, which might cause you to lose data. Generally speaking, the heartbeat should be about 1.5 to 2 times the step size to allow for irregular data arrival.
However none of this relates to your 'unknown' question, much of which Tobi has already answered.
You will get 'unknown' on timeslots you have not loaded, yes.
2 and 3. Since you have a single RRD you need to have all the samples updated at the same timestamp, in the same operation. In this case, you're probably better off collecting them all at once and storing them into the same file, so that you can load them together and store into the RRD together. If this is an issue, and the sensors are probed independently, then I'd advise having a separate RRD for each sensor, so that you can update them independently. You can still generate a graph over all 3 together as you can define your graph DEFs to point to different RRD files no problem. This might be a better way to do it.
And Tobi's right about a 20y RRA possibly being somewhat excessive ;)

Related

Output training losses over iterations/epochs to file from trainer.py in HuggingFace Transfrormers

In the Transformer's library framework, by HuggingFace only the evaluation step metrics are outputted to a file named eval_resuls_{dataset}.txt in the "output_dir" when running run_glue.py. In the eval_resuls file, there are the metrics associated with the dataset. e.g., accuracy for MNLI and the evaluation loss.
Can a parameter be passed to run_glue.py to generate a training_results_{dataset}.txt file that tracks the training loss? Or would I have to build the functionality myself?
My file named run_python_script_glue.bash:
GLUE_DIR=../../huggingface/GLUE_SMALL/
TASK_NAME=MNLI
ID=OT
python3 run_glue.py \
--local_rank -1 \
--seed 42 \
--model_type albert \
--model_name_or_path albert-base-v2 \
--task_name $TASK_NAME \
--do_train \
--do_eval \
--data_dir $GLUE_DIR/$TASK_NAME \
--max_seq_length 128 \
--per_gpu_train_batch_size 8 \
--per_gpu_eval_batch_size 8 \
--gradient_accumulation_steps 2\
--learning_rate 3e-5 \
--max_steps -1 \
--warmup_steps 1000\
--doc_stride 128 \
--num_train_epochs 3.0 \
--save_steps 9999\
--output_dir ./results/GLUE_SMALL/$TASK_NAME/ALBERT/$ID/ \
--do_lower_case \
--overwrite_output_dir \
--label_noise 0.2\
--att_kl 0.01\
--att_se_hid_size 16\
--att_se_nonlinear relu\
--att_type soft_attention \
--adver_type ot \
--rho 0.5 \
--model_type whai \
--prior_gamma 2.70 \
--three_initial 0.0
In the trainer.py file in the transformer library, the training loss variable during the training step is called tr_loss.
tr_loss = self._training_step(model, inputs, optimizer, global_step)
loss_scalar = (tr_loss - logging_loss) / self.args.logging_steps
logs["loss"] = loss_scalar
logging_loss = tr_loss
In the code, the training loss is first scaled by the logging steps and later passed to a logs dictionary. The logs['loss'] is later printed to the terminal but not to a file. Is there a way to upgrade this to include an update to a txt file?

For loop values in a Makefile not being used as arguments to a shell command

I'm trying to use a Makefile to iterate over several date values and execute a python script for each one, here's the Makefile I'm using (Makefile.study.s1):
include Makefile
# Dates to test
SNAP_TST := 2019-10-12 2020-02-08 2020-10-10 2021-01-02 2021-07-24 2021-12-31 2022-05-27
buildDataset:
for date in $(SNAP_TST) ; do \
python src/_buildDataset.py --table $(TABLE) \
--nan-values $(NAN_CONFIG) \
--patterns-rmv $(PATTERNS_RMV) \
--target-bin $(TARGET_CLASS) \
--target-surv $(TARGET_SURV) \
--target-init $(TARGET_INIT) \
--train-file $(DATA_TRN) \
--test-file $(DIR_DATA)/main/churnvol_test_$$date.csv \
--train-date $(SNAP_TRN) \
--test-date $$date \
--config-input $(CONFIG_INPUT) \
--feats $(FEATS) ; \
done
.PHONY: buildDataset
When I run make -f Makefile.study.s1 buildDataset it replaces the value date with the string "$date" instead of one of the dates in SNAP_TST. Can you guys help me figure out what I did wrong here, and how I can fix this makefile so that $$date is replaced with one of the dates in SNAP_TST? Thank you in advance.

Transformers fine tune model warning

python run_clm.py \
--model_name_or_path ctrl \
--train_file df_finetune_train.csv \
--validation_file df_finetune_test.csv \
--do_train \
--do_eval \
--preprocessing_num_workers 72 \
--block_size 256 \
--output_dir ./finetuned
I am trying to fine tune the ctrl model on my own dataset, where each row represents a sample.
However, I got the warning info below.
[WARNING|tokenization_utils_base.py:3213] 2021-03-25 01:32:22,323 >>
Token indices sequence length is longer than the specified maximum
sequence length for this model (934 > 256). Running this sequence
through the model will result in indexing errors
What is the cause for this ? any solutions ?

Custom garmin map have no name

I created a Garmin map from my own OSM files (using JOSM and my own GPS records, no input from Openstreetmap).
The whole process run well, but I have just a little problem : when I load the final map to Basecamp, the name of this map is empty (blank).
Any idea ?
Here is the code. Before, some variables :
PREFIX=640000
ORIGINALNAME=$(echo ${PREFIX}00)
NAME=$(echo ${PREFIX}01)
ID_PUBLIC=64
DIR="/home/Carto"
GMAPIBUILDER="/Applications/Carto/gmapi-builder.py"
MKGMAP="/Applications/Carto/mkgmap/mkgmap.jar"
First, create img files from different layers
for f in $DIR/src/public/*.osm ; do
g=$(basename $f .osm) ;
d=$(dirname $f)
java -Xmx2G -jar $MKGMAP \
--transparent --add-pois-to-areas \
--keep-going --draw-priority=$drawpriority \
--description="[iero] "$g \
--family-name="iero Congo" \
--series-name="iero Congo" \
--mapname=$NAME --family-id=$ID --product-id=$ID \
--country-name=Congo --country-abbr=CG \
--style-file=$DIR/styles --style=iero \
--copyright-message="[iero.org] Congo $DATE" \
--product-version=$VERSION \
--latin1 --output-dir=$DIR/output/imgs/public $f 1> /dev/null;
cp $DIR/output/imgs/public/${NAME}.img $DIR/output/imgs/public/${NAME}.img
let NAME++ ;
let nbfiles++ ;
let drawpriority++ ;
done
Next, concatenate those files in unique img file
java -jar $MKGMAP --tdbfile --gmapsupp $DIR/output/imgs/public/*.img \
--keep-going \
--style-file=$DIR/styles --style=iero \
--family-name="iero Congo" \
--series-name="iero Congo" \
--description="[iero] Congo map" \
--mapname=$ORIGINALNAME --family-id=${ID_PUBLIC} --product-id=${ID_PUBLIC} \
--copyright-message="[iero.org] Congo $DATE" \
--product-version=$VERSION \
--output-dir=$DIR/output/gps/public 1> /dev/null;
Then, create gmapi files, ready for Basecamp :
python $GMAPIBUILDER -t $DIR/output/gps/public/osmmap.tdb -b $DIR/output/gps/public/osmmap.img -o $DIR/output/basecamp/mac/public $DIR/output/imgs/public/*.img
If you want to see the problem, final files can be downloaded in my website : http://www.iero.org/blog/2014/06/carte-du-congo/
Thanks !
Greg
I have done testing and only get the blank names with versions of mkgmap after they introduced the overview map feature. I built a map with r2585 and the name showed correctly.

How to populate RRD database with CPU and MEM usage data?

I have a Lighttpd server (on Centos) and would like to display 4 graphs: lighttpd traffic, lighttpd requests per second, CPU usage and MEM usage. I've set place for rrd database for lighttpd config like this:
rrdtool.binary = "/usr/bin/rrdtool"
rrdtool.db-name = "/var/www/lighttpd.rrd"
And put into my WWW cgi-bin sh file that gets data from lighttpd RRD file and creates graphs of traffic and requests per second like this:
#!/bin/sh
RRDTOOL=/usr/bin/rrdtool
OUTDIR=//var/www/graphs
INFILE=/var/www/lighttpd.rrd
OUTPRE=lighttpd-traffic
WIDTH=400
HEIGHT=100
DISP="-v bytes --title TrafficWebserver \
DEF:binraw=$INFILE:InOctets:AVERAGE \
DEF:binmaxraw=$INFILE:InOctets:MAX \
DEF:binminraw=$INFILE:InOctets:MIN \
DEF:bout=$INFILE:OutOctets:AVERAGE \
DEF:boutmax=$INFILE:OutOctets:MAX \
DEF:boutmin=$INFILE:OutOctets:MIN \
CDEF:bin=binraw,-1,* \
CDEF:binmax=binmaxraw,-1,* \
CDEF:binmin=binminraw,-1,* \
CDEF:binminmax=binmaxraw,binminraw,- \
CDEF:boutminmax=boutmax,boutmin,- \
AREA:binmin#ffffff: \
STACK:binmax#f00000: \
LINE1:binmin#a0a0a0: \
LINE1:binmax#a0a0a0: \
LINE2:bin#efb71d:incoming \
GPRINT:bin:MIN:%.2lf \
GPRINT:bin:AVERAGE:%.2lf \
GPRINT:bin:MAX:%.2lf \
AREA:boutmin#ffffff: \
STACK:boutminmax#00f000: \
LINE1:boutmin#a0a0a0: \
LINE1:boutmax#a0a0a0: \
LINE2:bout#a0a735:outgoing \
GPRINT:bout:MIN:%.2lf \
GPRINT:bout:AVERAGE:%.2lf \
GPRINT:bout:MAX:%.2lf \
"
$RRDTOOL graph $OUTDIR/$OUTPRE-hour.png -a PNG --start -14400 $DISP -w $WIDTH -h $HEIGHT
$RRDTOOL graph $OUTDIR/$OUTPRE-day.png -a PNG --start -86400 $DISP -w $WIDTH -h $HEIGHT
$RRDTOOL graph $OUTDIR/$OUTPRE-month.png -a PNG --start -2592000 $DISP -w $WIDTH -h $HEIGHT
OUTPRE=lighttpd-requests
DISP="-v req --title RequestsperSecond -u 1 \
DEF:req=$INFILE:Requests:AVERAGE \
DEF:reqmax=$INFILE:Requests:MAX \
DEF:reqmin=$INFILE:Requests:MIN \
CDEF:reqminmax=reqmax,reqmin,- \
AREA:reqmin#ffffff: \
STACK:reqminmax#00f000: \
LINE1:reqmin#a0a0a0: \
LINE1:reqmax#a0a0a0: \
LINE2:req#00a735:requests"
$RRDTOOL graph $OUTDIR/$OUTPRE-hour.png -a PNG --start -14400 $DISP -w $WIDTH -h $HEIGHT
$RRDTOOL graph $OUTDIR/$OUTPRE-day.png -a PNG --start -86400 $DISP -w $WIDTH -h $HEIGHT
$RRDTOOL graph $OUTDIR/$OUTPRE-month.png -a PNG --start -2592000 $DISP -w $WIDTH -h $HEIGHT
Basically it's not my script, i get it from somewhere from the internet.
Now i would like to do the same for CPU usage and MEM usage.
I don't like to use any additional packages!
As you can see lighttpd populates lighttpd.rrd file with traffic data and requests per second. Now i would like to the system to populate second rrd file with CPU and MEM usage, so i can add to sh file code to generate graphs for this data.
How can I populate RRD file with CPU and MEM usage data?
Please, NO THIRD-PARTY tools !
the trick is to read data from files in the proc filesystem and use sed/awk to extract the actual value. Create an rrdfile with DS type GAUGE to store the data ... look at the tutorials on www.rrdtool.org to get going.

Resources