weekly seasonality in r with fourier terms - arima

I've hourly data and I'd like to capture weekly seasonality in my data. How could I capture it with fourier terms in R, in order to add as regressor to my ARIMA model? Or should I consider dummy variables?

Related

How to map features from two different data using regressor for classification?

I am trying to build a Keras model to implement to approach explained in this paper.
Context of my implementation:
I have two different kinds of data representing the same set of classes(labels) that needs to be classified. The 1st kind is Image data, and the second kind is EEG data (a time series sequence).
I know that to classify image data we can use CNN models like this:
model.add(Conv2D(filters=256, kernel_size=(11,11), strides=(1,1), padding='valid'))
model.add(Activation('relu'))
model.add(Dense(1000))
model.add(Activation('relu'))
model.add(Dropout(0.4))
# Batch Normalisation
model.add(BatchNormalization())
# Output Layer
model.add(Dense(40))
model.add(Activation('softmax'))
And to classify sequence data we can use LSTM models like this:
model.add(LSTM(units = 50, return_sequences = True))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(32, activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(40, activation='softmax'))
But the approach of the paper above shows that EEG feature vectors can be mapped with image vectors through regression like this:
The first approach is to train a CNN to map images to corresponding
EEG feature vectors. Typically, the first layers of CNN attempt to
learn the general (global) features of the images, which are common
between many tasks, thus we initialize the weights of these layers
using pre-trained models, and then learn the weights of the last
layers from scratch in an end-to-end setting. In particular, we used
the pre-trained AlexNet CNN, and modified it by replacing the
softmax classification layer with a regression layer (containing as
many neurons as the dimensionality of the EEG feature vectors),
using Euclidean loss as the objective function.
The second approach consists of extracting image features using
pre-trained CNN models and then employ regression methods to map
image features to EEG feature vectors. We used our fine-tuned
AlexNet as feature extractors by
reading the output of the last fully connected layer, and then
applied several regression methods (namely, k-NN regression, ridge
regression, random forest regression) to obtain the predicted
feature vectors
I am not able to comprehend how to code the above two approaches. I have never used a regressor for feature mapping and then do classification. Any leads on this are much appreciated.
In my understanding the training data consists of (eeg_signal,image,class_label) triplets.
Train the LSTM model with input=eeg_signal, output=class_label. Loss is crossentropy.
Peel off the last layer of the LSTM model. Let's say the pre-last layer's output is a vector of size 20. Let's call it eeg_representation.
Run this truncated model on all your eeg_signal inputs, save the output of eeg_representation. You will get a tensor of [batch, 20]
Take that AlexNet mentioned in the paper (or any other image classifier), peel off the last layer. Let's say the pre-last layer's output is a vector of size 30. Let's call it image_representation.
Stich a linear layer to the end of the previous layer. This layer will convert image_representation to eeg_representation. It has 20 x 30 weight.
Train the stiched model on (image, eeg_representation) pairs. Loss is the Euclidean distance.
And now the fun part: Stich together model trained in step 7. and the peeled off part of model trained in step 1. If you input an image, you will get class predictions.
This sound like not a big deal (because we do image classification all the time), but if this is really working, it means that this is a "prediction that is running through our brains" :)
Thank you bringing up this question and linking the paper.
I feel I just repeated what's in your question and in the the paper.
I would be beneficial to have some toy dataset to be able to provide code examples.
Here's a Tensorflow tutorial on how to "peel off" the last layer of a pretrained image classification model.

AutoML Video Inteligence with custom labels - true negatives in training data

This question pertains to AutoML for Video Intelligence (custom labels).
When setting up training data, you are instructed to only label videos with your custom labels in them (and not videos that don’t have that label). How does the model train to identify true negatives for custom labels?
After applying the score threshold, the predictions made by your model will fall in one of the following four categories.
We can use these categories to calculate precision and recall — metrics that help us gauge the effectiveness of our model.

How to create a Single AUC, Confusion matrix, and ROC curve in H2O AutoML Python

I am using H2o's Auto ML package and would like to know if it is possible to get a single AUC, Confusion Matrix and ROC curve for all the methods combined. For instance I have AUC values for the individual models GLM, Stacked Ensemble, deep learning etc. Can you get these three values for all the methods combined? The goal is to be able to compare the Auto ML package to other similar packages.

Exploration graphics in h2o

To whom may it concern,
Is it possible to plot an exploratory variable versus the target in h2o? I want to know whether it is possible to carry out basic data exploration in h2o, or whether it is not designed for that.
Many thanks in advance,
Kere
the main plotting functionality for an H2O frame is for histograms (hist() in python and h2o.hist() in R).
Within Flow you can do basic data exploration if you import your dataframe, then click on inspect and then, next to the hyperlinked columns, you'll see a plot button which will let you get bar charts of counts for example and other plot types.
You can also easily convert single columns you want to plot into a pandas or R dataframe with
H2OFrame.as_data_frame() in python
as.data.frame.H2OFrame in R and then use the native python and R plotting methods

gis polygon map overlay intersection operation

There are many algorithms for binary map overlay operation in vector data format which take two layers of map and produce resultant layer i.e overlaid layer as output. I am wondering whether there are any algorithms which take more than two layers say 3 layers simultaneously and produce the overlay result?
There are a variety of geographic computational overlay procedures available for multiple layers. These fall into the group of multiple criteria decision analysis, whereby multiple criteria (map)layers are standardized and combined (overlayed) to produce a resulting (map)layer. However, many of these are for raster data inputs!
If in fact you want to just combine vector data to produce an intersection, a procedural model would work best as #Thomas has commented. This can be done vis a vis python (standalone) or with model builder inside arcgis. Alas, there are other methods that can be used to script the procedural overlay process.
I would like you to think about what exactly you're aiming to do. Let's think about the following scenarios:
You have a vector polygon of some City, and your goal is to overlay all the industrial, residential and commercial land usage. This would leave you to subtract the different land uses from your City polygon, one by one. Or, you can merge your three land uses into one poylgon and subtract from your City polygon.
Given the wide range of multiple criteria decision analysis methodologies (eg. weighted linear combination), a raster methodology might be suitable if you're looking for the "optimal location" For instance, if you were looking for a location in the City that has an optimal combination of industrial, commercial and retail land use, weighted linear combination could be used.
Let us define our land use weights as 20%, 40%, 40% (industrial, commercial, retail). We must also standardize our land use layer values between 0 and 1. The following combination of layer values give the most optimal combination of the three criteria: 0.2, 0.4 and 0.4 = 1.

Resources