How to stack layers (slope, curvature, aspect, etc) derived from DEM to form multi-band images in QGIS/ARCGIS? - pixel

Made a DEM from Landsat imagery and derived the aspect, slope, curvature, TWI, etc layers (generic flood conditioning factors). I now need to incorporate these layers and feed them to a LSTM neural network.
Currently thinking of following this procedure:
All the conditioning factors to be stacked together to form a multi-band image.
Each pixel and its neighboring pixels in a 3 × 3 window to be extracted, and 9 pixel vectors to be sorted into a sequential data based on spatial continuity.
The sequential data is sent to LSTM network.
How do I complete step 1?

Related

How to map features from two different data using regressor for classification?

I am trying to build a Keras model to implement to approach explained in this paper.
Context of my implementation:
I have two different kinds of data representing the same set of classes(labels) that needs to be classified. The 1st kind is Image data, and the second kind is EEG data (a time series sequence).
I know that to classify image data we can use CNN models like this:
model.add(Conv2D(filters=256, kernel_size=(11,11), strides=(1,1), padding='valid'))
model.add(Activation('relu'))
model.add(Dense(1000))
model.add(Activation('relu'))
model.add(Dropout(0.4))
# Batch Normalisation
model.add(BatchNormalization())
# Output Layer
model.add(Dense(40))
model.add(Activation('softmax'))
And to classify sequence data we can use LSTM models like this:
model.add(LSTM(units = 50, return_sequences = True))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(32, activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(40, activation='softmax'))
But the approach of the paper above shows that EEG feature vectors can be mapped with image vectors through regression like this:
The first approach is to train a CNN to map images to corresponding
EEG feature vectors. Typically, the first layers of CNN attempt to
learn the general (global) features of the images, which are common
between many tasks, thus we initialize the weights of these layers
using pre-trained models, and then learn the weights of the last
layers from scratch in an end-to-end setting. In particular, we used
the pre-trained AlexNet CNN, and modified it by replacing the
softmax classification layer with a regression layer (containing as
many neurons as the dimensionality of the EEG feature vectors),
using Euclidean loss as the objective function.
The second approach consists of extracting image features using
pre-trained CNN models and then employ regression methods to map
image features to EEG feature vectors. We used our fine-tuned
AlexNet as feature extractors by
reading the output of the last fully connected layer, and then
applied several regression methods (namely, k-NN regression, ridge
regression, random forest regression) to obtain the predicted
feature vectors
I am not able to comprehend how to code the above two approaches. I have never used a regressor for feature mapping and then do classification. Any leads on this are much appreciated.
In my understanding the training data consists of (eeg_signal,image,class_label) triplets.
Train the LSTM model with input=eeg_signal, output=class_label. Loss is crossentropy.
Peel off the last layer of the LSTM model. Let's say the pre-last layer's output is a vector of size 20. Let's call it eeg_representation.
Run this truncated model on all your eeg_signal inputs, save the output of eeg_representation. You will get a tensor of [batch, 20]
Take that AlexNet mentioned in the paper (or any other image classifier), peel off the last layer. Let's say the pre-last layer's output is a vector of size 30. Let's call it image_representation.
Stich a linear layer to the end of the previous layer. This layer will convert image_representation to eeg_representation. It has 20 x 30 weight.
Train the stiched model on (image, eeg_representation) pairs. Loss is the Euclidean distance.
And now the fun part: Stich together model trained in step 7. and the peeled off part of model trained in step 1. If you input an image, you will get class predictions.
This sound like not a big deal (because we do image classification all the time), but if this is really working, it means that this is a "prediction that is running through our brains" :)
Thank you bringing up this question and linking the paper.
I feel I just repeated what's in your question and in the the paper.
I would be beneficial to have some toy dataset to be able to provide code examples.
Here's a Tensorflow tutorial on how to "peel off" the last layer of a pretrained image classification model.

Yolov5: image detection without segmentation?

I have read a number of papers on Yolov5 images detection techniques. But the papers don't refers to any segmentation step done by Yolov5. While I know that it is not possible to do image classification without a segmentation process, I am asking the following question: do Yolov5 do any segmentation step in order to detect images? If yes which segmentation algorithm does it use?
segmentation mainly uses Fully Convolutional Network(FCN) architecture. FCN is a CNN without fully connected layers(FC). segmenation can be thought as an encoder followed by a decoder. Here encoder and decoder is FCN.
classification using CNN is a set of convolutional layers(extract high level features of input image) followed by one or more fully connected(FC) layers or dense layers.Last dense/FC layer classify the input image into various classes.
YOLO is a regression based object detection algorithm based on CNN architecture.In YOLO image is split or segmented into S * S grid cells.Each grid cell predict only one object that means a cell tries to predict an object whose centre falls inside that cell. For each grid cell CNN predicts
B number of bounding boxes(x,y,w,h).(x,y) is the centre of a bounding box relative to cell location.Confidence score of each predicted bounding box is also calculated.Confidence score of each bounding box is the IOU of predicted bounding box and ground truth bounding box.Confidence score represent how likely the bounding box contains an object
C conditional class probabilities for each grid cell(one per class). Conditional class probability means probability of detected objects belongs to a class.
Shape of prediction/output of CNN will be (S , S, (B * 5 + C)) ; number 5 represent x_center,y_center,width,height of bounding box and its confidence score
If an image is divided into 7 * 7 grid cells , 2 bounding boxes are predicted for each cell and the total number of classes are 3, then shape of CNN output will be (7,7,13)
Source

How to set the target vector for training images in neural network?

I am training a neural network to recognize three different signs (stop sign, no-left sign and no-entry sign). I have taken 50 images for each class. Every picture has size of 8x8 matrix, so my input then will be 150x64 matrix and output - 3x1 matrix, but how do I assign the target for these images, also do I have to normalize these images before proceeding to the training part?
If the images do not have labels (the targets), you have to labels them yourself, of course. Labeling 50 images should not take much time.
Also you have to normalize the images in some way, either min-max or subtract the mean and divide by the standard deviation, neural network training will fail if you don't.

What is the recognition rate of PCA eigenfaces?

I used the Database of Faces (formally the ORL Database) from the AT&T Laboratories Cambridge. The database consists of 400 images with 10 images per person, i.e, there is 10 images of each 40 person.
I separated 5 images of each person for training and the remaining 5 images of each person for testing.
So I have 2 folders:
1) Training (5 images/person = 200 images)
2) Testing (5 images/person = 200 images)
The photos in the training folder are different from those in the testing folder.
The percentage recognition rate I got is only 80%.
But if I pre-process the image before recognition I got:
pre-processing with imajust: 82%
pre-processing with sharpen: 83%
pre-processing with sharpen and imadjust: 84%
(If pre-processing is done, it is applied to bot training and testing images)
For the number of eigenfaces used, all eigenvalues of matrix L are sorted and those who are less than a specified threshold, are eliminated.
L_eig_vec = [];
for i = 1 : size(V,2)
if( D(i,i)>1 )
L_eig_vec = [L_eig_vec V(:,i)];
end
end
I use matlab to implement the face recognition system. Is it normal that the recognition rate is that low?
The accuracy would depend on the classifier you are using once you have the data in the PCA projected space. In the original Turk/Pentland eigenface paper
http://www.face-rec.org/algorithms/PCA/jcn.pdf
they just use kNN / Euclidean distance but a modern implementation might use SVMs e.g. with an rbf kernel as the classifier in the "face space", with C and gamma parameters optimized using a grid search. LibSVM would do that for you and there is a Matlab wrapper available.
Also you should register the faces first i.e. warping the images so they have facial landmarks e.g. eyes, nose, mouyth in a harmonised position across all the dataset? If the images aren't pre-registered then you will get a performance loss. I would expect a performance in the 90s for a dataset of 5 people using Eigenfaces with SVM and pre-registration. That figure is a gut feeling based on prior implementation / performance of past student projects. One thing to note however is your number of training examples is very low - 5 points in a high dimensional space is not much to train a classifier on.

Image Processing - Detection of joint features in volumetric wire-like shapes (fibres)

I am dealing with some research on the analysis of fibres in Steel fibre reinforced concrete, and after a few months on an automatic research based on the analysis of the hessian matrix for each pixel, i find myself stuck.
The above mentioned analysis works, but it doesn't take into account that my fibres (that you can see in the pictures) have hooked parts in the end and those hooks ruin the whole analysis of the orientation tensor.
Now by using the information i'm extracting already pixel by pixel, i would like to try to identify the 6 points of interest for each fibre (that is beginning, end, and the 4 points in which it bends) to proceed then with a model matching.
I have a volumetric dataset describing the concrete volume and all the fibres inside, smoothed through a gaussian (each fibre is 400 pixel long and 8 pixel thick).
Do you have any idea or hint that could speed up my attempt to localize those feature keypoints that i could later use for some space-indexed model matching with a model of the fibre?

Resources