How to do Yolov4 inference from weights using some GUI - user-interface

I have trained yolov4 model. Now I want to do inference by developing some GUI (using opencv/python or any other tool) where I can give image as an input and get the inference

Related

How to map features from two different data using regressor for classification?

I am trying to build a Keras model to implement to approach explained in this paper.
Context of my implementation:
I have two different kinds of data representing the same set of classes(labels) that needs to be classified. The 1st kind is Image data, and the second kind is EEG data (a time series sequence).
I know that to classify image data we can use CNN models like this:
model.add(Conv2D(filters=256, kernel_size=(11,11), strides=(1,1), padding='valid'))
model.add(Activation('relu'))
model.add(Dense(1000))
model.add(Activation('relu'))
model.add(Dropout(0.4))
# Batch Normalisation
model.add(BatchNormalization())
# Output Layer
model.add(Dense(40))
model.add(Activation('softmax'))
And to classify sequence data we can use LSTM models like this:
model.add(LSTM(units = 50, return_sequences = True))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(32, activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(40, activation='softmax'))
But the approach of the paper above shows that EEG feature vectors can be mapped with image vectors through regression like this:
The first approach is to train a CNN to map images to corresponding
EEG feature vectors. Typically, the first layers of CNN attempt to
learn the general (global) features of the images, which are common
between many tasks, thus we initialize the weights of these layers
using pre-trained models, and then learn the weights of the last
layers from scratch in an end-to-end setting. In particular, we used
the pre-trained AlexNet CNN, and modified it by replacing the
softmax classification layer with a regression layer (containing as
many neurons as the dimensionality of the EEG feature vectors),
using Euclidean loss as the objective function.
The second approach consists of extracting image features using
pre-trained CNN models and then employ regression methods to map
image features to EEG feature vectors. We used our fine-tuned
AlexNet as feature extractors by
reading the output of the last fully connected layer, and then
applied several regression methods (namely, k-NN regression, ridge
regression, random forest regression) to obtain the predicted
feature vectors
I am not able to comprehend how to code the above two approaches. I have never used a regressor for feature mapping and then do classification. Any leads on this are much appreciated.
In my understanding the training data consists of (eeg_signal,image,class_label) triplets.
Train the LSTM model with input=eeg_signal, output=class_label. Loss is crossentropy.
Peel off the last layer of the LSTM model. Let's say the pre-last layer's output is a vector of size 20. Let's call it eeg_representation.
Run this truncated model on all your eeg_signal inputs, save the output of eeg_representation. You will get a tensor of [batch, 20]
Take that AlexNet mentioned in the paper (or any other image classifier), peel off the last layer. Let's say the pre-last layer's output is a vector of size 30. Let's call it image_representation.
Stich a linear layer to the end of the previous layer. This layer will convert image_representation to eeg_representation. It has 20 x 30 weight.
Train the stiched model on (image, eeg_representation) pairs. Loss is the Euclidean distance.
And now the fun part: Stich together model trained in step 7. and the peeled off part of model trained in step 1. If you input an image, you will get class predictions.
This sound like not a big deal (because we do image classification all the time), but if this is really working, it means that this is a "prediction that is running through our brains" :)
Thank you bringing up this question and linking the paper.
I feel I just repeated what's in your question and in the the paper.
I would be beneficial to have some toy dataset to be able to provide code examples.
Here's a Tensorflow tutorial on how to "peel off" the last layer of a pretrained image classification model.

is it possible to classify more than 1000 objects using inception model in tensorflow?

is it possible to classify more than 1000 objects using inception model in tensorflow? I want to classify more than 1000 objects with transfer learning model using TensorFlow image classification.
Popular image classificatuon models can be viewed as a convolutional feature extractor and a classifier in top. The bottom part will take your [208, 208, 3] image and turn it into a columnt of 2048 features [1,1,2048] (all numbers are just for example). After typically a softmax classifier will follow. The classifier is a fullyconnected layer that will have a single neuron for each object-class. If you have 1000 classes it will have 1000*(2048+1) parameters. Note, that only classifier depends on the number of classes.
Doing transfer learning, ine typically discards existing classifier layer and retrains it from scratch. If the feature extractor is trained as well it is called finetuning. While retraining the classifier you can choose arbitrary number of classes to be used.
In short: you are free to do transfer learning on any new number of object classes.

Detecting Targets using Python

I am new to python and I want to develop a code to detect the following targets.
target to detect
target to detect 2
Any links would be appreciated..?
You can use Convolutional Neural Networks (CNN) to classify given image as one of the two.
There are plenty of links for image classification using CNN in the Net.

Can we use a model trained with image classification to help in object detection in tensorflow?

I have used Tensorflow-for-poets to build an image classification model. However, I now want to use the trained model in an object detection model. Can I just import the .pb files directly or do I have to retrain the model?
I am getting this error when I try it
KeyError: "The name 'image_tensor:0' refers to a Tensor which does not exist. The operation, 'image_tensor', does not exist in the graph."
You can not directly use the .pb model produced by image classification to perform object detection. You will have to obtain an object detection model, train it and then use it to detect. There are pretrained object detection models at Tensorflow obejct detection model zoo.
detailed answer below:
Image classification and object detection are two different but very closely related tasks. In fact, Ross Girshick asked a similar question on the famous paper R-CNN
To what extent do the CNN classification results on ImageNet generalize to object detection results on the PASCAL VOC Challenge?
This question basically means that image classification model can be used to help object detection but there are some more steps needed. So you cannot just directly use a classification network to do object detection task. (But the error you gave was something different, you can find the correct tensor name and fix the error, but it just does not make sense to directly use classification network to do object detection that way.)
There is naive solution to combine the two, you could just use a sliding window of various sizes passing through the image and perform classification, this can perform object detection.
Another solution is integrated. To give an example, Faster R-CNN is an object detection network which used VGG as the feature extractor (In the original paper). Here you can see that VGG is an image classification network and it is pretrained on some image classification task.
image source

What are the algorithms used behind filters in image editing softwares?

For example: What algorithm is used to generate the image by the fresco filter in Adobe Photoshop?
Do you know some place where I can read about the algorithms implemented in these filters?
Lode's Computer Graphics Tutorial
The source code for GIMP would be a good place to start. If the code for some filter doesn't make sense, at least you'll find jargon in the code and comments that can be googled.
The Photoshop algorithms can get very complex, and beyond simple blurring and sharpening, each one is a topic unto itself.
For the fresco filter, you might want to start with an SO question on how to cartoon-ify and image.
I'd love to read a collection of the more interesting algorithms, but I don't know of such a compilation.
Digital image processing is the use of computer algorithms to perform image processing on digital images. As a subcategory or field of digital signal processing, digital image processing has many advantages over analog image processing. It allows a much wider range of algorithms to be applied to the input data and can avoid problems such as the build-up of noise and signal distortion during processing. Since images are defined over two dimensions (perhaps more) digital image processing may be modeled in the form of multidimensional systems.
Digital image processing allows the use of much more complex algorithms, and hence, can offer both more sophisticated performance at simple tasks, and the implementation of methods which would be impossible by analog means.
In particular, digital image processing is the only practical technology for:
Classification
Feature extraction
Pattern recognition
Projection
Multi-scale signal analysis
Some techniques which are used in digital image processing include:
Pixelation,
Linear filtering,
Principal components analysis
Independent component analysis
Hidden Markov models
Anisotropic diffusion
Partial differential equations
Self-organizing maps
Neural networks
Wavelets

Resources