Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
Assume an ANN has trained by 1GB size training data using a long time.
Do we need train the data again when just a few rows in the training data changed?
or is the design of ANN error?
In general, the answer would be yes. Here's why...
First of all, 1GB of training data is a relative measure, since it gives no indication of the number of training samples. Perhaps each training sample is 1MB in size (maybe an image), leaving you with only about 1,000 samples which may not be enough.
Secondly, it is important to know the architecture of the neural network in order to address the question of retraining fully. If the components in your training set that was updated correspond to nodes that may be heavily influenced in terms of the usage, then a retrain is most certainly in order. Of course, the converse is not necessarily true, as it may not be immediately apparent how the interconnectedness of neural network may be influenced by a change in the input.
Thirdly, a neural network is meant to represent a type of complex pattern-matcher, trained to recognize some input relationship, and produce a possible output relationship. From this naive point of view, a change in input should surely correspond to a change in output. As such, a change in the training data may very well correspond to a change in the expected output data. And, even if it doesn't, the input pattern has changed, which may imply that the output pattern also changed.
Let's consider the following example. Assume your neural network is trained to match paintings with artists and it's been success in making the following match to Leonardo da Vinci:
Now it may be trained well enough to also assert that the following images are "painted" by the same artist:
This may be because you trained your neural network on your favourite past-time of balloon an Lego figurines. However, now a number of your input samples change, specifically those associated with the Mona Lisa. Instead they resemble your new favourite past-time... freehand mouse-drawing:
Despite what you say, in general the artistry of the above image doesn't really match that of the earlier ones. As such, your pattern-matcher may not appropriately recognize this as a piece of art made by Leonardo da Vinci. So it's fair to say that retraining it on images of this sort should be in order.
You probably have a number of options:
Test how effective it is to retrain your neural network given the change in training data. That would allow you to answer the question yourself, and give some insight into the architecture of your neural network.
Retrain your neural network on just the changes, which could be considered new training data. The value of this might depend heavily on your architecture, the complexity of the samples as well as the training quantity (not size).
Perform a complete retrain and test the efficacy.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I saw that, while using the conv2d function of Theano, the filters were flipped both vertically and horizontally. Why is that? And does it matter in the case of a Convolutional Neural Network?
Because this is how convolution is defined mathematically. Without the flipping of filter, the operation is called cross-correlation. The advantage of convolution is that it has nicer mathematical properties.
However in the context of Convolutional Neural Network it doesn't matter whether you use convolution or cross-correlation, they are equivalent. This is because the weights of filters are learned during the training, i.e. they are updated to minimize a cost function. In a CNN that uses the cross-correlation operation, the learned filters will be equal to the flipped learned filters of a CNN that uses the convolution operation (assuming exactly the same training conditions for both, i.e. same initialization, inputs, number of epochs etc.). So the outputs of such two CNNs will be the same for the same inputs.
Cross-correlation operation is slightly more intuitive and simpler to implement (because no flipping is performed) and that's probably the reason why other frameworks like Tensorflow and Pytorch use it instead of the actual convolution (they still call it convolution though, probably due to historical reasons or to be consistent in terminology with other frameworks that use the actual convolution).
Just to add in to the above post, even though we say we are using "convolution" operation in CNNs, we actually use cross correlation. But both convolution and correlation produces different outcomes, and I did this exercise to actually see the difference in outcomes.
When I reserached more on this topic, I found that the very initial inception of CNNs is considered to be originated from the paper Neocognitron, which is known to use convolutional operation while using the kernel operation, but later implementations of CNNs and most Deep Learning libraries are known to use correlation filtering instead of convolutional filtering, and we still kept using the name of Convolutional Neural Networks as mostly the algorithm complexities and performance remained almost same.
If you want a detailed article and intuition on how both differ, please take a look at this: https://towardsdatascience.com/does-cnn-represent-convolutional-neural-networks-or-correlational-neural-networks-76c1625c14bd
Let me know if this helps. :)
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I'm doing research to implement automatic segmentation based on MRI (Magnetic Resonance Image) modals. For my case, prostate region is my focus . To make this happen, I'm thinking about these steps: 1. Image Acquisitions (about 20+ patients MRI - DICOM, each patient has around 15-30 slice images, and all these images will be dataset for training, you can see example of the dataset below)
From these dataset, I'm thinking to do manual segmentation with purpose to get the region of prostate (and the size of prostate in each slice is not consistent), so I can get the feature of prostate for any size, as you can see below. The green one is central of prostate, and the red one is peripheral zone of prostate.
So now, I have feature dataset of all slices, and I'm ready for train it to create classifier model.
?
As I'm still green of MATLAB (sorry for this), I have no idea to train the dataset to create classifier that can detect the region of prostate (in any size), and automatically give a boundary to it. Should I use classifier + segmentation algorithm (level set/active contour) to get this done or only using classifier algorithm can get this done?
I' learning about object detection algorithm such as; Haar-Like Feature, but can get all of this clear (yes, I'm screwed). I would be very grateful if anyone can help me to give a clear idea, and guide me to make this happen, please.
Very Thank you
I imagine that you could get good segmentation using superpixels (SLIC is a good approach for generating these), especially so since your images do not seem particularly complex. A common approach for building a classifier using superpixels is to train a CRF to learn the correct segmentation. This is a fairly common approach in computer vision and looks like it should do well given your data. Further, there are good implementations of both of these approaches in Matlab.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
What is the meaning of training a data(set of images) in computer vision and how it is done ?
what are classifiers?
In machine learning, you have some sort of learning algorithm. A learning algorithm takes some data, called the training set, and produces a model of the world from that. So suppose you had a computer vision learning algorithm that attempted to classify images into two categories: it's a picture of a face, or it's not a picture of face.
To keep things simple, just assume the learning algorithm is very stupid - you feed it pictures of faces marked "face" and pictures of things that don't have faces marked "not a face." Our dumb algorithm then just calculates the average light intensity of pictures marked "face" and produces a model that says "if the average intensity of a picture is closer to the average intensity of pictures I previously saw marked "face" than pictures I previously saw that were marked "not a face", then predict that the picture I'm being shown is a face."
The pictures you had to show it to calculate the average light intensity of images of faces as well as pictures marked "not a face" is the training set.
The training set is contrasted with the testing set. It's not very impressive for an algorithm to produce a model that tells you a picture is of a face if you've already shown it that picture before. So usually when you have data you "hold out" (or set aside) a small portion to evaluate how good the model actually is.
The basic process of a machine learning task breaks down as follows:
Obtain, prepare, and format data so that it can be used as input for a model.
Through random choice, set aside some portion of the data to be designated as the "training set" and the other as the "testing set." Usually like 10% or so is set aside for testing.
Apply the learning algorithm to the training set.
The output of a learning algorithm is a model which accepts input and produces some output given that input. In a computer vision context, the input is almost always going to be a picture, but the output could be a classification, perhaps a 3-D map, maybe it finds certain things in the picture; it depends on what you're trying to do.
Determine the accuracy of the model by feeding it the data it's never seen before (i.e. the testing set) and compare the output of the model with what you know the output should be.
You use training data to build a classifier.
How it is done exactly? It varies. In a nutshell you need some sort of distance measure, a simple rule, to compare your samples against each other. You also need a rule to make decisions, that decides whether this object belongs to class A or class B, let's say.
The trick is to find such distance measure that is very simple but varies a lot across images of different classes. For making decisions you can rely on information theory and select any of the available techniques, such as: svm, rf.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I'm not entirely sure this is the correct stack exchange subsite to post this question to, but...
I'm looking for an algorithm that I can use to determine with a decent amount of certainty if a given piece of audio is music or not. Just a boolean result is fine, I don't need to know the key, bpm or anything like that, I just need to be able to determine if it appears to be music (as opposed to speech). Programming language is irrelevant, but I'll end up converting it to Python.
In a phrase, Fourier analysis. Look at the power of different frequencies over time. Here's speech, and here's violin playing. The former shows dramatic changes with every syllable; the 'flow' is very disjoint and could be picked up by an algorithm which took the derivative of the different frequency bands as a function of time. In paradigmatic music, on the other hand, the transitions are much smoother and the tones are purer (less 'blur' in the graph). See also the 'spectrogram' wikipedia page.
What you could do is set up a few Karplus Strong resonance rings and through the audio through them, and just monitor the level of energy in each ring
if it is Western music, it is pretty much all tuned to 12-TET, ie logarithmic 12 tone scale based around concert pitch A4#440Hz
so just pick 3 or 4 notes equally spaced through the octave eg C5, (omit C# D D#) E5 (omit F F# G) G#5 (omit A A# B)
and at least one of those rings will be flaring regularly -- whichever key the music is in, it's probably going to hit one of those notes quite a lot
ideally do it for a bunch of notes, but if you need this real-time it can get a bit heavy feeding your audio simultaneously into 50 rings
alternatively you could even use a pitch detector and catalogue recorded pitches, and look at ratios of log(noteAfreq):log(noteBfreq) see whether they are arranging themselves into low order fractions like 3:4 += 0.5%. but I don't think anyone has built a decent polyphonic pitch detector -- it is practically impossible.
Melodyne might have pulled it off
If it's just a vocal signal you can e-mail me.
For some reason this question has attracted a large number of really bad answers.
Use pyAudioAnalysis. Also, Google "audio feature analysis".
On its surface, this sounds like a hard problem, but there's been an explosion of great work on classifiers in the past 20 years, so many well-documented solutions exist. Most classifiers today usually can figure this out with an error rate of only a few percent. Some classifiers can even figure out what genre of music it is.
Most current algorithms for doing this break down into detecting a bunch of statistical representations of the input audio (features), and then doing a set of automatic classifications of the inputs based on previous training data.
pyAudioAnalysis is one library for extracting these features and then training a kNN or other mixed model based on the detected features. There are many more comparable libraries, such as Essentia for C++. Essentia also has Python bindings.
An Introduction to Audio Content Analysis: Applications in Signal Processing and Music Informatics is a good introductory book.
Look for a small "First differential" over a sequence of FFTs that are in the range of music tones (ie: 1024 samples per chunk run through FFT, then plot chunk1-chunk0,chunk2-chunk1,...). As a first approximation, this should be enough to detect simple things.
This is the sort of algorithm that could be tweaked forever, even in genre-specific ways. Music itself is generally periodic as well, so coming up with a way to run FFTs over the FFTs. And the idea to look for a consistent twelfth root of two spread of outstanding frequencies sounds really plausible.
I bet you were hoping to find this sitting in an free Python library for you to simply drop a file into. :-)
If I have a large set of data that describes physical 'things', how could I go about measuring how well that data fits the 'things' that it is supposed to represent?
An example would be if I have a crate holding 12 widgets, and I know each widget weighs 1 lb, there should be some data quality 'check' making sure the case weighs 13 lbs maybe.
Another example would be that if I have a lamp and an image representing that lamp, it should look like a lamp. Perhaps the image dimensions should have the same ratio of the lamp dimensions.
With the exception of images, my data is 99% text (which includes height, width, color...).
I've studied AI in school, but have done very little outside of that.
Are standard AI techniques the way to go? If so, how do I map a problem to an algorithm?
Are some languages easier at this than others? Do they have better libraries?
thanks.
Your question is somewhat open-ended, but it sounds like you want is what is known as a "classifier" in the field of machine learning.
In general, a classifier takes a piece of input and "classifies" it, ie: determines a category for the object. Many classifiers provide a probability with this determination, and some may even return multiple categories with probabilities on each.
Some examples of classifiers are bayes nets, neural nets, decision lists, and decision trees. Bayes nets are often used for spam classification. Emails are classified as either "spam" or "not spam" with a probability.
For you question you'd want to classify your objects as "high quality" or "not high quality".
The first thing you'll need is a bunch of training data. That is, a set of objects where you already know the correct classification. One way to obtain this could be to get a bunch of objects and classify them by hand. If there are too many objects for one person to classify you could feed them to Mechanical Turk.
Once you have your training data you'd then build your classifier. You'll need to figure out what attributes are important to your classification. You'll probably need to do some experimentation to see what works well. You then have your classifier learn from your training data.
One approach that's often used for testing is to split your training data into two sets. Train your classifier using one of the subsets, and then see how well it classifies the other (usually smaller) subset.
AI is one path, natural intelligence is another.
Your challenge is a perfect match to Amazon's Mechanical Turk. Divvy your data space up into extremely small verifiable atoms and assign them as HITs on Mechanical Turk. Have some overlap to give yourself a sense of HIT answer consistency.
There was a shop with a boatload of component CAD drawings that needed to be grouped by similarity. They broke it up and set it loose on Mechanical Turk to very satisfying results. I could google for hours and not find that link again.
See here for a related forum post.
This is a tough answer. For example, what defines a lamp? I could google images a picture of some crazy looking lamps. Or even, look up the definition of a lamp (http://dictionary.reference.com/dic?q=lamp). Theres no physical requirements of what a lamp must look like. Thats the crux of the AI problem.
As for data, you could setup Unit testing on the project to ensure that 12 widget() weighs less than 13 lbs in the widetBox(). Regardless, you need to have the data at hand to be able to test things like that.
I hope i was able to answer your question somewhat. Its a bit vauge, and my answers are broad, but hopefully it'll at least send you in a good direction.