Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
What is the meaning of training a data(set of images) in computer vision and how it is done ?
what are classifiers?
In machine learning, you have some sort of learning algorithm. A learning algorithm takes some data, called the training set, and produces a model of the world from that. So suppose you had a computer vision learning algorithm that attempted to classify images into two categories: it's a picture of a face, or it's not a picture of face.
To keep things simple, just assume the learning algorithm is very stupid - you feed it pictures of faces marked "face" and pictures of things that don't have faces marked "not a face." Our dumb algorithm then just calculates the average light intensity of pictures marked "face" and produces a model that says "if the average intensity of a picture is closer to the average intensity of pictures I previously saw marked "face" than pictures I previously saw that were marked "not a face", then predict that the picture I'm being shown is a face."
The pictures you had to show it to calculate the average light intensity of images of faces as well as pictures marked "not a face" is the training set.
The training set is contrasted with the testing set. It's not very impressive for an algorithm to produce a model that tells you a picture is of a face if you've already shown it that picture before. So usually when you have data you "hold out" (or set aside) a small portion to evaluate how good the model actually is.
The basic process of a machine learning task breaks down as follows:
Obtain, prepare, and format data so that it can be used as input for a model.
Through random choice, set aside some portion of the data to be designated as the "training set" and the other as the "testing set." Usually like 10% or so is set aside for testing.
Apply the learning algorithm to the training set.
The output of a learning algorithm is a model which accepts input and produces some output given that input. In a computer vision context, the input is almost always going to be a picture, but the output could be a classification, perhaps a 3-D map, maybe it finds certain things in the picture; it depends on what you're trying to do.
Determine the accuracy of the model by feeding it the data it's never seen before (i.e. the testing set) and compare the output of the model with what you know the output should be.
You use training data to build a classifier.
How it is done exactly? It varies. In a nutshell you need some sort of distance measure, a simple rule, to compare your samples against each other. You also need a rule to make decisions, that decides whether this object belongs to class A or class B, let's say.
The trick is to find such distance measure that is very simple but varies a lot across images of different classes. For making decisions you can rely on information theory and select any of the available techniques, such as: svm, rf.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I'm doing research to implement automatic segmentation based on MRI (Magnetic Resonance Image) modals. For my case, prostate region is my focus . To make this happen, I'm thinking about these steps: 1. Image Acquisitions (about 20+ patients MRI - DICOM, each patient has around 15-30 slice images, and all these images will be dataset for training, you can see example of the dataset below)
From these dataset, I'm thinking to do manual segmentation with purpose to get the region of prostate (and the size of prostate in each slice is not consistent), so I can get the feature of prostate for any size, as you can see below. The green one is central of prostate, and the red one is peripheral zone of prostate.
So now, I have feature dataset of all slices, and I'm ready for train it to create classifier model.
?
As I'm still green of MATLAB (sorry for this), I have no idea to train the dataset to create classifier that can detect the region of prostate (in any size), and automatically give a boundary to it. Should I use classifier + segmentation algorithm (level set/active contour) to get this done or only using classifier algorithm can get this done?
I' learning about object detection algorithm such as; Haar-Like Feature, but can get all of this clear (yes, I'm screwed). I would be very grateful if anyone can help me to give a clear idea, and guide me to make this happen, please.
Very Thank you
I imagine that you could get good segmentation using superpixels (SLIC is a good approach for generating these), especially so since your images do not seem particularly complex. A common approach for building a classifier using superpixels is to train a CRF to learn the correct segmentation. This is a fairly common approach in computer vision and looks like it should do well given your data. Further, there are good implementations of both of these approaches in Matlab.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
Assume an ANN has trained by 1GB size training data using a long time.
Do we need train the data again when just a few rows in the training data changed?
or is the design of ANN error?
In general, the answer would be yes. Here's why...
First of all, 1GB of training data is a relative measure, since it gives no indication of the number of training samples. Perhaps each training sample is 1MB in size (maybe an image), leaving you with only about 1,000 samples which may not be enough.
Secondly, it is important to know the architecture of the neural network in order to address the question of retraining fully. If the components in your training set that was updated correspond to nodes that may be heavily influenced in terms of the usage, then a retrain is most certainly in order. Of course, the converse is not necessarily true, as it may not be immediately apparent how the interconnectedness of neural network may be influenced by a change in the input.
Thirdly, a neural network is meant to represent a type of complex pattern-matcher, trained to recognize some input relationship, and produce a possible output relationship. From this naive point of view, a change in input should surely correspond to a change in output. As such, a change in the training data may very well correspond to a change in the expected output data. And, even if it doesn't, the input pattern has changed, which may imply that the output pattern also changed.
Let's consider the following example. Assume your neural network is trained to match paintings with artists and it's been success in making the following match to Leonardo da Vinci:
Now it may be trained well enough to also assert that the following images are "painted" by the same artist:
This may be because you trained your neural network on your favourite past-time of balloon an Lego figurines. However, now a number of your input samples change, specifically those associated with the Mona Lisa. Instead they resemble your new favourite past-time... freehand mouse-drawing:
Despite what you say, in general the artistry of the above image doesn't really match that of the earlier ones. As such, your pattern-matcher may not appropriately recognize this as a piece of art made by Leonardo da Vinci. So it's fair to say that retraining it on images of this sort should be in order.
You probably have a number of options:
Test how effective it is to retrain your neural network given the change in training data. That would allow you to answer the question yourself, and give some insight into the architecture of your neural network.
Retrain your neural network on just the changes, which could be considered new training data. The value of this might depend heavily on your architecture, the complexity of the samples as well as the training quantity (not size).
Perform a complete retrain and test the efficacy.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
How can I distinguish between two different users, like two different neighbours who lives in a same address and goes to the same office, but they have different patterns of driving and have different office schedules. I wanted to find out the probability of two persons who behaves more or less exactly. Depending on the resolution of the map, I wants to figure them, where they are, how often they are. Can I create a pattern ´for each drivers into some signatures, where their identity can be traced upon.
I assume, by the way that you asked your question, that you haven't had any plausible ideas yet. So I'll make an answer which is purely based on an idea that you might like to try out.
I initially thought of suggesting something along the line of word-similarity metrics, but because order is not necessarily important here, maybe it's worth trying something simpler to start. In fact, if I ever find myself considering something complex when developing a model, I take a step back and try to simplify. It's quicker to code, and you don't get so attached to something that's a dead end.
So, how about histograms? If you divide up time and space into larger blocks, you can increment a value in the relevant location for each time interval. You get a 2D histogram of a person's location. You can use basic anti-aliasing to make the histograms more representative.
From there, it's down to histogram comparison. You could implement something real basic using only 1D strips. You know, like sum the similarity measure for each of the vertical and horizontal strips. Linear histogram comparison is super-easy, and just a few lines of code in a language like C. Good enough for proof of concept. If it feels you're on the right track, then start looking for more tricky ideas...
The next thing I'd do is further stratify my data, using days of the week and statutory holidays... Maybe even stratify further using seasonal variables. I've found it pretty effective for forecasting electricity load, which is as much about social patterns as it is about weather. The trends become much more distinct when you separate an influencing variable.
So, after stratification you get a stack of 2D 'slices', and your signature becomes a kind of 3D volume. I see nothing wrong with representing the entire planet as a grid. Whether your squares represent 100m or 1km. It's easy to store this sparsely and prune out anything that's outside some number of standard deviations. You might choose only the most major events for the day and end up with a handful of locations.
You can then focus on the comparison metric. Maybe some kind of image-based gradient- or cluster-analysis. I'm sure there's loads of really great stuff out there. This is just the kinds of starting-points I make, having done no research.
If you need to add some temporal information to introduce separation between people with very similar lives, you can maybe build some lags into the system... Such as "where they were an hour ago". At that point (or possibly before), you probably want to switch from my over-simplified approach of averaging out a person's daily activities, and instead use something like classification trees. This kind of thing is very easy and rapid to develop with a tool like MATLAB or R.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I'm not entirely sure this is the correct stack exchange subsite to post this question to, but...
I'm looking for an algorithm that I can use to determine with a decent amount of certainty if a given piece of audio is music or not. Just a boolean result is fine, I don't need to know the key, bpm or anything like that, I just need to be able to determine if it appears to be music (as opposed to speech). Programming language is irrelevant, but I'll end up converting it to Python.
In a phrase, Fourier analysis. Look at the power of different frequencies over time. Here's speech, and here's violin playing. The former shows dramatic changes with every syllable; the 'flow' is very disjoint and could be picked up by an algorithm which took the derivative of the different frequency bands as a function of time. In paradigmatic music, on the other hand, the transitions are much smoother and the tones are purer (less 'blur' in the graph). See also the 'spectrogram' wikipedia page.
What you could do is set up a few Karplus Strong resonance rings and through the audio through them, and just monitor the level of energy in each ring
if it is Western music, it is pretty much all tuned to 12-TET, ie logarithmic 12 tone scale based around concert pitch A4#440Hz
so just pick 3 or 4 notes equally spaced through the octave eg C5, (omit C# D D#) E5 (omit F F# G) G#5 (omit A A# B)
and at least one of those rings will be flaring regularly -- whichever key the music is in, it's probably going to hit one of those notes quite a lot
ideally do it for a bunch of notes, but if you need this real-time it can get a bit heavy feeding your audio simultaneously into 50 rings
alternatively you could even use a pitch detector and catalogue recorded pitches, and look at ratios of log(noteAfreq):log(noteBfreq) see whether they are arranging themselves into low order fractions like 3:4 += 0.5%. but I don't think anyone has built a decent polyphonic pitch detector -- it is practically impossible.
Melodyne might have pulled it off
If it's just a vocal signal you can e-mail me.
For some reason this question has attracted a large number of really bad answers.
Use pyAudioAnalysis. Also, Google "audio feature analysis".
On its surface, this sounds like a hard problem, but there's been an explosion of great work on classifiers in the past 20 years, so many well-documented solutions exist. Most classifiers today usually can figure this out with an error rate of only a few percent. Some classifiers can even figure out what genre of music it is.
Most current algorithms for doing this break down into detecting a bunch of statistical representations of the input audio (features), and then doing a set of automatic classifications of the inputs based on previous training data.
pyAudioAnalysis is one library for extracting these features and then training a kNN or other mixed model based on the detected features. There are many more comparable libraries, such as Essentia for C++. Essentia also has Python bindings.
An Introduction to Audio Content Analysis: Applications in Signal Processing and Music Informatics is a good introductory book.
Look for a small "First differential" over a sequence of FFTs that are in the range of music tones (ie: 1024 samples per chunk run through FFT, then plot chunk1-chunk0,chunk2-chunk1,...). As a first approximation, this should be enough to detect simple things.
This is the sort of algorithm that could be tweaked forever, even in genre-specific ways. Music itself is generally periodic as well, so coming up with a way to run FFTs over the FFTs. And the idea to look for a consistent twelfth root of two spread of outstanding frequencies sounds really plausible.
I bet you were hoping to find this sitting in an free Python library for you to simply drop a file into. :-)
If I have a large set of data that describes physical 'things', how could I go about measuring how well that data fits the 'things' that it is supposed to represent?
An example would be if I have a crate holding 12 widgets, and I know each widget weighs 1 lb, there should be some data quality 'check' making sure the case weighs 13 lbs maybe.
Another example would be that if I have a lamp and an image representing that lamp, it should look like a lamp. Perhaps the image dimensions should have the same ratio of the lamp dimensions.
With the exception of images, my data is 99% text (which includes height, width, color...).
I've studied AI in school, but have done very little outside of that.
Are standard AI techniques the way to go? If so, how do I map a problem to an algorithm?
Are some languages easier at this than others? Do they have better libraries?
thanks.
Your question is somewhat open-ended, but it sounds like you want is what is known as a "classifier" in the field of machine learning.
In general, a classifier takes a piece of input and "classifies" it, ie: determines a category for the object. Many classifiers provide a probability with this determination, and some may even return multiple categories with probabilities on each.
Some examples of classifiers are bayes nets, neural nets, decision lists, and decision trees. Bayes nets are often used for spam classification. Emails are classified as either "spam" or "not spam" with a probability.
For you question you'd want to classify your objects as "high quality" or "not high quality".
The first thing you'll need is a bunch of training data. That is, a set of objects where you already know the correct classification. One way to obtain this could be to get a bunch of objects and classify them by hand. If there are too many objects for one person to classify you could feed them to Mechanical Turk.
Once you have your training data you'd then build your classifier. You'll need to figure out what attributes are important to your classification. You'll probably need to do some experimentation to see what works well. You then have your classifier learn from your training data.
One approach that's often used for testing is to split your training data into two sets. Train your classifier using one of the subsets, and then see how well it classifies the other (usually smaller) subset.
AI is one path, natural intelligence is another.
Your challenge is a perfect match to Amazon's Mechanical Turk. Divvy your data space up into extremely small verifiable atoms and assign them as HITs on Mechanical Turk. Have some overlap to give yourself a sense of HIT answer consistency.
There was a shop with a boatload of component CAD drawings that needed to be grouped by similarity. They broke it up and set it loose on Mechanical Turk to very satisfying results. I could google for hours and not find that link again.
See here for a related forum post.
This is a tough answer. For example, what defines a lamp? I could google images a picture of some crazy looking lamps. Or even, look up the definition of a lamp (http://dictionary.reference.com/dic?q=lamp). Theres no physical requirements of what a lamp must look like. Thats the crux of the AI problem.
As for data, you could setup Unit testing on the project to ensure that 12 widget() weighs less than 13 lbs in the widetBox(). Regardless, you need to have the data at hand to be able to test things like that.
I hope i was able to answer your question somewhat. Its a bit vauge, and my answers are broad, but hopefully it'll at least send you in a good direction.