How to use restricted boltzman machine to classify? [closed] - algorithm

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
I was reading about restricted boltzman machines which is an algorithm used on deep learning. I dont finish to understand how do a RBM can be used to classification. Could anybody provide me a example of clssifying with this algorithm?.
From wikipedia:
RBMs have found applications in dimensionality reduction,
classification, collaborative filtering, feature learning and topic
modelling. They can be trained in either supervised or unsupervised
ways, depending on the task.[1]
[1] Larochelle, H.; Bengio, Y. (2008). "Classification using discriminative restricted Boltzmann machines". Proceedings of the 25th international conference on Machine learning - ICML '08. p. 536. doi:10.1145/1390156.1390224. ISBN 9781605582054. edit

RBM is not a classification model, it is a model for unsupervised learning. There are at least two possible classification-related applications:
In deep learning, RBMs are used as preprocessing units, while on the top of them you still built some "simple" linear model (like logistic regression, perceptron or svm)
In some side works (by Hinton in particular) you can create two RBMs stacks, and connect them with one layer of RBM on top, where one stack is feeded with inputs, and the second one with labels. This way RBM during autoaussociation learning actually models the input->labels mapping (as well as the other way around)

Related

Looking for product reviews dataset [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
I'm working on a school project on product analysis which is based on sentimental analysis. I've been looking for a training dataset for quite a some time now and what I've been able to find so far is a dataset for movie reviews. My question is, can I use this dataset for training the classifier, i.e. will it have an effect on the accuracy of classification? If so, does anyone here know where I can get a free dataset for product reviews?
I am assuming you are using some textual model like the bag of words model.
From my experiments, you usually don't get good results when changing from one domain to another (even if the train data set and the test are all products, but of different categories!).
Think of it logically, an oven that gets hot quickly usually indicate a good product. Is it also the same for laptops?
When I experimented with it a few years ago I used amazon comments as both train set and also to test my algorithms.
The comments are short and informative and were enough to get ~80% accuracy. The 'ground' truth was the stars system, where 1-2 stars were 'negative', 3 stars - 'neutral', and 4-5 stars 'positive'.
I used a pearl script from esuli.it to crawl amazon's comments.

Are there any examples of applying machine learning to improve code performance? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
I saw a talk by Keith Adams of Facebook comparing machine learning techniques to tuning code for improved performance in the real world. Are there examples of such automation techniques being applied in real projects? I
I know of profile guided optimizations in certain compilers and also some techniques JIT compilers use to improve performance, but I am thinking of more fundamental ways to improve code performance that could require changing the code itself and not code generation. Things like:
Choosing the optimal buffer size in a particular network application or choosing the right stack size for particular application.
Selecting the struct layout in a multi-threaded application that improves local cache performance while reducing false sharing.
Selecting different data structures all together for a certain algorithm.
I read a paper on Halide, an image processing framework that uses genetic algorithms to auto-tune image processing pipelines to improve performance. Examples like this one or any pointers to research would be useful.
Have a look at Remy http://web.mit.edu/remy/
It uses kind of genetic optimization approach to generate algorithm for congestion control in networks, significantly increasing network's performance. One specifies assumptions about network being used, and Remy generates control algorithm to be run on data nodes of this network. The results are amazing, Remy outperforms all human-developed so far optimization techniques.
FFTW is a widely used software package which uses OCaml to generate optimized C code. This paper has more details on the process: http://vuduc.org/pubs/vuduc2000-fftw-dct.pdf
You might also look into Acovea, a genetic algorithm to optimize compiler flags: http://stderr.org/doc/acovea/html/index.html

How to classify a large collection of user entered company names? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
Our site allows users to enter the company they work for as a free form text entry.
Historically we gathered around a few millions of unique entries. Since we put no constraints we ended up with a lot of variations, typos (e.g. over 1000 distinct entries just for McDonald's)
We realized we could provide our users with a great feature if only we could tie these variations together. We compiled a clean list of companies as a starting point using various online sources [Dictionary]
Now, we're trying to find out a best way to deal with the user data source. We thought about assigning some similarity score:
- comparing each entry with [Dictionary], calculating a lexical distance (possibly in Hadoop job)
- taking advantage of some search database (e.g. Solr)
And associate the user enter text this way.
What we're wondering is did anyone go through similar "classification" exercise and could share any tips?
Thanks,
Piotr
I'd use simple Levenshtein distance (http://en.wikipedia.org/wiki/Levenshtein_distance).
A few millions entries - you should be able to process it easily on one computer (no hadoop, or other heavy-weight tools).

Journal / Proceeding about comparing the similarity of 2 images? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I am a newbie in Matlab field. And i want to learn more about methodology to comparing 2 images to know the similarity between them.
I need more information in international journal / international proceeding, book or another reprort that describe about it.
I Will use it as my literature study.
Is there any suggestion what is the similar journal,book or proceeding that has discussed about it? If has, please include the title and link of them..
Thank You for the attention.
For journals I would recommend the IEEE Transactions on Image Processing:
http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=83
This is a good general intro from MIT:
http://www.mit.edu/~ka21369/Imaging2012/tannenbaum.pdf
You need to define "similarity" better.
In the image compression sense, similarity is a function of the pixel-wise difference between the images (PSNR, and other metrics).
In a computer vision sense, you would want to see if the two images contain similar content such as objects or scenes. I would recommend using Google Scholar for that.

Where can I learn about how to represent liquids mathematically for an HTML5 canvas? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I would like to learn how to represent liquid (water) in 2d and/or 3d mathematically to create a simulation using the HTML5 canvas. Any resources for this and/or representing other real-world materials in 2d or 3d mathematically?
Probably the best (simple) mathematic representation of non-compressible fluids flow (one of which is water) is using finite element method over circulation field.
In simple case finite elememt method could deal with rectangular grid.
But in more complex cases (turbulence, cavitation, fluid/gas interaction) there may need to be used another methods, like particle systems or other types of fields.
Also there may be combination of methods: FEM simulates fulid itelf and particle system visualises it (simulating dust of small particles floating in fluid).
What you are looking for is called Particle Systems
Here is a technique to model fluids and it includes an implementation
You can use a physics engine like Box2D for javascript to create thousands of small circles or squares to simulate a fluid but I dont know if this is the way to go since you may face performance issues with this approach.
GPU Gems had a chapter on this. Might be a bit compute-intensive for HTML though.

Resources