Collecting data for my master thesis about gamification in Stack Overflow - data-collection

I am collecting data for my Master thesis, which aims to study how gamification motivates knowledge sharing, using Stack Overflow as an example.
The questionnaire is very short and will take about 5 minutes of your time.
This questionnaire is anonymous, so don't worry about revealing your information. Would you be interested in answering a few questions in relation to participation and gamification in stack Overflow? I've attached the survey below and would appreciate your help.
https://surveyhero.com/c/e94ee2db

Related

How to use restricted boltzman machine to classify? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
I was reading about restricted boltzman machines which is an algorithm used on deep learning. I dont finish to understand how do a RBM can be used to classification. Could anybody provide me a example of clssifying with this algorithm?.
From wikipedia:
RBMs have found applications in dimensionality reduction,
classification, collaborative filtering, feature learning and topic
modelling. They can be trained in either supervised or unsupervised
ways, depending on the task.[1]
[1] Larochelle, H.; Bengio, Y. (2008). "Classification using discriminative restricted Boltzmann machines". Proceedings of the 25th international conference on Machine learning - ICML '08. p. 536. doi:10.1145/1390156.1390224. ISBN 9781605582054. edit
RBM is not a classification model, it is a model for unsupervised learning. There are at least two possible classification-related applications:
In deep learning, RBMs are used as preprocessing units, while on the top of them you still built some "simple" linear model (like logistic regression, perceptron or svm)
In some side works (by Hinton in particular) you can create two RBMs stacks, and connect them with one layer of RBM on top, where one stack is feeded with inputs, and the second one with labels. This way RBM during autoaussociation learning actually models the input->labels mapping (as well as the other way around)

Looking for product reviews dataset [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
I'm working on a school project on product analysis which is based on sentimental analysis. I've been looking for a training dataset for quite a some time now and what I've been able to find so far is a dataset for movie reviews. My question is, can I use this dataset for training the classifier, i.e. will it have an effect on the accuracy of classification? If so, does anyone here know where I can get a free dataset for product reviews?
I am assuming you are using some textual model like the bag of words model.
From my experiments, you usually don't get good results when changing from one domain to another (even if the train data set and the test are all products, but of different categories!).
Think of it logically, an oven that gets hot quickly usually indicate a good product. Is it also the same for laptops?
When I experimented with it a few years ago I used amazon comments as both train set and also to test my algorithms.
The comments are short and informative and were enough to get ~80% accuracy. The 'ground' truth was the stars system, where 1-2 stars were 'negative', 3 stars - 'neutral', and 4-5 stars 'positive'.
I used a pearl script from esuli.it to crawl amazon's comments.

Are there any examples of applying machine learning to improve code performance? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
I saw a talk by Keith Adams of Facebook comparing machine learning techniques to tuning code for improved performance in the real world. Are there examples of such automation techniques being applied in real projects? I
I know of profile guided optimizations in certain compilers and also some techniques JIT compilers use to improve performance, but I am thinking of more fundamental ways to improve code performance that could require changing the code itself and not code generation. Things like:
Choosing the optimal buffer size in a particular network application or choosing the right stack size for particular application.
Selecting the struct layout in a multi-threaded application that improves local cache performance while reducing false sharing.
Selecting different data structures all together for a certain algorithm.
I read a paper on Halide, an image processing framework that uses genetic algorithms to auto-tune image processing pipelines to improve performance. Examples like this one or any pointers to research would be useful.
Have a look at Remy http://web.mit.edu/remy/
It uses kind of genetic optimization approach to generate algorithm for congestion control in networks, significantly increasing network's performance. One specifies assumptions about network being used, and Remy generates control algorithm to be run on data nodes of this network. The results are amazing, Remy outperforms all human-developed so far optimization techniques.
FFTW is a widely used software package which uses OCaml to generate optimized C code. This paper has more details on the process: http://vuduc.org/pubs/vuduc2000-fftw-dct.pdf
You might also look into Acovea, a genetic algorithm to optimize compiler flags: http://stderr.org/doc/acovea/html/index.html

Reviewing Data Structures and Algorithms [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I was wondering if anyone had knew of a website that provides a great review of data structures and algorithms. I would like it to specifically geared towards interview questions with regards to data structures and algorithms. Would implementation of all of these data structures be something good to review?
Thanks!
This page is a good starting point:
This webpage covers the space and time Big-O complexities of common algorithms used in Computer Science. When preparing for technical interviews in the past, I found myself spending hours crawling the internet putting together the best, average, and worst case complexities for search and sorting algorithms so that I wouldn't be stumped when asked about them. Over the last few years, I've interviewed at several Silicon Valley startups, and also some bigger companies, like Yahoo, eBay, LinkedIn, and Google, and each time that I prepared for an interview, I thought to msyelf "Why oh why hasn't someone created a nice Big-O cheat sheet?". So, to save all of you fine folks a ton of time, I went ahead and created one.

A beginners guide to caching using memcacheD [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
There are a lot of tutorial on the Internet which claim to teach you how to use memcacheD, but most of them are about memcache (hence the emphasis on the d).
In php, memcached doesn't even have a connect method. Also a lot of these tutorial just teach you how to connect and add values, but I can figure that out by reading the manual, so please help me to create a one stop reference for memcached. What strategies would you recommend, what are the best practices? How would you cache something like a forum, or a social site with ever-changing data?
The trouble I seem to have is, I know how to connect, add and remove values, but what exactly am I suppose to cache? (I'm just experimenting, this is not for a project, so I can't really give an example).
but what exactly am I suppose to
cache?
You're supposed to cache the data that doesn't change often and is read many times. For example, let's take a forum - you'd cache the initial page of the forum that displays forums available, forum description and forum IDs that allow you to see topics under various forum categories.
Since it's not likely that you create, delete or update forums every second, it's safe to assume that the read:write ratio is in favor of read which allows you to cache that front page where you display forums and by doing so, you are alleviating the load on your database since it doesn't have to be accessed for most visits to your site.
You can also take this caching one step further - you cache everything your site has to offer and you set your cache expiry time to 5 minutes. Assuming your database isn't huge (hundreds of gigabytes) and that it fits available RAM - you'd effectively query your database every 5 minutes.
Assuming you have a lot of visits per day (let's say 20 000 unique visits) - you can calculate how much it attributes to saving resources when connecting to database and extracting data.

Resources