Hot Or Not / Facemash algorithm - Why Elo's Rating Algo? [closed] - algorithm

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
In the social network movie i saw Mark used Elo rating system
But was Elo rating system necessary ?
can anyone tell me what was the advantage using elo's rating system ?
Can the problem be solved in this way too ?
is there any problem in this algo [written below] ?
Table Structure
Name Name of the woman
Pic_Name [pk] Path to the picture
Impressions Number, the images was shown
Votes Number, people selected as hot
Now we show randomly 2 photos from the database and the hottest woman is selected by Maximum number of Votes
Before voting close/down please write your reason

But was that necessary?
No, there are several different ways of implement such system.
Can anyone tell me what was the advantage using elo's rating system ?
The main advantage and the central idea in Elo's system is that if someone with low rating wins over someone with high rating their ratings are updated by a larger number, than if the two had similar rating to start with. This means that the ratings will converge fairly quickly.
I don't really see how your approach is a good one. First of all it seems like it depends on how often a pic is randomly selected for potential upvoting. Even if you showed all pics equally many times, the property described above doesn't hold. I.e, if some one wins over a really hot girl, she would still get only a single upvote. This means that your approach wouldn't converge as quickly as Elo's system. In fact, the approach you propose doesn't converge to some steady rating-values at all.

Simply counting the number of votes and ranking women by that is not adequate and I can think of two reasons why:
What if a woman is average-looking but by luck her picture get displayed more often? Then she would get more votes and her ranking would rise inappropriately.
What if a woman is average-looking but by luck your website would always compare her to ugly women? The she would get more votes and her ranking would rise inappropriately.
I don't know much about the Elo rating system but it probably doesn't suffer from problems like this.

It's a movie about geeks. Elo is a geeky way to rate competitors on the basis of the results of pairwise contests between them. Its association with chess adds extra geekiness. It's precisely the kind of thing that geeks in movies should be doing.
It may have happened that exactly way in real life too, in which case Zuckerberg probably chose Elo because it's a well-known algorithm for doing this, which has been used in practice in several sports. Why go to the effort of inventing a worse algorithm?

Related

Does it have a mature algorithm or theory for a scenario about best distribution? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
Scenario:
We have 10 kinds of toy,and every kind include 10 toys.
We will distribute toys to 100 children.Every child have different degree of satisfaction for 10 kinds. Tip:In the real project we will have 300000+ children records in my database.
My Question is:How to measure and define the best solution for distribution?
And how to get the result?Please give me a hint.
Some friends suggest me to try KM algorithm, I'm not sure it will work for me.
Thinks.
This problem is hard because you haven't decided what you want to optimize, and because many optimization methods will be expensive to run if you have 300K children - or customers - to worry about.
What do you want to optimize? If you try and optimize the sum of some set of per-child satisfaction score, can you really compare the subjective satisfaction of two different children, let alone add them up to produce anything sensible? If you decide on such a system, can you prove that it cannot be distorted by children who decide to lie about their satisfactions, for instance saying that they will be devastated if they don't get one particular toy?
What if somebody decides that the sum of satisfaction scores isn't the right metric, but instead that you should minimize the dis-satisfaction of the most dis-satisfied child?
What if somebody decides that inequality is the real problem, so if there is one very happy child, you should take away their toy and give it to somebody else to minimize the difference in satisfaction between the most and least satisfied child?
What if somebody decides that some children count more than other children, because of something their great-grandparents did, or didn't do?
Just to not be completely negative, here is a cheap scheme, and an attempt to prove a property about it. Put the children in random order and allocate the toys as if each child were to choose according to their preferences in this order - so each child would get the toy they most preferred according to the toys left when they came to choose.
One property you might want for a method of choosing is that, after the toys were distributed, children wouldn't find that they could trade toys amongst themselves to produce a better distribution, making you look silly (aka not a pareto optimal solution). Suppose that such a pattern of trades was possible among the children in this scheme. Consider the trading child who came first among these children in the initial randomization. They chose the toy they wanted most from all those available, so there is in fact nothing the other trading children could offer them that they would prefer. So this scheme is at least not vulnerable to later trades.

How do portfolio software generate their portfolio suggestions so quickly when the possibilities are huge? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I hope this does not get closed because it is related to algorithms that I have not been able to figure out(its also pretty long because I'm so confused about how its being done). Basically many years back I used to work at a mutual fund and we used different tools to select optimize portfolios as well as hedge existing ones. We would take these results and make our own modifications then sell them to clients. After my company downsized, I decided I wanted to give it a try(to create the software and include my customizations) but I have no clue how combinations are actually generated for the software.
After 6 months of trying, I'm accepting that my approach is impossible. I was trying to use combination algorithms like from Knuth's book, as well as doing bit combinations to try to find every possible portfolio(I limited it to 30 stocks) on the NYSE(5,000+ stocks). But as per everyone I have spoken to this will take me billions of billions of years to just get one days results(for me on a GPU i stopped it after 2 days of straight processing).
So what am I missing? We would enter our risk tolerance and view of the market(stock market growth expectations, inflation expectations, fed funds expectations,etc..) and it would give us the ideal portfolio(in theory..) within a few seconds/minutes. With thousands of possibilities and quadrillion possible combinations of weights of stocks, how are they able to calculate results so quickly(or even at all)? As the admin of the system, I know we downloaded a file everyday(less than 100 mb and loaded in a mssql database probably just market data..so its not like we had every possibility. Using my approach above I would get a 5 gig file in a min of doing my version of Knuth's combination algo) and the applications worked offline(so it must have been doing it locally on the desktop/laptop cpu not on a massive supercomputer somewhere and took a min or two to run..15 minutes was the longest for a global fund which includes every stock in the world). Its so confusing because their work required correlation of the entire fund(I don't think they were just sending the top stocks they pre-calculated because everyone got different results). So if I wanted a 30 stock fund that gave me 2% returns and had a negative correlation with the market, and was 60% hedged how could the software generate that portfolio out of billions of possibilities so quickly? note, I'm not asking about the math or the finance part, I'm asking how it was able to generate 30 stocks from the entire market that gave 2% returns when in order to do that it would need to know the returns of all 30 stock portfolio(That alone would make it run for billions of years, right? the other restrictions make it more complex).
So How is this being done programmatically? I'm starting to believe they are not using Knuth's combination algorithm to generate every possibility yet their results don't seem randomly selected and individually selecting the stocks seems to miss the correlation part. How can so many investment softwares do things like this?
Such algorithms almost certainly don't generate every possibility - as you rightly observe that would be impractical.
Portfolio selection is however very easy to do with other techniques that will give you a very good answer. The two most likely are:
If you make simplifying assumptions around risk / return you can solve for an optimal portfolio mathematically (see http://en.wikipedia.org/wiki/Capital_asset_pricing_model for some of the maths)
A genetic algorithm which does mutation / crossover operations on randomised sample portfolios will find a very good solution pretty fast. You can combine this with Monte-Carlo style modelling approaches to understand the range of possible outcomes.
Personally, I'd probably suggest the genetic algorithm approach - although it's not as mathematically pure, it will give you good answers and should able to handle any constraints you want to throw at it quite easily (e.g. max number of stocks in a portfolio)
Modern Portfolio theory is a subject in its own right, with books such as "Modern Portfolio Theory and Investment Analysis", and an introduction at http://en.wikipedia.org/wiki/Modern_portfolio_theory.
One way to get problems you can actually solve is to treat it as a mathematical optimization problem. If you have a vector which gives you the amount of each stock you buy, then - under various assumptions - the return is a linear function of this vector, and the risk is a quadratic function of this vector. Maximising the return for given risk, or minimising the risk for given return, is a well-understood mathematical problem, even for very large numbers of stocks - http://en.wikipedia.org/wiki/Quadratic_programming.
One practical problem with this is that the answer you get will probably tell you buy some fraction of almost all the stocks on the market. My guess is that real life programs use some "secret sauce" heuristic that doesn't guarantee the perfect answer, subject to a constraint on the number of stocks you are actually prepared to buy, but works pretty well in practice. Returning the perfect answer appears to be a hard problem - see e.g. http://arxiv.org/ftp/arxiv/papers/1105/1105.3594.pdf

Team Question ranking game problem

We have question game with Yes or No type of answers. There are multiple teams participating and every team have different number of players. Each player answers to questions. Players can join the game after few of questions has been ended. How to count fairly the all score for the team so we can rank a team?
I would just use the number of correct answers.
First things first: If you have more than one statistic, you have more than one metric. I see an almost infinite number of ranking possibilities. Here's the ones that jump out at me:
Use the average correct answer percentage for the players on the team.
If you have tournaments, have a ranking for tournament win percentage. (You could also use a chess-style ranking to determine the ranking of tournaments.)
Track how many people get a question right wrong. A player's score for getting a question right is (1 - q) where q is the % of people that got that question right. If you get it wrong, you also lose q points. This actually means as other people answer questions, your score may go up or down (which it should, since the purpose is to make it relative to the other players.)
i'll edit more in as I think of them (if I think of them). I really like option 3 though!

algorithms to evaluate user responses

I'm working on a web application which will be used for classifying photos of automobiles. The users will be presented with photos of various vehicles, and will be asked to answer a series of questions about what they see. The results will be recorded to a database, averaged, and displayed.
I'm looking for algorithms to help me identify users which frequently don't vote with the group, indicating that they're probably either not paying attention to the photos, or that they're lying about what they see. I then want to exclude these users, and recalculate the results, such that I can say, with a known amount of confidence, that this particular photo shows a vehicle that is this and that.
This question goes out to all you computer science guys, where to find such algorithms or to give myself the theoretical background to design such algorithms. I'm assuming I'm going to have to learn some probability and statics, maybe some data mining. Some book recommendations would be great. Thanks!
P.S. These are multiple choice questions.
All of these are good suggestions. Thank you! I wish there was a way on stack overflow to select multiple correct answers so more of you could be acknowledged for your contributions!!
Read The Elements of Statistical Learning, it is a great compendium on data mining.
You can be interested especially in unsupervised algorithms, for example clustering. Assuming that most people do not lie, the biggest cluster is right and the rest is wrong. Mark people accordingly, then apply some bayesian statistics and you'll be done.
Of course, most data mining technologies are pretty experimentative, so don't count on that they will be always right... or even in most cases.
I believe what you described is solved using outlier/anomaly detection.
A number of techniques exist:
statistical-based methods
distance-based methods
model-based methods
I suggest you take a look at these slides from the excellent book Introduction to Data Mining
If you know what answers you are expecting why do you ask people to vote? By excluding some values you basically turn the vote in something that you like. Automobiles make different impression to different individuals. If 100 ppl loved a car then when someone comes and says that he/she doesn't like it, you exclude the vote?
But anyway, considering that you still want to do this, first of all you will need a large set o data from "trusted" voters. This will give you an idea of "good" answer and from this point you can choose the exclude threshold.
Without an initial set of data you cannot apply any algorithm because you will get false results. Consider just one vote of 100 from on a scale from 0 to 100. The second vote is "1" The you will exclude this vote because is too far away from the average.
I think a pretty simple algorithm could accomplish this for you. You could try and get fancier by calculating the standard deviations and such, but I wouldn't bother.
Here's a simple approach that should be sufficient:
For each of your users, calculate the number of questions they answered and the number of times they selected the most popular answer for the question. The users which have the lowest ratio of picking the popular answer versus total answers you can guess are providing bogus data.
You probably would not want to throw out the data from users where they've only answered a small number of questions because they likely have just disagreed on a few versus putting in bogus data.
What kind of questions are they (Yes/No, or 1 to 10?).
You may be able to get away with not discarding anything by using a mean instead of an average. With averages if there are extreme outliers in the response it could affect the average, but if you use median you may get a better answer. So for example if you had 5 answers, order them and pick the middle one.
I think what you are saying is that you are concerned that certain people are "outliers", and they are adding noise to your data, making the categorizations less reliable. So, if you have a Chevy Camaro, and most people say it is either a pony car, a muscle car, or a sports car, but you have some goofball who says it's a family sedan, you would want to minimize the impact of his vote.
One thing you could do is provide a Stack Overflow-like reputation score for users:
The more a user is "in agreement" with other users, the better his or her score would be. For a given user (User X), this could be determined by a simple calculation of what percentage of users who responded to a question chose the same category as User X, then averaging this value over all questions answered.
You may want to to multiply this value by the total number of question answered to encourage people to answer as many questions as possible. (Note: if you choose to do this, it would be equivalent to just summing the percentage agreement scores rather than averaging them.)
You could present the final reputation score to users, making sure to explain that they will be rewarded for how well their responses agree with those of other users. This will encourage people to answer more questions but also to take care in their answers.
Finally, you could calculate a certainty score for a given categorization by adding up the total reputation score of all people who chose a given category.
Some of these ideas may need some refinement, especially since I don't know your exact situation. Certainly, if people can see what other people chose before they vote, it would be way too easy to game the system.
If you were to collect votes like "on a scale from 1 to 10, how would you rate this car", you could probably use simple average and standard deviation: the smaller the standard deviation, the more unanimous the general consensus is among your voters, and you can flag users who are e.g. 3 standard devs from the average.
For multiple choice, you need to be more careful. Simply discarding all but the most-voted option will do nothing but disgruntle the voters. You need to establish a measure of how significant the winner is w.r.t. the other options, e.g. flag users who voted for options with less than 1/3 of the winning options count.
Note that I wrote "flag users", not discard votes. If you discard votes, you can't tell how confident you are about the result ("91% voted this to be a Ford Mustang"). If a user has more than a certain percentage of his votes flagged - well, that's up to you.
Your trickiest problem, however, will probably be to collect sufficient votes. Depending on how easy the multiple choice problem is, you probably need several times the number of options as votes, per photo. Otherwise the statistics are meaningless.

What is the most effective way to present and communicate a performance improvement (e.g. percentages, raw data, graphics)? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
Is it better to describe improvements using percentages or just the differences in the numbers? For example if you improved the performance of a critical ETL SQL Query from 4000 msecs to 312 msecs how would you present it as an 'Accomplishment' on a performance review?
In currency. Money is the most effective medium for communicating value, which is what you're trying to use the performance review to demonstrate.
Person hours saved, (very roughly) estimated value of $NEW_THING_THE_COMPANY_CAN_DO_AS_RESULT, future hardware upgrades averted, etc.
You get the nice bonus that you show that you're sensitive to the company's financial position; a geek who can align himself with what the company is really about.
Take potato
Drench Potato in Lighter Fluid
Light potato on fire
Hand potato to boss
Make boss hold it for 4 seconds.
Ask boss how long those 4 seconds felt
Ask boss how much better half a second would have been
Bask in glory
It is always better to measure relative improvement.
So, if you brought it down to 312ms from 4000ms then it is an improvement of 3688ms, which is 92.2% of the original speed. So, you reduced the runtime by 92.2%. In other words, you brought the runtime down to only 7.8% of what it was originally.
Absolute numbers, on the other hand, usually are not that good since they are not comparable. (If your original runtime was 4,000,000ms then an improvement of 3688ms isn't that great.)
See this link for some nice chart suggestions.
Comparison to Requirements
If I have requirements (response time, throughput), I like to color code the absolute numbers like so:
Green: <= 80% of the requirement (response time); >= 120% of > the requirement (throughput)
No formatting: Meets the requirement.
Red: Does not meet the requirement.
Comparisons are interesting, but only if we have enough to see trends over time; Is our performance steadily improving or degrading? Ultimately, the business only cares if we're meeting the requirement. It's only when we don't that they ask for comparisons to previous releases.
Comparison of Benchmarks
If I'm comparing benchmarks to some baseline, then I like to use percentages, but only if the benchmark is a statistically significant change from the baseline.
Hardware Sizing
If I'm doing hardware sizing or capacity planning, then I like to express the performance as the absolute number plus the cost per transaction. For example:
System A: 1,000 transactions/second, $0.02/transaction
System B: 1,500 transactions/second, $0.04/transaction
Use whichever appears most impressive given the change. According to one method of calculation, that change sped up the query by 1,300%, which looks more impressive than 13x improvement, or
============= <-- old query
= <-- new query
Although the graph isn't a bad method.
If you can calculate the improvement in money, then go for that. One piece of software I wrote many years ago saved a few engineers a little bit of time each day. Figuring out the cost of salary, benefits, overhead and it turned into a savings of more than $12k per year for a small company.
-Adam
Rule of the thumb: Whichever sounds more impressive.
If you went from 10 tasks done in a period to 12, you could say you improved the performance by 20%
Saying you did two tasks more doesnt seem that impressive.
In your case, both numbers sound good, but try different representations and see what you get!
Sometimes graphics help a lot of the improvement is there on a number of factors, but the combined somehow does not look that cool
Example: You have 5 params A, B, C, D, E. You could make a bar chart with those 5 params and "before and after" values side by side for each param. That sure will look impressive.
God im starting to sound like my friend from marketing!
runs away screaming
you can make numbers and graphs say anything you want - the important thing is to make them say something meaningful and relevant to the audience you're presenting them to. if it's end users you can show them differences in the screen refreshes (something they understand), to managers perhaps the reduced number of servers they'll need in order to support the application ($ savings), financial...it's all about the $ how much did it save them. a general rule is the less technical the group the more graphical and dramatic you need to be.

Resources