Team Question ranking game problem - ranking

We have question game with Yes or No type of answers. There are multiple teams participating and every team have different number of players. Each player answers to questions. Players can join the game after few of questions has been ended. How to count fairly the all score for the team so we can rank a team?

I would just use the number of correct answers.

First things first: If you have more than one statistic, you have more than one metric. I see an almost infinite number of ranking possibilities. Here's the ones that jump out at me:
Use the average correct answer percentage for the players on the team.
If you have tournaments, have a ranking for tournament win percentage. (You could also use a chess-style ranking to determine the ranking of tournaments.)
Track how many people get a question right wrong. A player's score for getting a question right is (1 - q) where q is the % of people that got that question right. If you get it wrong, you also lose q points. This actually means as other people answer questions, your score may go up or down (which it should, since the purpose is to make it relative to the other players.)
i'll edit more in as I think of them (if I think of them). I really like option 3 though!

Related

Does it have a mature algorithm or theory for a scenario about best distribution? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
Scenario:
We have 10 kinds of toy,and every kind include 10 toys.
We will distribute toys to 100 children.Every child have different degree of satisfaction for 10 kinds. Tip:In the real project we will have 300000+ children records in my database.
My Question is:How to measure and define the best solution for distribution?
And how to get the result?Please give me a hint.
Some friends suggest me to try KM algorithm, I'm not sure it will work for me.
Thinks.
This problem is hard because you haven't decided what you want to optimize, and because many optimization methods will be expensive to run if you have 300K children - or customers - to worry about.
What do you want to optimize? If you try and optimize the sum of some set of per-child satisfaction score, can you really compare the subjective satisfaction of two different children, let alone add them up to produce anything sensible? If you decide on such a system, can you prove that it cannot be distorted by children who decide to lie about their satisfactions, for instance saying that they will be devastated if they don't get one particular toy?
What if somebody decides that the sum of satisfaction scores isn't the right metric, but instead that you should minimize the dis-satisfaction of the most dis-satisfied child?
What if somebody decides that inequality is the real problem, so if there is one very happy child, you should take away their toy and give it to somebody else to minimize the difference in satisfaction between the most and least satisfied child?
What if somebody decides that some children count more than other children, because of something their great-grandparents did, or didn't do?
Just to not be completely negative, here is a cheap scheme, and an attempt to prove a property about it. Put the children in random order and allocate the toys as if each child were to choose according to their preferences in this order - so each child would get the toy they most preferred according to the toys left when they came to choose.
One property you might want for a method of choosing is that, after the toys were distributed, children wouldn't find that they could trade toys amongst themselves to produce a better distribution, making you look silly (aka not a pareto optimal solution). Suppose that such a pattern of trades was possible among the children in this scheme. Consider the trading child who came first among these children in the initial randomization. They chose the toy they wanted most from all those available, so there is in fact nothing the other trading children could offer them that they would prefer. So this scheme is at least not vulnerable to later trades.

Distribution among users for collaborative voting algorithm

Users of my application (it's a game actually) answer questions to get points. Questions are supplied by other users. Due to volume, I cannot check everything myself, so I decided to crowd-source the filtering process to the users (players). The rules are simple:
each user is shown a question to rate as good/bad/unsure
when question is rated 5 times as "bad" it is removed from the pool
when question is rated 5 times as "good" it is removed from the poll and flagged to be played by other players who have not seen it
If everyone could see everything, this would be easy. However, later in the game phase, users shouldn't get questions they have already seen. This means that users should not see all the questions, and exactly those they don't see would they get to play (answer) later in the game.
Total number of questions is much larger than number of players, new questions are added daily and new players come all the time, so I cannot just distribute in advance.
I'm looking for some algorithm that would maximize the number of rated playable (i.e. unseen) questions for all players.
I tried to google, but I'm not even sure which terms to put in the search box, and using stuff like "distribution", "voting", "collaborative filtering" gives very interesting but unusable results.
Ratio of good vs bad questions is 1:3, ie. 25% of questions are rated good. Number of already submitted unrated questions is over 10000. Number of active users with privilege to vote is around 150.
I'm currently considering splitting the question pool and user base into 2 parts. One part of the user base would check the question for the other part and vice versa. Splitting the questions is easy (odd vs even for example). However, I'm still not sure how to divide the user base. I thought about using odd/even position in "top question checkers" list, however the positions on list changes daily as new questions are checked.
Update: I just asked a sequel to this question - I need to periodically remove a fixed number of questions from the pool.
I'm unaware if there is a specific, well known algorithm for this. However this would be my line of thinking:
"maximize the number of rated playable (i.e. unseen) questions for all players" means both maximising the number of questions with +5 and the number of not-seen questions from each player.
Whatever the algorithm will be, its effectiveness is tied to both the quality of the questions submitted by the contributors and the willingness to rate by other players (unless you force them to rate questions).
The goal of your system should not be that of making all players to have the same amount of "unseen questions" [this is in fact irrelevant], but rather that of always having for each player a "reserve" of unseen questions that allows him/her to play at its normal gamerate. For example: say you have two users A and B who play regularly on your site. A normally answers 80 quizzes per day, while B only 40. If your system in average get 100 new approved questions daily, in principle you would like player A to never see more than 20 of those every day, while player B could safely see 60 of them.
The ratio between submitted question and approved question is also important: if every second submitted question is not good, then users A and B from above could rate 40 and 120 questions daily.
So my approach to the final algorithm would be:
Keep track of the following:
Number of submitted new question per day (F = Flow)
Ratio between good/total submitted questions (Q = quality)
Number of questions used (for playing, not for rating) by each player per day (GR = Game Rate)
Number of questions rated by each player on a given day (RC = Review Counter)
Establish a priority queue of questions to be rated. The goal here is to have approved questions as fast as possible. Give a bonus priority to both:
questions that have collected upvotes already
questions submitted by users who have a history of other questions having already been accepted.
When a player is involved in rating, show him/her the first question in the queue.
Repeat step 3 as much as you want making sure this condition is never met: Q * (F - RC) < GR
[The above can be further tweaked, considering the fact that when the user first register, there will be already a pool of approved but unseen questions on the site]
Of course you can heavily influence the behaviour of users by giving incentives for meritorious activity (badges and reputation points on SO are a self-explanatory example).
EDIT/ADDENDUM: The discussion in the comments clarify that the GR is fixed, and it is one question per day. Furthermore, the OP states that there will be at least 1 new approved question in the system every 24 hours. This means that it is possible to simplify the above algorithm in one of the two forms:
If the user can vote only AFTER he answered his daily question:
If there is at least one approved, unseen question in the system, let the user vote at will.
If the user can vote even BEFORE answering his daily question:
If there are at least two approved, unseen questions in the system, let the user vote at will.
This is such that if a user is voting all votable questions on the system and then answers his daily one at 23:59, there will still be a question available to be answered at 00:00, plus 24h time for the system to acquire a new question for the following day.
HTH!

Hot Or Not / Facemash algorithm - Why Elo's Rating Algo? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
In the social network movie i saw Mark used Elo rating system
But was Elo rating system necessary ?
can anyone tell me what was the advantage using elo's rating system ?
Can the problem be solved in this way too ?
is there any problem in this algo [written below] ?
Table Structure
Name Name of the woman
Pic_Name [pk] Path to the picture
Impressions Number, the images was shown
Votes Number, people selected as hot
Now we show randomly 2 photos from the database and the hottest woman is selected by Maximum number of Votes
Before voting close/down please write your reason
But was that necessary?
No, there are several different ways of implement such system.
Can anyone tell me what was the advantage using elo's rating system ?
The main advantage and the central idea in Elo's system is that if someone with low rating wins over someone with high rating their ratings are updated by a larger number, than if the two had similar rating to start with. This means that the ratings will converge fairly quickly.
I don't really see how your approach is a good one. First of all it seems like it depends on how often a pic is randomly selected for potential upvoting. Even if you showed all pics equally many times, the property described above doesn't hold. I.e, if some one wins over a really hot girl, she would still get only a single upvote. This means that your approach wouldn't converge as quickly as Elo's system. In fact, the approach you propose doesn't converge to some steady rating-values at all.
Simply counting the number of votes and ranking women by that is not adequate and I can think of two reasons why:
What if a woman is average-looking but by luck her picture get displayed more often? Then she would get more votes and her ranking would rise inappropriately.
What if a woman is average-looking but by luck your website would always compare her to ugly women? The she would get more votes and her ranking would rise inappropriately.
I don't know much about the Elo rating system but it probably doesn't suffer from problems like this.
It's a movie about geeks. Elo is a geeky way to rate competitors on the basis of the results of pairwise contests between them. Its association with chess adds extra geekiness. It's precisely the kind of thing that geeks in movies should be doing.
It may have happened that exactly way in real life too, in which case Zuckerberg probably chose Elo because it's a well-known algorithm for doing this, which has been used in practice in several sports. Why go to the effort of inventing a worse algorithm?

Calculating scores from incomplete league tables

When I was in high school and learning about matrices, we were shown a technique that would help in a situation like this:
There are a number of chess players in a league, and they need to determine a ranking for all of them, but don't have enough time for every player to play every other person. If it ends up that Player A beats Player B, and Player B beats Player C, you can say with some level of certainty that Player A is better than Player C and therefore award some points to player A in lieu of them actually playing each other.
As I said, this was a little while ago and I can't remember how to actually perform the algorithm, but I think it was called something like a "domination matrix". Searching the web for that has been fruitless and scary at times, so I don't think that's right.
Can anyone give me some help? Ideally an algorithm I can use for this program I'm working on, but even just a pointer to some more information about the procedure.
It sounds like you are remembering a presentation of the Perron-Frobenius theorem - which is at least a safer search term :-). One such is at
http://www.math.utah.edu/~keener/lectures/rankings.pdf
Chess players use the Elo system, described at http://en.wikipedia.org/wiki/Elo_rating_system and http://www.chesselo.com/, which would be easier to implement. It is possible that there is no good ranking even if you know everything - see http://en.wikipedia.org/wiki/Nontransitive_dice. People modelling soccer games usually keep track of defensive and offensive strengths separately.
What it sounds like you are describing is a Swiss System tournament or a very similar variation all described on the linked Wikipedia entry. Although rather than given an incomplete tournament to calculate ratings it is a way to organize a tournament to pair the best chess players with the best and the worst chess players with the worst to determine a ranking without the need for everyone to play everyone else.
Maybe some type of PageRank algorithm might work for you.
Imagine every person has a webpage in which they hyperlink to every person who defeated them.
Running the page rank algorithm on this data would give you give you the steady state of your link matrix which might indicate to you the relative importance of each person (I guess).
For example a person who played only one game but, in that, defeated someone who defeated lots of people might have a higher page rank than somebody who defeated 10 people who in turn have not won a single game.
perhaps the min-max algorithm ?

algorithms to evaluate user responses

I'm working on a web application which will be used for classifying photos of automobiles. The users will be presented with photos of various vehicles, and will be asked to answer a series of questions about what they see. The results will be recorded to a database, averaged, and displayed.
I'm looking for algorithms to help me identify users which frequently don't vote with the group, indicating that they're probably either not paying attention to the photos, or that they're lying about what they see. I then want to exclude these users, and recalculate the results, such that I can say, with a known amount of confidence, that this particular photo shows a vehicle that is this and that.
This question goes out to all you computer science guys, where to find such algorithms or to give myself the theoretical background to design such algorithms. I'm assuming I'm going to have to learn some probability and statics, maybe some data mining. Some book recommendations would be great. Thanks!
P.S. These are multiple choice questions.
All of these are good suggestions. Thank you! I wish there was a way on stack overflow to select multiple correct answers so more of you could be acknowledged for your contributions!!
Read The Elements of Statistical Learning, it is a great compendium on data mining.
You can be interested especially in unsupervised algorithms, for example clustering. Assuming that most people do not lie, the biggest cluster is right and the rest is wrong. Mark people accordingly, then apply some bayesian statistics and you'll be done.
Of course, most data mining technologies are pretty experimentative, so don't count on that they will be always right... or even in most cases.
I believe what you described is solved using outlier/anomaly detection.
A number of techniques exist:
statistical-based methods
distance-based methods
model-based methods
I suggest you take a look at these slides from the excellent book Introduction to Data Mining
If you know what answers you are expecting why do you ask people to vote? By excluding some values you basically turn the vote in something that you like. Automobiles make different impression to different individuals. If 100 ppl loved a car then when someone comes and says that he/she doesn't like it, you exclude the vote?
But anyway, considering that you still want to do this, first of all you will need a large set o data from "trusted" voters. This will give you an idea of "good" answer and from this point you can choose the exclude threshold.
Without an initial set of data you cannot apply any algorithm because you will get false results. Consider just one vote of 100 from on a scale from 0 to 100. The second vote is "1" The you will exclude this vote because is too far away from the average.
I think a pretty simple algorithm could accomplish this for you. You could try and get fancier by calculating the standard deviations and such, but I wouldn't bother.
Here's a simple approach that should be sufficient:
For each of your users, calculate the number of questions they answered and the number of times they selected the most popular answer for the question. The users which have the lowest ratio of picking the popular answer versus total answers you can guess are providing bogus data.
You probably would not want to throw out the data from users where they've only answered a small number of questions because they likely have just disagreed on a few versus putting in bogus data.
What kind of questions are they (Yes/No, or 1 to 10?).
You may be able to get away with not discarding anything by using a mean instead of an average. With averages if there are extreme outliers in the response it could affect the average, but if you use median you may get a better answer. So for example if you had 5 answers, order them and pick the middle one.
I think what you are saying is that you are concerned that certain people are "outliers", and they are adding noise to your data, making the categorizations less reliable. So, if you have a Chevy Camaro, and most people say it is either a pony car, a muscle car, or a sports car, but you have some goofball who says it's a family sedan, you would want to minimize the impact of his vote.
One thing you could do is provide a Stack Overflow-like reputation score for users:
The more a user is "in agreement" with other users, the better his or her score would be. For a given user (User X), this could be determined by a simple calculation of what percentage of users who responded to a question chose the same category as User X, then averaging this value over all questions answered.
You may want to to multiply this value by the total number of question answered to encourage people to answer as many questions as possible. (Note: if you choose to do this, it would be equivalent to just summing the percentage agreement scores rather than averaging them.)
You could present the final reputation score to users, making sure to explain that they will be rewarded for how well their responses agree with those of other users. This will encourage people to answer more questions but also to take care in their answers.
Finally, you could calculate a certainty score for a given categorization by adding up the total reputation score of all people who chose a given category.
Some of these ideas may need some refinement, especially since I don't know your exact situation. Certainly, if people can see what other people chose before they vote, it would be way too easy to game the system.
If you were to collect votes like "on a scale from 1 to 10, how would you rate this car", you could probably use simple average and standard deviation: the smaller the standard deviation, the more unanimous the general consensus is among your voters, and you can flag users who are e.g. 3 standard devs from the average.
For multiple choice, you need to be more careful. Simply discarding all but the most-voted option will do nothing but disgruntle the voters. You need to establish a measure of how significant the winner is w.r.t. the other options, e.g. flag users who voted for options with less than 1/3 of the winning options count.
Note that I wrote "flag users", not discard votes. If you discard votes, you can't tell how confident you are about the result ("91% voted this to be a Ford Mustang"). If a user has more than a certain percentage of his votes flagged - well, that's up to you.
Your trickiest problem, however, will probably be to collect sufficient votes. Depending on how easy the multiple choice problem is, you probably need several times the number of options as votes, per photo. Otherwise the statistics are meaningless.

Resources