vote algorithm to avoid cheat

vote algorithm to avoid cheat - algorithm

I would like to build a vote website, something like a +1 of facebook.
Is there some best practice to deal against cheat ?
Best way would be to use facebook, twitter and/or google to have unique user but I would like to let Anonymous user to vote.

You could ward against massive automated cheating using some form of a proof-of-work system. That would require browsers to invest some computation into the vote. A single vote shouldn't take too long, but bulk-voting would require serious computational efforts. Using JavaScript for this voting would also mean that any bulk voting mechanism has to come from a JavaScript-capable browser (or browser emulation), so simple scripts generating HTTP requests would be foiled as well.
A single user manually clicking multiple times won't be considerably delayed by this approach, though. So combine this requirement with logs of IP addresses, user agent strings, cookies, perhaps also flash cookies. The cookies would help to identify a single dial-in user trying to vote repeatedly using a sequence of connections. Although the measures just listed are easily circumvented with a large scale automated cheating attack, they should deal nicely (although not perfectly) with manual voting. So I believe that the two solutions should complement one another quite nicely.
You might want to block anonymity networks like tor. The block should only affect the voting, though, as you'll want to allow anonymous users to view your site, right?
You might also consider requiring unauthenticated users to solve a CAPTCHA. Depending on how important the votes are, and how many of your users are unauthenticated, your users might or might not be deterred from voting by this measure.

From my limited experience, if you don't require authentication, the system will most likely be cheated one way or another: there is no real bulletproof way to tell who is who, unless you record a lot of data pertaining to the client which might be statistically considered unique as a whole (but you can't achieve 100% certainty with this approach).
Of course, if you expect 10 users it's way easier than if you gotta handle millions. IP + user agent check will most likely be fine in the first case, but it would not be very good in the second (NATs come to mind).
Lastly, you have to consider the likelihood of cheating, or in other words: are your users really going to try and cheat the system, or you can get away with checking informations that can quite easily be forged ?

Related

Spam prevention in Rails

I've got a Rails app where users can send messages to other users. The problem is, it's the type of site that draws many spammers who send bogus messages.
I'm already aware of a couple spam services like Akismet (via rakismet) and Defensio (via defender). The problem with these is that it looks like they don't take into account messages the user has already sent. The type of spam I'm seeing on my site is where the user sends the same (or very similar) messages to many other users. As such, I'd like to be able to compare to at least a handful of past messages to ensure they're different enough to not be considered spam.
So far, the best thing I've come across is the Text::Levenshtein distance implementation, which calculates the number of differences between two strings. I suppose I could calculate the number of difference divided by the string length, and if it's above a certain threshold, then it's not considered spam.
One other thing I've come across is Classifier::Bayes, which makes a best guess as to what category something falls into. Still pondering on this one.
I feel like I might just be looking in the wrong place, and maybe there's already a better solution for something like this out there. Perhaps I'm searching for the wrong words to find something a little more useful.

Don't try and roll your own solution for this, it's much more complex than you would expect. It is infact one of those things, like encryption, where it is a much better idea to farm it out to someone/something that is really good at it. Here is some background for you.
Levenshtein distance is certainly a good thing to be aware of (you never know when a similarity metric will come in handy), but it is not the right thing to use for this particular problem.
A Bayesian classifier is much closer to what you're after. Infact spam detection is pretty much the canonical example of where a naive Bayesian classifier can do a tremendous job. Having said that you'd have to find a large collection of data (messages) that has been classified as spam and non-spam and that's similar to the types of messages you get on your site. You would then need to train your classifier and measure its performance. You'd need to tweak it and make sure you don't overfit it etc. While Classifier::Bayes is a decent basic implementation it will not give you a lot of support for this. Infact Ruby does suffer from a lack of good natural language processing libraries. There is nothing in Ruby to compare to python's NLTK.
Having said all of that, services like akismet will certainly have a bayesian classifier as one of the tools they use to determine if what you send them is spam or not. This classifier will likely be much more sophisticated than what you can build yourself, if for no other reason than the fact that they also have access to so much data. They likely also have other types of classifiers/algorithms that they will use, this is their core business after all.
Long story short, if I were you I would give something like Akismet another look. If you build a facility into your site where you or your users can flag messages as spam (for example via rakismet's spam! method), you'll be able to send this data to akismet and the service should learn pretty quickly that a particular kind of message is spammy. So if your users are sending many similar spammy messages, even if akismet doesn't pick this up straight away, after you flag a couple of these all the rest should be picked up automatically. If I were you I would be concentrating my efforts into experimenting in this direction rather than trying to roll your own solution.

What techniques might be successful in monitoring UI events for the purpose of identifying friction points regarding the usability of an application?

What techniques/algorithms might be successful in monitoring UI events for the purpose of identifying friction points regarding the usability of an application?
Most battle-tested, real-world software contains extensive error checking and logging. We frequently employ complex logging systems to help diagnose, and occasionally predict, failures before they happen. We generally focus on reporting the catastrophic failures on the server-side.
These failures are important of course, but I think there is another class of errors that are overlooked, and yet perhaps equally important. Whether you are using an iPhone, blackberry, laptop, desktop, or point-of-sale touch screen, user interaction is typically processed as discrete events. I suspect that identifying patterns of UI events can expose areas where the user is having difficulty regarding efficiently interacting with the application. I found an interesting academic paper on this subject here. I think the ideas presented in the paper are great, but perhaps other, simpler, techniques might yield nice results. What are your ideas and experiences in this area?

Interesting paper. One of the things I get out of it is that it's not easy to make sense out of user event logs unless you have a very specific hypothesis you are trying to test. They can be very useful, for example, if you know it's taking users too long to complete Task X or they're failing to complete it altogether. It's clearly a whole different ballgame to try to analyze sequences without any other supporting information and make sense out of them (though it can be done if you use the sophisticated techniques mentioned in the paper).
One simpler method would be to simply measure the total time to complete a given task that you know is common and important. If it's a shopping application, for example, the time to complete the check-out on a purchase would probably be something useful to collect. It's not quite that simple, though, because you'd have to at least take into account interruptions (e.g., the user's boss came into the room and he abandoned his shopping for actual work--not that I've ever done this :-P). You could have a simple rule that says, if there were no events logged for X seconds, assume the user is not paying attention to the screen.
Another simple thing you could do would be to check for obvious signs of errors, such as a user employing the "undo" facility or inputting information into an input box in a web app that causes a validation check to trigger (e.g., failing to enter required information and putting information in the wrong format). If certain input boxes result in a high number of errors, it might be a sign you should be more flexible in allowing different formats (e.g., allow users to enter a date as "6/28/09", "6-28-09", "june 28, 2009" instead of requiring a single format).
One other idea: if your application has contextual help, certainly count how many times people use it for each page/section/module of your application.
I doubt anything I'm saying is earth shattering, but maybe it will give you some ideas.
-Dan

I once wrote an app that tracked, for each command, whether it was accessed via the menu or a command-key equivalent; it gave us pretty good insight into which key-equivalents were expendable. We didn't have toolbars at the time, but the same kind of logging and analysis could of course apply to them, or to context-sensitive menus: anywhere you want to provide a limited set of valuable options.

You might want to Google usability testing. I've never heard of it being done by having a program recognize patterns of events, but rather by having humans watch humans use the program.

Should users see their own performance stats in real time

I know this questions seems extremely open ended. I will try to narrow scope.
I have been struggling for some time as to include or exclude real time user performance stats in an application gui.
Does anyone have any info on the harm vs gain in including these stats in an app?
i.e. number of emails answered, number of customer calls taken, average time per customer etc.
The users are begging for more info on their stats, as it is how they are rated. However there is concern that given access to see their performance real time or near real time it will negatively affect their work.
I can kind of equate it to being measured on how many lines of code I churned out in one day. Would this help me to be more productive or just teach me write code as fast as possible and most likely make a lot of mistakes.
In my application I can think of these scenario's
i.e. BAD: "I see I have spent 10 minutes on this issue already, I need to finish this up asap"
vs
i.e. GOOD: "I was able to help that customer quickly, My productivity is good today"

I think there may indeed be a phenomenon one could call “The Facebook Effect,” where by merely presenting a metric (e.g., number of “friends,” minutes per call), you imply that it’s important. However, I don’t think that’s an issue here because it’s clear that users already regard these metrics as very important, otherwise they wouldn’t be begging you for them.
I believe there is a wealth of psychology research showing that adding faster feedback on a metric will very likely improve user performance on that metric. Furthermore, it is emotionally stressful for people to be evaluated on a metric that they cannot track well themselves, which appears to be the case now.
I say show the metrics real time. Management obviously wants high performance on them otherwise they wouldn’t be rating the users on it. User want it, and it’ll reduce their job stress. Sounds like a win-win to me.
Of course, the customers probably lose out, so there’s a bit of an ethical dilemma. However, if these metrics are problem, then the problem lies in corporate policy, not in the user interface. By showing the metrics to the user, you will highlight whatever shortcomings this policy has to the managers. Maybe management will get a clue.
If it’s part of your role, you could propose measurements of customer satisfaction (e.g., after-call automated survey, random follow-up surveys by a human) so that managers can at least track the impacts of their policy. Assuming they care (how are they evaluated?).

KISS principle, is this information pertinent to what the users will be doing?
If no, then no do not add it. One thing that I like on web apps is the time it takes for a page to load after updating / submitting something (basically posting back data).
That way I / a user can tell if there is an issue with data going to the database or a caching issue.
The problem with displaying say for instance calls to customers is your average times may not be as accurate as you'd hope. You may have a customer who likes to chat, or is having technical issues that are irrelevant to your business but yet get reflected in your apps.
This data shouldn't be trusted because of these types of things. Another thing is, if you start displaying call times you end up having employee competitions to see who can get off the phone first..that's when you start hurting yourself more..bad customer service.
A few big names used to rate employees based on things like call volume, average time on the call, etc. Remember when Dell tried to outsource all their technical calls to India? Customers here in the US were frustrated and calls were either too long (not understanding) or too short (Customers did not want to deal with it). Well the big shooters thought hey call times are pretty inline with what we had forecasted and our costs are going down. But it hit rock bottom as time went on...

Your application is useful as a tool to convey this information, and you should definitely include it if users are clamoring for more visibility and transparency into the data it collects. A user's response to the information, however, will largely depend on what the organizational culture already does with it.
For example, does management aggressively encourage users to keep calls short and deal with as many customers as possible, and fire those who don't meet quotas? If so, that's going to provoke an equally strong reaction from users to keep their calls short when they discover they've been a bit on the longer side. ("Uh-oh, I'm getting to the 2-minute mark. I'd better hang up and fake a disconnection to avoid getting my average call length too high.")
Conversely, if management simply encourages everyone to do a good job and provide excellent customer service, this information can be synthesized by users in the overall context of this work. ("I've spent a lot of time on this customer -- I should see if I can wrap things up shortly, or escalate him if I can't fix this problem.")

How can I estimate a Web site build (refresh) when I don't know all of the site's features?

I know there are several estimating questions here, and I have read through most of them, but this one is slightly different. If you're doing a refresh on a Web site, it might include usability enhancements that increase the hours for page production and development. We'll never look at a Web site and say to ourselves the way it is now is the way it will be in the future. If that were the case, then our clients wouldn't be looking for our expertise. Should it always be a requirement to do a team brainstorm before responding to an RFP or creating a formal statement of work? What if those doing the brainstorming are not doing the final work? We can only inventory the current site to a certain extent, and I'm starting to think we should make estimates only for what we know, letting the potential client tell us where we're missing certain elements.

In your proposal, be fairly specific about what you saw on the current web site (how many pages/resources? are they low/med/high complex? What are the high level features you see already existing (i.e. search, security, AJAX, profiles, etc.) and what would you consider adding? Give ranges, rather than specific estimates.
The more detail you give about what you saw without knowing the requirements will help the client believe you didn't just shoot the proposal through an RFP chute, and that you're serious about the work, without tying you to a commitment to deliver more/faster than you can. Clients do understand that you can't really make a reliable estimate, but the more they believe you have made a serious attempt, the more they're likely to commit their time to helping you understand the requirements towards making a final proposal and SOW.

When you don't know what your estimating, expect the estimation to be inaccurate (and, at best, approximate).
State your assumptions ("X if we do Foo or Y if we don't").
State what you would need to reduce uncertainty ("we need to spend an hour with the client to gather requirements before we can provide any estimate").

Understanding Users - Does Performance Trump Looks?

It seems to me that whenever a GUI (Graphical User Interface) is involved, the look and feel of the interface nearly always trumps the performance of the application.
Is this a universal phenomenon?

Sufficiently bad looks trump any level of good performance.
Sufficiently bad performance trumps any level of good looks.

This boils down to the psychology of your target audience and about the architecture of your application. If the GUI reacts quickly and is laid out in such a way that it is intuitive to the user (as opposed to the developer), then the underlying layers may not need to perform so well. If however, the user wants to get data from a database and they're left hanging while the data loads, they're going to feel very differently. Compare 2 web applications just as an example:
Application one feels quite responsive but under the covers things take longer than it appears on the surface - it uses AJAX to talk to Web Services. The Web Services aren't the quickest, but everything happens on callback (asynchronously), so the user isn't held up while fields populate. It doesn't impede their workflow. On a bad day when network performance deteriorates the background performance, sure it's noticeable, but user activity isn't impeded any further than normal.
Application two doesn't feel quite so responsive. Everything happens on postback, there's no AJAX or Web Services, on a good day the page loads are fairly quick. Of course, on every postback the user's workflow is impeded while they wait for the page to reload. On a bad day, network performance causes performance to deteriorate noticeably, further impeding the user.
Application one is far less likely to get complaints because the user isn't held up even though fields aren't loaded so quickly. The user can enter data and move on. Everything is handled asynchronously. Of course, in the background, the Web Service process may actually be slower than the full page refresh but the user isn't going to care so much.
From many thousands of hours writing software and directly interacting with my users - frequently those who aren't necessarily as computer literate as your average 10 year old I've noted these points that are key to getting acceptance from just such an audience [written from a user perspective]:
It must do what I want how I want it: Don't just read the spec and expect your code to meet exactly what it says on the paper. Really read what it says on the paper and understand what the user meant by that. Design to the underlying meaning of the words not the black and white of the ink on the paper. If you don't understand exactly what I meant, come and talk to me and I'll explain it until you do. I'll be less happy if you deliver software that missed my whole point than I will by your questions. I'll feel much happier if I get the feeling you're on my side by really trying to understand me.
It must assist my workflow and not impede it: It's great if all I have to do is push one button to complete what would've taken me an hour to do before, but if it freezes my computer for the 20 minutes it takes to complete the task, I'm not going to be a happy camper.
It must be intuitive to use: That means I don't want to have to wade through the documentation you didn't provide me with in order to figure out how to use it. Neither do I want a 20 minute explanation that I'm going to forget 3 minutes after you walk out of the door. Design the software such that my 10 year old could figure it out as easily as they can program the PVR. It means that I should interact with it in a manner that seems logical to me as the person that will be using it day in day out. It doesn't matter if it's functionally correct, if I can't figure out how to use it, I'm not going to use it, much less pay for it.
It must be responsive: I don't want to have to click a button and then wait 10 seconds for a list to load and then select an item from that list and then wait for another screen to load before I can select an action to complete on that item which then takes 5 minutes to complete. Find a way to load the data quickly - if you can't load the data quickly in response to my action, then figure out a trick to make it feel like the data is loading quickly - perhaps by loading it in advance in the background and only displaying what I need displayed in response to my action... my point is, I don't care what you do, just make it appear like it's doing it quickly.
It must be robust: It doesn't matter what I throw at it, it should accept it and move on. If I do something wrong or put something incorrect in a field, tell me - IN PLAIN ENGLISH!! I don't care about buffer overflows or IOExceptions thrown at line 479 while attempting to open a file. Just handle it and tell me what I did wrong in language I understand.
Give me documentation: Okay, I know I'm not going to read it, and I'm more likely to pick the phone up and call you than remember where I put it when you gave it to me. But knowing its there makes me feel warm and fuzzy inside. It shows that you cared about the software enough - and me enough to write instructions that I can reference oustide business hours when you're not available.
Price: This depends entirely on your audience, but in my experience, if you met all of the above points, price tends to be far less of a concern than it might appear on the surface.

Although "you can't judge a book by its cover", people often do with software. I don't know if I would say this is "universal", but certainly common.

I don't think it's even a true phenomenon. I don't care how zippy the "look and feel" is, if it takes second to echo a keypress, the UI experience will suck. If it takes a long time to repaint the page for small changes, the UI will suck.
Now, as long as the response time of the application is less than some amount, then the look and feel will be a big part of the satisfaction.
Check out some of Jakob Neilsen's books on this.

Isn't it a bit of a false dichotomy? If the look and feel of an application isn't clean, well-organized and effective then you don't have a high-performing application. No matter how fast it may be.

I've found that the best combination is a snappy and easy-to-use GUI. This doesn't necessarily mean your app has to have great performance, but having the GUI freeze on you is a kiss of death. The iPhone's Safari does this well--you can continue to scroll around the screen even if the rendering engine can't keep up with you. Yeah, the no-content hatch marks are ugly, but at least the user knows he's in control.

I think it depends on the users. I work in a medium sized company in the IT department constructing web based software for consumption by the employees of said company. The users range from Human Resources, Manufacturing, R&D, Sales, Finance, to making applications for the CEO. Each of the different departments and users within those departments seem to have different criteria for what makes a quality application.
For instance, a Human Resource department usually deals with a lot of textual data. They spend heaps of time inputting things into forms like employee information, entitlements, recruiting etc. These types of users might opt for the look and feel of an application for this purpose, they want an aesthetically pleasing design that is easy to navigate and intuitive.
On the other hand a department like finance might favor performance in their reporting tools. I have had a few experiences with large SQL tables with complicated queries that took a considerable amount of time to execute. Users that run these kinds of reports many times a day soon get fed up of waiting and would gladly lose a bit of interface intuitiveness in exchange for time that could be spent on other tasks.
So, i would say that you can't make a blanket statement like "All users prefer a speedy application" or "All users like pretty applications". It really depends on the users preferences, their job requirements, and the applications purpose.

Balance is everything.
The UI needs to look respectable, professional and flow similar to other applications so the user has a common experience, thus little learning curve. It shouldn't have unecesssary whistles and bells unless specifically requested.
The performance should be at least tolerable. If you have extra time in a project, I would spend that time speeding it up unless the user specifically asks for UI enhancements. Many times, whistles and bells can compromise performance as some UI enhancements require additional CPU time AND sometimes add awkwardness to the app. At first glance, some of these apps look cool but long term usability suffers in favor of the NEATO factor.

Important for the user is that using the program is fun. The program should not only be able to do what I want it must feel good to use the program.
Making the user wait at moments the user does not understand or foresee isn't fun.
Crashing and making errors isn't fun either.
But looking good and helping me doing my task through the look and let me work fast and without work flow interruptions is fun.
Programmers often think that programs that are slow and use much memory are bad and they measure all their software on memory usage and the use of the processor. But most of you users won't start top or the windows task manager and look at the footprint of your program they will use it and if if feels good to use the program, and the rest of their computer with the program running they will fell good to.
One thing I read about often is the usage of as many CPUs as possible to get a task for the user done in the shortest time. Is this high performance? Your program is very fast. But the Computer is very slow at the moment and switching to the email program because I know the task will take its time is a pain in the ass. So sometimes you may want to free some resources to improve the feeling of your program. Even if that would slow down your own program.

The most important are price, functionality, compatibility, and reliability.
Looks and performance are both, relatively unimportant and in practice they are both therefore unable to "trump" anything:
Compatibility: for example, in the real world I use MS Word, not because it's fast or pretty but because it's compatible with everyone else who uses it.
Functionality: when I want to book a train in France, I use http://www.voyages-sncf.com/ not because it's fast or pretty (or even outstandingly reliable) but because it has the necessary functionality.
Reliability: if an application crashes then I'm probably not going to use it again, no matter how fast it crashes, or how nice it looked before it crashed.
Price: etc. (say no more).

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio