How should I deal with accessibility reports? - user-interface

I need to draw up an accessibility report that is imposed by the European Union for some UI/UX projects. The problem is that the guidelines are not exactly clear and I would like to have more explanations regarding the compilation and, as a consequence, the calculation of the resulting score.
In particular I was interested in knowing how to deal with items that can be considered as NA (Not Applicable); for example: let's suppose that the application for which the accessibility declaration is being drawn up does not contain pre-recorded audio or video elements, in in this case the points involved in the report ( AUDIO & VIDEO STATEMENT W3 CONSORTIUM ) are not applicable for evaluation; as such, can they be subtracted and excluded from the total score?
Does anyone have an answer? Understanding how to deal with this point would also allow me to adjust the score, which would also drastically change the compliancy based on the accessibility threshold.
I tried to contact the authorities in charge and they were unable to give me an answer; I have also searched from various sources, but the situation is not very clear.
Thanks!

Related

Practical image processing books - For the example of canny filter

I know there exist many many books about image processing but I need an advise for a particular good one giving practical hints for using the algorithms. I don't need background information about HOW an algorithm works, e.g. HoughTrafo or Canny Filter as I know that already from various books. But I need a good advise on how to use those filters efficiently and in particular on how to set the thresholds etc.
It currently gives me a huge headache on how to chose those values. When I set them to fixed values, they work for one picture and when changing the illumination slightly, the dont work anymore for various reasons. So I wonder on how to dynamically set them from image specific values. I read on SO to e.g. set the canny thresholds to:
low = 0.666*mean(img)
high = 1.333*mean(img)
(http://www.kerrywong.com/2009/05/07/canny-edge-detection-auto-thresholding/)
but somehow I havent had much success with it so far.
I'm interested in good advises for books etc in particular but included the special example on how to determine thresholds for canny to make it a valid SO-question :-)
If you want a practical book, typically the book will be targeted to a specific library.
Here you can find a list of most of the books about the OpenCV library.

How to detect how similar a speech recording is to another speech recording?

I would like to build a program to detect how close a user's audio recording is to another recording in order to correct the user's pronunciation. For example:
I record myself saying "Good morning"
I let a foreign student record "Good morning"
Compare his recording to mine to see if his pronunciation was good enough.
I've seen this in some language learning tools (I believe Rosetta Stone does this), but how is it done? Note we're only dealing with speech (and not, say, music). What are some algorithms or libraries I should look into?
A lot of people seem to be suggesting some sort of edit distance, which IMO is a totally wrong approach for determining the similarity of two speech patterns, especially for patterns as short as OP is implying. The specific algorithms used by speech-recognition in fact are nearly the opposite of what you would like to use here. The problem in speech recognition is resolving many similar pronunciations to the same representation. The problem here is to take a number of slightly different pronunciations and get some kind of meaningful distance between them.
I've done quite a bit of this stuff for large scale data science, and while I can't comment on exactly how proprietary programs do it, I can comment on how it's done in academia and provide a solution that is straightforward and will give you the power and flexibility that you want for this approach.
Firstly: Assuming that what you have is some chunk of audio without any filtering done on it. Just as it would be acquired from a microphone. The first step is to eliminate background noise. There are a number of different methods for this, but I'm going to assume that what you want is something that will work well without being incredibly difficult to implement.
Filter the audio using scipy's filtering module here. There are a lot of frequencies that microphones pick up that are simply not useful for categorizing speech. I would suggest either a Bessel or a Butterworth filter to ensure that your waveform is persevered through filtering. The fundamental frequencies for everyday speech are generally between 800 and 2000 Hz (reference) so a reasonable cutoff would be something like 300 to 4000 Hz, just to make sure you don't lose anything.
Look for the least active portion of speech and assume that is a reasonable representation of background noise. At this point you're going to want to run a series of fourier transforms along your data (or generate a spectrogram) and find the part of your speech recording that has the lowest average frequency response. Once you have that snapshot, you should subtract it from all other points in your audio sample.
At this point should should have an audio file that is mostly just your user's speech and should be ready to be compared to another file that has gone through this process. Now, we want to actually clip the sound and compare this clip to some master clip.
Secondly: You're going to want to come up with a distance metric between two speech patterns, there are a number of ways to do this, but I'm going to assume we have the output of part one and some master file that has been through similar processing.
Generate a spectrogram of the audio file in question (example). The output from this is ultimately going to be an image that can be represented as a 2-d array of frequency response values. A spectrogram is essentially a fourier transform over time where the colour corresponds to intensity.
Use OpenCV (has python bindings, example) to run blob detection on your spectrogram. Effectively this is going to look for the big colorful blob in the middle of your spectrogram, and give you some limits on this. Effectively, what this should do, is return a significantly more sparse version of the original 2d-array that solely represents the speech in question. (With the assumption that your audio file will have some trailing stuff on the front and back ends of recording)
Normalize the two blobs to account for differences in speech speed. Everyone talks at a different speeds, and as such your blobs will probably have different sizes along the x-axis (time). This will ultimately introduce a level of checks in your algorithm that you don't want for the speed of speech. This step isn't needed if you also want to make sure that they speak at the same speed as the master copy, but I would suggest it. Basically you want to stretch out the shorter version by multiplying it's time axis by some constant that's just the ratio of the lengths of your two blobs.
You should also normalize the two blobs based on maximum and minimum intensity to account for people that talk at different volumes. Again, this is up to your discretion, but to fix this you should find similar ratios for the total span of intensities that you have as well as the two recording's max intensities and make sure that these two values match up between your 2-d arrays.
Third: Now that you have 2-d arrays representing your two speech events, that should in theory contain all of their useful information it's time to directly compare them. Luckily, comparing two matrices is a well-solved problem and there are a number of ways to move forward.
Personally I would suggest using a metric like Cosine Similarity to determine the difference between your two blobs, but that's not the only solution and while it'll give you a quick validation, you can do better.
You could try subtracting one matrix from the other and get an evaluation of how much difference there is between them, which would probably be more accurate than simple cosine distance.
It might be overkill, but you could assume that there are certain regions of speech that matter more or less for evaluating difference between blobs (it might not matter if someone uses a long i instead of a short i, but a g instead of a k could be a different word entirely). For something like that you'd want to develop a mask for the difference array in the previous step and multiply all your values by that.
Whichever method you choose, you can now simply set some difference threshold and make sure that the difference between the two blobs is below your desired threshold. If it is, the captured speech is similar enough to be correct. Otherwise have them try again.
I hope that's helpful, and again, I can't assure you that this is the exact algorithm that a company uses since that information is hugely proprietary and not open for the public, but I can assure you that methods similar to these are used in the very best papers in academia and that these methods will get you a great balance of accuracy and ease of implementation. Let me know if you have any questions, and good luck with your future data science exploits!
The musicg api https://code.google.com/p/musicg/
has a audio fingerprint generator and scorer
along with source code to show how its done.
I think it looks for the most similar point in each track, then scores based on how far it can match.
It might look something like
import com.musicg.wave.Wave
com.musicg.fingerprint.FingerprintSimilarity
com.musicg.fingerprint.FingerprintSimilarityComputer
com.musicg.fingerprint.FingerprintManager
double score =
new FingerprintsSimilarity(
new Wave("voice1.wav").getFingerprint(),
new Wave("voice2.wav").getFingerprint() ).getSimilarity();
Idea:
The way biotechnologists align two protein sequences is as follows: Each sequence is represented as a string on an alphabet as(A/C/G/T - these are different types of proteins, irrelevant for us), where each letter (here, an entry) represents a particular amino acid. The quality of an alignment (its score) is calculated from the similarity of each pair of corresponding entries, and the number and length of the blank entries that need to be inserted to produce that alignment.
Same algorithm (http://en.wikipedia.org/wiki/Needleman-Wunsch_algorithm) can be used for pronunciation, from substitution frequencies in a set of alternate pronunciations. Then you can calculate alignment scores to measure the similarity between the two pronunciations in a way that is sensitive to the differences between phonemes. Measures of similarity that can be used here are Levenshtein distance, phoneme error rate, and word error rate.
Algorithms
The minimum number of insertions, deletions and substitutions required for transformation of one sequence into another is the Levenshtein distance. More info at http://php.net/manual/en/function.levenshtein.php
Phoneme error rate (PER) is the Levenshtein distance between a predicted pronunciation and the reference pronunciation, divided by the number of phonemes in the reference pronunciation.
Word error rate (WER) is the proportion of predicted pronunciations with at least one phoneme error to the total number of pronunciations.
Source: Did an Internship on this at UW-Madison
A carefully configured Levenshtein distance should do the trick.
I know this question is out of date but...
To solve a similar problem I used Google Speech Recognized API to check WHAT was said and visual compare scaled wave forms of volume changes to detect differences in rhythm.
Code & video of the result.
you can use Musicg https://code.google.com/p/musicg/ as roy zhang suggested. In android, just include musicg jar file in your android project and use it. A tested example:
import com.musicg.wave.Wave;
import com.musicg.fingerprint.FingerprintSimilarity;
//somewhere in your code add
String file1 = Environment.getExternalStorageDirectory().getAbsolutePath();
file1 += "/test.wav";
String file2 = Environment.getExternalStorageDirectory().getAbsolutePath();
file2 += "/test.wav";
Wave w1 = new Wave(file1);
Wave w2 = new Wave(file2);
FingerprintSimilarity fps = w1.getFingerprintSimilarity(w2);
float score = fps.getScore();
float sim = fps.getSimilarity();
Log.d("score", score+"");
Log.d("similarities", sim+"");
Good luck
If this is only to check the pronunciation [of course with different accent], you can do this :
Step 1 : Using some voice tool [say dragon dictation], you can have the text with you.
Step 2 : Compare the string or the word formed and compare it with the string that actually was meant to be pronounced.
Step 3 : If you find any discrepancy in the strings, means the word was not spelled correctly. And you can suggest the correct pronunciation.
You have to look into speech recognition algorithms. I understand that you don't need to translate speech to text (that is done by speech recognition algorithms), however, in your case many algorithms would be the same.
Probably, HMM would be helpful here (hidden markov models).
Also look into here: http://htk.eng.cam.ac.uk/

Algorithm to handle data aggregation from multiple error-prone sources

I'm aggregating concert listings from several different sources, none of which are both complete and accurate. Some of the data comes from users (such as on last.fm), and may be incorrect. Other data sources are highly accurate, but may not contain every event. I can use attributes such as the event date, and the city/state to try to match listings from disparate sources. I'd like to be reasonably certain that the events are valid. It seems like it would be a good strategy to consume as many different sources as possible to validate listings on error-prone sources.
I'm not sure what the technical term for this is, as I'd like to research it further. Is it data mining? Are there any existing algorithms? I understand a solution will never be completely accurate.
Here is an approach that locates it within statistics - specifically, it uses a Hidden Markov Model (http://en.wikipedia.org/wiki/Hidden_Markov_model):
1) Use your matching process to produce a cleaned list of possible events. Consider each event to be marked "true" or "bogus", even though the markings are hidden from you. You might imagine that some source of events produces them, generating them as either "true" or "bogus" according to a probability which is an unknown parameter.
2) Associate unknown parameters with each source of listings. These give the probability that this source will report a true event produced by the source of events, and the probability that it will report a bogus event produced by the source.
3) Notice that if you could see the markings of "true" or "bogus" you could easily work out the probabilities for each source. Unfortunately, of course, you can't see these hidden markings.
4) Let's call these hidden markings "Latent Variables" because then you can use the http://en.wikipedia.org/wiki/Em_algorithm to hillclimb to promising solutions for this problem, from random starts.
5) You can obviously make the problem more complicated by dividing events up into classes, and giving sources of listing parameters which make them more likely to report some classes of events than others. This might be useful if you have sources that are extremely reliable for some sorts of events.
I believe the term you are looking for is Record Linkage -
the process of bringing together two or more records relating to the same entity(e.g., person, family, event, community, business, hospital, or geographical area)
This presentation (PDF) looks like a nice introduction to the field. One algorithm you might use is Fellegi-Holt - a statistical method for editing records.
One potential search term is "fuzzy logic".
I'd use a float or double to store a probability (0.0 = disproved ... 1.0 = proven) of some event details being correct. As you encounter sources, adjust the probabilities accordingly. There's a lot for you to consider though:
attempting to recognise when multiple sources have copied from each other and reduce their impact
giving more weight to more recent data or data that explicitly acknowledges the old data (e.g. given a 100% reliable site saying "concert X to be held on 4th August", and a unknown blog alleging "concert X moved from 4th August to 9th", you might keep the probability of there being such a concert at 100% but have a list with both dates and whatever probabilities you think appropriate...)
beware assuming things are discrete; contradictory information may reflect multiple similar events, dual billing, same-surnamed performers etc. - the more confident you are that the same things are referenced, the more the data can combined to reinforce or negate each other
you should be able to "backtest" your evolving logic by using data related to a set of concerts where you now have full knowledge of their actual staging or lack thereof; process data posted before various cut-off dates prior to the events to see how the predictions you derive reflect the actual outcomes, tweak and repeat (perhaps automatically)
It may be most practical to start scraping from the sites you have, then consider the logical implications of the types of information you're seeing. Which aspects of the problem need to be handled using fuzzy logic can then be decided. An evolutionary approach may mean reworking things, but may end up faster than getting bogged down in a nebulous design phase.
Data mining is about finding information from structured sources like a database, or a post where the fields are separated for you. There's some text mining in here when you have to parse the information out of free text. In either case, you could keep track of how many data sources agree on a show as a confidence measure. Either display the confidence measure or use it to decide if your data is good enough. There's lots to play with. Having a list of legitimate cities, venues and acts can help you decide if a string represents a legitimate entity. Your lists might even be in a database that lets you compare city and venue for consistency.

POS UI design & development: what should be included & avoided? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I'm having to design & develop UI for a Point of Sale (POS) system.
There are obvious features that need to be included, like product selection & quantity, payment method, tender amount, user login (as many users will use one terminal), etc.
My question is related more towards the UI design aspect of developing this system.
How should UI features/controls be positioned, sized?
Is there a preferred layout?
Are their colours I should be avoiding?
If you know of any resources to guide me, that would also help.
The reason this is critical to me as I am aware of the pressurized environment in which POS systems are used & I want to make the process as (i) quick, (ii) simple to use and (iii) result driven as possible for the user to service customers.
All answers, info & suggestions welcome.
Thanks.
P.s. If you could mention the "playoff" between controls that would also be appreciated (i.e. if touch screen a keypad control is provided, but if also supporting keyboard & mouse input how do you manage the keypad & UI space effectively?)
A couple of thoughts from a couple of projects I've worked with:
For the touch screen ensure that each button can be pressed by someone with "fat fingers" as easily as smaller ones (some layouts encourage the use of thumbs at particular locations). Also highlight each button when it is pressed (with a slow-ish fade if you have spare CPU cycles).
Bigger grids are better than smaller ones. The numeric pad should always be in the same place (often the bottom right). The Enter/Tender/etc. "transaction" keys should be bigger than the individual numeric keys - (1) make it more obvious where it is, (2) it will be pressed more often than other screen areas and will wear out (a bigger area will last longer on average; this was more important with older style touch screens; newer technology is more resilient).
Allow functions/SKUs to be reassigned to different grid positions; the layout that works well for one store will likely be wrong for a slightly different one.
Group related functions by colour, but use excellent contrasts. Make sure that the fore/back combination looks good at all angles (some LCDs "bleed" colours left-to-right and/or top-to-bottom angles).
Positive touch screen feedback with sounds needs to have configurable volume and sound sets. Muted tones might be better in an quieter upmarket store, but "perky" sounds are better in a clothing store with louder background music/noise, etc.
Allow the grid size to be specified in percentages or "grid-block units" instead of pixels and draw everything with vectors, etc. since some hardware combinations may have LCDs with better resolution. (One system I worked on was originally specified as 640x480 but shipped at 1280x1024, so my design pre-planning saved a lot of rework later.)
And of course look at the ready-made solutions first (especially if you can get demo software/hardware for evaluation). Although they can be expensive they've often implemented a lot of things that'll you'll have to work through later, and may be cheaper in the long run, even after creating custom add-ons for your system.
Also:
Our UI supported a normal keyboard/mouse combo too (the touchable buttons were just standard button controls sized appropriately). If you pressed a number key it would trigger the same event as clicking the screen-pad button; other hotkeys were mapped to often used button commands (Enter, etc).
If run on a non-POS desktop (e.g. backoffice) the window could be resized too (the "POS desktop" maintained the same aspect ratio, adding dead space at the sides if needed). A standard top menu was available for additional administrative tasks, reporting, etc.
The design allowed everyone to build and test the UI before the associated hardware was finalized. And standard UI testing tools would work too.
Even More:
Our barcode scanners were serial/USB rather than keyboard-like, so each packet from the device raised a comms event. The selected "scanner type" driver class used the most secure formatting that the device could give us - some can supply prefix, suffix and/or checksum characters if programmed correctly - and then stripped this before handing the code up to the application.
The system made a "bzzzt" noise when the barcode couldn't be used (e.g. while cash drawer is open).
This design also avoided the need to set the keyboard focus to a specific entry area.
A tip: if the user is manually entering a barcode via the keypad, and hasn't completed it by pressing Enter, and then attempts to scan another barcode, it should beep instead, so the user can accept or cancel the pending item first.
Aggregated POS Design Guidelines
Based on the above and other literature, here is my list of guidelines for POS design.
[it would be nice if we grew this list further]
User Performance Priorities (in order):
efficiency (least time to transaction conclusion)
effectiveness (accurate info & output)
user satisfaction (based on first 2 in work context)
learning time (reduce time to learn system by making it simple)
GUIDELINES
Flexible Transaction Building - don't force a sequence to transaction wher possible. Place product orders in any order & allow them to be changed to a point.
Optimise Transaction Rate - allow a user to complete a transaction as quickly as possible (least clicks are not really the issue as more clicks could mean larger value of transaction, which makes business sense)
Support Handedness / Dexterity - most users have a dominant hand and a weaker hand in terms of dexterity. Allow the UI to be customised (on a single click) for handedness. my example: a L->R / R->L toggle button which moves easy features like "OK", "Cancel" in nearer proximity to weaker hand.
Constant Feedback - provide snapshot feedback which describes current state of the transaction and calculated result of transaction (NB: accounts) before & after committing a transaction.
Control "Volume" - control volume refers to the colour saturation/contrast, prominence of positioning and size of a control. Design more frequently used controls to have larger "volumes" relative to less frequently used controls. e.g. "Pay" button larger than "cancel" button. E.g. High contrast & greater colour saturation increase volume.
Target Findability - finding & selecting targets (item, numeric key) is key to efficiency. Group related controls (close proximity), place controls on screen edges (screen edge traps pointer), emphasise control amplitude (this dimension emphasises users normal plain of motion) and colour coding make finding & selecting targets more efficient.
Avoid Clutter - too many options limits control volume and reduces findability.
Use Plain Text - avoid abbreviations as much as possible (only use standard abbreviations e.g. size: S, M, L, etc.). This is especially true for product lookup.
Product Lookup - support shortcuts for regular orders (i.e. burger meal), categorised browsing & item name search (least ordered items). Consider include a special item: this is any item where the user types what is wanted (i.e. specific whiskey order) - this requires pricing though.
Avoid User Burden - the user should be able to read answers to customer questions from the UI. So provide regularly requested/prioritised feedback for transaction (i.e. customer asks: "what will be the the outstanding balance on my account if I buy this item?" It should appear in UI already)
Conversational Ordering - customer drives the ordering not the system. So allows item selection to be non-sequential.
Objective Focused - the purpose of POS is to conclude the transaction from a business perspective. Always make transaction conclusion possible immediately with "Pay" button. If clicked, any incomplete items will be un-done: user then read order back before requesting cash/credit card)
Personas - there are different categories (personas) of users of POS systems like (i) Clerk/Cashier and (ii) Manager. The UI should present the relevant options to that logged-in persona according to these guidelines i.e. Cashier: large volume on transaction building controls; Manager: large volume on transaction/user management controls.
Touch Screens - (i) allow for touch input with generally larger controls to supported a large finger tip as pointer. (ii) Provide proprioceptive feedback - this is feedback that indicate the control pushed (it should have a short delay on it fade: user finger will be in the way initially). (iii) Auditory Feedback (optional) - this helps with feedback especially with regards errors in pressurised environment.
User Training - users must be trained to understand business protocol & how the POS supports that protocol. They are the one's driving the system. Also, speak to POS users to design & enhance your system - again they are experienced users of the POS system
Context Analysis - a thorough analysis of the context of use for your POS system should be performed to best implement the POS heuristics mentioned above effectively. Understanding the user (human factors), the tasks (frequency, duration, stress factors, etc.) and environment (lighting, hardware, space layout, etc.) should be comprehensively conducted during design and should not be assumed. Get your hands dirty & get into the users work space!! That way you can develop something your specific users can use effectively, efficiently and satisfactorily
I hope this helps everyone.
To all respondents, I really appreciate your feedback! Please give me more wrt to this answer. Thanks
I ran across this question, and I thought I'd add my two cents since some of my work has been mentioned here.
I agree with most of what's been said, but it's important to remember that most everything mentioned represents heuristics. That means that while they're good principles to follow, there are likely times when (a) specific rules should be broken, and (b) there will be contradictions between rules. The trick is being able to weigh conflicting principles and apply them to the appropriate degrees (as you noted in a previous comment).
In the end, it's a matter of balancing the business requirements and user needs in a way that produces optimal results. And in the real world, I find that this can never be achieved through heuristics alone.
Here's an example: I recently finished POS designs for Subway, Wendy's, and Starbucks (see Case Studies at POSDesigns.com). All of these designs used solid heuristics, but all of them came out very, very different because of differences in the business goals and requirements, the users' needs and background, the environment in which they work, the technology being used, and a whole host of other differences.
You can never create a great design in a vacuum. For each of the clients mentioned above, I traveled around to a lot of different types of stores in multiple countries to get a feel for how the users' worked, how the systems would be used, how customers ordered, etc. All this information - along with sales and other data provided by the company - was invaluable in creating a highly usable solution.
Here's another example: Guideline #3 you provide previously ("Support Handedness / Dexterity") is fine as a heuristic (though I have to say that I question the conclusion of swapping simply OK/Cancel). But in visiting Subway stores, we discovered that in that context, the location of the register actually plays a greater role in the hand employees prefer.
In other words, registers that were squished against a wall on the right side tended to produce left-handed users, even when the users were right-handed for every other task. This had implications for the way we allowed the UI to be flip-flopped...and who had control of it. There are tons of examples like that, but we never could have achieved the gains that the user interfaces have produced - like 90% reduction in voids, near zero training, increased speed, accuracy, and check sizes, etc. - by following heuristics alone.
One more point (sorry...you've got me going now :-). Many times heuristics are incomplete without more data as to how to apply them. Consider your guideline #11, "Conversational Ordering". There's much more to this guideline than just providing flexibility in order entry. For instance, one of the many things you have to consider is that not all paths should be presented as equally probable.
We analyzed the way Starbucks' customers ordered in various locations across the United States and United Kingdom. Then, we optimized the system for the most commonly spoken patterns. If we had allowed all paths to have the same "volume", we would have sacrificed usability in other areas, since the design would have appeared more cluttered. The new POS system now supports almost all possible order patterns, but the most probable paths are presented at a higher "volume" than those that are less probable.
OK, it turned out to be more than two cents, but the bottom line is this: If you have a chance to visit the environments in which your POS will be used, analyze customer/employee interactions, etc. ...you should take it. Contextual observations and analysis are invaluable in correctly applying heuristics to your situation.
Good luck!
Dr. Kevin Scoresby
FYI - I'd enjoy talking further about this if you or anyone else in the group would like. My office phone number is on my "About Us" page at POSDesigns.com, or you can use the form to initiate an email conversation. Feel free to call anytime during business hours U.S., East Coast Time.
Devstuff already provides some great answers. In addition:
Create a prototype design (can be simply color-printed on paper) and test this in a scenario that is as realistic as possible, i.e. in a store, with a real future user. Enact a few common scenarios and ask the user to really 'use' your prototype as he / she would use the final product. Obtain feedback through interview and observation.
One way to evaluate your design is to check if you have applied the CRAP principles of design. This article discusses how this can relate to user interface design.
In addition to what has already been posted, here are some tips we picked up along the way.
We use two distinct UI's, one for touch-screen with large bold buttons and one for mouse/keyboard entry. the code behind them is the same just the layout is different.
For touch screens
Try not to have pop-up messages that take focus away from the main form, as users may not be looking at the screen, for example if they are chatting with the customer. we found that if this happen users will continues scanning products unaware that they are not been entered into the sale.
If using a bar code scanner be aware that they sometimes send an enter key after the bar code, that will active focused controls (saying yes/no to pop-ups). To help prevent this we disable the enter key-press on buttons, so only a mouse/finger press will fire the click event. we also turn tab stop to false (may be called different in you language), to stop controls that are touch only from getting focus.
As far as colours go we try to stick to bold button and font colours that can easily be distinguished/read in poorly lit rooms and on screens with glare, as most times users are not in the position to move the screen should they have problem reading it.
Anything you can do to speed up/ help the user is a good thing, for example on our payment screen, as well as having 0..9 keys for payment entry, we also have £1,£2,£5,£10 etc so users don't have to add up the money they are given, they can just press the key for each coin/note they received from the customer.
The best tip I can give is to remember that you are designing for a completely different environment form a desktop application, that would be used in an office. and that users may of never used a computer before. since POS systems are usually locked down, try to make it as easy to use out of the box as possible.
another thing to consider is personas (as introduced in cooper's "the inmates are running the asylum").
essentially, you make up a few canonical "users". give them names, hobbies, skills, a picture, and use them as the people you are designing for.
ie:
billy the cashier: has some computer experience (playing on his ps2). he's in high school, may go to community college. he's a primary user of the system, and wants to be able to learn the new system quickly.
cyrus the manager: needs to manage the cashiers. needs a way, with only his authorization, void transactions and be able to review logs of the sales for making reports as well as managing "shrinkage" (theft). he has 2 kids, lives out in the suburbs so has a 45 minute commute; therefore he doesn't want to spend extra time wrangling the system.
you may need three or four personas; any more than that and it becomes hard to design for.
I highly recommend the book "inmates are running the asylum", plus cooper wrote another book: "about face"; which I have yet to read.
good luck!
I would recommend doing some sort of usability survey amongst your current user group. There is no need for this to be a complicated or highly scientific survey. Present them with simple questions to determine:
The users priority when using the system (accuracy of task, speed, aesthetics)
Preferred input devices
Work flow through such a system
Level of education and domain knowledge
I have found that a lot can be learnt from a simple survey like this and can be applied to your UI design to ensure that the users usability experience is satisfactory.
Great comments from everyone else. I'll just add that there's also an article by Dr. Kevin Scoresby titled "How to Design a (POS) System that Everybody Hates" that discusses usability of POS systems and adds a few points to what people have already mentioned, such as:
Don't punish the employee for customer choices
Avoid creating conflicts with the real world
Avoid color coding (1 out of 10 males has a form of color blindness)
I've also discovered lots of helpful POS design tips at POSDesigns.com. One thing I found interesting is that by focusing too much on the number of button presses, you can actually impact speed--which is often a primary goal. There's also a tip titled "Five Factors that Influence Speed" that I found helpful.
Good luck!
Kyle
There are already some really good systems out there i.e. Tabtill for Win 8 http://www.tabtill.com or Shopkeep for iOS http://www.shopkeep.com. The fewest number of clicks your user need to do the better. As I am also involved in coding for such solutions and having worked with clients using various POS systems, some can be really frustrating. Remember watching cashiers in a bar tapping 10 times just to take payment for a couple of items, their fingers are hopelessly hovering over the screen trying to find the right colourful button. Keep it simple! Allow sorting of your visible product range, categorize them or use barcode reader. Keep at least 5% gap between buttons and don't let silly animations slow down your UI. Either invent your own or just copy what is already out there with your own twist.

Optimal Document Size for LSI Similarity Model

I'm using Gensim's excellent library to compute similarity queries on a corpus using LSI. However, I have a distinct feeling that the results could be better, and I'm trying to figure out whether I can adjust the corpus itself in order to improve the results.
I have a certain amount of control over how to split the documents. My original data has a lot of very short documents (mean length is 12 words in a document, but there exist documents that are 1-2 words long...), and there are a few logical ways to concatenate several documents into one. The problem is that I don't know whether it's worth doing this or not (and if so, to what extent). I can't find any material addressing this question, but only regarding the size of the corpus, and the size of the vocabulary. I assume this is because, at the end of the day, the size of a document is bounded by the size of the vocabulary. But I'm sure there are still some general guidelines that could help with this decision.
What is considered a document that is too short? What is too long? (I assume the latter is a function of |V|, but the former could easily be a constant value.)
Does anyone have experience with this? Can anyone point me in the direction of any papers/blog posts/research that address this question? Much appreciated!
Edited to add:
Regarding the strategy for grouping documents - each document is a text message sent between two parties. The potential grouping is based on this, where I can also take into consideration the time at which the messages were sent. Meaning, I could group all the messages sent between A and B within a certain hour, or on a certain day, or simply group all the messages between the two. I can also decide on a minimum or maximum number of messages grouped together, but that is exactly what my question is about - how do I know what the ideal length is?
Looking at number of words per document does not seem to me to be the correct approach. LSI/LSA is all about capturing the underlying semantics of the documents by detecting common co-occurrences.
You may want to read:
LSI: Probabilistic Analysis
Latent Semantic Analysis (particularly section 3.2)
A valid excerpt from 2:
An important feature of LSI is that it makes no assumptions
about a particular generative model behind the data. Whether
the distribution of terms in the corpus is “Gaussian”, Poisson, or
some other has no bearing on the effectiveness of this technique, at
least with respect to its mathematical underpinnings. Thus, it is
incorrect to say that use of LSI requires assuming that the attribute
values are normally distributed.
The thing I would be more concerned is if the short documents share similar co-occurring terms that will allow LSI to form an appropriate topic grouping all of those documents that for a human share the same subject. This can be hardly done automatically (maybe with a WordNet / ontology) by substituting rare terms with more frequent and general ones. But this is a very long shot requiring further research.
More specific answer on heuristic:
My best bet would be to treat conversations as your documents. So the grouping would be on the time proximity of the exchanged messages. Anything up to a few minutes (a quarter?) I would group together. There may be false positives though (strongly depending on the actual contents of your dataset). As with any hyper-parameter in NLP - your mileage will vary... so it is worth doing a few experiments.
Short documents are indeed a challenge when it comes to applying LDA, since the estimates for the word co-occurrence statistics are significantly worse for short documents (sparse data). One way to alleviate this issue is, as you mentioned, to somehow aggregate multiple short texts into one longer document by some heuristic measure.
One particularity nice test-case for this situation is topic modeling Twitter data, since it's limited by definition to 140 characters. In Empirical Study of Topic Modeling in Twitter (Hong et al, 2010), the authors argue that
Training a standard topic model on aggregated user messages leads to a
faster training process and better quality.
However, they also mention that different aggregation methods lead to different results:
Topics learned by using different aggregation strategies of
the data are substantially different from each other.
My recommendations:
If you are using your own heuristic for aggregating short messages into longer documents, make sure to experiment with different aggregation techniques (potentially all the "sensical" ones)
Consider using a "heuristic-free" LDA variant that is better tailored for short messages, e.g, Unsupervised Topic Modeling for Short Texts Using Distributed
Representations of Words

Resources