Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
Why is speech recognition so difficult? What are the specific challenges involved? I've read through a question on speech recognition, which did partially answer some of my questions, but the answers were largely anecdotal rather than technical. It also still didn't really answer why we still can't just throw more hardware at the problem.
I've seen tools that perform automated noise reduction using neural nets and ambient FFT analysis with excellent results, so I can't see a reason why we're still struggling with noise except in difficult scenarios like ludicrously loud background noise or multiple speech sources.
Beyond this, isn't it just a case of using very large, complex, well-trained neural nets to do the processing, then throwing hardware at it to make it work fast enough?
I understand that strong accents are a problem and that we all have our colloquialisms, but these recognition engines still get basic things wrong when the person is speaking in a slow and clear American or British accent.
So, what's the deal? What technical problems are there that make it still so difficult for a computer to understand me?
Some technical reasons:
You need lots of tagged training data, which can be difficult to acquire once you take into account all the different accents, sounds etc.
Neural networks and similar gradient descent algorithms don't scale that well - just making them bigger (more layers, more nodes, more connections) doesn't guarantee that they will learn to solve your problem in a reasonable time. Scaling up machine learning to solve complex tasks is still a hard, unsolved problem.
Many machine learning approaches require normalised data (e.g. a defined start point, a standard pitch, a standard speed). They don't work well once you move outside these parameters. There are techniques such as convolutional neural networks etc. to tackle these problems, but they all add complexity and require a lot of expert fine-tuning.
Data size for speech can be quite large - the size of the data makes the engineering problems and computational requirements a little more challenging.
Speech data usually needs to be interpreted in context for full understanding - the human brain is remarkably good at "filling in the blanks" based on understood context. Missing informations and different interpretations are filled in with the help of other modalities (like vision). Current algorithms don't "understand" context so they can't use this to help interpret the speech data. This is particularly problematic because many sounds / words are ambiguous unless taken in context.
Overall, speech recognition is a complex task. Not unsolvably hard, but hard enough that you shouldn't expect any sudden miracles and it will certainly keep many reasearchers busy for many more years.....
Humans use more than their ears when listening, they use the knowledge they
have about the speaker and the subject. Words are not arbitrarily sequenced
together, there is a grammatical structure and redundancy that humans use
to predict words not yet spoken. Furthermore, idioms and how we ’usually’
say things makes prediction even easier.
In Speech Recognition we only have the speech signal. We can of course construct a
model for the grammatical structure and use some kind of statistical model
to improve prediction, but there are still the problem of how to model world
knowledge, the knowledge of the speaker and encyclopedic knowledge. We
can, of course, not model world knowledge exhaustively, but an interesting
question is how much we actually need in the ASR to measure up to human
comprehension.
Speech is uttered in an environment of sounds, a clock ticking, a computer
humming, a radio playing somewhere down the corridor, another human
speaker in the background etc. This is usually called noise, i.e., unwanted
information in the speech signal. In Speech Recognition we have to identify and filter out
these noises from the speech signal. Spoken language != Written language
1: Continuous speech
2: Channel variability
3: Speaker variability
4: Speaking style
5: Speed of speech
6: Ambiguity
All this points have to be considered while building a speech recognition, That's why its a quite difficult.
-------------Refered from http://www.speech.kth.se/~rolf/gslt_papers/MarkusForsberg.pdf
I suspect you are interested in 'countinuous' speech recognition, where the speaker speaks sentences (not single words) at normal speed.
The problem is not simply one of signal analysis, but there is a large natural language component as well. Most of us understand spoken language not by analyzing every single thing that we hear, as that would never work because each person speaks differently, phonemes are suppressed, pronunciations are different, etc. We just interpret a portion of what we hear and the rest is 'interpolated' by our brain once the context of what is being said is established. When you have no context, it is difficult to understand spoken language.
Lots of major problems in speech recognition are not directly related to the language itself:
different people (women, men, children, elders etc.) have different voices
sometimes the same person sounds different for example when the person has a cold
different background noises
everyday speech sometimes contains words from other languages (like you have the german word Kindergarden in the US/English)
some persons not from the country itself learned the language (they usually sound different)
some persons speak faster, others speak slower
quality of the microphone
etc.
Solving these things always is pretty hard... on top of that you have the language/pronounciation to take care of...
For reference see the Wikipedia article http://en.wikipedia.org/wiki/Speech_recognition - it has a good overview including some links and book references which are a good starting point...
From the technical POV the "audio preprocessing" is just one step in a long process... let's say the audio is "crytal clear", then several of the above mentioned aspects (like having a cold, having a mixup in languages etc.) still need to be solved.
All this means that for good speech recognition you need to have a model of the langauge(s) that is thorough enough to account for slight differences (like "ate" versus "eight") which usually involves some context-analysis (i.e. semantic and fact/world knowledge, see http://en.wikipedia.org/wiki/Semantic%5Fgap) etc.
Since almost all relevant languages have evolved and were not designed as mathematical models you basically need to "reverse engineer" the available implicit and explicit knowlegde about a language into a model which is a big challenge IMHO.
Having worked myself with neural nets I can assure you that while they provide good results in some cases they are not "magical tools"... almost always a good neural net has been carefully designed and optimized for the specific requirement... in this context it needs both extensive experience/knowledge of languages and neural nets PLUS extensive training to achieve usable results...
Its been a decade since I took a language class in college, but from what I recall language can be broken up into phonemes. Language processors do their best to identify these phonemes, but they are unique to every individual. Even once they are broken up they must then be reassembled into a meaningful construct.
Take this example, humans are quite capable of reading with no punctuation and no capital letters and no spaces. It takes a second, but we can do it quite readily. This is kind of what a computer has to look at when it gets a block of phonemes. However, computers are not nearly as good at parsing this data out. One of the reasons is it is difficult for computers to have context. Humans can even understand babies despite the fact that their phonemes can be completely wrong.
Even if you have all the phonemes correct, then arranging them into an order that makes sense is also difficult.
I am interesting in the raw (or composite) metrics used to get a handle on how well a person can program in a particular language.
Scenario: George knows a few programming languages and wants to learn "foobar", but He would like to know when he has a reasonable amount of experience in "foobar".
I am really interesting in something broader than just the LOC (lines of code) metric.
My hope for this question is to understand how engineers quantify the programming language experiences of others and if this can be mechanically measured.
Thanks in Advance!
In reply to the previous two posters, I'd guess that there is a way to get a handle on how well a person can program in a particular language: you can test how well someone knows English, or Maths, or Music, or Medicine, or Fine Art, so what's so special about a programming language?
In reply to the OP, I guess the tests must assess:
How well you can program
How well you can use the programming language
Therefore the metrics might be:
What's the goodness of the person's programming (and there are various dimensions of goodness such as bug-free, maintainable, quick/cheap to write, runs quickly, meets user requirements, etc.)?
Does the person use appropriate/idiomatic features of the programming language in question in order to do that good programming?
It would be difficult to make the test 'mechanical', though: most exams that I know of are graded by a human examiner. In the case of programming, part of the test could be graded mechanically (i.e. "does it run?") but part of it ("is it understandable and idiomatic?") is intended to benefit, and is better judged by, other human programmers.
The best indicator of your expertise in a particular language, in my opinion, is how productive you are in it.
Productivity is not just how fast you can work but, importantly, how few bugs you create and how little refactoring/rework is required later on.
For example, if you took two languages you have similar level of experience with, and were (in parallel universes) to build the same system with both, I would say the language you build the system with faster and with fewer defects/design flaws, is the language you have more expertise in.
Sorry it's not a "hard" metric for you, it's a more practical approach.
I don't believe that this can be "mechanically measured". I've thought about this a lot though.
Hang on...
Even the "LOC" of a program is a heavily disputed topic!
(Are we talking about the output of cat *.{h,c} | wc -l or some other mechnanism, for instance? What about blank lines? Comments? Are comments important? Is good code documented?)
Until you've realised how pointless a LOC comparison is, you've no hope of realising how pointless other metrics are.
It's a rather qualitative thing that is rarely measured with any great accuracy. It's like asking "how smart was Einstein?". Certification is one (and a reasonably thorough) quantitative indicator, but even it falls drastically short of identifying "good programmers" as many recruiters discover.
What are you ultimately trying to achieve? General programming aptitude can be more important than language expertise in some situations.
If you are language-focussed, taking on a challenge like Project Euler using that language may be a way to track progress.
How proficient they are in debugging complex problems in that language.
Ask them about projects they have worked on in the past, difficult problems they encountered and how they solved them. Ask them about debugging techniques they have used - you'll be surprised at what you'll hear, and you might even learn something new ;-)
A lot of places have a person or two who is a superstar in their field - the person everyone else goes to when they can't figure out what is wrong with their program. I'm guessing thats the person you are looking for :-)
Facility with a programming language is not enough. What is required is facility with a programming language in the context of a partiular suite of libraries on a particular platform
C++ on winapi on Windows 32bit
C++ on KDE on Linux
C++ on Symbian on a Nokia S60 phone
C# on MS .NET on Windows
C# on Mono on Linux
Within such a context, the measures of competence using the target language on the target platform are as follows:
The ability to express common
patterns succinctly and robustly.
The ability to debug common but subtle bugs like race conditions.
It would be possible to develope a suite of benchmark exercises for a programmer. One might also, once significant samples were available, determine the bell curve for ability. Preparing these things would take literally years and they would rapidly be obsoleted. This (and general tightness) is why organisations don't bother.
It would also be necessary to grade people in both "tool maker" and "tool user" modes. Tool makers are very different people with a much higher level of competence but they are often unsuited to monkey work, for which you really want a tool user.
John
There are a couple of ways to approach your question:
1) If you are interviewing candidates for a particular position requiring a particular language, then the only measure to compare candidates is 'how long has this person been writing in this language.' It's not perfect - it's not even very good - but it's reality. Unless you want to give the candidate a problem, a computer, and a compiler to test them on the spot there's no other measure. And then most programmer-types don't do well in "someone's watching you" scenarios.
2) I interpret your question to be more of 'when can I call MYSELF profecient in a language?' For this I would refer to levels of learning a non-native language: first level is you need to look up words/phrases in a dictionary (book) in order to say or understand anything; second level would be that you can understand hearing the language(or reading code) with only the occasional lookup in your trusted and now well-worn dictionary; third level you can now speak (or write code) with only the occasional lookup; fourth level is where you dream in the language; and the final levels is where fool native speakers into thinking that you're a native speaker also (in programming, other experts would think that you may have helped develop the language syntax).
Note that this doesn't help determine how good of a programmer you are - just like knowing English without having to look up words in the dictionary doesn't show "how gooder you is at writin' stuff" - that's subjective and has nothing to do with a particular language as people that are good at programming are good in any language you give them.
The phrase "a reasonable amount of experience" is dependent upon the language being considered and what that language can be used for.
A metric is the result of a measurement. Stevens (see wikipedia: Level Of Measurement) proposed that measurements use four different scale types: nominal (assigning a label), ordinal (assigning a ranking), interval (ordering the measurements) and ratio (having a non-arbitrary zero starting point). LOC is a ratio measurement. Although far from perfect, I think LOC is a relevant, objective number indicating how much experience you have in a language and can be compared to quantifiable values in the software industry. But, this begs the question: where do these industry values come from?
Personally, I would say that "George" will know that he has a reasonable amount of experience when he has designed, implemented and tested a project, maybe of his own choosing on his personal time on his home computer if need be. For example: database, business application, web page, GUI test tool, etc.
From the hiring managers point of view, I would start off by asking the programmer how good s/he is in the language, but this is not a metric. I have always thought that the best way to measure a persons ability to write programs is to give the programmer several small programming problems that are thought-out in advance and solved in a given amount of time, say, 5 minutes each. I have never objected to this being done to me in job interviews. Several metrics are available: Was the programmer able to solve the problem (yes or no - nominal)? How much time did it take (number of minutes - ratio)? How effective was their approach to solving the problem (good, fair, poor - ordinal)? You learn not only the persons ability to write code, but can observe several subjective things as well, such as their behaviour as they go about solving the problem, the questions s/he asks while solving the problem, the ability to work under pressure, etc, From a "quality" perspective though, remember that people do not like being measured.
Still, I believe there are some good metrics like the McCabe Cyclomatic Metric for cyclomatic complexity or the amount of useful commentary per block of code or even the average amount of code written between two consecutive tests.
I know of no such thing. I don't believe there's concensus on how to quantify experience or what "reasonable" means. Maybe I'll learn something too, but if I do it'll be a great surprise.
This may be pertinent.
I find that testing the ability to debug is a more accurate gauge of programming skill than any test aimed at straightforward programming problems that I have encountered. Given the source for a reasonably sized class or function with a stated (or unstated, in some cases) misbehavior, can the testee locate the problem?
Well, they try that in job interviews. There's no metric, but you can assess a person's abilities through questioning and quizzing.
WTF/s * LOC, smaller is best.
there are none; expertise can only be judged subjectively relative to others, or tested on specifics (which has its own level of inaccuracy)
see what is the fascination with code metrics for more information
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
Some of us just have a hard time with the softer aspects of UI design (myself especially). Are "back-end coders" doomed to only design business logic and data layers? Is there something we can do to retrain our brain to be more effective at designing pleasing and useful presentation layers?
Colleagues have recommended a few books me including The Design of Sites, Don't make me think and Why Software sucks , but I am wondering what others have done to remove their deficiencies in this area?
Let me say it directly:
Improving on this does not begin with guidelines. It begins with reframing how you think about software.
Most hardcore developers have practically zero empathy with users of their software. They have no clue how users think, how users build models of software they use and how they use a computer in general.
It is a typical problem when an expert collides with a laymen: How on earth could a normal person be so dumb not to understand what the expert understood 10 years ago?
One of the first facts to acknowledge that is unbelievably difficult to grasp for almost all experienced developers is this:
Normal people have a vastly different concept of software than you have. They have no clue whatsoever of programming. None. Zero. And they don't even care. They don't even think they have to care. If you force them to, they will delete your program.
Now that's unbelievably harsh for a developer. He is proud of the software he produces. He loves every single feature. He can tell you exactly how the code behind it works. Maybe he even invented an unbelievable clever algorithm that made it work 50% faster than before.
And the user doesn't care.
What an idiot.
Many developers can't stand working with normal users. They get depressed by their non-existing knowledge of technology. And that's why most developers shy away and think users must be idiots.
They are not.
If a software developer buys a car, he expects it to run smoothly. He usually does not care about tire pressures, the mechanical fine-tuning that was important to make it run that way. Here he is not the expert. And if he buys a car that does not have the fine-tuning, he gives it back and buys one that does what he wants.
Many software developers like movies. Well-done movies that spark their imagination. But they are not experts in producing movies, in producing visual effects or in writing good movie scripts. Most nerds are very, very, very bad at acting because it is all about displaying complex emotions and little about analytics. If a developer watches a bad film, he just notices that it is bad as a whole. Nerds have even built up IMDB to collect information about good and bad movies so they know which ones to watch and which to avoid. But they are not experts in creating movies. If a movie is bad, they'll not go to the movies (or not download it from BitTorrent ;)
So it boils down to: Shunning normal users as an expert is ignorance. Because in those areas (and there are so many) where they are not experts, they expect the experts of other areas to have already thought about normal people who use their products or services.
What can you do to remedy it? The more hardcore you are as a programmer, the less open you will be to normal user thinking. It will be alien and clueless to you. You will think: I can't imagine how people could ever use a computer with this lack of knowledge. But they can. For every UI element, think about: Is it necessary? Does it fit to the concept a user has of my tool? How can I make him understand? Please read up on usability for this, there are many good books. It's a whole area of science, too.
Ah and before you say it, yes, I'm an Apple fan ;)
UI design is hard
To the question:
why is UI design so hard for most developers?
Try asking the inverse question:
why is programming so hard for most UI designers?
Coding a UI and designing a UI require different skills and a different mindset. UI design is hard for most developers, not some developers, just as writing code is hard for most designers, not some designers.
Coding is hard. Design is hard too. Few people do both well. Good UI designers rarely write code. They may not even know how, yet they are still good designers. So why do good developers feel responsible for UI design?
Knowing more about UI design will make you a better developer, but that doesn't mean you should be responsible for UI design. The reverse is true for designers: knowing how to write code will make them better designers, but that doesn't mean they should be responsible for coding the UI.
How to get better at UI design
For developers wanting to get better at UI design I have 3 basic pieces of advice:
Recognize design as a separate skill. Coding and design are separate but related. UI design is not a subset of coding. It requires a different mindset, knowledge base, and skill group. There are people out there who focus on UI design.
Learn about design. At least a little bit. Try to learn a few of the design concepts and techniques from the long list below. If you are more ambitious, read some books, attend a conference, take a class, get a degree. There are lot of ways to learn about design. Joel Spolky's book on UI design is a good primer for developers, but there's a lot more to it and that's where designers come into the picture.
Work with designers. Good designers, if you can. People who do this work go by various titles. Today, the most common titles are User Experience Designer (UXD), Information Architect (IA), Interaction Designer(ID), and Usability Engineer. They think about design as much as you think about code. You can learn a lot from them, and they from you. Work with them however you can. Find people with these skills in your company. Maybe you need to hire someone. Or go to some conferences, attend webinars, and spend time in the UXD/IA/ID world.
Here are some specific things you can learn. Don't try to learn everything. If you knew everything below you could call yourself an interaction designer or an information architect. Start with things near the top of the list. Focus on specific concepts and skills. Then move down and branch out. If you really like this stuff, consider it as a career path. Many developers move into managements, but UX design is another option.
Learn fundamental design concepts. You should know about affordances, visibility, feedback, mappings, Fitt's law, poka-yokes, and more. I recommend reading The Design of Everyday Things (Don Norman) and Universal Principles of Design (Lidwell, Holden, & Butler)
Learn about user experience. This is becoming the umbrella term for the human-centered design of web sites, applications, and any other digital artifact. The classic primer here is The elements of User Experience (Jesse James Garrett). You can get an overview and the first few chapters from the author's site.
Learn to sketch designs. Sketching is fast way to explore design options and find the right design, whereas usability testing is about getting the design right. Paper prototyping is fast, cheap, and effective during the early design stages. Much faster than coding a digital prototype. The key text here is Sketching User Experience: Getting the design right and the right design (Bill Buxton). Sketching is a particularly useful skill when working with IA/ID/UX designers. Your collaboration will be more effective. For a good primer on how and why designers sketch, watch the presentation How to be a UX team of one by Leah Buley from the 2008 IA Summit.
Learn paper prototyping. The fastest way to iteratively test an interface before you write code. Different from sketching and usability testing. The definitive book here is Paper Prototyping (Carolyn Snyder). You can get a good DVD on this from the Nielsen Norman Group.
Learn usability testing. Discount testing is easy and effective. But for many UIs, usability is hard to do well. You can learn the basics quickly, but good usability people are invaluable. If you want a book, the classic is The Handbook of Usability Testing (Jeffrey Rubin). It's older but offers thorough coverage of lab-based testing. The famous starter book is Don't Make Me Think (2nd Ed) (Steve Krug). I caution people about this one: Krug makes it sound easier than it is. But it is a good starting point. The user research books listed in the next point also cover this topic. And you can find piles about it online.
Learn about information architecture. The main book here is Information Architecture for the World Wide Web (3rd) (Louis Rosenfeld & Peter Morville). A good starter book is Information Architecture: Blueprints for the Web (Christina Wodtke). For more, visit the Information Architecture Institute or attend the annual Information Architecture Summit.
Learn about interaction design. The main book here is The Essentials of Interaction Design (3rd) (Alan Cooper, et al). A good starter book is Designing for interaction (Dan Saffer). For more, visit the Interaction Design Association (IxDA) or attend the annual Interaction Design conference.
Learn fundamentals of graphic design. Graphic design is not UI design, but concepts from graphic design can improve an interface. Graphic design introduces design principles for the visual presentation of information, such as proximity, alignment, and small multiples. I recommend reading The non-designer's design book (Robin Williams) and Envisioning Information (Edward Tufte)
Learn to do user research. Where usability tests an interface, user research tries to model users and their tasks through personas, scenarios, user journeys, and other documents. It's about understanding users and what they do, then using that to inform the design instead of guessing. Some techniques are interviews, surveys, diary studies, and cart sorting. Good books on this are Observing the User Experience (Mike Kuniavsky) and Understanding Your Users (Courage & Baxter)
Learn to do field research. Watching people in the lab under artificial conditions helps (ie: usability), but there is nothing like watching people use your code in context: their home, their office, or wherever they use it. Goes by various names, including ethnography, field studies, and contextual inquiry. Here is a good primer on field research. Two of the better known books here are Rapid Contextual Design (Karen Holtzblatt et al) and User and task analysis for interface design (Hackos & Redish).
Read UX design web sites. Some of the big ones are Boxes & Arrows, UX Mag, UX Matters, and Digital Web magazine.
Use UI pattern libraries. There are patterns for interfaces. For web sites, I recommend The Design of Sites, 2nd ed (Van Duyne, et al) and Homepage usability: 50 websites deconstructed (Jakob Nielsen & Marie Tahir). For desktop applications I recommend Designing interfaces (Jennifer Tidwell), and for web applications I recommend Designing Web Interfaces: Principles and Patterns for Rich Interactions (Bill Scott & Theresa Neil). Online you should check Welie pattern library, UI patterns, and Web UI patterns.
Attend UX design conferences. Some good annual conferences are: Information Architecture Summit, Interaction '09 (IxDA), User Interface, and UX week.
Attend a workshop or webinar. You can take workshops, webinars, and online courses. This is far from a comprehensive list, but you might try the UIE virtual seminars, Adaptive Path virtual seminars, and UX webinars from Rosenfeld Media.
Get a degree. A graduate degree in HCI is one approach, but these programs are mostly about writing coding. If you want to learn about the design of digital artifacts and devices, then you want a graduate program that's not in CS. Some options include Interaction Design at Carnegie Mellon, the d-School at Stanford, the ITP program at NYU, and Information Architecture & Knowledge Management at Kent State (disclosure: I'm on faculty at Kent; we are seeing more and more people with CS degrees moving into UX design instead of management, which is interesting, because management is the traditional path for developers who want to move away from writing code while staying in their field). There are many more programs. Each has their own perspective, areas of emphasis, and technical expectations. Some come out of the arts and visual design, others out of library and information science, and some from CS. Most are hybrids, but every hybrid has deeper roots in one or more fields. If this interests you, look around and try to understand the differences between these programs. Some offer online courses and certificate programs in addition to full-fledged degrees.
Why UI design is hard
Good UI design is hard because it involves 2 vastly different skills:
A deep understanding of the machine. People in this group worry about code first, people second. They have deep technological knowledge and skill. We call them developers, programmers, engineers, and so forth.
A deep understanding of people and design: People in this group worry about people first, code second. They have deep knowledge of how people interact with information, computers, and the world around them. We call them user experience designers, information architects, interaction designers, usability engineers, and so forth.
This is the essential difference between these 2 groups—between developers and designers:
Developers make it work. They implement the functionality on your TiVo, your iPhone, your favorite website, etc. They make sure it actually does what it is supposed to do. Their highest priority is making it work.
Designers make people love it. They figure out how to interact with it, how it should look, and how it should feel. They design the experience of using the application, the web site, the device. Their highest priority is making you fall in love with what developers make. This is what is meant by user experience, and it's not the same as brand experience.
Moreover, programming and design require different mindsets, not just different knowledge and different skills. Good UI design requires both mindsets, both knowledge bases, both skill groups. And it takes years to master either one.
Developers should expect to find UI design hard, just as UI designers should expect to find writing code hard.
What really helps me improve my design is to grab a fellow developer, one the QA guys, the PM, or anyone who happens to walk by and have them try out a particular widget or screen.
Its amazing what you will realize when you watch someone else use your software for the first time
Ultimately, it's really about empathy -- can you put yourself in the shoes of your user?
One thing that helps, of course, is "eating your own dogfood" -- using your applications as a real user yourself, and seeing what's annoying.
Another good idea is to find a way to watch a real user using your application, which may be as complicated as a usability lab with one-way mirrors, screen video capture, video cameras on the users, etc., or can be as simple as paper prototyping using the next person who happens to walk down the hall.
If all else fails, remember that it's almost always better for the UI to be too simple than too complicated. It's very very easy to say "oh, I know how to solve that, I'll just add a checkbox so the user can decide which mode they prefer". Soon your UI is too complicated. Pick a default mode and make the preference setting an advanced configuration option. Or just leave it out.
If you read a lot about design you can easily get hung up on dropped shadows and rounded corners and so forth. That's not the important stuff. Simplicity and discoverability are the important stuff.
Contrary to popular myth there are literally no soft aspects in UI design, at least no more than needed to design a good back end.
Consider the following; good back end design is based upon fairly solid principles and elements any good developer is familiar with:
low coupling
high cohesion
architectural patterns
industry best practices
etc
Good back end design is usually born through a number of interactions, where based on the measurable feedback obtained during tests or actual use the initial blueprint is gradually improved. Sometimes you need to prototype smaller aspects of back end and trial them in isolation etc.
Good UI design is based on the sound principles of:
visibility
affordance
feedback
tolerance
simplicity
consistency
structure
UI is also born through test and trial, through iterations but not with compiler + automated test suit, but people. Similarly to back end there are industry best practises, measurement and evaluation techniques, ways to think of UI and set goals in terms of user model, system image, designer model, structural model, functional model etc.
The skill set needed for designing UI is quite different from designing back-end and hence don’t expect to be able to do good UI without doing some learning first. However that both these activities have in common is the process of design. I believe that anyone who can design good software is capable of designing good UI as long as they spend some time learning how.
I recommend taking a course in Human Computer Interaction, check MIT and Yale site for example for online materials:
MIT User Interface Design and Implementation Course
Structural vs Functional Model in Understanding and Usage
The excellent earlier post by Thorsten79 brings up the topic of software development experts vs users and how their understanding of software differ. Human learning experts distinguish between functional and structural mental models. Finding way to your friend's house can be an excellent example of the difference between the two:
First approach includes a set of detailed instructions: take the first exit of the motorway, then after 100 yards turn left etc. This is an example of functional model: list of concrete steps necessary to achieve a certain goal. Functional models are easy to use, they do not require much thinking just a straight forward execution. Obviously there is a penalty for the simplicity: it might not be the most efficient route and any any exceptional situation (i.e. a traffic diversion) can easilly lead to a complete failure.
A different way to cope with the task is to build a structural mental model. In our example that would be a map that conveyes a lot of information about the internal structure of the "task object". From understanding the map and relative locations of our and friend's house we can deduct the functional model (the route). Obviously it's requires more effort, but much more reliable way of completing the task in spite of the possible deviations.
The choice between conveying functional or structural model through UI (for example, wizard vs advanced mode) is not that straight forward as it might seem from Thorsten79's post. Advanced and frequent users might well prefer the structural model, whereas occassional or less expirienced users — functional.
Google maps is a great example: they include both functional and structural model, so do many sat navs.
Another dimension of the problem is that the structural model presented through UI must not map to the structure of software, but rather naturally map to structure of the user task at hand or task object involved.
The difficulty here is that many developers will have a good structural model of their software internals, but only functional model of the user task the software aims to assist at. To build good UI one needs to understand the task/task object structure and map UI to that structure.
Anyway, I still can't recommend taking a formal HCI course strongly enough. There's a lot of stuff involved such as heuristics, principles derived from Gestalt phychology, ways humans learn etc.
I suggest you start by doing all your UI in the same way as you are doing now, with no focus on usability and stuff.
alt text http://www.stricken.org/uploaded_images/WordToolbars-718376.jpg
Now think of this:
A designer knows he has achieved perfection not when there is nothing left to add, but when there is nothing left to take away.
— Saint-Exupéry
And apply this in your design.
A lot of developers think that because they can write code, they can do it all. Designing an interface is a completely different skill, and it was not taught at all when I attended college. It's not just something that just comes naturally.
Another good book is The Design of Everyday Things by Donald Norman.
There's a huge difference between design and aesthetics, and they are often confused.
A beautiful UI requires artistic or at least aesthetic skills that many, including myself, are incapable of producing. Unfortunately, it is not enough and does not make the UI usable, as we can see in many heavyweight flash-based APIs.
Producing usable UIs requires an understanding of how humans interact with computers, some issues in psychology (e.g., Fitt's law, Hick's law), and other topics. Very few CS programs train for this. Very few developers that I know will pick a user-testing book over a JUnit book, etc.
Many of us are also "core programmers", tending to think of UIs as the facade rather than as a factor that could make or break the success of our project.
In addition, most UI development experience is extremely frustrating. We can either use toy GUI builders like old VB and have to deal with ugly glue code, or we use APIs that frustrate us to no end, like trying to sort out layouts in Swing.
Go over to Slashdot, and read the comments on any article dealing with Apple. You'll find a large number of people talking about how Apple products are nothing special, and ascribing the success of the iPod and iPhone to people trying to be trendy or hip. They will typically go through feature lists, and point out that they do nothing earlier MP3 players or smart phones didn't do.
Then there are people who like the iPod and iPhone because they do what the users want simply and easily, without reference to manuals. The interfaces are about as intuitive as interfaces get, memorable, and discoverable. I'm not as fond of the UI on MacOSX as I was on earlier versions, I think they've given up some usefulness in favor of glitz, but the iPod and iPhone are examples of superb design.
If you are in the first camp, you don't think the way the average person does, and therefore you are likely to make bad user interfaces because you can't tell them from good ones. This doesn't mean you're hopeless, but rather that you have to explicitly learn good interface design principles, and how to recognize a good UI (much as somebody with Asperger's might need to learn social skills explicitly). Obviously, just having a sense of a good UI doesn't mean you can make one; my appreciation for literature, for example, doesn't seem to extend to the ability (currently) to write publishable stories.
So, try to develop a sense for good UI design. This extends to more than just software. Don Norman's "The Design of Everyday Things" is a classic, and there's other books out there. Get examples of successful UI designs, and play with them enough to get a feel for the difference. Recognize that you may be having to learn a new way of thinking abou things, and enjoy it.
The main rule of thumb I hold to, is never try to do both at once. If I'm working on back-end code, I'll finish up doing that, take a break, and return with my UI hat on. If you try to work it in whilst you're doing code, you'll approach it with the wrong mindset, and end up with some horrible interfaces as a result.
I think it's definitely possible to be both a good back-end developer and a good UI designer, you just have to work at it, do some reading and research on the topic (everything from Miller's #7, to Nielsen's archives), and make sure you understand why UI design is of the utmost importance.
I don't think it's a case of needing to be creative but rather, like back-end development, it is a very methodical, very structured thing that needs to be learned. It's people getting 'creative' with UIs that creates some of the biggest usability monstrosities... I mean, take a look at 100% Flash websites, for a start...
Edit: Krug's book is really good... do take a read of it, especially if you're going to be designing for the Web.
There are many reasons for this.
(1) Developer fails to see things from the point of view of the user. This is the usual suspect: lack of empathy. But it is not usually true since developers are not as alien as people make them out to be.
(2) Another, more common reason is that the developer being so close to his own stuff, having stayed with his stuff for so long, fails to realize that his stuff may not be so familiar(a term better than intuitive) to other people.
(3) Still another reason is the developer lacks techniques.
MY BIG CLAIM: read any UI, human interection design, prototyping book. e.g. Designing the Obvious: A Common Sense Approach to Web Application Design, Don't Make Me Think: A Common Sense Approach to Web Usability, Designing the moment, whatever.
How do they discuss task flows? How do they describe decision points? That is, in any use case, there are at least 3 paths: success, failure/exception, alternative.
Thus, from point A, you can go to A.1, A.2, A.3.
From point A.1, you can get to A.1.1, A.1.2, A.1.3, and so on.
How do they show such drill-down task flow?
They don't. They just gloss over it.
Since even UI experst don't have a technique, developers have no chance.
He thinks it is clear in his head. But it is not even clear on paper, let alone clear in software implementation.
I have to use my own hand-made techniques for this.
I try to keep in touch with design-specific websites and texts. I found also the excellent Robin Williams book The Non-Designer's Design Book to be very interesting in these studies.
I believe that design and usability is a very important part of software engineering and we should learn it more and stop giving excuses that we are not supposed to do design.
Everyone can be a designer once in a while, as also everyone can be a programmer.
When approaching UI design, here are a few of the things I keep in mind throughout (by far not a complete list):
Communicating a model. The UI is a narrative that explains a mental model to the user. This model may be a business object, a set of relationships, what have you. The visual prominence, spatial placement, and workflow ordering all play a part in communicating this model to the user. For example, a certain kind of list vs another implies different things, as well as the relationship of what's in the list to the rest of the model. In general I find it best to make sure only one model is communicated at a time. Programmers frequently try to communicate more than one model, or parts of several, in the same UI space.
Consistency. Re-using popular UI metaphors helps a lot. Internal consistency is also very important.
Grouping of tasks. Users should not have to move the mouse all the way across the screen to verify or complete a related sequence of commands. Modal dialogs and flyout-menus can be especially bad in this area.
Knowing your audience. If your users will be doing the same activities over and over, they will quickly become power users at those tasks and be frustrated by attempts to lower the initial entry barrier. If your users do many different kinds of activities infrequently, it's best to ensure the UI holds their hand the whole time.
Read Apple Human Interface Guidelines.
I find the best tool in UI design is to watch a first-time User attempt to use the software. Take loads of notes and ask them some questions. Never direct them or attempt to explain how the software works. This is the job of the UI (and well written documentation).
We consistently adopt this approach in all projects. It is always fascinating to watch a User deal with software in a manner that you never considered before.
Why is UI design so hard? Well generally because the Developer and User never meet.
duffymo just reminded me why: Many Programmers think "*Design" == "Art".
Good UI design is absolutely not artistic. It follows solid principles, that can be backed up with data if you've got the time to do the research.
I think all programmers need to do is take the time to learn the principles. I think it's in our nature to apply best practice whenever we can, be it in code or in layout. All we need to do is make ourselves aware of what the best practices are for this aspect of our job.
What have I done to become better at UI design?
Pay attention to it!
It's like how ever time you see a chart on the news or an electronic bus sign and you wonder 'How did they get that data? Did they do that with raw sql or are they using LINQ?' (or insert your own common geek curiosity here).
You need to start doing that but with visual elements of all kinds.
But just like learning a new language, if you don't really throw yourself into it, you won't ever learn it.
Taken from another answer I wrote:
Learn to look, really look, at the world around you. Why do I like that UI but hate this one? Why is it so hard to find the noodle dishes in this restaurant menu? Wow, I knew what that sign meant before I even read the words. Why was that? How come that book cover looks so wrong? Learn to take the time to think about why you react the way you do to visual elements of all kinds, and then apply this to your work.
However you do it (and there are some great points above), it really helped me once I accepted that there is NO SUCH THING AS INTUITIVE....
I can hear the arguments rumbling on the horizon... so let me explain a little.
Intuitive: using what one feels to be right or true based on an unconscious method or feeling.
If (as Carl Sagan postulated) you accept that you cannot comprehend things that are absolutely unlike anything you have ever encountered then how could you possibly "know" how to use something if you have never used anything remotely like it?
Think about it: kids try to open doors not because they "know" how a doorknob works, but because they have seen someone else do it... often they turn the knob in the wrong direction, or pull too soon. They have to LEARN how a doorknob works. This knowledge then gets applied in different but similar instances: opening a window, opening a drawer, opening almost anything big with a big, knob-looking handle.
Even simple things that seem intuitive to us will not be intuitive at all to people from other cultures. If someone held their arm out in front of them and waived their hand up-and-down at the wrist while keeping the arm still.... are they waiving you away? Probably, unless you are in Japan. There, this hand signal can mean "come here". So who is right? Both, of course, in their own context. But if you travel to both, you need to know both... UI design.
I try to find the things that are already "familiar" to the potential users of my project and then build the UI around them: user-centric design.
Take a look at Apple's iPhone. Even if you hate it, you have to respect the amount of thought that went into it. Is it perfect? Of course not. Over time an object's perceived "intuitiveness" can grow or even fade away completely.
For example. Most everyone knows that a strip of black with two rows of holes along the top and bottom looks like a film strip... or do they?
Ask your average 9 or 10 year old what they think it is. You may be surprised how many kids right now will have a hard time identifying it as a film strip, even though it is something that is still used to represent Hollywood, or anything film (movie) related. Most movies for the past 20 years have been digitally shot. And when was the last time any of us held a piece of film of ANY kind, photos or film?
So, what it all boils down to for me is: Know your audience and constantly research to keep up with trends and changes in things that are "intuitive", target your main users and try not to do things that punish the inexperienced in favor of the advanced users or slow down the advanced users in order to hand-hold the novices.
Ultimately, every program will require a certain amount of training on the user's part to use it. How much training and for which level of user is part of the decisions that need to be made.
Some things are more or less familiar based on your target user's past experience level as a human being, or computer user, or student, or whatever.
I just shoot for the fattest part of the bell curve and try to get as many people as I can but realizing that I will never please everyone....
I know that Microsoft is rather inconsistent with their own guidelines, but I have found that reading their Windows design guidelines have really helped me. I have a copy on my website here, just scroll down a little the the Vista UX Guide. It has helped me with things such as colors, spacing, layouts, and more.
I believe the main problem has nothing to do with different talents or skillsets. The main problem is that as a developer, you know too much about what the application does and how it does it, and you automatically design your UI from the point of view of someone who has that knowledge.
Whereas a user typically starts out knowing absolutely nothing about the application and should never need to learn anything about its inner workings.
It is very hard, almost impossible, to not use knowledge that you have - and that's why an UI should not be designed by someone who's developing the app behind it.
"Designing from both sides of the screen" presents a very simple but profound reason as to why programmers find UI design hard: programmers are trained to think in terms of edge cases while UI designers are trained to think in terms of common cases or usage.
So going from one world to the other is certainly difficult if the default traning in either is the exact opposite of the other.
To say that programms suck at UI design is to miss the point. The point of the problem is that the formal training that most developers get go indepth with the technology. Human - Computer Interaction is not a simple topic. It is not something that I can "mind-meld" to you by providing a simple one line statement that makes you realize "oh the users will use this application more effectively if I do x instead of y."
This is because there is one part of UI design that you are missing. The human brain. In order to understand how to design a UI, you have to understand how the human mind interacts with machinery. There is an excellent course I took at the University of Minnesota on this topic taught by a professor of Psychology. It is named "Human - Machine Interaction". This describes many of the reasons of why UI design is so complicated.
Since Psychology is based on Correlations and not Causality you can never prove that a method of UI design will always work in any given situation. You can correlate that many users will find a particular UI design appealing or efficient, but you cannot prove that it will always generalize.
Additionally, there are two parts to UI design that many people seem to miss. There is the aesthetical appeal, and the functional workflow. If you go for a 100% aesthetical appeal, sure people will but your product. I highly doubt that aesthetics will ever reduce user frustration though.
There are several good books on this topic and course to take (like Bill Buxton's Sketching User Experiences, and Cognition in the Wild by Edwin Hutchins). There are graduate programs on Human - Computer Interaction at many universities.
The overall answer to this question though lies in how individuals are taught computer science. It is all math based, logic based and not based on the user experience. To get that, you need more than a generic 4 year computer science degree (unless your 4 year computer science degree had a minor in psychology and was emphasized in Human - Computer Interaction).
Let's turn your question around -
Are "ui designers" doomed to only design information architecture and presentation layers? Is there something they can do to retrain their brains to be more effective at designing pleasing and efficient system layers?
Seems like them "ui designers" would have to take a completely different perspective - they'd have to look from the inside of the box outwards; instead of looking in from outside the box.
Alan Cooper's "The Inmates are Running the Asylum" opinion is that we can't successfully take both perspectives - we can learn to wear one hat well but we can't just switch hats.
I think its because a good UI is not logical. A good UI is intuitive.
Software developers typically do bad on 'intuitive'
A useful framing is to actively consider what you're doing as designing a process of communication. In a very real sense, your interface is a language that the user must use to tell the computer what to do. This leads to considering a number of points:
Does the user already speak this language? Using a highly idiosyncratic interface is like communicating in a language you've never spoken before. So if your interface must be idiosyncratic at all, it had best introduce itself with the simplest of terms and few distractions. On the other hand, if your interface uses idioms that the user is accustomed to, they'll gain confidence from the start.
The enemy of communication is noise. Auditory noise interferes with spoken communication; visual noise interferes with visual communication. The more noise you can cut out of your interface, the easier communicating with it will be.
As in human conversation, it's often not what you say, it's how you say it. The way most software communicates is rude to a degree that would get it punched in the face if it were a person. How would you feel if you asked someone a question and they sat there and stared at you for several minutes, refusing to respond in any other way, before answering? Many interface elements, like progress bars and automatic focus selection, have the fundamental function of politeness. Ask yourself how you can make the user's day a little more pleasant.
Really, it's somewhat hard to determine what programmers think of interface interaction as being, other than a process of communication, but maybe the problem is that it doesn't get thought of as being anything at all.
There are a lot o good comments already, so I am not sure there is much I can add.
But still...
Why would a developer expect to be able to design good UI?
How much training did he had in that field?
How many books did he read?
How many things did he designed over how many years?
Did he had the opportunity to see the reaction of it's users?
We don't expect that a random "Joe the plumber" to be able to write good code.
So why would we expect the random "Joe the programmer" to design good UI?
Empathy helps. Separating the UI design and the programming helps. Usability testing helps.
But UI design is a craft that has to be learned, and practiced, like any other.
Developers are not (necessarily) good at UI design for the same reason they aren't (necessarily) good at knitting; it's hard, it takes practice, and it doesn't hurt to have someone show you how in the first place.
Most developers (me included) started "designing" UIs because it was a necessary part of writing software. Until a developer puts in the effort to get good at it, s/he won't be.
To improve just look around at existing sites. In addition to the books already suggested, you might like to have a look at Robin Williams's excellent book "The Non-designers Design Book" (sanitised Amazon link)
Have a look at what's possible in visual design by taking a look at the various submissions over at The Zen Garden as well.
UI design is definitely an art though, like pointers in C, some people get it and some people don't.
But at least we can have a chuckle at their attempts. BTW Thanks OK/Cancel for a funny comic and thanks Joel for putting it in your book "The Best Software Writing I" (sanitised Amazon link).
User interface isn't something that can be applied after the fact, like a thin coat of paint. It is something that needs to be there at the start, and based on real research. There's tons of Usability research available of course. It needs to not just be there at the start, it needs to form the core of the very reason you're making the software in the first place: There's some gap in the world out there, some problem, and it needs to be made more usable and more efficient.
Software is not there for its own sake. The reason for a peice of software to exist is FOR PEOPLE. It's absolutely ludicrous to even try to come up with an idea for a new peice of software, without understanding why anyone would need it. Yet this happens all the time.
Before a single line of code is written, you should go through paper versions of the interface, and test it on real people. This is kind of weird and silly, it works best with kids, and someone entertaining acting as "the computer".
The interface needs to take advantage of our natural cognitive facilities. How would a caveman use your program? For instance, we've evolved to be really good at tracking moving objects. That's why interfaces that use physics simulations, like the iphone, work better than interfaces where changes occur instantaneously.
We are good at certain kinds of abstraction, but not others. As programmers, we're trained to do mental gymnastics and backflips to understand some of the weirdest abstractions. For instance, we understand that a sequence of arcane text can represent and be translated into a pattern of electromagnetic state on a metal platter, which when encountered by a carefully designed device, leads to a sequence of invisible events that occur at lightspeed on an electronic circuit, and these events can be directed to produce a useful outcome. This is an incredibly unnatural thing to have to understand. Understand that while it's got a perfectly rational explanation to us, to the outside world, it looks like we're writing incomprehensible incantations to summon invisible sentient spirits to do our bidding.
The sorts of abstractions that normal humans understand are things like maps, diagrams, and symbols. Beware of symbols, because symbols are a very fragile human concept that take conscious mental effort to decode, until the symbol is learned.
The trick with symbols is that there has to be a clear relationship between the symbol, and the thing it represents. The thing it represents either has to be a noun, in which case the symbol should look VERY MUCH like the thing it represents. If a symbol is representing a more abstract concept, that has to be explained IN ADVANCE. See the inscrutable unlabled icons in msword's, or photoshop's toolbar, and the abstract concepts they represent. It has to be LEARNED that the crop tool icon in photoshop means CROP TOOL. it has to be understood what CROP even means. These are prerequisites to correctly using that software. Which brings up an important point, beware of ASSUMED knowledge.
We only gain the ability to understand maps around the age of 4. I think I read somewhere once that chimpanzees gain the ability to understand maps around the age of 6 or 7.
The reason that guis have been so successful to begin with, is that they changed a landscape of mostly textual interfaces to computers, to something that mapped the computer concepts to something that resembled a physical place. Where guis fail in terms of usability, is where they stop resembling something you'd see in real life. There are invisible, unpredictable, incomprehensible things that happen in a computer that bare no resemblance to anything you'd ever see in the physical world. Some of this is necessary, since there'd be no point in just making a reality simulator- The idea is to save work, so there has to be a bit of magic. But that magic has to make sense, and be grounded in an abstraction that human beings are well adapted to understanding. It's when our abstractions start getting deep, and layered, and mismatched with the task at hand that things break down. In other words, the interface doesn't function as a good map for the underlying software.
There are lots of books. The two I've read, and can therefore reccomend, are "The Design of Everyday Things" by donald norman, and "The Human Interface" by Jef Raskin.
I also reccomend a course in psychology. "The Design of Every day Things" talks about this a bit. A lot of interfaces break down because of a developer's "folk understanding" of psychology. This is similar to "folk physics". An object in motion stays in motion doesn't make any sense to most people. "You have to keep pushing it to keep it in motion!" thinks the physics novice. User testing doesn't make sense to most developers. "You can just ask the users what they want, and that should be good enough!" thinks the psychology novice.
I reccomend Discovering Psychology, a PBS documentary series, hosted by Philip Zimbardo. Failing that, try and find a good physics textbook. The expensive kind. Not the pulp fiction self help crap that you find in Borders, but the thick hardbound stuff you can only find in a university library. This is a necesesary foundation. You can do good design without it, but you'll only have an intuitive understanding of what's going on. Reading some good books will give you a good perspective.
If you read the book "Why software sucks" you would have seen Platt's answer, which is a simple one:
Developers prefere control over user-friendliness
Average people prefere user-friendliness over control
But another another answer to your question would be "why is dentistry so hard for some developers?" - UI design is best done by a UI designer.
http://dotmad.net/blog/2007/11/david-platt-on-why-software-sucks/
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
A couple of weeks ago, my piano teacher and I were bouncing ideas off of each other concerning meta-composing music software. The idea was this:
There is a system taking midi input from a bunch of instruments, and pushes output to the speakers and lights. The software running on this system analyzes the midi data it's getting, and determines which sounds to use, based on triggers set up by the composer (when I play an F7 chord 3 times within 2 seconds, switch from the harpsichord sound to the piano sound), pedals, or actual real-time analysis of the music. It would control the lights based on the performance and sounds of the instruments in a similar fashion - the musician would only have to vaguely specify what they wanted - and real time analysis of their playing would do the rest. On the fly procedurally generated music could play along with the musician as well. Essentially, the software would play along with the performer, with one guiding the other. I imagine that it would take some practice to get use to such a system, but that it could have quite incredible results.
I'm a big fan of improv jazz. One characteristic of improv that is lacking from other art forms is the temporalness of it. A painting can be appreciated 10 or 1000 years after it has been painted, but music (especially extemporized music) is about the performance as it is the creation. I think that the software that I described would add a great deal to the performance, as with it, as playing the exact same piece would result in a completely different show each time.
So, now for the questions.
Am I crazy?
Does software to do any or all of this exist yet? I've done some research and haven't turned up anything. The key to this system is that it is running during the performance.
Were I to write something like this, would a scripting language such as Python be fast enough to do the computations that I need? Presumably it'd be running on a fairly quick system, and could take advantage of the 2^n core processors Intel keeps releasing.
Can any of you share your experience and advice concerning interfacing with musical instruments and lights and the like?
Have any ideas or suggestions? Cold and harsh criticism?
Thanks for your time in reading this, and for any and all advice!
(And sorry for the joke in the tags, I couldn't resist.)
People have used Max MSP to do this kind of thing with Midi and creating video accompaniment, or just Midi accompaniment. It's a completely domain specific app, that probably was inspired by small talk or something, which barely any real programmer could love, but musician-programmers do.
Despite the text on the site I just linked to, and the fact that 'everyone' uses the commercial version, it wasn't always a commercial product. Ircam eventually released it's own lineage. It's called jMax. PureData, mentioned in another post here is another rewrite of that lineage.
There's also CSound; which wasn't meant to be real-time, but is likely able to be pretty real-time now that you have a decent computer compared to where CSound started.
Some people have also hacked Macromedia Director extensions to allow for doing midi stuff in Lingo... That's very outdated, and hence some of them have moved to more modern Adobe environments.
Look at PureData. It can do extensive midi analysis and folks use it for performance.
Indeed, here's a video that flashes past a puredata screen. It shows someone interacting with a rather complex instrument using PD.
Also, look at CSounds.
I have used PyAudio quite extensively for dealing with raw audio inputs, and found it to be very unpythonic, acting much more like a very thin wrapper over C code. However, if you're dealing with midi, rather then raw waveforms, then your tasks are quite a bit simpler, and python should be quite fast enough, unless you play at 10000 beats per minute :)
Some of the issues: detecting simultaneity, harmonic (formal - i.e., chord structure) analysis.
This is also an 80/20 problem that if you restrict the chord progressions allowed, then it becomes quite a bit simpler. After all, what does "playing along" mean, anyway, right?
(Also, at electronic music conf's I've been too, there are lots of people doing various real-time accompaniment experiments based on input sound and movement). Good luck!
You might also look at ChucK and SuperCollider, the two most popular 'real' realtime music programming languages.
Also, you might be surprised at how much you can accomplish with Ableton Live racks.
(and it's CSound. No 's' at the end)
see also:
Keykit
Arx
I have no idea if the second one is actually real or worth looking at. Keykit, however, is.
You might contact Gary Lee Nelson in the TIMARA department at Oberlin. 20 years ago I did a project that auto-generated the rhythm section for 12 bar blues and I recall him describing a tool that he knew of that did essentially what you're describing.
You might be interested in GenJam
The answer to your question is no - you're not crazy.
Similar systems exist, but your description is pretty
vague to begin with so it's not much of a spec to judge against.
I suggest you start writing a prototype and see how it does.
Something extremely small and simple.
Existing systems be damned.
I'm using c++ on win32 api (no mfc).
Started writing my sequencer back on the Amiga500.
It doesn't do lights, but there's plenty to do in just music.
Good luck to you.
It's an EXTREMELY fun project.
I'd say -don't- pattern your project on how other projects work.
Because, if you ask me, they don't work so great ;)
And the fun is being able to do something different.