Making a spell check utility - data-structures

This idea just popped into my head, so I don't have any code to show for it, but I was curious to know the answer. How is spell check implemented on most major word processors? I'm most curious to know what kind of data structures would be used in the creation of such a utility. Also, references to algorithms would be nice answers as well.

For a basic guide in python, have a look here.
Also you might want to look at this past question

Related

Text search question about implementation

Can someone explain me how the text searching algorithm works? I understand its a huge field but am trying to understand it from high level so that I can look up academic papers on it.
For example, Spelling mistakes is one problem that is tough to solve and of course Google solves it. When I search for a term and misspell it on Google, it automatically suggests the correct spelling. How is indexing done for it? Using MapReduce I can see they index various entities. What do they or some one else index and store? May be I am looking for a practical implementation of MapReduce if I am thinking in the right direction at all.
Pav
I'm afraid this question really is too big, which probably explains why it has not seen an answer yet. As far as Google's spell-checker is concerned, Peter Norvig explains how it is done: How to Write a Spelling Corrector
The exact implementation in productive use at Google surely looks quite a bit different and way more complicated, but this might get you started.

What human learning techniques can be applied to improve code layout?

Is it possible to use the results of studies made into human learning in order to identify how code might be laid out to improve comprehension?
Code layout wars almost always end up defending consistency and the prevailing style, but could there be ways of laying out code that are provably better than others?
What is Code Layout to you?
On one hand there are these evil things called coding conventions, which place everyone in a corset. I loathe these and I believe we're far behind schedule to eliminate them. We can parse code and I do not understand, why our IDEs still display code based on the very textual format it is stored in. What's so hard in allowing each user set up his layout prefences and the IDE displays all source code accordingly? Most IDEs offer some kind of auto-format option, yet you often cannot customize how it works.
However, a far more interesting approach is whether our current point of view on source code is suitable for learning at all. Projects like Code Bubbles are pioneering a new way there. And then of course, we have model-based approaches which are often more accessible from a learner's point of view.
I'm afraid there is no definite answer to this question. In fact, if you can write down a detailed answer for it, don't forget to claim a PhD for it ;)
Could there be ways of laying out code that are provably better than others?
Yes. This problem was studied extensively in the 1980s. You could read all about it :-)
A good university library should have Human Factors and Typography for More Readable Programs by Ronald M. Baecker and Aaron Marcus, published by Addison-Wesley in 1990.
I think this comes down to personal preference. I prefer to have very little shorthand in my codes, I think it's the best way for me to comprehend what's going on inside my codes without having to remember which order shorthand works in, maybe my memory is bad.
Possibly it would be a good idea to use such studies say on a class of students learning to make codes the same way, but everyone develops their own way of coding after time. There are already "provably better ways" as laid out by the best practice suggestions for each language.
Interesting question.
The biggest problem for me with understanding code is not code layout (however code should be formatted consistently) but following execution order. In complex OO source code it is hard to see the complete code involved in execution.
I think that IDE functions can help a lot for code understanding. For me (as a java developer) tools like the Call Hierarchy view in Eclipse and Mylyn are very useful.
An interesting (new) way of understanding code is shown in the Code Bubbles Project.
I expect more steps in these directions in the future.
I think teaching programming may have given me some skill in this area, because to get ideas across to students you have to keep things small, simple, and introducing only one concept at a time.
However, as one of my colleagues used to say to his students:
Teaching is my job.
Learning is yours.
As that applies to programming, I think it is the programmer's responsibility to write the code so as to educate others as to what he/she is trying to accomplish, but there is no code that will be clear to readers who do not put in effort.

What does a good programmer's code look like? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I am a hobbyist programmer (started with VBA to make excel quicker) and have been working with VB.NET / C#.NET and am trying to learn ADO.NET.
A facet of programming that has always frustrated me is what does 'good' look like? I am not a professional so have little to compare against. What makes a better programmer?
Is it:
They have a better understanding of
all the objects / classes / methods
in a given language?
Their programs are more efficient?
The design of their programs are much
better in terms of better
documentation, good choice of names
for functions etc.?
Put another way, if I were to look at the code of a professional programmer, what is the first thing that I would notice about their code relative to mine? For example, I read books like 'Professional ASP.NET' by Wrox press. Are the code examples in that book 'world class'? Is that the pinnacle? Would any top-gun programmer look at that code and think it was good code?
The list below is not comprehensive, but these are the things that I thought of in considering your question.
Good code is well-organized. Data and operations in classes fit together. There aren't extraneous dependencies between classes. It does not look like "spaghetti."
Good code comments explain why things are done not what is done. The code itself explains what is done. The need for comments should be minimal.
Good code uses meaningful naming conventions for all but the most transient of objects. the name of something is informative about when and how to use the object.
Good code is well-tested. Tests serve as an executable specification of the code and examples of its use.
Good code is not "clever". It does things in straightforward, obvious ways.
Good code is developed in small, easy to read units of computation. These units are reused throughout the code.
I haven't read it yet, but the book I'm planning to read on this topic is Clean Code by Robert C. Martin.
The first thing you'd notice is that their code follows a consistent coding-style. They always write their structure blocks the same, indent religiously and comment where appropriate.
The second things you'd notice is that their code is segmented into small methods / functions spanning no more than a couple dozen lines at the most. They also use self describing method names and generally their code is very readable.
The third thing you'd notice, after you messed around with the code a little is that the logic is easy to follow, easy to modify - and therefore easily maintainable.
After that, you'll need some knowledge and experience in software design techniques to understand the specific choices they took constructing their code architecture.
Regarding books, I haven't seen many books where the code could be considered "world-class". In books they try mostly to present simple examples, which might be relevant to solving very simple problems but aren't reflective of more complex situations.
Quoting Fowler, summizing readability:
Any fool can write code that a computer can understand.
Good programmers write code that humans can understand.
'nough said.
Personally, I'll have to quote "The Zen of Python" by Tim Peters. It tells Python programmers what their code should look like, but I find that it applies to basically all code.
Beautiful is better than ugly. Explicit is better than
implicit. Simple is better than complex. Complex is better
than complicated. Flat is better than nested. Sparse is
better than dense. Readability counts. Special cases
aren't special enough to break the rules. Although practicality
beats purity. Errors should never pass silently. Unless
explicitly silenced. In the face of ambiguity, refuse the
temptation to guess. There should be one-- and preferably only
one --obvious way to do it. Although that way may not be obvious
at first unless you're Dutch. Now is better than never.
Although never is often better than right now. If the
implementation is hard to explain, it's a bad idea. If the
implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
Code is poetry.
Start from this point of logic and you can derive many of the desirable qualities of code. Most importantly, observe that code is read far more than it is written, hence write code for the reader. Rewrite, rename, edit, and refactor for the reader.
A follow on corollary:
The reader will be you at time n from the code creation date. The payoff of writing code for the reader is a monotonically increasing function of n. A reader looking at your code for the first time is indicated by n == infinity.
In other words, the larger the gap of time from when you wrote the code to when you revisit the code, the more you will appreciate your efforts to write for the reader. Also, anyone you hand your code off to will gain great benefit from code written with the reader as the foremost consideration.
A second corollary:
Code written without consideration for the reader can be unnecessarily difficult to understand or use. When the consideration for the reader drops below a certain threshold, the reader derives less value from the code than the value gained by rewriting the code. When this occurs the previous code is thrown away and, tragically, much work is repeated during the rewrite.
A third corollary:
Corollary two has been known to repeat itself multiple times in a vicious cycle of poorly documented code followed by forced rewrites.
I've been programming for 28 years and I find this a tough question to answer. To me good code is a complete package. The code is cleanly written, with meaningful variable and method names. It has well placed comments that comment the intent of the code and doesn't just regurgitate the code you can already read. The code does what it is supposed to in an efficient manner, without wasting resources. It also has to be written with an eye towards maintainability.
The bottom line though is that it means different things to different people. What I might label as good code someone else might hate. Good code will have some common traits which I think I've identified above.
The best thing you can do is expose yourself to code. Look at other people's code. Open Source projects are a good source for that. You will find good code and bad code. The more you look at it, the better you will recognize what you determine to be good code and bad code.
Ultimately you will be your own judge. When you find styles and techniques you like adopt them, over time you will come up with your own style and that will change over time. There is no person on here that can wave a wand and say what is good and that anything else is bad.
Read the book Code Complete. This explains a lot of ideas about how to structure code and the the reasons for doing so. Reading it should short-circuit your time to aquiring the experience necessary to tell good from bad.
http://www.amazon.com/Code-Complete-Practical-Handbook-Construction/dp/0735619670/ref=pd_bbs_sr_1?ie=UTF8&s=books&qid=1229267173&sr=8-1
Having been programming for nearly 10 years now myself and having worked with others I can say without bias that there is no difference between a good programmer and an average programmers code
All programmers at a competent level:
Comment Correctly
Structure Efficiently
Document Cleanly
I once overheard a co-worker say "I've always been very logical and rational minded. I think that's why I enjoy developing"
That in my opinion, is the mind of an average programmer. One who sees the world in terms of rules and logic and ultimately obeys those rules when designing and writing a program.
The expert programmer, understands the rules, but also their context. This ultimately leads to them coming up with new ideas and implementations, the mark of an expert programmer. Programming is ultimately an art form.
Succinctly put, a good programmer's code can be read and understood.
In my opinion, a good programmer's code is language-agnostic; well-written code can be read and understood in a short amount of time with minimal thinking, regardless of the programming language used. Whether the code is in Java, Python, C++ or Haskell, well-written code is understandable by people who don't even program in that particular language.
Some characteristics of code that is easy to read are, methods that are well-named, absence of "tricks" and convoluted "optimization", classes are well-designed, to name a few. As others have mentioned, coding style is consistent, succinct and straight-forward.
For example, the other day, I was taking a look at the code for TinyMCE to answer one of the questions on Stack Overflow. It is written in JavaScript, a language that I've hardly used. Yet, because of the coding style and the comments that are included, along with the structuring of the code itself, it was fairly understandable, and I was able to navigate through the code in a few minutes.
One book that was quite an eye-opener for me in the regard of reading good programmer's code is Beautiful Code. It has many articles written by authors of various programming projects in various programming languages. Yet, when I read it, I could understand what the author was writing in his code despite the fact that I've never even programmed in that particular language.
Perhaps what we should keep in mind is that programming is also about communication, not only to the computer but to people, so good programmer's code is almost like a well-written book, which can communicate to the reader about the ideas it wants to convey.
Easy to read
easy to write
easy to maintain
everything else is filigree
Good code should be easily understood.
It should be well commented.
Difficult parts should be even better commented.
Good code is readable. You'd have no trouble understanding what the code is doing on the first read through of code written by a good professional programmer.
Rather then repeat everyone else's great suggestions, I will instead suggest that you read the book Code Complete by Steve McConnell
Essentially it is a book packed full of programming best practices for both functionality and style.
[Purely subjective answer]
For me, good code is a form of art, just like a painting. I might go further and say that it's actually a drawing that includes characters, colors, "form" or "structure" of code, and with all this being so readable/performant. The combination of readability, structure (i.e. columns, indentation, even variable names of the same length!), color (class names, variable names, comments, etc.) all make what I like to see as a "beautiful" picture that can make me either very proud or very detestful of my own code.
(As said before, very subjective answer. Sorry for my English.)
I second the recommendation of Bob Martin's "Clean Code".
"Beautiful Code" was highly acclaimed a couple of years ago.
Any of McConnell's books are worth reading.
Perhaps "The Pragmatic Programmer" would be helpful, too.
%
Just wanted to add my 2 cents... comments in your code -- and your code itself, generally -- should say what your code does, now how it does it. Once you have the concept of 'client' code, which is code that calls other code (simplest example is code that calls a method), you should always be most worried about making your code comprehensible from the "client's" perspective. As your code grows, you'll see that this is... uh, good.
A lot of the other stuff about good code is about the mental leaps that you'll make (definitely, if you pay attention)... 99% of them have to do with doing a bit more work now to spare you a ton of work later, and reusability. And also with doing things right: I almost always want to run the other way rather than using regular expressions, but every time I get into them, I see why everybody uses them in every single language I work in (they're abstruse, but work and probably couldn't be better).
Regarding whether to look at books, I would say definitely not in my experience. Look at APIs and frameworks and code conventions and other people's code and use your own instincts, and try to understand why stuff is the way it is and what the implications of things are. The thing that code in books almost never does is plan for the unplanned, which is what error checking is all about. This only pays off when somebody sends you an email and says, "I got error 321" instead of "hey, the app is broke, yo."
Good code is written with the future in mind, both from the programmer's perspective and the user's perspective.
This is answered pretty well in Fowler's book, "Refactoring", It's the absence of all the "smells" he describes throughout the book.
I haven't seen 'Professional ASP.NET', but I'd be surprised if it's better than OK. See this question for some books with really good code. (It varies, of course, but the accepted answer there is hard to beat.)
This seems to be (should be) a FAQ. There is an ACM article about beautiful code recently. There seems to be a lot of emphasis on easy to read/understand. I'd qualifier this with "easy to read/understand by domain experts". Really good programmers tend to use the best algorithms (instead of naive easy to understand O(n^2) algorithms) for any given problems, which could be hard to follow, if you're not familiar with the algorithm, even if the good programmer gives a reference to the algorithm.
Nobody is perfect including good programmers but their code tend to strive for:
Correctness and efficiency with proven algorithms (instead of naive and adhoc hacks)
Clarity (comment for intent with reference to non-trivial algorithms)
Completeness to cover the basics (coding convention, versioning, documentation, unit tests etc.)
Succinctness (DRY)
Robustness (resilient to arbitrary input and disruption of change requests)
i second the recommendation for uncle bob's "clean code". but you may wish to take a look at http://www.amazon.com/Implementation-Patterns-Addison-Wesley-Signature-Kent/dp/0321413091 as i think this deals with your specific question a bit better. good code should leap off the page and tell you what it does/how it works.
Jeff Atwood wrote a nice article about how coders are Typists first reference:
http://www.codinghorror.com/blog/archives/001188.html
When being a typist you always need to be elegant in your work, having strucutre and proper "grammar" is highly important. Now converting this to "programming"-typing would catch the same outcome.
Structure
Comments
Regions
I'm a software engineere which means during my education i've come across many different languages but my programming always "feel" the same, as my writing does on fekberg.wordpress.com, i have a "special" way for typing.
Now programming different applications and in different languages, such as Java, C#, Assembler, C++,C i've come to the "standard" of writing that i like.
I see everything as "boxes" or regions and each region has it's explaining commenting. A region might be "class Person" and inside this Region i have a couple of methods for properties, which i may call "Access Methods" or such and each property and region has it's own explaining commenting.
This is highly important, i always see my code that i do, as "being a part of an api", when creating an API structure and elegance is VERY important.
Think about this. Also read my paper on Communication issues when adapting outsourcing which explains in rough, how bad code can conflict, Enterpret as you like: http://fekberg.wordpress.com/2008/12/14/communication-issues-when-adapting-outsourcing/
Good code is easy to understand, easy to maintain, and easy to add to. Ideally, it is also as efficient as possible without sacrificing other indicators.
Great code to me is something that is simple to grasp yet sophisticated. The things that make you go, "wow, of course, why didn't I think of it that way?". Really good code is not hard to understand, it simply solves the problem at hand in a straight-forward way (or a recursive way, if that is even simpler).
Good code is where you know what the method does from the name. Bad code is where you have to work out what the code does, to make sense of the name.
Good code is where if you read it, you can understand what it's doing in not much more time than it takes to read it. Bad code is where you end up looking at it for ages trying to work out wtf it does.
Good code has things named in such a way as to make trivial comments unnecessary.
Good code tends to be short.
Good code can be reused to do what it does anywhere else, since it doesn't rely on stuff that is really unrelated to its purpose.
Good code is usually a set of simple tools to do simple jobs (put together in well organised ways to do more sophisticated jobs). Bad code tends to be huge multi-purpose tools that are easy to break and difficult to use.
Code is a reflection of a programmer's skills and mindset. Good programmers always have an eye on the future - how the code will function when requirements or circumstances are not exactly what they are today. How scalabale it will be? How convenient it will be when I am not the one maintaining this code? How reusable the code will be, so that someone else doing similar stuff can reuse the code and not write it again. What when someone else is trying to understand the code that I have written.
When a programmer has that mindset, all the other stuff falls in place nicely.
Note: A code base is worked on by many programmers over time and typically there is not a specific designation of code base to a programmer. Hence good code is a reflection of all the company's standards and quality of their workforce.
(I use "he" below because this is the person that I aspire to be, sometimes with success).
I believe that the core of a good programmer's philosophy is that he is always thinking "I am coding for myself in the future when I will have forgotten all about this task, why I was working on it, what were the risks and even how this code was supposed to work."
As such, his code has to:
Work (it doesn't matter how fast code gets to the wrong answer. There's no partial credit in the real world).
Explain how he knows that this code works. This is a combination of documentation (javadoc is my tool of choice), exception handling and test code. In a very real sense, I believe that, line for line, test code is more valuable than functional code if for no other reason than it explains "this code works, this is how it should be used, and this is why I should get paid."
Be maintained. Dead code is a nightmare. Legacy code maintenance is a chore but it has to be done (and remember, it's "legacy" the moment that it leaves your desk).
On the other hand, I believe that the good programmer should never do these things:
Obsess over formatting. There are plenty of IDEs, editors and pretty-printers that can format code to exactly the standard or personal preference that you feel is appropriate. I use Netbeans, I set up the format options once and hit alt-shift-F every now and then. Decide how you want the code to look, set up your environment and let the tool do the grunt work.
Obsess over naming conventions at the expense of human communication. If a naming convention is leading you down the road of naming your classes "IElephantProviderSupportAbstractManagerSupport" rather than "Zookeeper", change the standard before you make it harder for the next person.
Forget that he works as a team with actual human beings.
Forget that the primary source of coding errors is sitting at his keyboard right now. If there's a mistake or an error, he should look to himself first.
Forget that what goes around comes around. Any work that he does now to make his code more accessible to future readers will almost certainly benefit him directly (because who's going to be the first person asked to look at his code? He is).
It works
It has unit tests that prove that it works
The rest is icing...
The best code has a certain elegance that you recognise as soon as you see it.
It looks crafted, with care and attention to detail. It's obviously produced with someone with skill and has an art about it - you could say it looks sculpted and polished, rather than rough and ready.
It's consistent and reads easily.
It's split into small, highly cohesive functions each of which do one thing and do it well.
It's minimally coupled, meaning that dependencies are few and strictly controlled,
usually by...
Functions and classes have dependencies on abstractions rather than implementations.
Ironically the better the programmer the less indispensable he/she becomes because the code produced is better maintainable by anyone (as stated by general consent by Eran Galperin).
My experience tells the opposite is also true. The worse the programmer the more difficult to maintain his/her code is, so more indispensable he/she becomes, since no other soul can understand the riddles produced.
I have a good example :
Read GWT (google web tookit) Source code, you will see that every fool understand it (some english books are harder to read than this code).

Please recommend an open source project with quality comments in good English

English is not my mother tongue. However, I have to write comments in English. I want to improve my "comment English" by studying a piece of code which is commented in a good English. Please recommend an open source project which contains a lot of meaningful comments written by people with an excellent command of the language.
I can recommend Simon Tatham's Puzzle Collection.
The comments I've read look like correct English to me. Also, they're clearly written and contain useful information about why things are done the way they are.
The developer documentation is also easy to read and understand, and as a side benefit you'll be reading the documentation of a system with a good and simple architecture.
If you aim to read something and then emulate it without explicitly understanding why doing so is a good idea, I can definitely recommend ST's puzzles. If you want an explicit understanding of what good writing is, I think ST's puzzles will serve as a good example, but you really want to read something like Strunk and White.
I don't think comments are a good place to learn, even if you have to emulate this style. Comments are often not even correct English and good projects will not have too many comments anyway, hence, not much to learn from.
If, on the other hand, you take a project that is commented extensively you can be almost sure that it's not a great role model since the programmers were incapable of conveying terse meaning. This is a generalization, of course. However, I believe it's (almost?) always true. YMMV.
Instead, learn English by studying the experts. There has been a question about this. I recommended “On Writing Well” by William Zinsser and I will do so again.
(By the way, English isn't my mother tongue either.)
I've only ever looked at small pieces of it, but I hear the linux kernel is very well commented.
I've always thought Angband is well commented.
I fairly well commented open source project is Drupal . Check it at Drupal.org . I've developed several portals in it and you can actually learn quite a bit by reading comments in the modules and themes. Actually they are some times more informative then the documentation.
Minix is quite well commented. They can be a bit terse in some places but the comments are very helpful.
Also, if you want to improve your writing... There are a lot of confusing things about the english language. How you master these points is what distinguishes a mediocre writer from an excellent writer. The best, smallest, and most concise book to get (everyone should have a copy!) is called the Elements of Style by Strunk and White. Note that you can get it for 2 bucks on Amazon. Best $2 you'll spend!

How do you tell someone they're writing bad code? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Closed 10 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I've been working with a small group of people on a coding project for fun. It's an organized and fairly cohesive group. The people I work with all have various skill sets related to programming, but some of them use older or outright wrong methods, such as excessive global variables, poor naming conventions, and other things. While things work, the implementation is poor. What's a good way to politely ask or introduce them to use better methodology, without it coming across as questioning (or insulting) their experience and/or education?
Introduce questions to make them realise that what they are doing is wrong. For example, ask these sort of questions:
Why did you decide to make that a global variable?
Why did you give it that name?
That's interesting. I usually do mine this way because [Insert reason why you are better]
Does that way work? I usually [Insert how you would make them look silly]
I think the ideal way of going about this is subtly asking them why they code a certain way. You may find that they believe that there are benefits to other methods. Unless I knew the reason for their coding style was due to misinformation I would never judge my way as better without good reason. The best way to go about this is to just ask them why they chose that way; be sure to sound interested in their reasoning, because that is what you need to attack, not their ability.
A coding standard will definitely help, but if it were the answer to every software project then we'd all be sipping cocktails on our private islands in paradise. In reality, we're all prone to problems and software projects still have a low success rate. I think the problem would mostly stem from individual ability rather than a problem with convention, which is why I'd suggest working through the problems as a group when a problem rears its ugly head.
Most importantly, do NOT immediately assume that your way is better. In reality, it probably is, but we're dealing with another person's opinion and to them there is only one solution. Never say that your way is the better way of doing it unless you want them to see you as a smug loser.
Start doing code reviews or pair programming.
If the team won't go for those, try weekly design reviews. Each week, meet for an hour and talk about a peice of code. If people seem defensive, pick old code that no one is emotionally attached to any more, at least at the beginning.
As #JesperE: said, focus on the code, not the coder.
When you see something you think should be different, but others don't see it the same way, then start by asking questions that lead to the deficiencies, instead of pointing them out. For example:
Globals: Do you think we'll ever want to have more than one of these? Do you think we will want to control access to this?
Mutable state: Do you think we'll want to manipulate this from another thread?
I also find it helpful to focus on my limitations, which can help people relax. For example:
long functions: My brain isn't big enough to hold all of this at once. How can we make smaller pieces that I can handle?
bad names: I get confused easily enough when reading clear code; when names are misleading, there's no hope for me.
Ultimately, the goal is not for you to teach your team how to code better. It's to establish a culture of learning in your team. Where each person looks to the others for help in becoming a better programmer.
Introduce the idea of a code standard. The most important thing about a code standard is that it proposes the idea of consistency in the code base (ideally, all of the code should look like it was written by one person in one sitting) which will lead to more understandable and maintainable code.
You have to explain why your way is better.
Explain why a function is better than cutting & pasting.
Explain why an array is better than $foo1, $foo2, $foo3.
Explain why global variables are dangerous, and that local variables will make life easier.
Simply whipping out a coding standard and saying "do this" is worthless because it doesn't explain to the programmer why it's a good thing.
First, I'd be careful not to judge too quickly. It's easy to dismiss some code as bad, when there might be good reasons why it's so (eg: working with legacy code with weird conventions). But let's assume for the moment that they're really bad.
You could suggest establishing a coding standard, based on the team's input. But you really need to take their opinions into account then, not just impose your vision of what good code should be.
Another option is to bring technical books into the office (Code Complete, Effective C++, the Pragmatic Programmer...) and offer to lend it to others ("Hey, I'm finished with this, anyone would like to borrow it?")
If possible, make sure they understand that you're critizising their code, not them personally.
Suggest a better alternative in a non-confrontational way.
"Hey, I think this way will work too. What do you guys think?" [Gesture to obviously better code on your screen]
Have code reviews, and start by reviewing YOUR code.
It will put people at ease with the whole code review process because you are beginning the process by reviewing your own code instead of theirs. Starting off with your code will also give them good examples of how to do things.
They may think your style stinks too. Get the team together to discuss a consistent set of coding style guidelines. Agree to something. Whether that fits your style isn't the issue, settling on any style as long as it's consistent is what matters.
By example. Show them the right way.
Take it slow. Don't thrash them for every little mistake right off the bat, just start with things that really matter.
The code standard idea is a good one.
But consider not saying anything, especially since it is for fun, with, presumably, people you are friends with. It's just code...
There's some really good advice in Gerry Weinberg's book "The Psychology of Computer Programming" - his whole notion of "egoless programming" is all about how to help people accept criticism of their code as distinct from criticism of themselves.
Bad naming practices: Always inexcusable.
And yes, do no always assume that your way is better... It can be difficult, but objectivity must be maintained.
I've had an experience with a coder that had such horrible naming of functions, the code was worse than unreadable. The functions lied about what they did, the code was nonsensical. And they were protective/resistant to having someone else change their code. when confronted very politely, they admitted it was poorly named, but wanted to retain their ownership of the code and would go back and fix it up "at a later date."
This is in the past now, but how do you deal with a situation where they error is ACKNOWLEDGED, but then protected? This went on for a long time and I had no idea how to break through that barrier.
Global variables: I myself am not THAT fond of global variables, but I know a few otherwise excellent programmers that like them A LOT. So much so that I've come to believe they are not actually all that bad in many situations, as they allow for clarity, ease of debugging. (please don't flame/downvote me :) ) It comes down to, I've seen a lot of very good, effective, bug free code that used global variables (not put in by me!) and great deal of buggy, impossible to read/maintain/fix code that meticulously used proper patterns. Maybe there IS a place (though shrinking perhaps) for global variables? I'm considering rethinking my position based on evidence.
Start a wiki on your network using some wiki software.
Start a category on your site called "best practices" or "coding standards" or something.
Point everyone to it. Allow for feedback.
When you do releases of the software, have the person whose job it is to put code into the build push back on developers, pointing them to the Wiki pages on it.
I've done this in my organization and it took several months for people to really get into the hang of using the Wiki but now it's this indispensable resource.
If you have even a loose standard of coding, being able to point to that, or indicating that you can't follow the code because it's not the correct format may be worthwhile.
If you don't have a coding format, now would be a good time to get one in place. Something like the answers to this question may be helpful: https://stackoverflow.com/questions/4121/team-coding-styles
I always go with the line 'This is what I would do'. I don't try and lecture them and tell them their code is rubbish but just give an alternative viewpoint that can hopefully show them something that is obviously a bit neater.
Have the person(s) in question prepare a presentation to the rest of the group on the code for a representative module they have written, and let the Q&A take care of it (trust me, it will, and if it's a good group, it shouldn't even get ugly).
I do love code, and never had any course in my live about anything related to informatics I started very bad and started to learn from examples, but what I always remember and kept in my mind since I read the "Gang Of Four" book was:
"Everyone can write code that is understood by a machine, but not all can write code that is understood by a human being"
with this in mind, there is a lot to be done in the code ;)
I can't emphasize patience enough. I've seen this exact sort of thing completely backfire mostly because someone wanted the changes to happen NOW. Quite a few environments need the benefits of evolution, not revolution. And by forcing change today, it can make for a very unhappy environment for all.
Buy-in is key. And your approach needs to take into account the environment you are in.
It sounds like you're in an environment that has a lot of "individuality" to it. So... I wouldn't suggest a set of coding standards. It will come across that you want to take this "fun" project and turn it into a highly structured work project (oh great, what's next... functional documents?). Instead, as someone else said, you'll have to deal with it to a certain extent.
Stay patient and work toward educating others in your direction. Start with the edges (points where your code interacts with others) and when interacting with their code try to take it as an opportunity to discuss the interface they've created and ask them if it would be okay with them if it was changed (by you or them). And fully explain why you want the change ("it will help deal with changing subsystem attributes better" or whatever). Don't nit-pick and try to change everything you see as being wrong. Once you interact with others on the edge, they should start to see how it would benefit them at the core of their code (and if you get enough momentum, go deeper and truly start to discuss modern techniques and the benefits of coding standards). If they still don't see it... maybe you'll need to deal with that within yourself (especially on a "fun" project).
Patience. Evolution, not revolution.
Good luck.
I don a toga and open a can of socratic method.
The Socratic Method named after the Classical Greek philosopher Socrates, is a form of philosophical inquiry in which the questioner explores the implications of others' positions, to stimulate rational thinking and illuminate ideas. This dialectical method often involves an oppositional discussion in which the defense of one point of view is pitted against another; one participant may lead another to contradict himself in some way, strengthening the inquirer's own point.
A lot of the answers here relate to code formatting which these days is not particularly relevant, as most IDEs will reformat your code in the style you choose. What really matters is how the code works, and the poster is right to look at global variables, copy & paste code, and my pet peeve, naming conventions. There is such a thing as bad code and it has little to do with format.
The good part is that most of it is bad for a very good reason, and these reasons are generally quantifiable and explainable. So, in a non-confrontational way, explain the reasons. In many cases, you can even give the writer scenarios where the problems become obvious.
I'm not the lead developer on my project and therefore can't impose coding standards but I have found that bad code usually causes an issue sooner rather than later, and when it does i'm there with a cleaner idea or solution.
By not interjecting at the time and taking a more natural approach i've gained more trust with the lead and he often turns to me for ideas and includes me on the architectural design and deployment strategy used for the project.
People writing bad code is just a symptom of ignorance (which is different from being dumb). Here's some tips for dealing with those people.
Peoples own experience leaves a stronger impression than something you will say.
Some people are not passionate about the code they produce and will not listen to anything you say
Paired Programming can help share ideas but switch who's driving or they'll just be checking email on their phone
Don't drown them with too much, I've found even Continuous Integration needed to be explained a few times to some older devs
Get them excited again and they will want to learn. It could be something as simple as programming robots for a day
TRUST YOUR TEAM, coding standards and tools that check them at build time are often never read or annoying.
Remove Code Ownership, on some projects you will see code silos or ant hills where people say thats my code and you can't change it, this is very bad and you can use paired programming to remove this.
Instead of having them write code, have them maintain their code.
Until they have to maintain their steaming pile of spaghetti, they will never understand how bad they are at coding.
Nobody likes to listen someone saying their work sucks, but any sane person would welcome mentoring and ways of avoiding unnecessary work.
One school of teaching even says that you should not point out mistakes, but focus what is done right. For instance, instead of pointing out incomprehensible code as bad, you should point out where their code is particularly easy to read. In the first case you are priming others to think and act like crappy programmers. In the later case you are priming for thinking like a skilled professional.
I have a similar senario with the guys i work with.. They dont have the exposure to coding as much as i do but they are still usefull at coding.
Rather than me letting the do what they want and go back and edit the whole thing. I usually just sit them down and show them two ways of doing things. Thier way and My way, From this we discuss the pro's and cons of each method and therefore come to a better understanding and a better conclusion on how should we go about programming.
Here is the really suprizing part. Sometimes they will come up with questions that even i dont have answers to, and after research we all get a better concept of methodology and structure.
Discuss.
Show them Why
Don't even think you are always right.. Sometimes even they will teach you something new.
Thats what i would do if i was you :D
Probably a bit late after the effect, but that's where an agreed coding standard is a good thing.
I frankly believe that someone's code is better when it's easier to change, debug, navigate, understand, configure, test and publish (whew).
That said I think it is impossible to tell someone his/her code is bad without having a first go at having him / her explaining what it does or how is anyone supposed to enhance it afterwards (like, creating new funcionality or debugging it).
Only then their mind snaps and anyone will be able to see that:
Global variables value changes are almost always untrackable
Huge functions are hard to read and understand
Patterns make your code easier to enhance (as long as you obay to their rules)
( etc...)
Perhaps a session of pair programming should do the trick.
As for enforcing coding standards - it helps but they are too far away from really defining what is good code.
You probably want to focus on the impact of the bad code, rather than what might be dismissed as just your subjective opinion of whether it's good or bad style.
Privately inquire about some of the "bad" code segments with an eye toward the possibility that it is actually reasonable code, (no matter how predisposed you may be), or that there are perhaps extenuating circumstances. If you are still convinced that the code is just plain bad -- and that the source actually is this person -- just go away. One of several things may happen: 1) the person notices and takes some corrective action, 2) the person does nothing (is oblivious, or doesn't care as much as you do).
If #2 happens, or #1 does not result in sufficient improvement from your point of view, AND it is hurting the project, and/or impinging on you enough, then it may be time to start a campaign to establish/enforce standards within the team. That requires management buy-in, but is most effective when instigated from grass roots.
Good luck with that. I feel your pain brother.

Resources