When is a new language the right tool for the job? - boo

For a long time I've been trying different languages to find the feature-set I want and I've not been able to find it. I have languages that fit decently for various projects of mine, but I've come up with an intersection of these languages that will allow me to do 99.9% of my projects in a single language. I want the following:
Built on top of .NET or has a .NET implementation
Has few dependencies on the .NET runtime both at compile-time and runtime (this is important since one of the major use cases is in embedded development where the .NET runtime is completely custom)
Has a compiler that is 100% .NET code with no unmanaged dependencies
Supports arbitrary expression nesting (see below)
Supports custom operator definitions
Supports type inference
Optimizes tail calls
Has explicit immutable/mutable definitions (nicety -- I've come to love this but can live without it)
Supports real macros for strong metaprogramming (absolute must-have)
The primary two languages I've been working with are Boo and Nemerle, but I've also played around with F#.
Main complaints against Nemerle: The compiler has horrid error reporting, the implementation is buggy as hell (compiler and libraries), the macros can only be applied inside a function or as attributes, and it's fairly heavy dependency-wise (although not enough that it's a dealbreaker).
Main complaints against Boo: No arbitrary expression nesting (dealbreaker), macros are difficult to write, no custom operator definition (potential dealbreaker).
Main complaints against F#: Ugly syntax, hard to understand metaprogramming, non-free license (epic dealbreaker).
So the more I think about it, the more I think about developing my own language.
Pros:
Get the exact syntax I want
Get a turnaround time that will be a good deal faster; difficult to quantify, but I wouldn't be surprised to see 1.5x developer productivity, especially due to the test infrastructures this can enable for certain projects
I can easily add custom functionality to the compiler to play nicely with my runtime
I get something that is designed and works exactly the way I want -- as much as this sounds like NIH, this will make my life easier
Cons:
Unless it can get popularity, I will be stuck with the burden of maintenance. I know I can at least get the Nemerle people over, since I think everyone wants something more professional, but it takes a village.
Due to the first con, I'm wary of using it in a professional setting. That said, I'm already using Nemerle and using my own custom modified compiler since they're not maintaining it well at all.
If it doesn't gain popularity, finding developers will be much more difficult, to an extent that Paul Graham might not even condone.
So based on all of this, what's the general consensus -- is this a good idea or a bad idea? And perhaps more helpfully, have I missed any big pros or cons?
Edit: Forgot to add the nesting example -- here's a case in Nemerle:
def foo =
if(bar == 5)
match(baz) { | "foo" => 1 | _ => 0 }
else bar;
Edit #2: Figured it wouldn't hurt to give an example of the type of code that will be converted to this language if it's to exist (S. Lott's answer alone may be enough to scare me away from doing it). The code makes heavy use of custom syntax (opcode, :=, quoteblock, etc), expression nesting, etc. You can check a good example out here: here.

Sadly, there's no metrics or stories around failed languages. Just successful languages. Clearly, the failures outnumber the successes.
What do I base this on? Two common experiences.
Once or twice a year, I have to endure a pitch for a product/language/tool/framework that will Absolutely Change Everything. My answer has been constant for the last 20 or so years. Show me someone who needs support and my company will support them. And that's that. Never hear from them again. Let's say I've heard 25 of these.
Once or twice each year, I have to work with a customer who has orphaned technology. At some point in the past, some clever programming built a tool/framework/library/package that was used internally for several projects. Then that programmer left. No one else can figure that darn thing out, and they want us to replace/rewrite it. Sadly, we can't figure it out either, and our proposal is to rewrite from scratch. And they complain that their genius built the set of apps in a period of weeks, it can't take us months to rewrite them in Java/Python/VB/C#. Let's say I've written 25 or so of these kinds of proposals.
That's just me, one consultant.
Indeed one particularly sad situation was a company who's entire IT software portfolio was written by one clever guy with a private language and tools. He hadn't left, but he'd realized that his language and toolset had fallen way behind the times -- the state of the art had moved on, and he hadn't.
And the move was -- of course -- in an unexpected direction. His language and tools were okay, but the world had started to adopt relational databases, and he had absolutely no way to upgrade his junk to move away from flat files. It was something he had not foreseen. Indeed, it was something he could not possibly foresee. [You won't fall into this trap, will you?]
So, we talked. He rewrote a lot of the applications in Plain-Old VAX Fortran (yes, this is a long time ago.) And he rewrote it to use plain old relational SQL stuff (Ingres, at the time.)
After a year of coding, they were having performance problems. They called me back to review all the great stuff they'd done in replacing the home-built language. Sadly, they'd done the worst possible relational database design. Worst possible. They'd taken their file copies, merges, sorts, and what-not, and implemented each low-level file system operation using SQL, duplicating database rows left, right and center.
He was so mired in his private vision of the perfect language, that he couldn't adapt to a relatively common, pervasive new technology.

I say go for it.
It would be an awesome experience regardless of weather it makes it to production or not.
If you make it compile down to IL then you do not have to worry about not being able to re-use your compiled assemblies with C#
If you believe that you have valid complaints about the languages you listed above, it is likely that many will think like you. Of course, for every 1000 interested person there might be 1 willing to help you maintain it - but that is always the risk
But here are a few things to be cautioned about:
Get your language specification IN STONE before development. Make sure any and all language features are figured out before hand - even things that you may only want in the future. In my opinion, C# is slowly falling into the "oh-just-one-more-language-extension" trap that will lead to its eventual doom.
Be sure to make it optimized. I dont know what you already know; but if you dont know then learn ;) Nobody will want a language that has nice syntax but runs as slow as IE's javascript implementation.
Good luck :D

When I first started my career in the early 90s, there seemed to be this craze of everyone developing their own in-house languages. My first 3 jobs were with companies that had done this. One company had even developed their own operating system!
From experience, I'd say this is a bad idea for the following reasons:
1) You will spend time debugging the language itself in addition to the code base on top of it
2) Any developers you hire will need to go through the learning curve of the language
3) It will be hard to attract and keep developers since working in a proprietary language is a dead-end for someone's career
The main reason I left those three jobs was because they had proprietary languages and you'll notice that not many companies take this route any more :).
An additional argument I'd make is that most languages have entire teams whose full time job it is to develop the language. Maybe you'd be an exception, but I'd be very surprised if you'd be able to match that level of development by only working on the language part-time.

Main complaints against Nemerle: The
compiler has horrid error reporting,
the implementation is buggy as hell
(compiler and libraries), the macros
can only be applied inside a function
or as attributes, and it's fairly
heavy dependency-wise (although not
enough that it's a dealbreaker).
I see your post has been written more than two years ago.
I advise you trying Nemerle language today.
The compiler is stable. There are no blocker bugs for today.
The VS integration has a lot of improvements , also there is SharpDevelop integration.
If you give it a chance, you won't be disappointed.

NEVER EVER develop your own language.
Developing your own language is a fool's trap, and worse it will limit you to what your imagination can provide, as well demanding that you work out both your development environment and the actual programme you're writing.
The cases in which this doesn't apply are pretty much if you're Larry Wall, the AWK guys, or part of a substantial group of people dedicated to testing the boundaries of programming. If you're in any of those categories, you don't need my advice, but I strongly doubt that you're targeting a niche where there is no suitable programming language for the task AND the characteristics of the people doing the task.

If you are as clever as you seem to be (a likely possibility), my advice is to go ahead and do the design of the language first, iterate a couple of times over it, ask some smart fellows you trust in smart programming language related communities about the concrete design you came up with and then take the decision.
You might realize in the process of creating the design that just a quick hack on Nemerle would give it all you need, for example. Many things can happen just when thinking hard about a problem, and the final solution might not be what you actually had in mind when beginning the project.
Worst case scenario, you're stuck with actually implementing the design, but by then you will have it proof read and mature, and you'll know with a high degree of certainty that it was a good path to take.
A related piece of advice, start small, just define the features you absolutely need and then build on them to get the rest.

Writing your own language is not a easy project.. Especially one to be used in any kind of "professional setting"
It is a huge amount of work, and I would doubt you could write your own language, and still write any big projects that use it - you will spend so long adding features that you need, fixing bugs, and general language-design stuff.
I would strongly recommend choosing a language that is closest to what you want, and extending it to do what you need. It'll never be exactly what you want, but compared to the time you'll spend writing your own language, I would say that's a small compromise..

Scala has a .NET compiler. I don't know the status of this though. It's kind of a second class citizen in the Scala world (which is more focused on the JVM). But it might be a good tradeof to adopt the .NET compiler instead of creating a new language from scratch.
Scala is kind of weak in the meta-programming department ATM. It's possible that the need for metaprogramming is somewhat reduced by other language features. In any case I don't think anyone would be sad if you were to implement metaprogramming features for it. Also there is a compiler plug-in infrastructure on the way.

I think most languages will never fit all of the bill.
You might want to combine your 2 favourite languages (in my case C# and Scheme) and use them together.
From a professional point of view, this probably not a good idea though.

It would be interesting to hear some of the things you feel you can't do in existing languages. What kind of projects are you working on that can't be done in C#?
I'm just curios!

Related

First time porting a library from one language to another

I am porting a library from C++ to Java. This is my first time and I am not sure
about what "porting" really means? Specifically, what if the author named a variable
as 'A' and I think that a better name would be 'B'. Same for methods, classes and namespaces.
Also, what if I think something can be done better? Does porting mean that I should try
to keep as much of the original code spirit as possible, but still allow myself freedom
to improve stuff?
Thanks
It doesn't necessarily have to be a one-to-one translation (and in many cases, it can't be done). Porting is just rewriting a piece of software in a different language/environment/etc. Sometimes porting will require you to tweak things and implement them in different ways altogether, so I think the last sentence of your post pretty much captures the gist of things.
I view it as comparable to translating a book from English to another language. There will be instances where judgment calls need to be made in terms of how to express the intent/function of the source material.
When porting from System A to System B, the world is your oyster. You can pretty much change anything if you believe it's an improvement. The only caveats to that are when dealing with interfaces. Say, you are porting an API, for example, it wouldn't be a good idea to name externally-available methods, as that would break something down the road. Tracing naming issues across multiple classes is a major pain.
As someone who's done a fair bit of porting from language to language, I would recommend sticking to implementation details first and foremost. A good engineering principle is to change one thing at any time. That way, when things don't run as expected, you'll know that it's your implementation that is to blame, and not some silly naming issue. And when you do come to renaming, I suppose it goes without saying, be very careful and backup often. This is one case where software versioning may save you hours of time.
When "porting" a library from one platform to another, you are porting functionality. You are not porting style of code. It isn't like in literature, where one must maintain the style of the piece, keeping in mind metaphors and iambic pentameter or what have you.

Besides "treat warnings as errors" and fixing memory leaks, what other ideas should we implement as part of our coding standards?

First let me say, I am not a coder but I help manage a coding team. No one on the team has more than about 5 years experience, and most of them have only worked for this company.. So we are flying a bit blind, hence the question.
We are trying to make our software more stable and are looking to implement some "best practices" and coding standards. Recently we started taking this very seriously as we determined that much of the instability in our product could be linked back to the fact that we allowed Warnings to go through without fixing when compiling. We also never bothered to take memory leaks seriously enough.
In reading through this site we are now quickly fixing this problem with our team but it begs the question, what other practices can we implement team wide that will help us?
Edit: We do fairly complex 2D/3D Graphics Software that is cross-platform Mac/Windows in C++.
Typically, the level of precision/exactingness in coding standards/process is directly connected to the safety level required. E.g., if you are working in aerospace, you will tightly control pretty much everything. But, on the other end of the spectrum, if you are working on a computer gaming forum site...if something breaks, no biggie. You can have slop. So YMMV, depending on your field.
The classic book on coding is Code Complete 2nd edition, by Steve McConnell. Have a team copy & strongly recommend your developers purchase it(or have the company get it for them). That will satisfy probably 70% of the stylistic questions. CC addresses the majority of development cases.
edit:
Graphics software, C++, Mac/Windows.
Since you're doing cross-platform work, I would recommend having an automated "compile-on-checkin" process for your Mac(10.4(maybe), 10.5, 10.6), and Windows(XP(maybe), Vista, 7). This ensures your software at the least compiles, and you know when it doesn't.
Your source control(which you are using, I assume), should support branching, and your branching strategy can reflect cross-platformy-ness as well. It's also advantageous to have mainline branches, dev branches, and experimental branches. YMMV; you will probably need to iterate on that and consult with with people who are familiar with configuration management.
Since it's C++, you will probably want to be running Valgrind or similar to know if there is a memory leak. There are some static analyzers which you can get: I don't know how effective they are at the modern C++ idiom. You can also invest in writing some wrappers to help watch memory allocations.
Regarding C++...The books Effective C++, More Effective C++, and Effective STL(all by Scott Meyers) should be on someone's shelf, as well as Modern C++ by Andrescu. You may find Lippman's book on the C++ object model useful as well, I don't know.
HTH.
There are a lot of consultants/companies who have coding rules to sell you, you should have no difficulty finding one. However, one that doesn't first ask you the field you are in (you didn't mention it in your question) is providing you with snake oil.
Test-Driven Development. TDD helps check for logic errors at the development phase.
Get everyone to read and discuss various standards and guidelines. I (as well as Stroustrup) suggest the Joint Strike Fighter coding standards. Ask your developers to classify the guidelines therein among
Already met
Could be met easily (few changes from current condition)
Should work toward in old code and follow in new development
Not worth it
Have the long technical discussions, and settle on a set for the team to adopt.
Code reviews have been shown to provide significant benefits to code quality, even more so than traditional testing. I would suggest getting in the habit of performing routine design and code reviews; the number of stages at which reviews are performed, the formality and detail of the reviews, and the percentage of work subject to review can all be set according to your business requirements. Coding standards can be useful when done right (and if everyone's code looks similar, it is also easier to review), but where you put your braces and how far you indent blocks isn't really going to affect defect rates.
Also, it's worth familiarizing yourself and your peers with the concept of technical debt and working bit by bit to redesign and improve parts of the system as you come in contact with them. However, unless you have comprehensive unit testing and/or processes in place to ensure high code quality, this may not help things.
Given that this is Stack Overflow, someone should reference The Joel Test. I like to automate as much as possible, so using Lint is also a must.
These basics are good for most any industry or team size:
Use Agile methodology (scrum is a good example).
http://www3.software.ibm.com/ibmdl/pub/software/rational/web/whitepapers/2003/rup_bestpractices.pdf
Use Test-driven development. http://www.agiledata.org/essays/tdd.html
Use consistent coding standards. Here is an example document:
http://www.dotnetspider.com/tutorials/BestPractices.aspx
Get your team familiar with good
design patterns.
http://www.dofactory.com/Patterns/Patterns.aspx
You can't go wrong with these basics. Build from there with new team members who have been there and done that. I'd strongly suggest pair programming once you've got those guys on the team. It is the best way to infect people with best practices.
Best of luck to you!
The first thing you need to consider when adding coding standards/best practices is the effect it will have on your team's morale and cohesiveness. Developers usually resent any practices that are imposed on them even if they are good ideas. The people issues have to be addressed for a big change to be successful.
You will need to involve your group in developing the standards and try to achieve consensus. That said, you will never get universal agreement on anything, so you will have to balance consensus and getting to standards. I've seen major fights over something as simple as tabs versus spaces in source.
The best book I've seen for C/C++ guidelines in complicated projects is Large Scale C++ Software Design. That book along with Code Complete (which is a must-read classic) are good starting points.
You don't mention any language, and while it is true that most of coding standards are language independent, it will also help you in your search. On most of the companies I had work they have different coding standards for different programming languages. So my advice will be:
Choose your language
Search the web since there are plenty of standards out there for your language
Gather all the standards you found
Divide your team into groups and give them a few of the documents to analyze. They should come with a list of things they think worthy to have in their new standards.
Have a meeting so each group present its findings to everybody (there will be a lot of redundancy between groups). That should be an open discussion and everybody's opinion should be accounted.
Compile a list of the standards that were selected by the majority of the coders and that should be your starting point.
Perform semi annual reviews of the standards, to add or remove things.
Now, The logic behind this is : Most of the problems from putting a coding standard from scratch is developer's acceptance. Each of us have a way of doing things and it sucks when somebody from the outside believes one way of doing things is better from another. So, if developers understand the logic and the purpose of the coding standards then you have half of the work done. The other thing is that standards should be design and created specifically for your company's needs. There will be some things that will made sense, and some that don't. With the above approach you could discriminate between those. The other thing is that standards should be able to change over time to reflect the company needs, so a coding standard should be a living document.
This blog post describes a lot of the common practices of mediocre programming. These are some of the potential issues you're team is having. It includes a quick explanation of the "best practice" for each one.
One thing you should have rules about is some kind of naming standard. It just makes life easier for people while not being really invasive.
Other than that, I'd have to say it depends on the level of your team. Some need more rules than others. The better people are, the less "support" they need from rules.
If you want a complete set of coding rules to control every little detail, you're going to spend lots of time arguing about rules and exceptions to rules and what you should write rules about. I'd go with something already written instead.
If you are concerned about quality then one thing you could do that really isn't about rules, is:
Automated building and testing. This has helped me a lot. Once you find a problem, it really helps to have an environment where you can write a test to verify the problem. Fix the problem and then easily add your test to an automatic test suite that makes sure that sort of problem can't come back without being spotted.
Then make sure these run often. Preferably every time someone checks something in.
If your framework requires certain rules to function well, put those in your coding standard.
If you decide to have coding standards, you want to be very careful about what you put in. If the document is too long or focuses on arbitrary stylistic details, it will just get ignored and nobody will bother to read it. Often a lot of what goes into coding standards is just the preferences of the person that wrote the document (or some standards that have been copied off the web!). If something is in the standard, it needs to be very clear to the reader how it improves quality and why it is important.
I would argue that a large proportion of what makes code readable is to do with design rather than the layout of the code. I have seen a lot of code that would adhere to the standards but still be difficult to read (really long methods, bad naming etc.) - you can't have everything it the standards, at some point it comes down to how skilled and disciplined your developers are - do what you can to increase their skills.
Perhaps rather than a coding standards document, try to get the team to learn about good design (easier said than done, I know). Make them aware of things like the SOLID principles, how to separate concerns, how to handle exceptions properly. If they design well, the code will be easy to read and it won't matter if there are enough white lines or the curly braces are in the right place.
Get some books about design principles (see a couple of recommendations below). Maybe get the the team to do some workshops to discuss some of the topics. Perhaps get them to collectively write a document on what principles might be important for their project. Whatever you do, make sure it is the team as a whole who decides what the standards / principles are.
http://www.amazon.co.uk/Principles-Patterns-Practices-Robert-Martin/dp/0131857258/
http://www.amazon.co.uk/Clean-Code-Handbook-Software-Craftsmanship/dp/0132350882
Don't write your own standards from scratch.
Chances are there are several out there that define what you want already, and are more complete than you could come up with on your own. That said, don't worry too much if you don't agree 100% with it on minor matters, you can swap in some parts of others, or call some infraction of it an warning rather than an error - depending on your own needs. (for example, some standards would throw a warning if the length of a line is more than 80 characters long, I prefer no more than 120 as a hard limit, but would make sure there was a good reason - readability & clarity for example - if there was > 80).
Also, do try to find automated methods of checking your code against the standard - including your own minor changes as required.
Besides books already recommended, I would also mention,
C++ Coding Standards: 101 Rules, Guidelines, and Best Practices by Herb Sutter and Andrei Alexandrescu (Paperback - Nov 4, 2004)
If you're programming on VB.NET, make sure Option Explicit and Option Strict are set to ON. This will save you a lot of grief tracking down mysterious bugs. These can be set at project level so that you never have to remember to to set them in your code files
I really like:
MISRA C standard (it's a little strict tho' but the ideas hold for C++)
and Hi-Integrity's http://www.codingstandard.com/HICPPCM/index.html C++ standard which borrows heavily from MISRA
LDRA (a static analysis tool) uses these standards to grade your work (this I don't use as it's expensive) but I can vouch for running cppcheck as a good 'free/libre' static analysis checker.

(When) Should I learn compilers?

According to this http://steve-yegge.blogspot.com/2007/06/rich-programmer-food.html article, I defnitely should.
Quote Gentle, yet insistent executive
summary: If you don't know how
compilers work, then you don't know
how computers work. If you're not 100%
sure whether you know how compilers
work, then you don't know how they
work.
I thought that it was a very interesting article, and the field of application is very useful (do yourself a favour and read it)
But then again, I have seen successful senior sw engineers that didn’t know compilers very well, or internal machine architecture for that matter,
but did know a thing or two of each item in the following list :
A programming paradigm (OO, functional,…)
A programming language API (C#, Java..) and at least 2 very different some say! (Java / Haskell)
A programming framework (Java, .NET)
An IDE to make you more productive (Eclipse, VisualStudio, Emacs,….)
Programming best practices (see fxcop rules for example)
Programming Principles (DRY, High Cohesion, Low Coupling, ….)
Programming methodologies (TDD, MDE)
Design patterns (Structural, Behavioural,….)
Architectural Basics (Tiers, Layers, Process Models (Waterfall, Agile,…)
A Testing Tool (Unit Testing, Model Testing, …)
A GUI technique (WPF, Swing)
A documenting tool (Javadoc, Sandcastle..)
A modelling languague (and tool maybe) (UML, VisualParadigm, Rational)
(undoubtedly forgetting very important stuff here)
Not all of these tools are necessary to be a good programmer (like a GUI when you just don’t need it)
but most of them are. Where do compilers come in, and are they really that important, since, as I mentioned,
lots of programmers seems to be doing fine without knowing them and especially, becoming a good programmer is seen the multitude of knowledge domains almost a lifetime achievement :-) , so even if compilers are extremely important, isn't there always stuff still more important?
Or should i order 'The Unleashed Compilers Unlimited Bible (in 24H..))) today?
For those who have read the article, and want to start studying right away :
Learning Resources on Parsers, Interpreters, and Compilers
If you just want to be a run-of-the-mill coder, and write stuff... you don't need to take compilers.
If you want to learn computer science and appreciate and really become a computer scientist, you MUST take compilers.
Compilers is a microcosm of computer science! It contains every single problem, including (but not limited to) AI (greedy algorithms & heuristic search), algorithms, theory (formal languages, automata), systems, architecture, etc.
You get to see a lot of computer science come together in an amazing way. Not only will you understand more about why programming languages work the way that they do, but you will become a better coder for having that understanding. You will learn to understand the low level, which helps at the high level.
As programmers, we very often like to talk about things being a "black box"... but things are a lot smoother when you understand a little bit about what's in the box. Even if you don't build a whole compiler, you will surely learn a lot. You will get to see the formalisms behind parsing (and realize it's not just a bunch of special cases hacked together), and a bunch of NP complete problems. You will see why the theory of computer science is so important to understand for practical things. (After all, compilers are extremely practical... and we wouldn't have the compilers we have today without formalisms).
I really hope you consider learning about them... it will help you get to the next level as a computer scientist :-).
You should learn about compilers, for the simple reason that implementing a compiler makes you a better programmer. The compiler will surely suck, but you will have learned a lot during the way. It is a great way of improving (or practising) your programming skill.
You do not need to understand compilers to be a good programmer, but it can help. One of the things I realized when learning about them, is that compiling is simply a translation.
If you have ever translated from one language to another, you have just done compiling.
So when should you learn about compilers?
When you want to, or need it to solve a problem.
Compiler theory is useful, but not essential.
Although there are some techniques which come in handy, like lexical analysis and parsing.
Another one is error handling. Compilers need a lot of these. User input can contain anything, even the unexpected. And you need to deal with all of these.
If you're going to be working at a high-enough level where you're worrying over UML and self-describing code, you could easily go your entire career without wanting or needing intimate details of how the compiler works.
But, if you're an in-the-trenches coder and have no aspirations to manage your friends, it's likely that one day, you'll realize you're waging war with your compiler. It could be a random bug that comes along or a hallway conversation about while-verses-for loops. You'll realize the assembly (or IL, likely, in the coming years) is just a bit to the left of what you were needing and another universe will unfold.
So, I suppose my answer is, just be aware of the compiler for now, that it's doing quite a lot, but don't worry over it too much.
The compilers courses usually focus on how the high-level code is analyzed and translated into machine code. That's very interesting, but not crucial. It's more important to understand what is this machine code that is generated by the compiler so that you understand how a computer works and what is the cost of each language construct.
So I'd rather say that you should know an assembly language (I mean a limited subset of assembly language for one architecture) to understand how a computer works and the latter is definitely required for a competent programmer so that he understands what segmenation fault is, when to optimize and when not and other similar low-level things.
If you intend to write extremely time-critical real-time code, you will benefit from understanding how the compiler optimises your code. However, you will actually benefit more from understanding the underlying architecture of your hardware.
From my experience, if you understand how the hardware works, and how the compiler interprets your code, you will be able to write code that does exactly what you intend it to do. I have been caught on several occasions, writing code that got optimised away by the compiler and made the hardware do something that I did not intend.
All in all, understanding the entire software-hardware stack is not essential to write good algorithms and code, but it will most certainly help!
From a practical perspective, general compiler theory is less of concern than a assembler, linker and loader to a specific platform. For example, I just consider the GCC compiler as a translator from my high-level C language to the low-level assembly language on a x86 platform. And more often than not, I manually refine ;) the code generated by the compiler.
From a scientific perspective, I would strongly suggest you learning the compiler theory, it will help you understand the great idea that computer is built upon. And even more, you will have a different eye upon the world.
Just my opinion, but I believe compilers is not given enough attention in CS courses, not in mine, and not in any others afaik. I think any CS major should do 2 things after a sabbatical or finishing their major: Re-learn if necessary finite automata and maybe a formal methods language. Apply it.
Write a simple compiler with this knowledge. Alex Aiken has a very useful online tutorial on writing a compiler for the COOL (Classroom Object Oriented Language) which is a subset of Scala as of 2013 ver. At least at time of writing.

What are some good strategies to fix bugs as code becomes more complex?

I'm "just" a hobbyist programmer, but I find that as my programs get longer and longer the bugs get more annoying--and harder to track. Just when everything seems to be running smoothly, some new problem will appear, seemingly spontaneously. It may take me a long time to figure out what caused the problem. Other times I'll add a line of code, and it'll break something in another unit. This can get kind of frustrating if I thought everything was working well.
Is this common to everyone, or is it more of a newbie kind of thing? I hear about "unit testing," "design frameworks," and various other concepts that sound like they would decrease bugginess, make my apps "robust," and everything easy to understand at a glance :)
So, how big a deal are bugs to people with professional training?
Thanks -- Al C.
The problem of "make a fix, cause a problem elsewhere" is very well known, and is indeed one of the primary motivations behind unit testing.
The idea is that if you write exhaustive tests for each small part of your system independently, and run them on the entire system every time you make a change anywhere, you will see the problem immediately. The main benefit, however, is that in the process of building these tests you'll also be improving your code to have less dependencies.
The typical solution to these sort of problems is to reduce coupling; make different parts less dependent on one another. More experienced developers sometimes have habits or design skills to build systems in this manner. For example, we use interfaces and implementations rather than classes; we use model-view-controller for user interfaces, etc. In addition, we can use tools that help further reduce dependencies, like "Dependency injection" and aspect oriented programming.
All programmers make mistakes. Good and experienced programmers build their programs so that it is easier to find the mistakes and restrict their effects.
And it is a big deal for everyone. Most companies spend more time on maintenance than on writing new code.
Are you automating your tests? If you do not, you're signing up creating bugs without finding them.
Are you adding tests for bugs as you fix them? If you do not, you are signing up for creating the same bugs over and over.
Are you writing unit tests? If not, you are signing up for long debugging sessions when a test fails.
Are you writing your unit tests first? If not, your unit tests will be hard to write when your units are tightly coupled.
Are you refactoring mercilessly? If not, every edit will become more difficult and more likely to introduce bugs. (But make sure you have good tests, first.)
When you fix a bug, are you fixing the entire class? Don't just fix the bug; don't just fix similar bugs throughout your code; change the game so you can never create that kind of bug again.
Bugs are a big deal to everyone. I've always found that the more I program, the more I learn about programming in general. I cringe at the code I wrote a few years back!! I started out as a hobbyist and liked it so much that I went to engineering college to get a Computer Science Engineering major (I am in my final semester). These are the things that I have learned :
I take time to actually design what I am going to write and document the design. It really eliminates a lot of problems down the line. Whether the design is as simple as writing down a few points on what I am going to write or full blown UML modeling (:( ) doesn't matter. Its the clarity of thought and purpose and having material to look back at when I come back to the code after a while that matter the most.
No matter what language I write in, keeping my code simple and readable is important. I think that it is extremely important not to over complicate the code and at the same time not to over simplify it. (Hard learned lesson!!)
Efficiency optimizations and fancy tricks should be applied at the end, only when necessary and only if they are needed. Another thing is that I apply them only If I really know what I am doing and I always test my code!
Learning language dependant details helps me keep my code bug free. For instance I learned that scanf() is evil in C!
Others have already commented on the zen of writing tests. I would like to add that you should always do regression tests. (i.e. Write new code, test all parts of your code to see if it breaks)
Keeping a mental picture of code is hard at times, so I always document my code.
I use methods to make sure that there is a bare minimum dependence between different parts of my code. Interfaces, class hierarchies etc. (Decoupled design)
Thinking before I code and being disciplined in whatever I write is another crucial skill. I know people who don't format their code so its readable (Shudder!).
Reading other peoples source to learn best practices is good. Making my own list is better!. When working in a team, there must be a common set of them.
Don't be paralyzed by analysis. Write tests, then code, then execute and test. Rinse wash repeat!
Learning to read over my own code and combing it for mistakes is important. Improving my arsenal of debugging skills was a great investment. I keep them sharp by helping my classmates fix bugs regularly.
When there is a bug in my code, I assume its my mistake, not the computers and work from there. That is a state of mind that really helps me.
A fresh pair of eyes aids in debugging. Programmers tend to miss even the most obvious errors in their own code when exhausted. Having someone to show your code to is great.
having someone to throw ideas at and not be judged is important. I talk to my mom (who is not a programmer) , throw ideas at her and find solutions. She helps me bounce my ideas back and forth and refine them. If she is unavailable, I talk to my pet cat.
I am not so be discouraged by bugs anymore. I've learned to love removing bugs almost as much as programming.
Using version control has really helped me manage different ideas I get while coding. That helps reduce errors. I recommend using git or any other version control system you might like.
As Jay Bazzuzi said - Refactor code. I just added this point after reading his answer, to keep my list complete. All credit goes to him.
Try to write reusable code. Reuse code, both yours and from libraries. Using libraries which are bug free to do some common tasks really reduces bugs (sometimes).
I think the following quote says it best - "If debugging is the art of removing bugs, programming must be the art of putting them in."
No offense to anyone who disagrees. I hope this answer helps.
Note
As others Peter has pointed out, use Object Oriented Programming if you are writing a large amount of code. There is a limit to code length after which it becomes harder and harder to manage if written procedurally. I like procedural for smaller stuff, like playing with algorithms.
There are two ways to write error-free programs; only the third one works. ~Alan J. Perlis
The only way for errors to occur in a program is by being put there by the author. No other mechanisms are known. Programs can't acquire bugs by sitting around with other buggy programs. ~Harlan Mills
Obviously, bugs are a big deal to any programmer. Just look through the list of questions on Stack Overflow to see this illustrated.
The difference between a hobbyist and an experienced professional is that the pro will be able to use his experience to code in a more "defensive" way, avoiding many types of bugs in the first place.
All the other answers are great. I'll add two things.
Source control is mandatory. I'm assuming you're on windows here. VisualSVN Server is free and maybe 4 clicks to install. TortoiseSVN is also free and it integrates into Windows Explorer, getting around the VS Express limitations of no add-ins. If you create too many bugs, you can revert your code and start over. Without source control, this is next to impossible. Plus you can sync your code if you have a laptop and a desktop.
People are going to recommend many techniques like unit testing, mocking, Inversion of Control, Test Driven Development, etc. These are great practices, but don't try to cram it all into your head too quickly. You have to write code to get better at writing code, so work these techniques slowly into your code writing. You have to crawl before you walk and walk before you can run.
Best of luck in your coding adventures!
This is a common newbie thing. As you get more experience, of course, you'll still have bugs, but they'll be easier to find and fix because you'll learn how to make your code more modular (so that changing one thing doesn't have ripple effects everywhere else), how to test it, and how to structure it to fail fast, close to the source of the problem, rather than in some arbitrary place. One very basic but useful thing that doesn't require complex infrastructure to implement is to check the inputs to all functions that have non-trivial precondtions with asserts. This has saved me several times in cases where I would have otherwise gotten weird segfaults and arbitrary behavior that would have been near impossible to debug.
If bugs weren't a problem then I'd be able to write a 100,000 line program in 10 minutes!
Your question is like, "As an amateur doctor, I worry about my patients' health: sometimes when I'm not careful enough, they sicken. Is patients' health a problem for you professional doctors too?"
Yes: it's the central problem, even the only problem (for any sufficiently all-inclusive definition of 'bug').
Bugs are common to everyone -- professional or not.
The larger and more distributed the project, the more careful one must be. One look at any open source bug database (ex: https://bugzilla.mozilla.org/ ) will confirm this for you.
The software industry has evolved various programming styles and standards, which when used right, make wrong code easier to spot or limited in its impact.
Therefore, training has a very positive on code quality... But at the end of the day, bugs still sneak through.
If you're just a hobbyist programmer, learning full bore TDD and OOP may involve more time than you're willing to put in. So, going on the assumption that you don't want to put in the time on them, a few easily digestible suggestions to cut down on bugs are:
Keep each function doing one thing. Be suspect of a function more than, say, 10 lines long. If you think you can break it into two functions, you probably should. Something that will help you control this is naming your functions according to exactly what they are doing. If you find that your names are long and unwieldy then you function is probably doing too many things.
Turn magic strings into constants. That is, instead of using:
people["mom"]
use instead
var mom = "mom";
people[mom]
Design your functions to either do something (command) or get something (query), but not both.
An extremely short and digestible take on OOP is here http://www.holub.com/publications/notes_and_slides/Everything.You.Know.is.Wrong.pdf. If you get this, you've got the gist of OOP and are quite frankly ahead of a lot of professional programmers.
The prevailing wisdom seems to be that the average programmer creates 12 bugs per 1000 lines of code - depends on who you ask for the exact number, but it's always per lines of code - so, the bigger the program, the more the bugs.
Subpar programmers tend to create way more bugs.
Newbies are often trapped by idiosyncrasies of the language, and lacking experience tends towards more bugs too. As you go on, you will get better, but never will you create bug-free code... well I still have bugs, even after 30 years, but that could be just me.
Nasty bugs happen to everyone from pros to hobbyists. Really good programmers get asked to track down really nasty bugs. It's part of the job. You'll know you've made it as a software developer when you stare at a nasty bug for two days and in frustration you shout, "Who wrote this crap!?!?" ... only to realize it was you. :-)
Part of the skill of a software developer is the ability to keep a large set of interrelated items straight in his/her head. It sounds like you're discovering what happens when your mental model of the system breaks down. With practice you will learn to design software that doesn't feel so brittle. There are tons of books, blogs, etc. out there on the subject of software design. And Stack Overflow of course for specific questions.
All that said, here's a couple of things you can do:
A good debugger is invaluable. Often you have to step through your code line by line to figure out what went wrong.
Use a garbage-collected language such as Python or Java if it makes sense for your project. GC will help you focus on making things work instead of getting bogged down by maddening memory errors.
If you write C++, learn to love RAII.
Write LOTS of code. Software is somewhat of an art form. Lots of practice will make you better at it.
Welcome to Stack Overflow!
What really changed my odds against code complexity and bugs was using a coding standart - how to place brackets an so on. It may seem like just boring and useless thing but it really unifies all the code and makes it much easier to read and maintain. So do you use a coding standart?
If you're not well organized, your codebase will become your very own Zebra Puzzle. Adding more code is like adding more people/animals/houses to your puzzle, and soon you have 150 various animals, people, houses and cigarette brands in your puzzle and you realize that it just took you a week to add 3 lines of code because everything is so inter-related that it takes forever to make sure the code still executes how you want it to.
The most popular organizational paradigm seems to be Object Oriented Programming, if you can break your logic down into small units which can be constructed and used independently of each other, then you will find bugs far less painful when they occur.

Do you think a software company should impose developers a coding-style? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
If you think it shouldn't, explain why.
If yes, how deep should the guidelines be in your opinion? For example, indentation of code should be included?
I think a team (rather than a company) need to agree on a set of guidelines for reasonably consistent style. It makes it more straightforward for maintenance.
How deep? As shallow as you can agree on. The shorter and clearer it is the more likely it is that all the team members can agree to it and will abide by it.
You want everybody reading and writing code in a standard way. There are two ways you can achieve this:
Clone a single developer several times and make sure they all go through the same training. Hopefully they should all be able to write the same codebase.
Give your existing developers explicit instruction on what you require. Tabs or spaces for indentation. Where braces sit. How to comment. Version-control commit guidelines.
The more you leave undefined, the higher the probability one of the developers will clash on style.
The company should impose that some style should be followed. What style that is and how deep the guidelines are should be decided collectively by the developer community in the company.
I'd definitely lay down guidelines on braces, indentation, naming etc...
You write code for readability and maintainability. Always assume someone else is going to read your code.
There are tools that will auto magically format your code , and you can mandate that everyone uses the tool.
If you are on .Net look at stylecop, fxcop and Resharper
Do you think a software company should impose developers a coding-style?
Not in a top-down manner. Developers in a software company should agree on a common coding style.
If yes, how deep should the guidelines be in your opinion?
They should only describe the differences from well-known conventions, trying to keep the deviation minimal. This is easy for languages like Python or Java, somewhat blurry for C/C++, and almost impossible for Perl and Ruby.
For example, indentation of code should be included?
Yes, it makes code much more readable. Keep indentation consistent in terms of spaces vs tabs and (if you opt for spaces) number of space characters. Also, agree on a margin (e.g. 76 chars or 120 chars) for long lines.
Yes, but within reason.
All modern IDEs offer one-keystroke code pretty-print, so the "indentation" point is quite irrelevant, in my opinion.
What is more important is to establish best practices: for example, use as little "out" or "ref" parameters as possible... In this example, you have 2 advantages: improves readability and also fixes a lot of mistakes (a lot of out parameters is a code smell and should probably be refactored).
Going beyond that is, in my honest opinion, a bit "anal" and unnecessarily annoying for the devs.
Good point by Hamish Smith:
Style is quite different from best
practices. It's a shame that 'coding
standards' tend to roll the two
together. If people could keep the
style part to a minimum and
concentrate on best practices that
would probably add more value.
I don't believe a dev team should have style guidelines they must follow as a general rule. There are exceptions, for example the use of <> vs. "" in #include statements, but these exceptions should come from necessity.
The most common reason I hear people use to explain why style guidelines are necessary is that code written in a common style is easier to maintain that code written in individual styles. I disagree. A professional programmer isn't going to be bogged down when they see this:
for( int n = 0; n < 42; ++42 ) {
// blah
}
...when they are used to seeing this:
for(int n = 0; n < 42; ++42 )
{
// blah
}
Moreover, I have found it's actually easier to maintain code in some cases if you can identify the programmer who wrote the original code by simply recognizing their style. Go ask them why they implemented the gizmo in such a convoluted way in 10 minutes instead of spending the better part of a day figuring out the very technical reason why they did something unexpected. True, the programmer should have commented the code to explain their reasoning, but in the real world programmers often don't.
Finally, if it takes Joe 10 minutes backspacing & moving his curly braces so that Bill can spend 3 fewer seconds looking at the code, did it really save any time to make Bill do something that doesn't come natural to him?
I believe having a consistent codebase is important. It increases the maintainability of ur code. If everyone expects the same kind of code, they can easily read and understand it.
Besides it is not much of a hassle given today's IDEs and their autoformatting capabilities.
P.S:
I have this annoying habit of putting my braces on the next line :). No one else seems to like it
I think that programmers should be able to adapt to the style of other programmers. If a new programmer is unable to adapt, that usually means that the new programmer is too stubborn to use the style of the company. It would be nice if we could all do our own thing; however, if we all code along some bast guideline, it makes debugging and maintenance easier. This is only true if the standard is well thought out and not too restrictive.
While I don't agree with everything, this book contains an excellent starting point for standards
The best solution would be for IDEs to regard such formatting as meta data. For example, the opening curly brace position (current line or next line), indentation and white space around operators should be configurable without changing the source file.
In my opinion I think it's highly necessary with standards and style guides. Because when your code-base grows you will want to have it consistent.
As a side note, that is why I love Python; because it already imposes quite a lot of rules on how to structure your applications and such. Compare that with Perl, Ruby or whatever where you have an extreme freedom(which isn't that good in this case).
There are plenty of good reasons for the standards to define the way the applications are developed and the way the code should look like. For example when everyone use the same standard an automatic style-checker could be used as a part of the project CI.
Using the same standards improve code readability and helps to reduce the tension between team members about re-factoring the same code in different ways.
Therefore:
All the code developed by the particular team should follow precisely the same standard.
All the code developed for a particular project should follow precisely the same standard.
It is desirable that teams belonging to the same company use the same standard.
In an outsourcing company an exception could be made for a team working for a customer if the customer wants to enforce a standard of their own. In this case the team adopts the customer's standard which could be incompatible with the one used by their company.
Like others have mentioned, I think it needs to be by engineering or by the team--the company (i.e. business units) should not be involved in that sort of decision.
But one other thing I'd add is any rules that are implemented should be enforced by tools and not by people. Worst case scenario, IMO, is some over-zealous grammar snob (yes, we exist; I know because we can smell our own) writes some documentation outlining a set of coding guidelines which absolutely nobody actually reads or follows. They become obsolete over time, and as new people are added to the team and old people leave, they simply become stale.
Then, some conflict arises, and someone is put in the uncomfortable position of having to confront someone else about coding style--this sort of confrontation should be done by tools and not by people. In short, this method of enforcement is the least desirable, in my opinion, because it is far too easy to ignore and simply begs programmers to argue about stupid things.
A better option (again, IMO) is to have warnings thrown at compile time (or something similar), so long as your build environment supports this. It's not hard to configure this in VS.NET, but I'm unaware of other development environments that have similar features.
Style guidelines are extremely important, whether they're for design or development, because they speed the communication and performance of people who work collaboratively (or even alone, sequentially, as when picking up the pieces of an old project). Not having a system of convention within a company is just asking people to be as unproductive as they can. Most projects require collaboration, and even those that don't can be vulnerable to our natural desire to exercise our programming chops and keep current. Our desire to learn gets in the way of our consistency - which is a good thing in and of itself, but can drive a new employee crazy trying to learn the systems they're jumping in on.
Like any other system that's meant for good and not evil, the real power of the guide lies in the hands of its people. The developers themselves will determine what the essential and useful parts are and then, hopefully, use them.
Like the law. Or the English language.
Style guides should be as deep as they want to be - if it comes up in the brainstorm session, it should be included. It's odd how you worded the question because at the end of the day there is no way to "impose" a style guide because it's only a GUIDE.
RTFM, then glean the good stuff and get on with it.
Yes, I think companies should. Developer may need to get used to the coding-style but in my opinion a good programmer should be able to work with any coding style. As Midhat said: It is important to have a consistent codebase.
I think this is also important for opensource projects, there is no supervisor to tell you how to write your code but many languages have specifications on how naming and organisation of your code should be. This helps a lot when integrating opensource components into your project.
Sure, guidelines are good, and unless it's badly-used Hungarian notation (ha!), it'll probably improve consistency and make reading other people's code easier. The guidelines should just be guidelines though, not strict rules enforced on programmers. You could tell me where to put my braces or not to use names like temp, but what you can't do is force me to have spaces around index values in array brackets (they tried once...)
Yes.
Coding standards are a common way of ensuring that code within a certain organization will follow the Principle of Least Surprise: consistency in standards starting from variable naming to indentation to curly brace use.
Coders having their own styles and their own standards will only produce a code-base that is inconsistent, confusing, and frustrating to read, especially on larger projects.
These are the coding standards for a company I used to work for. They're well defined, and, while it took me a while to get used to them, meant that the code was readable by all of us, and uniform all the way through.
I do think coding standards are important within a company, if none are set, there are going to be clashes between developers, and issues with readability.
Having the code uniform all the way through presents a better code to the end user (so it looks as if it's written by one person - which, from an End Users point of view, it should - that person being "the company" and it also helps with readability within the team...
A common coding style promotes consistency and makes it easy for different people to easily understand, maintain and expand the whole code base, not only their own pieces. It also makes it easier for new people to learn the code faster. Thus, any team should have a guidelines on how the code is expected to be written.
Important guidelines include (in no particular order):
whitespace and indentation
standard comments - file, class or method headers
naming convention - classes, interfaces, variables, namespaces, files
code annotations
project organization - folder structures, binaries
standard libraries - what templates, generics, containers and so on to use
error handling - exceptions, hresults, error codes
threading and synchronization
Also, be wary of programmers that can't or won't adapt to the style of the team, no matter how bright they might be. If they don't play by one of the team rules, they probably won't play by other team rules as well.
I would agree that consistency is key. You can't rely on IDE pretty-printing to save the day, because some of your developers may not like using an IDE, and because when you're trawling through a code base of thousands of source files, it's simply not feasible to pretty print all the files when you start working on them, and perform a roll-back afterwards so your VCS doesn't try to commit back all the changes (clogging the repository with needless updates that burden everyone).
I would suggest standardizing at least the following (in decreasing order of importance):
Whitespace (it's easiest if you choose a style that conforms to the automatic pretty-printing of some shared tool)
Naming (files and folders, classes, functions, variables, ...)
Commenting (using a format that allows automatic documentation generation)
My opinion:
Some basic rules are good as it helps everyone to read and maintain the code
Too many rules are bad as it stops developers innovating with clearer ways of laying out code
Individual style can be useful to determine the history of a code file. Diff/blame tools can be used but the hint is still useful
Modern IDEs let you define a formatting template. If there is a corporate standard, then develop a configuration file that defines all the formatting values you care about and make sure everyone runs the formatter before they check in their code. If you want to be even more rigorous about it you could add a commit hook for your version control system to indent the code before it is accepted.
Yes in terms of using a common naming standard as well as a common layout of classes and code behind files. Everything else is open.
Every company should. Consistent coding style ensures higher readibility and maintainability of the codebase across whole your team.
The shop I work at does not have a unified coding standard, and I can say we (as a team) vastly suffer from that. When there is no will from the individuals (like in case of some of my colleagues), the team leader has to bang his fist on the table and impose some form of standardised coding guidelines.
Ever language has general standards that are used by the community. You should follow those as well as possible so that your code can be maintained by other people used to the language, but there's no need to be dictatorial about it.
The creation of an official standard is wrong because a company coding standard is usually too rigid, and unable to flow with the general community using the language.
If you're having a problem with a team member really be out there in coding style, that's an excellent thing for the group to gently suggest is not a good idea at a code review.
Coding standards: YES. For reasons already covered in this thread.
Styling standards: NO. Your readable, is my bewildering junk, and vice versa. Good commenting and code factoring have a far greater benefit. Also gnu indent.
I like Ilya's answer because it incorporates the importance of readability, and the use of continuous integration as the enforcement mechanism. Hibri mentioned FxCop, and I think its use in the build process as one of the criteria for determining whether a build passes or fails would be more flexible and effective than merely documenting a standard.
I entirely agree that coding standards should be applied, and that it should almost always be at the team level. However there are a couple of exceptions.
If the team is writing code that is to be used by other teams (and here I mean that other teams will have to look at the source, not just use it as a library) then there are benefits to making common standards across all the teams using it. Similarly if the policy of the company is to frequently move programmers from one team to another, or is in a position where one team frequently wants to reuse code from another team then it is probably best to impose standards across the company.
There are two types of conventions.
Type A conventions: "please do these, it is better"
and Type B: "please drive on the right hand side of the road", while it is okay to drive on the other side, as long as everyone does the same way.
There's no such thing as a separate team. All code in a good firm is connected somehow, and style should be consistent. It's easier to get yourself used to one new style than to twenty different styles.
Also, a new developer should be able to respect the practices of existing codebase and to follow them.

Resources