(When) Should I learn compilers? - compiler-theory

According to this http://steve-yegge.blogspot.com/2007/06/rich-programmer-food.html article, I defnitely should.
Quote Gentle, yet insistent executive
summary: If you don't know how
compilers work, then you don't know
how computers work. If you're not 100%
sure whether you know how compilers
work, then you don't know how they
work.
I thought that it was a very interesting article, and the field of application is very useful (do yourself a favour and read it)
But then again, I have seen successful senior sw engineers that didn’t know compilers very well, or internal machine architecture for that matter,
but did know a thing or two of each item in the following list :
A programming paradigm (OO, functional,…)
A programming language API (C#, Java..) and at least 2 very different some say! (Java / Haskell)
A programming framework (Java, .NET)
An IDE to make you more productive (Eclipse, VisualStudio, Emacs,….)
Programming best practices (see fxcop rules for example)
Programming Principles (DRY, High Cohesion, Low Coupling, ….)
Programming methodologies (TDD, MDE)
Design patterns (Structural, Behavioural,….)
Architectural Basics (Tiers, Layers, Process Models (Waterfall, Agile,…)
A Testing Tool (Unit Testing, Model Testing, …)
A GUI technique (WPF, Swing)
A documenting tool (Javadoc, Sandcastle..)
A modelling languague (and tool maybe) (UML, VisualParadigm, Rational)
(undoubtedly forgetting very important stuff here)
Not all of these tools are necessary to be a good programmer (like a GUI when you just don’t need it)
but most of them are. Where do compilers come in, and are they really that important, since, as I mentioned,
lots of programmers seems to be doing fine without knowing them and especially, becoming a good programmer is seen the multitude of knowledge domains almost a lifetime achievement :-) , so even if compilers are extremely important, isn't there always stuff still more important?
Or should i order 'The Unleashed Compilers Unlimited Bible (in 24H..))) today?
For those who have read the article, and want to start studying right away :
Learning Resources on Parsers, Interpreters, and Compilers

If you just want to be a run-of-the-mill coder, and write stuff... you don't need to take compilers.
If you want to learn computer science and appreciate and really become a computer scientist, you MUST take compilers.
Compilers is a microcosm of computer science! It contains every single problem, including (but not limited to) AI (greedy algorithms & heuristic search), algorithms, theory (formal languages, automata), systems, architecture, etc.
You get to see a lot of computer science come together in an amazing way. Not only will you understand more about why programming languages work the way that they do, but you will become a better coder for having that understanding. You will learn to understand the low level, which helps at the high level.
As programmers, we very often like to talk about things being a "black box"... but things are a lot smoother when you understand a little bit about what's in the box. Even if you don't build a whole compiler, you will surely learn a lot. You will get to see the formalisms behind parsing (and realize it's not just a bunch of special cases hacked together), and a bunch of NP complete problems. You will see why the theory of computer science is so important to understand for practical things. (After all, compilers are extremely practical... and we wouldn't have the compilers we have today without formalisms).
I really hope you consider learning about them... it will help you get to the next level as a computer scientist :-).

You should learn about compilers, for the simple reason that implementing a compiler makes you a better programmer. The compiler will surely suck, but you will have learned a lot during the way. It is a great way of improving (or practising) your programming skill.

You do not need to understand compilers to be a good programmer, but it can help. One of the things I realized when learning about them, is that compiling is simply a translation.
If you have ever translated from one language to another, you have just done compiling.
So when should you learn about compilers?
When you want to, or need it to solve a problem.

Compiler theory is useful, but not essential.
Although there are some techniques which come in handy, like lexical analysis and parsing.
Another one is error handling. Compilers need a lot of these. User input can contain anything, even the unexpected. And you need to deal with all of these.

If you're going to be working at a high-enough level where you're worrying over UML and self-describing code, you could easily go your entire career without wanting or needing intimate details of how the compiler works.
But, if you're an in-the-trenches coder and have no aspirations to manage your friends, it's likely that one day, you'll realize you're waging war with your compiler. It could be a random bug that comes along or a hallway conversation about while-verses-for loops. You'll realize the assembly (or IL, likely, in the coming years) is just a bit to the left of what you were needing and another universe will unfold.
So, I suppose my answer is, just be aware of the compiler for now, that it's doing quite a lot, but don't worry over it too much.

The compilers courses usually focus on how the high-level code is analyzed and translated into machine code. That's very interesting, but not crucial. It's more important to understand what is this machine code that is generated by the compiler so that you understand how a computer works and what is the cost of each language construct.
So I'd rather say that you should know an assembly language (I mean a limited subset of assembly language for one architecture) to understand how a computer works and the latter is definitely required for a competent programmer so that he understands what segmenation fault is, when to optimize and when not and other similar low-level things.

If you intend to write extremely time-critical real-time code, you will benefit from understanding how the compiler optimises your code. However, you will actually benefit more from understanding the underlying architecture of your hardware.
From my experience, if you understand how the hardware works, and how the compiler interprets your code, you will be able to write code that does exactly what you intend it to do. I have been caught on several occasions, writing code that got optimised away by the compiler and made the hardware do something that I did not intend.
All in all, understanding the entire software-hardware stack is not essential to write good algorithms and code, but it will most certainly help!

From a practical perspective, general compiler theory is less of concern than a assembler, linker and loader to a specific platform. For example, I just consider the GCC compiler as a translator from my high-level C language to the low-level assembly language on a x86 platform. And more often than not, I manually refine ;) the code generated by the compiler.
From a scientific perspective, I would strongly suggest you learning the compiler theory, it will help you understand the great idea that computer is built upon. And even more, you will have a different eye upon the world.

Just my opinion, but I believe compilers is not given enough attention in CS courses, not in mine, and not in any others afaik. I think any CS major should do 2 things after a sabbatical or finishing their major: Re-learn if necessary finite automata and maybe a formal methods language. Apply it.
Write a simple compiler with this knowledge. Alex Aiken has a very useful online tutorial on writing a compiler for the COOL (Classroom Object Oriented Language) which is a subset of Scala as of 2013 ver. At least at time of writing.

Related

Is there any scripting language that's fast, easy to embed, and well-suited for high-level game-programming?

First off, I'm aware that there are many questions related to this, but none of them seemed to help my specific situation. In particular, lua and python don't fit my needs as well as I could hope. It may be that no language with my requirements exists, but before coming to that conclusion it'd be nice to hear a few more opinions. :)
As you may have guessed, I need such a language for a game engine I'm trying to create. The purpose of this game engine is to provide a user with the basic tools for building a game, while still giving her the freedom of creating many different types of games.
For this reason, the scripting language should be able to handle game concepts intuitively. Among other things, it should be easy to define a variety of types, sub-type them with slightly different properties, query and modify objects dynamically, and so on.
Furthermore, it should be possible for the game developer to handle every situation they come across in the scripting language. While basic components like the renderer and networking would be implemented in C++, game-specific mechanisms such as rotating a few hundred objects around a planet will be handled in the scripting language. This means that the scripting language has to be insanely fast, 1/10 C speed is probably the minimum.
Then there's the problem of debugging. Information about the function, stack trace and variable states that the error occurred in should be accessible.
Last but not least, this is a project done by a single person. Even if I wanted to, I simply don't have the resources to spend weeks on just the glue code. Integrating the language with my project shouldn't be much harder than integrating lua.
Examining the two suggested languages, lua and python, lua is fast(luajit) and easy to integrate, but its standard debugging facilities seem to be lacking. What's even worse, lua by default has no type-system at all. Of course you can implement that on your own, but the syntax will always be weird and unintuitive.
Python, on the other hand, is very comfortable to use and has a basic class system. However, it's not that easy to integrate, it's paradigm doesn't really involve type-checking and it's definitely not fast enough for more complex games. I'd again like to point out that everything would be done in python. I'm well aware that python would likely be fast enough for 90% of the code.
There's also Scala, which I haven't seen suggested so far. Scala seems to actually fulfill most of the requirements, but embedding the Java VM with C doesn't seem very easy, and it generally seems like java expects you to build your application around java rather than the other way around. I'm also not sure if Scala's functional paradigm would be good for intuitive game-development.
EDIT: Please note that this question isn't about finding a solution at any cost. If there isn't any language better than lua, I will simply compromise and use that(I actually already have the thing linked into my program). I just want to make sure I'm not missing something that'd be more suitable before doing so, seeing as lua is far from the perfect solution for me.
You might consider mono. I only know of one success story for this approach, but it is a big one: C++ engine with mono scripting is the approach taken in Unity.
Try the Ring programming language
http://ring-lang.net
It's general-purpose multi-paradigm scripting language that can be embedded in C/C++ projects, extended using C/C++ code and/or used as standalone language. The supported programming paradigms are Imperative, Procedural, Object-Oriented, Functional, Meta programming, Declarative programming using nested structures, and Natural programming.
The language is simple, trying to be natural, encourage organization and comes with transparent implementation. It comes with compact syntax and a group of features that enable the programmer to create natural interfaces and declarative domain-specific languages in a fraction of time. It is very small, fast and comes with smart garbage collector that puts the memory under the programmer control. It supports many programming paradigms, comes with useful and practical libraries. The language is designed for productivity and developing high quality solutions that can scale.
The compiler + The Virtual Machine are 15,000 lines of C code
Embedding Ring Interpreter in C/C++ Programs
https://en.wikibooks.org/wiki/Ring/Lessons/Embedding_Ring_Interpreter_in_C/C%2B%2B_Programs
For embeddability, you might look into Tcl, or if you're into Scheme, check out SIOD or Guile. I would suggest Lua or Python in general, of course, but your question precludes them.
Since noone seems to know a combination better than lua/luajit, I think I will leave it at that. Thanks for everyone's input on this. I personally find lua to be very lacking as a high-level language for game-programming, but it's probably the best choice out there. So to whomever finds this question and has the same requirements(fast, easy to use, easy to embed), you'll either have to use lua/luajit or make your own. :)

Which will serve a budding programmer better: A classic book in scheme or a modern language like python?

I'm really interested in becoming a serious programmer, the type that people admire for hacker chops, as opposed to a corporate drone who can't even complete FizzBuzz.
Currently I've dabbled in a few languages, most of my experience is in Perl and Shell, and I've dabbled slightly in Ruby.
However, I can't help but feel that although I know bits and pieces of languages, I don't know how to program.
I'm really in no huge rush to immediately learn a language that can land me a job (though I'd like to do it soon), and I'm considering using PLT Scheme (now called Racket) to work through How to Design Programs or Structure and Interpretation of Computer Programs, essentially, one of the Scheme classics, because I have always heard that they teach people how to write high-quality, usable, readable code.
However, even MIT changed its introductory course from using SICP and Scheme to one in Python.
So, I ask for the sage advice of the many experienced programmers here regarding the following:
Does Scheme (and do those books) really teach one how to program well? If so, which of the two books do you recommend?
Is this approach to learning still relevant and applicable? Am I on the right track?
Am I better off spending my time learning a more practical/common language like Python?
Is Scheme (or lisp in general) really a language that one learns, only to never use? Or do those of you who know a lisp code in it often?
Thanks, and sorry for the rambling.
If you want to learn to really program, start doing it. Quit dabbling and write code. Pick a language and write code. Solve problems and release applications. Work with experienced programmers on open source projects, but get doing. A lot.
Does Scheme (and do those books) really teach one how to program well? If so, which of the two books do you recommend?
Probably. Probably better than any of the Learn X in Y Timespan books.
Is this approach to learning still relevant and applicable? Am I on the right track?
Yes.
Am I better off spending my time learning a more practical/common language like Python?
Only if you plan to get a job in it. Scheme will give you a better foundation though.
Is Scheme (or lisp in general) really a language that one learns, only to never use? Or do those of you who know a lisp code in it often?
I do emacs elisp fiddling to adjust my emacs. I also work with functional languages on the side to try to keep my mind flexible.
My personal opinion is that there are essentially two tracks that need to be walked before the student can claim to know something about programming. Track one is the machine itself, the computer. You should start with assembly here and learn how the computer works. After some work and understanding there - don't skimp - you should learn C and then C++; really getting the understanding of resource management and what really happens. Track two is the very high level language track - Scheme, Prolog, Haskell, Perl, Python, C#, Java, and others that execute on a VM or interpreter lie in this area. These, too, need to be studied to learn how problems can be abstracted and thought about in different ways that do not involve the fiddly bits of a real computer.
However, what will not work is being a language dilettante when learning to program. You will need to find a language - Scheme is acceptable, although I'd recommend starting at the low level first - and then stick with that language for a good year at least.
The most important parts of Scheme are the programming-language concepts you can pick up that modern languages are now just adopting or adding support for.
Lisp and Scheme have supported, before most other languages, features that were often revolutionary for the time: closures and first-order functions, continuations, hygienic macros, and others. C has none of these.
But they're appearing more and more often in programming languages that Get Stuff Done today. Why can you just declare functions seemingly anywhere in JavaScript? What happens to outside variables you reference from within a function? What are these new "closures" that PHP 5.3 is just now getting? What are "side effects" and why can they be bad for parallel computing? What are "continuations" in Ruby? How do LINQ functions work? What's a "lambda" in Python? What's the big deal with F#?
These are all questions that learning Scheme will answer but C won't.
I'd say it depends on what you want to do.
If you want to get into programming, Python is probably better. It's an excellent first language, resembles most common programming languages, and is widely available. You'll find more libraries handy, and will be able to make things more easily.
If you want to get into computer science, I'd recommend Scheme along with SICP.
In either case, I'd recommend learning several very different languages eventually, to give you more ways to look at and solve problems. Getting reasonably proficient in Common Lisp, for example, will make you a better Java programmer. I'd take them one at a time, though.
The best languages to start with are probably:
a language you want to play/learn in
a language you want to work in
And probably in that order, too, unless the most urgent need is to feed yourself.
Here's the thing: the way to learn to program is to do it a lot. In order to do it a lot, you're going to need a lot of patience and more than a little bit of enthusiasm. This is more important than the specific language you pick.... but picking a language that you like working in (whether because you like the features or because you feel it'll teach you something) can be a big boost.
That said, here's a couple of comments on Scheme:
Does Scheme (and do those books)
really teach one how to program well?
The thing about Scheme (or something like it) is that if you learn it, it'll teach you some very useful abstractions that a lot of programmers who don't ever really come to grips with a functional programming language never learn. You'll think differently The substance of programming languages and computing will look more fluid to you. You'll have a better idea of how to compose your own quasi-primitives out of a very small set of primitives rather than relying on the generally static set of primitives offered in some other languages.
The problem is that a lot of what I'm saying might not mean much to you at the moment, and it's a bit more of a mind-bending road than coming into a common dynamic language like Perl, Python, or Ruby... or even a language like C which is close to the Von Neumann mechanics of the machine.
This doesn't mean it's necessarily a bad idea to start there: I've been part of an experiment where we taught Prolog of all things to first-time programmers, and it worked surprisingly well. Sometimes beginner's mind actually helps. :) But Scheme as a first language is definitely an unconventional path. I suspect Ruby or Python would be a gentler road.
Is Scheme (or lisp in general) really
a language that one learns, only to
never use?
It's a language that you're unlikely to be hired to program in. However, while you're learning to program, and after you've learned and are doing it in your free time, you can write code in whatever you want, and because of the Internet, you'll probably be able to find people working on open source projects in whatever language you want. :)
I hate to tell ya, but nobody admires programmers for their "hacker chops". There's people who get shit done, then there's everyone else. A great many of the former types are the "corporate drones" you appear to hold in contempt.
Now, for your question, I personally love Lisp (and Scheme), but if you want something you're more likely to use in industry "Beginning Python" might be better material for you as Python is found more often in the wild. Or if you enjoy Ruby, find some good Ruby material and start producing working solutions (same with Java or .Net or whatever).
Really, either route will serve you well. The trick is to stick with it until you've internalized the concepts being taught.
Asking whether an approach to learning is relevant and applicable is tricky - there are many different learning styles, and it's a matter of finding out which ones apply to you personally. Bear in mind that the style you like best might not be the one that actually works best for you :-)
You've got plenty of time and it sounds like you have enthusiasm to spare, so it's not a matter of which language you should learn, but which one you should learn first. personally, I'd look at what you've learnt so far, what types of languages and paradigms you've got under your belt, and then go off on a wild tangent and chose one completely different.
I started programming at a very very young age. When I was in high school, I thought I was a good programmer. That's when I started learning about HOW and WHY the languages work rather than just the syntax.
Before learning the how and why, switching to a new language would have been hell. I had learned a language, but I hadn't learned to program. Now that I know the fundamental concepts well, I can apply them to virtually any language and pick it up with ease.
I would highly recommend a book (or even a school coarse, if you can afford it) that takes you through the processes of coding without relying on a specific language.
Unfortunately I don't have any books to recommend, but if others agree with me and know of any, maybe they can offer a suggestion.
//Edit: After re-reading your question, I realize that I may have not actually answered any of them... Sorry about that. I think picking up a book that will take you in-depth with best-practices would be extremely helpful, regardless of the language you choose.
There are basic programming concepts (logic flow, data structures), which are easily taught by using languages like Python. However, there are much more complex programming concepts (design patterns, optimization, threading, etc.) which the classic languages don't abstract away for you.
If your search for knowledge leans more toward algorithm development and the science of programming, start with C. If your search is more for a practical ends, I hear Ruby is a good starting point.
I agree with gruszczy. I'd start programming with C.
It may be kind of scary at first (at least for me :S ) but in the long run you'd be grateful it. I mean I love Python, but because I learned C first, the learning curve for other languages wasn't very steep at all.
Start with C and make it so.
Just remember to practice, because you'll never improve at something by doing nothing. ;)
To a specific point in your question, the "classics" you mention will help you with exactly what the titles say. SICP is about the structure and interpretation of computer programs. It is not about learning Scheme (though you will learn Scheme). HtDP is about how to design programs, it is not about learning Scheme (though you will learn Scheme).
Scheme, in principle, is a very small and concise language with almost no gotchas. This makes it excellent for moving on to learning how to structure and interpret programs, or how to design them. More traditional "practical" languages like C, C++, Python, or Java do not have this quality. They are rife with syntax. Learning with these languages means you must simultaneously learn syntactical quirks while learning to think like a programmer. In my opinion, this is unfortunate. In some cases the quirks are good, in others they are accidents of history, but in all cases it is unfortunate.
Start coding in C. It should be a horror for you at first, but this teaches you most important stuff like: pointers, recurrence, memory management. Try reading some classic books about programming like The Art of Computer Programming by Donald Knuth. After you master that, you can think about learning object oriented programming or functional programming. First basics. If fou manage to learn them, nothing will be hard for you ever again.

How to draw a tiger with just 3 lines?

Background:
An art teacher once gave me a design problem to draw a tiger using only 3 lines. The idea being that I study a tiger and learn the 3 lines to draw for people to still be able to tell it is a tiger.
The solution for this problem is to start with a full drawing of a tiger and remove elements until you get to the three parts that are most recognizable as a tiger.
I love this problem as it can be applied in multiple disciplines like software development, especially in removing complexity.
At work I deal with maintaining a large software system that has been hacked to death and is to the point of becoming unmaintainable. It is my job to remove the burdensome complexity that was caused by past developers.
Question:
Is there a set process for removing complexity in software systems - a kind of reduction process template to be applied to the problem?
Check out the book Refactoring by Martin Fowler, and his http://www.refactoring.com/ website.
Robert C. Martin's Clean Code is another good resource for reducing code complexity.
Unfortunately, the analogy with the tiger drawing may not work very well. With only three lines, a viewer can imagine the rest. In a software system, all the detail has to actually be there. You generally can't take much away without removing something essential.
Check out the book Anti-Patterns for a well-written book on the whole subject of moving from bad (or maladaptive) design to better. It provides ways to recover from a whole host of problems typically found in software systems. I would then add support to Kristopher's recommendation of Refactoring as an important second step.
Checkout the book, Working Effectively with Legacy Code
The topics covered include
Understanding the mechanics of software change: adding features, fixing bugs, improving design, optimizing performance
Getting legacy code into a test harness
Writing tests that protect you against introducing new problems
Techniques that can be used with any language or platform—with examples in Java, C++, C, and C#
Accurately identifying where code changes need to be made
Coping with legacy systems that aren't object-oriented
Handling applications that don't seem to have any structure
This book also includes a catalog of twenty-four dependency-breaking techniques that help you work with program elements in isolation and make safer changes.
While intellectually stimulating, the concept of detail removal doesn't carry very well (at least as-is) to software programs. The reason being that the drawing is re-evaluated by a human with it ability to accept fuzzy input, whereby the program is re-evaluated by a CPU which is very poor at "filling the blanks". Another more subtle reason is that programs convey a spaciotemporal narrative, whereas the drawing is essentially spacial.
Consequently with software there is much less room for approximation, and for outright removal of particular sections of the code. Never the less, refactoring is the operational keyword and is sometimes applicable even for them most awkward legacy pieces. This discipline is however part art part science and doesn't have very many "quick tricks" that I know of.
Edit: One isn't however completely helpless against legacy code. See for example the excellent book references provided in Alex Baranosky and Kristopher Johnson's answers. These books provide many useful techniques, but on the whole I remain strong in my assertion that refactoring non-trivial legacy code is an iterative process that requires both art and science (and patience and ruthlessness and gentleness ;-) ).
This is a loaded question :-)
First, how do we measure "complexity"? Without any metric decided apriori, it may be hard to justify any "reduction" project.
Second, is the choice entirely yours? If we may take an example, assume that, in some code base, the hammer of "inheritance" is used to solve every other problem. While using inheritance is perfectly right for some cases, it may not be right for all cases. What do you in such cases?
Third, can it be proved that behavior/functionality of the program did not change due to refactoring? (This gets more complex when the code is part of a shipping product.)
Fourth, you can start with start with simpler things like: (a) avoid global variables, (b) avoid macros, (c) use const pointers and const references as much as possible, (d) use const qualified methods wherever it is the logical thing to do. I know these are not refactoring techniques, but I think they might help you proceed towards your goal.
Finally, in my humble opinion, I think any such refactoring project is more of people issue than technology issue. All programmers want to write good code, but the perception of good vs. bad is very subjective and varies across members in the same team. I would suggest to establish a "design convention" for the project (Something like C++ Coding Standards). If you can achieve that, you are mostly done. The remaining part is modify the parts of code which does not follow the design convention. (I know, this is very easy to say, but much difficult to do. Good wishes to you.)

Knowledge of windows internals?

I wondered if any of you have knowledge of the internal workings of windows (kernel, interrupts, etc) and if you've found that you've become a better developer as a result?
Do you find that the more knowledge the better is a good motto to have as a developer?
I find myself studying a lot of things, thinking with more understanding, I'll be a better developer. Of course practice and experience also comes into play.
This is a no brainier - absolutely (assuming you're a developer primarily on the Windows platform, of course). A working knowledge of how the car engine works will make a lot of common programming tasks (debugging, performance work, etc) a lot easier.
Windows Internals is the standard reference.
I believe it is valuable to understand how things work underneath. CLR/.NET to C++, native to ASM, ASM to CPU architecture, building registers and ops from logical gates, logical gates from MOSFETs, transistors from quantum physics and the latter from respective mathematical apparatus (group theory, etc).
Understanding low level makes you not only think different but also feel different - like you are in control of things, standing on the shoulders of giants.
More knowledge is always better, and having knowledge at many levels is a lot more valuable than just knowing whatever layer of abstraction you are working at.
A good rule of thumb is that you should have a good knowledge of the layer below the layer where you are working. So, for example, if you write a lot of .NET code, you should know how the CLR works. If you write a lot of web apps, you should understand HTTP. If you writing code that uses HTTP directly, then you should understand TCP/IP. If you are implementing a TCP/IP stack, then you need to understand how Ethernet works.
Knowledge of Windows internals is really helpful if you are writing native Win32 code, or if OS performance issues are critical to what you are doing. At higher levels of abstraction, it may be less helpful, but it never hurts.
I dont think that one requires special or secret knowledge of internals such as those that may be extended to members of the windows team or those with source access but I absolutely contend that understanding internals helps you become a better developer.
Take threading for instance, if you are going to build an application that uses threading in even a moderate way - understanding how windows works, how the threading works, how memory processes work are all keys to being able to do a good job with that code.
I agree to a point with your edict but I would not agree that experience/practice/knowledge are mutually exclusive. That net-net of experience is that you have knowledge gained from that experience. There is also a wisdom component to experience and practice but those are usually intangible situational elements that you apply in the future to avoid mistakes. Bottom line knowledge is a precipitate of experience.
Think of it this way, how many people do you know with 30+ years of experience in IT, think of them and take the top two. Now go into that memory bank and think of the people you know in the industry who are super smart, who know so much about so many things and pick the top two of those. You now have your final 4 - if you had to pick one to start a project with who would it be? Invariably we pick the super smart guy.
Yes, understanding Windows internals helped me to become a better programmer. It also taught be a lot of bad practices, bad ideas, and poor design concepts.
I highly suggest studying OS X or Linux internals as an alternative. It'll take less time, make more sense, and be much more productive.
Read code. Read lots of code. Read lots of good code. jQuery, Django, AIR framework source, Linux kernel, compilers.
Try to learn programming languages that introduce you to new approaches, like Lisp, Ruby, Python, or Javascript. OOP is good, but .net and Java seem to take the brainwash approach on it, and elevate it to some kind of religious level, instead of it just being a good tool in your toolbox.
If you don't understand the code you are reading, it likely means you are on the right track, and learning new techniques.
I'd suggest getting a mac simply because you'll find yourself wanting to make your UIs simpler and easier. It's really important to have a good environment if you want to become a great programmer. Surround yourself with engineers better than yourself (if you can), work with frameworks and languages that take the 'engineer' approach vs. the 'experimenter' approach, and... use a operating system that contains code better than yours.
I'd also reccomend the book "Coders at Work".
It depends. Many programmers who understand the internals of a system begin writing optimised code to exploit that knowledge. This has three very serious side-effects:
1.) It's harder for others without that knowledge to extend or support the code.
2.) System internals may change without notice, whereas interfaces are usually versioned and changes discussed publicly.
3.) Interfaces are generally consistent across platform revisions and hardware, internals do not have this consistency.
In short, There's a lot of broken, unsupportable code out there that's borked because it relies on an internal process that the vendor changed without notice.
Father of language C said that "you don't need to learn all features of language to write great codes. Better you understand the problem, better you write the code." Having knowledge is always better.

When is a new language the right tool for the job?

For a long time I've been trying different languages to find the feature-set I want and I've not been able to find it. I have languages that fit decently for various projects of mine, but I've come up with an intersection of these languages that will allow me to do 99.9% of my projects in a single language. I want the following:
Built on top of .NET or has a .NET implementation
Has few dependencies on the .NET runtime both at compile-time and runtime (this is important since one of the major use cases is in embedded development where the .NET runtime is completely custom)
Has a compiler that is 100% .NET code with no unmanaged dependencies
Supports arbitrary expression nesting (see below)
Supports custom operator definitions
Supports type inference
Optimizes tail calls
Has explicit immutable/mutable definitions (nicety -- I've come to love this but can live without it)
Supports real macros for strong metaprogramming (absolute must-have)
The primary two languages I've been working with are Boo and Nemerle, but I've also played around with F#.
Main complaints against Nemerle: The compiler has horrid error reporting, the implementation is buggy as hell (compiler and libraries), the macros can only be applied inside a function or as attributes, and it's fairly heavy dependency-wise (although not enough that it's a dealbreaker).
Main complaints against Boo: No arbitrary expression nesting (dealbreaker), macros are difficult to write, no custom operator definition (potential dealbreaker).
Main complaints against F#: Ugly syntax, hard to understand metaprogramming, non-free license (epic dealbreaker).
So the more I think about it, the more I think about developing my own language.
Pros:
Get the exact syntax I want
Get a turnaround time that will be a good deal faster; difficult to quantify, but I wouldn't be surprised to see 1.5x developer productivity, especially due to the test infrastructures this can enable for certain projects
I can easily add custom functionality to the compiler to play nicely with my runtime
I get something that is designed and works exactly the way I want -- as much as this sounds like NIH, this will make my life easier
Cons:
Unless it can get popularity, I will be stuck with the burden of maintenance. I know I can at least get the Nemerle people over, since I think everyone wants something more professional, but it takes a village.
Due to the first con, I'm wary of using it in a professional setting. That said, I'm already using Nemerle and using my own custom modified compiler since they're not maintaining it well at all.
If it doesn't gain popularity, finding developers will be much more difficult, to an extent that Paul Graham might not even condone.
So based on all of this, what's the general consensus -- is this a good idea or a bad idea? And perhaps more helpfully, have I missed any big pros or cons?
Edit: Forgot to add the nesting example -- here's a case in Nemerle:
def foo =
if(bar == 5)
match(baz) { | "foo" => 1 | _ => 0 }
else bar;
Edit #2: Figured it wouldn't hurt to give an example of the type of code that will be converted to this language if it's to exist (S. Lott's answer alone may be enough to scare me away from doing it). The code makes heavy use of custom syntax (opcode, :=, quoteblock, etc), expression nesting, etc. You can check a good example out here: here.
Sadly, there's no metrics or stories around failed languages. Just successful languages. Clearly, the failures outnumber the successes.
What do I base this on? Two common experiences.
Once or twice a year, I have to endure a pitch for a product/language/tool/framework that will Absolutely Change Everything. My answer has been constant for the last 20 or so years. Show me someone who needs support and my company will support them. And that's that. Never hear from them again. Let's say I've heard 25 of these.
Once or twice each year, I have to work with a customer who has orphaned technology. At some point in the past, some clever programming built a tool/framework/library/package that was used internally for several projects. Then that programmer left. No one else can figure that darn thing out, and they want us to replace/rewrite it. Sadly, we can't figure it out either, and our proposal is to rewrite from scratch. And they complain that their genius built the set of apps in a period of weeks, it can't take us months to rewrite them in Java/Python/VB/C#. Let's say I've written 25 or so of these kinds of proposals.
That's just me, one consultant.
Indeed one particularly sad situation was a company who's entire IT software portfolio was written by one clever guy with a private language and tools. He hadn't left, but he'd realized that his language and toolset had fallen way behind the times -- the state of the art had moved on, and he hadn't.
And the move was -- of course -- in an unexpected direction. His language and tools were okay, but the world had started to adopt relational databases, and he had absolutely no way to upgrade his junk to move away from flat files. It was something he had not foreseen. Indeed, it was something he could not possibly foresee. [You won't fall into this trap, will you?]
So, we talked. He rewrote a lot of the applications in Plain-Old VAX Fortran (yes, this is a long time ago.) And he rewrote it to use plain old relational SQL stuff (Ingres, at the time.)
After a year of coding, they were having performance problems. They called me back to review all the great stuff they'd done in replacing the home-built language. Sadly, they'd done the worst possible relational database design. Worst possible. They'd taken their file copies, merges, sorts, and what-not, and implemented each low-level file system operation using SQL, duplicating database rows left, right and center.
He was so mired in his private vision of the perfect language, that he couldn't adapt to a relatively common, pervasive new technology.
I say go for it.
It would be an awesome experience regardless of weather it makes it to production or not.
If you make it compile down to IL then you do not have to worry about not being able to re-use your compiled assemblies with C#
If you believe that you have valid complaints about the languages you listed above, it is likely that many will think like you. Of course, for every 1000 interested person there might be 1 willing to help you maintain it - but that is always the risk
But here are a few things to be cautioned about:
Get your language specification IN STONE before development. Make sure any and all language features are figured out before hand - even things that you may only want in the future. In my opinion, C# is slowly falling into the "oh-just-one-more-language-extension" trap that will lead to its eventual doom.
Be sure to make it optimized. I dont know what you already know; but if you dont know then learn ;) Nobody will want a language that has nice syntax but runs as slow as IE's javascript implementation.
Good luck :D
When I first started my career in the early 90s, there seemed to be this craze of everyone developing their own in-house languages. My first 3 jobs were with companies that had done this. One company had even developed their own operating system!
From experience, I'd say this is a bad idea for the following reasons:
1) You will spend time debugging the language itself in addition to the code base on top of it
2) Any developers you hire will need to go through the learning curve of the language
3) It will be hard to attract and keep developers since working in a proprietary language is a dead-end for someone's career
The main reason I left those three jobs was because they had proprietary languages and you'll notice that not many companies take this route any more :).
An additional argument I'd make is that most languages have entire teams whose full time job it is to develop the language. Maybe you'd be an exception, but I'd be very surprised if you'd be able to match that level of development by only working on the language part-time.
Main complaints against Nemerle: The
compiler has horrid error reporting,
the implementation is buggy as hell
(compiler and libraries), the macros
can only be applied inside a function
or as attributes, and it's fairly
heavy dependency-wise (although not
enough that it's a dealbreaker).
I see your post has been written more than two years ago.
I advise you trying Nemerle language today.
The compiler is stable. There are no blocker bugs for today.
The VS integration has a lot of improvements , also there is SharpDevelop integration.
If you give it a chance, you won't be disappointed.
NEVER EVER develop your own language.
Developing your own language is a fool's trap, and worse it will limit you to what your imagination can provide, as well demanding that you work out both your development environment and the actual programme you're writing.
The cases in which this doesn't apply are pretty much if you're Larry Wall, the AWK guys, or part of a substantial group of people dedicated to testing the boundaries of programming. If you're in any of those categories, you don't need my advice, but I strongly doubt that you're targeting a niche where there is no suitable programming language for the task AND the characteristics of the people doing the task.
If you are as clever as you seem to be (a likely possibility), my advice is to go ahead and do the design of the language first, iterate a couple of times over it, ask some smart fellows you trust in smart programming language related communities about the concrete design you came up with and then take the decision.
You might realize in the process of creating the design that just a quick hack on Nemerle would give it all you need, for example. Many things can happen just when thinking hard about a problem, and the final solution might not be what you actually had in mind when beginning the project.
Worst case scenario, you're stuck with actually implementing the design, but by then you will have it proof read and mature, and you'll know with a high degree of certainty that it was a good path to take.
A related piece of advice, start small, just define the features you absolutely need and then build on them to get the rest.
Writing your own language is not a easy project.. Especially one to be used in any kind of "professional setting"
It is a huge amount of work, and I would doubt you could write your own language, and still write any big projects that use it - you will spend so long adding features that you need, fixing bugs, and general language-design stuff.
I would strongly recommend choosing a language that is closest to what you want, and extending it to do what you need. It'll never be exactly what you want, but compared to the time you'll spend writing your own language, I would say that's a small compromise..
Scala has a .NET compiler. I don't know the status of this though. It's kind of a second class citizen in the Scala world (which is more focused on the JVM). But it might be a good tradeof to adopt the .NET compiler instead of creating a new language from scratch.
Scala is kind of weak in the meta-programming department ATM. It's possible that the need for metaprogramming is somewhat reduced by other language features. In any case I don't think anyone would be sad if you were to implement metaprogramming features for it. Also there is a compiler plug-in infrastructure on the way.
I think most languages will never fit all of the bill.
You might want to combine your 2 favourite languages (in my case C# and Scheme) and use them together.
From a professional point of view, this probably not a good idea though.
It would be interesting to hear some of the things you feel you can't do in existing languages. What kind of projects are you working on that can't be done in C#?
I'm just curios!

Resources