What's the difference between Managed/Byte Code and Unmanaged/Native Code? - unmanaged

Sometimes it's difficult to describe some of the things that "us programmers" may think are simple to non-programmers and management types.
So...
How would you describe the difference between Managed Code (or Java Byte Code) and Unmanaged/Native Code to a Non-Programmer?

Managed Code == "Mansion House with an entire staff or Butlers, Maids, Cooks & Gardeners to keep the place nice"
Unmanaged Code == "Where I used to live in University"

think of your desk, if you clean it up regularly, there's space to sit what you're actually working on in front of you. if you don't clean it up, you run out of space.
That space is equivalent to computer resources like RAM, Hard Disk, etc.
Managed code allows the system automatically choose when and what to clean up. Unmanaged Code makes the process "manual" - in that the programmer needs to tell the system when and what to clean up.

I'm astonished by what emerges from this discussion (well, not really but rhetorically). Let me add something, even if I'm late.
Virtual Machines (VMs) and Garbage Collection (GC) are decades old and two separate concepts. Garbage-collected native-code compiled languages exist, even these from decades (canonical example: ANSI Common Lisp; well, there is at least a compile-time garbage-collected declarative language, Mercury - but apparently the masses scream at Prolog-like languages).
Suddenly GCed byte-code based VMs are a panacea for all IT diseases. Sandboxing of existing binaries (other examples here, here and here)? Principle of least authority (POLA)/capabilities-based security? Slim binaries (or its modern variant SafeTSA)? Region inference? No, sir: Microsoft & Sun does not authorize us to even only think about such perversions. No, better rewrite our entire software stack for this wonderful(???) new(???) language§/API. As one of our hosts says, it's Fire and Motion all over again.
§ Don't be silly: I know that C# is not the only language that target .Net/Mono, it's an hyperbole.
Edit: it is particularly instructive to look at comments to this answer by S.Lott in the light of alternative techniques for memory management/safety/code mobility that I pointed out.
My point is that non technical people don't need to be bothered with technicalities at this level of detail.
On the other end, if they are impressed by Microsoft/Sun marketing it is necessary to explain them that they are being fooled - GCed byte-code based VMs are not this novelty as they claim, they don't solve magically every IT problem and alternatives to these implementation techniques exist (some are better).
Edit 2: Garbage Collection is a memory management technique and, as every implementation technique, need to be understood to be used correctly. Look how, at ITA Software, they bypass GC to obtain good perfomance:
4 - Because we have about 2 gigs of static data we need rapid access to,
we use C++ code to memory-map huge
files containing pointerless C structs
(of flights, fares, etc), and then
access these from Common Lisp using
foreign data accesses. A struct field
access compiles into two or three
instructions, so there's not really
any performance. penalty for accessing
C rather than Lisp objects. By doing
this, we keep the Lisp garbage
collector from seeing the data (to
Lisp, each pointer to a C object is
just a fixnum, though we do often
temporarily wrap these pointers in
Lisp objects to improve
debuggability). Our Lisp images are
therefore only about 250 megs of
"working" data structures and code.
...
9 - We can do 10 seconds of Lisp computation on a 800mhz box and cons
less than 5k of data. This is because
we pre-allocate all data structures we
need and die on queries that exceed
them. This may make many Lisp
programmers cringe, but with a 250 meg
image and real-time constraints, we
can't afford to generate garbage. For
example, rather than using cons, we
use "cons!", which grabs cells from an
array of 10,000,000 cells we've
preallocated and which gets reset
every query.
Edit 3: (to avoid misunderstanding) is GC better than fiddling directly with pointers? Most of the time, certainly, but there are alternatives to both. Is there a need to bother users with these details? I don't see any evidence that this is the case, besides dispelling some marketing hype when necessary.

I'm pretty sure the basic interpretation is:
Managed = resource cleanup managed by runtime (i.e. Garbage Collection)
Unmanaged = clean up after yourself (i.e. malloc & free)

Perhaps compare it with investing in the stock market.
You can buy and sell shares yourself, trying to become an expert in what will give the best risk/reward - or you can invest in a fund which is managed by an "expert" who will do it for you - at the cost of you losing some control, and possibly some commission. (Admittedly I'm more of a fan of tracker funds, and the stock market "experts" haven't exactly done brilliant recently, but....)

Here's my Answer:
Managed (.NET) or Byte Code (Java) will save you time and money.
Now let's compare the two:
Unmanaged or Native Code
You need to do your own resource (RAM / Memory) allocation and cleanup. If you forget something, you end up with what's called a "Memory Leak" that can crash the computer. A Memory Leak is a term for when an application starts using up (eating up) Ram/Memory but not letting it go so the computer can use if for other applications; eventually this causes the computer to crash.
In order to run your application on different Operating Systems (Mac OSX, Windows, etc.) you need to compile your code specifically for each Operating System, and possibly change alot of code that is Operating System specific so it works on each Operating System.
.NET Managed Code or Java Byte Code
All the resource (RAM / Memory) allocation and cleanup are done for you and the risk of creating "Memory Leaks" is reduced to a minimum. This allows more time to code features instead of spending it on resource management.
In order to run you application on different Operating Systems (Mac OSX, Windows, etc.) you just compile once, and it'll run on each as long as they support the given Framework you are app runs on top of (.NET Framework / Mono or Java).
In Short
Developing using the .NET Framework (Managed Code) or Java (Byte Code) make it overall cheaper to build an application that can target multiple operating systems with ease, and allow more time to be spend building rich features instead of the mundane tasks of memory/resource management.
Also, before anyone points out that the .NET Framework doesn't support multiple operating systems, I need to point out that technically Windows 98, WinXP 32-bit, WinXP 64-bit, WinVista 32-bit, WinVista 64-bit and Windows Server are all different Operating Systems, but the same .NET app will run on each. And, there is also the Mono Project that brings .NET to Linux and Mac OSX.

Unmanaged code is a list of instructions for the computer to follow.
Managed code is a list of tasks for the computer follow that the computer is free to interpret on its own on how to accomplish them.

The big difference is memory management. With native code, you have to manage memory yourself. This can be difficult and is the cause of a lot of bugs and lot of development time spent tracking down those bugs. With managed code, you still have problems, but a lot less of them and they're easier to track down. This normally means less buggy software, and less development time.
There are other differences, but memory management is probably the biggest.
If they were still interested I might mention how a lot of exploits are from buffer overruns and that you don't get that with managed code, or that code reuse is now easy, or that we no longer have to deal with COM (if you're lucky anyway). I'd probably stay way from COM otherwise I'd launch into a tirade over how awful it is.

It's like the difference between playing pool with and without bumpers along the edges. Unless you and all the other players always make perfect shots, you need something to keep the balls on the table. (Ignore intentional ricochets...)
Or use soccer with walls instead of sidelines and endlines, or baseball without a backstop, or hockey without a net behind the goal, or NASCAR without barriers, or football without helmets ...)

"The specific term managed code is particularly pervasive in the Microsoft world."
Since I work in MacOS and Linux world, it's not a term I use or encounter.
The Brad Abrams "What is Managed Code" blog post has a definition that say things like ".NET Framework Common Language Runtime".
My point is this: it may not be appropriate to explain it the terms at all. If it's a bug, hack or work-around, it's not very important. Certainly not important enough to work up a sophisticated lay-persons description. It may vanish with the next release of some batch of MS products.

Related

What is the reason why high level abstractions that use lock free programming deep down aren't popular?

From what I gathered on the lock free programming, it is incredibly hard to do right... and I agree.
Just thinking about some problems makes my head hurt. But what I wonder is, why isn't
there a widespread use of high-level wrappers around (e.g. lock free queue and similar stuff)?
For example boost has no lock free library, although one was suggested as far as I know.
I mean I guess that there is a lot of applications where you cant avoid the fact that the critical
section is the big part of the load. So what are the reasons? Is it...
Patents - I heard that some stuff related to lock-free programming is patented.
Performance.
Google, and Microsoft have internal libraries like that but none of them are public...
Something else?
So my question is: Why are high level abstractions that use lock free programming deep down not very
popular, while at the same time "regular" multi-threaded programming is "in"?
EDIT: boost got a lockfree lib :)
There are few people who are familiar enough with the field to implement easy-to-use lock-free libraries. Of those few, even fewer publish work for free and of those almost none do the vital additional work to make the library useable - e.g. publish full API docs, etc. They tend to just release a zip file with code in, which is almost useless. Then of course you also need to find a library which is written in the language you want to use, compiles on the platform you're using and finally, word of the library has to get out, so people know it exists.
Patents are an issue, in that they limit what can be offered. There is, for example, to my knowledge no unpatented singly-linked list. All the skip list stuff is heavily patented, too.
A hero in this field is Cliff Click, who came up with a lock-free hash, which he has more-or-less placed in the public domain.
You can find my lock-free library here;
http://www.liblfds.org
Another is Samy Bahra's Concurrency Kit;
http://www.concurrencykit.org
FYI Microsoft's .Net framework gained some lock free classes in .Net 4.0. Namely container classes in the System.Collections.Concurrent namespace, which are:
ConcurrentDictionary
ConcurrentQueue
ConcurrentStack
I've looked into their implementation and they are relatively fiddly/complex under the hood therefore they do represent a significant amount of effort in designing and testing (threading issues are of course notoriously difficult to test to a high standard).
You can take a look at libcds C++ library. It is collection of lock-free containers (stacks, queues, sets and maps) and safe memory reclamation algorithms.
IMHO regarding C++ (I'm not advanced in other languages). New C++ standard has just been released and the compiler developers need a time to implement its requirements. Today, all compilers do not support C++11 memory model entirely since it requires significant changes in compiler’s optimization rules. Recently, Microsoft announces support of the atomic operations that is the base of lock-free programming in VC++ 11 Developer Preview. It is good news for us. As I know, GCC is going to support it in 4.8 (or above).
Second problem is patents. Many interesting lock-free container algorithms are patented that is a barrier to include them to vendor’s libraries.
Third, the main part of lock-free containers is garbage collecting (safe memory reclamation). C++ is free from any GC (fortunately). There are a few GC algos (Hazard Pointer, Pass-the-Buck, epoch-based and so on) but most of them are patented too.
Fourth, not enough instruments to prove the correctness of memory fences applied in your lock-free implementation. Now I known only one – relacy(http://www.1024cores.net/home/relacy-race-detector).
I think after 2-3 years we’ll see many production-ready multiplatform C++ libraries of lock-free containers and algorithms. These libraries are being developed by vendors and enthusiasts.
However, in my opinion, our future is the hardware transaction memory (HTM). Today AMD, Sun (sorry, Oracle), Intel (?) are investigating HTM with very interesting results. Let’s wait.
There is at least one "lock free” framework that is somewhat popular: Erlang.
One major problem is that unless one uses an excessive number of memory barriers, it's hard to be certain that one has enough; if one does use an excessive number of memory barriers, performance is likely to be inferior to what one would have gotten using locks.
The biggest problem with locks is not performance, but robustness. If a thread gets waylaid while it holds a lock, the system dies. By contrast, if a thread which is accessing a lock-free data structure gets waylaid, it won't affect other threads' use thereof. In some situations, a lock-free data structure may be preferable to one using locks, even if performance is inferior, because one must protect the system from being brought down by a malfunctioning thread (for example, even if one was prepared to kill off a thread which hit a StackOverflowException without taking down the process, how would one protect against a thread putting a lot of stuff on its stack before calling a method to access a lock-protected data structure that the method, such that the lock-guarded method hit a stack overflow?) If one uses lock-free data structures, such risks aren't a problem.

Relation between language and scalability

I came across the following statement in Trapexit, an Erlang community website:
Erlang is a programming language used
to build massively scalable soft
real-time systems with requirements on
high availability.
Also I recall reading somewhere that Twitter switched from Ruby to Scala to address scalability problem.
Hence, I wonder what is the relation between a programming language and scalability?
I would think that scalability depends only on the system design, exception handling etc. Is it because of the way a language is implemented, the libraries, or some other reasons?
Hope for enlightenment. Thanks.
Erlang is highly optimized for a telecommunications environment, running at 5 9s uptime or so.
It contains a set of libraries called OTP, and it is possible to reload code into the application 'on the fly' without shutting down the application! In addition, there is a framework of supervisor modules and so on, so that when something fails, it gets automatically restarted, or else the failure can gradually work itself up the chain until it gets to a supervisor module that can deal with it.
That would be possible in other languages of course too. In C++, you can reload dlls on the fly, load plugsin. In Python you can reload modules. In C#, you can load code in on-the-fly, use reflection and so on.
It's just that that functionality is built in to Erlang, which means that:
it's more standard, any erlang developer knows how it works
less stuff to re-implement oneself
That said, there are some fundamental differences between languages, to the extent that some are interpreted, some run off bytecode, some are native compiled, so the performance, and the availability of type information and so on at runtime differs.
Python has a global interpreter lock around its runtime library so cannot make use of SMP.
Erlang only recently had changes added to take advantage of SMP.
Generally I would agree with you in that I feel that a significant difference is down to the built-in libraries rather than a fundamental difference between the languages themselves.
Ultimately I feel that any project that gets very large risks getting 'bogged down' no matter what language it is written in. As you say I feel architecture and design are pretty fundamental to scalability and choosing one language over another will not I feel magically give awesome scalability...
Erlang comes from another culture in thinking about reliability and how to achieve it. Understanding the culture is important, since Erlang code does not become fault-tolerant by magic just because its Erlang.
A fundamental idea is that high uptime does not only come from a very long mean-time-between-failures, it also comes from a very short mean-time-to-recovery, if a failure happened.
One then realize that one need automatic restarts when a failure is detected. And one realize that at the first detection of something not being quite right then one should "crash" to cause a restart. The recovery needs to be optimized, and the possible information losses need to be minimal.
This strategy is followed by many successful softwares, such as journaling filesystems or transaction-logging databases. But overwhelmingly, software tends to only consider the mean-time-between-failure and send messages to the system log about error-indications then try to keep on running until it is not possible anymore. Typically requiring human monitoring the system and manually reboot.
Most of these strategies are in the form of libraries in Erlang. The part that is a language feature is that processes can "link" and "monitor" each other. The first one is a bi-directional contract that "if you crash, then I get your crash message, which if not trapped will crash me", and the second is a "if you crash, i get a message about it".
Linking and monitoring are the mechanisms that the libraries use to make sure that other processes have not crashed (yet). Processes are organized into "supervision" trees. If a worker process in the tree fails, the supervisor will attempt to restart it, or all workers at the same level of that branch in the tree. If that fails it will escalate up, etc. If the top level supervisor gives up the application crashes and the virtual machine quits, at which point the system operator should make the computer restart.
The complete isolation between process heaps is another reason Erlang fares well. With few exceptions, it is not possible to "share values" between processes. This means that all processes are very self-contained and are often not affected by another process crashing. This property also holds between nodes in an Erlang cluster, so it is low-risk to handle a node failing out of the cluster. Replicate and send out change events rather than have a single point of failure.
The philosophies adopted by Erlang has many names, "fail fast", "crash-only system", "recovery oriented programming", "expose errors", "micro-restarts", "replication", ...
Erlang is a language designed with concurrency in mind. While most languages depend on the OS for multi-threading, concurrency is built into Erlang. Erlang programs can be made from thousands to millions of extremely lightweight processes that can run on a single processor, can run on a multicore processor, or can run on a network of processors. Erlang also has language level support for message passing between processes, fault-tolerance etc. The core of Erlang is a functional language and functional programming is the best paradigm for building concurrent systems.
In short, making a distributed, reliable and scalable system in Erlang is easy as it is a language designed specially for that purpose.
In short, the "language" primarily affects the vertical axii of scaling but not all aspects as you already eluded to in your question. Two things here:
1) Scalability needs to be defined in relation to a tangible metric. I propose money.
S = # of users / cost
Without an adequate definition, we will discussing this point ad vitam eternam. Using my proposed definition, it becomes easier to compare system implementations. For a system to be scalable (read: profitable), then:
Scalability grows with S
2) A system can be made to scale based on 2 primary axis:
a) Vertical
b) Horizontal
a) Vertical scaling relates to enhancing nodes in isolation i.e. bigger server, more RAM etc.
b) Horizontal scaling relates to enhancing a system by adding nodes. This process is more involving since it requires dealing with real world properties such as speed of light (latency), tolerance to partition, failures of many kinds etc.
(Node => physical separation, different "fate sharing" from another)
The term scalability is too often abused unfortunately.
Too many times folks confuse language with libraries & implementation. These are all different things. What makes a language a good fit for a particular system has often more to do with the support around the said language: libraries, development tools, efficiency of the implementation (i.e. memory footprint, performance of builtin functions etc.)
In the case of Erlang, it just happens to have been designed with real world constraints (e.g. distributed environment, failures, need for availability to meet liquidated damages exposure etc.) as input requirements.
Anyways, I could go on for too long here.
First you have to distinguish between languages and their implementations. For instance ruby language supports threads, but in the official implementation, the thread will not make use of multicore chips.
Then, a language/implementation/algorithm is often termed scalable when it supports parallel computation (for instance via multithread) AND if it exhibits a good speedup increase when the number of CPU goes up (see Amdahl Law).
Some languages like Erlang, Scala, Oz etc. have also syntax (or nice library) which help writing clear and nice parallel code.
In addition to the points made here about Erlang (Which I was not aware of) there is a sense in which some languages are more suited for scripting and smaller tasks.
Languages like ruby and python have some features which are great for prototyping and creativity but terrible for large scale projects. Arguably their best features are their lack of "formality", which hurts you in large projects.
For example, static typing is a hassle on small script-type things, and makes languages like java very verbose. But on a project with hundreds or thousands of classes you can easily see variable types. Compare this to maps and arrays that can hold heterogeneous collections, where as a consumer of a class you can't easily tell what kind of data it's holding. This kind of thing gets compounded as systems get larger. e.g. You can also do things that are really difficult to trace, like dynamically add bits to classes at runtime (which can be fun but is a nightmare if you're trying to figure out where a piece of data comes from) or call methods that raise exceptions without being forced by the compiler to declare the exception. Not that you couldn't solve these kinds of things with good design and disciplined programming - it's just harder to do.
As an extreme case, you could (performance issues aside) build a large system out of shell scripts, and you could probably deal with some of the issues of the messiness, lack of typing and global variables by being very strict and careful with coding and naming conventions ( in which case you'd sort of be creating a static typing system "by convention"), but it wouldn't be a fun exercise.
Twitter switched some parts of their architecture from Ruby to Scala because when they started they used the wrong tool for the job. They were using Ruby on Rails—which is highly optimised for building green field CRUD Web applications—to try to build a messaging system. AFAIK, they're still using Rails for the CRUD parts of Twitter e.g. creating a new user account, but have moved the messaging components to more suitable technologies.
Erlang is at its core based on asynchronous communication (both for co-located and distributed interactions), and that is the key to the scalability made possible by the platform. You can program with asynchronous communication on many platforms, but Erlang the language and the Erlang/OTP framework provides the structure to make it manageable - both technically and in your head. For instance: Without the isolation provided by erlang processes, you will shoot yourself in the foot. With the link/monitor mechanism you can react on failures sooner.

Digital Circuit understanding

In my quest for getting some basics down before I start going into programming I am looking for essential knowledge about how the computer works down at the core level.
I have a theory that actually understanding what for instance a stackoverflow let alone a stack is, instead of my sporadic knowledge about computer systems, will help me longer term.
Is there any books or sites that take you through how processors are structured and give a holistic overview and that somehow relates to good to know about digital logic?
Am i making sense?
Yes, you should read some topics of
John L. Hennessy & David A. Patterson, "Computer Architecture: A quantitative Approach"
It has microprocessors' history and theory , (starting with RISC archs - MIPS), pipelining, memory, storage, etc.
David Patterson is a Professor of Computer of Computer Science on EECS Department - U. Berkeley. http://www.eecs.berkeley.edu/~pattrsn/
Hope it helps, here's the link
Tanenbaum's Structured Computer Organization is a good book about how computers work. You might find it hard to get through the book, but that's mostly due to the subject, not the author.
However, I'm not sure I would recommend taking this approach. Understanding how the computer works can certainly be useful, but if you don't really have any programming knowledge, you can't really put your knowledge to good use - and you probably don't need that knowledge yet anyway. You would be better off learning about topics like object-oriented programming and data structures to learn about program design, because unless you're looking at doing embedded programming on very limited systems, you'll find those skills far more useful than knowledge of a computer's inner workings.
In my opinion, 20 years ago it was possible to understand the whole spectrum from BASIC all the way through operating system, hardware, down to the transistor or even quantum level. I don't know that it's possible for one person to understand that whole spectrum with today's technology. (Years ago, everyone serviced their own car. Today it's too hard.)
Some of the "layers" that you might be interested in:
http://en.wikipedia.org/wiki/Boolean_logic (this will be helpful for programming)
http://en.wikipedia.org/wiki/Flip-flop_%28electronics%29
http://en.wikipedia.org/wiki/Finite-state_machine
http://en.wikipedia.org/wiki/Static_random_access_memory
http://en.wikipedia.org/wiki/Bus_%28computing%29
http://en.wikipedia.org/wiki/Microprocessor
http://en.wikipedia.org/wiki/Computer_architecture
It's pretty simple really - the cpu loads instructions and executes them, most of those instructions revolve around loading values into registers or memory locations, and then manipulating those values. Certain memory ranges are set aside for communicating with the peripherals that are attached to the machine, such as the screen or hard drive.
Back in the days of Apple ][ and Commodore 64 you could put a value directly in to a memory location and that would directly change a pixel on the screen - those days are long gone, it is abstracted away from you (the programmer) by several layers of code, such as drivers and the operating system.
You can learn about this sort of stuff, or assembly language (which i am a huge fan of), or AND/NAND gates at the hardware level, but knowing this sort of stuff is not going to help you code up a web application in ASP.NET MVC, or write a quick and dirty Python or Powershell script.
There are lots of resources out there sprinkled around the net that will give you insight into how the CPU and the rest of the hardware works, but if you want to get down and dirty i honestly think you should buy one of those older machines off eBay or somewhere, and learn its particular flavour of assembly language (i understand there are also a lot of programmable PIC controllers out there that might also be good to learn on). Picking up an older machine is going to eliminate the software abstractions and make things way easier to learn. You learn way better when you get instant gratification, like making sprites move around a screen or generating sounds by directly toggling the speaker (or using a PIC controller to control a small robot). With those older machines, the schematics for an Apple ][ motherboard fit on to a roughly A2 size sheet of paper that was folded into the back of one of the Apple manuals - i would hate to imagine what they look like these days.
While I agree with the previous answers insofar as it is incredibly difficult to understand the entire process, we can at least break it down into categories, from lowest (closest to electrons) to highest (closest to what you actually see).
Lowest
Solid State Device Physics (How transistors work physically)
Circuit Theory (How transistors are combined to create logic gates)
Digital Logic (How logic gates are put together to create digital functions or digital structures i.e. multiplexers, full adders, etc.)
Hardware Organization (How the data path is laid out in the CPU, the components of a Von Neuman machine -> memory, processor, Arithmetic Logic Unit, fetch/decode/execute)
Microinstructions (Bit level programming)
Assembly (Programming with words, but directly specifying registers and takes forever to program even simple things)
Interpreted/Compiled Languages (Programming languages that get compiled or interpreted to assembly; the operating system may be in one of these)
Operating System (Process scheduling, hardware interfaces, abstracts lower levels)
Higher level languages (these kind of appear twice; it depends on the language. Java is done at a very high level, but C goes straight to assembly, and the C compiler is probably written in C)
User Interfaces/Applications/Gui (Last step, making it look pretty)
You can find out a lot about each of these. I'm only somewhat expert in the digital logic side of things. If you want a thorough tutorial on digital logic from the ground up, go to the electrical engineering menu of my website:
affablyevil.wordpress.com
I'm teaching the class, and adding online lessons as I go.

Porting Wii and/or PSOne Games to OpenGL ES

I have been asked to investigate porting Wii games and some (Sony) PSOne games to OpenGL ES (can you guess what platform?).
I have never undertaken a game port like this before (and will be hiring someone to do it) but I'd like to understand the process.
Does the Wii use OpenGL? If not what does it use and how easy is it to port to OpenGL / OpenGL ES?
Are there any resources/books/blogs that will help me in understanding the process?
Will my company have to become an official Wii developer? If so where do I start that process?
Porting from the Wii or the PSOne is a complex and involved task that can be broken down into multiple separate engineering efforts working in parallel to produce a working end product. The best possible thing you can do before moving to the target hardware is to compartmentalize all of the non-portable code while ensuring that the game continues to run as expected. When you commit to moving to the new platform, your effort switches to reimplementing the non-portable compartmentalized parts.
So, to answer your question, yes, you will need to become or work with a Sony and Nintendo licensed developer in order to take this approach. In the case of Sony, I don't even know if they offer a PSOne development program anymore which presents issues. Your Sony account rep can help clarify.
The major subsystems that are likely to be the focus of your porting effort are:
Rendering Graphics code contains fundamental assumptions about the hardware it is being run on in order to perform optimally. API-level compatibility is superficial compatibility and does not get you as much as you may hope it does. Plan on finding the entry point to the renderer and determining what data you need to render a scene and rewriting all the render code from there for your target hardware.
Game Saving Game state serialization and archival will need to be separated out. Older games often fwrite() structs with #pragma packed fields. Is that still going to work for you?
Networking Wii games write to high level services that are unavailable on your target hardware. At the low level, sockets are still sockets. What network services do your Wii games rely on?
Controls From where you are coming from to where you are going, anything short of a full redesign or reimagining of input will result in poor reviews of the software.
Memory Management Console games often make fundamental assumptions about the rate the system software returns memory from the heap, how much fragmentation it will cause and the duration the game needs to operate under these conditions. These memory management assumptions are obsolete on the new platform. It is wise to write your own memory manager that provides a cushion from the operating system. Also, console games compiled for release are stripped of most error handling and don't gracefully handle running out of memory-- just a heads up.
Content Your bottleneck will be system memory. Can you fit the necessary assets into memory? With textures, you can reduce mip where necessary and with graphics hardware timing, you can pull in the far clipping plane. With assets resident in memory, you may need a technical artist to go through and reduce the face density of your models or an animation programmer to implement a more size-friendly animation codec. This is very game specific.
You also run into the standard set of problems with things like bit compatibility (though the Wii and PSOne are both 32-bit), compiler idiosyncrasies, build script incompatibilities and proprietary compiler extensions.
Games are relatively challenging to test. A good rule of thumb is you want to have enough testers on your team to run through the game in a maximum of two days, covering all major aspects of play. In games that take a long time to beat (RPGs with 30+ hours of gameplay), your testing team needs to be quite large to offer full coverage. Because you are just doing a port, you can come up with a testing plan that maximizes coverage of your new code without having a testing team punch every wall in your game to make sure it (still) has clipping. The game shipped once.
Becoming a licensed developer requires you to apply. The turnaround time, from experience, is not good. Generally speaking, priority is given to studios with shipped titles and organized offices with reasonably good security and the ability to buy the (relatively) expensive development kits. You may be better off working with a licensed developer if you do not meet these criteria.
Console and game development is challenging for people already experienced in it. There is no book that covers it all. My recommendation is to attempt to recruit an expert who has experience shipping titles in a position of systems or engine programmer. What types of programmers and skillsets exist in games is a whole different question for Stack, though.
Games consoles don't use OpenGL but their own, custom libraries. The main reason is that they are pretty slow and have little RAM. So you need to squeeze out every drop of performance you can get. And that means: Custom code. Usually, you get a framework with the developer kit which gets you started and then, you build your code from that. Eventually, you'll start replacing parts from the developer kit with your own special code to get all the speed and special effects you need.
There is a reason why PSOne games are so ugly on the PS3 despite the fact that the developers have access to the sources: Revenue just doesn't justify to touch the code.
Which is one reason why game development is so expensive: Every game is (more or less) a completely new product. Sometimes, game companies can reuse a bit of code from the last version but more often than not, they have to develop everything again. They also don't talk much with each other.
In recent years, kits have become more complex and powerful and you can get complete game engines (with all kinds of effects and 3D support) but each engine is a completely different kind of beast, so you can't even copy code from engine A to B.
Today, media content (video, audio and render sequences) are so expensive that the actual game engine is often a minor detail, so this isn't going to change any time soon.
Net result: If you want to port a game, write an emulator for the hardware (which is usually pretty simple and allows you to run all kinds of games).
[EDIT] To develop software for the Wii, see here: http://www.warioworld.com/
For a Wii emulator, see http://wiiemulator.net/
I ported a couple of games, when I was a new game programmer, from working with one version of our engine to a newer version (where backwards compatibility was neither ignored nor pursued). Even copying (and possibly renaming) the files and placing them in a home in the new project was a bit of work. Following that, the procedure was:
recompile
fix many of the hundreds of errors [in many places, with the same error occurring over and over again]
and
"wire up" calls from the new game engine to the appropriate calls in the old code
"wire up" function calls from the old code into the new game engine
deal with other oddities (ex. in the old game engine, the 2d game would "swizzle" textures itself; in the new version, the engine did it (on specific platforms))
and, while I don't recall this clearly, it was probably mixed in with a bunch of #ifdeffing out portions of code so the thing would actually compile, and possibly creating function stubs to be filled in later.
As I recall, it was three or four days until I had something that compiled. (But, it did help when we ported other games from the old version to the new one!)
The magnitude of the task will come down to what the code you are getting is like. If it has generic 3D calls that you can intercept -- add a thunking layer to -- then you are in business. It depends on the level of abstraction in the code. If it is well-behaved and has things like "RenderModel" and "RenderWorld" calls, you can replace those functions, and even the structures that they work with. If drawing is occurring all over the place, and calls are more like "Draw Polygon" and "Draw Line" or "Draw using this highly optimised data structure", then you are likely in for a long slog.
You shouldn't need a Wii dev kit. Sometimes it is nice to verify that the code you are given does indeed compile in the original environment (and matches the shipping code!), but sometimes you can just take it on faith and make it work in its new environment.
Lastly, I don't think the Wii uses OpenGL, and I really don't know where to point you for further help.
What you may want to do is to start with designing the architecture of the game, write up a detailed specification for what the new game is like.
Once you have this, since you will be rewriting the code, you may find that some of the business logic that doesn't deal with the console can be ported over. But, anything dealing with I/O, user interaction or graphics/sounds will be rewritten, so you might as well do that from scratch.
A specification is very important, to make certain that you know how the current game is working so that the new port will give the same user experience, if that is what is desired.
You may want to keep the same bugs, if that is part of the experience, as, if I know that in the Wii I can jump down and bounce off the wall to safely land, then if I can't do that in the new version then that may be bothersome.
Well porting a PS1 game to an iPhone would be quite a task they work in very different ways. I'm sure its doable but it will be a LOT of work to replace all the fixed point maths and lack of Z-Buffer based rendering to a real graphics chip.
Wii would be a lot easier. The Wii API is very similar to OpenGL. However the Wii has some very nice fixed function features that just are not available on any other GL based platform. Should be doable, though ...
I'm not really sure I can say anything more than that. Have signed far too many NDAs over the years to be 100% sure of what I can and cannot say ;)
Still if you want to hire someone to do some porting work and are prepared to supply the required hardware then I might be free ;)

Scaling multithreaded applications on multicored machines

I'm working on a project were we need more performance. Over time we've continued to evolve the design to work more in parallel(both threaded and distributed). Then latest step has been to move part of it onto a new machine with 16 cores. I'm finding that we need to rethink how we do things to scale to that many cores in a shared memory model. For example the standard memory allocator isn't good enough.
What resources would people recommend?
So far I've found Sutter's column Dr. Dobbs to be a good start.
I just got The Art of Multiprocessor Programming and The O'Reilly book on Intel Threading Building Blocks
A couple of other books that are going to be helpful are:
Synchronization Algorithms and Concurrent Programming
Patterns for Parallel Programming
Communicating Sequential Processes by C. A. R. Hoare (a classic, free PDF at that link)
Also, consider relying less on sharing state between concurrent processes. You'll scale much, much better if you can avoid it because you'll be able to parcel out independent units of work without having to do as much synchronization between them.
Even if you need to share some state, see if you can partition the shared state from the actual processing. That will let you do as much of the processing in parallel, independently from the integration of the completed units of work back into the shared state. Obviously this doesn't work if you have dependencies among units of work, but it's worth investigating instead of just assuming that the state is always going to be shared.
You might want to check out Google's Performance Tools. They've released their version of malloc they use for multi-threaded applications. It also includes a nice set of profiling tools.
Jeffrey Richter is into threading a lot. He has a few chapters on threading in his books and check out his blog:
http://www.wintellect.com/cs/blogs/jeffreyr/default.aspx.
As monty python would say "and now for something completely different" - you could try a language/environment that doesn't use threads, but processes and messaging (no shared state). One of the most mature ones is erlang (and this excellent and fun book: http://www.pragprog.com/titles/jaerlang/programming-erlang). May not be exactly relevant to your circumstances, but you can still learn a lot of ideas that you may be able to apply in other tools.
For other environments:
.Net has F# (to learn functional programming).
JVM has Scala (which has actors, very much like Erlang, and is functional hybrid language). Also there is the "fork join" framework from Doug Lea for Java which does a lot of the hard work for you.
The allocator in FreeBSD recently got an update for FreeBSD 7. The new one is called jemaloc and is apparently much more scaleable with respect to multiple threads.
You didn't mention which platform you are using, so perhaps this allocator is available to you. (I believe Firefox 3 uses jemalloc, even on windows. So ports must exist somewhere.)
Take a look at Hoard if you are doing a lot of memory allocation.
Roll your own Lock Free List. A good resource is here - it's in C# but the ideas are portable. Once you get used to how they work you start seeing other places where they can be used and not just in lists.
I will have to check-out Hoard, Google Perftools and jemalloc sometime. For now we are using scalable_malloc from Intel Threading Building Blocks and it performs well enough.
For better or worse, we're using C++ on Windows, though much of our code will compile with gcc just fine. Unless there's a compelling reason to move to redhat (the main linux distro we use), I doubt it's worth the headache/political trouble to move.
I would love to use Erlang, but there way to much here to redo it now. If we think about the requirements around the development of Erlang in a telco setting, the are very similar to our world (electronic trading). Armstrong's book is on my to read stack :)
In my testing to scale out from 4 cores to 16 cores I've learned to appreciate the cost of any locking/contention in the parallel portion of the code. Luckily we have a large portion that scales with the data, but even that didn't work at first because of an extra lock and the memory allocator.
I maintain a concurrency link blog that may be of ongoing interest:
http://concurrency.tumblr.com

Resources