What free, low-overhead (statistical) profilers one can use under Linux?

Preferably from Ubuntu repositories.

Others have mentioned OProfile; for full-system statistical profiling on modern Linux installations, it does indeed rock.
The more venerable tool (which doesn't require kernel support and thus will work under older versions of Linux or even non-Linux operating systems) is GNU gprof, included in binutils (and thus doubtless already installed in your development environment).
To use gprof, just compile your application with the -pg argument to gcc; a file called gmon.out will be created after your program exits, and gprof can then be used to analyze this file.

A simple but effective technique is to run the program under GDB and handle the SIGINT signal. While the program is running, generate SIGINT manually by typing control-c or whatever, and while it is halted, record the call stack. Do this a number of times, like 10 or 20, while the program is being subjectively slow. This will give you a very good idea of where the time goes.
This method does not give you precise timing, but it does precisely locate the instructions, including call instructions, that cost the most time.
How can I profile C++ code running in Linux?

Sysprof is a good profiler, similar to OProfile (also has a gtk GUI). which is available in the Ubuntu repository. It's a kernel level profiler, requiring a kernel module unlike gprof, however, also unlike gprof, it can profile multithreaded applications.

There is OProfile. It is not that difficult to use, but is somewhat buggy.

I've had good success with oprofile (http://oprofile.sourceforge.net/news/) which is available in Ubuntu repositories as well. It doesn't require recompilation, and doesn't have any limitations regarding shared objects or the like.


Is there a way to build CLI with no dependencies required?

Recently I though about scaffolding a little CLI with Ruby, but was concerned about using it on a machine with no Ruby installed. I've searched for examples of popular CLI's and found that Docker CLI is built with Go language. I'm able to use this CLI on my computer with no Go installed. How can one build a tool that will not require you to install Ruby?
My guess is that there's a build process involved and it might be compiled to something present on most systems, like shell or smth. Sorry if this is a lame guess/question!
(note: this is not a detailed answer, just a summary of how it works)
Cli programs are just as other programs, there is nothing special about them.
Go is a compiled language - a program called "compiler" takes the go code and translate it directly to machine language, following the conventions imposed by an operative system. It becames pure 0 and 1, no references to anything else. The main advantage is that is self-contained, but you have to recompile it on every different architecture (32bit, arm processors, ...) and operating system (windows, linux, macos) - it's the operating system that take cares of redirecting input and output on cli.
Ruby, instead, is interpreted. There is a program called "Ruby interpreter" which translates your code to the appropriate machine code on the fly. It's a different approach, it's more "high level".
The advantage is that you don't need to recompile the code. However, the "Ruby interpreter" itself must be written in some compiled language.

Tool displaying processor or core assignment for process in Windows?

In Windows 7, is there a tool that will allow me to see the cpu/core to which a process has been assigned for a recent timeslice under windows? I need to demonstrate that a particular application's process's threads can, and do, land on different processors/cores in a multi-processor/core environment with default scheduling behavior.
Intel VTune for Windows may be what you're looking for.
As for the point you're trying to demonstrate, the answer is almost certainly yes, but it will depend on what else is happening in the system. You can of course take control of which core(s) a thread runs on using the core affinity API routines, but you have to work really hard to beat the OSes own judgement.
Under Solaris there's DTrace, and Linux has a clone called FTrace. I've used FTrace and it does exactly what you want. It might be worth Googling around for an DTrace for Windows. The Windows Performance Toolkit might be just that.

Set up a development environment on Linux targeting Linux and Windows

For a university course I have to write a http server which is supposed to run on both Linux and Windows.
I have got a humble Linux machine which I don't think can handle any kind of heavy virtual environment, neither I'm willing to go through the hassle of installing it.
This is the first project of mine complex enough (I estimate ~1.5 months to develop) to require an environment sufficiently comfortable to alternate rapidly between short coding and testing sessions (the latter on both platforms, of course).
So, I was wondering what could be the best set up for this situation. I think testing it on Wine would be ok (it is not a real-world thing, after all), and I installed MinGW for the Windows-targeting part.
Basically, a simple well-written makefile could solve my problem... It should build both the Linux and Windows binaries and place them in the respective folders (the Windows one in the Wine sub-tree) and I'm all done! But I feel very inexperienced in this thing and I really don't know where to start. Maybe the make manual, ahah!:)
Thoughts, suggestions, anything I didn't think/know!
Thank you!
(PS. I'm planning to use emacs as editor, or maybe learn vim. Unless eclipse provide some kind of skynet-like plugin that entirely solve this problem...:)
You're on the right track. It's not that complicated, really, thanks to MinGW. You basically need two things:
The code has to be portable across the OSes. MinGW has some POSIX support, but you'll probably need to either use Cygwin in order to be able to use the POSIX interface or have your own compatibility layer for interfacing with the OS. I'd probably go for Cygwin as then you can code only against POSIX and won't have to test and debug your compatibility layer. Also, make sure you won't use any external libraries that are OS specific. Non-portable code often results in a compile error, but make sure you test the application thoroughly anyway.
The toolchains for targeting Linux and Windows. You already have them, you just need to use them correctly. Normally you'd use a variable like $(CROSS_COMPILE) as a prefix when calling the toolchain during cross compilation. So when compiling for Linux, you call gcc, ld, etc. (having the CROSS_COMPILE variable empty), and when compiling for Windows you call e.g. i486-mingw32-gcc, i486-mingw32-ld etc., i.e. CROSS_COMPILE=i486-mingw32-. Or just just define CC, LD etc. depending on the target.
I wrote a small game on Linux and made it run on Windows as well. If you browse the code, you can see the code has next to no #ifdef jungle (basically just some extra debugging features enabled for Linux), and the Makefile is simple as well, with no complicated handling for cross-compilation, just the possibility to override CC etc. like it should be. As lots of important open source software is written this way (especially software that's used by the desktop and embedded devices), you should also be able to find lots of other examples on how to set up the build environment correctly.
As for testing the application on Windows, I think the best option is if you can find a real Windows machine somehow. If you do everything correctly, it should run the same as on Linux and you won't need to continuously test your application on both OSes. If testing on a Windows machine is not possible, a VM would be the next best choice, though it would probably be more difficult to set it up. Wine is a good backup plan, but I don't think you can be sure your application works well on Windows if you only tested it on Wine.

How are operating systems debugged?

How are operating systems typically debugged? They cannot be stepped through with a debugger like simple console programs, and the build times are too large to repeatedly make small changes and recompile the whole thing.
They aren't debugged as a multi-gigabyte programs! :)
If you mean the individual user-mode components, they can mainly be debugged just like normal programs and libraries (because they are normal programs/libraries!).
For kernel-mode components, though, each OS has its own mechanism; here is some information regarding the way that we do kernel debugging in Windows. It can be done using the help of another machine connected to the machine you're debugging, via a serial port or something. I'm not familiar with the process itself, but that's the gist of how they work. (You need to set some boot loader options so that the system is ready for the debugger to be connected as early as possible.)
It depends on which part of the operating system you're talking about. When I worked at MSFT, I worked on the IE team. We debugged IE and the shell (Windows Explorer) in Visual Studio and stepped through them line by line all day long. Though, sometimes, it's easier to debug using a command line tool such as NTSD.
If, however, you want to debug anything in Kernel land such as the OS kernel or device drivers, which I suspect is really what you're asking, then you must use the Kernel debugger. For Windows that is a command line tool called kd, and generally you run the debugger on one machine and remotely debug the target.
There are a whole set of techniques throughout history from flashing lights on the console, to the use of hardware devices like an ICE, to more modern techniques utilizing fairly standard debuggers. One technique that is more common among OS developers then application developers is the analysis of a core dump. Look at something like mdb on solaris for ideas about how Solaris kernel developers do some of their debugging. Also tracing technologies are used. Anywhere from fairly straightforward logging packages to more modern techniques like dtrace.
Also note that the techniques used depend on the layer of software. Initial boot tends to be a fairly hard place to get your fingers into. But after that the environment of modern operation systems looks more and more like the application setting you are use to. In the end, it is all code :)

Debugging an Operating System

I was going through some general stuff about operating systems and struck on a question. How will a developer debug when developing an operating system i.e. debug the OS itself? What tools are available to debug for the OS developer?
Debugging a kernel is hard, because you probably can't rely on the crashing machine to communicate what's going on. Furthermore, the codes which are wrong are probably in scary places like interrupt handlers.
There are four primary methods of debugging an operating system of which I'm aware:
Sanity checks, together with output to the screen.
Kernel panics on Linux (known as "Oops"es) are a great example of this. The Linux folks wrote a function that would print out what they could find out (including a stack trace) and then stop everything.
Even warnings are useful. Linux has guards set up for situations where you might accidentally go to sleep in an interrupt handler. The mutex_lock function, for instance, will check (in might_sleep) whether you're in an unsafe context and print a stack trace if you are.
Traditionally, under debugging, everything a computer does is output over a serial line to a stable test machine. With the advent of virtual machines, you can now wire one VM's execution serial line to another program on the same physical machine, which is super convenient. Naturally, however, this requires that your operating system publish what it is doing and wait for a debugger connection. KGDB (Linux) and WinDBG (Windows) are some such in-OS debuggers. VMWare supports this story explicitly.
More recently the VM developers out there have figured out how to debug a kernel without either a serial line or kernel extensions. VMWare has implemented this in their recent stuff.
The problem with debugging in an operating system is (in my mind) related to the Uncertainty principle. Interrupts (where most of your hard errors are sure to be) are asynchronous, frequent and nondeterministic. If your bug relates to the overlapping of two interrupts in a particular way, you will not expose it with a debugger; the bug probably won't even happen. That said, it might, and then a debugger might be useful.
Deterministic Replay
When you get a bug that only seems to appear in production, you wish you could record what happened and replay it, like a security camera. Thanks to a professor I knew at Illinois, you can now do this in a VMWare virtual machine. VMWare and related folks describe it all better than I can, and they provide what looks like good documentation.
Deterministic replay is brand new on the scene, so thus far I'm unaware of any particularly idiomatic uses. They say it should be particularly useful for security bugs, too.
Moving everything to User Space.
In the end, things are still more brittle in the kernel, so there's a tremendous development advantage to following the Nucleus (or Microkernel) design, where you shave the kernel-mode components to their bare minimum. For everything else, you can use the myriad of user-space dev tools out there, and you'll be much happier. FUSE, a user-space filesystem extension, is the canonical example of this.
I like this last idea, because it's like you wrote the program to be writeable. Cyclic, no?
In a bootstrap scenario (OS from scratch), you'd probably have to introduce remote debugging capabilities (memory dumping, logging, etc.) in the OS kernel early on, and use a separate machine. Or you could use a virtual machine/hypervisor.
Windows CE has a component called KITL - Kernel Independent Transport Layer. I guess the title speaks for itslf.
You can use a VM: eg. debug ring0 code with bochs/gdb
or Debugging NetBSD kernel with qemu
or a serial line with something like KDB.
printf logging
attach to process
serious unit tests
Remote debugging with kernel debuggers, which can also be done via virtualization.
Debugging an operating system is not for the faint of heart. Because the kernel is being debugged, your options would be quite limited. Copious amount of printf statements is one trick, and furthermore, it depends on really what 'operating system' is being debugged, we could be talking about
Memory management
Raw Disk input/output
Screen input/output
Again, it is a widely varying exercise as in the above, they all interact with one another. Even more complicated is the fact, supposing you were to debug the kernel, how would you do it if the runtime environment is not properly set (by that, I am talking about the kernel's responsibility for loading binary executables).
Some kernels may (not all of them have them) incorporate a simple debug monitor, in fact, if I rightly recall, in the book titled 'Developing your own 32bit Operating System' by Richard A Burgess, Sams publishing, he incorporated a debug monitor which displays various states of the CPU, registers and so on.
Again, take into account of the fact that the binary executables require a certain loading mechanism, for example a gdb equivalent, if the environment for loading binaries are not set up, then your options are quite limited.
By using copious amount of printf statements to display errors, logs etc to a separate terminal or to a file is the best line of debugging, it does sound a nightmare but it would be worth the effort to do so.
Hope this helps,
Best regards,

