adding new command to linux kernel - linux-kernel

I have searched the web for long time without success. How can I add a new command to the kernel command line? e.g. a command like ver without any arguments, that simply prints Hello World message when executed.

The Linux kernel does not handle any commands (but the GRUB bootloader can pass some arguments to the booted kernel). It is just handling system calls. You could add some new one, but learn about advanced linux programming & syscalls(2) before hacking the kernel.
Perhaps you want to add a new command available to your shell. This is quite simple, put a new executable file (either an ELF executable, e.g. compiled from C code, or some script) in a directory known to your PATH. You might also make a new shell function.
The system call to create a new process is fork(2) (sometimes emulated thru some appropriate call to clone(2)...). fork is difficult to understand, so take several hours to read about it. Almost every process (with a few exceptions, notably /sbin/init) is created by fork. The system call to execute a new program is execve(2). Your shell will often use fork & execve to run commands (except builtin ones, like cd). Use strace(1) to understand what syscalls are done (by some program).
PS. Don't dare hacking your kernel before understanding quite well user-land Linux system programming....

Related

Difference between compiling code and executing code

So I write a program in some language, LanguageX, using a simple text pad.
Then I put that text into a Compiler. The compiler outputs machine code (or assembly which is then compiled into machine code)
My question is, who actually executes the compiled program.
Does the compiler execute it? Or do I need another "executor app" execute it?
Or does the hardware execute the program directly? But who orders the hardware to do that?
I'm confused because the concepts of compiling a program, and executing a program, seem to be used interchangeably.
An example is HTML. I can write html code in a text file and save it as .html, open it with Firefox, and it will run. Is Firefox a compiler, an executor, both, neither?
Another example is a commercial app I buy and install. Whenever I click on the .exe, is the app compiled or executed? Both?
A program is data that explains how to execute things. You can read the program yourself and have a sense of what it should be doing, or give it to another program that can execute it. When a program is directly executed from its source, it is said to be "interpreted".
For example, your browser is interpreting HTML to render a page. When there is Javascript associated with a page, this is loaded and executed by a Javascript interpreter that has access to your page and its elements. The Javascript interpreter is part of your browser program, and your browser is a program that is executed by a processor.
A compiler is a program that transforms the source code into another language, typically instructions that can be decoded by your CPU, but that may be also bytes not directly executable by a processor, but by a virtual machine (another program that knows how to interpret the byte code).
For some languages the compilation phase also involves also a step called linking, but the resulting file is basically a bit of metadata and a sequence of instructions your processor can understand.
In order to execute a program, you ask (through a shell or the graphical interface) your operating systems to load the program: the kernel allocates resources for your process and put the code in a portion of memory that is flagged as executable (there are a lot more details to it).
The kernel keeps track of processes and executes them in one or more processors. The code is being fed directly to a processor which can decode the instructions resulting from compilation. Periodically, a process is interrupted by the kernel to let other processes run (when the process is waiting for something, or due to something called an "interrupt"). When you have multiple processors, multiple programs can execute truly in parallel.
See for example Linux Kernel Teaching for a lot more details.

CPU with more than 2 modes

Im new to "Operating Systems" to please just dont blast me.
I studied the User Mode and the Kernel Mode of the CPU. I've just discovered that some CPU has intermediate modes beetween user and kernel mode.
But why is this necessary ? Is it dangerous to use always kernel mode for privileged instructions ? Or is a matter of prestations ?
The VAX/VMS system is one that used four modes. VMS works quite differently from Eunuchs. In Eunuchs variants you have a shell process. Each time you run a program you create a new process. In fact, the shell in Eunuchs variants is just a program with nothing special about it.
In VMS, the the command interpreter exists in the same process as the running program. Programs can (and often do) interact with the command interpreter. When your program ends, the command interpreter takes back control. Run another program and you remain in the same process with a new executable loaded.
The command interpreter runs in "supervisor mode" which is one level higher than user mode. It is then protected from the user mode access messing with it. As the same time, any bugs in the command interpreter will not cause the system to crash.
Also the debugger exists in supervisor mode within the process it is debugging.
For people brought up under Windoze and Eunuchs you cannot appreciate how primitive their file handling is. VMS, like most real non-toy operating systems, has different file structures. It supports stream files like Eunuchs and Windows. However, it also supports sequential file structures, fixed record file structures and files indexed on keys. The system services for managing this run in executive mode (above supervisor and below kernel). Again, that allows having protected system services that will not crash the entire operating system.
I should also mention that non-toy operating systems support file versions. If you open a document, edit it, and save it, you create a new version of the file with the same name. If you make a misteak or eror you can go back and fix it.
The general answer to your question is these other modes provide means for the operating system to provide interfaces to services that are otherwise protected from users messing with that will not affect the entire operating system when there are problems.
Ideally, an operating system would do as little as possible in kernel mode. When you have operating systems that are quick and dirty and do very little, they just use kernel mode.

Is running a Windows .EXE inherently slower via Cygwin than via BAT?

We are running a Windows .EXE file via a Cygwin script and are encountering performance issues. I have seen various posts about Cygwin performance issues, including this one, one of whose answers delves enough into Cygwin internals to make me think there could be a problem. However, these posts do seem to be more about startup time, tab completion, etc. Before we launch on a benchmarking witch hunt, I was hoping to ask: is there any intrinsic reason why a Windows .EXE could run slower if kicked off from Cygwin vs. BAT?
Not the actual program, no.
Housekeeping and stuff before running the program may vary. Cmd probably calls CreateProcess directly. Cygwin's bash may first do argument parsing, wildcard expansion, fork via Cygwin's slow implementation and call exec with the parsed arguments, which Cygwin has to piece together into a string again to pass to CreateProcess. But in the end, a new process is created which has no ties to its parent anymore. So how fast your program runs entirely depends on that program, not on who launched it.

How to speed up Cygwin?

I have been running drush scripts (for Drupal) with Cygwin on my relatively fast windows machine, but I still have to wait about a minute for any drush command (specifically drush cache clear to execute).
I'm quite sure it has something to do with the speed of Cygwin since my fellow developers (who are running Linux) can run these scripts in about 5 seconds.
Is there a way to make Cygwin use more memory and/or CPU per terminal?
The problem you're running into is not some arbitrary limit in Cygwin that you can make go away with a settings change. It's an inherent aspect of the way Cygwin has to work to get the POSIX semantics programs built under it expect.
The POSIX fork() system call has no native equivalent on Windows, so Cygwin is forced to emulate it in a very inefficient way. Shell scripts cause a call to fork() every time they execute an external process, which happens quite a lot since the shell script languages are so impoverished relative to what we'd normally call a programming language. External programs are how shell scripts get anything of consequence done.
There are other inefficiencies in Cygwin, though if you profiled it, you'd probably find that that's the number one speed hit. In most places, the Cygwin layer between a program built using it and the underlying OS is pretty thin. The developers of Cygwin take a lot of pains to keep the layer as thin as possible while still providing correct POSIX semantics. The current uncommon thickness in the fork() call emulation is unavoidable short of Microsoft adding a native fork() type facility to their OS. Their incentives to do that aren't very good.
The solutions posted above as comments aren't bad.
Another possibility is to go through the drush script and see if there are calls to external programs you can replace with shell intrinsics or more efficient constructs. I wouldn't expect a huge speed improvement by doing that, but it has the nice property that you'll speed things up on the Linux side as well. (fork() is efficient on Linux, but starting external programs is still a big speed hit that you may not have to pay as often as you currently do.) For instance:
numlines=`grep somepattern $somefile | wc -l`
if [ $numlines -gt 0 ] ; then ...
would run faster as:
if grep -q somepattern $somefile ; then ...
The first version is arguably clearer, but it requires at least three external program invocations, and with primitive shells, four. (Do you see all of them?) The replacement requires only one external program invocation.
Also look at things that slow down Cygwin startup:
Trim down your Windows PATH (to the bare bones like %SystemRoot%\system32;%SystemRoot%)
Remove things you don't need from bashrc and bash_profile
Move things you only need in your terminal window from bashrc to bash_profile
One surprisingly large time suck in Cygwin is Bash completion. If you are using it (and you should because it's great), only source completion for the commands you need (rather than all of them which used to be the default). And, as mentioned above, source them from bash_profile, not bashrc.
You can give Cygwin a higher priority.
Write a new batch file, for example, "cygstart.bat" with the following content:
start "Cygwin" /high C:\cygwin\Cygwin.bat
The /high switch gives the shell a higher process priority.

chdir programmatically

In Windows -- and probably Unix for that matter -- using the chdir() function in a (32-bit) program doesn't change the directory when the program exits. (It does in a 16-bit Windows program.)
Does anybody know how to do that in a Windows 32-bit program?
Uhm... IMHO it's exactly one of the things that the OS must guarantee not to happen. The current dir is a per-process property, a child process usually inherits it from the parent process, but the reverse should not happen (and it doesn't).
To obtain what you want, the parent could actively watch some information (message, file, shared memory...) in which the child process stores the new directory, and then call chdir() with the new value.
As far as I know, Windows' cmd.exe doesn't have any mechanism like that.
Actually, by using code injection techniques (e.g. CreateRemoteThread) on the parent process it could be possible to force it to do something unexpected, but it's a very dirty trick, not at all good neither general.
Win16 was different: there was a single "msdos" state for all the programs, but it was a limitation, not a feature.
It sounds like you're asking one process (your Win32 program) to change the CWD of another process (your shell). As far as I know, this is impossible without the latter process providing an API for such a purpose. The nearest I can come to any sort of reference for this assertion, however, is the following quote from MSDN:
A parent process can directly alter the environment variables of a child process during process creation. This is the only situation when a process can directly change the environment settings of another process.
Well yeah it's true the popular API calls to change directory change it for the process. ... BUT ...
(1.) 16-bit windows programs can change the global directory; probably because they run in the same process as the command.com thing. That's what I've been happily using for years; I assume XP somehow emulates this? ... But now Windows 7 64-bit won't run 16-bit programs anymore! (?)
(2.) Both Windows and Unix "cd" commands can of course change directories for the calling process -- presumably because they are built-in commands of the command shell. But successor Windows shells manage to accomplish this, or at least I hope PowerShell can do that. All built-ins?
(3.) The way I've wound-up doing it is modifying my programs that used to call the API to simply emit "cd \dst\directory" to stdout, then in a procedure do
chdirprogram >t~.bat
call T~.bat
Which works great. And of course the usual point of a change-directory program is to provide the functionality in a batch procedure with a computed destination. Which of course you can do in Unix with Bash etc. variables, but not in Windows batch files, although maybe (?) in the numerous successor Windows procedure things, which I don't want to use. ... Since this functionality is obviously useful, I was hoping someone knew of a sneaky Windows call what'd do it. The explanation that it's somehow wrong for a process to change the directory for a calling process is one of those bogus, "you're not supposed to do that and I won't tell you why" excuses. ... But I guess I'll just stick to my pitiful little batch files.
Are you talking about the SetCurrentDirectory function of Windows API? The article says that the function "changes the current directory for the current process". In for instance Delphi, there is a function ChDir that actually calls this API function.

Resources