When and where is a process named in Linux - linux-kernel

I've been trying to follow the flow of process creation on Linux.
So far, I've put in a few debug printk's to understand pid allocation on the Linux kernel.
However, now I wish to map PIDs to binaries as they are being created (or executed).
I know that the way Linux creates processes is by forking off init and then doing an exec..or doing an exec directly from init..
I'm trying to trace when and where the field comm on the new task_struct is being filled..
The comm field stores the binary being executed.
So far, no matter where I try to print the comm field (execept during the context_switch function), all processes always display their name as khelper
I've tried extensively debugging the do_execve function, but that just doesn't seem to contain code related to changing of the comm field..
Could someone point out where and when the comm field is assigned

Correction: The function is setup_new_exec in fs/exec.c it calls set_task_comm which actually sets this field.

I've found that setup_new_exec in fs/exec.c fills in the comm field in the struct task_struct for most user processes.
However, this does not seem to happen for a lot of processes that are started within the kernel itself..

Related

Any way to tag Windows processes recursively?

Currently, I'm using environment variables to do just that. But, when processes get started using cmd.exe instead of cygwin's bash.exe, I cannot read their environment variables easily via /proc any more. Apart from that, there is also a race condition between reading a process' environment variables and killing it where the process gets replaced in the meantime with another process having the same process id.
Another solution would possibly be process groups, assuming the group IDs will be propagated to child processes and that I can somehow kill stuff explicitly by using the process group id. However, this is not applicable to processes that already have a process group id. Or can I give them a second group id?
A last-resort solution would be to assume that the parent process keeps running and clears up childs. This is a pretty risky assumption however, and here again we have a race condition between identifying child processes and the actual act of killing them via their PID.
Is there any tooling provided by Windows to achieve just what I'm trying to do? Not necessarily using pgids or env vars.

How to access Linux kernel data structures?

I want to print the information of each process and what that process is doing at runtime. i.e. Which file is read/write by that process continuously.
For this I'm writing a kernel module.
Any one have idea to How to access this information in kernel module or how to access the process table data structures in my kernel module?
pseudo code for task will be like this:
1. get each process from /proc.
2. Access the data structure of that process i.e. process table and all
3. print what that process is doing i.e. which file it is accessing (i.e. reading or writing) at rutime.
Please take a look at this example.
It specifically shows how to create a kernel module which prints the open files of a process (and relies on the task_struct struct gained from the current macro I mentioned in my comment). This can be manipulated to far more complicated things which can be accessed through the process task_struct struct.
There is a macro called for_each_process declared in /include/linux/sched.h
http://lxr.free-electrons.com/source/include/linux/sched.h#L2621
By using this macro, it is possible to traverse all process's task_struct.
http://lxr.free-electrons.com/source/include/linux/sched.h#L1343

how to pass information to a background process in bash

I have created a bash script and it runs in the background. It has a PID which is stored in a file, and I can use KILL to pass predefined signals to the process.
From time to time however, I'd like to pass information to the process manually. Preferably what I would like to happen is to be able to pass a string or array of information, which is captured through TRAP, then the forever loop inside the bash file will process the information. Is there an easy way to pass information into a background process?
Thanks
You can create a fifo, have the main process write to it and have the child read from it.
mkfifo link
run_sub < link &
generate_output > link
Have it listen on a socket and implement a protocol to achieve your communication aims, probably a bit much for bash.
Or, have it try to read a particular file on receipt of a particular signal. For example, it is common for programs to re-read their configuration files on receipt of a HUP.

Appending data to a file from the Linux Kernel

I'm trying gather measurements of cycle counts for a particular sys call (sys_clone) in the linux kernel. That said, my process won't be the only one calling it and I can't know my pid ahead of time; so I'll have to record every invocation of it for every pid.
The problem that I've got is that the only ways I can figure out how to output this data (debugfs, sysfs, procfs) involve statically sized buffers, which will be quickly overwritten with irrelevant data from other processes calling sys_clone.
So, does anyone know how to append an arbitrary number of lines to a user space accessible file in linux?
You can take the printk()/klogd approach, and use a circular buffer that is exported via /proc. A user-space process blocks on reading your /proc file, and once it reads something that is removed from the buffer. In fact, you could take a look whether klogd/syslogd can be modified to also read your /proc file, thus you wouldn't need to implement the userspace part.
If you are good with something simpler, just printk() your information in a normalized form with some prefix, and then just filter it out from your syslog using this prefix.
There are a few more possibilities (e.g. using netlink to send messages to userspace), but writing to a file from the kernel is not something I'd recommend.
You could stash the counts in the right task_struct, and make it visible through a per-process file in /proc/<pid>/.

How are PIDs generated on Ubuntu?

I've just wrote a program that forks one process. The child process just displays "HI" 200 times. The father process just says he's the father.
I've printed out both pids.
When I run my program multiple times, I see that the parent's pid stays the same, which is normal. What I don't understand is why the child's pid keeps getting incremented by 2, and exactly 2.
My question: Is this the standard method of pid generation in Ubuntu? Incrementing by 2?
PIDs happen to be handed out monotonically increasing in Linux 2.6, but why does it matter which you get? Don't rely on any specific behavior. If there is a skip of +2 it might simply be because another process happened to spawn a child. Or because +1 would have reached a PID that is already in use.
Found a reference here saying that vfork() consumes a pid as a byproduct of its operation. As well, in some cases, if you're forking from a shell script, the fork might spawn a new shell before your actual script gets involved, which would also consume a pid.
I'd suggest suspending your program between a couple forks, and see if there's another process occupying those "missing" pids.

Resources