Does every function end up in kernel mode? - windows

Can we say that while programming, showing something on output, adding values etc., we always interact with system? I mean whether every function in app ends up(finally) in kernel.
I don't know if this approach varies from OS to OS so I mean Windows.
I appreciate Your response, and I am sorry for my English.

No, adding two values together will pretty sure not use any system code.

You always interact with the system in that the CPU (or some other processor like a GPU) has to execute your code.
Not every instruction executed by the CPU will involve a kernel-mode operation, though.

No, for example in Windows all messaging and COM objects don't end in Kernel-mode but they may use some kernel-mode resources like HANDLEs.

Related

Preventing from accessing process memory

I made an example that writes into process memory using task_for_pid() and mach_vm_write().
task_for_pid(mach_task_self(), pid, &target_task);
mach_vm_write(target_task, address, '?', local_size);
Is there a way to block to access memory of the specific process from another processes like cheat engine on OS X.
How do I prevent another process from calling task_for_pid?
Not that many others come to mind except hooking.
In OS X, the calls to task_for_pid are regulated by taskgated. Basically, unless it's your task , or you're root (or, in older systems, member of procview group), you won't get that elusive task port. But if you are allowed, then you have the port, and can do basically anything you want.
Hooking won't help, since task_for_pid is a mach trap - people can call it directly using the system call interface. iOS has much tighter controls on it (thanks to AppleMobileFileIntegrity.kext). If you want to control the trap, effectively the only way of doing so is writing a small kext to do the trick for you.

make_request and queue limits

I'm writing a linux kernel module that emulates a block device.
There are various calls that can be used to tell the block size to the kernel, so it aligns and sizes every request toward the driver accordingly. This is well documented in the "Linux Device Drives 3" book.
The book describes two methods of implementing a block device: using a "request" function, or using a "make_request" function.
It is not clear, whether the queue limit calls apply when using the minimalistic "make_request" approach (which is also the more efficient one if the underlying device is has really no benefit from sequential over random IO, which is the case with me).
I would really like to get the kernel to talk to me using 4K block sizes, but I see smaller bio-s hitting my make_request function.
My question is that should the blk_queue_limit_* affect the bio size when using make_request?
Thank you in advance.
I think I've found enough evidence in the kernel code that if you use make_request, you'll get correctly sized and aligned bios.
The answer is:
You must call blk_queue_make_request first, because it sets queue limits to defaults. After this, set queue limits as you'd like.
It seems that every part of the kernel submitting bios are do check for validity, and it's up to the submitter to do these checks. I've found incomplete validation in submit_bio and generic_make_request. But as long as no one does tricks, it's fine.
Since it's a policy to submit correct bio's, but it's up to the submitter to take care, and no one in the middle does, I think I have to implement explicit checks and fail the wrong bio-s. Since it's a policy, it's fine to fail on violation, and since it's not enforced by the kernel, it's a good thing to do explicit checks.
If you want to read a bit more on the story, see http://tlfabian.blogspot.com/2012/01/linux-block-device-drivers-queue-and.html.

Snoop interprocess communications

Has anyone tried to create a log file of interprocess communications? Could someone give me a little advice on the best way to achieve this?
The question is not quite clear, and comments make it less clear, but anyway...
The two things to try first are ipcs and strace -e trace=ipc.
If you want to log all IPC(seems very intensive), you should consider instrumentation.
Their are a lot of good tools for this, check out PIN in perticular, this section of the manual;
In this example, we show how to do
more selective instrumentation by
examining the instructions. This tool
generates a trace of all memory
addresses referenced by a program.
This is also useful for debugging and
for simulating a data cache in a
processor.
If your doing some heavy weight tuning and analysis, check out TAU (Tuning and analysis utilitiy).
Communication to a kernel driver can take many forms. There is usually a special device file for communication, or there can be a special socket type, like NETLINK. If you are lucky, there's a character device to which read() and write() are the sole means of interaction - if that's the case then those calls are easy to intercept with a variety of methods. If you are unlucky, many things are done with ioctls or something even more difficult.
However, running 'strace' on the program using the kernel driver to communicate can reveal just about all it does - though 'ltrace' might be more readable if there happens to be libraries the program uses for communication. By tuning the arguments to 'strace', you can probably get a dump which contains just the information you need:
First, just eyeball the calls and try to figure out the means of kernel communication
Then, add filters to strace call to log only the kernel communication calls
Finally, make sure strace logs the full strings of all calls, so you don't have to deal with truncated data
The answers which point to IPC debugging probably are not relevant, as communicating with the kernel almost never has anything to do with IPC (atleast not the different UNIX IPC facilities).

How to understand asynchronous io in Windows?

1.How to understand asynchronous io in Windows??
2.If I write/read something to the file using asynchronous io :
WriteFile();
ReadFile();
WriteFile();
How many threads does the OS generate to accomplish these task?
Do the 3 task run simultaneously and in multi-threading way
or run one after another just with different order?
3.Can I use multithreading and in each thread using a asynchronous io
to read or write the same file?
1.How to understand asynchronous io in Windows??
Read the Win32 documentation. Search on the web. Don't expect an answer to such a large, broad question here in SO.
2.If I write/read something to the file using asynchronous io :
WriteFile();
ReadFile();
WriteFile();
How many threads does the OS generate to accomplish these task?
I don't think it does. It will re-use existing thread contexts to execute kernel function calls. Basically the OS schedules the work and borrows a thread to do it - which is fine, since the kernel context is always the same.
3.Can I use multithreading and in each thread using a asynchronous io to read or write
the same file?
I believe so, yes. I don't know that the order of execution is guaranteed to match the order of submission, in which case you will obtain unpredictable results if you issue concurrent reads/writes on the same byte ranges.
To your questions:
How many threads does the OS generate
to accomplish these task?
Depends if you are using the windows pools, iocp, etc. Generally you decide.
Do the 3 task run simultaneously and
in multi-threading way or run one
after another just with different
order?
This depends on your architecture. On a single-cored machine, the 3 tasks would run one after another and the order would be os decided. On a multi-cored machine these might run together, depending on how the OS scheduled the threads.
3.Can I use multithreading and in each thread using a asynchronous io to read
or write the same file?
That is out of my knowledge so someone else would need to answer that one.
I suggest getting a copy of Windows via C/C++ as that has a very large chapter on Asynchronous IO.
I guess it depends which operating system you are using. But you shouldnt have to worry about this anyhow, it is transparent and should not affect how you write your code.
If you use the standard read and write in windows, you don't have to care that the system may not write it immediately, unless you are writing on the command-line and are waiting for the user to type some input. The OS is responsible for ensuring that what you write will eventually be written to the hard drive, and will do a much better job that you can do anyway.
If you are working on some weird asynchronous io, then please reformat your question.
I suggest looking for Jeffery Richter's books on Win32 programming. They are very well-written guides for just this sort of thing.
I think he has a newer book(s?) on C#, so watch out that you don't buy the wrong one.

How to determine the memory layout of a process in Windows?

How can I determine what memory is accessible by a process, other than calling ReadProcessMemory() on every single byte/page/whatever to see if it wins or fails?
(I know it must be possible as several tools show this sort of information, e.g. IDA Pro debugger, WinHex, Sysinternals' Process Monitor, ...)
VirtualQueryEx is likely the function you want.

Resources