Driver proxy to avoid repeated code signing on Windows - windows

Drivers on Windows should be signed and submitted to Microsoft for code signing and this is a requirement for 64 bit systems. The problem is that, when you have to update the driver, you have to submit it again, once for every release.
Is possible to build some kind of proxy or shim driver so that I have to sign and submit it only once, and then have my code in a separate module?
Of course I can't just move the working code in a DLL, as also dynamically loaded modules have to be signed in order to being executed in 64 bit kernel mode. What if I put my code in a raw file, load it in memory (allocated with execution flag enabled) and then execute it? Other ideas?

You don't have to submit a driver to Microsoft. You have to sign your driver with a cross-signing-certificate. You will get a nag screen this way, but this is not forbidden!
There were (are) several attempts to do just what you want to do. These are 'barely' tolerated, but these drivers may be banned at any time. (By revoking your signature)
The process is very easy :
Load the signed driver.
Provide a IoControl in which user mode programs can send memory to the kernel.
Change the execute bit of this memory, and just call an address in this memory.

Related

Bootloader Strategy for Corrupt Applications

I've implemented a bootloader for a Kinetis ARM Cortex-M4 microcontroller.
The main application (starting at 0x10000) is re-programmed via the bootloader over a custom RS232 interface. I've implemented jumpToApplication and jumpToBootloader functions from the bootloader and application perspectives and all works fine so far.
One strategy I'm keen to understand is what to do upon the event of a corrupt main application?
The bootloader currently checks the stack-pointer and program-counter of the main application before deciding whether to jump. However, if the main application is corrupt then either two issues will occur:
The main application will hang and make it difficult to re-program
The microcontroller will reboot and will be stuck in a bootloader > application > bootloader (etc) loop
I have a SharedData structure which allows me to share data (via a fixed RAM location) between both the bootloader and application. I have considered adding a rebootCounter to this structure which would be incremented upon the HardFaultInterrupt being triggered in the main application.
This value could be tested in the bootloader and, depending on the counter value, a decision could be made as to whether to stay in the bootloader or try to launch the application.
Are there more "industry standard" ways of dealing with this?
UPDATE
To clarify, the ultimate reason for asking this question is to cover the following scenario:
Bootloader is programmed into the device during production phase via JTAG
Main application (latest build) is loaded during testing phase
During the testing phase, there is a power-cut or connection issue and the device is only partially programmed
When power is applied again, the bootloader will "assume" that there is a valid program in the main part of flash and will "jump" to this application
The microcontroller is now stuck in no mans land with no way of re-loading flash via the bootloader again without opening up the products enclosure and re-flashing the chip via JTAG - not something we can do when the product is in the field.
During the bootloader programming phase, the firmware is programmed and validated byte-by-byte to ensure that there is no corruption during the data transfer. If corruption occurs during this phase (bad packet due to USB hub issue, for example) then the bootloader will continue to accept re-programming commands.
UPDATE #2
The following post seems to be thinking along similar lines:
https://interrupt.memfault.com/blog/how-to-write-a-bootloader-from-scratch
First I recommend that add some delay in your bootloader that waits for a firmware update process start indicator. I developed something similar; desktop application sends start byte periodically and when you connect your device, it enters bootloader mode and waits for five seconds more to get new firmware information; so it is not important whether there is valid main application on the flash or not.
Another solution to check the existing of the main application use a specific sector of the flash for firmware information, before a firmware update process erase that sector. After a successful firmware update write a specific data to that sector. In the bootloader read this sector and verify that there is a valid application on the flash.
I would add some 'magic' value (say 0xDEAD00D) an the end of the application and only jump to the application of the magic value is there. You can have a pointer to that location at 0x10000.
To make things more robust, program the magic value after the verify is completed.

Is it posible to load a system driver from memory without it touching the filesystem directly?

Is it possible to load a signed windows driver from memory without the file ever touching the disk? If it is possible, is it trivial to achieve or are there any obstacles to overcome. To clarify, the driver may exist on the disk at some point but in an encoded state.
For example, I know that it's possible to decode a payload from memory and inject it into another running process, but since that's technically a Windows "Feature" I'm not sure whether things are as easy when you're loading things into the kernel.
If it is possible, bonus points for sources. All my search has turned up is people calling each other idiots and malware authors without actually getting into whether it's technically possible/feasable.
My use case is md5sum detection since to load drivers onto a 64 bit windows system they must be signed, and so the hash would be immutable. If you can load drivers from memory then monitoring the file system wouldn't be sufficient for my needs.
No, Drivers have to have an entry in the service manager to point to some bin file.
this is part of the Service registry
I know that windbg is able to replace the image of a driver, but it is from a kernel debugger using a map files

how to debug a pci device and linux driver

I am programming a pci device with verilog and also writing its driver,
I have probably inserted some bug in the hardware design and when i load the driver with insmod the kernel just gets stuck and doesnt respond. Now Im trying to figure out what's the last driver code line that makes my computer stuck. I have inserted printk in all relevant functions like probe and init but non of them get printed.
What other code is running when i use insmod before it gets to my init function? (I guess the kernel gets stuck over there)
printks are often not useful debugging such a problem. They are buffered sufficiently that you won't see them in time if the system hangs shortly after printk is called.
It is far more productive to selectively comment out sections of your driver and by process of elimination determine which line is the (first) problem.
Begin by commenting out the entire module's init section leaving only return 0;. Build it and load it. Does it hang? Reboot system, reenable the next few lines (class_create()?) and repeat.
From what you are telling, it is looks like that Linux scheduler is deadlocking by your driver. That's mean that interrupts from the system timer doesn't arrive or have a chance to be handled by kernel. There are two possible reasons:
You hang somewhere in your driver interrupt handler (handler starts its work but never finish it).
Your device creates interrupts storm (Device generates interrupts too frequently as a result your system do the only job -- handling of your device interrupts).
You explicitly disable all interrupts in your driver but doesn't reenable them.
In all other cases system will either crash, either oops or panic with all appropriate outputs or tolerate potential misbehavior of your device.
I guess that printk won't work for such extreme scenario as hang in kernel mode. It is quite heavy weight and due to this unreliable diagnostic tool for scenarios like your.
This trick works only in simpler environments like bootloaders or more simple kernels where system runs in default low-end video mode and there is no need to sync access to the video memory. In such systems tracing via debugging output to the display via direct writing to the video memory can be great and in many times the only tool that can be used for debugging purposes. Linux is not the case.
What techniques can be recommended from the software debugging point of view:
Try to review you driver code devoting special attention to interrupt handler and places where you disable/enable interrupts for synchronization.
Commenting out of all driver logic with gradual uncommenting can help a lot with localization of the issue.
You can try to use remote kernel debugging of your driver. I advice to try to use virtual machine for that purposes, but I'm not aware about do they allow to pass the PCI device in the virtual machine.
You can try the trick with in-memory tracing. The idea is to preallocate the memory chunk with well known virtual and physical addresses and zeroes it. Then modify your driver to write the trace data in this chunk using its virtual address. (For example, assign an unique integer value to each event that you want to trace and write '1' into the appropriate index of bytes array in the preallocated memory cell). Then when your system will hang you can simply force full memory dump generation and then analyze the memory layout packed in the dump using physical address of the memory chunk with traces. I had used this technique with VmWare Workstation VM on Windows. When the system had hanged I just pause a VM instance and looked to the appropriate .vmem file that contains raw memory latout of the physical memory of the VM instance. Not sure that this trick will work easy or even will work at all on Linux, but I would try it.
Finally, you can try to trace the messages on the PCI bus, but I'm not an expert in this field and not sure do it can help in your case or not.
In general kernel debugging is a quite tricky task, where a lot of tricks in use and all they works only for a specific set of cases. :(
I would put a logic analyzer on the bus lines (on FPGA you could use chipscope or similar). You'll then be able to tell which access is in cause (and fix the hardware). It will be useful anyway in order to debug or analyze future issues.
Another way would be to use the kernel crash dump utility which saved me some headaches in the past. But depending your Linux distribution requires installing (available by default in RH). See http://people.redhat.com/anderson/crash_whitepaper/
There isn't really anything that is run before your init. Bus enumeration is done at boot, if that goes by without a hitch the earliest cause for freezing should be something in your driver init AFAIK.
You should be able to see printks as they are printed, they aren't buffered and should not get lost. That's applicable only in situations where you can directly see kernel output, such as on the text console or over a serial line. If there is some other application in the way, like displaying the kernel logs in a terminal in X11 or over ssh, it may not have a chance to read and display the logs before the computer freezes.
If for some other reasons the printks still do not work for you, you can instead have your init function return early. Just test and move the return to later in the init until you find the point where it crashes.
It's hard to say what is causing your freezes, but interrupts is one of those things I would look at first. Make sure the device really doesn't signal interrupts until the driver enables them (that includes clearing interrupt enables on system reset) and enable them in the driver only after all handlers are registered (also, clear interrupt status before enabling interrupts).
Second thing to look at would be bus master transfers, same thing applies: Make sure the device doesn't do anything until it's asked to and let the driver make sure that no busmaster transfers are active before enabling busmastering at the device level.
The fact that the kernel gets stuck as soon as you install your driver module makes me wonder if any other driver (built in to kernel?) is already driving the device. I made this mistake once which is why i am asking. I'd look for the string "kernel driver in use" in the output of 'lspci' before installing the module. In any case, your printk's should be visible in dmesg output.
in addition to Claudio's suggestion, couple more debug ideas:
1. try kgdb (https://www.kernel.org/doc/htmldocs/kgdb/EnableKGDB.html)
2. use JTAG interfaces to connect to debug tools (these i think vary between devices, vendors so you'll have to figure out which debug tools you need to the particular hardware)

Registry Access hook to protect driver

I'm writing a driver for Windows NT that provides Ring-0 access for userspace application. I want to make a utility with exclusive rights to execute any user's commands that would be protected from any external harmful influence.
Surfing the Internet I found that it is necessary to hook some native kernel functions, such as NtOpenProcess, NtTerminateProcess, NtDublicateObject, etc. I've made a working driver which protects an application but then I realized that it would be better to prevent it also from external attempts of removing the driver or forbidding its loading during OS starting like firewall. I divided the task into two parts: to prevent physical removing of the driver from \system32\drivers\ and to prevent changing/removing registry key responsible for loading the driver (HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services).
The matter is that I do not understand how to hook the access to the registry key from kernel space and even not sure that it is possible: all functions from ntdll that work with registry are in the userspace, unavailable from kernelspace. Also all API hooks that I can set from userspace would be in specific proccess's memory context. So we need to inject Dll into every proccess be it current or new.
Is there a method to hook all NT-calls in one place without injecting Dll into every proccess?
You do this in wrong way. Registry calls is also nt syscalls and reside in SSDT (as another Zw* syscalls). But hooking SSDT is bad practice. Major drawbacks - its dont working on x64 systems because of PathGuard. Right way is use documented specific filtering mechanisms of OS. For registry calls it is Configuration Manager callbacks. There are some caveats for windows xp version of this callbacks (some facilities are unimplemented or bogus) but xp is dead now =). It`s very simple to use it. You can start (and end =) ) from this guide http://msdn.microsoft.com/en-us/library/windows/hardware/ff545879(v=vs.85).aspx

Is DLL loaded in kernel mode or user mode?

I was asked such a question in an interview:
In windows, suppose there is an exe which depends on some dlls, when you start
the exe, and then the dependent dlls will be loaded, are these dlls
loaded in kernel mode or user mode?
I am not quite sure about the question, not the mention the answer - could you help to explain?
Thanks.
I'm not an expert about how Windows internally works, but for what i know the correct answer is user mode, simply because only the processes related to your Operative System are admitted in the kernel space http://en.wikibooks.org/wiki/Windows_Programming/User_Mode_vs_Kernel_Mode
Basically if it's not an OS process, it's going to be allocated in the user space.
The question is very imprecise/ambiguous. "In Windows" suggests something but isn't clear what. Likely the interviewer was referring to the Win32 subsystem - i.e. the part of Windows that you usually get to see as an end-user. The last part of the question is even more ambiguous.
Now while process and section objects (in MSDN referred to as MMF, loaded PE images such as .exe and .dll and .sys) are indeed kernel objects and require some assistance from the underlying executive (and memory manager etc) the respective code in the DLL (including that in DllMain) will behave exactly the same as for any other user mode process, when called from a user mode process. That is, each thread that is running code from the DLL will transition to kernel mode to make use of OS services eventually (opening files, loading PE files, creating events etc) or do some stuff in user mode whenever that is sufficient.
Perhaps the interviewer was even interested in the memory ranges that are sometimes referred to as "kernel space" and "user space", traditionally at the 2 GB boundary for 32bit. And yes, DLLs usually end up below the 2 GB boundary, i.e. in "user space", while other shared memory (memory mapped files, MMF) usually end up above that boundary.
It is even possible that the interviewer fell victim to a common misunderstanding about DLLs. The DLL itself is merely a dormant piece of memory, it isn't running anything on its own ever (and yes, this is also true for DllMain). Sure, the loader will take care of all kinds of things such as relocations, but in the end nothing will run without being called explicitly or implicitly (in the context of some thread of the process loading the DLL). So for all practical purposes the question would require you to ask back.
Define "in Windows".
Also "dlls loaded in kernel mode or user mode", does this refer to the code doing the loading or to the end result (i.e. where the code runs or in what memory range it gets loaded)? Parts of that code run in user mode, others in kernel mode.
I wonder whether the interviewer has a clear idea of the concepts s/he is asking about.
Let me add some more information. It seems from the comments on the other answer that people have the same misconception that exists about DLLs also about drivers. Drivers are much closer to the idea of DLLs than to that of EXEs (or ultimately "processes"). The thing is that a driver doesn't do anything on its own most of the time (though it can create system threads to change that). Drivers are not processes and they do not create processes.
The answer is quite obviously User mode for anybody who does any kind of significant application development for windows. Let me explain two things.
DLL
A dynamic link library is closely similar to a regular old link library or .lib. When your application uses a .lib it pastes in function definitions just after compile time. You typically use a .lib to store API's and to modify the functions with out having to rebuild the whole project, just paste new .lib with same name over the old and as long as the interface(function name and parameters) hasn't changed it still works. Great modularity.
A .dll does exactly the same thing however it doesn't require re-linking or any compilation. You can think of a .dll as essentially a .lib which gets compiled to an .exe just the same as applications which use it. Simply put the new .dll which shares the name and function signatures and it all just works. You can update your application simply by replacing .dlls. This is why most windows software consists of .dlls and a few exe's.
The usage of a .dll is done in two ways
Implicit linking
To link this way if you had a .dll userapplication.dll you would have an userapplication.lib which defines all the entry points in the dll. You simply link to the static link library and then include the .dll in the working directory.
Explicit linking
Alernatively you can programmatically load the .dll by first calling LoadLibrary(userapplication.dll) which returns a handle to your .dll. Then GetProcAddress(handle, "FunctionInUserApplicationDll") which returns a function pointer you can use. This way your application can check stuff before attempting to use it. c# is a little different but easier.
USER/KERNEL MODES
Windows has two major modes of execution. User mode and Kernel modes (kernel further divided into system and sessions). For user mode the physical memory address is opaque. User mode makes use of virtual memory which is mapped to real memory spaces. User mode driver's are coincidentally also .dll's. A user mode application typically gets around 4Gb of virtual addressing space to work with. Two different applications can not meaningfully use those address because they are with in context of that application or process. There is no way for a user mode application to know it's physical memory address with out falling back to kernel mode driver. Basically everything your used to programming (unless you develop drivers).
Kernel mode is protected from user mode applications. Most hardware drivers work in the context of kernel mode and typically all windows api's are broken into two categories user and kernel. Kernel mode drivers use kernel mode api's and do not use user mode api's and hence don't user .dll's(You can't even print to a console cause that is a user mode api set). Instead they use .sys files which are drivers and essentially work exactly the same way in user mode. A .sys is an pe format so basically an .exe just like a .dll is like an .exe with out a main() entry point.
So from the askers perspective you have two groups
[kernel/.sys] and [user/.dll or .exe]
There really isn't .exe's in kernel because the operating system does everything not users. When system or another kernel component starts something they do it by calling DriverEntry() method so I guess that is like main().
So this question in this sense is quite simple.

Resources