Force windows onto one CPU, and then take over the rest - windows

I've seen various RTOSes that have this strategy that they have windows boot on one or more CPUs and then run realtime programs on the rest of the CPUs. Any idea how this might be accomplished? Can I let the computer boot off two CPUs and then stop execution on the rest of the CPUs? What documentation should I start looking at? I have enough experience with the linux kernel that I might be able to figure out how to do it under linux, so if there's anything that maps onto linux well that you could describe it in terms of, that'd be fantastic.

You can boot Windows on fewer CPUs than available easily. Run msconfig.exe, go to the Boot tab, click the Advanced options... button, check the number of processors box and set the desired number (this is for Windows 7, the exact location for Vista and XP might differ slightly).
But that's just a solution to a very small part of the problem.
You will need to implement a special kernel-mode driver to start those other CPUs (Windows won't let you do that sort of thing from non-kernel-mode code). And you will need to implement a thread scheduler for those CPUs and a bunch of other low-level things... You might want to steal some physical memory (RAM) from Windows as well and implement a memory manager as well and those two may be a very involved thing.
What to read? The Intel/AMD CPU documentation (specifically the APIC part), the x86 Multiprocessor specification from Intel, books on Windows drivers, Windows Internals books, MSDN, etc.

You can't turn off Windows on one CPU and expect to run your program as usual because syscalls are serviced by the same CPU that the thread issuing the syscall is issued on. The syscall relies on kernel-mode accessible per-thread data to handle the syscalls, and hence any thread (usermode or kernel-mode) can only run when Windows has performed the per-core initialization of the CPU.
It seems likely that you're writing a super-double-mega-awesome app that really-definitely needs to run, like, super-fast and you want everyone else to get off the core, 'cos then, like, you'll be the totally fastest-est, but you're not really appreciating that if Windows isn't on your core, then you can't use ANY part of Windows on that core either.
If you really do want to do this, you'll have to run as a boot-driver. The boot-driver will be able to reserve one of the cores from being initialized during boot, preventing Windows from "seeing" that core. You can then manually construct your own thread of execution to run on that core, but you'll need to handle paging, memory allocation, scheduling, NUMA, NMI exceptions, page-faulting, and ACPI events yourself. You won't be able to call Windows from that core without bluescreening Windows. You'll be on your own.
What you probably want to do is to lock your thread to a single processor (via SetThreadAffinity) and then up the priority of your thread to the maximum value. When you do so, Windows is still running on your core to service things like pagefaults and hardware interrupts, but no lower priority user-mode thread will run on that core (they'll all move to other cores unless they are also locked to your processor).

I could not understand the question properly. But if you asking for scheduling process to cores then linux can accomplish this using set affinity. Follow this page :
http://www.kernel.org/doc/man-pages/online/pages/man2/sched_setaffinity.2.html

Related

Immidiate and latent effects of modifying my own kernel binary at runtime?

I'm more of a web-developer and database guy, but severely inconvenient performance issues relating to kernel_task and temperature on my personal machine have made me interested in digging into the details of my Mac OS (I notices some processes would trigger long-lasting spikes in kernel-task, despite consistently low CPU temperature and newly re-imaged machine).
I am a root user on my own OSX machine. I can read /System/Library/Kernels/kernel. My understanding is this is "Mach/XNU" Kernel of this machine (although I don't know a lot about those, but I'm surprised that it's only 13Mb).
What happens if I modify or delete /System/Library/Kernels/kernel?
I imagine since it's at run-time, things might be okay until I try to reboot. If this is the case, would carefully modifying this file change the behavior of my OS, only effective on reboot, presuming it didn't cause a kernel panic? (is kernel-panic only a linux thing?)
What happens if I modify or delete /System/Library/Kernels/kernel?
First off, you'll need to disable SIP (system integrity protection) in order to be able to modify or edit this file, as it's protected even from the root user by default for security reasons.
If you delete it, your system will no longer boot. If you replace it with a different xnu kernel, that kernel will in theory boot next time, assuming it's sufficiently matched to both the installed device drivers and other kexts, and the OS userland.
Note that you don't need to delete/replace the kernel file to boot a different one, you can have more than one installed at a time. For details, see the documentation that comes with Apple's Kernel Debug Kits (KDKs) which you can download from the Apple Developer Downloads Area.
I imagine since it's at run-time, things might be okay until I try to reboot.
Yes, the kernel is loaded into memory by the bootloader early on during the boot process; the file isn't used past that, except for producing prelinked kernels when your device drivers change.
Finally, I feel like I should explain a little about what you actually seem to be trying to diagnose/fix:
but severely inconvenient performance issues relating to kernel_task and temperature on my personal machine have made me interested in digging into the details of my Mac OS
kernel_task runs more code than just the kernel core itself. Specifically, any kexts that are loaded (see kextstat command) - and there are a lot of those on a modern macOS system - are loaded into kernel space, meaning they are counted under kernel_task.
Long-running spikes of kernel CPU usage sound like they might be caused by file system self-maintenance, or volume encryption/decryption activity. They are almost certainly not basic programming errors in the xnu kernel itself. (Although I suppose stupid mistakes are easy to make.)
Another possible culprits are device drivers; especially GPU drivers are incredibly complex pieces of software, and of course are busy even if your system is seemingly idle.
The first step to dealing with this problem - if there indeed is one - would be to find out what the kernel is actually doing with those CPU cycles. So for that you'd want to do some profiling and/or tracing. Doing this on the running kernel most likely again requires SIP to be disabled. The Instruments.app that ships with Xcode is able to profile processes; I'm not sure if it's still possible to profile kernel_task with it, I think it at least used to be possible in earlier versions. Another possible option is DTrace. (there are entire books written on this topic)

Programmatically detect if hardware virtualization is enabled on Windows 7

Background
I've been bouncing around this for a while and still haven't come up with an adequate solution, hoping someone out there can point me in the right direction.
Essentially I need to identify whether I can run 64bit VM on a target machine (working in GO but happy to consider binding c code or some assembly (though I feel a bit out of depth there)
In order to run a 64 bit VM the system need Hardware Virtualisation support available and enabled in the bios (im only concerned with intel/amd at this time)
Journey so far
From windows 8 onwards, Windows ships with Hyper-V, and there is a nice function you can call IsProcessorFeaturePresent from the kernel32.dll with an arg of 'PF_VIRT_FIRMWARE_ENABLED' which will tell you if hardware virtualisation is enabled in firmware:
IsProcessorFeaturePresent
now I dont really like the way this behaves (it says not available if hyper-v is installed) but i can cope with it by checking if hyper-v is enabled through other means so this pretty much does the job from win8 upwards.
Problem is this function always return false on win 7 for some reason - even on a system on which I know hardware virtualization is enabled.
Coming from another angle I have used this lib to determine what instruction sets are available: intel processor feature lib - this allows me to know what type of virtualization instructions are available on the processor (if any)
But I'm still missing the final piece of knowing if its enabled in the bios on win 7. I figure in principle it should be easy from here - I should be able to call something which utilizes the virtualization extensions and see if it responds as expected. But unfortunately I have no idea how to do this.
Does anyone have any suggestions as to how I might do this?
Note: Im happy to consider 3rd party libs but this would be used in commercial software so licensing would have to allow for that (e.g nothing from Microsoft)
I am afraid you won't be able to achieve what you want unless you are ready to provide a kernel driver, because checking if BIOS has enabled virtualization requires kernel privileges.
Intel Software Developer Manual describes a model-specific register (MSR) with number 3Ah called IA32_FEATURE_CONTROL. Its bits 1 and 2 control whether VMX instructions are allowed in SMX and non-SMX modes. Also there is bit zero which, when written with 1, locks the whole register's value, thus making impossible to enable/disabled features until the next processor reset. This means that, if BIOS code has disabled VMX and locked it, an OS that boots later will be unable to change that fact, only to see it.
To read this or any other MSR one should use machine instruction RDMSR, and this instruction is only available when CPL is zero, that is, within an OS context. It will throw an exception if attempted to be used from application code.
Unless you find a program interface method that wraps RDMSR around and provides it to applications, you are out of luck. Typically that implies loading and running a dedicated kernel driver. I am aware about one for Linux, but cannot say if there is anything for Windows.
As an extra note, if your code is already running inside a virtual machine, like it is for some Windows installations which enable a Hyper-V environment for regular desktop, then you won't even be able to see an actual host MSR value. It will be up to the VMM to provide you with an emulated value, as well as it will show you whatever CPUID value it wants you to see, not the one from the host.

Can a Linux kernel run as an ARM TrustZone secure OS?

I am trying to run a Linux kernel as the secure OS on a TrustZone enabled development board(Samsung exynos 4412). Although somebody would say secure os should be small and simple. But I just want to try. And if it is possible, then write or port a trustlet application to this secure os will be easy, especially for applications with UI(trusted UI).
I bought the development board with a runnable secure OS based on Xv6 and the normal os is Android(android version 4.2.2, kernel version 3.0.15). I have tried to replace the simple secure os with the android Linux kernel, that is, with a little assembly code ahead, such as clearing the NS bit of SCR register, directly called the Linux kernel entry(with necessary kernel tagged list passed in).
The kernel uncompressed code is executed correctly and the first C function of the kernel, start_kernel(), is also executed. Almost all the initialization functions run well except running to calibrate_delay(). This function will wait for the jiffies changed:
/* wait for "start of" clock tick */
ticks = jiffies;
while (ticks == jiffies);
I guess the reason is no clock interrupt is generated(I print logs in clock interrupt callback functions, they are never gotten in). I have checked the CPSR state before and after the local_irq_enable() function. The IRQ and FIQ bit are set correctly. I also print some logs in the Linux kernel's IRQ handler defined in the interrupt vectors table. Nothing logged.
I know there may be some differences in interrupt system between secure world and non secure world. But I can't find the differences in any documentation. Can anybody point out them? And the most important question is, as Linux is a very complicated OS, can Linux kernel run as a TrustZone secure OS?
I am a newbie in Linux kernel and ARM TrustZone. Please help me.
Running Linux as a secure world OS should be standard by default. Ie, the secure world supervisor is the most trusted and can easily transition to the other modes. The secure world is an operating concept of the ARM CPU.
Note: Just because Linux runs in the secure world, doesn't make your system secure! TrustZone and the secure world are features that you can use to make a secure system.
But I just want to try. And if it is possible, then write or port a trustlet application to this secure os will be easy, especially for applications with UI(trusted UI).
TrustZone allows partitioning of software. If you run both Linux and the trustlet application in the same layer, there is no use. The trustlet is just a normal application.
The normal mode for trustlets is to setup a monitor vector page on boot and lock down physical access. The Linux kernel can then use the smc instruction to call routines in the trustlet to access DRM type functionality to decrypt media, etc. In this mode, Linux runs as a normal world OS, but can call limited functionality with-in the secure world through the SMC API you define.
Almost all the initialization functions run well except running to calibrate_delay().
This is symptomatic of non-functioning interrupts. calibrate_delay() runs a tight loop while waiting for a tick count to increase via system timer interrupts. If you are running in the secure world, you may need to route interrupts. The register GICD_ISPENDR can be used to force an interrupt. You may use this to verify that the ARM GIC is functioning properly. Also, the kernel command line option lpj=XXXXX (where XXXX is some number) can skip this step.
Most likely some peripheral interrupt router, clock configuration or other interrupt/timer initialization is done by a boot loader in a normal system and you are missing this. Booting a new board is always difficult; even more so with TrustZone.
There is nothing technically preventing Linux from running in the Secure state of an ARM processor. But it defeats the entire purpose of TrustZone. A large, complex kernel and OS such as Linux is infeasible to formally verify to the point that it can be considered "Secure".
For further reading on that aspect, see http://www.ok-labs.com/whitepapers/sample/sel4-formal-verification-of-an-os-kernel
As for the specific problem you are facing - there should be nothing special about interrupt handling in Secure vs. Non-secure state (unless you explicitly configure it to be different). But it could be that the secure OS you have removed was performing some initial timer initializations that are now not taking place.
Also 3.0.15 is an absolutely ancient kernel - it was released 2.5 years ago, based on something released over 3 years ago.
There are several issues with what you are saying that need to be cleared up. First, are you trying to get the Secure world kernel running or the Normal world kernel? You said you wanted to run Linux in the SW and Android in the NW, but your question stated, "I have tried to replace the simple secure os with the android Linux kernel". So which kernel are you having problems with?
Second, you mentioned clearing the NS-bit. This doesn't really make sense. If the NS-bit is set, clearing it requires means that you are running in the NW already (as the bit being set would indicate), you have executed a SMC instruction, switched to Monitor mode, set the NS-bit to 0, and then restored the SW registers. Is this the case?
As far as interrupts are concerned, have you properly initialized the VBAR for each execution mode, i.e. Secure world VBAR, Normal world VBAR, and MVBAR. All three will need to be setup, in addition to setting the proper values in the NSACR and others to ensure interrupts are channeled to the correct execution world and not all just handled by the SW. Also, you do need separate exception vector tables and handlers for all three modes. You might be able to get away with only one set initially, but once you partition your memory system using the TZASC, you will need separate everything.
TZ requires a lot of configuration and is not simply handled by setting/unsetting the NS-bit. For the Exynos 4412, numerous TZ control registers that must be properly set in order to execute in the NW. Unfortunately, none of the information on them is covered in the Public version of the User's Guide. You need the full version in order to get all the values and address necessary to actually run a SW and NW kernel on this processor.

Detecting HyperThreading without CPUID?

I'm working on a number-crunching application and I'm trying to squeeze all possible performance out of it that I can. I'm designing it to work for both Windows and *nix and even for multi-CPU machines.
The way I have it currently set up, it asks the OS how many cores there are, sets affinity on each core to a function that runs a CPUID ASM command (yes, it'll get run multiple times on the same CPU; no biggie, it's just initialization code) and checks for HyperThreading in the Features request of CPUID. From the responses to the CPUID command it calculates how many threads it should run. Of course, if a core/CPU supports HyperThreading it will spawn two on a single core.
However, I ran into a branch case with my own machine. I run an HP laptop with a Core 2 Duo. I replaced the factory processor a while back with a better Core 2 Duo that supports HyperThreading. However, the BIOS does not support it as the factory processor didn't. So, even though the CPU reports that it has HyperThreading it's not capable of utilizing it.
I'm aware that in Windows you can detect HyperThreading by simply counting the logical cores (as each physical HyperThreading-enabled core is split into two logical cores). However, I'm not sure if such a thing is available in *nix (particularly Linux; my test bed).
If HyperTreading is enabled on a dual-core processor, wil the Linux function sysconf(_SC_NPROCESSORS_CONF) show that there are four processors or just two?
If I can get a reliable count on both systems then I can simply skip the CPUID-based HyperThreading checking (after all, it's a possibility that it is disabled/not available in BIOS) and use what the OS reports, but unfortunately because of my branch case I'm not able to determine this.
P.S.: In my Windows section of the code I am parsing the return of GetLogicalProcessorInformation()
Bonus points: Anybody know how to mod a BIOS so I can actually HyperThread my CPU ;)? Motherboard is an HP 578129-001 with the AMD M96 chipset (yuck).

MS-Windows scheduler control (or otherwise) -- test application performance on slower CPU?

Is there some tool which allows one to control the MS-Windows (XP-SP3 32-bit in my case) scheduler, s.t. a target application (which I'd like to test), operates as if it is running on a slower CPU. Say my physical host is a 2.4GHzv Dual-Core, but I'd like the application to run as if, it is running on a 800MHz/1.0GHz CPU.
I am aware of some such programs which allowed old DOS games to run slower, but AFAIK, they take the approach of consuming CPU cycles to starve the application. I do not want such a thing, and also would like to have higher precision control on the clock.
I don't believe you'll find software that directly emulates the different CPUs. But something like ProcessLasso would let you control a programs CPU usage. Thus simulating, in a way, a slower clock speed.
I also found this blog entry with many other ways to throttle your CPU: Windows CPU throttling techniques
Additionally, if you have access to VMWare you could setup a resource pool with a limited CPU reservation.

Resources