MS-Windows scheduler control (or otherwise) -- test application performance on slower CPU? - windows

Is there some tool which allows one to control the MS-Windows (XP-SP3 32-bit in my case) scheduler, s.t. a target application (which I'd like to test), operates as if it is running on a slower CPU. Say my physical host is a 2.4GHzv Dual-Core, but I'd like the application to run as if, it is running on a 800MHz/1.0GHz CPU.
I am aware of some such programs which allowed old DOS games to run slower, but AFAIK, they take the approach of consuming CPU cycles to starve the application. I do not want such a thing, and also would like to have higher precision control on the clock.

I don't believe you'll find software that directly emulates the different CPUs. But something like ProcessLasso would let you control a programs CPU usage. Thus simulating, in a way, a slower clock speed.
I also found this blog entry with many other ways to throttle your CPU: Windows CPU throttling techniques
Additionally, if you have access to VMWare you could setup a resource pool with a limited CPU reservation.

Related

High bandwidth Networking and the Windows "System Interrupts" Process

I am writing a massive UDP network application.
Running traffic at 10gigabits per second.
I have a very high "System Interrupts" CPU usage in task manager.
Reading about what this means, I see:
What Is the “System Interrupts” Process?
System Interrupts is an official part of Windows and, while it does
appear as a process in Task Manager, it’s not really a process in the
traditional sense. Rather, it’s an aggregate placeholder used to
display the system resources used by all the hardware interrupts
happening on your PC.
However most articles say that a high value corresponds with failing hardware.
However, since the "system interrupts" entry correlates to high IRQ usage, maybe this should be high considering my large UDP network usage.
Also, is all of this really happenning on one CPU core? Or is this an aggregate of all things happening across all CPU cores.
If you have many individual datagrams being sent over UDP, it's certainly going to cause a lot of hardware interrupts, and a lot of CPU usage. 10 Gb is certainly in the range of "lots of CPU" if your datagrams are relatively small.
Each CPU has its own hardware interrupts. You can see how spread out the load is over cores on the performance tab - the red line is the kernel CPU time, which includes hardware interrupts and other low-level socket handling by the OS.

Detecting HyperThreading without CPUID?

I'm working on a number-crunching application and I'm trying to squeeze all possible performance out of it that I can. I'm designing it to work for both Windows and *nix and even for multi-CPU machines.
The way I have it currently set up, it asks the OS how many cores there are, sets affinity on each core to a function that runs a CPUID ASM command (yes, it'll get run multiple times on the same CPU; no biggie, it's just initialization code) and checks for HyperThreading in the Features request of CPUID. From the responses to the CPUID command it calculates how many threads it should run. Of course, if a core/CPU supports HyperThreading it will spawn two on a single core.
However, I ran into a branch case with my own machine. I run an HP laptop with a Core 2 Duo. I replaced the factory processor a while back with a better Core 2 Duo that supports HyperThreading. However, the BIOS does not support it as the factory processor didn't. So, even though the CPU reports that it has HyperThreading it's not capable of utilizing it.
I'm aware that in Windows you can detect HyperThreading by simply counting the logical cores (as each physical HyperThreading-enabled core is split into two logical cores). However, I'm not sure if such a thing is available in *nix (particularly Linux; my test bed).
If HyperTreading is enabled on a dual-core processor, wil the Linux function sysconf(_SC_NPROCESSORS_CONF) show that there are four processors or just two?
If I can get a reliable count on both systems then I can simply skip the CPUID-based HyperThreading checking (after all, it's a possibility that it is disabled/not available in BIOS) and use what the OS reports, but unfortunately because of my branch case I'm not able to determine this.
P.S.: In my Windows section of the code I am parsing the return of GetLogicalProcessorInformation()
Bonus points: Anybody know how to mod a BIOS so I can actually HyperThread my CPU ;)? Motherboard is an HP 578129-001 with the AMD M96 chipset (yuck).

Force windows onto one CPU, and then take over the rest

I've seen various RTOSes that have this strategy that they have windows boot on one or more CPUs and then run realtime programs on the rest of the CPUs. Any idea how this might be accomplished? Can I let the computer boot off two CPUs and then stop execution on the rest of the CPUs? What documentation should I start looking at? I have enough experience with the linux kernel that I might be able to figure out how to do it under linux, so if there's anything that maps onto linux well that you could describe it in terms of, that'd be fantastic.
You can boot Windows on fewer CPUs than available easily. Run msconfig.exe, go to the Boot tab, click the Advanced options... button, check the number of processors box and set the desired number (this is for Windows 7, the exact location for Vista and XP might differ slightly).
But that's just a solution to a very small part of the problem.
You will need to implement a special kernel-mode driver to start those other CPUs (Windows won't let you do that sort of thing from non-kernel-mode code). And you will need to implement a thread scheduler for those CPUs and a bunch of other low-level things... You might want to steal some physical memory (RAM) from Windows as well and implement a memory manager as well and those two may be a very involved thing.
What to read? The Intel/AMD CPU documentation (specifically the APIC part), the x86 Multiprocessor specification from Intel, books on Windows drivers, Windows Internals books, MSDN, etc.
You can't turn off Windows on one CPU and expect to run your program as usual because syscalls are serviced by the same CPU that the thread issuing the syscall is issued on. The syscall relies on kernel-mode accessible per-thread data to handle the syscalls, and hence any thread (usermode or kernel-mode) can only run when Windows has performed the per-core initialization of the CPU.
It seems likely that you're writing a super-double-mega-awesome app that really-definitely needs to run, like, super-fast and you want everyone else to get off the core, 'cos then, like, you'll be the totally fastest-est, but you're not really appreciating that if Windows isn't on your core, then you can't use ANY part of Windows on that core either.
If you really do want to do this, you'll have to run as a boot-driver. The boot-driver will be able to reserve one of the cores from being initialized during boot, preventing Windows from "seeing" that core. You can then manually construct your own thread of execution to run on that core, but you'll need to handle paging, memory allocation, scheduling, NUMA, NMI exceptions, page-faulting, and ACPI events yourself. You won't be able to call Windows from that core without bluescreening Windows. You'll be on your own.
What you probably want to do is to lock your thread to a single processor (via SetThreadAffinity) and then up the priority of your thread to the maximum value. When you do so, Windows is still running on your core to service things like pagefaults and hardware interrupts, but no lower priority user-mode thread will run on that core (they'll all move to other cores unless they are also locked to your processor).
I could not understand the question properly. But if you asking for scheduling process to cores then linux can accomplish this using set affinity. Follow this page :
http://www.kernel.org/doc/man-pages/online/pages/man2/sched_setaffinity.2.html

Emulating a processor's (limited) resources, including clock speed

I would like a software environment in which I can test the speed of my software on hardware with specific resources. For example, how fast does this program run on an 800MHz x86 with 24 Mb of RAM, when my host hardware is a 3GHz quad core amd64 with 12GB of RAM? Emulators such as qemu make a great point of running "almost as fast" as the underlying hardware; I would like to make it run slower. Is there a way to do that?
I have never tried it, but perhaps you could achieve what you want to some extent by combining an emulator like QEMU or VirtualBox on Linux with something like this:
http://cpulimit.sourceforge.net/
If you can limit the CPU time available to the emulator you might be able to simulate the results of execution on a slower computer. Keep in mind, though, that this would only affect the execution speed (or so I hope, anyway).
The CPU instruction set and other system features would remain unchanged. This means that emulating a specific processor accurately would be difficult if not impossible.
In addition, using something like cpulimit, which works using SIGSTOP and SIGCONT to repeatedly stop/restart the emulator process might cause side-effects, such as timing inconsistencies, video display artifacts etc.
In your emulator, keep a virtual "clock" and increment it appropriately as you execute each instruction. From there you can simply report how long it took in virtual time to execute, or you can have your emulator sleep now and again to keep execution speed roughly where it would be in the target.

Decreasing performance of dev machine to match end-user's specs

I have a web application, and my users are complaining about performance. I have been able to narrow it down to JavaScript in IE6 issues, which I need to resolve. I have found the excellent dynaTrace AJAX tool, but my problem is that I don't have any issues on my dev machine.
The problem is that my users' computers are ancient, so timings which are barely noticable on my machine are perhaps 3-5 times longer on theirs, and suddenly the problem is a lot larger. Is it possible somehow to degrade the performance of my dev machine, or preferrably of a VM running on my dev machine, to the specs of my customers' computers?
I don't know of any virtualization solutions that can do this, but I do know that the computer/CPU emulator Bochs allows you to specify a limit on the number of emulated instructions per second, which you can use to simulate slower CPUs.
I am not sure if you can cpu bound it, but in VirutalBox or Parallel, you can bound the memory usage. I assume if you only give it about 128MB then it will be very slow. You can also limit the throughput on the network with a lot of tools. I guess the only thing I am not sure about is the CPU. That's tricky. Curious to know what you find. :)
You could get a copy of VMWare Workstation and choke the CPU of your VM.
With most virtual PC software you can limit the amount of RAM, but you are not able to set the CPU to a slower speed as it does not emulate a CPU, but uses the host CPU.
You could go with some emulation software like bochs that will let you setup an x89 processor environment.
You may try Fossil Toys
* PC Speed
PC CPU speed monitor / benchmark. With logging facility.
* Memory Load Test
Test application/operating system behaviour under low memory conditions.
* CPU Load Test
Test application/operating system behaviour under high CPU load conditions.
Although it doesn't simulate a specific CPU clock speed.

Resources