I want to know whether timer interrupts i.e what we see at "cat /proc/interrupts | grep timer" are NMIs Or what.
I know that watchdog timer is an NMI.
Normally this depends on the hardware and not the kernel, you should look at the documentation of your CPU. Most of the time there are only a few NMI lines available in a CPU and they should be used as less as possible. (Only for life threatening interrupts, like the watchdog timer :))
And a quite useful read about NMI http://www.ganssle.com/articles/anmi.htm
Related
I have a textbook statement says disabling interrupt is not recommended in multi-processor system, and it will take too much time. But I don't understand this, can anyone show me the process of multi-processor system disabling the interrupts? Thanks
on x86 (and other architectures, AFAIK), enabling/disabling interrupts is on a per-core basis. You can't globally disable interrupts on all cores.
Software can communicate between cores with inter-processor interrupts (IPIs) or atomic shared variables, but even so it would be massively expensive to arrange for all cores to sit in a spin-loop waiting for a notification from this core that they can re-enable interrupts. (Interrupts are disabled on other cores, so you can't send them an IPI to let them know when you're done your block of atomic operations.) You have to interrupt whatever all 7 other cores (e.g. on an 8-way SMP system) are doing, with many cycles of round-trip communication overhead.
It's basically ridiculous. It would be clearer to just say you can't globally disable interrupts across all cores, and that it wouldn't help anyway for anything other than interrupt handlers. It's theoretically possible, but it's not just "slow", it's impractical.
Disabling interrupts on one core doesn't make something atomic if other threads are running on other cores. Disabling interrupts works on uniprocessor machines because it makes a context-switch impossible. (Or it makes it impossible for the same interrupt handler to interrupt itself.)
But I think my confusion is that for me the difference between 1 core and 8 core is not a big number for me; why disabling all of them from interrupt is time consuming.
Anything other than uniprocessor is a fundamental qualitative difference, not quantitative. Even a dual-core system, like early multi-socket x86 and the first dual-core-in-one-socket x86 systems, completely changes your approach to atomicity. You need to actually take a lock or something instead of just disabling interrupts. (Early Linux, for example, had a "big kernel lock" that a lot of things depended on, before it had fine-grained locking for separate things that didn't conflict with each other.)
The fundamental difference is that on a UP system, only interrupts on the current CPU can cause things to happen asynchronously to what the current code is doing. (Or DMA from devices...)
On an SMP system, other cores can be doing their own thing simultaneously.
For multithreading, getting atomicity for a block of instructions by disabling interrupts on the current CPU is completely ineffective; threads could be running on other CPUs.
For atomicity of something in an interrupt handler, if this IRQ is set up to only ever interrupt this core, disabling interrupts on this core will work. Because there's no threat of interference from other cores.
Suppose we're talking about a cloud linux server.
For a project I have. How bad would it be to modify the timer interrupt such that on each tick the processor will also check 1-4 cached dwords ?
Will that run the system totally unstable? Much slower?
Second, is the timer interrupt is anywhere near the cpu's clock or much slower?
(System_timer, not rtc)
Bad.
An OS does a lot of things on a timer interrupt. It sounds like what you are proposing to add is insignificant. But I still wouldn't recommend adding it to the timer interrupt handler itself. Interrupt handlers are tricky business.
You should use the systems in place in the kernel to schedule your task to run. (Sorry I can't be more specific, but if you are seriously considering changing a fundamental interrupt handler, then you should have no trouble figuring it out.)
I'm having a hard time understanding this.
How does the scheduler know that a certain period of time has passed?
Does it use some sort of syscall or interrupt for that?
What's the point of using the constant HZ instead of seconds?
What does the system timer have to do with the scheduler?
How does the scheduler know that a certain period of time has passed?
The scheduler consults the system clock.
Does it use some sort of syscall or interrupt for that?
Since the system clock is updated frequently, it suffices for the scheduler to just read its current value. The scheduler is already in kernel mode so there is no syscall interface involved in reading the clock.
Yes, there are timer interrupts that trigger an ISR, an interrupt service routine, which reads hardware registers and advances the current value of the system clock.
What's the point of using the constant HZ instead of seconds?
Once upon a time there was significant cost to invoking the ISR, and on each invocation it performed a certain amount of bookkeeping, such as looking for scheduler quantum expired and firing TCP RTO retransmit timers. The hardware had limited flexibility and could only invoke the ISR at fixed intervals, e.g. every 10ms if HZ is 100. Higher HZ values made it more likely the ISR would run and find there is nothing to do, that no events had occurred since the previous run, in which case the ISR represented overhead, cycles stolen from a foreground user task. Lower HZ values would impact dispatch latency, leading to sluggish network and interactive response times. The HZ tuning tradeoff tended to wind up somewhere near 100 or 1000 for practical hardware systems. APIs that reported system clock time could only do so in units of ticks, where each ISR invocation would advance the clock by one tick. So callers would need to know the value of HZ in order to convert from tick units to S.I. units. Modern systems perform network tasks on a separately scheduled TCP kernel thread, and may support tickless kernels which discard many of these outdated assumptions.
What does the system timer have to do with the scheduler?
The scheduler runs when the system timer fires an interrupt.
The nature of a pre-emptive scheduler is it can pause "spinning" usermode code, e.g. while (1) {}, and manipulate the run queue, even on a single-core system.
Additionally, the scheduler runs when a process voluntarily gives up its time slice, e.g. when issuing syscalls or taking page faults.
I would like to get my Core i7 CPU to enter sleep state just momentarily, for one millisecond or so from a batch file or executable.
I know sleep can be induced with SetSuspendState, but I'm looking for a solution that does not put the entire system to sleep, but just the CPU momentarily.
CPU is Core i7 3632QM, and OS is Windows 7 and 10.
Thanks
Based on your comment about defeating some kind of shutdown every 30 mins, it sounds like you need the whole CPU (all cores) to sleep. We need much more detail on that to do more than guess about which sleep states will serve your purpose and which won't.
Based on comments, it's likely that ACPI S3 sleep will be needed. Ross's comment about the hardware supporting an S1 sleep didn't mention an S2 (CPU actually powered down), so it's probably not even possible to power down just the CPU.
So your best bet is to look into programmatically doing a sleep/wake cycle, which is possible on at least some hardware. On Linux, the rtcwake command has an option to do that. I assume it programs a wakeup time into the BIOS's NVRAM before initiating a sleep. (I think there are only a few commonly-used formats/locations for storing this, so there's a good chance it's possible on your computer.)
Try a google search for wake up laptop at a certain time or something to find Windows equivalent of rtcwake. I didn't look at any of the hits, but they look promising.
I'm not an expert at this system power-state management stuff, but you probably need the system to enter an ACPI sleep state. S3 is the usual "suspend to RAM"; OSes that support suspend usually use this as their non-hibernate option.
For your use, maybe S1 or S2 will do (and anything less than this, like CPU power-saving C-states probably won't be sufficient, especially not states that are just per-core).
ACPI global sleep states (from Wikipedia). Systems are not required to implement all levels.
S1, Power on Suspend (POS): Processor caches are flushed, and the CPU(s) stops executing instructions. The power to the CPU(s) and RAM is maintained. Devices that do not indicate they must remain on may be powered off.
S2: CPU powered off. Dirty cache is flushed to RAM.
S3, commonly referred to as Standby, Sleep, or Suspend to RAM (STR): RAM remains powered. (But hard drives and everything else powers down)
S4, hibernate
I'm not going to try to write Windows API function calls to do this. I wouldn't be surprised if there's an program for requesting Windows to enter S1 or S2 state (ideally with some kind of triggered wakeup).
#RossRidge says that the HM70 chipset does implement S1 sleep (and implies that it doesn't support an S2 sleep.) Since S1 doesn't power down the CPU, it may not reset the timer. Even a hypothetical S2 sleep might not do the trick, because the timer may be external to the CPU and/or managed by the BIOS.
Software exists to program the BIOS to wake at a certain time. That's one possible way to trigger coming out of suspend. So it might be possible to write a script that programs a wakeup time for 2 seconds in the future, then initiates a sleep.
#MargaretBloom comments that Chapter 14 of the Intel Manuals enumerates all the power-management capabilities. (See the x86 tag wiki for links). Also that a totally different workaround may be possible, by using SMM.
re: your your followup question which was downvoted into oblivion:
enter sleep state just momentarily, for one millisecond
1ms is about 3 million core clock cycles. That's not momentary for a computer, especially from an asm programming perspective.
You definitely don't want to write assembly by hand to enter these states. Instead, use your OS's existing ACPI interface. This is a big part of the reason that everyone downvoted the crap out of your followup question.
Other than short per-core sleeps from mwait, pause, and hlt insns, the OS needs to know what's going on. For more about pause, see this. There aren't specific instructions to enter deeper sleeps anyway; you program ACPI by writing to device registers in MMIO space.
When all cores are HLTed at the same time, the whole CPU can opportunistically power down more stuff until the next timer or other interrupt wakes it up again (this is or at least is related to ACPI C-states, as I understand it). But this happens all the time during normal operation, because modern OSes run HLT on cores that are idle. The only interesting thing you could do here is get the CPU to sleep like this occasionally even if the system was running some CPU-intensive processes. (e.g. some threads with non-idle priority that run hlt in a loop). Since HLT is a privileged instruction, this would require a kernel thread or a syscall. You probably can't actually raise the priority of the system idle process so it steals time from other runable processes.
This may be an oversimplification: I haven't looked at kernel idle tasks recently to see if they still just run HLT when they want the current core to sleep until the next interrupt. For a while (when CPU power management was in its infancy) idle loops used to run some other stuff to enter a low-power C-state. But HLT may do that now.
Processes in userspace are scheduled by the kernel scheduler to get processor time but how the different kernel tasks get CPU time? I mean, when no process at userspace are requering CPU time (so CPU is iddle by executing NOP instructions) but some kernel subsystem need to carry out some task regularly, are timers and other hw and sw interrupts the common methods to get CPU time in kernel space?.
It's pretty much the same scheduler. The only difference I could think of is that kernel code has much more control over execution flow. For example, there is direct call to scheduler schedule().
Also in kernel you have 3 execution contexts - hardware interrupt, softirq/bh and process. In hard (and probably soft) interrupt context you can't sleep, so scheduling is not done during executing code in this context.