what happens to cache and DRAM when executing "a=5"? - caching

If a process writes a immediate operand to an address
int a;
a = 5;
what happens to L1-Data cache and DRAM?
DRAM fills "5" first or L1-Data Cache fills "5" first?

The compiler assigns some memory address to variable a. In the second statement, when a = 5 is executed, if the system is a multi-processor system, a request will be sent downstream to invalidate all lines and give the processor executing the code this particular cache address in a unique cache coherency state. The value of 5 is then written to the L1 cache (assuming the compiler wants to keep the cacheline address in the cache and does not deem that this should be written back to memory/DRAM).


what happens during context switch between two processes in linux?

Let's say process p1 is executing with its own address space(stack,heap,text). When context switch happens, i understand that all the current cpu registers are pushed into PCB before loading process p2. Then TLB is flushed and loaded with p2 address mapping and starts executing with its own address spaces.
What i would like know is the state of p1 address space. Will it be copied to disk and updates its page table before loading process p2?
The specifics of a context switch depend upon the underlying hardware. However, context switches are basically the same, even among different system.
The mistake you have is " i understand that all the current cpu registers are pushed into stack before loading process p2". The registers are stored in an area of memory that is usually called the PROCESS CONTEXT BLOCK (or PCB) whose structure is defined by the processor. Most processors have instructions for loading and saving the process context (i.e., its registers) into this structure. In the case of Intel, this can require multiple instructions saving to multiple blocks because of all the different register sets (e.g. FPU, MMX).
The outgoing process does not have to be written to disk. It may paged out if the system needs more memory but it is possible that it could stay entirely in memory and be ready to execute.
A context switch is simply the exchange of one processor's saved register values for another's.

write a block in cache that it's dirty bit was set

In computer architecture , if processor want to read a block in cache which it's dirty bit was set , then the processor will re-write this block to the memory or just read the block without write allocate ?
For reads, the data is read from the cache, as that is the latest updated data. For writes to the same block, the new data (to the same address) is updated and the dirty bit is set again. Only when there's a conflict miss (due to two different addresses sharing the same cache block) would the data actually be pushed to the next level of memory hierarchy.

Write in invalid state of MESI protocol

How is the write operation for a memory location that's not in the cache handled in the MESI protocol? The state diagrams i have seen mark it as Write Miss but i can't follow what happens in reality.
I think this results in a load operation on the bus to ensure that the processor trying to do the write gets exclusive access to the location and then the block is modified. Is this how it's done in reality or is the handling of write in invalid state implementation defined?
If the policy is allocate on a write miss:
If the block was not present in any other caches but only main memory, the block is fetched into the cache first, marked as M (modified) state, and then the write proceeds.
If the block was present in some other caches, it's copy in the other caches is first invalidated, so that this cache gains the only copy of the block, and then the write proceeds.
If the policy is no allocate on write miss: all write misses go directly to main memory. A copy is not fetched into the cache. If the main memory does not have the only copy of the block (some other cache has a copy), then the other copies are first invalidated and the write takes place in main memory.

Cleared RW (write protect) flag for PTEs of a process in kernel yet no segmentation fault on write

I implemented incremental process checkpointing at page level(I just dump the data from the process address space into a file).
The approach I used is as follows. I used two system calls:
Complete Checkpoint: copy entire address space. Also if write bit
is set for a page, clear it.
Incremental checkpoint: only dump data if write bit is set and clear it again. So basically, I check if write bit is set for an incremental checkpoint. If yes, dump the page data.
Test program:
char a[10000];
From what I know, the kernel should be doing page fault and handle illegal write case by killing the process with SIGSEGV. Yet the program is successfully checkpointed.
What is exactly happening here ?
If you modify a PTE when it's still cached in the TLB, the effect of the modification may be unseen for a while (until the PTE gets evicted from the TLB and has to be reread from the page table).
You need to invalidate the PTE in the TLB with the invlpg (I'm assuming x86) instruction after PTE modification. And it has to be done on all CPUs. There must be a dedicated function for this purpose in the kernel.
Also it wouldn't hurt to double check that the compiler didn't reorder or throw away anything from the above code.

Page fault and dirty pages

I have started reading about CPU caches and I have two questions:
Lets say the CPU receives a page fault and transfers control to the kernel handler. The handler decides to evict a frame in memory which is marked dirty. Lets say the CPU caches are write back with valid and modified bits. Now, the memory content of this frame are stale and the cache contains the latest data. How does the kernel force the caches to flush?
The way the page table entry (PTE) gets marked as dirty is as follows: The TLB has a modify bit which is set when the CPU modifies the page's content. This bit is copied back to the PTE on context switch. If we get a page fault, the PTE might be non-dirty but the TLB entry might have the modified bit set (it has not been copied back yet). How is this situation resolved?
As for flushing cache, that's just a privileged instruction. The OS calls the instruction and the hardware begins flushing. There's one instruction for invalidating all values and signaling an immediate flush without write back, and there's another instruction that tells the hardware to write data back before flushing. After the instruction call, the hardware (cache controller and I/O) takes over. There are also privileged instructions that tell the hardware to flush the TLB.
I'm not certain about your second question because it's been a while since I've taken an operating systems course, but my understanding is that in the event of a page fault the page will first be brought into the page table. Any page that is removed depends on available space as well as the page replacement algorithm used. Before that page can be brought in, if the page that it is replacing has the modified bit set it must be written out first so an IO is queued up. If it's not modified, then the page is immediately replaced. Same process for the TLB. If the modified bit is set then before that page is replaced you must write it back out so an IO is queued up and you just have to wait.
