I am wondering if a process is moved to other NUMA nodes by linux kernel implicitly for the purpose of load balancing.
If it does, could you tell me when it happens?
Thank you.
Related
Once the Uboot loads the Linux kernel image (ZImage) onto the ram, it invokes it (could be using bootz, bootm or some other commands based on the type of the kernel Image) and then the control goes to booting the kernel. Does the uboot will be informed about the kernel boot result?, means, whether the kernel booting went through completely or got stuck in the middle because of errors?.
I looked at do_bootz, do_bootm_states and boot_selected_os api's in the uboot src code to see if there is any way to know about the final kernel boot result, but I couldn't able to figured it out.
Details:
U-boot Version: 2017.03-rc2
api's are available at: cmd/bootz.c and bootm.c files.
If any one in this community knows about it or have an idea about it, please explain to me or point me to the correct path.
Thanks in advance.
Regards
Vamsi Chagari
After bootm, booti, bootz transfer control to the kernel the memory formerly used by U-Boot will be reused by the operating system. As U-Boot is no longer in memory it cannot be informed about the operating system status.
If you use the bootefi command the U-Boot implementation of the UEFI runtime services stays in memory while the operating system is starting. The UEFI services can be called by the operating system. These include services relating to variables. One use of UEFI variables is the definition of the boot sequence.
Unfortunately UEFI variables are not yet completetly implemented in U-Boot (as of version v2018.07). They currently cannot be accessed after exiting boot services.
I'm reading the book and it tells that:
After U-Boot loads Linux kernel, the kernel will claim all the resources of U-Boot
What does this mean? Does it mean that all data structures that allocated in U-Boot will be discarded?
For example: during U-Boot, PCIE and Network Device will be initialized.
After booting Linux kernel, will the PCIE and Network Device data structure be discarded? Will the Linux kernel do PCIE and NEtwork initialize again? Or U-Boot will transfer some data to kernel?
It depends on your CPU architecture how the communication happens, but it is usually via a special place in RAM, flash or the filesystem. No data structures are transferred, they would be meaningless to the kernel and the memory space will be different between the two. Uboot generally passes boot parameters like what type of hardware is present, what memory to use for something, or which type of mode to use for a specific driver. So yes, the kernel will re-initialize the hardware. The exception may be some of the low level CPU specifics which the kernel may expect uboot or a BIOS to have setup already.
Depending on your architecture, there may be different mechanism for the u-boot to communicate with the Linux kernel.
Actually there may be some structures defined by u-boot which are transferred to and used by the kernel using ATAGS. The address in which these structure are passed is stored in r2 register on ARM. They convey information such as available RAM size and location, kernel command line, ...
Note that on some architectures (like ARM again) we have support for device-tree which intends for defining the hardware in which the kernel is going to be run as well as kernel command line, memory and other thins. Such description is usually created during kernel compile time, loaded into the memory by the u-boot and in case of ARM architecture, its address is transferred through r2 register.
The interesting thing about this (regarding your question) is that u-boot can change this device-tree structure before passing it to the kernel through device tree overlay mechanism. So this is a (relatively) new way of u-boot/kernel communication. Note that device-tree is not supported on some architectures.
And at the end, yes, the hardware is reinitialized by the kernel even in they have already initialized by the u-boot except for memory controller and some other very low level initialization, AFAIK.
I am developing an embedded system on a PowerPC processor and there is need for communication with an FPGA via PCIe. I wish to use Linux/embedded-Linux as a bootloader to leverage its PCIe initialization code and driver API for simplified PCIe driver development. However in the end I want to be running bare-metal code (no OS running). So I am looking at using PetitBoot/kexec to jump from Linux to my own code.
Is this possible?
My current understanding of PCIe drivers leads me to believe that once the device is initialized, so long as I have a pointer to the address space, I should be able to simply execute MMIO R/W operations directly to the memory space. So even if kexec overwrites the driver code I should be able to use the device because the driver has done its job already.
Is this correct?
If not, what are my alternatives?
I don't think this approach would be a good idea. Drivers that were written with the Linux OS in mind are going to assume that all of the OS's resources are available, not just memory allocations. For example, it may configure interrupt handlers, but when the OS is not longer available, your hardware may get hung because nothing is acknowledging and servicing its interrupt requests.
I'm skeptical of the memory initialization as well. I suppose you could theoretically allocate some DMA memory and pass the resulting physical address to your bare-metal application as it takes over, but the whole process seems sketchy. It would be very difficult to make sure everything in Linux is shut down cleanly while leaving the PCIe subsystem running. You'll have to look at the driver's shut down routines and see what it does to the card to make sure it doesn't shut down the device and make it unresponsive to your bare-metal code.
I would suggest that you instead go through the Linux-based driver and use it as a guide to construct a new bare-metal driver. Copy the initialization code that you need, and leave out the Linux-specific configuration details.
I wanted to know if there is a way to perform experiments specific to NUMA architecture
on a x86_64 machine, by some sort of simulation tools.
I cam across some resource on creating fake NUMA nodes but couldn't figure out how exactly
to use them. Virtual machines are also fine, if there is a way.
Thanks,
I'm trying to implement a "Semantic based memory sharing model" for Xen. As a part of my project, i'm trying to share kernel code pages across VMs. I' ve assumed that the code segments of linux kernels with similar version are 100% identical. But when i carry out some experiments using Virtual Machines running Debian Squeeze, i have found 3 memory pages are different in kernel code segment.
So my question is that, does the linux kernel modifies its code pages at runtime?
Yes, it can - for example, spinlocks can be dynamically patched out of the code if the kernel sees at runtime that it is running on a uniprocessor system. I do not know of an exhaustive list of such cases, you will need to inspect the code.
See the LWN article on SMP Alternatives for more information on one system that does runtime patching within the kernel.