When I was studying about I/O in my OS Class, I came to a following screenshot from an macOS looking terminal which is showing the total number of interrupts generated in 10 seconds-Image Here
can anyone please suggest from which command could I get this output, specially Macs with new ARM chips?
On intel macs, the command you want is called latency. See man latency. You can run it like so:
sudo latency -n /System/Library/Kernels/kernel
I'm not sure if this command is available for ARM based macs or not.
Related
I'm currently struggling to determine how I can get an emulated environment via QEMU to correctly display output on the command line. I have an environment that displays perfectly well using the virt reference board, a cortex-a9CPU, and the 4.1 Linux kernel cross-compiled for ARM. However, if I swap out the 4.1 kernel for 2.6 or 3.1, suddenly I can no longer see console output.
While solving this issue is my main goal, I feel like I lack a critical understanding of how Linux and the hardware initially integrate before userspace configurations via boot scripts and whatnot have a chance to execute. I am aware of the device tree, and have a loose understanding of how it works. But the issue I ran into where a different kernel version broke console availability entirely confounds me. Can someone explain how Linux initially maps console output to a hardware device on the ARM architecture?
Thank you!
The answer depends quite a bit on which kernel version, what config options are set, what hardware, and also possibly on kernel command line arguments.
For modern kernels, the answer is that it looks in the device tree blob it is passed for descriptions of devices, some of which will be serial ports, and it initializes those. The kernel config or command line will specify which of those is to be used for the console. For earlier kernels, especially if you go all the way back to 2.6, use of device tree was less universal, and for some hardware the boot loader simply said "this is a versatile express board" (for instance) and the kernel had compiled-in data structures to tell it where the devices were for each board that it supported. As the transition to device tree progressed, boards were converted one by one, and sometimes a few devices at a time, so what exactly the situation was for any specific kernel version depends on which board you're using.
The other thing that I rather suspect you're running into is that if the kernel crashes early in bootup (ie before it finds the serial port at all) then it will never output anything. So if the kernel is just too early to support the "virt" board properly at all, or if your kernel config is missing something important, then the chances are good that it crashes in early boot without being able to print you a useful message. (Sometimes "earlycon" or "earlyprintk" kernel arguments can assist here, but not always.)
I've been looking for a osx utility that shows cpu usage for each cpu. For example
CPU 0 - 10%
CPU 1 - 2%
...
I know of many ways of getting this information in other Unix-like system (/proc, mpstat, etc) but none work in osx. The most useful one for Mac is top but it only shows total cpu usage. I need the application to be run from the shell so that I can log the usage over time. I also tried cpuwalk.d but it only shows you if the application is running in one or more cores.
If you take a look at Activity Monitor app you will notice that it basically displays the same info as top, but with the addition of a graph that shows cpus load.
If anyone has any idea of how to get the information I would appreciate it. Thanks.
You can try htop. If you have homebrew installed, simple install it via "brew install htop", after the installation finished, type in htop on the shell.
you can download os x hardware monitor src from: https://github.com/max-horvath/htop-osx
Our product is a specialized device running minimal Ubuntu. Our C++ application on the device periodically scans I2C bus to detect if any new monitor/projector/etc. has been connected. This generally works well. However, once in two to three weeks, we see a random freeze.
As it happens randomly, we cannot consistently reproduce it.
From coding perspective, I essentially scan /dev/i2c-* files, open() the file, and try to read first 128 bytes using ioctl().
I guess we do something similar to what Linux tool i2cdetect does. From the manpage on i2cdtect, it states that "read byte" is known to lock SMBus on various write-only chips. Wondering if anyone knows this could be the problem we are running into. Regards.
In Xcode's Instruments, there is a tool called Counters that exposes low-level counter information provided by the CPU, such as the number of instructions executed or number of cache misses:
This is similar to the Linux syscall perf_event_open introduced in Linux 2.6.32. On Linux, I can use perf_event_open then start/stop profiling around the section of my code I'm interested in. I'd like to record the same type of stats on OS X: counting the instructions (etc.) that a certain piece of code takes, and getting the result in an automated fashion. (I don't want to use the Instruments GUI to analyze the data.)
Are there any APIs that allow this (ex: using dtrace or similar)? From some searching it sounds like the private AppleProfileFamily.framework might have the necessary hooks, but it's unclear how to go about linking to or using it.
In GNU/Linux I use Intel's PCM to monitor CPU utilization. I'm not sure if this works fine on OSX, but as far as I know the source-code is including the MacMSRDriver directory. I have no any OSX device, never test it anyway.
In case this source compiled on your device, Just run:
pcm.x -r -- your_program your_program_parameter
or if you want advanced profiling, use pcm-core.x instead or you can build your own code based on pcm-core.cpp
I conducted the following benchmark in qemu and qemu-kvm, with the following configuration:
CPU: AMD 4400 process dual core with svm enabled, 2G RAM
Host OS: OpenSUSE 11.3 with latest Patch, running with kde4
Guest OS: FreeDos
Emulated Memory: 256M
Network: Nil
Language: Turbo C 2.0
Benchmark Program: Count from 0000000 to 9999999. Display the counter on the screen
by direct accessing the screen memory (i.e. 0xb800:xxxx)
It only takes 6 sec when running in qemu.
But it takes 89 sec when running in qemu-kvm.
I ran the benchmark one by one, not in parallel.
I scratched my head the whole night, but still not idea why this happens. Would somebody give me some hints?
KVM uses qemu as his device simulator, any device operation is simulated by user space QEMU program. When you write to 0xB8000, the graphic display is operated which involves guest's doing a CPU `vmexit' from guest mode and returning to KVM module, who in turn sends device simulation requests to user space QEMU backend.
In contrast, QEMU w/o KVM does all the jobs in unified process except for usual system calls, there's fewer CPU context switches. Meanwhile, your benchmark code is a simple loop which only requires code block translation for just one time. That cost nothing, compared to vmexit and kernel-user communication of every iteration in KVM case.
This should be the most probable cause.
Your benchmark is an IO-intensive benchmark and all the io-devices are actually the same for qemu and qemu-kvm. In qemu's source code this can be found in hw/*.
This explains that the qemu-kvm must not be very fast compared to qemu. However, I have no particular answer for the slowdown. I have the following explanation for this and I think its correct to a large extent.
"The qemu-kvm module uses the kvm kernel module in linux kernel. This runs the guest in x86 guest mode which causes a trap on every privileged instruction. On the contrary, qemu uses a very efficient TCG which translates the instructions it sees at the first time. I think that the high-cost of trap is showing up in your benchmarks." This ain't true for all io-devices though. Apache benchmark would run better on qemu-kvm because the library does the buffering and uses least number of privileged instructions to do the IO.
The reason is too much VMEXIT take place.