I have several assumptions in mind please correct me if I'm wrong:
Without a real IOMMU a DMA-Transfer would be a security risk, because a guest could transfer garbage in Host Memory.
A valid DMA-Transfer between guest memory and passthrough device without a real IOMMU is not possible.
The Host-OS is not aware of any DMA-related things related to the passthrough device
An now some questions:
related to Point 3: Is there a way to get any information about a DMA-Transfer with a passthrough device?
If I don't have DMA-Remapping on, would KVM complain about it?
Is there a possibillity to deny any DMA-related stuff for the guest with the passthrough device?
Ran across this old question while doing some research and figured I'd post an answer for anyone interested.
About point 1. Without an IOMMU you can't do passthrough at all. I/O devices are in the kernel space. The guest kernel is a user process that only thinks it is running in kernel space.
As to the questions:
With an emulated device the emulator code intercepts all of the DMA setup and can make sure they are valid. With PCI passthrough, the register reads and writes are going directly to the device, or to a VF on the device for SR-IOV, so there is no opportunity for KVM or any other code outside the guest to validate, give errors or complain.
Related
So I understand that a kernel driver sits on top of HW device and then to communicate with the device via user space you will need to talk to the kernel driver via CreateFile && Read && Write. I have seen the design of Window's kernel drivers and their sample codes whether it is a USB or PCI or...
Now what I want to understand is how does the kernel drive communicate with the hardware? Where is the driver code would we usually find the code responsible for reading/writing registers on a certain device? What does the driver need to communicate with the device? I was told it is the BAR0 value that maps the HW to the virtual memory area which means any address we want to access on a physical device would start at that address. Is that correct? what If I have BAR0 = 0xfc500000, do i have to find addresses of certain register on the device and then ad it as an offset?
Driver need to get HW Resources from the OS. in a PCI device example, you will get MMIO Address and the interrupt vector. the MMIO Address is a physical address the PCI Controller and the BIOS map the device too.
the driver gets this value in EvtPepareHardware callback (in KMDF) and then need to map it to kernel Virtual address using MmMapIoSpace().
once you get a kernel Virtual address theoretically it is a "pointer" to the HW memory space and you can access the registers as you wish.
but it is recommended to use the HAL macros to void caching and other issues to access that memory. e.g. READ_REGISTER_ULONG64
to find the address of a registers you the Hardware device spec
for more info read this "Reading and Writing to Device Registers"
A registered netfilter hook can get the packet from the linux kernel. Here linux kernel gets the packet, looks for registered hooks, and passes the packet to them.
The general flow would be:
1. NIC receives the frame
2. Places it in DMA rx ring
3. Kernel's networking subsystem takes it from the DMA rx ring.
But is there a way to get the packet before it enters into linux networking subsystem (may be a big term, my intention is kernel networking code that takes the packet first). That is, my driver should get the packets before it goes into the linux networking stack.
I am a learner and trying to write a piece of code that does the packet processing in the kernel.
Please correct my understanding if wrong and help me with good pointers to read.
Just modify the network NIC card driver? It is the one who got the ISR first. You should be able to filter it there.
I've been trying to get into QEMU development in order to virtualize a not supported hardware.
I want to develop a new QEMU i2c device (qemu x86), that would get/send data to an application running on the guest. Thing is : I need these data onto the host, as a daemon will send/get the same kind of data to the guest.
My questions are : is it easy to get the data from this device ?
Are there any examples already in QEMU that can fit my needs ?
PS : my i2c device is only a "bridge" between the host and the guest. I need the application to use i2c (can't change that).
Typically, the qemu device will use the "chardev" abstraction to get data from a socket on the host. For example, something like -chardev socket,path=/tmp/foo.sock,server,nowait,id=foo -device myi2c,chardev=foo will connect your i2c device to a socket on the host.
There are many examples of devices that use chardevs in QEMU's hw/char directory. A simple example is digic-uart.c.
i have been doing the KVM stuff and have a couple of questions that can not figure out.
1> as we know, normally the external interrupt will cause VMexit and the hypervisor will inject a virtual interrupt if it is for guest. Then which irq will be injected (i mean the interrupt vector for indexing the guest IDT)? How does the KVM get to know about this (associate a host IRQ with a guest virtual IRQ)?
2> if for assigned device to the guest, the hypervisor will deliver that IRQ to the guest. by tracing the code, i found the host IRQ is different with the guest's (i mean the interrupt vector). how the KVM configure which interrupt vector the guest should use?
3> if we configure not exit on external interrupt by setting some field in VMCS, what will happen during the physical interrupts? will the CPU use the guest IDT for response interrupt? If so, can KVM redirect the CPU to use another IDT for guest (assuming modifying the IDTR)?
4> where is the guest IDT located? it this configured by the qemu while initializing the vcpu and registers (include the IDTR)?
I really hope someone can reply to my questions. I will be very appreciated.
Thanks
1-
2-
The code is in irq_comm.c and very complex. For the guest vector, the hypervisor traps and monitors the PCI configuration space of the guest (this is actually done in QEMU - see for instance kvm_msi_update - however a syscall to the KVM updates it with the data).
3- Yes. For setting another IDT - you need to change the IDTR field in the VMCS.
4- The guest IDT is configured by the guest code. QEMU/KVM is not directly involved in it. You need to configure the execution-controls to trap on LIDT in order to monitor changes for the guest IDTR.
Sounds like you are trying to implement ELI from ASPLOS'12.
Contact me offline (the second author of the paper - NA).
I'm writing an IPsec implementation for a microcontroller and I want to test it using a standard Linux box running Debian Lenny. Both devices should secure the communication between them using IPsec ESP in tunnel mode. The keys are setup manually using setkey. There's no (or at least should be no) user space program involved in processing an IPsec packet. Now I want to see how my created packets are processed by the Linux kernel. To see the raw packets I capture them using tcpdump and analyze them using wireshark.
What's the best way to obtain debug information about IPsec processing?
How can I figure out whether the packet is accepted by the kernel?
How can I view the reason for a packet to be dropped?
You can instrument the XFRM (or perhaps ipv4/esp.c) kernel code to print out debug messages at the right spots.
For example, in net/ipv4/esp.c there exists a function esp_input() which has some error cases, but you'll see most the interesting stuff is in the xfrm/*.c code.
That said, I didn't have a problem interoperating a custom IPSec with Linux. Following the 43xx specs and verifying the packets came out correctly via wireshark seemed to do well. If you're having issues and don't want to instrument the kernel then you can setup iptables rules and count the number of (various type of) packets at each point.
Finally, be sure you've actually added a security policy (SP) as well as a security association (SA) and setup firewall rules properly.