perf can inspect KVM guest execution, showing you exactly where your virtual machine is spending its CPU cycles, even when those cycles are spent in the hypervisor or in guest kernel code.

Here’s what that looks like in action. Imagine you’ve got a VM named my-vm-01 running on your KVM host. You want to profile its qemu-kvm process, which is handling the guest’s execution.

First, find the qemu-kvm process ID (PID) for your VM. You can usually do this with pgrep:

pgrep -f qemu-kvm.*my-vm-01

Let’s say that returns 12345. Now, you can start perf to record events, specifically focusing on the KVM-related events. A good starting point is to look at cycles, instructions, and a few key KVM events:

sudo perf record -p 12345 -e cycles,instructions,kvm:kvm_entry,kvm:kvm_exit,kvm:kvm_mmio,kvm:kvm_pio --call-graph dwarf

This command does a few things:

  • -p 12345: Tells perf to attach to the process with PID 12345.
  • -e cycles,instructions,kvm:kvm_entry,kvm:kvm_exit,kvm:kvm_mmio,kvm:kvm_pio: Specifies the events to record.
    • cycles: High-level CPU clock cycles.
    • instructions: Number of instructions retired.
    • kvm:kvm_entry: Triggered when the VMM (QEMU) enters the KVM kernel module to handle a guest event or instruction.
    • kvm:kvm_exit: Triggered when KVM exits back to the VMM (QEMU) because the guest performed a specific action (like an I/O, an interrupt, or an instruction that requires emulation).
    • kvm:kvm_mmio: Records MMIO (Memory-Mapped I/O) accesses by the guest that KVM intercepts.
    • kvm:kvm_pio: Records PIO (Port I/O) accesses by the guest that KVM intercepts.
  • --call-graph dwarf: Enables call graph recording, which is crucial for understanding the context of where these events occur within the qemu-kvm process and the kernel.

After perf record has run for a sufficient period (e.g., a minute or two while your VM is under load), you can analyze the data:

sudo perf report

This will open an interactive TUI where you can navigate the profiled events. You’ll see percentages of CPU time spent in different functions. The key is to look for:

  1. kvm_entry and kvm_exit overhead: High percentages here indicate frequent context switches between the guest and the hypervisor. This is expected, but excessive amounts can point to inefficient guest behavior or hardware issues.
  2. Specific KVM event handlers: Functions like handle_mmio_ioevent or handle_pio_ioevent being hot suggest the guest is performing a lot of I/O operations that KVM has to emulate. This is often a bottleneck for storage or network-intensive workloads.
  3. kvm_vcpu_run: This is the core loop where KVM executes guest VCPUs. High time here is normal, but understanding why it’s running (i.e., what event caused the exit) is key.
  4. Guest kernel code within kvm_vcpu_run: perf can often show you kernel functions running within the guest context when you have call graphs enabled and the appropriate debug symbols. Look for functions related to your guest’s workload.

The kvm:kvm_entry and kvm:kvm_exit events are fundamental to understanding VM performance. Every time a guest needs to do something that the hypervisor must manage – like accessing I/O devices, handling interrupts, or executing privileged instructions – KVM traps into the host kernel. This trap-and-emulate mechanism incurs overhead. If these events are happening extremely frequently, it means your guest is spending a lot of time in this transition, rather than executing its own code directly.

The kvm:kvm_mmio and kvm:kvm_pio events are particularly interesting for I/O bottlenecks. If you see significant time spent in the handlers for these events, it means the guest is performing I/O operations that are being intercepted and emulated by QEMU/KVM. This is common for emulated devices (like old network cards or storage controllers). Using virtio drivers on the guest side, which offer direct memory access and bypass much of this emulation, can drastically reduce this overhead. You’d see less time in kvm:kvm_mmio and kvm:kvm_pio and more time in kvm_vcpu_run with guest kernel code.

When analyzing perf report, you can drill down into specific call stacks. If kvm_exit is high, pressing enter on that line will show you why it exited. You might see functions like kvm_io_bus_register_dev or kvm_handle_ioreq which pinpoint the exact I/O operation causing the exit.

A common culprit for high kvm_exit counts related to I/O is the guest’s use of emulated devices. For example, if your guest is using an emulated e1000 network card instead of a virtio-net device, every network packet will likely cause a kvm_exit for MMIO access.

To fix this, you would typically:

  1. Identify the guest device: In your virt-manager or QEMU command line, check the virtual hardware assigned to your VM.
  2. Switch to virtio: If using emulated devices for network or storage, change them to their virtio equivalents (e.g., virtio-net-pci for network, virtio-blk-pci or virtio-scsi-pci for storage). Ensure the guest OS has the corresponding virtio drivers installed.
  3. Re-profile: After making the change, re-run perf record and perf report to see the reduction in kvm_exit events and the shift of time back to kvm_vcpu_run and guest code.

This process of identifying specific KVM events that cause exits and then optimizing the guest’s device model is the core of tuning KVM performance with perf.

After you’ve optimized I/O and reduced KVM exits, you’ll likely start seeing more time spent within the guest kernel’s scheduler or application code. The next step would be to profile the guest itself using perf inside the guest to understand its internal bottlenecks.

Want structured learning?

Take the full Perf course →