perf record can drop samples under high CPU load, leading to incomplete performance profiles.

Common Causes and Fixes

  1. Sample Rate Too High:

    • Diagnosis: Check the sample rate. If it’s very high (e.g., perf record -e cycles:P which defaults to a high frequency), it can overwhelm the kernel’s sampling mechanism.
    • Fix: Reduce the sample frequency. For example, perf record -e cycles:50000 samples every 50,000 CPU cycles. This reduces the overhead and the likelihood of dropped samples.
    • Why it works: Lowering the sampling frequency means perf interrupts the CPU less often, reducing the chance that an interrupt handler (which perf uses) will be preempted or that the system will be too busy to record the sample.
  2. Insufficient Buffer Size:

    • Diagnosis: The default buffer size for perf events might be too small for the amount of data being generated. Check perf’s buffer settings.
    • Fix: Increase the buffer size. Use perf record -c 1000000 to set the buffer size to 1,000,000 events. This gives perf more room to store samples before they need to be written to disk.
    • Why it works: A larger buffer allows perf to accumulate more samples in memory before needing to flush them, reducing the chance of overflow if the system experiences short bursts of high activity.
  3. Kernel Module Interference:

    • Diagnosis: Certain kernel modules, especially those that heavily instrument the kernel or interact with tracing mechanisms, can conflict with perf. Look for messages in dmesg related to tracing or perf.
    • Fix: Temporarily unload suspect kernel modules. For example, if tracepoints are causing issues, try rmmod tracepoints (if applicable and safe). Test perf record again.
    • Why it works: Some kernel tracing mechanisms can consume significant kernel resources or lock contention, interfering with perf’s ability to reliably capture samples.
  4. System-Wide vs. Per-Thread/Process Sampling:

    • Diagnosis: If you’re trying to profile a specific process but perf record is configured for system-wide sampling without filtering, it might be swamped by other processes.
    • Fix: Target your profiling. Use perf record -p <PID> -e cycles:P to profile a specific process ID (<PID>). This focuses perf’s efforts on the target.
    • Why it works: By specifying a process ID, perf only collects samples related to that process’s execution, drastically reducing the volume of data and the system load it imposes.
  5. CPU Frequency Scaling:

    • Diagnosis: Aggressive CPU frequency scaling (e.g., the ondemand or powersave governors) can lead to unpredictable sampling intervals if the CPU speed changes rapidly during sampling.
    • Fix: Set the CPU governor to performance. You can do this temporarily for all CPUs with echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor. This keeps CPUs running at their maximum frequency.
    • Why it works: A constant CPU frequency means the timing of cycles events is more predictable, leading to more consistent sampling and fewer dropped samples.
  6. I/O Bottlenecks:

    • Diagnosis: If the disk where perf.data is being written is slow or heavily utilized, it can become a bottleneck, causing perf to drop samples because it can’t write them out fast enough.
    • Fix: Write perf.data to a faster storage device (e.g., an NVMe SSD instead of an HDD) or to a different disk that is less busy. You can specify the output file with perf record -o /path/to/faster/disk/perf.data ....
    • Why it works: By offloading the write operations to a faster or less contended I/O subsystem, perf can flush its internal buffers more quickly, preventing them from overflowing.
  7. Insufficient Permissions/Security Modules:

    • Diagnosis: Security modules like SELinux or AppArmor might restrict perf’s access to certain kernel features or memory regions necessary for accurate sampling, especially for kernel-level events. Check /var/log/audit/audit.log or dmesg for denials.
    • Fix: Adjust SELinux/AppArmor policies to allow perf necessary permissions. For SELinux, this might involve setting specific booleans or contexts. For example, sudo setsebool -P allow_perf_trace_events on (though this depends on your specific SELinux configuration and kernel version).
    • Why it works: perf relies on specific kernel interfaces and memory access for sampling. Security modules can block these, leading to incomplete data or errors. Granting the necessary permissions allows perf to function correctly.

The next error you’ll likely encounter after fixing dropped samples is related to insufficient symbols for accurate function-level analysis, requiring perf to have access to debug information.


perf is designed to give you a window into your system’s execution, but it’s not a passive observer; it actively influences the very behavior it’s trying to measure.

Let’s look at how perf record captures stack traces, and why it’s more than just a simple function call list. When perf samples, it doesn’t just record the instruction pointer at the moment of interruption. For stack traces, it walks the call stack at that instant. This involves traversing the stack frame pointers in memory to reconstruct the sequence of function calls that led to the current execution point.

Consider a scenario where you’re profiling a web server under load. You run perf record -g -p <PID> -- sleep 10. The -g flag tells perf to record call graphs (stack traces).

# Example output snippet from perf script after recording
# (This is a simulated representation, actual output is more verbose)

# Event: cycles:P

Want structured learning?

Take the full Perf course →