The perf command in Linux is a powerful tool for performance analysis, and a common use case is tracing kernel functions to understand what the operating system is doing under the hood.

Let’s say you’re seeing high CPU usage and want to pinpoint which kernel functions are consuming the most time. You can use perf to do this.

First, you need to identify the event you want to trace. For kernel functions, perf can sample based on hardware performance counters, but it can also trace specific kernel tracepoints or use dynamic probes (kprobes/uprobes). For tracing kernel functions directly, kprobes are often the most flexible.

To trace calls to a specific kernel function, like tcp_sendmsg, you’d use perf probe.

sudo perf probe -x /boot/vmlinux-5.15.0-76-generic tcp_sendmsg

This command tells perf to insert a probe at the entry point of the tcp_sendmsg function within the kernel image located at /boot/vmlinux-5.15.0-76-generic.

Once the probe is set, you can start recording events.

sudo perf record -e probe_tcp_sendmsg:entry -a -- sleep 10

Here, -e probe_tcp_sendmsg:entry specifies that we want to record events from the tcp_sendmsg entry probe. -a means to sample across all CPUs. -- sleep 10 runs the recording for 10 seconds.

After the recording, you can view the results:

sudo perf report

This will show you a breakdown of where time was spent, and you’ll see tcp_sendmsg listed if it was active. The output will typically show the function name and the percentage of samples attributed to it.

You can also trace function calls and returns. To trace both entry and return points of tcp_sendmsg:

sudo perf probe -x /boot/vmlinux-5.15.0-76-generic 'tcp_sendmsg%return'
sudo perf record -e probe_tcp_sendmsg:entry -e probe_tcp_sendmsg:return -a -- sleep 10
sudo perf report

The %return suffix tells perf to place a probe at the function’s exit.

Beyond specific functions, perf can also trace kernel tracepoints. These are predefined points in the kernel code that are designed for instrumentation. You can list available tracepoints with:

sudo perf list tracepoints

For example, to trace network packet reception events:

sudo perf record -e net:net_dev_queue -a -- sleep 10
sudo perf report

This traces the net_dev_queue tracepoint, which fires when a network device is about to queue a packet.

If you want to trace all calls to kernel functions within a certain time frame (which can generate a massive amount of data), you can use perf record -g for call graph tracing. However, for simply counting or timing specific functions, probes are more targeted.

The perf tool relies on the debuginfo packages for your kernel being installed if you’re directly referencing kernel symbols and need symbol resolution. If perf probe fails to find a symbol, ensure your kernel debug symbols are installed (e.g., linux-image-$(uname -r)-dbgsym on Debian/Ubuntu).

One of the most surprising things about perf probe is its ability to dynamically insert probes into a running kernel without recompilation or a reboot. This is achieved through the kernel’s kprobes and uprobes infrastructure, allowing you to instrument specific points in kernel or user-space code on the fly.

The next step in performance analysis is often correlating kernel function activity with user-space application behavior, which can be done by combining perf probes with other tracing mechanisms or by using perf’s support for user-space probes.

Want structured learning?

Take the full Perf course →