perf’s uprobe feature lets you instrument user-space code as if it were kernel code, giving you unparalleled visibility into application behavior without modifying the application itself.

Let’s see perf in action. Imagine we have a simple C program that does some work and we want to know how many times a specific function, do_work, is called.

#include <stdio.h>
#include <unistd.h>

void do_work(int id) {
    printf("Worker %d doing work...\n", id);
    usleep(100000); // Simulate some work
}

int main() {
    for (int i = 0; i < 5; ++i) {
        do_work(i);
    }
    return 0;
}

We can compile this:

gcc -g -o myapp myapp.c

Now, to trace do_work using perf uprobes, we first need to find its address. We can use objdump for this:

objdump -t myapp | grep do_work

This might output something like:

0000000000401126 g     F .text  000000000000003c do_work

The address is 0x401126. Now, we can tell perf to probe this address:

sudo perf record -e '{uprobe:myapp:0x401126}' --call-graph dwarf,caller ./myapp

Let’s break this down:

  • sudo perf record: We need root privileges for perf to access kernel tracing mechanisms.
  • -e '{uprobe:myapp:0x401126}': This is the event specification.
    • uprobe:: Indicates we’re using a user-space probe.
    • myapp: The name of the executable we’re probing.
    • 0x401126: The address of the do_work function.
  • --call-graph dwarf,caller: This tells perf to capture call graph information, which is crucial for understanding the context of the function call. dwarf uses DWARF debugging information, and caller is a fallback for systems without it.
  • ./myapp: The command to run the application we want to trace.

After myapp finishes, perf will have created a perf.data file. We can then analyze it:

perf script

This will dump a lot of information. To filter for our do_work calls, we can use grep:

perf script | grep do_work

You’ll see output similar to this, showing each time do_work was entered:

...
myapp     12345 [000] 10000.000000:   uprobe:myapp:0x401126: (401126)
myapp     12345 [000] 10000.100000:   uprobe:myapp:0x401126: (401126)
...

The perf.data file also contains call graph data. To see which functions called do_work, we can use perf report:

perf report

In the perf report TUI, navigate to the uprobe:myapp:0x401126 entry. You’ll see the call stack leading to it. In our simple case, it will show main as the caller.

This granular tracing allows you to pinpoint performance bottlenecks, understand complex application flows, and debug issues in libraries or third-party code without recompilation or runtime instrumentation libraries.

uprobes can also be attached to specific offsets within a shared library, not just an executable. To do this, you’d use the library’s path and the symbol name or offset:

sudo perf record -e '{uprobe:/usr/lib/x86_64-linux-gnu/libc.so.6:__libc_malloc}' --call-graph dwarf,caller -a ./myapp

Here, we’re tracing __libc_malloc within the C library. The -a flag means we’re tracing all CPUs.

A common pitfall is forgetting to include debugging symbols (-g) when compiling the target application or library. Without them, perf might struggle to resolve function names and offsets accurately, often falling back to raw addresses which are less informative.

When using uprobes with dynamically linked executables, perf can automatically resolve symbol names if the executable is compiled with debugging information and linked dynamically. However, for static binaries or when symbols are stripped, you’ll need to rely on function addresses obtained via objdump or readelf.

The perf event syntax allows for more advanced filtering. You can specify probes based on function names directly if debugging symbols are available:

sudo perf record -e '{uprobe:myapp:do_work}' --call-graph dwarf,caller ./myapp

This simplifies the command if you know the function name and debugging symbols are present. perf will look up the address of do_work in the myapp executable.

Beyond simple function entry tracing, uprobes can also be configured to trigger on function return using the uretprobe event. This is useful for measuring function duration or observing what happens after a function completes.

sudo perf record -e '{uretprobe:myapp:do_work}' --call-graph dwarf,caller ./myapp

This allows you to understand not just when a function is called, but also how long it takes to execute and what the subsequent execution path looks like.

If you encounter "permission denied" errors, ensure you’re running perf with sudo. The perf_event_paranoid kernel setting can also restrict tracing capabilities. You might need to adjust it:

echo 1 | sudo tee /proc/sys/kernel/perf_event_paranoid

A value of 1 generally allows uprobes. Lower values are more permissive, higher values are more restrictive.

The next step in mastering perf for user-space tracing is exploring tracepoints and kprobes for deeper system-level insights.

Want structured learning?

Take the full Perf course →