Linux perf is a powerful tool, but letting any user run it with root-like privileges is a massive security hole.
Let’s see perf in action. Imagine we want to profile a simple C program that just loops a million times.
#include <stdio.h>
int main() {
long long sum = 0;
for (long long i = 0; i < 1000000000; ++i) {
sum += i;
}
printf("Sum: %lld\n", sum);
return 0;
}
If we try to run perf stat as a regular user on a system where perf_event_paranoid is set to its default, we’ll get an error:
$ perf stat ./my_program
Error: The system hasカーネル configuration that prevents perf from accessing performance counters.
To allow access, please try to adjust 'perf_event_paranoid' in the kernel.
See also /proc/sys/kernel/perf_event_paranoid for more details.
This is because accessing performance counters, especially CPU-specific ones like instruction retired or cache misses, requires a level of system access that could be exploited. A malicious user could use perf to:
- Infer sensitive data: By observing precise instruction counts or cache behavior, an attacker might deduce information about cryptographic operations, password checks, or other sensitive data being processed by other processes.
- Bypass security controls: Understanding the exact execution flow and timing of system components could help an attacker find vulnerabilities or timing side-channels.
- Denial of Service: While less common, certain
perfoperations could theoretically be crafted to consume excessive resources.
The perf_event_paranoid setting in the Linux kernel is the primary gatekeeper for this. It’s a single integer value that controls how much access perf has. The values are:
-1(or0on older kernels): Maximum access. Can profile all processes, all events. This is generally unsafe for multi-user systems.1: Allow CPU event access (e.g., cycles, instructions, cache misses).2: Allow kernel profiling.3: Allow only user-space profiling. This is the default and generally the safest setting for general use.
Configuring Safe Profiling Access
The goal is to set perf_event_paranoid to a value that balances the need for profiling with security requirements. For most development and debugging scenarios on a system where you’re not worried about other users snooping, a setting of 1 or 2 might be acceptable. On production systems or those shared by multiple untrusted users, 3 is usually the way to go.
Let’s say you want to allow CPU event profiling but prevent kernel profiling (setting perf_event_paranoid to 1).
-
Check current setting:
cat /proc/sys/kernel/perf_event_paranoidIf it outputs
3, you’re currently in the most restricted mode. -
Temporarily change setting (as root):
echo 1 | sudo tee /proc/sys/kernel/perf_event_paranoidThis change is immediate but will be lost on reboot. You should now be able to run
perf staton your own processes. -
Permanently change setting (as root): To make the change persistent across reboots, you need to configure
sysctl.- Create or edit a
sysctlconfiguration file. A common place is/etc/sysctl.d/99-perf.conf. - Add the following line to the file:
kernel.perf_event_paranoid = 1 - Apply the setting immediately without rebooting:
or specifically:sudo sysctl --systemsudo sysctl -w kernel.perf_event_paranoid=1
- Create or edit a
With perf_event_paranoid set to 1, our previous perf stat command should now work:
$ perf stat ./my_program
performance counter stats for './my_program':
1.001234567 seconds time elapsed
1,000,000,000 cpu_clock:u (83.33%)
0 task-clock # 0.00 CPUs utilized
0 context-switches # 0.00 K/sec
0 cpu-migrations # 0.00 K/sec
1,000 page-faults # 0.99 K/sec
2,000,000,000 instructions # 2.00 insns per cycle
1,000,000,000 cpu_cycles # 1.00 CPU cycles
0 stalled-cycles-frontend # 0.00% of all cycles
0 stalled-cycles-backend # 0.00% of all cycles
1.001357890 seconds user
0.000000000 seconds sys
If you need to profile kernel functions (e.g., to see how your program interacts with the OS), you’d set perf_event_paranoid to 0 (or -1 on very recent kernels). This is generally discouraged on multi-user systems.
Beyond perf_event_paranoid, there’s another related knob: perf_event_mlock_kb. This controls how much memory perf can lock for its event buffers. If it’s too low, perf might fail with "Cannot open perf_event_mlock_kb" errors. You can increase it similarly:
echo 256 | sudo tee /proc/sys/kernel/perf_event_mlock_kb # Allow 256KB of mlock
And for persistence:
kernel.perf_event_mlock_kb = 256
in your sysctl configuration.
The perf_event_paranoid setting is the primary mechanism to control what kind of system-wide events perf can access. By understanding its levels and configuring it appropriately using sysctl, you can enable necessary profiling while mitigating security risks. However, even with the most restrictive settings, the ability to observe execution details means that perf should always be used with an understanding of the potential for information leakage.
The next common issue you’ll encounter is perf failing with "Permission denied" even after adjusting perf_event_paranoid, which often points to insufficient perf_event_mlock_kb limits.