Intel Processor Trace (PT) is a hardware feature that records the execution flow of a program, allowing for incredibly detailed post-mortem analysis of what code actually ran.
Let’s see it in action. Imagine you’ve got a C program that’s misbehaving, maybe a race condition or a subtle bug that only appears under specific load. You want to know exactly which instructions were executed, in what order, leading up to the failure.
Here’s a simple C program:
#include <stdio.h>
#include <unistd.h>
void worker(int id) {
if (id % 2 == 0) {
printf("Worker %d is doing something.\n", id);
} else {
printf("Worker %d is doing something else.\n", id);
}
}
int main() {
for (int i = 0; i < 5; ++i) {
worker(i);
usleep(10000); // sleep for 10ms
}
return 0;
}
To trace this with perf, you’d first need to enable PT support in your kernel (usually a kernel config option like CONFIG_INTEL_PT=y). Then, you’d run perf like this:
sudo perf record -e intel_pt// -o trace.perf ./my_program
This command tells perf to:
sudo: Run with root privileges, as tracing often requires it.perf record: The command to start recording performance data.-e intel_pt//: Specify the event to trace.intel_ptis the generic name for Intel PT. The//signifies no filtering, meaning trace everything.-o trace.perf: Save the trace data to a file namedtrace.perf../my_program: The program to trace.
After the program finishes, you’ll have trace.perf. Now, you can analyze it. The raw PT data is a stream of packets indicating branches, calls, returns, and timing information. perf can decode this into a more human-readable format, often by associating it with the program’s symbols.
perf script -i trace.perf > trace.txt
This perf script command takes the raw trace.perf file and outputs a textual representation of the execution flow to trace.txt. Opening trace.txt would show you lines like:
...
12345.678901: 12345 [000] 1000000 CPU 0/0:
my_program 400400 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 "Fix" "Fix"
"Fix" "Fix"
Error: The system experienced a failure because the "cupsd" service failed to start. This is interesting because "cupsd" is the CUPS (Common Unix Printing System) daemon, responsible for handling print jobs. Its failure means the entire printing subsystem is down.
Common Causes and Fixes:
1. **Corrupted CUPS Configuration:**
* **Diagnosis:** Check CUPS error logs. On most systems, this is `/var/log/cups/error_log`. Look for specific errors related to configuration parsing or loading.
```bash
sudo tail -n 50 /var/log/cups/error_log
```
* **Fix:** The most common culprit is a malformed `printers.conf` file. Back it up and let CUPS regenerate it.
```bash
sudo mv /etc/cups/printers.conf /etc/cups/printers.conf.bak
sudo systemctl restart cups
```
* **Why it works:** This removes any potentially corrupt user-defined configurations, forcing CUPS to start with a clean, default state.
2. **Permissions Issues on CUPS Directories:**
* **Diagnosis:** Verify ownership and permissions of CUPS-related directories, especially `/var/spool/cups` and `/var/cache/cups`. The `lp` user and group should own these.
```bash
ls -ld /var/spool/cups /var/cache/cups
```
Expected output for `/var/spool/cups`: `drwxr-xr-x 2 lp lp 4096 Jan 1 10:00 /var/spool/cups`
Expected output for `/var/cache/cups`: `drwxr-xr-x 2 lp lp 4096 Jan 1 10:00 /var/cache/cups`
* **Fix:** Correct the ownership and permissions.
```bash
sudo chown lp:lp /var/spool/cups /var/cache/cups
sudo chmod 755 /var/spool/cups /var/cache/cups
sudo systemctl restart cups
```
* **Why it works:** CUPS needs to write temporary print job data and cache files; incorrect ownership prevents this, leading to startup failures.
3. **Conflicting Print Drivers or Filters:**
* **Diagnosis:** Examine `/var/log/cups/error_log` for messages indicating problems loading specific drivers or filters, often mentioning `.so` files. Also, check `/etc/cups/ppd/` for PPD files that might be corrupt or incompatible.
* **Fix:** Remove or rename suspect PPD files or custom filters. If you recently added a printer or driver, try removing it.
```bash
# Example: If 'hp-laserjet.ppd' is suspect
sudo mv /etc/cups/ppd/hp-laserjet.ppd /etc/cups/ppd/hp-laserjet.ppd.bak
sudo systemctl restart cups
```
* **Why it works:** A bad driver or filter can crash the CUPS filter chain during initialization or when it tries to process a job, preventing the daemon from starting cleanly.
4. **Network Port Conflict (Less Common for `cupsd` itself):**
* **Diagnosis:** While `cupsd` typically binds to `localhost:631`, other services could theoretically conflict if CUPS is configured for network access on a different port. Check if port 631 is already in use.
```bash
sudo ss -tulnp | grep ':631'
```
* **Fix:** If another process is using port 631, stop that process or reconfigure CUPS to use a different listening port (edit `/etc/cups/cupsd.conf`).
```bash
# Example: If another process is on 631, stop it.
# Then restart cups.
sudo systemctl restart cups
```
* **Why it works:** A service cannot bind to a port already in use, preventing `cupsd` from becoming available.
5. **Disk Space Full:**
* **Diagnosis:** Check available disk space, particularly on partitions hosting `/var/spool/cups` and `/var/cache/cups`.
```bash
df -h /var/spool/cups
```
* **Fix:** Free up disk space by deleting old print jobs, temporary files, or unnecessary data.
```bash
sudo systemctl restart cups
```
* **Why it works:** CUPS needs space to spool print jobs and store temporary data. A full disk prevents these operations, causing startup or runtime failures.
6. **SELinux/AppArmor Restrictions:**
* **Diagnosis:** Check system audit logs for SELinux or AppArmor denials related to `cupsd`.
```bash
# For [SELinux](/debugging/selinux/):
sudo ausearch -m avc -ts recent | grep cupsd
# For [AppArmor](/debugging/apparmor/):
sudo grep cupsd /var/log/audit/audit.log
```
* **Fix:** Adjust SELinux/AppArmor policies to allow `cupsd` necessary access. This is system-specific, but often involves commands like `chcon` or `semanage` for SELinux, or editing profiles for AppArmor.
```bash
# Example SELinux fix for a common issue:
sudo semanage fcontext -a -t cupsd_var_run_t "/var/run/cups(/.*)?"
sudo restorecon -Rv /var/run/cups
sudo systemctl restart cups
```
* **Why it works:** Security modules can block `cupsd` from accessing required files or network sockets, even if standard file permissions are correct.
After resolving the underlying issue, the next error you'll likely encounter is a "client-error-not-found" when trying to print, if you haven't re-added your printers yet after clearing configurations.