Compiler Optimization: From Source to Speed

The fastest code a compiler can produce is often the code it doesn’t have to touch.

Consider this C++ code. We want to see how quickly a compiler can turn it into machine code.

#include <iostream>
#include <vector>
#include <string>

// Function to process a vector of strings
void process_strings(const std::vector<std::string>& data) {
    for (const auto& str : data) {
        if (str.length() > 10) {
            std::cout << "Long string: " << str << std::endl;
        } else {
            std::cout << "Short string: " << str << std::endl;
        }
    }
}

int main() {
    std::vector<std::string> messages = {
        "Hello",
        "This is a much longer string than ten characters",
        "Short",
        "Another long one, definitely exceeding ten characters"
    };

    process_strings(messages);

    return 0;
}

To compile this, you’d typically use g++:

g++ -O2 main.cpp -o main

The -O2 flag enables a good set of optimizations, but it still involves the compiler analyzing the code, deciding on transformations, and generating machine instructions. This takes time.

Now, what if we want to minimize compile time? The key is to reduce the amount of work the compiler has to do. This often means writing code that is simpler, more direct, and avoids constructs that require deep analysis or complex transformations.

Think about the process_strings function. It iterates, checks a condition, and prints. The compiler has to understand the loop, the if statement, and the potential paths. If we could express this more directly, or even eliminate the need for the compiler to generate the loop logic itself, we’d save time.

One way to drastically speed up compilation is to leverage pre-computation or static data. If the "processing" of strings could be done before compilation, the compiler’s job becomes trivial.

Let’s imagine a scenario where the output is fixed. Instead of the compiler generating code to decide what to print, we can provide the exact output as a string literal.

#include <iostream>

int main() {
    // The entire output is pre-determined and hardcoded.
    // The compiler just needs to print this single string.
    std::cout << "Short string: Hello\n"
              << "Long string: This is a much longer string than ten characters\n"
              << "Short string: Short\n"
              << "Long string: Another long one, definitely exceeding ten characters\n";
    return 0;
}

Compiling this version:

g++ -O0 main.cpp -o main

Notice the -O0 flag. Here, we’re telling the compiler to do no optimizations. Why? Because there’s virtually nothing to optimize. The compiler’s task is reduced to:

Parse the code.
See a single std::cout statement with multiple string literals concatenated.
Generate machine code to load that string and write it to standard output.

This compile time will be orders of magnitude faster than the previous example, especially for larger inputs or more complex logic. The compiler doesn’t need to reason about loops, conditional branches, or memory management for vectors of strings. It just has to print.

The "surprise" here is that often, the best compiler optimization you can ask for is to give it so little work that it barely needs to compile at all. This is achieved by shifting the computational burden from runtime (or compile time analysis) to build time or even manual coding.

When you encounter a long compilation time, ask yourself: "Can I pre-calculate this result? Can I represent this output directly as a static literal? Can I move the logic out of the code that the compiler needs to process?" The answer to these questions often leads to the fastest compile times.

The next step in understanding compiler performance is learning how to analyze the generated assembly to see why certain code patterns are slow to compile, even without explicit optimization flags.