Protobuf nested messages are the primary way to compose complex data structures, but they’re often misunderstood as just a way to save space.

Let’s see this in action. Imagine we’re building a system to track user profiles, and each user has an address.

message Address {
  string street = 1;
  string city = 2;
  string state = 3;
  string zip_code = 4;
}

message UserProfile {
  string user_id = 1;
  string name = 2;
  Address home_address = 3; // Nested message
  repeated string favorite_colors = 4;
}

When we serialize a UserProfile with a nested Address, it doesn’t create a separate, top-level Address object. Instead, the fields of Address are directly encoded within the UserProfile’s serialized data, prefixed by the tag number for home_address (which is 3). This is why it’s space-efficient: the structure is flattened.

The core problem nested messages solve is schema organization and reusability. Instead of duplicating address fields (street, city, state, zip) in every message that needs an address, you define Address once and then reference it. This leads to:

  • Clarity: Your schema becomes more readable and less repetitive.
  • Maintainability: If you need to add a field to an address (e.g., country), you only change it in one place (Address message), and all UserProfile instances automatically gain that field.
  • Encapsulation: It logically groups related data.

Internally, Protobuf uses a system of tag numbers. When you have a nested message, say home_address with tag 3, the serialized data for the UserProfile will contain:

  1. The tag number for home_address (3).
  2. The wire type for a length-delimited field (since a message is essentially a sequence of bytes).
  3. The length of the encoded Address message.
  4. The encoded Address message itself, containing its own fields with their respective tag numbers and values.

The exact levers you control are the message definitions themselves and the tag numbers assigned to fields. When defining a nested message, you are essentially creating a sub-schema that will be embedded. The repeated keyword can also be used with nested messages, allowing you to have a list of complex objects, like repeated Address past_addresses = 5;.

The most surprising thing about nested messages is how they interact with oneof. If you define a oneof field that includes a nested message, Protobuf serializes it in a way that is indistinguishable from a regular nested message externally, but internally it tracks which field within the oneof was set, allowing for exactly one field to be present.

If you’re using Protobuf with a language that supports reflection (like Java or Python), you can dynamically inspect the structure of a nested message without needing to know its exact fields at compile time.

Want structured learning?

Take the full Protobuf course →