Pulsar’s bookies, the nodes responsible for ledger storage, are surprisingly flexible in how they handle data on disk.
Let’s see this in action. Imagine a bookie with a couple of disks, /mnt/disk1 and /mnt/disk2. Pulsar can be configured to use both, and it’s not just about splitting the load; it’s about how it prioritizes and recovers.
Here’s a typical bookie configuration snippet, showing how you’d define these storage locations:
bookkeeper:
# ... other bookkeeper settings
storage:
# Use a comma-separated list of paths for multiple disks
disk-dirs:
- /mnt/disk1/bookkeeper/ledgers
- /mnt/disk2/bookkeeper/ledgers
# Optional: specify a journal directory if you want it separate from data
# journal-dirs:
# - /mnt/disk1/bookkeeper/journal
# - /mnt/disk2/bookkeeper/journal
When Pulsar writes data to a ledger, it doesn’t just round-robin across these directories. It uses a more sophisticated approach. Each ledger is assigned to a specific disk directory. If a disk fails, only the ledgers residing on that specific disk become unavailable. The bookie will then attempt to recover those ledgers from other bookies in the ensemble. This isolation is key to preventing a single disk failure from taking down the entire bookie.
The core problem bookies solve is ensuring data durability and availability for Pulsar topics. They achieve this by replicating ledger entries across multiple bookies. When a client writes to a Pulsar topic, the message is sent to a bookie (the "write quorum"), which then forwards it to other bookies in the ensemble (the "ack quorum"). Only once a sufficient number of bookies have acknowledged the write is the message confirmed to the client. The data is then persisted to disk on each of those bookies.
The storage.disk-dirs setting is where you tell the bookie which physical or logical storage locations to use. Pulsar will then manage the distribution of ledger data across these directories. It’s not a simple load-balancing act; the system intelligently assigns ledgers to directories. When you configure multiple disk-dirs, Pulsar will distribute the ledgers across them. If you have disk-dirs: [/data/disk1, /data/disk2], Pulsar will create subdirectories for ledgers on each of these paths.
One of the most surprising aspects is how Pulsar handles disk failures. If /mnt/disk1 goes offline, the bookie doesn’t immediately crash. It marks the ledgers residing on /mnt/disk1 as unavailable. However, the bookie itself remains operational, serving requests for ledgers on /mnt/disk2. The system will then initiate recovery for the lost ledgers, relying on the replication factor and the data present on other bookies. This granular failure handling is what makes Pulsar’s storage layer so resilient.
You can monitor the health and usage of these disks using the bookkeeper shell. Connecting to a bookie with bookkeeper shell and then running ls will show you the ledgers managed by that bookie. You can also use bookkeeper bookie-status to get an overview of the bookie’s health, including disk usage. If you suspect a disk issue, checking the bookie logs (/var/log/pulsar/bookkeeper.log) is crucial. Look for I/O errors, disk full warnings, or messages indicating a specific directory is no longer accessible.
The actual mechanics of how ledgers are assigned to disks are managed by the bookkeeper’s internal scheduler, which aims for a balance of capacity and avoiding hot spots. When a new ledger is created, the bookkeeper picks an available disk-dir to host it. If a disk-dir becomes full or inaccessible, the bookkeeper will attempt to re-assign new ledgers to other available directories.
The next frontier for bookie storage management is understanding how Pulsar’s garbage collection mechanism interacts with disk space, particularly when dealing with ledger deletions and space reclamation.