Redpanda gracefully handles node removal, but it’s not as simple as just shutting down a server.
Let’s say you have a three-node Redpanda cluster running on node1, node2, and node3. You want to remove node3.
Here’s how you’d do it:
First, you need to tell Redpanda that node3 is leaving the cluster. This is done by updating the cluster configuration. You’ll edit the redpanda.yaml file on one of the remaining nodes (node1 or node2).
Find the cluster_id and node_id sections. You’ll see a list of listeners and seeds. Remove the entry for node3 from the seeds list. For example, if your redpanda.yaml on node1 looks like this:
# redpanda.yaml on node1
cluster_id: "some-unique-cluster-id"
node_id: 1
listeners:
http:
- name: rpc
address: 0.0.0.0:9644
require_tls: false
kafka:
- name: kafka
address: 0.0.0.0:9092
require_tls: false
admin:
- name: admin
address: 0.0.0.0:9645
require_tls: false
raft:
tls:
cert_file: ""
key_file: ""
truststore: ""
keystore: ""
require_client_auth: false
log_dir: "/var/lib/redpanda/kvs"
tune_networking: true
cloud_storage:
enabled: true
access_key: ""
secret_key: ""
endpoint: ""
bucket: ""
region: ""
disable_tls: true
panda_proxy:
enabled: true
kafka_api_timeout_ms: 30000
redpanda_api_timeout_ms: 30000
external_listeners:
- name: kafka_external
address: 192.168.1.101:9092 # External IP for node1
tls:
enabled: false
admin_api_timeout_ms: 30000
admin_api_external_listeners:
- name: admin_external
address: 192.168.1.101:9645 # External IP for node1
tls:
enabled: false
seeds:
- host: "node1"
port: 9644
- host: "node2"
port: 9644
- host: "node3"
port: 9644
You would change the seeds section to:
seeds:
- host: "node1"
port: 9644
- host: "node2"
port: 9644
After saving this change on node1, you need to gracefully restart Redpanda on node1 to pick up the new configuration. The command for this is:
sudo rpk cluster config save redpanda.yaml
sudo systemctl restart redpanda
You’ll then repeat this process on node2, editing its redpanda.yaml to remove node3 from the seeds and restarting Redpanda.
Once you’ve updated the configuration on all remaining nodes, Redpanda will recognize that node3 is no longer part of the cluster. It will then initiate a data rebalancing process. Any data that was stored on node3 will be replicated to node1 and node2 to maintain the desired replication factor. This ensures data availability and durability.
You can monitor the rebalancing progress using the rpk tool:
rpk cluster status
Look for partitions that might be in a rebalancing state. The cluster status will show you the health of each node and the state of partitions.
After the rebalancing is complete and node3 is no longer listed in the rpk cluster status output, you can safely shut down and remove the physical or virtual machine that hosted node3.
The final step is to ensure your application’s Kafka client configurations (bootstrap servers) are updated to reflect the remaining nodes. If your clients were pointing to node3, they’ll need to be updated to use node1 and node2.
The next thing you’ll likely encounter is needing to add a new node to the cluster, which involves a similar configuration update process but with the new node’s details.