Understanding Quorum – When the Raspberry Pi comes in
- How does it work? A quorum is achieved when a group of connected nodes represents more than half of the total nodes in the cluster.
- You might ask, ‘What is a quorum in a Galera cluster?’ Think of it as a majority vote. To prevent a ‘split-brain’—where different nodes accept conflicting writes—the cluster must have a quorum to allow changes to the data.
- In a 3-node cluster, you need at least 2 nodes online.
- In a 4-node cluster, you need at least 3 nodes online (still a majority).
- In a 5-node cluster, you need at least 3 nodes online.
- What happens if nodes fail and quorum is lost? If a network problem splits the cluster, only the group with the majority (the ‘Primary Component’) will continue to allow reads and writes.
- Any node or group of nodes that is in the minority automatically freezes itself. It won’t even accept reads, because the data could be out of date. This is a critical safety feature to protect your data’s integrity. So, it doesn’t just become ‘read-only’ – for all practical purposes, it becomes non-operational until it can rejoin the main cluster.”
- So far we deployed 4 nodes, one on each Proxmox node. So we will need a 5th one somewhere. What if one of the Proxmox nodes is down? Then our Galera cluster would simply stop working. So what we can do here to prevent it is to add another tiny element into our infrastructure – a spare Raspberry Pi (or another mini server) that is connected to the same network segment and can act as a so-called Arbitrator.
- The Arbitrator serves as the deciding member of the quorum, even if it does NOT hold any database-related data. Almost zero traffic or performance is required. This way, if we have 2 LXCs on Proxmox node 1 and 2 LXCs on Proxmox node 2 and one goes down, the Arbitrator with the node will still have a quorum.
Hardware Setup for the Pi
- Grab a Raspberry Pi (any version will do if it’s not already busy with other tasks).
- The Pi should be connected ideally via the same switch as the Proxmox nodes on the same network for low latency. I would not recommend WiFi connection here.
- The OS is up to your choice, a basic headless Raspberry Pi OS Lite would do. In this guide, I assume it is a Debian-based distro.
- Ideally, have it connected to UPS. Either have one that covers several devices (such as your Proxmox nodes + router(s) + the Pi) or one that is dedicated to the Pi, that is a power bank with UPS capabilities. Ping me for tips if you are struggling to find some 🙂
Installation Process (Garb)
- Connect to the Pi via SSH and run the following:
# Update / Upgrade the OS to its newest version sudo apt update && sudo apt upgrade -y # Install Maria DB client only (no need for a server) sudo apt install galera-arbitrator-4 -y
- Let’s configure the Arbitrator:
sudo nano /etc/default/garb # A comma-separated list of other node addresses (IPs) in your Galera cluster. # At least one of these must be contactable at startup. GALERA_NODES="<node1_ip>:4567,<node2_ip>:4567,<node3_ip>:4567,<node4_ip>:4567" # The Galera cluster name such as ClusterA GALERA_GROUP="my_galera_cluster" # Optional: log file for garbd. Leave it commented out or # you will run into permission issues. # LOG_FILE=""
- Save an exit and then enable the garbd service:
# Start the garbd service sudo systemctl start garbd # Verify the service is up and runnning sudo systemctl status garbd # Set it up to auto-start on boot sudo systemctl enable garbd # Monitor the log # tail -f /var/log/garbd.log
- Log into any of your actual MariaDB nodes and check the cluster size.
mysql -u root -p -e "SHOW GLOBAL STATUS LIKE 'wsrep_cluster_size';"
- Now our cluster is resilient to a split-brain scenario in case one of the two Proxmox nodes are down.
- In my case, I use the Pi also as a Proxmox backup server with a large 1 TB micro SD A2 card + as a quorum member for the Proxmox nodes + a quorum member for the Galera cluster. Quite neat!
