In my previous role, my primary focus was on PostgreSQL databases, specifically in database setup, configuration, and infrastructure rather than on the SQL level. For instance, one of my tasks was establishing streaming replication between a master and two standby database servers.

I found this work incredibly exciting. I vividly remember the satisfaction of getting streaming replication working during my first week on the job. The ability to configure a system from scratch and see it replicate an entire database from one machine to another was exhilarating!

However, as you’re likely aware, working with such systems comes with its challenges.

When the master server/node fails, one of the standby servers needs to take over as the new master. This process involves rebuilding the old master, which initially acts as a standby/slave before being promoted again to master. In my experience, this rebuilding process can be slow, particularly for large databases common in large enterprises. In a business environment where time is money, the extended rebuild times can pose significant challenges.

During my time in this role, a former colleague introduced me to an application running on something called Kubernetes. He demonstrated a feature where intentionally destroying one of the pods resulted in its automatic replacement without manual intervention or ssh-ing inside the pods to revive it.

Coming from a database background, this experience fascinated me. And that was just the tip of the kubernetes iceberg.

And into the rabbit hole I go..