I recently gave a presentation at Percona Live 2015 in Santa Clara, CA. In this presentaiton I originally wanted to simply show running MySQL replication, first asynchronous, and more importantly, a Galera cluster, and in so doing, demonstrate how useful Kubernetes is.
Why?
The talk was a good chance to introduce the MySQL community– developers, DBAs, sysadmins, and others to what Kubernetes is and what it means for MySQL
A bit of learning
I thought at the time when I submitted my synopsis that the talk would be straightforward. About 2-3 months ago, I started working on the setup I would use for the demonstration. My goal was to use a stock CoreOS cluster with the necessary Kubernetes components installed and running as a cluster.
The reality was that there was a bit more to it than that. Isn’t that how everything that has to do with complex systems is? To make a long story short, I tried the Vagrant setup for CoreOS but using the cloud-init scripts in Kubernetes documentation but I could never get complete success running a Kubernetes cluster this way. Hence, the blog post I recently published that covered my basic setup.
Finally, using the process outlined in that [post], I had a Kubernetes cluster that consistently worked for the most part. Some gotchas were that upon launching the cluster, the cloud-init scripts had dependencies that required downloading various binaries required to run Kubernetes and set up networking. A slow network connection resulted in failure because of this particular timing– something I plan to fix and contribute back to the community.
Asynchronous Replication
With a working Kubernetes cluster, I decided it was time to first start with regular MySQL asyncronous replication since it might present a more simple proof of concept. The way to do this was essentially to modify the standard MySQL Docker container to have a master and slave variant. The higher abstraction of this is that there will be two pods - a master pod, and a slave pod. For the master pod, only one container will run. The slave pod could run one or more containers.
The master container is built using a Dockerfile that specifies and entrypoint shell script. This is the basic pattern that the stock MySQL container uses, albeit only to set up essential MySQL settings, particularly the root user and password. This modifications to this entrypoint script sets up the replication user privileges (name, password, and host to allow). In order to do this, when the container is started, environment variables are passed from the pod configuration file supplying the mysql root password, replication username, and replication password. The host to allow connection from uses 10.x.x.x as that’s the IP range that Kubernetes uses to assigns to pods. This range would cover any container in the slave pod(s) that would need to connect as a slave. With these environment variables, the entrypoint script builds up an SQL script that runs these priviledge modification is run with MySQL in insecure mode (for initialization) using mysqld --initialize-insecure=on
. Additionally, a script called “random.sh” runs to set the server-id value in my.cnf. Once the master pod is running, a master service is started called mysql_master which the great functionality of Kubernetes makes availble as environment variables MYSQL_MASTER_HOST
and MYSQL_MASTER_PORT
on any container launched there afterword, including the slave container.
From the entrypoint script:
echo "GRANT REPLICATION SLAVE, REPLICATION CLIENT on *.* TO '$MYSQL_REPLICATION_USER'@'10.100.%' IDENTIFIED BY '$MYSQL_REPLICATION_PASSWORD';" >> "$tempSqlFile"
The slave container is built similar to the master container with regard to the Dockerfile specifying an entrypoint script, except instead of setting up privileges, it sets up replication by running the CHANGE MASTER...
in the sql script that is built up using the aforementioned environment variables both passed and available through Kubernetes MYSQL_MASTER_HOST
which is the master the slave is set up to read from.
From the entrypoint script:
if [ ! -z "$MYSQL_MASTER_SERVICE_HOST" ]; then
echo "STOP SLAVE;" >> "$tempSqlFile"
echo "CHANGE MASTER TO master_host='$MYSQL_MASTER_SERVICE_HOST', master_user='$MYSQL_REPLICATION_USER', master_password='$MYSQL_REPLICATION_PASSWORD';">> "$tempSqlFile"
echo "START SLAVE;" >> "$tempSqlFile"
fi
This actually is quite straightforward and worked the first time I prototyped it. I first ran it as two separate containers, passing the environment variables explicitly - like the example below:
docker run -e MYSQL_ROOT_PASSWORD=c-kr1t capttofu/mysql_master_kubernetes
docker run -e MYSQL_MASTER_SERVICE_HOST=x.x.x.x -e MYSQL_ROOT_PASSWORD=c-kr1t capttofu/mysql_slave_kubernetes
Once I verified this, it was a matter of creating master and slave pod files (to view follow links)
This proved the basic concept worked. That being, using an entrypoint script to set up the dabase in advance.
Galera replication
For Galera replication, it seemed it might actually be more simple since when setting up Galera replication one need not concern themselves with binary log position nor how to get a snapshot of data– that being handled by Galera (SST - single state transfer when joining). The difficulty was due to the fact that services can only have a single port and IP using the version of Kubernetes that I had to use for my demo. Galera replication requires 4 ports: 3306, 4444, 4567, and 4568. In newer versions of Kubernetes support multiple ports. The way I planned to get around this is that I took advantage of the read-only Kubernete API running on the host value found in the enviroment variable $KUBERNETES_RO_SERVICE_HOST
on every container Kubernetes starts (in a pod). The Kubernetes client kubectl
is included on the Docker image. The entrypoint script in turn runs kubectl
and parses the output for every pod named “pxc_0”, iterating from 1 to 3, in a loop, building up the string used for wsrep_cluster_address
. Of course, if the container is launched and the environment variable WSREP_CLUSTER_ADDRESS
is set to gcomm://
, then that value is used, in this case the pod pxc_node1
, the “bootstrap” pod.
Galera replication is pretty simple once you know which hosts will be part of the cluster. In this case, the pattern is to launch the pxc_node1
pod as the bootstrap pod, then pxc_node2
and pxc_node3
. When this is completed, there should be a cluster.
Actual steps
First, set up a Kubernetes cluster per my blog post.
Pre-reqs
Build the kubernetes client program:
$ git clone https://github.com/GoogleCloudPlatform/kubernetes
$ cd kubernetes
kubernetes $ make
kubernetes $ sudo cp cmd/kubectl /usr/local/bin
Clone the kubernetes mysql replication repository
$ git clone https://github.com/CaptTofu/mysql_replication_kubernetes.git
$ cd mysql_replication_kubernetes
mysql_replication_kubernetes $ git submodule init
mysql_replication_kubernetes $ git submodule update
Create pxc_01 pod
mysql_replication_kubernetes $ cd galera_sync_replication
galera_sync_replication $ kubectl create -f pxc-node1.yaml
pxc-node1
Verify pod is running
galera_sync_replication $ kubectl get pods
POD IP CONTAINER(S) IMAGE(S) HOST LABELS STATUS CREATED
pxc-node1 10.244.78.2 pxc-node1 capttofu/percona_xtradb_cluster_5_6:latest 172.16.230.131/172.16.230.131 name=pxc-node1 Pending 5 Seconds
In the example above, the status is Pending
. Once the status is Running
, create the second pod
Create pxc-node2 and pxc-node3 pod
Once pxc_node1 has a status of Running
, create pxc_node2 and pxc_node3:
galera_sync_replication $ kubectl create -f pxc-node2.yaml
pxc-node2
galera_sync_replication $ kubectl create -f pxc-node3.yaml
pxc-node3
Create a service for pxc-node1
From before, recall that pxc-node1 is running on the kubernetes minion/node with an IP address of 172.16.230.131. Edit the configuration file for pxc_node1 service to make it possible to connect to the pxc_node1 pod using that address with publicIPs
. Edit pxc-node1-service.yaml:
---
id: pxc-node1
kind: Service
apiVersion: v1beta1
port: 3306
containerPort: 3306
selector:
name: pxc-node1
labels:
name: pxc-node1
publicIPs:
- 172.16.230.131
Once this file is ready, create the service
galera_sync_replication $ kubectl create -f pxc-node3.yaml
pxc-node3
Verify everything is running
There should be all three pods running (status Running
) and a single pxc_node1 service:
galera_sync_replication $ kubectl get pods,services
POD IP CONTAINER(S) IMAGE(S) HOST LABELS STATUS CREATED
pxc-node1 10.244.78.2 pxc-node1 capttofu/percona_xtradb_cluster_5_6:latest 172.16.230.131/172.16.230.131 name=pxc-node1 Running About an hour
pxc-node2 10.244.75.2 pxc-node2 capttofu/percona_xtradb_cluster_5_6:latest 172.16.230.139/172.16.230.139 name=pxc-node2 Running About an hour
pxc-node3 10.244.11.2 pxc-node3 capttofu/percona_xtradb_cluster_5_6:latest 172.16.230.144/172.16.230.144 name=pxc-node3 Running 54 minutes
NAME LABELS SELECTOR IP PORT
kubernetes component=apiserver,provider=kubernetes <none> 10.100.0.2 443
kubernetes-ro component=apiserver,provider=kubernetes <none> 10.100.0.1 80
pxc-node1 name=pxc-node1 name=pxc-node1 10.100.43.123 3306
The output above shows that everything is up and running– time to connect to the database!
Access pxc-node1
service
Services are created immediately, so the database can be immediately accessed
$ mysql -u root -p -h 172.16.230.131
Enter password:
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MySQL connection id is 6
Server version: 5.6.22-72.0-56 Percona XtraDB Cluster (GPL), Release rel72.0, Revision 978, WSREP version 25.8, wsrep_25.8.r4150
Copyright (c) 2000, 2015, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MySQL [(none)]> show status like 'wsrep_inc%'
-> ;
+--------------------------+----------------------------------------------------+
| Variable_name | Value |
+--------------------------+----------------------------------------------------+
| wsrep_incoming_addresses | 10.244.78.2:3306,10.244.11.2:3306,10.244.75.2:3306 |
+--------------------------+----------------------------------------------------+
1 row in set (0.01 sec)
This output shows that all three Galera nodes are up and running!
Summary
With this proof of concept, there is much more to do. Most of all, it would be good to use replication controllers instead of simple pods to create the three galera single-container pods. That way, there is a means of ensuring that all pods will continue to run. It would also be good to demonstrate this proof-of-concept’s value by launching an application that uses this Galera cluster. At least at this point, there is something very useful to start with!
Special thanks to – Kelsey Hightower, Tim Hockin, Daniel Smith and others in #google-containers for their patience and excellent help!