He is also a former member of MySQL operation and maintenance DBA. Currently, he is working on RDS related products in Party Bs company. In the process of dealing with the DBA operation and maintenance or development department of Party A, we can feel the pain and difficulty of DBA operation and maintenance personnel in the current big wave of cloud computing, containerization and micro service.
Generally, DBA operation and maintenance personnel have relatively weak R & D ability and no engineering project experience. Of course, automated operation and maintenance, shell or Python script auxiliary tools are enough for the operation and maintenance management of small-scale RDS clusters (10-20). However, with the expansion of the companys scale and the enrichment of application scenarios, enterprises usually do not have only one database instance, which may coexist with MySQL, Oracle, SQL server, PostgreSQL, etc.
Then the enterprise employer requires DBAs to master the characteristics and capabilities of various databases, or recruit employees for each type of DBA. In fact, there are commonalities in the horizontal usage scenarios of relational databases, such as high availability, scalable RDS cluster scale, changeable computing / storage, backup recovery, monitoring alarm, etc.
There are indeed barriers between R & D and operation and maintenance. We often see that after R & D personnel release software applications online, they need to provide hardware and network environment for deployment. Generally, operation and maintenance personnel do not care about the good or bad or fast of your software operation, but only care about monitoring indicators such as physical services and network.
In addition to these indicators, DBA operation and maintenance personnel also need to care about the good and fast of the database software itself. For example, whether the application table has created a reasonable index, the storage space size of the physical machine, whether the SQL statement is illegal, such as the use of select * fromtable1, table2... And so on, resulting in the storage mediums IO being full, etc. When enterprises start the construction of microservices and Devops, the requirements for service agility and rapid delivery capability are put forward. Relational database is a special application scenario. Some large-scale enterprises set up DBA department to be responsible for the operation and maintenance and development of database instances. With the rapid development of products and rapid adaptation to the market, the speed and ability of database instance delivery has gradually become a bottleneck.
Containerization is the only way
So what scenario is suitable for the database to run in the container? In the customer scenario, enterprises usually start to build their own Devops, and need to deliver RDS services quickly or DBA personnel of enterprises are responsible for the scale ratio of more than 1:50 + database instances.
On the value of container database
The so-called container is just an ordinary process. The special features of this process are: 1) it may be located in a different namespace (NS), and the container process can be added to a different namespace by using the clone / unshare / setns system call. 2) its use of resources (CPU, memory, diskio, etc.) may be limited by cggroup resource control.
By using the characteristics of the container graphdriver, when DBA runs multiple instances on a single machine, the database files that the same version of database instance needs to run share the baseimage, which greatly saves the storage space of the physical machine.
DBAs are most concerned about the basic issues of data integrity and security. The container mentioned above is just a common process. It takes advantage of the characteristics of Linux kernel to disguise as a virtual OS operating environment. Graphdriver solves the problem of image file sharing through the cow mechanism.
To run a relational database container, we need to persist the data through another interface, not through the graphdriver. The container itself provides the ability to persist data. For example, run the containerized MySQL instance, mount the OS directory / opt to the containers / var / lib / MySQL directory, and write the data generated by the MySQL instance in the container to the / opt directory of the host.
We can check the running spec information of the container database through the docker inspect docker ID command. Intercept some key information.
We can see that docker maps the OS directory to the container for binding by using the type of volume. We can guarantee the data security of the container database through OS file systems such as ext4 and XFS, or using distributed storage volumes.
Kubernetes is the best practice of container database cluster
Kubernetesising the containerworld is no exaggeration. Cloud native, microservices, PAAS, IOT, Devops, etc. almost all architectures with container as technology stack take k8s as the first choice. The product Im responsible for also uses k8s as a container database choreography framework to provide multiple types of RDS services. At present, the maximum RDS size of a single cluster is up to 1000 +. Kubernets architecture allows enterprises to deploy containerized databases by extending their own custom resource types. Of course, they also need to solve the data persistence problem of container databases, scheduling strategies of container databases, network solutions and service exposure modes according to their own business scenarios.
Source: editor in charge of mass news: Chen Tiqiang_ NB6485