MySQL 8.0.12 innodb cluster 高可用集群部署运维管理手册之二 集群部署作者 方连超Innodb cluster 原理介绍Innodb cluster 利用组复制的 pxos
MySQL 8.0.12 innodb cluster 高可用集群部署运维管理手册之二 集群部署
作者 方连超
Innodb cluster 原理介绍
Innodb cluster 利用组复制的 pxos 协议,保障数据一致性,组复制支持单主模式和多主模式,
本文推荐的高可用架构为单主模式下,基于mysqlrouter的 innodbcluster:
上图是自己画的图,尽管丑陋了写,表达的别官方更清晰,这里注意利用JDBC驱动 进行多路由负载,
基础环境
系统:centos 7.5
Mysql:8.0.11 二进制包
Mysqlshell: 8.0.11 rpm 包
Mysql router: 8.0.11 二进制包
架构:
192.168.181.101 myrouter1 Keepalived、MySQL-shell、MySQL-Router、MySQL-client
192.168.181.102 myrouter2 Keepalived、MySQL-shell、MySQL-Router、MySQL-client
192.168.181.103 mysql3 MySQL服务端、MySQL-shell
192.168.181.104 mysql4 MySQL服务端、MySQL-shell
192.168.181.105 mysql5 MySQL服务端、MySQL-shell
准备工作:
修改/etc/hosts (5个节点都要做), 要和主机名一致
192.168.181.101 mysql1
192.168.181.102 mysql2
192.168.181.103 mysql3
192.168.181.104 mysql4
192.168.181.105 ysql5
软件包依赖
yum -y install gcc glibc libaio libstdc++ libstdc libncurses ld-linux
防火墙和selinux关闭:
(1)、关闭SElinux
setenforce 0
修改/etc/selinux/config
vim /etc/selinux/config SELINUX=disabled
(2)、关闭防火墙
systemctl stop firewalld systemctl disable firewalld
sysctl.conf 优化:
cat>>/etc/sysctl.conf <<EOFfs.aio-max-nr = 1048576fs.file-max = 681574400kernel.shmmax = 137438953472 kernel.shmmni = 4096kernel.sem = 250 32000 100 200net.ipv4.ip_local_port_range = 9000 65000net.core.rmem_default = 262144net.core.rmem_max = 4194304net.core.wmem_default = 262144net.core.wmem_max = 1048586EOF
limit 优化:
cat>>/etc/security/limits.conf <<EOFmysql soft nproc 65536mysql hard nproc 65536mysql soft nofile 65536mysql hard nofile 65536EOF
cat>>/etc/pam.d/login <<EOFsession required /lib/security/pam_limits.sosession required pam_limits.soEOF
cat>>/etc/profile<<EOFif [ $USER = "mysql" ]; thenulimit -u 16384 -n 65536fiEOF
source /etc/profile
安装mysql软件(sql 节点)
环境变量
useradd mysqlcat >>/home/mysql/.bash_profile<<EOFexport PATH=/usr/local/mysql/bin:/usr/local/mysql/lib:$PATHEOF
安装mysql 软件(sql 节点):
cd /datatar -xzvf mysql-8.0.11-el7-x86_64.tar.gz -C /usr/localmv /usr/local/mysql-8.0.11-el7-x86_64 /usr/local/mysqlchown -R mysql.mysql /usr/local/mysql
初始化mysql(一个主节点)
mkdir -p /data/mysql_3306_test/{data,log,binlog,conf,tmp}chown -R mysql.mysql /data/mysql_3306_test
参数文件(自己优化)
su – mysql
Vim /data/mysql_3306_test/conf/my.cnf
[mysqld]lower_case_table_names = 1user = mysqlserver_id = 1104 #各个实例全局唯一port = 3310enforce_gtid_consistency = ONgtid_mode = ONbinlog_checksum = nonedefault_authentication_plugin = mysql_native_password #为了兼容以前的驱动 datadir = /data/mysql_3310_test/datapid-file = /data/mysql_3310_test/tmp/mysqld.pidsocket = /data/mysql_3310_test/tmp/mysqld.socktmpdir = /data/mysql_3310_test/tmp/skip-name-resolve = ONtable_open_cache = 2000#################innodb########################innodb_data_home_dir = /data/mysql_3310_test/datainnodb_data_file_path = ibdata1:512M;ibdata2:512M:autoextendinnodb_buffer_pool_size = 2000Minnodb_flush_log_at_trx_commit = 1innodb_io_capacity = 600innodb_lock_wait_timeout = 120innodb_log_buffer_size = 8Minnodb_log_file_size = 200Minnodb_log_files_in_group = 3innodb_max_dirty_pages_pct = 85innodb_read_io_threads = 8innodb_write_io_threads = 8innodb_thread_concurrency = 32innodb_file_per_tableinnodb_rollback_on_timeoutinnodb_undo_directory = /data/mysql_3310_test/datainnodb_log_group_home_dir = /data/mysql_3310_test/data###################session###########################join_buffer_size = 8Mkey_buffer_size = 256Mbulk_insert_buffer_size = 8Mmax_heap_table_size = 96Mtmp_table_size = 96Mread_buffer_size = 8Msort_buffer_size = 2Mmax_allowed_packet = 64Mread_rnd_buffer_size = 32M############log set###################log-error = /data/mysql_3310_test/log/mysqld.errlog-bin = /data/mysql_3310_test/binlog/binloglog_bin_index = /data/mysql_3310_test/binlog/binlog.indexmax_binlog_size = 500Mslow_query_log_file = /data/mysql_3310_test/log/slow.logslow_query_log = 1long_query_time = 10log_queries_not_using_indexes = ONlog_throttle_queries_not_using_indexes = 10log_slow_admin_statements = ONlog_output = FILE,TABLEmaster_info_file = /data/mysql_3310_test/binlog/master.info##########################mgr set##############################mysqlx_port=33101 #要设置,因为组复制协议走的就是这个端口,多实例下要调整mysqlx_socket=/data/mysql_3310_test/tmp/mysqlx.sockloose-group_replication_group_name="aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"loose-group_replication_start_on_boot=offloose-group_replication_local_address= "mysql4:33101"loose-group_replication_group_seeds= "mysql3:33101,mysql4:33101,mysql5:33101"loose-group_replication_bootstrap_group= offloose-group_replication_ip_whitelist="192.168.181.0/24"loose-group_replication_single_primary_mode = on #设置为单主模式,不要多主,原因看后面loose-group_replication_auto_increment_increment=1 #这个默认是7,我们单主模式设置为1
单主模式和多主模式注意
多主模式下:
loose-group_replication_single_primary_mode = off
操作流程:业务端连接IP处理 -> GROUP内成员逐个依次主动退出GROUP (全部退出才行)-> 关闭 group_replication_single_primary_mode参数-> 逐个启动GROUP内的SERVER
Set global group_replication_single_primary_mode=off
初始化部署:
mysqld --defaults-file=/data/mysql_3306_test/conf/my.cnf --initialize-insecure --user=mysqlmysqld --defaults-file=/data/mysql_3310_test/conf/my.cnf --initialize-insecure --user=mysql
设置密码先要启动:
mysqladmin --defaults-file=my.cnf password 跟密码进入数据库mysql -uroot -p -S /data/mysql_3306_test/tmp/mysqld.sock
创建本地用户
create user root@'127.0.0.1' identified by 'password'; GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, RELOAD, SHUTDOWN, PROCESS, FILE, REFERENCES, INDEX, ALTER, SHOW DATABASES, SUPER, CREATE TEMPORARY TABLES, LOCK TABLES, EXECUTE, REPLICATION SLAVE, REPLICATION CLIENT, CREATE VIEW, SHOW VIEW, CREATE ROUTINE, ALTER ROUTINE, CREATE USER, EVENT, TRIGGER, CREATE TABLESPACE, CREATE ROLE, DROP ROLE ON *.* TO `root`@`127.0.0.1` WITH GRANT OPTION;GRANT PROXY ON ''@'' TO 'root'@'127.0.0.1' WITH GRANT OPTION;GRANT BACKUP_ADMIN,BINLOG_ADMIN,CONNECTION_ADMIN,ENCRYPTION_KEY_ADMIN,GROUP_REPLICATION_ADMIN,PERSIST_RO_VARIABLES_ADMIN,REPLICATION_SLAVE_ADMIN,RESOURCE_GROUP_ADMIN,RESOURCE_GROUP_USER,ROLE_ADMIN,SET_USER_ID,SYSTEM_VARIABLES_ADMIN,XA_RECOVER_ADMIN ON *.* TO `root`@`127.0.0.1` WITH GRANT OPTION;create user root@'192.168.181.%' identified by 'password';GRANT all on *.* TO root@'192.168.181.%' WITH GRANT OPTION;GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, RELOAD, SHUTDOWN, PROCESS, FILE, REFERENCES, INDEX, ALTER, SHOW DATABASES, SUPER, CREATE TEMPORARY TABLES, LOCK TABLES, EXECUTE, REPLICATION SLAVE, REPLICATION CLIENT, CREATE VIEW, SHOW VIEW, CREATE ROUTINE, ALTER ROUTINE, CREATE USER, EVENT, TRIGGER, CREATE TABLESPACE, CREATE ROLE, DROP ROLE ON *.* TO `root`@`192.168.181.%` WITH GRANT OPTION;GRANT BACKUP_ADMIN,BINLOG_ADMIN,CONNECTION_ADMIN,ENCRYPTION_KEY_ADMIN,GROUP_REPLICATION_ADMIN,PERSIST_RO_VARIABLES_ADMIN,REPLICATION_SLAVE_ADMIN,RESOURCE_GROUP_ADMIN,RESOURCE_GROUP_USER,ROLE_ADMIN,SET_USER_ID,SYSTEM_VARIABLES_ADMIN,XA_RECOVER_ADMIN ON *.* TO root@'192.168.181.%' WITH GRANT OPTION;
将实例复制拷贝到其他节点:
关闭主库节点
打包
传递
解压
安装mysql shell route
rpm -ivh mysql-shell-8.0.11-1.el7.x86_64.rpm
rpm -ivh mysql-router-8.0.11-1.el7.x86_64
配置集群
使用mysql命令
(1)、检查并配置实例(每个mysql节点)
mysqlsh --log-level=DEBUG3 日志位置 ~/.mysqlsh/mysqlsh.log
检查实例
dba.configureLocalInstance('[email protected]:3306');dba.chekInstanceConfiguration('[email protected]:3306')
(2)、创建cluster集群(确认每个SQL节点的实例都完成上述实例配置并且验证成功)
在任意一台mysql实例节点执行以下命令:
mysqlshshell.connect('root@mysql1:3306')var cluster = dba.createCluster('qwCluster');
如果创建成功输出的信息中会有类似“Cluster successfully created.”的语句
将另外两个节点加入到Cluster集群中
cluster.addInstance('root@mysql4:3306');cluster.addInstance('root@mysql5:3306');
集群验证
mysql.shell 查看
cluster.status()
navacat 连接mysql 查看
select @@hostname
问题
多实例环境不要用 3306端口
多实例环境下,某个实例采用了默认的3306端口,会导致经常性的误操作。切记
本环境建议的端口
3300 3310 3320 … 3390
一台主机最多部署10个实例
如何重置实验环境
- 主节点:
dba.dropMetadataSchema() mysqlshell 清空集群mysql> stop group_replication;mysql> reset master; (清空日志,确保和从库的表没有冲突奥,)mysql> reset slave
- 其他节点(主要清理和主库的主从信息, 确保主库和从库的表没有冲突奥)
mysql> stop group_replication;mysql> reset master;mysql> reset slave
主机名和 /etc/hosts中名字不一致
[Repl] Slave I/O for channel 'group_replication_recovery': error connecting to master 'mysql_innodb_cluster_r0430970923@mysql3:3306' - retry-time: 60 retries: 1, Error_code: MY-002005
[ERROR] [MY-011582] [Repl] Plugin group_replication reported: 'There was an error when connecting to the donor server. Please check that group_replication_recovery channel credentials and all MEMBER_HOST column values of performance_schema.replication_group_members table are correct and DNS resolvable.'
[ERROR] [MY-011583] [Repl] Plugin group_replication reported: 'For details please check performance_schema.replication_connection_status table and error log messages of Slave I/O for channel group_replication_recovery.'
这个问题郁闷了我很久,
Hostname 是
mysql4
mysql5
mysql6
而我在 /etc/hosts中是
192.168.181.103 mysql-1
192.168.181.103 mysql-2
192.168.181.103 mysql-3
主库的日志应用卡在某个位置无法应用到从库,
[ERROR] [MY-010586] [Repl] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'binlog.000007' position 151
[ERROR] [MY-010584] [Repl] Slave SQL for channel 'group_replication_applier': Error executing row event: 'Unknown database 'mysql_innodb_cluster_metadata'', Error_code: MY-001049
重建master:MySQL>reset master
组复制的限制
- 事物锁缺失问题:
组复制建议,事物隔离级别,read commit - 序列化隔离级别:多主模式不支持
- 并发DDL和DML: 多主模式下,不支持 一边对一个表进行DDL,另一边进行更新,这样对于DDL在其他实例上操作有未检出的风险
- 外键级联约束:多主模式下,多级外键依赖对引起多级操作, 因此可能导致未知冲突,建议打开 group_replication_enforce_update_everywhere_checks=ON
- 大事物,超过5秒未提交,会导致组通信失败,
- 多主模式下:select * for update 会导致 死锁。因为这个锁并非全组共享。
- 部分复制不支持:组复制下,设置部分复制,会过滤事物,导致组事物不一致。
- Mysql 8.0.11 group_replication_enforce_update_everywhere_checks=ON 多主模式下不支持。
- 停止复制的情况下,某个节点执行命令后再启动,会因为本地有私有事物,无法加入集群。需要全局 reset master 重新开始集群复制。