使用 Kuberadm 安装 kubernetes 集群 集群环境
主机名
IP
服务
node1
192.168.66.11
master
node2
192.168.66.12
nodes
node3
192.168.66.13
nodes
升级系统内核为 4.44 CentOS 7.x 系统自带的 3.10.x 内核存在一些 Bugs,导致运行的 Docker、Kubernetes 不稳定,我们需要先升级一下内核版本。
1 2 3 4 5 6 7 8 9 10 11 12 13 rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch.rpm yum --enablerepo=elrepo-kernel install -y kernel-lt kernel-lt-devel kernel-lt-headers # 安装完成后检查 /boot/grub2/grub.cfg 中对应内核 menuentry 中是否包含 initrd16 配置,如果没有,再安装 一次! grep "initrd16" /boot/grub2/grub.cfg kernel_version=`grep "^menuentry" /boot/grub2/grub.cfg | cut -d "'" -f2 | grep "4.4"` # 设置开机从新内核启动 grub2-set-default "$kernel_version" # 确认修改成功 grub2-editenv list
关闭 NUMA
1 2 3 4 5 6 7 8 9 cp /etc/default/grub{,.bak} vim /etc/default/grub # 在 GRUB_CMDLINE_LINUX 一行添加 `numa=off` 参数,如下所示: diff /etc/default/grub.bak /etc/default/grub 6c6 < GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=centos/root rhgb quiet" --- > GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=centos/root rhgb quiet numa=off" cp /boot/grub2/grub.cfg{,.bak} grub2-mkconfig -o /boot/grub2/grub.cfg
重启一下服务器确认内核版本升级成功
初始化配置 修改主机名
1 2 3 4 5 6 7 8 9 10 hostnamectl set-hostname node1 node1 hostnamectl set-hostname node2 node2 hostnamectl set-hostname node3 node3 cat >> /etc/hosts<<EOF 192.168.66.11 node1 192.168.66.12 node2 192.168.66.13 node3 EOF
关闭系统不需要的服务
1 systemctl stop postfix && systemctl disable postfix
安装依赖
1 yum install -y conntrack ntpdate ntp ipvsadm ipset jq iptables curl sysstat libseccomp wget vim net-tools git lrzsz
修改防火墙
1 2 3 systemctl stop firewalld && systemctl disable firewalld yum -y install iptables-services && systemctl start iptables && systemctl enable iptables iptables -F && iptables -X && iptables -F -t nat && iptables -X -t nat && service iptables save
关闭swap和selinux
1 2 swapoff -a && sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab setenforce 0 && sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config
修改内核参数
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 cat > /etc/sysctl.d/kubernetes.conf <<EOF net.bridge.bridge-nf-call-iptables=1 # 必备 开启桥接 net.bridge.bridge-nf-call-ip6tables=1 # 必备 开启桥接 net.ipv4.ip_forward=1 # 必备 net.ipv6.conf.all.disable_ipv6=1 # 必备 禁用ip 6 net.ipv4.tcp_tw_recycle=0 vm.swappiness=0 # 禁止使用 swap 空间,只有当系统 OOM 时才允许使用它 vm.overcommit_memory=1 # 不检查物理内存是否够用 vm.panic_on_oom=0 # 开启 OOM fs.inotify.max_user_instances=8192 fs.inotify.max_user_watches=1048576 fs.file-max=52706963 fs.nr_open=52706963 net.netfilter.nf_conntrack_max=2310720 EOF sysctl -p /etc/sysctl.d/kubernetes.conf
调整系统时区
1 2 3 4 5 6 7 # 设置系统时区为 中国/上海 timedatectl status # 查看时区状态 timedatectl set-timezone Asia/Shanghai # 将当前的 UTC 时间写入硬件时钟 实现修改的是, /etc/localtime 。lrwxrwxrwx 1 root root 35 Jun 5 16:33 /etc/localtime -> ../usr/share/zoneinfo/Asia/Shanghai timedatectl set-local-rtc 0 # 将你的硬件时钟设置为协调世界时(UTC) # 重启依赖于系统时间的服务 systemctl restart rsyslog systemctl restart crond
设置 rsyslogd 和 systemd journald
1 2 3 4 5 6 7 8 9 10 11 12 13 14 mkdir /var/log/journal # 持久化保存日志的目录 mkdir /etc/systemd/journald.conf.d cat > /etc/systemd/journald.conf.d/99-prophet.conf <<EOF [Journal] # 持久化保存到磁盘 Storage=persistent # 压缩历史日志 Compress=yes SyncIntervalSec=5m RateLimitInterval=30s RateLimitBurst=1000 # 最大占用空间 10G SystemMaxUse=10G # 单日志文件最大 200M SystemMaxFileSize=200M # 日志保存时间 2 周 MaxRetentionSec=2week # 不将日志转发到 syslog ForwardToSyslog=no EOF systemctl restart systemd-journald
安装 Docker 软件 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 # step 1: 安装必要的一些系统工具 yum install -y yum-utils device-mapper-persistent-data lvm2 # Step 2: 添加软件源信息 yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo # Step 3: 更新并安装Docker-CE yum makecache fast yum -y install docker-ce # mkdir /etc/docker cat > /etc/docker/daemon.json <<EOF { "registry-mirrors": ["https://it8jkcyv.mirror.aliyuncs.com"], "exec-opts": ["native.cgroupdriver=systemd"], "log-driver": "json-file", "log-opts": { "max-size": "100m" } } EOF mkdir -p /etc/systemd/system/docker.service.d # 重启docker服务 systemctl daemon-reload && systemctl restart docker && systemctl enable docker
kubernetes 服务安装 kube-proxy开启ipvs的前置条件是安装lvs服务并启用。
1 2 3 4 5 6 7 8 9 10 11 12 13 # ipvsadm 服务在刚才安装过了。 modprobe br_netfilter cat > /etc/sysconfig/modules/ipvs.modules <<EOF # !/bin/bash modprobe -- ip_vs modprobe -- ip_vs_rr modprobe -- ip_vs_wrr modprobe -- ip_vs_sh modprobe -- nf_conntrack_ipv4 EOF chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack_ipv4
安装 Kubeadm (主从配置) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 # 添加yum源 cat <<EOF > /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/ enabled=1 gpgcheck=1 repo_gpgcheck=1 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF # yum -y install kubeadm kubectl kubelet # systemctl enable kubelet.service
通过 kubeadm init 初始化k8s环境。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 kubeadm init --apiserver-advertise-address=192.168.66.11 --kubernetes-version=v1.18.0 --pod-network-cidr=10.244.0.0/16 --service-cidr=10.96.0.0/12 | tee kubeadm-init.log Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ Then you can join any number of worker nodes by running the following on each as root: kubeadm join 192.168.66.11:6443 --token tdew8c.7b303fgva3zy3dub \ --discovery-token-ca-cert-hash sha256:ad7e4f4153a1647392e37c8d1daa4ecc5e2619c06ca166310303313fbe8efd81
基于输出的内容 执行命令
1 2 3 mkdir -p $HOME/.kube cp -i /etc/kubernetes/admin.conf $HOME/.kube/config chown $(id -u):$(id -g) $HOME/.kube/config
1 2 3 4 kubectl get nodes NAME STATUS ROLES AGE VERSION node1 NotReady master 4m5s v1.18.5 # NotReady: 缺少网络环境,开始安装flannel
部署网络 1 2 3 wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml kubectl apply -f kube-flannel.yml
将 node2 和 node3 加入到集群中
1 2 [root@node2 ~]# kubeadm join 192.168.66.11:6443 --token tdew8c.7b303fgva3zy3dub \ --discovery-token-ca-cert-hash sha256:ad7e4f4153a1647392e37c8d1daa4ecc5e2619c06ca166310303313fbe8efd81
1 2 [root@node3 ~]# kubeadm join 192.168.66.11:6443 --token tdew8c.7b303fgva3zy3dub \ --discovery-token-ca-cert-hash sha256:ad7e4f4153a1647392e37c8d1daa4ecc5e2619c06ca166310303313fbe8efd81
确认nodes正常
1 2 3 4 5 kubectl get nodes NAME STATUS ROLES AGE VERSION node1 Ready master 22h v1.18.5 node2 Ready <none> 21h v1.18.5 node3 Ready <none> 21h v1.18.5
确认pods都正常
1 2 3 4 5 6 7 8 9 10 11 12 13 14 kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system coredns-66bff467f8-6wtcf 1/1 Running 3 21h kube-system coredns-66bff467f8-zqt6t 1/1 Running 3 21h kube-system etcd-node1 1/1 Running 3 21h kube-system kube-apiserver-node1 1/1 Running 3 21h kube-system kube-controller-manager-node1 1/1 Running 3 21h kube-system kube-flannel-ds-amd64-88z8h 1/1 Running 2 20h kube-system kube-flannel-ds-amd64-9jmjr 1/1 Running 2 20h kube-system kube-flannel-ds-amd64-rk9kj 1/1 Running 4 21h kube-system kube-proxy-77z8g 1/1 Running 2 20h kube-system kube-proxy-h76lv 1/1 Running 3 21h kube-system kube-proxy-pvpdr 1/1 Running 2 20h kube-system kube-scheduler-node1 1/1 Running 3 21h
故障汇总 故障一 服务器重启后,所以的容器不能启动 原因:使用ansible执行初始化的时候,swap分区没修改成功。导致重启后swap分区还在启用。 解决办法: 关闭swap分区,重启。问题解决。
故障二 node2 服务不能连接到api服务 原因: 安装完IPtable以后没有清空自带的规则信息。
解决办法: iptables -F && iptables -X && iptables -F -t nat && iptables -X -t nat && service iptables save
参考文档 https://blog.csdn.net/qq_40806970/article/details/97645628