Big Little Ant

不被嘲笑的梦想,是不值得去实现的

实战

1
2
3
wget https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0/aio/deploy/recommended.yaml
mv recommended.yaml dashboard-deploy.yaml
kubectl apply -f dashboard-deploy.yaml

查看运行状态

1
2
3
4
5
6
7
8
9
[root@node1 ~]# kubectl get pods -n kubernetes-dashboard
NAME READY STATUS RESTARTS AGE
dashboard-metrics-scraper-c79c65bb7-x9pgz 1/1 Running 0 88s
kubernetes-dashboard-56484d4c5-lflxh 1/1 Running 0 88s

[root@node2 ~]# kubectl get svc -n kubernetes-dashboard
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
dashboard-metrics-scraper ClusterIP 10.254.57.95 <none> 8000/TCP 3m26s
kubernetes-dashboard ClusterIP 10.254.14.214 <none> 443/TCP 3m26s

访问 dashboard

  1. 通过nginx Ingress 访问。
  2. 通过kube-proxy 访问。

通过 nginx Ingress 访问 dashboard

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
cat > dashboard-ingress.yaml <<EOF
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name: dashboard-deploy
namespace: kubernetes-dashboard
annotations:
nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
spec:
rules:
- host: dashboard.biglittleant.cn
http:
paths:
- path: /
backend:
serviceName: kubernetes-dashboard
servicePort: 443
EOF
1
2
3
4
5
kubectl apply -f dashboard-ingress.yaml

[root@node1 ~]# kubectl get ingress -n kubernetes-dashboard
NAME HOSTS ADDRESS PORTS AGE
dashboard-deploy dashboard.biglittleant.cn 192.168.66.11,192.168.66.12 80 4m33s

浏览器访问 URL:https://dashboard.biglittleant.cn

自己配置hosts解析,或者用命令这样验证结果。curl -I https://dashboard.biglittleant.cn -x 192.168.66.11:80

通过 port forward 访问 dashboard

启动端口转发:

1
2
3
[root@node2 ~]# kubectl port-forward -n kubernetes-dashboard  svc/kubernetes-dashboard 4443:443 --address 0.0.0.0
Forwarding from 0.0.0.0:4443 -> 8443
Handling connection for 4443

浏览器访问 URL:https://192.168.66.12:4443/

如果chrome 浏览器不让访问,可以在chrome该页面上,直接键盘敲入这11个字符:thisisunsafe

创建登录 Dashboard 的 token 和 kubeconfig 配置文件

dashboard 默认只支持 token 认证(不支持 client 证书认证),所以如果使用 Kubeconfig 文件,需要将 token 写入到该文件。

创建登录 token

1
2
3
4
5
kubectl create sa dashboard-admin -n kube-system
kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kube-system:dashboard-admin
ADMIN_SECRET=$(kubectl get secrets -n kube-system | grep dashboard-admin | awk '{print $1}')
DASHBOARD_LOGIN_TOKEN=$(kubectl describe secret -n kube-system ${ADMIN_SECRET} | grep -E '^token' | awk '{print $2}')
echo ${DASHBOARD_LOGIN_TOKEN}

使用输出的 token 登录 Dashboard。

创建使用 token 的 KubeConfig 文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# node1 上操作
export KUBE_APISERVER=https://192.168.66.11:6443
# 设置集群参数
kubectl config set-cluster kubernetes \
--certificate-authority=/etc/kubernetes/cert/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=dashboard.kubeconfig

# 设置客户端认证参数,使用上面创建的 Token
kubectl config set-credentials dashboard_user \
--token=${DASHBOARD_LOGIN_TOKEN} \
--kubeconfig=dashboard.kubeconfig

# 设置上下文参数
kubectl config set-context default \
--cluster=kubernetes \
--user=dashboard_user \
--kubeconfig=dashboard.kubeconfig

# 设置默认上下文
kubectl config use-context default --kubeconfig=dashboard.kubeconfig

用生成的 dashboard.kubeconfig 登录 Dashboard。

1
2
3
4
5
[root@node1 ~]# sz -y dashboard.kubeconfig
rz
Starting zmodem transfer. Press Ctrl+C to cancel.
Transferring dashboard.kubeconfig...
100% 2 KB 2 KB/sec 00:00:01 0 Errors

访问成功的界面

dashboar-nodes

参考文档

Fail to login - Access Control is not helping
Organizing Cluster Access Using kubeconfig Files
Cannot access dashboard with no error
Can’t sign in into dashboard
基于kubernetes集群部署DashBoard
Kubernetes Dashboard
在开启TLS的Kubernetes1.6集群上安装Dashboard
Kubernetes Dashboard 1.7.0部署二三事

报错汇总

k8s自动启动dashboard

1
"dial tcp 10.0.0.1:443: getsockopt: no route to host"
1
systemctl restart flanneld docker

Ingress 访问 dashboard服务

点击登录,不能实现跳转

Let me have a summary:
if you use recommend yaml to deploy dashboard, you should only access your dashboard by https , and you should generete you certs, refer to guide
then , you can run kubectl proxy –address=’0.0.0.0’ –accept-hosts=’^*$’ to visit dashboard on “http://localhost:8001/ui" . This page need to login use token, generete refer to this page. Also you can add NodePort to you yaml and access to it use :

if you deploy use http alternative method, you can only access your dashboard by :, remeber to add it to yaml first!!
After deploy, you should also generate you token and add header Authorization: Bearer for every request.

The offical wiki is a little bit confused so I reordered it here.

解决办法:使用https登陆。

ingress 简介

Ingress 公开了从集群外部到集群内服务的 HTTP 和 HTTPS 路由。 流量路由由 Ingress 资源上定义的规则控制。

1
2
3
4
5
internet
|
[ Ingress ]
--|-----|--
[ Services ]

可以将 Ingress 配置为服务提供外部可访问的 URL、负载均衡流量、终止 SSL/TLS,以及提供基于名称的虚拟主机等能力。 Ingress 控制器 通常负责通过负载均衡器来实现 Ingress,尽管它也可以配置边缘路由器或其他前端来帮助处理流量。

ingress控制器借助service的服务发现机制实现配置的动态更新以实现Pod的负载均衡机制实现,由于涉及到Ingress Controller的动态更新,目前社区Ingress Controller大体包含两种类型的控制器:

传统的七层负载均衡如Nginx,HAproxy,开发了适应微服务应用的插件,具有成熟,高性能等优点;
新型微服务负载均衡如Traefik,Envoy,Istio,专门适用于微服务+容器化应用场景,具有动态更新特点;

类型 常见类型 优点 缺点
传统负载均衡 nginx,haproxy 成熟,稳定,高性能 动态更新需reload配置文件
微服务负载均衡 Traefik,Envoy,Istio 天生为微服务而生,动态更新 性能还有待提升

安装

使用 NodePort的方式部署 ingress-nginx-controller

1
2
3
wget https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v0.35.0/deploy/static/provider/baremetal/deploy.yaml
mv deploy.yaml ingress-deploy-hostnetwork.yaml
kubectl apply -f ingress-deploy-hostnetwork.yaml

检查服务启动是否正常(拉镜像需要等待一段时间)

1
kubectl get pods -n ingress-nginx -l app.kubernetes.io/name=ingress-nginx

检查服务是否正常

1
2
3
4
5
6
7
8
9
10
11
12
POD_NAMESPACE=ingress-nginx
POD_NAME=$(kubectl get pods -n $POD_NAMESPACE -l app.kubernetes.io/name=ingress-nginx --field-selector=status.phase=Running -o jsonpath='{.items[0].metadata.name}')

kubectl exec -it $POD_NAME -n $POD_NAMESPACE -- /nginx-ingress-controller --version
-------------------------------------------------------------------------------
NGINX Ingress controller
Release: v0.35.0
Build: 54ad65e32bcab32791ab18531a838d1c0f0811ef
Repository: https://github.com/kubernetes/ingress-nginx
nginx version: nginx/1.19.2

-------------------------------------------------------------------------------

配置本地网络启动

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
apiVersion: apps/v1
kind: DaemonSet
metadata:
labels:
.....
name: ingress-nginx-controller
namespace: ingress-nginx
spec:
selector:
matchLabels:
.....
revisionHistoryLimit: 10
minReadySeconds: 0
template:
metadata:
labels:
.....
spec:
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet

修改内容:

  1. kind: DaemonSet 配置服务启动方式为DaemonSet(原来是deployment)。
  2. 配置 spec.hostNetwork: true,本地网络启动。
  3. 配置 dns的策略为: ClusterFirstWithHostNet, 与hostNetwork 配合使用,参考官方文档pod-s-dns-policy

重新同步一下pod文件

1
kubectl apply -f ingress-deploy-hostnetwork.yaml

配置标签,让ingress 只调度到Master节点上

为了让网关的性能更好,我们需要给ingress打标签,让ingress调度到master机器上。同时给master 打上污点,禁止其他pod调度到master节点上。

1
2
3
4
5
6
7
8
9
kubectl get pods -n ingress-nginx  -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
ingress-nginx-admission-create-z7sts 0/1 Completed 0 2d16h 172.30.199.211 node7 <none> <none>
ingress-nginx-admission-patch-4qgtz 0/1 Completed 0 2d16h 172.30.139.14 node6 <none> <none>
ingress-nginx-controller-h5fdd 1/1 Running 1 2d16h 192.168.66.16 node6 <none> <none>
ingress-nginx-controller-j6n9z 1/1 Running 0 2d16h 192.168.66.11 node1 <none> <none>
ingress-nginx-controller-kfnxt 1/1 Running 1 2d16h 192.168.66.17 node7 <none> <none>
ingress-nginx-controller-n5pl4 1/1 Running 0 2d16h 192.168.66.12 node2 <none> <none>
ingress-nginx-controller-qzbb5 1/1 Running 0 2d16h 192.168.66.13 node3 <none> <none>

对 node1 node2 打标签

1
2
kubectl label nodes node1 api/ingress-controller=true
kubectl label nodes node2 api/ingress-controller=true

查看标签

1
2
3
4
5
6
7
8
# 为了好看,去掉相同的标签
kubectl get nodes --show-labels
NAME STATUS ROLES AGE VERSION LABELS
node1 Ready <none> 3h53m v1.16.6 api/ingress-controller=true,kubernetes.io/hostname=node1
node2 Ready <none> 3h53m v1.16.6 api/ingress-controller=true,kubernetes.io/hostname=node2
node3 Ready <none> 3h53m v1.16.6 kubernetes.io/hostname=node3
node6 Ready <none> 3d4h v1.16.6 kubernetes.io/hostname=node6
node7 Ready <none> 3d4h v1.16.6 kubernetes.io/hostname=node7

增加标签选择,及容忍污点

1
2
3
4
5
6
7
8
9
10
11
# vim ingress-deploy-hostnetwork.yaml
spec:
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
nodeSelector:
api/ingress-controller: "true"
tolerations:
- key: "node-role.kubernetes.io/master"
operator: "Equal"
value: ""
effect: "NoSchedule"

在刚才修改网络的后面继续增加以上命令

从新同步一下配置

1
kubectl apply -f ingress-deploy-hostnetwork.yaml
1
2
3
4
5
6
[root@node1 ~]# kubectl get pods -n ingress-nginx  -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
ingress-nginx-admission-create-z7sts 0/1 Completed 0 2d16h 172.30.199.211 node7 <none> <none>
ingress-nginx-admission-patch-4qgtz 0/1 Completed 0 2d16h 172.30.139.14 node6 <none> <none>
ingress-nginx-controller-6s2lp 1/1 Running 0 57m 192.168.66.12 node2 <none> <none>
ingress-nginx-controller-j6n9z 1/1 Running 0 57m 192.168.66.11 node1 <none> <none>

为了避免master节点会被调度新的pod,需要给master节点打上Taints。
如果一个节点标记为 Taints ,除非 pod 也被标识为可以容忍污点节点,否则该 Taints 节点不会被调度 pod。
影响策略是 NoSchedule,只会影响新的 pod 调度

1
2
kubectl taint nodes node1 node-role.kubernetes.io/master=:NoSchedule
kubectl taint nodes node2 node-role.kubernetes.io/master=:NoSchedule
1
2
3
4
5
[root@node1 ~]# kubectl describe node node1 |grep Taints
Taints: node-role.kubernetes.io/master:NoSchedule
[root@node1 ~]# kubectl describe node node2 |grep Taints
Taints: node-role.kubernetes.io/master:NoSchedule
[root@node1 ~]#
1
2
3
4
5
6
7
[root@node1 ~]# kubectl get pods -n ingress-nginx  -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
ingress-nginx-admission-create-z7sts 0/1 Completed 0 2d19h 172.30.199.211 node7 <none> <none>
ingress-nginx-admission-patch-4qgtz 0/1 Completed 0 2d19h 172.30.139.14 node6 <none> <none>
ingress-nginx-controller-6s2lp 1/1 Running 0 3h6m 192.168.66.12 node2 <none> <none>
ingress-nginx-controller-kv4xs 1/1 Running 0 3h5m 192.168.66.11 node1 <none> <none>
[root@node1 ~]#

汇总一下配置文件的修改内容

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
apiVersion: apps/v1
kind: DaemonSet
metadata:
labels:
.....
name: ingress-nginx-controller
namespace: ingress-nginx
spec:
selector:
matchLabels:
.....
revisionHistoryLimit: 10
minReadySeconds: 0
template:
metadata:
labels:
.....
spec:
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
nodeSelector:
api/ingress-controller: "true"
tolerations:
- key: "node-role.kubernetes.io/master"
operator: "Equal"
value: ""
effect: "NoSchedule"

修改参数如下:

  • kind: Deployment #修改为DaemonSet
  • hostNetwork: true #添加该字段让docker使用物理机网络,在物理机暴露服务端口(80),注意物理机80端口提前不能被占用
  • dnsPolicy: ClusterFirstWithHostNet #使用hostNetwork后容器会使用物理机网络包括DNS,会无法解析内部service,使用此参数让容器使用K8S的DNS
  • nodeSelector:api/ingress-controller: “true” #添加节点标签
  • tolerations: 添加对指定节点容忍度

验证ingress 是否正常

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
cat > nginx-test.yaml <<EOF
#deploy
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-test
spec:
selector:
matchLabels:
app: nginx-test
replicas: 1
template:
metadata:
labels:
app: nginx-test
spec:
containers:
- name: nginx-test
image: nginx
ports:
- containerPort: 80
---
#service
apiVersion: v1
kind: Service
metadata:
name: nginx-test
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: nginx-test

---
#ingress
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
name: nginx-test
spec:
rules:
- host: nginx.biglittleant.cn
http:
paths:
- path: /
backend:
serviceName: nginx-test
servicePort: 80
EOF
1
apply -f nginx-test.yaml

检查执行结果

1
2
3
4
5
6
7
8
9
10
[root@node1 ~]# kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-test-54789bbd4-tplgz 1/1 Running 0 2m27s
[root@node1 ~]# kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.254.0.1 <none> 443/TCP 18h
nginx-test ClusterIP 10.254.51.46 <none> 80/TCP 3m57s
[root@node1 ~]# kubectl get ingress
NAME HOSTS ADDRESS PORTS AGE
nginx-test nginx.biglittleant.cn 192.168.66.11,192.168.66.12 80 4m2s

本地访问确认ingress工作正常

1
2
3
4
5
6
7
8
9
10
11
 curl -I nginx.biglittleant.cn -x 192.168.66.11:80
HTTP/1.1 200 OK
Server: nginx/1.19.2
Date: Wed, 30 Sep 2020 03:39:53 GMT
Content-Type: text/html
Content-Length: 612
Connection: keep-alive
Vary: Accept-Encoding
Last-Modified: Tue, 11 Aug 2020 14:50:35 GMT
ETag: "5f32b03b-264"
Accept-Ranges: bytes

在NGINX容器中新建一个文件,确认访问是否正常。

1
2
3
4
5
6
7
[root@node1 ~]# kubectl exec -it nginx-test-54789bbd4-tplgz /bin/bash
root@nginx-test-54789bbd4-tplgz:/# echo 1 > /usr/share/nginx/html/1.html
root@nginx-test-54789bbd4-tplgz:/# cat /usr/share/nginx/html/1.html
1
root@nginx-test-54789bbd4-tplgz:/# exit
exit
[root@node1 ~]#
1
2
 curl nginx.biglittleant.cn/1.html -x 192.168.66.11:80
1

参考文章

官方仓库
官方文档
k8s配置ingress
nginx ingress的安装配置和使用

什么是 Helm

在没使用 helm 之前,向 kubernetes 部署应用,我们要依次部署 deployment、svc 等,步骤较繁琐。况且随 着很多项目微服务化,复杂的应用在容器中部署以及管理显得较为复杂,helm 通过打包的方式,支持发布的版本 管理和控制,很大程度上简化了 Kubernetes 应用的部署和管理

Helm 本质就是让 K8s 的应用管理(Deployment,Service 等 ) 可配置,能动态生成。通过动态生成 K8s 资源清单文件(deployment.yaml,service.yaml)。然后调用 Kubectl 自动执行 K8s 资源部署

Helm 是官方提供的类似于 YUM 的包管理器,是部署环境的流程封装。Helm 有两个重要的概念:chart 和 release

  • chart 是创建一个应用的信息集合,包括各种 Kubernetes 对象的配置模板、参数定义、依赖关系、文档说 明等。chart 是应用部署的自包含逻辑单元。可以将 chart 想象成 apt、yum 中的软件安装包
  • release 是 chart 的运行实例,代表了一个正在运行的应用。当 chart 被安装到 Kubernetes 集群,就生成 一个 release。chart 能够多次安装到同一个集群,每次安装都是一个 release

Helm 包含两个组件:Helm 客户端和 Tiller 服务器,如下图所示
缺图

Helm 客户端负责 chart 和 release 的创建和管理以及和 Tiller 的交互。Tiller 服务器运行在 Kubernetes 集群 中,它会处理 Helm 客户端的请求,与 Kubernetes API Server 交互

Helm 部署

1
2
3
4
5
ntpdate ntp1.aliyun.com
wget https://get.helm.sh/helm-v2.16.5-linux-amd64.tar.gz
tar -zxvf helm-v2.16.5-linux-amd64.tar.gz
cd linux-amd64/
cp helm /usr/local/bin/

为了安装服务端 tiller,还需要在这台机器上配置好 kubectl 工具和 kubeconfig 文件,确保 kubectl 工具可以 在这台机器上访问 apiserver 且正常使用。 这里的 node1 节点以及配置好了 kubectl

因为 Kubernetes APIServer 开启了 RBAC 访问控制,所以需要创建 tiller 使用的 service account: tiller 并分 配合适的角色给它。 详细内容可以查看helm文档中的 Role-based Access Control。 这里简单起见直接分配 cluster- admin 这个集群内置的 ClusterRole 给它。创建 rbac-config.yaml 文件:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
cat > rbac-config.yaml <<EOF
apiVersion: v1
kind: ServiceAccount
metadata:
name: tiller
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: tiller
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: tiller
namespace: kube-system
EOF
1
2
3
kubectl create -f rbac-config.yaml
serviceaccount/tiller created
clusterrolebinding.rbac.authorization.k8s.io/tiller created

拉取镜像

1
2
3
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/tiller:v2.16.6
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/tiller:v2.16.6 k8s.gcr.io/tiller:v2.16.6
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/tiller:v2.16.6

安装tiller

1
helm init --service-account tiller --history-max 200 --upgrade --tiller-image k8s.gcr.io/tiller:v2.16.6 --stable-repo-url https://cnych.github.io/kube-charts-mirror/

tiller 默认被部署在 k8s 集群中的 kube-system 这个 namespace 下

1
2
3
4
5
6
kubectl get pod -n kube-system -l app=helm
NAME READY STATUS RESTARTS AGE
tiller-deploy-55479b584d-4kc4b 1/1 Running 2 21h
helm version
Client: &version.Version{SemVer:"v2.16.9", GitCommit:"8ad7037828e5a0fca1009dabe290130da6368e39", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.16.6", GitCommit:"dd2e5695da88625b190e6b22e9542550ab503a47", GitTreeState:"clean"}

Helm 自定义模板

1
2
3
# 创建文件夹
mkdir ./hello-world
cd ./hello-world
1
2
3
4
5
# 创建自描述文件 Chart.yaml , 这个文件必须有 name 和 version 定义
cat <<'EOF' > ./Chart.yaml
name: hello-world
version: 1.0.0
EOF
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
# 创建模板文件, 用于生成 Kubernetes 资源清单(manifests)
mkdir ./templates
cat <<'EOF' > ./templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: hello-world
labels:
app: hello-world
spec:
replicas: 3
selector:
matchLabels:
app: hello-world
template:
metadata:
labels:
app: hello-world
spec:
containers:
- name: nginx
image: nginx:lasest
ports:
- containerPort: 80
EOF

cat <<'EOF' > ./templates/service.yaml
apiVersion: v1
kind: Service
metadata:
name: hello-world
spec:
type: NodePort
ports:
- name: http
protocol: TCP
port: 80
selector:
app: hello-world
EOF
1
2
# 使用命令 helm install RELATIVE_PATH_TO_CHART 创建一次Release
$ helm install .
1
2
3
4
# 查看 hello-world的部署情况
helm list
NAME REVISION UPDATED STATUS CHART APP VERSION NAMESPACE
wistful-ferret 1 Thu Jul 2 18:42:42 2020 DEPLOYED hello-world-1.0.0 default
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# 这个文件中定义的值,在模板文件中可以通过 .VAlues对象访问到
cat <<'EOF' > ./templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: hello-world
labels:
app: hello-world
spec:
replicas: 3
selector:
matchLabels:
app: hello-world
template:
metadata:
labels:
app: hello-world
spec:
containers:
- name: nginx
image: {{ .Values.image.repository }}:{{ .Values.image.tag }}
ports:
- containerPort: 80
EOF

创建一个 values.yaml 定义相关值

1
2
3
4
5
6
# 配置体现在配置文件 values.yaml
cat <<'EOF' > ./values.yaml
image:
repository: nginx
tag: '1.18'
EOF
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# 查看当前的版本信息
helm list
NAME REVISION UPDATED STATUS CHART APP VERSION NAMESPACE
wistful-ferret 1 Thu Jul 2 18:42:42 2020 DEPLOYED hello-world-1.0.0 default

# 升级版本
helm upgrade -f values.yaml wistful-ferret .

## REVISION 版本号变成了2
helm list
NAME REVISION UPDATED STATUS CHART APP VERSION NAMESPACE
wistful-ferret 2 Fri Jul 3 15:16:24 2020 DEPLOYED hello-world-1.0.0 default

## 查看镜像号已经变更成功
kubectl get pods hello-world-8689856fc-v9s2m -o yaml |grep "\- image: "
- image: nginx:1.18

也可以通过命令行显式指定版本号

1
2
# 在 values.yaml 中的值可以被部署 release 时用到的参数 --values YAML_FILE_PATH 或 --set key1=value1, key2=value2 覆盖掉
$ helm install --set image.tag='latest' .

helm 常用命令

1
2
3
4
5
6
7
8
9
10
11
# 列出已经部署的 Release
$ helm ls
# 查询一个特定的 Release 的状态
$ helm status RELEASE_NAME
# 移除所有与这个 Release 相关的 Kubernetes 资源
$ helm delete cautious-shrimp
# helm rollback RELEASE_NAME REVISION_NUMBER
$ helm rollback cautious-shrimp 1
# 使用 helm delete --purge RELEASE_NAME 移除所有与指定 Release 相关的 Kubernetes 资源和所有这个 Release 的记录
$ helm delete --purge cautious-shrimp
$ helm ls --deleted

Debug

1
2
3
# 使用模板动态生成K8s资源清单,非常需要能提前预览生成的结果。
# 使用--dry-run --debug 选项来打印出生成的清单文件内容,而不执行部署
helm install . --dry-run --debug --set image.tag=latest

帮助文档

官方文档
helm 安装 dashboard
ingress-nginx deploy
ingress-nginx

下载和配置 coredns

1
2
3
cd /data/apps/k8s/work/
git clone https://github.com/coredns/deployment.git
mv deployment coredns-deployment

创建 coredns

1
2
3
4
5
cd /data/apps/k8s/work/coredns-deployment/kubernetes
export CLUSTER_DNS_SVC_IP="10.254.0.2"
export CLUSTER_DNS_DOMAIN="cluster.local"
./deploy.sh -i ${CLUSTER_DNS_SVC_IP} -d ${CLUSTER_DNS_DOMAIN} > coredns.yaml
kubectl apply -f coredns.yaml

Sep 27 15:35:22 node1 kube-scheduler[8420]: I0927 15:35:22.343021 8420 scheduler.go:667] pod kube-system/coredns-759df9d7b-td7rr is bound successfully on node “node7”, 2 nodes evaluated, 2 nodes were found feasible. Bound node resource: “Capacity: CPU<2>|Memory<4046008Ki>|Pods<220>|StorageEphemeral<41921540Ki>; Allocatable: CPU<2>|Memory<3943608Ki>|Pods<220>|StorageEphemeral<38634891201>.”.

检查 coredns 功能

1
2
3
4
5
6
7
8
9
10
11
12
$ kubectl get all -n kube-system -l k8s-app=kube-dns
NAME READY STATUS RESTARTS AGE
pod/coredns-759df9d7b-td7rr 1/1 Running 0 2m7s

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kube-dns ClusterIP 10.254.0.2 <none> 53/UDP,53/TCP,9153/TCP 2m7s

NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/coredns 1/1 1 1 2m7s

NAME DESIRED CURRENT READY AGE
replicaset.apps/coredns-759df9d7b 1 1 1 2m7s

新建一个 Deployment:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
cd /data/apps/k8s/work/
cat > my-nginx.yaml <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-nginx
spec:
replicas: 2
selector:
matchLabels:
run: my-nginx
template:
metadata:
labels:
run: my-nginx
spec:
containers:
- name: my-nginx
image: nginx:1.7.9
ports:
- containerPort: 80
EOF
kubectl create -f my-nginx.yaml

export 该 Deployment, 生成 my-nginx 服务:

1
2
3
4
5
6
$ kubectl expose deploy my-nginx
service "my-nginx" exposed

$ kubectl get services my-nginx -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
my-nginx ClusterIP 10.254.67.218 <none> 80/TCP 5s run=my-nginx

创建另一个 Pod,查看 /etc/resolv.conf 是否包含 kubelet 配置的 --cluster-dns--cluster-domain,是否能够将服务 my-nginx 解析到上面显示的 Cluster IP 10.254.67.218

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
cd /data/apps/k8s/work/
cat > dnsutils-ds.yml <<EOF
apiVersion: v1
kind: Service
metadata:
name: dnsutils-ds
labels:
app: dnsutils-ds
spec:
type: NodePort
selector:
app: dnsutils-ds
ports:
- name: http
port: 80
targetPort: 80
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: dnsutils-ds
labels:
addonmanager.kubernetes.io/mode: Reconcile
spec:
selector:
matchLabels:
app: dnsutils-ds
template:
metadata:
labels:
app: dnsutils-ds
spec:
containers:
- name: my-dnsutils
image: tutum/dnsutils:latest
command:
- sleep
- "3600"
ports:
- containerPort: 80
EOF
kubectl create -f dnsutils-ds.yml
1
2
3
4
$ kubectl get pods -l app=dnsutils-ds -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
dnsutils-ds-t5hqb 1/1 Running 0 4m25s 172.30.139.3 node6 <none> <none>
dnsutils-ds-zxzhf 1/1 Running 0 4m25s 172.30.199.195 node7 <none> <none>
1
2
3
4
$ kubectl -it exec dnsutils-ds-t5hqb  cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local
nameserver 10.254.0.2
options ndots:5

查看一下现有的server

1
2
3
4
5
$ kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
dnsutils-ds NodePort 10.254.242.169 <none> 80:31128/TCP 6m36s
kubernetes ClusterIP 10.254.0.1 <none> 443/TCP 146m
my-nginx ClusterIP 10.254.2.160 <none> 80/TCP 8m27s

nslookup 验证一下解析信息

1
2
3
4
5
6
$ kubectl -it exec dnsutils-ds-t5hqb nslookup kubernetes
Server: 10.254.0.2
Address: 10.254.0.2#53

Name: kubernetes.default.svc.cluster.local
Address: 10.254.0.1
1
2
3
4
5
6
$ kubectl -it exec dnsutils-ds-t5hqb nslookup my-nginx
Server: 10.254.0.2
Address: 10.254.0.2#53

Name: my-nginx.default.svc.cluster.local
Address: 10.254.2.160

nslookup 验证一下外网域名

1
2
3
4
5
6
7
8
9
10
$ kubectl -it exec dnsutils-ds-t5hqb  nslookup www.baidu.com
Server: 10.254.0.2
Address: 10.254.0.2#53

Non-authoritative answer:
www.baidu.com canonical name = www.a.shifen.com.
Name: www.a.shifen.com
Address: 39.156.66.18
Name: www.a.shifen.com
Address: 39.156.66.14

参考

  1. https://community.infoblox.com/t5/Community-Blog/CoreDNS-for-Kubernetes-Service-Discovery/ba-p/8187
  2. https://coredns.io/2017/03/01/coredns-for-kubernetes-service-discovery-take-2/
  3. https://www.cnblogs.com/boshen-hzb/p/7511432.html
  4. https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/dns

keepalived 服务配置

keepalived 起初是专为LVS设计的,专门用来监控LVS集群系统中各个服务节点的状态,后来又加入VRRP的功能,因此除了配合LVS服务外,也可以作为其他服务(nginx,haproxy)的高可用软件。
VRRP(virtual router redundancy protocol)虚拟路由器冗余协议:处理的目的就是为了解决静态路由传销的单点故障问题,
keepalived两大功能:healthcheck & failover 。

主机规划

主机名 IP VIP 功能
node8 192.168.66.18 192.168.66.250(DR) lvs+keepalived(主节点)
node9 192.168.66.19 192.168.66.250(DR) lvs+keepalived(备节点)
node1 192.168.66.11 nginx
node2 192.168.66.12 nginx

安装 lvs 软件

两台机器上都要操作

1
yum install ipvsadm -y

开启内核转发

1
2
3
cat /proc/sys/net/ipv4/ip_forward
echo 1 > /proc/sys/net/ipv4/ip_forward
cat /proc/sys/net/ipv4/ip_forward

keepalived 安装

两台机器上都要操作

1
yum install curl gcc openssl-devel libnl3-devel net-snmp-devel -y

二进制安装

1
2
3
4
5
6
7
8
9
cd /usr/local/src
wget https://www.keepalived.org/software/keepalived-2.1.5.tar.gz
tar -zxf keepalived-2.1.5.tar.gz
cd keepalived-2.1.5.tar.gz
./configure --prefix=/usr/local/src/keepalived-2.1.5_bin
make && make install
cd /usr/local/src/keepalived-2.1.5_bin
cp sbin/keepalived /usr/sbin/
cp bin/genhash /usr/bin/

实战

添加 keepalived 启动文件

两台机器上都要操作

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
vim /etc/sysconfig/keepalived
# Options for keepalived. See `keepalived --help' output and keepalived(8) and
# keepalived.conf(5) man pages for a list of all options. Here are the most
# common ones :
#
# --vrrp -P Only run with VRRP subsystem.
# --check -C Only run with Health-checker subsystem.
# --dont-release-vrrp -V Dont remove VRRP VIPs & VROUTEs on daemon stop.
# --dont-release-ipvs -I Dont remove IPVS topology on daemon stop.
# --dump-conf -d Dump the configuration data.
# --log-detail -D Detailed log messages.
# --log-facility -S 0-7 Set local syslog facility (default=LOG_DAEMON)
#

KEEPALIVED_OPTIONS="-D"

添加 keepalived 主配置文件

node8上都要操作

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
mkdir -p /etc/keepalived/
vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived

global_defs {
notification_email {
biglittleant@admin.com
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 192.168.200.1
smtp_connect_timeout 30
router_id node8 #修改为本机名
vrrp_skip_check_adv_addr
vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
}

vrrp_instance VI_1 {
state MASTER # 主节点
interface eth1
virtual_router_id 51
priority 100 # 要大于备节点的值
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.66.250
}
}

include keepalived.d/*.conf

备节点配置文件

node9上都要操作

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
mkdir -p /etc/keepalived/
vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived

global_defs {
notification_email {
biglittleant@admin.com
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 192.168.200.1
smtp_connect_timeout 30
router_id node9 #修改为本机名
vrrp_skip_check_adv_addr
vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
}

vrrp_instance VI_1 {
state BACKUP # 备节点
interface eth1
virtual_router_id 51
priority 99 # 要小于主节点的值
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.66.250
}
}

include keepalived.d/*.conf

添加 virtual_server 配置文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
mkdir -p /etc/keepalived/keepalived.d
vim /etc/keepalived/keepalived.d/vs-192.168.66.250_80.conf
virtual_server 192.168.66.250 80 {
delay_loop 6
lb_algo wrr
lb_kind DR
persistence_timeout 50
protocol TCP

real_server 192.168.66.11 80 {
weight 1
}

real_server 192.168.66.12 80 {
weight 2
}
}

添加 keepalived 启动文件

两台机器上都要操作

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# 添加keepalived 启动文件
vim /usr/lib/systemd/system/keepalived.service
[Unit]
Description=LVS and VRRP High Availability Monitor
After=syslog.target network-online.target

[Service]
Type=forking
PIDFile=/var/run/keepalived.pid
KillMode=process
EnvironmentFile=-/etc/sysconfig/keepalived
ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPTIONS
ExecReload=/bin/kill -HUP $MAINPID

[Install]
WantedBy=multi-user.target

# 启动keepalvied & 加入开机自启动
systemctl daemon-reload
systemctl enable keepalived
Created symlink from /etc/systemd/system/multi-user.target.wants/keepalived.service to /usr/lib/systemd/system/keepalived.service.
systemctl start keepalived

验证服务启动

node8上操作

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Sep 15 17:10:34 node8 Keepalived_vrrp[30389]: Using LinkWatch kernel netlink reflector...
Sep 15 17:10:34 node8 Keepalived_vrrp[30389]: VRRP sockpool: [ifindex(3), proto(112), unicast(0), fd(10,11)]
Sep 15 17:10:34 node8 Keepalived_healthcheckers[30388]: Got SIGHUP, reloading checker configuration
Sep 15 17:10:34 node8 Keepalived_healthcheckers[30388]: Initializing ipvs
Sep 15 17:10:34 node8 Keepalived_healthcheckers[30388]: Opening file '/etc/keepalived/keepalived.conf'.
Sep 15 17:10:34 node8 Keepalived_healthcheckers[30388]: Opening file 'keepalived.d/vs-192.168.66.250_80.conf'.
Sep 15 17:10:34 node8 Keepalived_healthcheckers[30388]: service [192.168.66.13]:80 no longer exist
Sep 15 17:10:34 node8 Keepalived_healthcheckers[30388]: Gained quorum 1+0=1 <= 2 for VS [192.168.66.250]:80
Sep 15 17:10:34 node8 systemd[1]: Reloaded LVS and VRRP High Availability Monitor.
Sep 15 17:10:35 node8 Keepalived_vrrp[30389]: VRRP_Instance(VI_1) Transition to MASTER STATE

[root@node8 ~]# ip addr list eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 08:00:27:89:7e:4b brd ff:ff:ff:ff:ff:ff
inet 192.168.66.18/24 brd 192.168.66.255 scope global noprefixroute eth1
valid_lft forever preferred_lft forever
inet 192.168.66.250/32 scope global eth1
valid_lft forever preferred_lft forever
inet6 fe80::a00:27ff:fe89:7e4b/64 scope link
valid_lft forever preferred_lft forever

[root@node8 ~]# ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 192.168.66.250:80 wrr persistent 50
-> 192.168.66.11:80 Route 1 0 0
-> 192.168.66.12:80 Route 2 0 0

手动关闭主节点,验证虚拟IP自动漂移

node8上操作

1
2
3
4
5
6
7
[root@node8 ~]# systemctl stop keepalived
# 发现主节点lvs配置没了
[root@node8 ~]# ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
[root@node8 ~]#

node9上操作

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# 通过日志,查看Keepalived主备切换过程
# 期间备节点发送ARP广播,让所有客户端更新本地的ARP表,以便客户端访问新接管VIP服务的节点。

Sep 15 17:33:21 node9 Keepalived_vrrp[30272]: VRRP_Instance(VI_1) Entering BACKUP STATE
Sep 15 17:33:21 node9 Keepalived_vrrp[30272]: VRRP sockpool: [ifindex(3), proto(112), unicast(0), fd(10,11)]
Sep 15 17:33:21 node9 systemd[1]: Reloaded LVS and VRRP High Availability Monitor.
Sep 15 17:34:48 node9 Keepalived_vrrp[30272]: VRRP_Instance(VI_1) Transition to MASTER STATE
Sep 15 17:34:49 node9 Keepalived_vrrp[30272]: VRRP_Instance(VI_1) Entering MASTER STATE
Sep 15 17:34:49 node9 Keepalived_vrrp[30272]: VRRP_Instance(VI_1) setting protocol iptable drop rule
Sep 15 17:34:49 node9 Keepalived_vrrp[30272]: VRRP_Instance(VI_1) setting protocol VIPs.
Sep 15 17:34:49 node9 Keepalived_vrrp[30272]: Sending gratuitous ARP on eth1 for 192.168.66.250
Sep 15 17:34:49 node9 Keepalived_vrrp[30272]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on eth1 for 192.168.66.250
Sep 15 17:34:49 node9 Keepalived_vrrp[30272]: Sending gratuitous ARP on eth1 for 192.168.66.250
Sep 15 17:34:49 node9 Keepalived_vrrp[30272]: Sending gratuitous ARP on eth1 for 192.168.66.250
Sep 15 17:34:49 node9 Keepalived_vrrp[30272]: Sending gratuitous ARP on eth1 for 192.168.66.250
Sep 15 17:34:49 node9 Keepalived_vrrp[30272]: Sending gratuitous ARP on eth1 for 192.168.66.250
Sep 15 17:34:54 node9 Keepalived_vrrp[30272]: Sending gratuitous ARP on eth1 for 192.168.66.250
Sep 15 17:34:54 node9 Keepalived_vrrp[30272]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on eth1 for 192.168.66.250
Sep 15 17:34:54 node9 Keepalived_vrrp[30272]: Sending gratuitous ARP on eth1 for 192.168.66.250
Sep 15 17:34:54 node9 Keepalived_vrrp[30272]: Sending gratuitous ARP on eth1 for 192.168.66.250
Sep 15 17:34:54 node9 Keepalived_vrrp[30272]: Sending gratuitous ARP on eth1 for 192.168.66.250
Sep 15 17:34:54 node9 Keepalived_vrrp[30272]: Sending gratuitous ARP on eth1 for 192.168.66.250

为什么连续发送5个arp,因为vrrp_garp_master_repeat默认为5;

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
[root@node9 ~]# ip addr list eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 08:00:27:e5:bd:89 brd ff:ff:ff:ff:ff:ff
inet 192.168.66.19/24 brd 192.168.66.255 scope global noprefixroute eth1
valid_lft forever preferred_lft forever
inet 192.168.66.250/32 scope global eth1
valid_lft forever preferred_lft forever
inet6 fe80::a00:27ff:fee5:bd89/64 scope link
valid_lft forever preferred_lft forever

[root@node9 ~]# ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 192.168.66.250:80 wrr persistent 50
-> 192.168.66.11:80 Route 1 0 0
-> 192.168.66.12:80 Route 2 0 0

重启启动主节点,验证是否切回

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
# 由于主节点优先级高于备节点,因此主节点抢占为MASTER,同时备节点成为BACKUP,并且移除VIP。

Sep 15 17:38:41 node8 Keepalived_vrrp[1214]: VRRP_Instance(VI_1) Entering BACKUP STATE
Sep 15 17:38:41 node8 Keepalived_vrrp[1214]: VRRP sockpool: [ifindex(3), proto(112), unicast(0), fd(10,11)]
Sep 15 17:38:41 node8 Keepalived_vrrp[1214]: VRRP_Instance(VI_1) forcing a new MASTER election
Sep 15 17:38:42 node8 Keepalived_vrrp[1214]: VRRP_Instance(VI_1) Transition to MASTER STATE
Sep 15 17:38:43 node8 Keepalived_vrrp[1214]: VRRP_Instance(VI_1) Entering MASTER STATE
Sep 15 17:38:43 node8 Keepalived_vrrp[1214]: VRRP_Instance(VI_1) setting protocol iptable drop rule
Sep 15 17:38:43 node8 Keepalived_vrrp[1214]: VRRP_Instance(VI_1) setting protocol VIPs.
Sep 15 17:38:43 node8 Keepalived_vrrp[1214]: Sending gratuitous ARP on eth1 for 192.168.66.250
Sep 15 17:38:43 node8 Keepalived_vrrp[1214]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on eth1 for 192.168.66.250
Sep 15 17:38:43 node8 Keepalived_vrrp[1214]: Sending gratuitous ARP on eth1 for 192.168.66.250
Sep 15 17:38:43 node8 Keepalived_vrrp[1214]: Sending gratuitous ARP on eth1 for 192.168.66.250
Sep 15 17:38:43 node8 Keepalived_vrrp[1214]: Sending gratuitous ARP on eth1 for 192.168.66.250
Sep 15 17:38:43 node8 Keepalived_vrrp[1214]: Sending gratuitous ARP on eth1 for 192.168.66.250
Sep 15 17:38:48 node8 Keepalived_vrrp[1214]: Sending gratuitous ARP on eth1 for 192.168.66.250
Sep 15 17:38:48 node8 Keepalived_vrrp[1214]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on eth1 for 192.168.66.250
Sep 15 17:38:48 node8 Keepalived_vrrp[1214]: Sending gratuitous ARP on eth1 for 192.168.66.250
Sep 15 17:38:48 node8 Keepalived_vrrp[1214]: Sending gratuitous ARP on eth1 for 192.168.66.250
Sep 15 17:38:48 node8 Keepalived_vrrp[1214]: Sending gratuitous ARP on eth1 for 192.168.66.250
Sep 15 17:38:48 node8 Keepalived_vrrp[1214]: Sending gratuitous ARP on eth1 for 192.168.66.250


[root@node8 ~]# ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 192.168.66.250:80 wrr persistent 50
-> 192.168.66.11:80 Route 1 0 0
-> 192.168.66.12:80 Route 2 0 0
[root@node8 ~]# ip addr list eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 08:00:27:89:7e:4b brd ff:ff:ff:ff:ff:ff
inet 192.168.66.18/24 brd 192.168.66.255 scope global noprefixroute eth1
valid_lft forever preferred_lft forever
inet 192.168.66.250/32 scope global eth1
valid_lft forever preferred_lft forever
inet6 fe80::a00:27ff:fe89:7e4b/64 scope link
valid_lft forever preferred_lft forever

报错汇总

Unable to load ipset library - libipset.so.11: cannot open shared object file: No such file or directory

yum 安装会报错,查看ipset的包,发现库文件是libipset.so.11, 怀疑是yum版本太旧导致。使用src安装最新版本,未发现此报错。

参考文档

git仓库
官方文档

vagrant+virtualbox 构建mac虚拟机环境

vagrant 简介

什么是vagrant

Vagrant是一个基于Ruby的工具,用于创建和部署虚拟化开发环境。它使用Oracle的开源VirtualBox虚拟化系统,使用 Chef创建自动化虚拟环境。

Vagrant的运行是需要依赖某个虚拟化平台的,如上面安装的virtualBox

它可以实现的功能有:

  • 建立和删除虚拟机
  • 配置虚拟机运行参数
  • 管理虚拟机运行状态
  • 自动配置和安装开发环境
  • 打包和分发虚拟机运行环境

在Vagrant体系中,有个box(箱子)的概念,这点类似于docker体系中的image(镜像)。基于同一个box,不同的人可以运行得到相同的内容

开始安装基础软件

主要安装以下三个软件

  • Vagrant 2.2 (brew cask install vagrant)
  • VirtualBox 6.0 (brew cask install virtualbox)
  • VirtualBox Guest Additions (vagrant plugin install vagrant-vbguest)
1
2
3
4
brew cask install vagrant
brew cask install virtualbox
vagrant plugin install vagrant-vbguest
brew install gnu-tar ## 增强型的打包工具GUN_TAR 可以不安装

自己制作cento7的镜像

第一步下载官方的centos7的镜像

1
wget https://cloud.centos.org/centos/7/vagrant/x86_64/images/CentOS-7-x86_64-Vagrant-2004_01.VirtualBox.box

centos 镜像下载页面: https://cloud.centos.org/centos/7/vagrant/x86_64/images/

下载完成后开始导入到vagrant中,导入完成后会有一个centos7-2004的box可以使用。

1
2
3
4
vagrant box add centos7-2004  CentOS-7-x86_64-Vagrant-2004_01.VirtualBox.box
## 查看导入的镜像
vagrant box list
centos7-2004 (virtualbox, 0)

init创建Vagrantfile 文件

1
vagrant init centos7-2004

编辑 Vagrantfile,执行base/common 的role来初始化系统。

1
2
3
4
5
6
7
8
Vagrant.configure("2") do |config|
config.vm.box = "centos7-2004"
config.vm.network "private_network", type: "dhcp"
config.vm.synced_folder ".", "/vagrant"
config.vm.provision "ansible_local" do |ansible|
ansible.playbook = "playbook.yml"
end
end
1
2
3
4
5
6
7
8
cat playbook.yml
---
- hosts: all # All Vagrant VMs
vars:
become: yes
become_user: 'root'
roles:
- base/common

执行vagrant up 启动虚拟机

1
2
3
4
5
6
vagrant up
Bringing machine 'default' up with 'virtualbox' provider...
==> default: Importing base box 'centos7-2004'...
==> default: Matching MAC address for NAT networking...
...

vagrant ssh 登录机器,验证role的执行结果确认。

机器初始化完成。先关闭虚拟机: vagrant halt , 然后开始执行打包程序: vagrant package 将package的包作为一个模板,在模板只是执行相关命令。

基于模板的(package.tar)box,安装nginx 配置

第一步:先将打包完成的镜像导入。

1
vagrant box add centos7-init package.box

第二步:初始化配置文件

1
vagrant init centos7-init

编辑Vagrantfile,执行nginx 的role来安装nginx。

1
2
3
4
5
6
7
Vagrant.configure("2") do |config|
config.vm.box = "centos7-init"
config.vm.network "private_network", type: "dhcp"
config.vm.provision "ansible_local" do |ansible|
ansible.playbook = "playbook.yml"
end
end
1
2
3
4
5
cat playbook.yml
---
- hosts: all # All Vagrant VMs
roles:
- nginx

vagrant up 启动nginx虚拟机
vagrant ssh 连接上服务器,确认服务器的ip和nginx服务是否正常。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
[root@localhost ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 52:54:00:4d:77:d3 brd ff:ff:ff:ff:ff:ff
inet 10.0.2.15/24 brd 10.0.2.255 scope global noprefixroute dynamic eth0
valid_lft 86183sec preferred_lft 86183sec
inet6 fe80::5054:ff:fe4d:77d3/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 08:00:27:ae:57:2e brd ff:ff:ff:ff:ff:ff
inet 172.28.128.9/24 brd 172.28.128.255 scope global noprefixroute dynamic eth1
valid_lft 382sec preferred_lft 382sec
inet6 fe80::a00:27ff:feae:572e/64 scope link
valid_lft forever preferred_lft forever
[root@localhost ~]# curl -I 127.0.01 ## 验证nginx服务是否正常。
HTTP/1.1 200 OK
Server: nginx/1.17.6
Date: Wed, 27 May 2020 03:46:09 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Wed, 27 May 2020 03:15:14 GMT
Connection: keep-alive
ETag: "5ecddb42-264"
Accept-Ranges: bytes

[root@localhost ~]#

在本机执行curl命令,确认nginx服务。

1
2
3
4
5
6
7
8
9
10
curl 172.28.128.9 -I
HTTP/1.1 200 OK
Server: nginx/1.17.6
Date: Wed, 27 May 2020 03:47:06 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Wed, 27 May 2020 03:15:14 GMT
Connection: keep-alive
ETag: "5ecddb42-264"
Accept-Ranges: byte

基于VBoxManage 打包box镜像

如果你想 将 virtualbox 中的虚拟机打包导入的vagrant的中,可以使用如下命令

  • 查询虚拟机名称: VBoxManage list vms
  • 指定虚拟机名称来创建 Box: vagrant package –base centos7 –output centos7.box
  • 添加创建的Box到Vagrant环境中: vagrant box add centos7 centos7.box
  • 初始化运行环境并设置Vagrantfile: vagrant init centos7
  • 使用Vagrant运行虚拟机,vagrant up
1
2
3
4
5
6
7
8
9
10
11
12
13
$ VBoxManage list vms
"centos7" {28ec4326-eb24-4fa6-b867-e49955e18c1d}

$ vagrant package --base centos7 --output centos7.box
==> centos7: Exporting VM...
==> centos7: Compressing package to: /Users/niu/centos7.box

$ vagrant box add centos7 centos7.box
==> box: Box file was not detected as metadata. Adding it directly...
==> box: Adding box 'centos7' (v0) for provider:
box: Unpacking necessary files from: file:///Users/niu/centos7.box
==> box: Successfully added box 'centos7' (v0) for 'virtualbox'!

通过vm.define 管理多个虚拟机

编辑Vagrantfile配置文件。增加 vm.define 的配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Vagrant.configure("2") do |config|
config.vm.box = "centos7-init"

config.vm.define "nginx" do |nginx|
nginx.vm.hostname = "nginx-01"
nginx.vm.network "private_network", type: "dhcp"
nginx.vm.provision "ansible_local" do |ansible|
ansible.playbook = "playbooks/nginx.yml"
end
nginx.vm.provider "virtualbox" do |v|
v.name = "v-nginx-01" ##用来在virtualbox 定义标识名称,默认是随机名称
v.memory = 512
v.cpus = 1
end
end
end
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
vagrant up nginx
Bringing machine 'nginx' up with 'virtualbox' provider...
==> nginx: Importing base box 'centos7-init'...
==> nginx: Matching MAC address for NAT networking...
==> nginx: Setting the name of the VM: jenkins-master
==> nginx: Clearing any previously set network interfaces...
==> nginx: Preparing network interfaces based on configuration...
nginx: Adapter 1: nat
nginx: Adapter 2: hostonly
==> nginx: Forwarding ports...
nginx: 22 (guest) => 2222 (host) (adapter 1)
==> nginx: Running 'pre-boot' VM customizations...
==> nginx: Booting VM...
==> nginx: Waiting for machine to boot. This may take a few minutes...
nginx: SSH address: 127.0.0.1:2222
nginx: SSH username: vagrant
nginx: SSH auth method: private key
==> nginx: Machine booted and ready!
[nginx] GuestAdditions 6.1.6 running --- OK.
==> nginx: Checking for guest additions in VM...
==> nginx: Setting hostname...
==> nginx: Configuring and enabling network interfaces...
==> nginx: Mounting shared folders...
nginx: /vagrant => /Users/niu/git/oschina/playbooks
==> nginx: Running provisioner: ansible_local...
nginx: Running ansible-playbook...

PLAY [all] *********************************************************************

TASK [Gathering Facts] *********************************************************
ok: [nginx]

TASK [nginx : add nginx group] *************************************************
ok: [nginx]

TASK [nginx : add nginx user] **************************************************
ok: [nginx]

TASK [nginx : install lib] *****************************************************
ok: [nginx]

TASK [nginx : add nginx tar file to /usr/local/src] ****************************
changed: [nginx]

TASK [nginx : Extract nginx.tar.gz into /usr/local/src/] **********************
changed: [nginx]

TASK [nginx : install nginx] ***************************************************
changed: [nginx]

TASK [nginx : change nginx directory mode] *************************************
changed: [nginx]

TASK [nginx : link nginx] ******************************************************
changed: [nginx]

TASK [nginx : add nginx path] **************************************************
changed: [nginx]

TASK [nginx : create nginx configure directory] ********************************
changed: [nginx]

TASK [nginx : copy nginx z-default.conf configure file] ************************
changed: [nginx]

TASK [nginx : copy nginx gzip.conf configure file] *****************************
changed: [nginx]

TASK [nginx : copy nginx 1-logformat configure file] ***************************
changed: [nginx]

TASK [nginx : copy nginx z-default.conf configure file] ************************
changed: [nginx]

TASK [nginx : copy nginx service file] *****************************************
changed: [nginx]

TASK [nginx : nginx service state] *********************************************
changed: [nginx]

RUNNING HANDLER [nginx : reload nginx] *****************************************
changed: [nginx]

RUNNING HANDLER [nginx : reload systemd] ***************************************
ok: [nginx]

PLAY RECAP *********************************************************************
nginx : ok=19 changed=14 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
1
2
3
4
5
vagrant ssh nginx
Last login: Wed May 27 03:01:07 2020 from 10.0.2.2
[vagrant@nginx-01 ~]$ hostname
nginx-01
[vagrant@nginx-01 ~]$

补充 vagrant 常见命令

命令 参数
vagrant init <名称> 初始化box的操作
vagrant box list 显示当前已经添加的box列表
vagrant box add <虚拟机名> <box文件名> 添加box的操作
vagrant box remove 名称 删除相应的box
vagrant up 启动虚拟机的操作
vagrant halt 关机
vagrant destroy 停止当前正在运行的虚拟机并销毁所有创建的资源
vagrant ssh 登录拟机的操作,也可以指定hostname登陆
vagrant status 获取当前虚拟机的状态,也可以查看指定hostname
vagrant suspend 挂起当前的虚拟机
vagrant resume 恢复前面被挂起的状态
vagrant reload 重新启动虚拟机,主要用于重新载入配置文件
vagrant plugin 用于安装卸载插件
vagrant package 打包命令,可以把当前的运行的虚拟机环境进行打包
vagrant global-status 查看所有虚拟机的ID号
vagrant ssh-config 查看ssh登录信息,可以把这些信息 保存到.ssh文件下config中,先用vagrant ssh 登录,然后把宿主机的ssh公钥保存到虚拟机的authorized_keys文件里,然后在宿主机ssh <名称> 就可以免密码登录

报错汇总

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
    default: SSH address: 127.0.0.1:2222
default: SSH username: vagrant
default: SSH auth method: private key
Timed out while waiting for the machine to boot. This means that
Vagrant was unable to communicate with the guest machine within
the configured ("config.vm.boot_timeout" value) time period.

If you look above, you should be able to see the error(s) that
Vagrant had when attempting to connect to the machine. These errors
are usually good hints as to what may be wrong.

If you're using a custom box, make sure that networking is properly
working and you're able to connect to the machine. It is a common
problem that networking isn't setup properly in these boxes.
Verify that authentication configurations are also setup properly,
as well.

If the box appears to be booting properly, you may want to increase
the timeout ("config.vm.boot_timeout") value.

因为镜像是自己制作的,就导致了没法登录(默认使用公钥和私钥的方式)。

解决办法一:导入官方的镜像到配置文件中

1
sudo -u vagrant wget https://raw.githubusercontent.com/mitchellh/vagrant/master/keys/vagrant.pub -O   /home/vagrant/.ssh/authorized_keys

vagrant 仓库地址: https://github.com/hashicorp/vagrant/tree/master/keys

解决办法二:自定义使用自己配置的用户名和密码

1
2
config.ssh.username = "root"
config.ssh.password = "redhat"

按照插件提示超时

timed out (https://rubygems.org/specs.4.8.gz)

Source: https://rubygems.org/

解决办法:

1
2
3
4
5
6
7
8
$ gem update --system # 这里请翻墙一下
$ gem -v
2.6.3

$ gem sources --add https://gems.ruby-china.com/ --remove https://rubygems.org/
$ gem sources -l
https://gems.ruby-china.com
# 确保只有 gems.ruby-china.com

vagrant plugin install vagrant-vbguest –plugin-clean-sources –plugin-source https://gems.ruby-china.com/

参考文档

官方镜像站
如果你要其他系统的镜像,可以来这里下载
virtualbox+vagrant学习-1-环境安装及vagrantfile的简单配置-Mac系统
Vagrant入门
利用Ansible将开发环境纳入版本管理
centos7 virtualbox 镜像下载
国内开源镜像站
Vagrant搭建虚拟机集群
制作 Vagrant Box
使用vagrant和vitrualBox搭建虚拟开发环境
Vagrant快速入门

使用 Kuberadm 安装 kubernetes 集群

集群环境

主机名 IP 服务
node1 192.168.66.11 master
node2 192.168.66.12 nodes
node3 192.168.66.13 nodes

升级系统内核为 4.44

CentOS 7.x 系统自带的 3.10.x 内核存在一些 Bugs,导致运行的 Docker、Kubernetes 不稳定,我们需要先升级一下内核版本。

1
2
3
4
5
6
7
8
9
10
11
12
13
rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch.rpm

yum --enablerepo=elrepo-kernel install -y kernel-lt kernel-lt-devel kernel-lt-headers

# 安装完成后检查 /boot/grub2/grub.cfg 中对应内核 menuentry 中是否包含 initrd16 配置,如果没有,再安装 一次!
grep "initrd16" /boot/grub2/grub.cfg
kernel_version=`grep "^menuentry" /boot/grub2/grub.cfg | cut -d "'" -f2 | grep "4.4"`

# 设置开机从新内核启动
grub2-set-default "$kernel_version"

# 确认修改成功
grub2-editenv list

关闭 NUMA

1
2
3
4
5
6
7
8
9
cp /etc/default/grub{,.bak}
vim /etc/default/grub # 在 GRUB_CMDLINE_LINUX 一行添加 `numa=off` 参数,如下所示:
diff /etc/default/grub.bak /etc/default/grub
6c6
< GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=centos/root rhgb quiet"
---
> GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=centos/root rhgb quiet numa=off"
cp /boot/grub2/grub.cfg{,.bak}
grub2-mkconfig -o /boot/grub2/grub.cfg

重启一下服务器确认内核版本升级成功

初始化配置

修改主机名

1
2
3
4
5
6
7
8
9
10
hostnamectl set-hostname node1  node1
hostnamectl set-hostname node2 node2
hostnamectl set-hostname node3 node3


cat >> /etc/hosts<<EOF
192.168.66.11 node1
192.168.66.12 node2
192.168.66.13 node3
EOF

关闭系统不需要的服务

1
systemctl stop postfix && systemctl disable postfix

安装依赖

1
yum install -y conntrack ntpdate ntp ipvsadm ipset jq iptables curl sysstat libseccomp wget vim net-tools git lrzsz

修改防火墙

1
2
3
systemctl stop firewalld && systemctl disable firewalld
yum -y install iptables-services && systemctl start iptables && systemctl enable iptables
iptables -F && iptables -X && iptables -F -t nat && iptables -X -t nat && service iptables save

关闭swap和selinux

1
2
swapoff -a && sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
setenforce 0 && sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config

修改内核参数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
cat > /etc/sysctl.d/kubernetes.conf <<EOF
net.bridge.bridge-nf-call-iptables=1 # 必备 开启桥接
net.bridge.bridge-nf-call-ip6tables=1 # 必备 开启桥接
net.ipv4.ip_forward=1 # 必备
net.ipv6.conf.all.disable_ipv6=1 # 必备 禁用ip 6
net.ipv4.tcp_tw_recycle=0
vm.swappiness=0 # 禁止使用 swap 空间,只有当系统 OOM 时才允许使用它
vm.overcommit_memory=1 # 不检查物理内存是否够用
vm.panic_on_oom=0 # 开启 OOM
fs.inotify.max_user_instances=8192
fs.inotify.max_user_watches=1048576
fs.file-max=52706963
fs.nr_open=52706963
net.netfilter.nf_conntrack_max=2310720
EOF

sysctl -p /etc/sysctl.d/kubernetes.conf

调整系统时区

1
2
3
4
5
6
7
# 设置系统时区为 中国/上海
timedatectl status # 查看时区状态
timedatectl set-timezone Asia/Shanghai # 将当前的 UTC 时间写入硬件时钟 实现修改的是, /etc/localtime 。lrwxrwxrwx 1 root root 35 Jun 5 16:33 /etc/localtime -> ../usr/share/zoneinfo/Asia/Shanghai
timedatectl set-local-rtc 0 #  将你的硬件时钟设置为协调世界时(UTC)
# 重启依赖于系统时间的服务
systemctl restart rsyslog
systemctl restart crond

设置 rsyslogd 和 systemd journald

1
2
3
4
5
6
7
8
9
10
11
12
13
14
mkdir /var/log/journal # 持久化保存日志的目录
mkdir /etc/systemd/journald.conf.d
cat > /etc/systemd/journald.conf.d/99-prophet.conf <<EOF
[Journal]
# 持久化保存到磁盘
Storage=persistent
# 压缩历史日志 Compress=yes
SyncIntervalSec=5m RateLimitInterval=30s RateLimitBurst=1000
# 最大占用空间 10G SystemMaxUse=10G
# 单日志文件最大 200M SystemMaxFileSize=200M
# 日志保存时间 2 周 MaxRetentionSec=2week
# 不将日志转发到 syslog ForwardToSyslog=no
EOF
systemctl restart systemd-journald

安装 Docker 软件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# step 1: 安装必要的一些系统工具
yum install -y yum-utils device-mapper-persistent-data lvm2
# Step 2: 添加软件源信息
yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
# Step 3: 更新并安装Docker-CE
yum makecache fast
yum -y install docker-ce

## 创建 /etc/docker 目录 # 配置 daemon.

mkdir /etc/docker

cat > /etc/docker/daemon.json <<EOF
{
"registry-mirrors": ["https://it8jkcyv.mirror.aliyuncs.com"],
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
}
}
EOF
mkdir -p /etc/systemd/system/docker.service.d
# 重启docker服务
systemctl daemon-reload && systemctl restart docker && systemctl enable docker

kubernetes 服务安装

kube-proxy开启ipvs的前置条件是安装lvs服务并启用。

1
2
3
4
5
6
7
8
9
10
11
12
13
# ipvsadm 服务在刚才安装过了。

modprobe br_netfilter
cat > /etc/sysconfig/modules/ipvs.modules <<EOF
#!/bin/bash
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack_ipv4
EOF

chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack_ipv4

安装 Kubeadm (主从配置)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# 添加yum源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
### 安装服务
yum -y install kubeadm kubectl kubelet
## 默认启动服务
systemctl enable kubelet.service

通过 kubeadm init 初始化k8s环境。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
kubeadm init --apiserver-advertise-address=192.168.66.11 --kubernetes-version=v1.18.0 --pod-network-cidr=10.244.0.0/16 --service-cidr=10.96.0.0/12 | tee kubeadm-init.log

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.66.11:6443 --token tdew8c.7b303fgva3zy3dub \
--discovery-token-ca-cert-hash sha256:ad7e4f4153a1647392e37c8d1daa4ecc5e2619c06ca166310303313fbe8efd81

基于输出的内容 执行命令

1
2
3
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
1
2
3
4
kubectl get nodes
NAME STATUS ROLES AGE VERSION
node1 NotReady master 4m5s v1.18.5
# NotReady: 缺少网络环境,开始安装flannel

部署网络

1
2
3
wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

kubectl apply -f kube-flannel.yml

将 node2 和 node3 加入到集群中

1
2
[root@node2 ~]# kubeadm join 192.168.66.11:6443 --token tdew8c.7b303fgva3zy3dub \
--discovery-token-ca-cert-hash sha256:ad7e4f4153a1647392e37c8d1daa4ecc5e2619c06ca166310303313fbe8efd81
1
2
[root@node3 ~]# kubeadm join 192.168.66.11:6443 --token tdew8c.7b303fgva3zy3dub \
--discovery-token-ca-cert-hash sha256:ad7e4f4153a1647392e37c8d1daa4ecc5e2619c06ca166310303313fbe8efd81

确认nodes正常

1
2
3
4
5
kubectl get nodes
NAME STATUS ROLES AGE VERSION
node1 Ready master 22h v1.18.5
node2 Ready <none> 21h v1.18.5
node3 Ready <none> 21h v1.18.5

确认pods都正常

1
2
3
4
5
6
7
8
9
10
11
12
13
14
kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-66bff467f8-6wtcf 1/1 Running 3 21h
kube-system coredns-66bff467f8-zqt6t 1/1 Running 3 21h
kube-system etcd-node1 1/1 Running 3 21h
kube-system kube-apiserver-node1 1/1 Running 3 21h
kube-system kube-controller-manager-node1 1/1 Running 3 21h
kube-system kube-flannel-ds-amd64-88z8h 1/1 Running 2 20h
kube-system kube-flannel-ds-amd64-9jmjr 1/1 Running 2 20h
kube-system kube-flannel-ds-amd64-rk9kj 1/1 Running 4 21h
kube-system kube-proxy-77z8g 1/1 Running 2 20h
kube-system kube-proxy-h76lv 1/1 Running 3 21h
kube-system kube-proxy-pvpdr 1/1 Running 2 20h
kube-system kube-scheduler-node1 1/1 Running 3 21h

故障汇总

故障一 服务器重启后,所以的容器不能启动

原因:使用ansible执行初始化的时候,swap分区没修改成功。导致重启后swap分区还在启用。
解决办法: 关闭swap分区,重启。问题解决。

故障二 node2 服务不能连接到api服务

原因: 安装完IPtable以后没有清空自带的规则信息。

解决办法: iptables -F && iptables -X && iptables -F -t nat && iptables -X -t nat && service iptables save

参考文档

https://blog.csdn.net/qq_40806970/article/details/97645628

ipvsadm是LVS在应用层的管理命令,我们可以通过这个命令去管理LVS的配置。

命令介绍

向LVS系统中添加一个用于负载均衡的virtual server(VS)

基本用法:

1
2
ipvsadm COMMAND [protocol] service-address
[scheduling-method] [persistence options]

命令:

  • -A, –add-service: 为ipvs虚拟服务器添加一个虚拟服务,即添加一个需要被负载均衡的虚拟地址。虚拟地址需要是ip地址,端口号,协议的形式。
  • -E, –edit-service: 修改一个虚拟服务。
  • -D, –delete-service: 删除一个虚拟服务。
  • -a, –add-server: 为虚拟服务添加一个real server(RS)
  • -e, –edit-server: 修改RS记录
  • -d, –delete-server: 删除RS记录
  • -C, –clear: 清除所有虚拟服务。
  • -R, –restore: 从标准输入获取ipvsadm命令。一般结合下边的-S使用。
  • -S, –save: 从标准输出输出虚拟服务器的规则。可以将虚拟服务器的规则保存,在以后通过-R直接读入,以实现自动化配置。
  • -L, -l, –list: 列出虚拟服务表中的所有虚拟服务。可以指定地址。添加-c显示连接表。
  • -Z, –zero: 将所有数据相关的记录清零。这些记录一般用于调度策略。
  • –set tcp tcpfin udp: 修改协议的超时时间。
  • –start-daemon state: 设置虚拟服务器的备服务器,用来实现主备服务器冗余。(注: 该功能只支持ipv4)
  • –stop-daemon: 停止备服务器。
  • -h, –help: 帮助。
1
2
3
ipvsadm command [protocol] service-address
server-address [packet-forwarding-method]
[weight options]

用来添加或修改VS的配置,service address用来指定涉及的虚拟服务即虚拟地址,server-address指定涉及的真实地址。

参数:

  • -t, –tcp-service service-address: 指定虚拟服务为tcp服务。service-address要是host[:port]的形式。端口是0表示任意端口。如果需要将端口设置为0,还需要加上-p选项(持久连接)。

  • -u, –udp-service service-address: 使用udp服务,其他同上。

  • -f, –fwmark-service integer: 用firewall mark取代虚拟地址来指定要被负载均衡的数据包,可以通过这个命令实现把不同地址、端口的虚拟地址整合成一个虚拟服务,可以让虚拟服务器同时截获处理去往多个不同地址的数据包。fwmark可以通过iptables命令指定。如果用在ipv6需要加上-6。

  • -s, –scheduler scheduling-method: 指定调度算法。调度算法可以指定以下8种: rr(轮询),wrr(权重),lc(最后连接),wlc(权重),lblc(本地最后连接),lblcr(带复制的本地最后连接),dh(目的地址哈希),sh(源地址哈希),sed(最小期望延迟),nq(永不排队)

  • -p, –persistent [timeout]: 设置持久连接,这个模式可以使来自客户的多个请求被送到同一个真实服务器,通常用于ftp或者ssl中。

  • -M, –netmask netmask: 指定客户地址的子网掩码。用于将同属一个子网的客户的请求转发到相同服务器。

  • -r, –real-server server-address: 为虚拟服务指定数据可以转发到的真实服务器的地址。可以添加端口号。如果没有指定端口号,则等效于使用虚拟地址的端口号。

  • -w, –weight weight:设置权重。权重是0~65535的整数。如果将某个真实服务器的权重设置为0,那么它不会收到新的连接,但是已有连接还会继续维持(这点和直接把某个真实服务器删除时不同的)。

  • -x, –u-threshold uthreshold: 设置一个服务器可以维持的连接上限。0~65535。设置为0表示没有上限。

  • -y, –l-threshold lthreshold: 设置一个服务器的连接下限。当服务器的连接数低于此值的时候服务器才可以重新接收连接。如果此值未设置,则当服务器的连接数连续三次低于uthreshold时服务器才可以接收到新的连接。(PS: 笔者以为此设定可能是为了防止服务器在能否接收连接这两个状态上频繁变换)

  • –mcast-interface interface: 指定使用备服务器时候的广播接口。

  • –syncid syncid: 指定syncid,同样用于主备服务器的同步。

  • [packet-forwarding-method]: 此选项指定某个真实服务器所使用的数据转发模式。需要对每个真实服务器分别指定模式。

    • -g, –gatewaying: 使用网关(即直接路由),此模式是默认模式。
    • -i, –ipip: 使用ipip隧道模式。
    • -m, –masquerading: 使用NAT模式。

以下选项用于list命令:

  • -c, –connection: 列出当前的IPVS连接。
  • –timeout: 列出超时
  • –daemon:
  • –stats: 状态信息
  • –rate: 传输速率
  • –thresholds: 列出阈值
  • –persistent-conn: 坚持连接
  • –sor: 把列表排序。
  • –nosort: 不排序
  • -n, –numeric: 不对ip地址进行dns查询
  • –exact: 单位
  • -6: 如果fwmark用的是ipv6地址需要指定此选项。

其他注意事项

  • 如果使用IPv6地址,需要在地址两端加上”【】“。例如:ipvsadm -A -t [2001:db8::80]:80 -s rr
  • 可以通过设置以下虚拟文件的值来防御DoS攻击:/proc/sys/net/ipv4/vs/drop_entry /proc/sys/net/ipv4/vs/drop_packet /proc/sys/net/ipv4/vs/secure_tcp

示例

使用DR模式

主机列表及IP地址划分

通过访问192.168.66.250可以通过DR模式实现请求在三台机器上轮询.

主机名 IP VIP 功能
node1 192.168.66.11 192.168.66.250(DR) lvs
node2 192.168.66.12 nginx
node3 192.168.66.13 nginx

node1 机器配置
安装 lvs软件

1
yum install ipvsadm -y

开启内核转发

1
2
3
cat /proc/sys/net/ipv4/ip_forward
echo 1 > /proc/sys/net/ipv4/ip_forward
cat /proc/sys/net/ipv4/ip_forward

开始添加配置
添加192.168.66.250的vip配置,指定轮询方式为rr

1
ipvsadm -A -t 192.168.66.250:80 -s rr

查看配置

1
2
3
4
5
ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 192.168.66.250:80 rr

添加真实服务器,指定传输模式为DR

1
2
ipvsadm -a -t 192.168.66.250:80 -r 192.168.66.12:80 -g
ipvsadm -a -t 192.168.66.250:80 -r 192.168.66.13:80 -g

tips: DR 模式通过修改mac地址来实现路由转发,所以vip的端口,必须和后端服务器的端口保持一致。

查看配置

1
2
3
4
5
6
7
ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 192.168.66.250:80 rr
-> 192.168.66.12:80 Route 1 0 0
-> 192.168.66.13:80 Route 1 0 0

node2 node3 服务器配置
安装 NGINX 软件

1
2
yum install nginx -y
systemctl start nginx

创建一个HTML文件用来验证

1
echo $HOSTNAME > /usr/share/nginx/html/1.html

结果确认

1
2
3
4
curl 192.168.66.12/1.html
node2
curl 192.168.66.13/1.html
node3

抑制ARP响应:

1
2
3
4
echo "1" > /proc/sys/net/ipv4/conf/lo/arp_ignore
echo "2" > /proc/sys/net/ipv4/conf/lo/arp_announce
echo "1" > /proc/sys/net/ipv4/conf/all/arp_ignore
echo "2" > /proc/sys/net/ipv4/conf/all/arp_announce

绑定vip地址

1
2
ifconfig lo:0 192.168.66.250/32 up
route add -host 192.168.66.250 dev lo

node1 验证结果

1
2
3
4
[root@node1 vagrant]# curl http://192.168.66.250/1.html
node2
[root@node1 vagrant]# curl http://192.168.66.250/1.html
node3

回滚配置

node1删除真实服务器地址

1
2
ipvsadm -d -t 192.168.66.250:80 -r 192.168.66.12:80
ipvsadm -d -t 192.168.66.250:80 -r 192.168.66.13:80

node1删除vip配置

1
ipvsadm -D -t 192.168.66.250:80

查看配置

1
2
3
4
ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn

node2,node3 关闭arp抑制

1
2
3
4
echo "0" > /proc/sys/net/ipv4/conf/lo/arp_ignore
echo "0" > /proc/sys/net/ipv4/conf/lo/arp_announce
echo "0" > /proc/sys/net/ipv4/conf/all/arp_ignore
echo "0" > /proc/sys/net/ipv4/conf/all/arp_announce

node2,node3 删除网卡及路由规则

1
2
ifconfig lo:0 down
route del 192.168.66.250

使用NAT模式

主机列表及IP地址划分

通过访问192.168.66.251可以通过NAT模式实现请求在三台机器上轮询.

主机名 IP VIP 功能
node1 192.168.66.11 192.168.66.251(NAT) lvs
node2 192.168.66.12 nginx
node3 192.168.66.13 nginx

node1 上操作

添加地址为192.168.66.251:80的虚拟服务,指定调度算法为轮转。

1
ipvsadm -A -t 192.168.66.251:80 -s rr

添加真实服务器,指定传输模式为NAT

1
2
ipvsadm -a -t 192.168.66.251:80 -r 192.168.66.12:80 -m
ipvsadm -a -t 192.168.66.251:80 -r 192.168.66.13:80 -m

查看配置

1
2
3
4
5
6
7
ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 192.168.66.251:80 rr
-> 192.168.66.12:80 Masq 1 0 0
-> 192.168.66.13:80 Masq 1 0 0

NAT模式是lvs的三种模式中最简单的一种。此种模式下只需要保证调度服务器与真实服务器互通就可以运行。

node1 验证结果

1
2
3
4
5
[root@node1 vagrant]# curl http://192.168.66.251/1.html
node3
[root@node1 vagrant]# curl http://192.168.66.251/1.html
node2
[root@node1 vagrant]#

查询lvs的配置信息

1
2
3
4
5
6
7
ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 192.168.66.251:80 rr
-> 192.168.66.12:80 Masq 1 0 0
-> 192.168.66.13:80 Masq 1 0 0
1
2
3
4
5
6
7
ipvsadm -L --stats
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Conns InPkts OutPkts InBytes OutBytes
-> RemoteAddress:Port
TCP 192.168.66.251:http 2 12 8 808 912
-> node2:http 1 6 4 404 456
-> node3:http 1 6 4 404 456

说明:

  1. Conns (connections scheduled) 已经转发过的连接数
  2. InPkts (incoming packets) 入包个数
  3. OutPkts (outgoing packets) 出包个数
  4. InBytes (incoming bytes) 入流量(字节)
  5. OutBytes (outgoing bytes) 出流量(字节)
1
2
3
4
5
6
7
ipvsadm -L --rate
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port CPS InPPS OutPPS InBPS OutBPS
-> RemoteAddress:Port
TCP 192.168.66.251:http 0 0 0 0 0
-> node2:http 0 0 0 0 0
-> node3:http 0 0 0 0 0

说明:

  1. CPS (current connection rate) 每秒连接数
  2. InPPS (current in packet rate) 每秒的入包个数
  3. OutPPS (current out packet rate) 每秒的出包个数
  4. InBPS (current in byte rate) 每秒入流量(字节)
  5. OutBPS (current out byte rate) 每秒入流量(字节)

保存和重载规则

-S:使用输出重定向进行规则保存

1
2
3
4
5
ipvsadm -S > lvs_`date +%F`.txt
cat lvs_2020-09-03.txt
-A -t 192.168.66.251:http -s rr
-a -t 192.168.66.251:http -r node2:http -m -w 1
-a -t 192.168.66.251:http -r node3:http -m -w 1

-R:使用输入重定向载入规则

1
2
3
4
5
6
7
8
ipvsadm -R < lvs_2020-09-03.txt
ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 192.168.66.251:80 rr
-> 192.168.66.12:80 Masq 1 0 0
-> 192.168.66.13:80 Masq 1 0 0

请求所有配置

-Z:清空计数器
-C:清空ipvs规则

1
2
3
4
5
6
ipvsadm -Z
ipvsadm -C
ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn

参考文档

负载均衡集群ipvsadm命令及基本用法

第一步创建ansible自定义模块路径

1
2
cd /data/db/playbooks/
mkdir -p library

vim ansible.cfg 增加如下内容:

1
2
[defaults]
library = ./library

下面我们开始第一个模块开发

创建第一个模块

vim library/info.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time : 2018/11/12 12:00 PM
# @Author : biglittle
# @Contact : biglittleant@hotmail.com
# @Site :
# @File : info.py
# @Software: PyCharm
# @Desc : python file
# @license : Copyright(C), Your Company

from ansible.module_utils.basic import *

# 实例化一个module,因为不需要参数所以argument_spec初始化参数为空字典。
module = AnsibleModule(
argument_spec = dict(),
)
output="hello word!"

result = dict(module='myinfo',stdout=output,changed=False,rc=0)

module.exit_json(**result)

1
2
3
4
5
6
7
8
9
10
ansible -i inventory/devlop linux-node1 -m myinfo
linux-node1 | SUCCESS => {
"changed": false,
"module": "myinfo",
"rc": 0,
"stdout": "hello word!",
"stdout_lines": [
"hello word!"
]
}

创建一个带参数的脚本

vim library/myinfo_args.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time : 2018/11/12 12:00 PM
# @Author : biglittle
# @Contact : biglittleant@hotmail.com
# @Site :
# @File : myinfo_args.py
# @Software: PyCharm
# @Desc : python file
# @license : Copyright(C), Your Company

from ansible.module_utils.basic import *

module = AnsibleModule(
argument_spec = dict(
msg=dict(required=True),
),
)
msg = module.params['msg']

result = dict(module='myinfo_args',stdout=msg,changed=False,rc=0)

module.exit_json(**result)

执行验证

1
2
3
4
5
6
7
8
9
10
ansible -i inventory/devlop linux-node1 -m myinfo_args -a "msg='new word'"
linux-node1 | SUCCESS => {
"changed": false,
"module": "myinfo_args",
"rc": 0,
"stdout": "new word",
"stdout_lines": [
"new word"
]
}

来个不带参数的的执行

1
2
3
4
5
ansible -i inventory/devlop linux-node1 -m myinfo_args
linux-node1 | FAILED! => {
"changed": false,
"msg": "missing required arguments: msg"
}

自己实现一个shell模块

vim library/myshell.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time : 2020年01月10日16:39:09
# @Author : biglittle
# @Contact : biglittleant@hotmail.com
# @Site :
# @File : myshell.py
# @Software: PyCharm
# @Desc : python file
# @license : Copyright(C), Your Company
from ansible.module_utils.basic import AnsibleModule
import commands


def main():
"""
run shell
"""
changed = False
module = AnsibleModule(
argument_spec = dict(
cmd = dict(type='str', required=True),
),
)
cmd = module.params['cmd']

code,output = commands.getstatusoutput(cmd)
if code == 0:
# 按照ansible 的返回格式定义返回内容,stdout为标准输出,changed代表系统有没有东西被变更,rc=0代表执行成功
result = dict(stdout=output,changed=changed,rc=0)
# 使用ansible规则的module实例下的exit_json返回正常内容
module.exit_json(**result)
else:
# 当调用失败返回错误信息的时候,数据字典只要传递msg信息就可了,然后调用module实例的fail_json方法给返回
result = dict(msg=output,rc=code)
module.fail_json(**result)


if __name__ == '__main__':

main()

执行一个正确的命令

1
2
3
4
5
6
7
8
9
ansible -i inventory/devlop linux-node1 -m myshell -a cmd='pwd'
linux-node1 | SUCCESS => {
"changed": false,
"rc": 0,
"stdout": "/home/niu",
"stdout_lines": [
"/home/niu"
]
}
1
2
3
4
5
6
ansible -i inventory/devlop linux-node1 -m myshell -a cmd='pws'
linux-node1 | FAILED! => {
"changed": false,
"msg": "sh: pws: command not found",
"rc": 32512
}

如果不定义: result = dict(msg=output,rc=code) ansible 会有一个默认的返回,输出类似下面的情况。

1
2
3
4
5
6
7
8
ansible -i inventory/devlop linux-node1 -m myshell -a cmd='pws'
linux-node1 | FAILED! => {
"changed": false,
"module_stderr": "Shared connection to linux-node1 closed.\r\n",
"module_stdout": "\r\nTraceback (most recent call last):\r\n File \"/home/niu/.ansible/tmp/ansible-tmp-1578648957.53-161256530271090/AnsiballZ_myshell.py\", line 113, in <module>\r\n _ansiballz_main()\r\n File \"/home/niu/.ansible/tmp/ansible-tmp-1578648957.53-161256530271090/AnsiballZ_myshell.py\", line 105, in _ansiballz_main\r\n invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)\r\n File \"/home/niu/.ansible/tmp/ansible-tmp-1578648957.53-161256530271090/AnsiballZ_myshell.py\", line 48, in invoke_module\r\n imp.load_module('__main__', mod, module, MOD_DESC)\r\n File \"/tmp/ansible_myshell_payload_XylgIF/__main__.py\", line 40, in <module>\r\n File \"/tmp/ansible_myshell_payload_XylgIF/__main__.py\", line 33, in main\r\nUnboundLocalError: local variable 'result' referenced before assignment\r\n",
"msg": "MODULE FAILURE\nSee stdout/stderr for the exact error",
"rc": 1
}

参数解释

argument_spec 支持的参数

例子:

1
2
3
4
5
6
7
module = AnsibleModule(
argument_spec = dict{
name = dict(type='str', required=True),
cwd = dict(type='str', required=False),
shell = dict(type='bool', default=True),
}
)

官方的ping模块分析

模块路径:https://github.com/ansible/ansible/blob/devel/lib/ansible/modules/system/ping.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
from ansible.module_utils.basic import AnsibleModule


def main():
# 实例化了一个类
module = AnsibleModule(
argument_spec=dict(
data=dict(type='str', default='pong'),
),
supports_check_mode=True
)
# 判断参数是否为crash,如果是就抛出异常
if module.params['data'] == 'crash':
raise Exception("boom")
# 正常情况下,定义个字典。ping=data,data默认是pong
result = dict(
ping=module.params['data'],
)
# 返回结果
module.exit_json(**result)


if __name__ == '__main__':
main()
1
2
3
4
5
6
7
8
9
10
11
12
ansible -i inventory/devlop  linux-node1 -m ping
linux-node1 | SUCCESS => {
"changed": false,
"ping": "pong"
}

ansible -i inventory/devlop linux-node1 -m ping -a "data=abcd"
linux-node1 | SUCCESS => {
"changed": false,
"ping": "abcd"
}

ansible模块中的输出定制

changed: 平时我们使用别人的模块,会发现changed字段,有的时候是true,有的时候是false,其实changed并不代表什么,ansible里面一致认为的是对系统进行了更改的changed为true,未更改的为false,其实只是一个记录的值而已,要改变chnged的值,返回的字典里面只要changed=Fasle就好了result = dict(changed=False,stdout=ouput)

ansible模块中的退出状态处理

  • 正常退出:module.exit_jons

  • 错误退出:module.fail_json

错误退出比较不一样的是,你要传递的参数是msg: result = dict(msg=output,rc=code)

报错汇总

1
ERROR! this task 'myinfo_args' has extra params, which is only allowed in the following modules: shell, win_shell, include_vars, add_host, raw, include_role, meta, set_fact, include, import_tasks, script, import_role, include_tasks, group_by, command, win_command

命令执行错了 ansible -i inventory/devlop linux-node1 -m myinfo_args -a 'new word'

正确的命令:ansible -i inventory/devlop linux-node1 -m myinfo_args -a "msg='new word'"

参考文档

Ansible模块开发-自定义模块

如何自定义变量

1
2
3
- hosts: webservers
vars:
http_port: 80

可以在temple 文件中使用 {{ }} 来使用变量

1
My amp goes to {{ max_amp_value }}

也可以在写playbook的时候使用变量

1
template: src=foo.cfg.j2 dest={{ remote_install_path }}/foo.cfg

这里有个小技巧,使用变量时,要用双引号引用。
错误写法

1
2
3
- hosts: app_servers
vars:
app_path: {{ base_path }}/22

正确写法

1
2
3
- hosts: app_servers
vars:
app_path: "{{ base_path }}/22"

facts: 系统的信息

从远程节点搜集到的系统信息称为facts。facts包括远程主机的IP地址,和操作系统类型,磁盘相关信息等等。

执行下边的命令来查看都有哪些信息:

1
- debug: var=ansible_facts
1
ansible test -m setup

返回内容类似下边这样:

1
2
3
4
5
6
7
8
9
10
11
12
ansible linux-node2 -m setup  |head
linux-node2 | SUCCESS => {
"ansible_facts": {
"ansible_all_ipv4_addresses": [
"10.0.2.12",
"192.168.56.12"
],
"ansible_all_ipv6_addresses": [
"fe80::250:56ff:fe3f:cc94",
"fe80::250:56ff:fe30:8f96"
],
......省略若干行内容......

可以在temple模板中使用上边返回的值:

1
2
{{ ansible_devices.sda.model }}
{{ ansible_facts['devices']['xvda']['model'] }}

要获取系统的主机名:

1
{{ ansible_hostname }}

facts会经常在条件语句及模板中使用。如,根据不同的linux发行版,执行不同的包管理程序。

关闭facts

如果你确信不需要主机的任何facts信息,而且对远程节点主机都了解的很清楚,那么可以将其关闭。远程操作节点较多的时候,关闭facts会提升ansible的性能。

只需要在play中设置如下:

1
2
- hosts: whatever
gather_facts: no

本地facts(facts.d)

如果远程节点系统上存在etc/ansible/facts.d目录,这个目录下的以.fact为后缀的文件,里面的内容可以是JSON格式,或者ini格式书写;或者是一个可以返回json格式数据的可执行文件,都可以用来提供本地facts信息。

在远程节点创建一个/etc/ansible/facts.d/preferences.fact的文件,内容如下:

1
2
3
4
5
6
mkdir -p /etc/ansible/facts.d
cat >>/etc/ansible/facts.d/preferences.fact<<EOF
[general]
name=linux
test=True
EOF

在控制节点获取自定义的信息:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
ansible linux-node2 -m setup -a "filter=ansible_local"
linux-node2 | SUCCESS => {
"ansible_facts": {
"ansible_local": {
"preferences": {
"general": {
"name": "linux",
"test": "True"
}
}
}
},
"changed": false
}

可以在playbooks或者模板中这样使用获取到的信息:

1
{{ ansible_local.preferences.general.name }}

linux-node2 服务器配置

1
2
3
4
5
6
7
8
mkdir /etc/ansible/facts.d -p

cat /etc/ansible/facts.d/ini.fact
[mytest]
niu=key

cat /etc/ansible/facts.d/niu.fact
{ "name":"shencan" , "list":["three","one","two"], "Dict": {"A":"B"} }

结果展示

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
ansible -i inventory/ linux-node2 -m setup -a "filter=ansible_local"
linux-node2 | SUCCESS => {
"ansible_facts": {
"ansible_local": {
"ini": {
"mytest": {
"niu": "key"
}
},
"niu": {
"Dict": {
"A": "B"
},
"list": [
"three",
"one",
"two"
],
"name": "shencan"
},
"preferences": {
"general": {
"name": "linux",
"test": "True"
}
}
}
},
"changed": false
}

缓存 facts数据

编辑 ansible.cfg 添加如下内容,让数据缓存到redis中。

1
2
3
4
5
[defaults]
gathering = smart
fact_caching = redis
fact_caching_timeout = 86400
# seconds

需要安装相关的依赖程序

1
2
3
yum install redis
service redis start
pip install redis

使用facter扩展facts信息

使用过Puppet的读者都熟悉facter是Puppet里面一个负责收集主机静态信息的组件,Ansible的facts功能也一样。Ansible的facts组件也会判断被控制机器上是否安装有facter和ruby-json包,如果存在的话,Ansible的facts也会采集facter信息。我们来查看以下机器信息:

1
2
3
ansible linux-node2 -m shell -a 'rpm -qa ruby-json facter'

ansible linux-node2 -m facter

当然,如果直接运行setup模块也会采集facter信息
所有facter信息在ansible_facts下以facter_开头,这些信息的引用方式跟Ansible自带facts组件收集的信息引用方式一致。

使用ohai扩展facts信息

ohai是Chef配置管理工具中检测节点属性的工具,Ansible的facts也支持ohai信息的采集。当然需要被管机器上安装ohai包。下面介绍ohai相关信息的采集:

1
ansible linux-node2 -m shell -a 'gem list|grep ohai'

如果主机上没有安装ohai包,可以使用gem方式进行安装。如果存在ohai包,可以直接运行ohai模块查看ohai属性:

1
ansible linux-node2 -m ohai

如果直接运行setup模块,也会采集ohai信息

Registered 获取执行命令的输出

register方式用于在task之间传递变量。

在刚开始使用 ansible-playbook 做应用程序部署的时候,因为在部署的过程中有使用到 command 或 shell 模块执行一些自定义的脚本,而且这些脚本都会有输出,用来表示是否执行正常或失败。如果像之前自己写脚本做应用程序部署的,这很好实现。但现在是用 Ansible 做,那么要怎么样做可以获取到 ansible playbook 中 command 模块的输出呢? Ansible 也提供的解决办法,这时我们就可以通过使用 register 关键字来实现,register 关键字可以存储指定命令的输出结果到一个自定义的变量中,我们通过访问这个自定义变量就可以获取到命令的输出结果。Register 的使用很方便,只需要在 task 声明 register 关键字,并自定义一个变量名就可以。如下:

1
2
3
4
5
6
7
8
9
10
- hosts: web_servers

tasks:

- shell: /usr/bin/foo
register: foo_result
ignore_errors: True

- shell: /usr/bin/bar
when: foo_result.rc == 5

register 中存放的的数据其实是ansible stdout返回的json数据,拿到json数椐后就可以做相关使用和处理了。可以使用-v 查看上一个task的stdout 数据。

1
2
3
4
5
6
7
- name: echo date
command: date
register: date_output

- name: echo date_output
command: echo "30"
when: date_output.stdout.split(' ')[2] == "30"

这里第 1 个 task 是执行了一个 date 命令,register 关键字将 date 命令的输出存储到 date_output 变量名。第 2 个 task 对输出进行分析,并使用 when 对关键字对分析后的进行判断,如果匹配,则执行这个 task,不匹配就不执行。这里要重点说下的,因为 register 获取到的输出内容都是字符串,而 ansible 又是 python 写的,你可以使用 python 字符串的方法对其做处理,比如本文中使用的 split,还可以使用 find 方法。个人觉得,真是非常灵活方便。

注册变量

1
register: xxx
  • xxx.stdout.find(sdf)
  • xxx.stdout_lines

使用debug模块查看register中的数据

1
2
3
4
5
6
7
8
9
---
- hosts: all
gather_facts: False
tasks:
- name: register variable
shell: hostname
register: info
- name: display variable
debug: msg="The varibale is {{ info }}"

内置变量

ansible 默认内置了一下变量,可以直接使用。hostvars, groups, group_names, and inventory_hostname

hostvars

hostvars 是用来调用指定主机变量,需要传入主机信息,返回结果也是一个 JSON 字符串,同样,也可以直接引用 JSON 字符串内的指定信息。如果主机执行获取了 facts 数据,则hosts中将会包含 facts 中的数据。

1
2
3
4
5
- name: Get the masters IP
set_fact: dns_master="{{ hostvars.ns1.ansible_default_ipv4.address }}"

- name: Configure BIND
template: dest=/etc/named.conf src=templates/named.conf.j2

hostvars 例子

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
{
'linux-node1': {
'inventory_file': '/data/db/playbooks/inventory/open_falcon',
'ansible_playbook_python': '/usr/bin/python2',
'falcon_mysql_port': 3306,
'ansible_check_mode': True,
'falcon_mysql_passwd': 123456,
'ansible_diff_mode': True,
'open_falcon_path': '/data/app/open-falcon',
'groups': {
'ungrouped': [],
'judge': ['linux-node2_judge1', 'linux-node2_judge2'],
'all': ['linux-node1', 'linux-node2_judge1', 'linux-node2_judge2'],
'open_falcon': ['linux-node2_judge1', 'linux-node2_judge2', 'linux-node1'],
'api': ['linux-node1']
},
'ansible_forks': 5,
'ansible_facts': {},
'inventory_hostname': 'linux-node1',
'ansible_inventory_sources': ['/data/db/playbooks/inventory/open_falcon'],
'target': 'judge',
'inventory_hostname_short': 'linux-node1',
'playbook_dir': '/data/db/playbooks/playbooks',
'omit': '__omit_place_holder__38ba14590a3d51c234a21046ca130054f96cc570',
'ansible_skip_tags': [],
'inventory_dir': '/data/db/playbooks/inventory',
'ansible_verbosity': 0,
'role': 'test',
'group_names': ['api', 'open_falcon'],
'falcon_mysql_host': '192.168.56.12',
'ansible_run_tags': ['test_vars'],
'ansible_version': {
'major': 2,
'full': '2.7.1',
'string': '2.7.1',
'minor': 7,
'revision': 1
},
'falcon_mysql_name': 'falcon'
},
’linux-node2_judge2‘: {
'ansible_hostname': 'linux-node2',
'ansible_ssh_port': 52113,
'ansible_ssh_host': 'linux-node2',
'ansible_system': 'Linux',
'playbook_dir': '/data/db/playbooks/playbooks',
'ansible_run_tags': ['test_vars'],
'judge_http_port': 6081,
'ansible_python_version': '2.7.5'
}
}

groups

groups 变量是一个全局变量,引用了inventory文件里所有的主机以及主机组信息 它返回的是一个json字符串。

1
2
3
4
5
6
7
{
'ungrouped': [],
u 'judge': [u 'linux-node2_judge1', u 'linux-node2_judge2'],
'all': [u 'linux-node1', u 'linux-node2_judge1', u 'linux-node2_judge2'],
u 'open_falcon': [u 'linux-node2_judge1', u 'linux-node2_judge2', u 'linux-node1'],
u 'api': [u 'linux-node1']
}

playbook中通过或是的方式来引用

1
2
3
- name: Create a user for all app servers
with_items: groups.appservers
mysql_user: name=kate password=test host={{ hostvars.[item].ansible_eth0.ipv4.address }} state=present
1
2
3
{% for host in groups['app_servers'] %}
{{ hostvars[host]['ansible_facts']['eth0']['ipv4']['address'] }}
{% endfor %}
1
2
3
4
5
etcd_initial_cluster: |-
{% for item in groups['etcd'] -%}
{{ hostvars[item]['node_name'] }}=http://{{ hostvars[item]['ansible_default_ipv4']['address'] }}:2380{% if not loop.last %},{% endif %}
{%- endfor %}
etcd_initial_cluster_state: "new"

group_names

group_names引用当前主机所在的group的名称

1
2
3
{% if 'webserver' in group_names %}
# some part of a configuration file that only applies to webservers
{% endif %}
1
2
3
4
5
6
7
- name: For secure machines
set_fact: sshconfig=files/ssh/sshd_config_secure
when: "'secure' in group_names"

- name: For non-secure machines
set_fact: sshconfig=files/ssh/sshd_config_default
when: "'secure' not in group_names"
  • inventory_hostname 变量保存了在设备配置清单中服务器的主机名
  • inventory_hostname_short 变量跟inventory_hostname一样,只是去掉域名,比如inventory_hostname 是host.example 那么inventory_hostname_short就是host
  • inventory_dir 是设备清单文件的路径
  • inventory_file 是设备清单文件的文件名

通过 vars_files 指定变量文件

1
2
3
4
5
6
7
8
9
10
11
12
13
---

- hosts: all
remote_user: root
vars:
favcolor: blue
vars_files:
- /vars/external_vars.yml

tasks:

- name: this is just a placeholder
command: /bin/echo foo

/vars/external_vars.yml 文件内容

1
2
3
4
---
# in the above example, this would be vars/external_vars.yml
somevar: somevalue
password: magic

使用 –extra-vars (or -e ) 命令行传入 变量

1
ansible-playbook release.yml --extra-vars "version=1.23.45 other_variable=foo"
1
ansible-playbook release.yml --extra-vars "@some_file.json"
1
2
ansible-playbook release.yml --extra-vars '{"version":"1.23.45","other_variable":"foo"}'
ansible-playbook arcade.yml --extra-vars '{"pacman":"mrs","ghosts":["inky","pinky","clyde","sue"]}'

变量读取的优先级

读取顺序倒叙实现,最后一行是优先级最高的

  • command line values (eg “-u user”)
  • role defaults [1]
  • inventory file or script group vars [2]
  • inventory group_vars/all [3]
  • playbook group_vars/all [3]
  • inventory group_vars/* [3]
  • playbook group_vars/* [3]
  • inventory file or script host vars [2]
  • inventory host_vars/* [3]
  • playbook host_vars/* [3]
  • host facts / cached set_facts [4]
  • play vars
  • play vars_prompt
  • play vars_files
  • role vars (defined in role/vars/main.yml)
  • block vars (only for tasks in block)
  • task vars (only for the task)
  • include_vars
  • set_facts / registered vars
  • role (and include_role) params
  • include params
  • extra vars (always win precedence)

定义一个roles 来测试执行顺序 vim roles/test/main.yaml

1
2
- debug:
msg: test_vars {{ test_vars }}

playbook 中定义 vars vim playbooks/role.yaml

1
2
3
4
5
- hosts: "{{ target }}"
roles:
- "{{ role }}"
vars:
- test_vars: playbook

执行roles ansible-playbook -i inventory/ -e target=localhost -e role=test playbooks/role.yaml -t test_vars -CD

1
2
3
4
TASK [test : debug] **************************************************************************************************************************************************************************************************
ok: [localhost] => {
"msg": "test_vars playbook"
}

第二步 在roles/vars中定义变量

1
2
vim test/vars/main.yaml
test_vars: test_vars

执行roles ansible-playbook -i inventory/ -e target=localhost -e role=test playbooks/role.yaml -t test_vars -CD

1
2
3
4
TASK [test : debug] **************************************************************************************************************************************************************************************************
ok: [localhost] => {
"msg": "test_vars test_vars"
}

第三步 在roles文件中定义 vim roles/test/main.yaml

1
2
3
4
- debug:
msg: test_vars {{ test_vars }}
vars:
- test_vars: block

执行roles ansible-playbook -i inventory/ -e target=localhost -e role=test playbooks/role.yaml -t test_vars -CD

1
2
3
4
TASK [test : debug] **************************************************************************************************************************************************************************************************
ok: [localhost] => {
"msg": "test_vars block"
}

第四步 命令行中使用 -e 传入变量

1
ansible-playbook -i inventory/ -e target=localhost -e role=test playbooks/role.yaml  -t test_vars -CD  -e "test_vars=cmdline"
1
2
3
4
TASK [test : debug] **************************************************************************************************************************************************************************************************
ok: [localhost] => {
"msg": "test_vars cmdline"
}

参考文档

Using Variables

0%