手动部署容器难以维护,使用容器编排技术可以解决如下问题:
Kubernetes架构图
Work node架构图
Master node架构图
Pod: kubernetes管理的主要对象,可以由一个或者共享资源的一组容器组成 kubelet: 管理worker node和master node之间的通信 kube-proxy: 运行在work node上,用于管理Node和Pod的网络通信 API Server: 提供API服务 Scheduler: 选择worker node运行Pod Controller: 监控Pod数量,控制worker node Worker node: 运行Pod的机器或者虚拟机 Master node: 运行Control Plane的机器或者虚拟机
$ kubectl create deployment web-app --image caddy
$ kubectl get deployment web-app
$ kubectl describe deployment web-app
$ kubectl scale --replicas 2 deployment.apps/web-app
$ kubectl set image deployment/web-app caddy=nginx
$ kubectl rollout status deployment web-app
$ kubectl rollout undo deployment web-app
$ kubectl rollout history deployment web-app
$ kubectl rollout undo deployment web-app --to_revision 1
$ kubectl apply -f deployment.yaml
$ kubectl delete deployments,services -l app=web-app
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
spec:
selector:
matchLabels:
app: web-app
replicas: 1
template:
metadata:
labels:
app: web-app
spec:
containers:
- name: web-app-container
image: nginx
livenessProbe:
httpGet:
path: /
port: 80
periodSeconds: 10
initialDelaySeconds: 5
$ kubectl expose deployment web-app --type=NodePort --name=web-app-service --port=80
$ kubectl port-forward service/web-app-service 30000:80
$ kubectl apply -f service.yaml
apiVersion: v1
kind: Service
metadata:
name: web-app-service
spec:
selector:
app: web-app
ports:
- protocol: TCP
port: 80
targetPort: 80
type: NodePort
Kubernetes中的volume会随着Pod的结束而结束,而container重启并不影响volume.给Pod添加Volume,需要在pod.spec中添加下面的定义
volumes:
- name: web-app-volume
emptyDir: {}
如果Pod中的container需要使用这个volume需要在container的配置中添加:
volumeMounts:
- mountPath: /usr/share/nginx/html
name: web-app-volume
PV(Persistent volumes)持久卷不依赖于Pod和Node,Pod删除后PV也会保持,而且另外一个node上的pod也可以共享这个持久卷。Pod可以通过定义一个PVC(PV Claim)来使用持久卷。
PV的定义:
apiVersion: v1
kind: PersistentVolume
metadata:
name: host-pv
spec:
capacity:
storage: 4Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
storageClassName: standard
hostPath:
path: /data
type: DirectoryOrCreate
PVC的定义:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: host-pvc
spec:
volumeName: host-pv
accessModes:
- ReadWriteOnce
storageClassName: standard
resources:
requests:
storage: 1Gi
Stateful Set可以让Pod按顺序启动,Pod的名字也是有固定的编号。可以通过如下的定义创建一个stateful set:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
selector:
matchLabels:
app: nginx # has to match .spec.template.metadata.labels
serviceName: "nginx"
replicas: 3 # by default is 1
minReadySeconds: 10 # by default is 0
template:
metadata:
labels:
app: nginx # has to match .spec.selector.matchLabels
spec:
terminationGracePeriodSeconds: 10
containers:
- name: nginx
image: registry.k8s.io/nginx-slim:0.8
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "my-storage-class"
resources:
requests:
storage: 1Gi
Service可以让请求load balance到不同的Pod。其它Pod可以通过service的DNS例如mysql.default.svc.cluster.local
来访问服务,但无法访问具体的Pod,因为Pod的DNS是根据IP来的,但每次Pod的销毁和创建IP地址都会变化,例如10-40-2-8.default.pod.cluster.local
。如果其它Pod想要直接访问service中的其中一个Pod,例如Mysql的master pod,就需要通过headless service。Headless service可以给Pod赋予固定的DNSpodname.headless-svc-name.namespace.svc.cluster-domain.example
,如mysql-1.mysql-h-svc.default.svc.cluster.local
。创建headless service也很简单,只需要将clusterIP设置为None即可,例如:
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
ports:
- port: 80
name: web
clusterIP: None
selector:
app: nginx
在stateful set中需要通过serviceName来指定这个headless service.
如果stateful set中的每个Pod都需要自己的PV和PVC,可以通过在stateful set的spec指定volumeClaimTemplates的方式实现,stateful set会保证每个Pod在重建后依然挂载原先的PVC:
volumeClaimTemplates:
- metadata:
name: data-volume
spec:
accessModes:
- ReadWriteOnce
storageClassName: google-storage
resources:
requests:
storage: 500Mi
Dockerfile中的Entrypoint可以在Kubernetes中用command: []
来设置,而对应的CMD则可以用args: []
来设置
可以定义ConfigMap来提供容器运行的环境变量:
apiVersion: v1
kind: ConfigMap
metadata:
name: web-app-config-map
data:
folder: story
user: user
然后在容器的定义中使用它:
env:
- name: STORY_FOLDER
valueFrom:
configMapKeyRef:
name: web-app-config-map
key: folder
可以用如下的YAML定义secret:
apiVersion: v1
kind: Secret
metadata:
name: web-app-secret
data:
password: cGFzc3dvcmQ=
然后再容器中可以把secrets作为环境变量来使用:
envFrom:
- secretRef:
name: web-app-secret
或者:
env:
- name: STORY_FOLDER
valueFrom:
secretKeyRef:
name: web-app-secret
key: password
在或者把Secret作为存储卷来使用:
volumes:
- name: app-secret-volume
secret:
secretName: web-app-secret
可以在Pod spec或者container中定义securityContext来增强容器的安全性,例如:
securityContext:
runAsUser: 1000
capabilities:
add: ["MAC_ADMIN"]
在容器的定义中可以使用如下yaml来请求CPU和内存资源及设置容器的资源使用限制:
resources:
requests:
memory: "4Gi"
cpu: 2
limits:
memory: "8Gi"
cpu: 4
可以创建LimitRange
对象来设置Pod默认的资源限制:
apiVersion: v1
Kind: LimitRange
metadata:
name: cpu-constraint
spec:
limits:
- default:
cpu: 500m
defaultRequest:
cpu: 500m
max:
cpu: 1
min:
cpu: 100m
type: Container
可以创建ResourceQuota
来指定namespace的资源限制:
apiVersion: v1
Kind: ResourceQuota
metadata:
name: my-resource-quota
spec:
hard:
requests.cpu: 4
requests.memory: 4Gi
limits.cpu: 10
limits.memory: 10Gi
Taint可以给一个Node打一个标签,而toleration则可以让Pod运行于taint标记的Node,没有设置toleration的Pod则不能运行于taint过的node。
给一个Node打上taint标记:
kubectl taint nodes {node-name} {key=value}:{taint-effect}
给Pod设置toleration:
tolerations:
- key: "app"
operator: "Equal"
value: "blue"
effect: NoSchedule
taint effect可以是:
可以用下面的命令来给一个node打标签:
$ kubectl label nodes {node-name} size=Large
然后在Pod的定义中,可以使用node selector来选择node:
nodeSelector:
size: Large
通过node selectors选择node功能比较简单,Node Affinity则功能更加丰富。在Pod的定义中,可以使用affinity来选择node:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoreDuringExecution:
- matchExpression:
- key: size
operator: NotIn
values:
- Small
Labels用于给Kubernetes对象打标签,而Selectors可以根据这些标签来快速过滤出来所要查找的对象。例如,给一个Pod打标签只需要在metadata中定义labels即可。
metadata:
name: web-app
labels:
app: App1
function: front-end
通过selector来查找对应的Pod:
$kubectl get pods --selector app=App1
NAME READY STATUS RESTARTS AGE
web-app-6cc4887574-pcld4 1/1 Running 0 26s
web-app-6cc4887574-w2mx6 1/1 Running 0 26s
web-app-6cc4887574-xhbkf 1/1 Running 0 26s
Annotations则用于给Kubernetes对象打上用于集成的一些信息,如:
version:v2
,将service的selector从version:v1
改成version:v2
version:v2
,service不变,标记依然为两个版本共有的标记,例如app: web-app
,将新的deployment的replica从小到大,直到升级为需要的值。然后删除原来的deployment。可以用如下的YAML定义一个Job:
apiVersion: batch/v1
kind: Job
metadata:
name: add-job
spec:
completions: 3
parallelism: 3
template:
spec:
containers:
- name: math-add
image: ubuntu
command: ['expr', '3', '+', '2']
restartPolicy: Never
可以用如下的YAML定义一个CronJob:
apiVersion: batch/v1
kind: CronJob
metadata:
name: add-cron-job
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
completions: 3
parallelism: 3
template:
spec:
containers:
- name: math-add
image: ubuntu
command: ['expr', '3', '+', '2']
restartPolicy: Never
环境变量SERVICE_NAME_SERVICE_HOST保存Pod的Cluster IP. 在指定环境变量的时候,value中填写”service_name.default”可以指向service的Cluster IP.
Ingress可以看做是Kubernetes对反向代理的抽象,例如可以把不同Host不同Path的流量导入到不同的service中。
可以通过如下的YAML创建一个Ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: ingress-wildcard-host
spec:
rules:
- host: "foo.bar.com"
http:
paths:
- pathType: Prefix
path: "/bar"
backend:
service:
name: service1
port:
number: 80
- host: "*.foo.com"
http:
paths:
- pathType: Prefix
path: "/foo"
backend:
service:
name: service2
port:
number: 80
网络策略(Network Policy)可以用来控制Pod之间的互相访问。例如,我们创建如下的一个Network Policy.
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: test-network-policy
namespace: default
spec:
podSelector:
matchLabels:
role: db
policyTypes:
- Ingress
- Egress
ingress:
- from:
- ipBlock:
cidr: 172.17.0.0/16
except:
- 172.17.1.0/24
- namespaceSelector:
matchLabels:
project: myproject
- podSelector:
matchLabels:
role: frontend
ports:
- protocol: TCP
port: 6379
egress:
- to:
- ipBlock:
cidr: 10.0.0.0/24
ports:
- protocol: TCP
port: 5978
在启动Kubernetes API server的时候可以通过--basic-auth-file
来指定静态密码文件,例如user-details.csv:
password1, user1, user_id1, group1
password2, user2, user_id2, group2
...
也可以通过--token-auth-file
来指定静态token文件,例如user-tokens.csv
token1, user1, user_id1, group1
token2, user2, user_id2, group2
...
service account可以看做Kubernetes内部的程序为主体的账户,例如,我们可以通过下面的定义给Kubernetes dashboard来创建一个SA,以及对应的token.
apiVersion: v1
kind: ServiceAccount
metadata:
namespace: default
name: dashboard-sa
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: demo-role
rules:
- apiGroups: ["*"]
resources: ["*"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: cluster-role-binding
namespace: default
subjects:
- kind: ServiceAccount
name: dashboard-sa
roleRef:
kind: ClusterRole
name: demo-role
apiGroup: rbac.authorization.k8s.io
创建sa对应的token:
$kubectl create token dashboard-sa
/v1alpha1 -> /v1beta1 -> v1