Prometheus初是 SoundCloud 构建的开源系统监控和报警工具,是一个独立的开源项目,于2016年加入了 CNCF 基金会,作为继 Kubernetes 之后的第二个托管项目。
其特征如下:
它有以下几个组件组成:
架构如下:
其流程很简单,Prometheus server端可以直接接收或者通过pushgateway获取到数据,存储到TSDB中,然后对数据进行规则整理,通过Altermanager进行报警或者通过Grafana等工具进行展示。
Prometheus在容器外搭建非常简单,只需要下载对应的release,启动二进制文件即可。 下载地址:https://prometheus.io/download/ 然后可以直接用下面命令启动:
./prometheus --config.file=prometheus.yml
其中prometheus.yaml是主要的配置文件,主要配置信息如下:
global:
scrape_interval: 15s
evaluation_interval: 15s
rule_files:
# - "first.rules"
# - "second.rules"
scrape_configs:
- job_name: prometheus
static_configs:
- targets: ['localhost:9090']
上面配置信息主要包括三个模块:global,rule_files,scrape_configs。 (1)、global定义Prometheus server全局配置。
(2)、rule_file,用于指定规则,Prometheus使用规则产生的时间序列数据或者产生的警报 (3)、scrape_configs,用于控制监控的资源
Prometheus默认会通过/metrics路径采集metrics,比如:curl http://localhost:9090/metrics 就可以看到相应的资源对象了。
1、创建namespace:
# kubectl create ns kube-ops
2、创建configmap,保存我们的主配置文件prometheus.yaml,这样我们要更新配置文件的话就只需要更新这个configmap即可。 prom-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
namespace: kube-ops
data:
prometheus.yaml: |
global:
scrape_interval: 15s
scrape_timeout: 15s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
创建资源:
# kubectl apply -f prom-configmap.yaml
configmap/prometheus-config created
# kubectl get configmap -n kube-ops
NAME DATA AGE
prometheus-config 1 16s
(3)、创建prometheus的Pod prom-deploy.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: prometheus-deploy
namespace: kube-ops
labels:
app: prometheus
spec:
selector:
matchLabels:
app: prometheus
replicas: 1
template:
metadata:
labels:
app: prometheus
spec:
serviceAccountName: prometheus-sa
containers:
- name: prometheus
image: prom/prometheus:v2.14.0
imagePullPolicy: IfNotPresent
command:
- "/bin/prometheus"
args:
- "--config.file=/etc/prometheus/prometheus.yaml"
- "--storage.tsdb.path=/data/prometheus"
- "--storage.tsdb.retention=24h"
- "--web.enable-admin-api"
- "--web.enable-lifecycle"
ports:
- name: http
protocol: TCP
containerPort: 9090
volumeMounts:
- name: data
mountPath: "/data/prometheus"
subPath: prometheus
- name: prometheus-config
mountPath: "/etc/prometheus"
resources:
requests:
cpu: 100m
memory: 500Mi
limits:
cpu: 100m
memory: 500Mi
securityContext:
runAsUser: 0
volumes:
- name: data
persistentVolumeClaim:
claimName: prometheus
- name: prometheus-config
configMap:
name: prometheus-config
我们把上面定义的configMap通过挂载的形式挂载到容器中,然后我们还要定义一个持久化PVC。 (4)、创建PV,PVC prom-pvc.yaml
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: prometheus-pv
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Recycle
nfs:
server: xx.xx.xx.xx
path: /data/k8s/prometheus
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: prometheus
namespace: kube-ops
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
(5)、配置RBAC认证 我们在deploy的模板中定义了serviceAccount,我们就需要定义一个serviceAccount的RBAC。 prom-rbac.yaml
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus-sa
namespace: kube-ops
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: prometheus
rules:
- apiGroups:
- ""
resources:
- nodes
- services
- endpoints
- pods
- nodes/proxy
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- configmaps
- nodes/metrics
verbs:
- get
- nonResourceURLs:
- /metrics
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus
subjects:
- kind: ServiceAccount
name: prometheus-sa
namespace: kube-ops
(6)、创建Service,用来暴露promethes服务 prom-service.yaml
apiVersion: v1
kind: Service
metadata:
name: prometheus-svc
namespace: kube-ops
spec:
type: NodePort
selector:
app: prometheus
ports:
- name: prometheus-web
port: 9090
targetPort: http
(7)、创建配置清单 创建PVC
# kubectl apply -f prom-pvc.yaml
persistentvolume/prometheus-pv created
persistentvolumeclaim/prometheus created
# kubectl get pv -n kube-ops
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
prometheus-pv 10Gi RWO Recycle Bound kube-ops/prometheus 7s
# kubectl get pvc -n kube-ops
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
prometheus Bound prometheus-pv 10Gi RWO 13s
创建RBAC
# kubectl apply -f prom-rbac.yaml
serviceaccount/prometheus-sa created
clusterrole.rbac.authorization.k8s.io/prometheus created
clusterrolebinding.rbac.authorization.k8s.io/prometheus created
# kubectl get clusterrole -n kube-ops | grep prometheus
prometheus 35s
# kubectl get clusterrolebinding -n kube-ops | grep prometheus
prometheus 46s
创建Pod
# kubectl apply -f prom-deploy.yaml
deployment.extensions/prometheus-deploy created
# kubectl get deploy -n kube-ops
NAME READY UP-TO-DATE AVAILABLE AGE
prometheus-deploy 1/1 1 0 10s
# kubectl get pod -n kube-ops
NAME READY STATUS RESTARTS AGE
prometheus-deploy-694446b7cb-ssdqm 1/1 Running 0 18s
创建Service
# kubectl apply -f prom-service.yaml
service/prometheus-svc created
# kubectl get svc -n kube-ops
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
prometheus-svc NodePort 10.68.254.74 <none> 9090:23050/TCP 6
然后就可以通过浏览器访问WEB界面了
完