首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >为Prometheus工作的Traefik指标,但是Grafana仪表板是空的。

为Prometheus工作的Traefik指标,但是Grafana仪表板是空的。
EN

Stack Overflow用户
提问于 2020-04-02 06:58:38
回答 1查看 2.5K关注 0票数 2

我已经将Trafeik(v1.7.15)Prometheus操作符配置为稳定的HELM图表(chart version 8.2.4)

但是,我看不到来自Grafana仪表板的任何度量数据,它们是空的。

此外,我还可以看到POD IP:8080端口带有curl命令的度量标准。参考以下指标摘录和很少的重要配置清单。

此外,我还可以看到,trafeik服务监视器在Prometheus中处于UP状态,与我为Mongo/Postgres/Rabbit MQ度量所做的策略相同,而这些grafana仪表板具有丰富的数据表示和良好的工作方式。

所以非常感谢有人能引导我正确地修复和显示Trafeikgrafana中获取的控制器指标吗?也让我知道原因?

我使用的是下面的Grafana仪表板,没有一个显示数据。很少有仪表板ID- 四四七五八二一四11741六二九三

谢谢

Trafeik配置:

部署YAML参数

代码语言:javascript
复制
    ports:
    - name: http
      containerPort: 80
    - name: admin
      containerPort: 8080
    - name: https
      containerPort: 443
    args:
    #- --api
    - --web
    - --web.metrics.prometheus
    - --kubernetes
    - --logLevel=INFO
    - --configfile=/config/traefik.toml
    volumeMounts:
    - mountPath: /config
      name: config
    - mountPath: /ssl
      name: ssl

Configmap TOML文件

代码语言:javascript
复制
  traefik.toml: |
    # traefik.toml
    logLevel = "INFO"
    defaultEntryPoints = ["http","https"]
    [entryPoints]
      [entryPoints.http]
      address = ":80"
      [entryPoints.http.redirect]
      entryPoint = "https"
      [entryPoints.https]
      address = ":443"
      [entryPoints.https.tls]
        [[entryPoints.https.tls.certificates]]
        CertFile = "/ssl/tls.crt"
        KeyFile = "/ssl/tls.key"
    [metrics]
      [metrics.prometheus]
        buckets = [0.1,0.3,1.2,5.0]

Prometheus服务监视器YAML

代码语言:javascript
复制
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
    name: traefik-sm
    labels:
        release: my-prometheus
spec:
    selector:
      matchLabels:
        k8s-app: traefik-ingress-lb
    namespaceSelector:
      any: true
    endpoints:
    - port: admin-ui
      name: traefik-ingress-service
      targetPort: 8080
      path: /metrics
      interval: 10s
      honorLabels: true

带卷曲的Trafeik度量

代码语言:javascript
复制
ubuntu@k8s-node1:~$ curl http://10.96.1.141:8080/metrics
# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 1.3978e-05
go_gc_duration_seconds{quantile="0.25"} 1.86e-05
go_gc_duration_seconds{quantile="0.5"} 2.3194e-05
go_gc_duration_seconds{quantile="0.75"} 5.2525e-05
go_gc_duration_seconds{quantile="1"} 0.090356709
go_gc_duration_seconds_sum 12.978064956
go_gc_duration_seconds_count 3774
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 64
# HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
# TYPE go_memstats_alloc_bytes gauge
go_memstats_alloc_bytes 8.322768e+06
# HELP go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed.
# TYPE go_memstats_alloc_bytes_total counter
go_memstats_alloc_bytes_total 2.7448991752e+10
# HELP go_memstats_buck_hash_sys_bytes Number of bytes used by the profiling bucket hash table.
# TYPE go_memstats_buck_hash_sys_bytes gauge
go_memstats_buck_hash_sys_bytes 1.579943e+06
# HELP go_memstats_frees_total Total number of frees.
# TYPE go_memstats_frees_total counter
go_memstats_frees_total 2.5932029e+08
# HELP go_memstats_gc_cpu_fraction The fraction of this program's available CPU time used by the GC since the program started.
# TYPE go_memstats_gc_cpu_fraction gauge
go_memstats_gc_cpu_fraction 0.00037814152889298634
# HELP go_memstats_gc_sys_bytes Number of bytes used for garbage collection system metadata.
# TYPE go_memstats_gc_sys_bytes gauge
go_memstats_gc_sys_bytes 2.4064e+06
# HELP go_memstats_heap_alloc_bytes Number of heap bytes allocated and still in use.
# TYPE go_memstats_heap_alloc_bytes gauge
go_memstats_heap_alloc_bytes 8.322768e+06
# HELP go_memstats_heap_idle_bytes Number of heap bytes waiting to be used.
# TYPE go_memstats_heap_idle_bytes gauge
go_memstats_heap_idle_bytes 5.3641216e+07
# HELP go_memstats_heap_inuse_bytes Number of heap bytes that are in use.
# TYPE go_memstats_heap_inuse_bytes gauge
go_memstats_heap_inuse_bytes 1.261568e+07
# HELP go_memstats_heap_objects Number of allocated objects.
# TYPE go_memstats_heap_objects gauge
go_memstats_heap_objects 54120
# HELP go_memstats_heap_released_bytes Number of heap bytes released to OS.
# TYPE go_memstats_heap_released_bytes gauge
go_memstats_heap_released_bytes 4.636672e+07
# HELP go_memstats_heap_sys_bytes Number of heap bytes obtained from system.
# TYPE go_memstats_heap_sys_bytes gauge
go_memstats_heap_sys_bytes 6.6256896e+07
# HELP go_memstats_last_gc_time_seconds Number of seconds since 1970 of last garbage collection.
# TYPE go_memstats_last_gc_time_seconds gauge
go_memstats_last_gc_time_seconds 1.5858102844353108e+09
# HELP go_memstats_lookups_total Total number of pointer lookups.
# TYPE go_memstats_lookups_total counter
go_memstats_lookups_total 0
# HELP go_memstats_mallocs_total Total number of mallocs.
# TYPE go_memstats_mallocs_total counter
go_memstats_mallocs_total 2.5937441e+08
# HELP go_memstats_mcache_inuse_bytes Number of bytes in use by mcache structures.
# TYPE go_memstats_mcache_inuse_bytes gauge
go_memstats_mcache_inuse_bytes 3472
# HELP go_memstats_mcache_sys_bytes Number of bytes used for mcache structures obtained from system.
# TYPE go_memstats_mcache_sys_bytes gauge
go_memstats_mcache_sys_bytes 16384
# HELP go_memstats_mspan_inuse_bytes Number of bytes in use by mspan structures.
# TYPE go_memstats_mspan_inuse_bytes gauge
go_memstats_mspan_inuse_bytes 180000
# HELP go_memstats_mspan_sys_bytes Number of bytes used for mspan structures obtained from system.
# TYPE go_memstats_mspan_sys_bytes gauge
go_memstats_mspan_sys_bytes 245760
# HELP go_memstats_next_gc_bytes Number of heap bytes when next garbage collection will take place.
# TYPE go_memstats_next_gc_bytes gauge
go_memstats_next_gc_bytes 1.6043632e+07
# HELP go_memstats_other_sys_bytes Number of bytes used for other system allocations.
# TYPE go_memstats_other_sys_bytes gauge
go_memstats_other_sys_bytes 666961
# HELP go_memstats_stack_inuse_bytes Number of bytes in use by the stack allocator.
# TYPE go_memstats_stack_inuse_bytes gauge
go_memstats_stack_inuse_bytes 851968
# HELP go_memstats_stack_sys_bytes Number of bytes obtained from system for stack allocator.
# TYPE go_memstats_stack_sys_bytes gauge
go_memstats_stack_sys_bytes 851968
# HELP go_memstats_sys_bytes Number of bytes obtained from system.
# TYPE go_memstats_sys_bytes gauge
go_memstats_sys_bytes 7.2024312e+07
# HELP go_threads Number of OS threads created
# TYPE go_threads gauge
go_threads 11
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 553.04
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1.048576e+06
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 11
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 6.9451776e+07
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.58573313806e+09
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 1.90099456e+08
# HELP traefik_backend_server_up Backend server is up, described by gauge value of 0 or 1.
# TYPE traefik_backend_server_up gauge
traefik_backend_server_up{backend="auth-jooqa.abc.com/",url="http://192.168.22.77:8180"}
# HELP traefik_config_last_reload_failure Last config reload failure
# TYPE traefik_config_last_reload_failure gauge
traefik_config_last_reload_failure 0
# HELP traefik_config_last_reload_success Last config reload success
# TYPE traefik_config_last_reload_success gauge
traefik_config_last_reload_success 1.585741581e+09
# HELP traefik_config_reloads_failure_total Config failure reloads
# TYPE traefik_config_reloads_failure_total counter
traefik_config_reloads_failure_total 0
# HELP traefik_config_reloads_total Config reloads
# TYPE traefik_config_reloads_total counter
traefik_config_reloads_total 4
EN

回答 1

Stack Overflow用户

发布于 2020-04-03 07:14:09

traefik导出的度量太少

如果检查导出的指标,就会发现太少了:

代码语言:javascript
复制
$ curl -s http://10.96.1.141:8080/metrics | grep -P '^traefik_'

traefik_backend_server_up{backend="auth-jooqa.abc.com/",url="http://192.168.22.77:8180"}
traefik_config_last_reload_failure 0
traefik_config_last_reload_success 1.585741581e+09
traefik_config_reloads_failure_total 0
traefik_config_reloads_total 4

很难用您的指标集找到现成的grafana仪表板。

让我们在提到的仪表板中使用grep expr标记(四四七五八二一四11741,6293(https://grafana.com/grafana/dashboards/6293 ))

代码语言:javascript
复制
for dashboard_url in 'https://grafana.com/api/dashboards/4475/revisions/4/download' 'https://grafana.com/api/dashboards/6293/revisions/2/download' 'https://grafana.com/api/dashboards/8214/revisions/1/download' 'https://grafana.com/api/dashboards/11741/revisions/1/download' ; do
  echo "\t = Dashboard: $dashboard_url = "
  curl -s $dashboard_url | jq '.panels[].targets[0].expr' | grep -Po 'traefik_[a-z_]+' | sort |uniq
done
))

上面的命令返回在适当仪表板的traefik_*中使用的expr度量的列表:

代码语言:javascript
复制
         = Dashboard: https://grafana.com/api/dashboards/4475/revisions/4/download =
traefik_backend_request_duration_seconds_sum
traefik_backend_requests_total
traefik_backend_server_up
traefik_config_reloads_total
traefik_entrypoint_requests_total
         = Dashboard: https://grafana.com/api/dashboards/6293/revisions/2/download =
traefik_backend_open_connections
traefik_backend_request_duration_seconds_sum
traefik_backend_requests_total
traefik_entrypoint_open_connections
traefik_entrypoint_request_duration_seconds_sum
traefik_entrypoint_requests_total
         = Dashboard: https://grafana.com/api/dashboards/8214/revisions/1/download =
traefik_backend_request_duration_seconds_sum
traefik_backend_requests_total
traefik_entrypoint_request_duration_seconds_sum
traefik_entrypoint_requests_total
         = Dashboard: https://grafana.com/api/dashboards/11741/revisions/1/download =
traefik_entrypoint_open_connections
traefik_entrypoint_request_duration_seconds_sum
traefik_entrypoint_requests_total
traefik_service_open_connections
traefik_service_request_duration_seconds_count
traefik_service_request_duration_seconds_sum
traefik_service_requests_total

正如您所看到的,仅使用了5个指标中的两个。

让我们试着找到合适的仪表板

由于这4个仪表板不适合您的度量集,所以让我们尝试在GitHub中找到合适的仪表板

  • traefik_backend_server_up8编码结果
  • traefik_backend_server_up还是traefik_config_reloads_total11代码结果
  • traefik_config_last_reload_failure OR traefik_config_last_reload_success OR traefik_config_reloads_failure_total1编码结果

建议

因此,我建议:

  • 要么尝试更新traefik以公开更多的实际度量集
  • 或者创建您自己的仪表板,很简单

P.S. grafana-仪表板生成器,以便更容易地创建Grafana仪表板

有一个开放源码工具可以更容易地创建仪表板:

jakubplichta/ Grafana -仪表板生成器:用YAML生成Grafana仪表板

目前,它支持三个数据存储:

  • 石墨
  • 普罗米修斯
  • InfluxDB
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/60985835

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档