gala-gopher/add-configuration-instructions.patch
Zhen Chen 4a36a5936f sync bugfix patches from openeuler/gala-gopher
- fix ksliprobe get invalid args occasionally at startup
- fix error print when starting gala-gopher
- add system_uuid field to distinguish client when post to pyroscope server
- repair stackprobe caused cpu rush
- add support to pyroscope
- bugfix: add check if thread is 0
- fix stackprobe memory allocation and deallocation errors
- normalize time format in flamegraph svg filename

(cherry picked from commit 6aef5cc8e4e2a34324c3f01663d2b61c0462f4ac)
2023-01-17 22:29:44 +08:00

274 lines
9.4 KiB
Diff
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

From 61431063cc0d52d4cee266cf5af6caccb6bc7803 Mon Sep 17 00:00:00 2001
From: wo_cow <niuqianqian@huawei.com>
Date: Mon, 12 Dec 2022 16:47:57 +0800
Subject: [PATCH] add configuration instructions
---
doc/conf_introduction.md | 70 +++++++++++++++++++
src/common/util.c | 8 +--
.../cadvisor.probe/cadvisor_probe.conf | 4 +-
.../cadvisor.probe/cadvisor_probe.meta | 4 +-
.../python.probe/cadvisor.probe/readme.md | 49 +++++++++++++
.../python.probe/pg_stat.probe/readme.md | 23 ++++++
6 files changed, 150 insertions(+), 8 deletions(-)
create mode 100644 src/probes/extends/python.probe/cadvisor.probe/readme.md
create mode 100644 src/probes/extends/python.probe/pg_stat.probe/readme.md
diff --git a/doc/conf_introduction.md b/doc/conf_introduction.md
index ce15a92..a35a70a 100644
--- a/doc/conf_introduction.md
+++ b/doc/conf_introduction.md
@@ -5,10 +5,20 @@ gala-gopher启动必须的外部参数通过配置文件`gala-gopher.conf`定义
gala-gopher支持用户配置观测的应用范围即支持用户设置关注的、需要监测的具体应用此项配置是在`gala-gopher-app.conf`配置文件中配置。
+部分extend探针有自己的配置文件开启该探针前需要设置好探针的配置文件。
+
## 配置介绍
配置文件归档在[config目录](../config)。
+extend探针配置文件归档在探针同级目录下。目前有配置文件的探针有:
+
+[stackprobe](../src/probes/extends/ebpf.probe/src/stackprobe)
+
+[cadvisor.probe](../src/probes/extends/python.probe/cadvisor.probe)
+
+[pg_stat.probe](../src/probes/extends/python.probe/pg_stat.probe)
+
### gala-gopher.conf
`gala-gopher.conf`文件的安装路径为 `/opt/gala-gopher/gala-gopher.conf`。该文件配置项说明如下:
@@ -84,6 +94,66 @@ gala-gopher支持用户配置观测的应用范围即支持用户设置关注
配置示例参见 [gala-gopher-app.conf示例](#gala-gopher-app.conf示例) 。
+### stackprobe.conf
+
+`stackprobe.conf`文件的安装路径为 `/opt/gala-gopher/extend_probes/stackprobe.conf`。该文件配置项说明如下:
+
+- general通用设置
+ - period火焰图生成周期
+ - log_dirstackprobe探针日志路径
+ - svg_dirsvg格式火焰图存储路径
+ - flame_dir堆栈信息存储路径
+ - debug_dir调试信息文件路径
+- flame_name各类型火焰图开关
+ - oncpuoncpu火焰图开关
+ - offcpuoffcpu火焰图开关
+ - ioio火焰图开关
+ - memleak内存泄漏火焰图开关
+- application暂未使用
+
+
+### cadvisor_probe.conf
+
+`cadvisor_probe.conf`文件的安装路径为 `/opt/gala-gopher/extend_probes/cadvisor_probe.conf`。该文件配置项说明如下:
+
+- version配置文件版本号
+- measurements待集成到gala-gopher的观测指标
+ - table_name: 数据表名称
+ - entity_name: 观测对象名称
+ - fields数据字段
+ - description数据字段描述信息
+ - type数据字段类型需和cAdvisor上报数据类型一致
+ - name数据字段名称需和cAdvisor上报数据名称一致
+
+> 说明cadvisor_probe.conf和cadvisor_probe.meta的字段需要一致。例外若conf中type字段为counter在meta中对应type字段应为gauge
+
+
+### pg_stat_probe.conf
+
+`pg_stat_probe.conf`文件的安装路径为 `/opt/gala-gopher/extend_probes/pg_stat_probe.conf`。该文件配置项说明如下:
+
+- serversPostgreSQL服务端配置
+ - ip服务端IP
+ - port服务端端口
+ - dbname服务端任意数据库名称
+ - user用户名
+ - password用户密码
+
+上述配置用户需能够访问pg_stat_database视图配置最小权限的命令如下
+
+PostgreSQL
+```shell
+grant SELECT ON pg_stat_database to <USER>;
+grant pg_monitor to <USER>;
+```
+
+GaussDB
+```shell
+grant usage on schema dbe_perf to <USER>;
+grant select on pg_stat_replication to <USER>;
+```
+
+
## 启动参数介绍
diff --git a/src/common/util.c b/src/common/util.c
index 1575546..e25e9ee 100644
--- a/src/common/util.c
+++ b/src/common/util.c
@@ -24,7 +24,7 @@
char *get_cur_date(void)
{
- /* return date str, ex: 2021/5/17 */
+ /* return date str, ex: 2021/05/17 */
static char tm[TM_STR_LEN] = {0};
struct tm *tmp_ptr = NULL;
time_t t;
@@ -34,7 +34,7 @@ char *get_cur_date(void)
tmp_ptr = localtime(&t);
(void)snprintf(tm,
TM_STR_LEN,
- "%d-%d-%d",
+ "%d-%02d-%02d",
(1900 + tmp_ptr->tm_year),
(1 + tmp_ptr->tm_mon),
tmp_ptr->tm_mday);
@@ -43,7 +43,7 @@ char *get_cur_date(void)
char *get_cur_time(void)
{
- /* return time str, ex: 2021/5/17 19:56:03 */
+ /* return time str, ex: 2021/05/17 19:56:03 */
static char tm[TM_STR_LEN] = {0};
struct tm *tmp_ptr = NULL;
time_t t;
@@ -53,7 +53,7 @@ char *get_cur_time(void)
tmp_ptr = localtime(&t);
(void)snprintf(tm,
TM_STR_LEN,
- "%d-%d-%d-%02d-%02d-%02d",
+ "%d-%02d-%02d-%02d-%02d-%02d",
(1900 + tmp_ptr->tm_year),
(1 + tmp_ptr->tm_mon),
tmp_ptr->tm_mday,
diff --git a/src/probes/extends/python.probe/cadvisor.probe/cadvisor_probe.conf b/src/probes/extends/python.probe/cadvisor.probe/cadvisor_probe.conf
index 215bb70..3027d4f 100644
--- a/src/probes/extends/python.probe/cadvisor.probe/cadvisor_probe.conf
+++ b/src/probes/extends/python.probe/cadvisor.probe/cadvisor_probe.conf
@@ -189,12 +189,12 @@ measurements:
name: "container_id",
},
{
- description: "...",
+ description: "failure type",
type: "label",
name: "failure_type",
},
{
- description: "...",
+ description: "scope",
type: "label",
name: "scope",
},
diff --git a/src/probes/extends/python.probe/cadvisor.probe/cadvisor_probe.meta b/src/probes/extends/python.probe/cadvisor.probe/cadvisor_probe.meta
index 598d585..178c750 100644
--- a/src/probes/extends/python.probe/cadvisor.probe/cadvisor_probe.meta
+++ b/src/probes/extends/python.probe/cadvisor.probe/cadvisor_probe.meta
@@ -189,12 +189,12 @@ measurements:
name: "container_id",
},
{
- description: "...",
+ description: "failure type",
type: "label",
name: "failure_type",
},
{
- description: "...",
+ description: "scope",
type: "label",
name: "scope",
},
diff --git a/src/probes/extends/python.probe/cadvisor.probe/readme.md b/src/probes/extends/python.probe/cadvisor.probe/readme.md
new file mode 100644
index 0000000..62a5532
--- /dev/null
+++ b/src/probes/extends/python.probe/cadvisor.probe/readme.md
@@ -0,0 +1,49 @@
+# cadvisor 探针开发说明
+
+## 功能描述
+
+集成容器性能分析工具[cAdvisor](https://github.com/google/cadvisor)的统计数据。支持的功能有:
+
+- 设置cAdvisor监听端口必需
+
+ 通过-p参数设置无默认值示例
+
+ `python3 cadvisor_probe.py -p 8080`
+
+ 表示监控cAdvisor输出若cAdvisor未启动则通过`cadvisor -port 8080`启动cAdvisor
+
+- 设置观测周期
+
+ 通过-d参数设置单位为秒默认值5示例
+
+ `python3 cadvisor_probe.py -p 8080 -d 5`
+
+ 表示每隔5s输出统计信息
+
+- 开启观测白名单
+
+ 通过-F参数设置配置为`task`表示按照`gala-gopher-app.conf`过滤配置为具体进程的pid表示仅监控此进程不配置则观测所有进程默认不配置示例
+
+ `python3 cadvisor_probe.py -p 8080 -F task`
+
+ 表示只观测`gala-gopher-app.conf`中的进程
+
+ `python3 cadvisor_probe.py -p 8080 -F 1234`
+
+ 表示只观测pid为1234的进程
+
+- 设置容器观测指标
+
+ 通过cadvisor_probe.conf和cadvisor_probe.meta配置二者需对应。配置方法详见[conf_introduction.md](../../../../../doc/conf_introduction.md#cadvisor_probe.conf)
+
+- 容器运行信息监控,具体的观测指标信息参见`cadvisor_probe.meta`。
+
+## 采集方案
+
+拉起cAdvisor进程并监控[cAdvisor原始Prometheus统计数据](https://github.com/google/cadvisor/blob/master/docs/storage/prometheus.md)
+采集cadvisor_probe.conf中配置的统计项将数据格式转换后按照cadvisor_probe.meta输出为gala-gopher框架支持的格式。
+
+## 约束条件
+
+- 需要预先安装cAdvisor
+
diff --git a/src/probes/extends/python.probe/pg_stat.probe/readme.md b/src/probes/extends/python.probe/pg_stat.probe/readme.md
new file mode 100644
index 0000000..1ddd6b7
--- /dev/null
+++ b/src/probes/extends/python.probe/pg_stat.probe/readme.md
@@ -0,0 +1,23 @@
+# pg_stat 探针开发说明
+
+## 功能描述
+
+获取PostgreSQL Sever的TPS统计数据。支持的功能有
+
+- 设置被观测服务端信息
+
+ 通过pg_stat_probe.conf设置支持多服务端配置方法详见[conf_introduction.md](../../../../../doc/conf_introduction.md#pg_stat_probe.conf)
+
+- 设置观测周期
+
+ 通过-d参数设置单位为秒默认值5示例
+
+ `python3 pg_stat_probe.py -d 5`
+
+ 表示每隔5s输出统计信息
+
+- 观测PostgreSQL Sever中各数据库的TPS统计数据具体的观测指标信息参见`pg_stat_probe.meta`
+
+## 采集方案
+
+通过计算数据库已提交的事务数在单位时间内的增长来计算TPS
--
2.33.0