From 4c64eac61570b4cfd4e77766639f144a8a93f713 Mon Sep 17 00:00:00 2001 From: vegbir Date: Sat, 10 Jun 2023 11:41:04 +0800 Subject: [PATCH 04/13] rubik: add psi design documentation Signed-off-by: vegbir --- CHANGELOG/CHANGELOG-2.0.0.md | 29 +++++++-- docs/design/psi.md | 94 +++++++++++++++++++++++++++++ docs/images/psi/PSI_designation.svg | 16 +++++ docs/images/psi/PSI_implement.svg | 4 ++ 4 files changed, 139 insertions(+), 4 deletions(-) create mode 100644 docs/design/psi.md create mode 100644 docs/images/psi/PSI_designation.svg create mode 100644 docs/images/psi/PSI_implement.svg diff --git a/CHANGELOG/CHANGELOG-2.0.0.md b/CHANGELOG/CHANGELOG-2.0.0.md index 5cc2cb8..b46fa3d 100644 --- a/CHANGELOG/CHANGELOG-2.0.0.md +++ b/CHANGELOG/CHANGELOG-2.0.0.md @@ -1,16 +1,37 @@ -1. Architecture optimization: +# CHANGELOG + +## v2.0.1 + +### New Feature + +Before June 30, 2023 + +1. **dynMemory** (asynchronous memory classification recovery): implement fssr strategy +2. **psi**: interference detection based on PSI index +3. **quotaTurbo**: elastic cpu limit user mode solution + +## v2.0.0 + +### Architecture optimization + refactor rubik through `informer-podmanager-services` mechanism, decoupling modules and improving performance -2. Interface change: + +### Interface change + - configuration file changes - use the list-watch mechanism to get the pod instead of the http interface -3. Feature enhancements: + +### Feature enhancements + - support elastic cpu limit user mode scheme-quotaturbo - support psi index observation - support memory asynchronous recovery feature (fssr optimization) - support memory access bandwidth and LLC limit - optimize the absolute preemption - optimize the elastic cpu limiting kernel mode scheme-quotaburst -4. Other optimizations: + +### Other optimizations + - document optimization - typo fix - compile option optimization diff --git a/docs/design/psi.md b/docs/design/psi.md new file mode 100644 index 0000000..674a8e0 --- /dev/null +++ b/docs/design/psi.md @@ -0,0 +1,94 @@ +# 【需求设计】基于PSI指标的干扰检测 + +## 需求设计图 + +![PSI_designation](../images/psi/PSI_designation.svg) + +## 实现思路 + +### PSI简介 + +PSI是Pressure Stall Information的简称,用于评估当前系统三大基础硬件资源CPU、Memory、IO的压力。顾名思义,当进程无法获得运行所需的资源时将会产生停顿,PSI就是衡量进程停顿时间长度的度量标准。 + +### 使能cgroupv1 psi特性 + +首先,检查是否开启cgroup v1的PSI。两种方法,看看文件是否存在或者查看内核启动命令行是否包含psi相关选项。 + +```bash +cat /proc/cmdline | grep "psi=1 psi_v1=1" +``` + +若无,则新增内核启动命令行 + +```bash +# 查看内核版本号 +uname -a +# Linux openEuler 5.10.0-136.12.0.86.oe2203sp1.x86_64 #1 +# 找到内核的boot文件 +ls /boot/vmlinuz-5.10.0-136.12.0.86.oe2203sp1.x86_64 +# 新增参数 +grubby --update-kernel="/boot/vmlinuz-5.10.0-136.12.0.86.oe2203sp1.x86_64" --args="psi=1 psi_v1=1" +# 重启 +reboot +``` + +随后便可以在cgroup v1中使用psi的三个文件观测数据。 +例如,在`/sys/fs/cgroup/cpu,cpuacct/kubepods/burstable//`目录下,涉及如下文件: + +- cpu.pressure +- memory.pressure +- io.pressure + +### 方案流程 + +针对PSI格式数据,使用`some avg10`作为观测指标。它表示任一任务在10s内的平均阻塞时间占比。 + +用户通过配置阈值保障在线Pod的资源可用以及高性能。具体来说,当阻塞占比超过某一阈值(默认为5%),则rubik按照一定策略驱逐离线Pod,释放相应资源。 + +在离线业务由注解`volcano.sh/preemptable="true"/"false"`标识。 + +```yaml +annotations: + volcano.sh/preemptable: true +``` + +在线Pod的CPU和内存利用率偏高,rubik会驱逐当前占用CPU资源/内存资源最多的离线业务。若离线业务I/O高,则会选择驱逐CPU资源占用最多的离线业务。 +> 注1:当前cgroup控制io带宽手段有效,难以精准判断驱逐哪个业务会降低io,因此暂时采用CPU利用率作为标准。 +> +> 注2:通过cadvisor库实时获取离线业务的CPU利用率、内存占用量、IO带宽等信息,按指标从大到小排序。 + +需要处理可疑对象时则通过责任链设计模式传递事件处理请求,并执行相应操作。 + +## 实现设计 + +![PSI_implement](../images/psi/PSI_implement.svg) + +## 接口设计 + +```yaml +data: + config.json: | + { + "agent": { + "enabledFeatures": [ + "psi" + ] + }, + "psi": { + "resource": [ + "cpu", + "memory", + "io", + ], + "interval": 10 + } + } +``` + +`psi`字段用于标识基于psi指标的干扰检测特性配置。目前,psi特性支持监测CPU、内存和I/O资源,用户可以按需配置该字段,单独或组合监测资源的PSI取值。 + +| 配置键[=默认值] | 类型 | 描述 | 可选值 | +| --------------- | ---------- | -------------------------------- | ----------- | +| interval=10 |int|psi指标监测间隔(单位:秒)| [10,30]| +| resource=[] | string数组 | 资源类型,声明何种资源需要被访问 | cpu, memory, io | +| avg10Threshold=5.0 | float | psi some类型资源平均10s内的压制百分比阈值(单位:%),超过该阈值则驱逐离线业务 | [5.0,100]| diff --git a/docs/images/psi/PSI_designation.svg b/docs/images/psi/PSI_designation.svg new file mode 100644 index 0000000..8b829e8 --- /dev/null +++ b/docs/images/psi/PSI_designation.svg @@ -0,0 +1,16 @@ + + + + + + + 开始遍历在线Pod列表读取并解析Pod PSI指标是否支持cgroupV1 PSI接口?/sys/fs/cgroup/cpuacct/cpu.pressure...io.pressure ...memory.pressure标记PSI指标最大值分别为cpu_max、mem_max、io_maxcpu_max >= threshold?mem_max >= threshold?io_max >= threshold?按照CPU利用率对离线业务进行排序按照内存占用量对离线业务进行排序按io带宽对离线业务进行排序处理可疑对象日志告警应用驱逐 \ No newline at end of file diff --git a/docs/images/psi/PSI_implement.svg b/docs/images/psi/PSI_implement.svg new file mode 100644 index 0000000..9704504 --- /dev/null +++ b/docs/images/psi/PSI_implement.svg @@ -0,0 +1,4 @@ + + + +cadvisor

<<Interface>>
Metric


+ Update() error

+ AddTrigger(...Trigger) Metric


<<Interface>>...
BaseMetric
BaseMetric
attributes
attributes
trigger []Trigger
trigger []Trigger
operations
operations
AddTrigger(...Trigger) Metric
Update() error
AddTrigger(...Trigger) Metri...

Manager
Manager
attributes
attributes
PSIConfig
Viewer
PSIConfig...
operations
operations
IsRunner() bool
Run(context.Context)
SetConfig(helper.CnfigHandler) error
PreStart(api.Viewer) error
Terminate(api.Viewer) error
IsRunner() bool...

<<Singleton>>

expulsionExec



<<Singleton>>...

<<Singleton>>

resourceAnalysisExec



<<Singleton>>...
PSIConfig+ Interval: int+ Resource: []string

<<Interface>>
Analyzer


+ MaxCPUUtil([]*PodInfo) *PodInfo

+ MaxIOBandWidth([]*PodInfo) *PodInfo

+ MaxMemUtil([]*PodInfo)*PodInfo


<<Interface>>...

<<Interface>>
Trigger


+ Execute(TriggerFactor) error

+ SetNext(...Trigger) Trigger

+ Name() string

<<Interface>>...

<<Interface>>
TriggerFactor


+ Message() string

+ TargetPods() map[string]*typedef.PodInfo


<<Interface>>...
Use
Use
Use
Use
Manager
Manager
attributes
attributes
manager.Manager
manager.Manager
operations
operations

+ New()
+ Start() error
+ ContainerInfoV2() 

+ New()...
BasePSIMetric
BasePSIMetric
attributes
attributes
*metric.BaseMetric
avg10Threshold float64
resources []string
conservation map[string]*typedef.PodInfo
suspicion map[string]*typedef.PodInfo
*metric.BaseMetric...
operations
operations
Update() error
Update() error
Use
Use
Use
Use
Use
Use
Use
Use
Use
Use
Text is not SVG - cannot display
\ No newline at end of file -- 2.41.0