!40 galera: allow joiner to report non-Primary during initial IST
From: @bixiaoyan1 Reviewed-by: @xiangbudaomz Signed-off-by: @xiangbudaomz
This commit is contained in:
commit
d824e75531
79
galera-allow-joiner-to-report-non-Primary-during-ini.patch
Normal file
79
galera-allow-joiner-to-report-non-Primary-during-ini.patch
Normal file
@ -0,0 +1,79 @@
|
|||||||
|
From 4357f0dbb8668ac4090cd7070c2ea195e5683326 Mon Sep 17 00:00:00 2001
|
||||||
|
From: Damien Ciabrini <dciabrin@redhat.com>
|
||||||
|
Date: Wed, 24 Jan 2024 13:27:26 +0100
|
||||||
|
Subject: [PATCH 05/20] galera: allow joiner to report non-Primary during
|
||||||
|
initial IST
|
||||||
|
|
||||||
|
It seems that with recent galera versions, when a galera node
|
||||||
|
joins a cluster, there is a small time window where the node is
|
||||||
|
connected to the primary component of the galera cluster, but it
|
||||||
|
might still be preparing its IST. During this time, it can report
|
||||||
|
itself as being 'not ready' and in 'non-primary' state.
|
||||||
|
|
||||||
|
Update the galera resource agent to allow the node to be in
|
||||||
|
non-primary state, but only if running a "promote" operation. Any
|
||||||
|
network partition during the promotion will be caught by the
|
||||||
|
promote timeout.
|
||||||
|
|
||||||
|
In reworking the promotion code, we move the check for primary
|
||||||
|
partition into the "galera_monitor" function. The check works
|
||||||
|
as before for regular "monitor" or "probe" operations.
|
||||||
|
|
||||||
|
Related-Bug: rhbz#2255414
|
||||||
|
---
|
||||||
|
heartbeat/galera.in | 25 +++++++++++++++++--------
|
||||||
|
1 file changed, 17 insertions(+), 8 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/heartbeat/galera.in b/heartbeat/galera.in
|
||||||
|
index 6aed3e4b..b518595c 100755
|
||||||
|
--- a/heartbeat/galera.in
|
||||||
|
+++ b/heartbeat/galera.in
|
||||||
|
@@ -822,6 +822,11 @@ galera_promote()
|
||||||
|
return $rc
|
||||||
|
fi
|
||||||
|
|
||||||
|
+ # At this point, the mysql pidfile is created on disk and the
|
||||||
|
+ # mysql server is reacheable via its UNIX socket. If we are a
|
||||||
|
+ # joiner, SST transfers (rsync) have finished, but an IST may
|
||||||
|
+ # still be requested or ongoing
|
||||||
|
+
|
||||||
|
galera_monitor
|
||||||
|
rc=$?
|
||||||
|
if [ $rc != $OCF_SUCCESS -a $rc != $OCF_RUNNING_MASTER ]; then
|
||||||
|
@@ -835,12 +840,6 @@ galera_promote()
|
||||||
|
return $OCF_ERR_GENERIC
|
||||||
|
fi
|
||||||
|
|
||||||
|
- is_primary
|
||||||
|
- if [ $? -ne 0 ]; then
|
||||||
|
- ocf_exit_reason "Failure. Master instance started, but is not in Primary mode."
|
||||||
|
- return $OCF_ERR_GENERIC
|
||||||
|
- fi
|
||||||
|
-
|
||||||
|
if ocf_is_true $bootstrap; then
|
||||||
|
promote_everyone
|
||||||
|
clear_bootstrap_node
|
||||||
|
@@ -991,8 +990,18 @@ galera_monitor()
|
||||||
|
fi
|
||||||
|
rc=$OCF_RUNNING_MASTER
|
||||||
|
else
|
||||||
|
- ocf_exit_reason "local node <${NODENAME}> is started, but not in primary mode. Unknown state."
|
||||||
|
- rc=$OCF_ERR_GENERIC
|
||||||
|
+ # It seems that with recent galera (26.4+), a joiner that is
|
||||||
|
+ # connected to a Primary component and is preparing its IST
|
||||||
|
+ # request might still temporarily report its state as
|
||||||
|
+ # Non-Primary. Do not fail in this case as the promote
|
||||||
|
+ # operation will loop until the IST finishes or the promote
|
||||||
|
+ # times out.
|
||||||
|
+ if [ "$__OCF_ACTION" = "promote" ] && ! ocf_is_true $(is_bootstrap); then
|
||||||
|
+ ocf_log info "local node <${NODENAME}> is receiving a State Transfer."
|
||||||
|
+ else
|
||||||
|
+ ocf_exit_reason "local node <${NODENAME}> is started, but not in primary mode. Unknown state."
|
||||||
|
+ rc=$OCF_ERR_GENERIC
|
||||||
|
+ fi
|
||||||
|
fi
|
||||||
|
|
||||||
|
return $rc
|
||||||
|
--
|
||||||
|
2.25.1
|
||||||
|
|
||||||
@ -1,7 +1,7 @@
|
|||||||
Name: resource-agents
|
Name: resource-agents
|
||||||
Summary: Open Source HA Reusable Cluster Resource Scripts
|
Summary: Open Source HA Reusable Cluster Resource Scripts
|
||||||
Version: 4.13.0
|
Version: 4.13.0
|
||||||
Release: 18
|
Release: 19
|
||||||
License: GPLv2+ and LGPLv2+
|
License: GPLv2+ and LGPLv2+
|
||||||
URL: https://github.com/ClusterLabs/resource-agents
|
URL: https://github.com/ClusterLabs/resource-agents
|
||||||
Source0: https://github.com/ClusterLabs/resource-agents/archive/v%{version}.tar.gz
|
Source0: https://github.com/ClusterLabs/resource-agents/archive/v%{version}.tar.gz
|
||||||
@ -23,6 +23,7 @@ Patch0014: portblock-remove-write-to-tcp_tw_recycle.patch
|
|||||||
Patch0015: findifsh-fix-corner-cases.patch
|
Patch0015: findifsh-fix-corner-cases.patch
|
||||||
Patch0016: fix-OCF_SUCESS-name-in-db2_notify.patch
|
Patch0016: fix-OCF_SUCESS-name-in-db2_notify.patch
|
||||||
Patch0017: docs-writing-python-agents-update-required-Python-ve.patch
|
Patch0017: docs-writing-python-agents-update-required-Python-ve.patch
|
||||||
|
Patch0018: galera-allow-joiner-to-report-non-Primary-during-ini.patch
|
||||||
Obsoletes: heartbeat-resources <= %{version}
|
Obsoletes: heartbeat-resources <= %{version}
|
||||||
Provides: heartbeat-resources = %{version}
|
Provides: heartbeat-resources = %{version}
|
||||||
BuildRequires: automake autoconf pkgconfig gcc perl-interpreter perl-generators python3-devel
|
BuildRequires: automake autoconf pkgconfig gcc perl-interpreter perl-generators python3-devel
|
||||||
@ -120,6 +121,9 @@ export CFLAGS="$(echo '%{optflags}')"
|
|||||||
%{_mandir}/man8/{ocf-tester.8*,ldirectord.8*}
|
%{_mandir}/man8/{ocf-tester.8*,ldirectord.8*}
|
||||||
|
|
||||||
%changelog
|
%changelog
|
||||||
|
* Tue Apr 22 2024 bixiaoyan <bixiaoyan@kylinos.cn> - 4.13.0-19
|
||||||
|
- galera: allow joiner to report non-Primary during initial IST
|
||||||
|
|
||||||
* Mon Apr 22 2024 zouzhimin <zouzhimin@kylinos.cn> - 4.13.0-18
|
* Mon Apr 22 2024 zouzhimin <zouzhimin@kylinos.cn> - 4.13.0-18
|
||||||
- docs: writing-python-agents: update required Python version to 3.6+
|
- docs: writing-python-agents: update required Python version to 3.6+
|
||||||
|
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user