Compare commits

..

12 Commits

Author SHA1 Message Date
openeuler-ci-bot
4b3428d210
!88 Fix: libcrmcommon: avoid file descriptor leak in IPC client with async connection
From: @bixiaoyan1 
Reviewed-by: @xiangbudaomz 
Signed-off-by: @xiangbudaomz
2024-04-29 08:46:10 +00:00
bixiaoyan
2113920964 Fix: libcrmcommon: avoid file descriptor leak in IPC client with async connection 2024-04-29 16:18:51 +08:00
openeuler-ci-bot
cc56cd2bd1
!87 Fix: tools: crm_mon segfaults when fencer connection is lost
From: @xiangbudaomz 
Reviewed-by: @bixiaoyan1 
Signed-off-by: @bixiaoyan1
2024-04-29 06:50:42 +00:00
openeuler-ci-bot
8597de790f
!86 Fix: cibsecret: Use 'ps axww' to avoid truncating issue
From: @xiangbudaomz 
Reviewed-by: @bixiaoyan1 
Signed-off-by: @bixiaoyan1
2024-04-29 01:11:13 +00:00
zouzhimin
41b01acb16 Fix: tools: crm_mon segfaults when fencer connection is lost 2024-04-28 23:02:01 +08:00
zouzhimin
636afb638d Fix: cibsecret: Use 'ps axww' to avoid truncating issue 2024-04-28 16:52:38 +08:00
openeuler-ci-bot
34cace3d14
!78 Fixed the warning message during installation of pacemaker-cli
From: @xiangbudaomz 
Reviewed-by: @jxy_git 
Signed-off-by: @jxy_git
2024-04-01 07:05:26 +00:00
zouzhimin
f3465c1747 Fixed the warning message during installation of pacemaker-cli 2024-04-01 14:45:24 +08:00
openeuler-ci-bot
3fbaa7c887
!76 Improve pacemaker-attrd cache management and logging
From: @xiangbudaomz 
Reviewed-by: @jxy_git 
Signed-off-by: @jxy_git
2024-03-27 05:59:27 +00:00
openeuler-ci-bot
aa726fd440
!74 Pacemaker Remote nodes can validate against later schema versions
From: @xiangbudaomz 
Reviewed-by: @jxy_git 
Signed-off-by: @jxy_git
2024-03-26 01:22:52 +00:00
zouzhimin
8a8fc628ce Improve pacemaker-attrd cache management and logging 2024-03-21 07:48:56 +08:00
zouzhimin
ab5458c1dc Pacemaker Remote nodes can validate against later schema versions 2024-03-21 07:15:03 +08:00
6 changed files with 3628 additions and 6 deletions

1986
002-schema-transfer.patch Normal file

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,37 @@
From 581e1bf3850a5e6a972ea02198bbbf2d99b29873 Mon Sep 17 00:00:00 2001
From: xin liang <xliang@suse.com>
Date: Wed, 6 Mar 2024 17:07:16 +0800
Subject: [PATCH] Fix: cibsecret: Use 'ps axww' to avoid truncating issue
When python program calling cibsecret with a small terminal width,
the command `ps -ef | grep '[p]acemaker-controld'` will return 1, see
>>> cmd = "ps -ef | grep '[p]acemaker-controld' >/dev/null"
>>> # When terminal width is small
>>> subprocess.call(cmd, shell=True)
1
>>> # When terminal is big enough
>>> subprocess.call(cmd, shell=True)
0
Use 'ps axww' can avoid this issue, also for BSD environment.
---
tools/cibsecret.in | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/cibsecret.in b/tools/cibsecret.in
index 4569863af..9df420126 100644
--- a/tools/cibsecret.in
+++ b/tools/cibsecret.in
@@ -171,7 +171,7 @@ check_env() {
else
fatal $CRM_EX_NOT_INSTALLED "please install pssh, pdsh, or ssh to run $PROG"
fi
- ps -ef | grep '[p]acemaker-controld' >/dev/null ||
+ ps axww | grep '[p]acemaker-controld' >/dev/null ||
fatal $CRM_EX_UNAVAILABLE "pacemaker not running? $PROG needs pacemaker"
}
--
2.25.1

View File

@ -0,0 +1,48 @@
From 47d6055bf418f7049fc716745be95374f465eb77 Mon Sep 17 00:00:00 2001
From: "Gao,Yan" <ygao@suse.com>
Date: Wed, 7 Feb 2024 11:21:23 +0100
Subject: [PATCH] Fix: libcrmcommon: avoid file descriptor leak in IPC client
with async connection
Previously if qb_ipcc_connect_async() succeeded but the following poll()
failed, the file descriptor would leak.
In that case, given that disconnect function is not registered yet,
qb_ipcc_disconnect() won't clean up the socket. In any case, call
qb_ipcc_connect_continue() here so that it may fail and do the cleanup
for us.
Issue introduced in 2.1.3 by 4b60aa100.
---
lib/common/ipc_client.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/lib/common/ipc_client.c b/lib/common/ipc_client.c
index 4635d38d8..df6697cee 100644
--- a/lib/common/ipc_client.c
+++ b/lib/common/ipc_client.c
@@ -1623,13 +1623,17 @@ pcmk__ipc_is_authentic_process_active(const char *name, uid_t refuid,
do {
poll_rc = poll(&pollfd, 1, 2000);
} while ((poll_rc == -1) && (errno == EINTR));
- if ((poll_rc <= 0) || (qb_ipcc_connect_continue(c) != 0)) {
+
+ /* If poll() failed, given that disconnect function is not registered yet,
+ * qb_ipcc_disconnect() won't clean up the socket. In any case, call
+ * qb_ipcc_connect_continue() here so that it may fail and do the cleanup
+ * for us.
+ */
+ if (qb_ipcc_connect_continue(c) != 0) {
crm_info("Could not connect to %s IPC: %s", name,
(poll_rc == 0)?"timeout":strerror(errno));
rc = pcmk_rc_ipc_unresponsive;
- if (poll_rc > 0) {
- c = NULL; // qb_ipcc_connect_continue cleaned up for us
- }
+ c = NULL; // qb_ipcc_connect_continue cleaned up for us
goto bail;
}
#endif
--
2.25.1

View File

@ -0,0 +1,84 @@
From 401f5d971f12db7792971aeec3aaba9f52d67626 Mon Sep 17 00:00:00 2001
From: Reid Wahl <nrwahl@protonmail.com>
Date: Thu, 18 Jan 2024 00:11:17 -0800
Subject: [PATCH] Fix: tools: crm_mon segfaults when fencer connection is lost
This is easiest to observe when Pacemaker is stopping.
When crm_mon is running in interactive mode (the default) and the
cluster is stopped, crm_mon crashes with a segmentation fault. This is a
regression that was introduced in Pacemaker 2.1.0 by commit bc91cc5.
However, for some reason the crash doesn't happen on all platforms. In
particular, I can reproduce the crash on Fedora 38 and 39, but not on
RHEL 9.3 or Fedora 37. This is independent of the Pacemaker version.
The cause is a use-after-free. In detail, it is as follows:
1. crm_mon registers a notification via its stonith API client for
disconnect events. This notification will call either
mon_st_callback_event() or mon_st_callback_display(), depending on
the CLI options. Both of these callbacks call
mon_cib_connection_destroy() for disconnect notifications, so it
doesn't matter which one is used.
2. When the fencer connection is lost, the mainloop calls the stonith
API client's destroy callback (stonith_connection_destroy()).
3. stonith_connection_destroy() sets the state to stonith_disconnected
and calls foreach_notify_entry(..., stonith_send_notification, blob),
where blob contains a disconnect notification.
4. foreach_notify_entry() loops over all the registered notify entries,
calling stonith_send_notification(entry, blob) for each notify entry.
5. For each notify client that's subscribed to disconnect notifications,
stonith_send_notification() calls the registered callback function.
6. Based on the registration in step (1), stonith_send_notification()
synchronously calls mon_st_callback_event()/display() for crm_mon.
7. mon_st_callback_event()/display() calls mon_cib_connection_destroy().
8. mon_cib_connection_destroy() calls stonith_api_delete(), which frees
the stonith API client and its members, including the notification
table.
9. Control returns to stonith_send_notification() and then back to
foreach_notify_entry().
10. foreach_notify_entry() moves to the next entry in the list. But the
entire list was freed in step (8). So when it tries to access a
member of one of the entries, we get a segmentation fault.
Commit bc91cc5 introduced the regression by deleting the stonith API
client in mon_cib_connection_destroy(). Prior to that,
mon_cib_connection_destroy() only disconnected the client and marked its
notify entries for removal.
I audited the other uses of stonith_api_delete() in crm_mon and
elsewhere, and I believe they're safe in the sense that they're never
called while we're processing stonith notify callbacks. A function
should never be allowed to call stonith_api_delete() if the stonith API
client might be sending out notifications. If there are more
notifications in the table, attempts to access them will be a
use-after-free.
Fixes T751
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
---
tools/crm_mon.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/tools/crm_mon.c b/tools/crm_mon.c
index 7789bfebf..19a2ead89 100644
--- a/tools/crm_mon.c
+++ b/tools/crm_mon.c
@@ -854,8 +854,12 @@ mon_cib_connection_destroy(gpointer user_data)
/* the client API won't properly reconnect notifications if they are still
* in the table - so remove them
*/
- stonith_api_delete(st);
- st = NULL;
+ if (st != NULL) {
+ if (st->state != stonith_disconnected) {
+ st->cmds->disconnect(st);
+ }
+ st->cmds->remove_notification(st, NULL);
+ }
if (cib) {
cib->cmds->signoff(cib);
--
2.25.1

File diff suppressed because it is too large Load Diff

View File

@ -17,7 +17,7 @@
## can be incremented to build packages reliably considered "newer" ## can be incremented to build packages reliably considered "newer"
## than previously built packages with the same pcmkversion) ## than previously built packages with the same pcmkversion)
%global pcmkversion 2.1.7 %global pcmkversion 2.1.7
%global specversion 5 %global specversion 11
## Upstream commit (full commit ID, abbreviated commit ID, or tag) to build ## Upstream commit (full commit ID, abbreviated commit ID, or tag) to build
%global commit 0f7f88312f7a1ccedee60bf768aba79ee13d41e0 %global commit 0f7f88312f7a1ccedee60bf768aba79ee13d41e0
@ -96,6 +96,7 @@
%global pkgname_procps procps-ng %global pkgname_procps procps-ng
%global pkgname_glue_libs cluster-glue-libs %global pkgname_glue_libs cluster-glue-libs
%global pkgname_pcmk_libs %{name}-libs %global pkgname_pcmk_libs %{name}-libs
%global hacluster_id 189
## Distro-specific configuration choices ## Distro-specific configuration choices
@ -148,10 +149,15 @@ Url: https://www.clusterlabs.org/
# You can use "spectool -s 0 pacemaker.spec" (rpmdevtools) to show final URL. # You can use "spectool -s 0 pacemaker.spec" (rpmdevtools) to show final URL.
Source0: https://codeload.github.com/%{github_owner}/%{name}/tar.gz/%{archive_github_url} Source0: https://codeload.github.com/%{github_owner}/%{name}/tar.gz/%{archive_github_url}
Source1: https://codeload.github.com/%{github_owner}/%{nagios_name}/tar.gz/%{nagios_archive_github_url} Source1: https://codeload.github.com/%{github_owner}/%{nagios_name}/tar.gz/%{nagios_archive_github_url}
Source2: pacemaker.sysusers
Patch0: Add_replace_for_PCMK__REMOTE_SCHEMA_DIR.patch Patch0: Add_replace_for_PCMK__REMOTE_SCHEMA_DIR.patch
Patch1: 001-schema-glib.patch Patch1: 001-schema-glib.patch
Patch2: Doc-HealthSMART-fix-the-description-of-temp_lower.patch Patch2: Doc-HealthSMART-fix-the-description-of-temp_lower.patch
Patch3: 002-schema-transfer.patch
Patch4: Improve-pacemaker-attrd-cache-management-and-logging.patch
Patch5: Fix-cibsecret-Use-ps-axww-to-avoid-truncating-issue.patch
Patch6: Fix-tools-crm_mon-segfaults-when-fencer-connection-is-lost.patch
Patch7: Fix-libcrmcommon-avoid-file-descriptor-leak-in-IPC-c.patch
Requires: resource-agents Requires: resource-agents
Requires: %{pkgname_pcmk_libs} = %{version}-%{release} Requires: %{pkgname_pcmk_libs} = %{version}-%{release}
Requires: %{name}-cluster-libs = %{version}-%{release} Requires: %{name}-cluster-libs = %{version}-%{release}
@ -490,8 +496,6 @@ find %{buildroot} -name '*.la' -type f -print0 | xargs -0 rm -f
rm -f %{buildroot}/%{_sbindir}/fence_legacy rm -f %{buildroot}/%{_sbindir}/fence_legacy
rm -f %{buildroot}/%{_mandir}/man8/fence_legacy.* rm -f %{buildroot}/%{_mandir}/man8/fence_legacy.*
install -p -D -m 0644 %{SOURCE2} %{buildroot}%{_sysusersdir}/pacemaker.conf
%post %post
%systemd_post pacemaker.service %systemd_post pacemaker.service
@ -553,7 +557,10 @@ fi
%systemd_postun_with_restart crm_mon.service %systemd_postun_with_restart crm_mon.service
%pre -n %{pkgname_pcmk_libs} %pre -n %{pkgname_pcmk_libs}
%sysusers_create_compat %{SOURCE2} # @TODO Use sysusers.d:
# https://fedoraproject.org/wiki/Changes/Adopting_sysusers.d_format
getent group %{gname} >/dev/null || groupadd -r %{gname} -g %{hacluster_id}
getent passwd %{uname} >/dev/null || useradd -r -g %{gname} -u %{hacluster_id} -s /sbin/nologin -c "cluster user" %{uname}
exit 0 exit 0
%ldconfig_scriptlets -n %{pkgname_pcmk_libs} %ldconfig_scriptlets -n %{pkgname_pcmk_libs}
@ -669,7 +676,6 @@ exit 0
%dir %attr (770, %{uname}, %{gname}) %{_var}/log/pacemaker/bundles %dir %attr (770, %{uname}, %{gname}) %{_var}/log/pacemaker/bundles
%files -n %{pkgname_pcmk_libs} %{?with_nls:-f %{name}.lang} %files -n %{pkgname_pcmk_libs} %{?with_nls:-f %{name}.lang}
%{_sysusersdir}/pacemaker.conf
%{_libdir}/libcib.so.* %{_libdir}/libcib.so.*
%{_libdir}/liblrmd.so.* %{_libdir}/liblrmd.so.*
%{_libdir}/libcrmservice.so.* %{_libdir}/libcrmservice.so.*
@ -758,6 +764,24 @@ exit 0
%license %{nagios_name}-%{nagios_hash}/COPYING %license %{nagios_name}-%{nagios_hash}/COPYING
%changelog %changelog
* Mon Apr 29 2024 bixiaoyan <bixiaoyan@kylinos.cn> - 2.1.7-11
- Fix: libcrmcommon: avoid file descriptor leak in IPC client with async connection
* Mon Apr 29 2024 zouzhimin <zouzhimin@kylinos.cn> - 2.1.7-10
- Fix: tools: crm_mon segfaults when fencer connection is lost
* Sun Apr 28 2024 zouzhimin <zouzhimin@kylinos.cn> - 2.1.7-9
- Fix: cibsecret: Use 'ps axww' to avoid truncating issue
* Mon Apr 01 2024 zouzhimin <zouzhimin@kylinos.cn> - 2.1.7-8
- Fixed the warning message during installation of pacemaker-cli
* Tue Mar 26 2024 zouzhimin <zouzhimin@kylinos.cn> - 2.1.7-7
- Improve pacemaker-attrd cache management and logging
* Mon Mar 25 2024 zouzhimin <zouzhimin@kylinos.cn> - 2.1.7-6
- Pacemaker Remote nodes can validate against later schema versions
* Thu Mar 21 2024 bixiaoyan <bixiaoyan@kylinos.cn> - 2.1.7-5 * Thu Mar 21 2024 bixiaoyan <bixiaoyan@kylinos.cn> - 2.1.7-5
- Doc: HealthSMART:fix the description of temp_lower_limit - Doc: HealthSMART:fix the description of temp_lower_limit