43 lines
1.5 KiB
Diff
43 lines
1.5 KiB
Diff
From 56cbf3f0716b556c815487d719abe86021125925 Mon Sep 17 00:00:00 2001
|
|
From: shimin <shimin@kuaishou.com>
|
|
Date: Wed, 10 Apr 2024 09:10:04 +0800
|
|
Subject: [PATCH] mon:fix mds metadata lost in one case
|
|
In most cases, peon's pending_metadata is inconsistent with mon's db.
|
|
When a peon turns into leader, and at the same time a active mds stops,
|
|
the new leader may flush wrong mds metadata into db. So we meed to
|
|
update mds metadata from db at every fsmap change.
|
|
|
|
This phenomenon can be reproduce like this:
|
|
A Cluster with 3 mon and 3 mds (one active, other two standby), 6 osd.
|
|
step 1. stop two standby mds;
|
|
step 2. restart all mon; (make pending_medata consistent with db)
|
|
step 3. start other two mds
|
|
step 4. stop leader mon
|
|
step 5. run "ceph mds metadata" command to check mds metadata
|
|
step 6. stop active mds
|
|
step 7. run "ceph mds metadata" command to check mds metadata again
|
|
|
|
In step 7, we would find mds metadata lost.
|
|
|
|
Fixes: https://tracker.ceph.com/issues/63166
|
|
Signed-off-by: shimin <shimin@kuaishou.com>
|
|
---
|
|
src/mon/MDSMonitor.cc | 1 +
|
|
1 file changed, 1 insertion(+)
|
|
|
|
diff --git a/src/mon/MDSMonitor.cc b/src/mon/MDSMonitor.cc
|
|
index 4b27d828c..0ac5060f7 100644
|
|
--- a/src/mon/MDSMonitor.cc
|
|
+++ b/src/mon/MDSMonitor.cc
|
|
@@ -136,6 +136,7 @@ void MDSMonitor::update_from_paxos(bool *need_bootstrap)
|
|
<< ", my e " << get_fsmap().epoch << dendl;
|
|
ceph_assert(version > get_fsmap().epoch);
|
|
|
|
+ load_metadata(pending_metadata);
|
|
load_health();
|
|
|
|
// read and decode
|
|
--
|
|
2.27.0
|
|
|