ceph/0007-bluestore-use-direct-write-for-bdevlabel.patch
luo rixin a7112a309a bluestore: 修复OSD读取superblock错误问题
在aarch64平台64K页大小情况下,osd采用direct write写superblock到地址8k-12k,
采用buffer write写设备标签到地址0-4k,针对buffer write操作系统会按页对齐刷盘,
superblock和设备标签刚好在同一个页上,刷盘导致superblock被覆盖,无法读出正确的数据。
将写设备标签改成direct write修复此问题。

Signed-off-by: luo rixin <luorixin@huawei.com>
2022-09-28 14:05:24 +08:00

44 lines
1.6 KiB
Diff

From 7672ceb4f09c81ee7a2d5e8672e2c402c3206b4e Mon Sep 17 00:00:00 2001
From: luo rixin <luorixin@huawei.com>
Date: Wed, 14 Sep 2022 19:50:01 +0800
Subject: [PATCH] os/bluestore: use direct write in
BlueStore::_write_bdev_label
On AArch64 with kernel page size 64K, it occurs occasionally
"OSD::init(): unable to read osd superblock" when deploying osd.
As bluestore use direct write to write the superblock at 0x2000~1000
and BlueStore::_write_bdev_label use buffer write to write label at
0x0~1000, The OS flush the buffer write algined to page size, it will
overwrite the superblock(0x2000~1000). Use driect write to avoid
overwriting the superblock.
Fixes: https://tracker.ceph.com/issues/57537
Signed-off-by: luo rixin <luorixin@huawei.com>
---
src/os/bluestore/BlueStore.cc | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/os/bluestore/BlueStore.cc b/src/os/bluestore/BlueStore.cc
index 8b893be79d1..534fe780f27 100644
--- a/src/os/bluestore/BlueStore.cc
+++ b/src/os/bluestore/BlueStore.cc
@@ -5104,13 +5104,14 @@ int BlueStore::_write_bdev_label(CephContext *cct,
z.zero();
bl.append(std::move(z));
- int fd = TEMP_FAILURE_RETRY(::open(path.c_str(), O_WRONLY|O_CLOEXEC));
+ int fd = TEMP_FAILURE_RETRY(::open(path.c_str(), O_WRONLY|O_CLOEXEC|O_DIRECT));
if (fd < 0) {
fd = -errno;
derr << __func__ << " failed to open " << path << ": " << cpp_strerror(fd)
<< dendl;
return fd;
}
+ bl.rebuild_aligned_size_and_memory(BDEV_LABEL_BLOCK_SIZE, BDEV_LABEL_BLOCK_SIZE, IOV_MAX);
int r = bl.write_fd(fd);
if (r < 0) {
derr << __func__ << " failed to write to " << path
--
2.20.1.windows.1