From 021ba3614f285e62d9f480b438ff374694bea11d Mon Sep 17 00:00:00 2001 From: Wenchao Hao Date: Tue, 17 Jan 2023 17:27:06 +0800 Subject: [PATCH] iscsid: stop connection for recovery if error is not timeout in iscsi_login_eh When iscsid is reopening a connection, and the reopen process has succeed to call bind_conn and comes to iscsi_session_set_params() to set parameters. If the iscsi target trigger another error event(such as close the socket connection between initiator and target) at this time, kernel would perform the error handler and set connection's state to ISCSI_CONN_FAILED, and set kernel iscsi_cls_conn->flags' ISCSI_CLS_CONN_BIT_CLEANUP bit. Which would make iscsid's iscsi_session_set_params() failed with ENOTCONN, so iscsi_login_eh() would be called by iscsid to handle this error. Now iscsid see conn->state is ISCSI_CONN_STATE_XPT_WAIT and session->r_stage is R_STAGE_SESSION_REOPEN, so it would call session_conn_reopen() with do_stop set to 0, which would not trigger kernel to call iscsi_if_stop_conn() to clear kernel data struct iscsi_cls_conn->flags' ISCSI_CLS_CONN_BIT_CLEANUP bit. The reopen would fall into an infinite cycle which looks like following: iscsi_conn_connect -> bind_conn(failed with ENOTCONN) ^ | | | | v session_conn_reopwn(with do_stop set to 0) The phenomenon is iscsid would always report log "can't bind conn x:0 to session x, retcode -107 (115)" and the session would not recovery. Fix this issue by checking error type in iscsi_login_eh(), if the error type is not timeout, make sure we would call session_conn_reopen() with do_stop set to STOP_CONN_RECOVER. Signed-off-by: Wenchao Hao --- ...ection-for-recovery-if-error-is-not-.patch | 68 +++++++++++++++++++ open-iscsi.spec | 7 +- 2 files changed, 73 insertions(+), 2 deletions(-) create mode 100644 0027-iscsid-stop-connection-for-recovery-if-error-is-not-.patch diff --git a/0027-iscsid-stop-connection-for-recovery-if-error-is-not-.patch b/0027-iscsid-stop-connection-for-recovery-if-error-is-not-.patch new file mode 100644 index 0000000..5b1ccd3 --- /dev/null +++ b/0027-iscsid-stop-connection-for-recovery-if-error-is-not-.patch @@ -0,0 +1,68 @@ +From 9f2074568e6c39f85c9d948cb3b869f4fc774695 Mon Sep 17 00:00:00 2001 +From: Wenchao Hao <73930449+wenchao-hao@users.noreply.github.com> +Date: Thu, 12 Jan 2023 11:10:05 +0800 +Subject: iscsid: stop connection for recovery if error is not + timeout in iscsi_login_eh (#388) + +When iscsid is reopening a connection, and the reopen process has succeed +to call bind_conn and comes to iscsi_session_set_params() to set +parameters. If the iscsi target trigger another error event(such as +close the socket connection between initiator and target) at this time, +kernel would perform the error handler and set connection's state to +ISCSI_CONN_FAILED, and set kernel iscsi_cls_conn->flags' +ISCSI_CLS_CONN_BIT_CLEANUP bit. Which would make iscsid's +iscsi_session_set_params() failed with ENOTCONN, so iscsi_login_eh() +would be called by iscsid to handle this error. + +Now iscsid see conn->state is ISCSI_CONN_STATE_XPT_WAIT and +session->r_stage is R_STAGE_SESSION_REOPEN, so it would call +session_conn_reopen() with do_stop set to 0, which would not trigger +kernel to call iscsi_if_stop_conn() to clear kernel data struct +iscsi_cls_conn->flags' ISCSI_CLS_CONN_BIT_CLEANUP bit. + +The reopen would fall into an infinite cycle which looks like +following: + +iscsi_conn_connect -> bind_conn(failed with ENOTCONN) + + ^ | + | | + | v + + session_conn_reopwn(with do_stop set to 0) + +The phenomenon is iscsid would always report log "can't bind conn x:0 +to session x, retcode -107 (115)" and the session would not recovery. + +Fix this issue by checking error type in iscsi_login_eh(), if the error +type is not timeout, make sure we would call session_conn_reopen() with +do_stop set to STOP_CONN_RECOVER. + +Signed-off-by: Wenchao Hao +--- + usr/initiator.c | 9 +++++++-- + 1 file changed, 7 insertions(+), 2 deletions(-) + +diff --git a/usr/initiator.c b/usr/initiator.c +index 56bf38b..9c48dd5 100644 +--- a/usr/initiator.c ++++ b/usr/initiator.c +@@ -735,8 +735,13 @@ static void iscsi_login_eh(struct iscsi_conn *conn, struct queue_task *qtask, + session_conn_shutdown(conn, qtask, err); + break; + } +- /* timeout during reopen connect. try again */ +- session_conn_reopen(conn, qtask, 0); ++ /* ++ * stop connection for recovery if error is not ++ * timeout ++ */ ++ if (err != ISCSI_ERR_TRANS_TIMEOUT) ++ stop_flag = STOP_CONN_RECOVER; ++ session_conn_reopen(conn, qtask, stop_flag); + break; + case R_STAGE_SESSION_CLEANUP: + session_conn_shutdown(conn, qtask, err); +-- +2.35.3 + diff --git a/open-iscsi.spec b/open-iscsi.spec index 591a7be..98b13ef 100644 --- a/open-iscsi.spec +++ b/open-iscsi.spec @@ -4,7 +4,7 @@ Name: open-iscsi Version: 2.1.5 -Release: 11 +Release: 12 Summary: ISCSI software initiator daemon and utility programs License: GPLv2+ and BSD URL: http://www.open-iscsi.com @@ -35,7 +35,7 @@ patch23: 0023-Remove-unused-fwparam_ibft.-ch-files-in-fwparam_ibft.patch patch24: 0024-Fix-a-possible-passing-null-pointer-in-usr-iface.c-3.patch patch25: 0025-iscsid-iscsiuio-fix-OOM-adjustment-377.patch patch26: 0026-iscsid-clear-scanning-thread-s-PR_SET_IO_FLUSHER-fla.patch - +patch27: 0027-iscsid-stop-connection-for-recovery-if-error-is-not-.patch BuildRequires: flex bison doxygen kmod-devel systemd-units gcc git isns-utils-devel systemd-devel BuildRequires: autoconf automake libtool libmount-devel openssl-devel pkg-config @@ -162,6 +162,9 @@ fi %{_mandir}/man8/* %changelog +* Tue Jan 17 2023 haowenchao - 2.1.5-12 +- iscsid: stop connection for recovery if error is not timeout in iscsi_login_eh + * Tue Jan 17 2023 haowenchao - 2.1.5-11 - iscsid: clear scanning thread's PR_SET_IO_FLUSHER flag