Wenchao Hao 021ba3614f iscsid: stop connection for recovery if error is not timeout in iscsi_login_eh
When iscsid is reopening a connection, and the reopen process has succeed
to call bind_conn and comes to iscsi_session_set_params() to set
parameters. If the iscsi target trigger another error event(such as
close the socket connection between initiator and target) at this time,
kernel would perform the error handler and set connection's state to
ISCSI_CONN_FAILED, and set kernel iscsi_cls_conn->flags'
ISCSI_CLS_CONN_BIT_CLEANUP bit. Which would make iscsid's
iscsi_session_set_params() failed with ENOTCONN, so iscsi_login_eh()
would be called by iscsid to handle this error.

Now iscsid see conn->state is ISCSI_CONN_STATE_XPT_WAIT and
session->r_stage is R_STAGE_SESSION_REOPEN, so it would call
session_conn_reopen() with do_stop set to 0, which would not trigger
kernel to call iscsi_if_stop_conn() to clear kernel data struct
iscsi_cls_conn->flags' ISCSI_CLS_CONN_BIT_CLEANUP bit.

The reopen would fall into an infinite cycle which looks like
following:

iscsi_conn_connect -> bind_conn(failed with ENOTCONN)

         ^                     |
         |                     |
         |                     v

    session_conn_reopwn(with do_stop set to 0)

The phenomenon is iscsid would always report log "can't bind conn x:0
to session x, retcode -107 (115)" and the session would not recovery.

Fix this issue by checking error type in iscsi_login_eh(), if the error
type is not timeout, make sure we would call session_conn_reopen() with
do_stop set to STOP_CONN_RECOVER.

Signed-off-by: Wenchao Hao <haowenchao@huawei.com>
2023-01-17 17:28:21 +08:00
2020-05-08 20:09:33 +08:00
Description
No description provided
12 MiB
Languages
Diff 100%