From: Greg KH on
2.6.33-stable review patch. If anyone has any objections, please let us know.

------------------

From: Mike Christie <michaelc(a)cs.wisc.edu>

commit 4ae0a6c15efcc37e94e3f30e3533bdec03c53126 upstream.

We could be failing/stopping a connection due to libiscsi starting
recovery/cleanup, but the xmit path or scsi eh thread path
could be dropping the connection at the same time.

As a result the session->state gets set to failed instead of in
recovery. We end up not blocking the session
and so the replacement timeout never gets started and we only end up
failing the IO when scsi_softirq_done sees that the
cmd has been running for (cmd->allowed + 1) * rq->timeout secs.

We used to fail the IO right away so users are seeing a long
delay when using dm-multipath. This problem was added in
2.6.28.

Signed-off-by: Mike Christie <michaelc(a)cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley(a)suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)suse.de>

---
drivers/scsi/libiscsi.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -3027,14 +3027,15 @@ static void iscsi_start_session_recovery
session->state = ISCSI_STATE_TERMINATE;
else if (conn->stop_stage != STOP_CONN_RECOVER)
session->state = ISCSI_STATE_IN_RECOVERY;
+
+ old_stop_stage = conn->stop_stage;
+ conn->stop_stage = flag;
spin_unlock_bh(&session->lock);

del_timer_sync(&conn->transport_timer);
iscsi_suspend_tx(conn);

spin_lock_bh(&session->lock);
- old_stop_stage = conn->stop_stage;
- conn->stop_stage = flag;
conn->c_stage = ISCSI_CONN_STOPPED;
spin_unlock_bh(&session->lock);



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/