|
From: Stefan Roscher on 7 May 2008 07:30 This is necessary because, in a multicore environment, a race between uverbs async handler and destroy QP could occur. Signed-off-by: Stefan Roscher <stefan.roscher at de.ibm.com> --- We are not sure if this should be fixed in the driver or in uverbs itself. Roland, what's your opinion about this? drivers/infiniband/hw/ehca/ehca_classes.h | 2 ++ drivers/infiniband/hw/ehca/ehca_irq.c | 4 ++++ drivers/infiniband/hw/ehca/ehca_qp.c | 5 +++++ 3 files changed, 11 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/hw/ehca/ehca_classes.h b/drivers/infiniband/hw/ehca/ehca_classes.h index 00bab60..1e9e99a 100644 --- a/drivers/infiniband/hw/ehca/ehca_classes.h +++ b/drivers/infiniband/hw/ehca/ehca_classes.h @@ -192,6 +192,8 @@ struct ehca_qp { int mtu_shift; u32 message_count; u32 packet_count; + atomic_t nr_events; /* events seen */ + wait_queue_head_t wait_completion; }; #define IS_SRQ(qp) (qp->ext_type == EQPT_SRQ) diff --git a/drivers/infiniband/hw/ehca/ehca_irq.c b/drivers/infiniband/hw/ehca/ehca_irq.c index ca5eb0c..ce1ab05 100644 --- a/drivers/infiniband/hw/ehca/ehca_irq.c +++ b/drivers/infiniband/hw/ehca/ehca_irq.c @@ -204,6 +204,8 @@ static void qp_event_callback(struct ehca_shca *shca, u64 eqe, read_lock(&ehca_qp_idr_lock); qp = idr_find(&ehca_qp_idr, token); + if (qp) + atomic_inc(&qp->nr_events); read_unlock(&ehca_qp_idr_lock); if (!qp) @@ -223,6 +225,8 @@ static void qp_event_callback(struct ehca_shca *shca, u64 eqe, if (fatal && qp->ext_type == EQPT_SRQBASE) dispatch_qp_event(shca, qp, IB_EVENT_QP_LAST_WQE_REACHED); + if (atomic_dec_and_test(&qp->nr_events)) + wake_up(&qp->wait_completion); return; } diff --git a/drivers/infiniband/hw/ehca/ehca_qp.c b/drivers/infiniband/hw/ehca/ehca_qp.c index 18fba92..d550200 100644 --- a/drivers/infiniband/hw/ehca/ehca_qp.c +++ b/drivers/infiniband/hw/ehca/ehca_qp.c @@ -566,6 +566,8 @@ static struct ehca_qp *internal_create_qp( return ERR_PTR(-ENOMEM); } + atomic_set(&my_qp->nr_events, 0); + init_waitqueue_head(&my_qp->wait_completion); spin_lock_init(&my_qp->spinlock_s); spin_lock_init(&my_qp->spinlock_r); my_qp->qp_type = qp_type; @@ -1934,6 +1936,9 @@ static int internal_destroy_qp(struct ib_device *dev, struct ehca_qp *my_qp, idr_remove(&ehca_qp_idr, my_qp->token); write_unlock_irqrestore(&ehca_qp_idr_lock, flags); + /* now wait until all pending events have completed */ + wait_event(my_qp->wait_completion, !atomic_read(&my_qp->nr_events)); + h_ret = hipz_h_destroy_qp(shca->ipz_hca_handle, my_qp); if (h_ret != H_SUCCESS) { ehca_err(dev, "hipz_h_destroy_qp() failed h_ret=%li " -- 1.5.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Roland Dreier on 7 May 2008 11:40 > We are not sure if this should be fixed in the driver or in uverbs itself. > Roland, what's your opinion about this? Would be nice to be able to fix it in uverbs but I don't see how. In particular a kernel consumer has to have the same guarantee that no async events will come in after destroy QP returns. And I don't see any way generic code can provide a guarantee about what low-level driver code may do internally. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Stefan Roscher on 7 May 2008 12:10 On Wednesday 07 May 2008 17:32:03 Roland Dreier wrote: > > We are not sure if this should be fixed in the driver or in uverbs itself. > > Roland, what's your opinion about this? > > Would be nice to be able to fix it in uverbs but I don't see how. In > particular a kernel consumer has to have the same guarantee that no > async events will come in after destroy QP returns. And I don't see any > way generic code can provide a guarantee about what low-level driver > code may do internally. > I agree, that's why I posted the driver fix first. So, will you apply it next? Regards Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Roland Dreier on 7 May 2008 14:10 > I agree, that's why I posted the driver fix first. Glad we agree :) I thought about it a little more and really convinced myself that there is no good way for generic code to handle this race. > So, will you apply it next? Yes, will apply it shortly. - R. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Roland Dreier on 7 May 2008 15:20 So I applied this, but thinking about it further, do you (theoretically at least) have the same problem with CQ and SRQ async events? - R. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
|
Pages: 1 Prev: [PATCH 1/9] viafb: VIA Frame Buffer Device Driver Next: [PATCH 3/9] Fix a printk call |