[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] xen-netback: fix race condition on XenBus disconnect
On 03/03/17 09:18, Paul Durrant wrote: >> -----Original Message----- >> From: Igor Druzhinin [mailto:igor.druzhinin@xxxxxxxxxx] >> Sent: 02 March 2017 22:57 >> To: netdev@xxxxxxxxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxxx >> Cc: Paul Durrant <Paul.Durrant@xxxxxxxxxx>; jgross@xxxxxxxx; Wei Liu >> <wei.liu2@xxxxxxxxxx>; Igor Druzhinin <igor.druzhinin@xxxxxxxxxx> >> Subject: [PATCH] xen-netback: fix race condition on XenBus disconnect >> >> In some cases during XenBus disconnect event handling and subsequent >> queue resource release there may be some TX handlers active on >> other processors. Use RCU in order to synchronize with them. >> >> Signed-off-by: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx> >> --- >> drivers/net/xen-netback/interface.c | 13 ++++++++----- >> drivers/net/xen-netback/xenbus.c | 17 +++++++---------- >> 2 files changed, 15 insertions(+), 15 deletions(-) >> >> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen- >> netback/interface.c >> index a2d32676..32e2cc6 100644 >> --- a/drivers/net/xen-netback/interface.c >> +++ b/drivers/net/xen-netback/interface.c >> @@ -164,7 +164,7 @@ static int xenvif_start_xmit(struct sk_buff *skb, struct >> net_device *dev) >> { >> struct xenvif *vif = netdev_priv(dev); >> struct xenvif_queue *queue = NULL; >> - unsigned int num_queues = vif->num_queues; > > Do you not need an rcu_read_lock() around this and use of the num_queues > value (as you have below)? > >> + unsigned int num_queues = rcu_dereference(vif)->num_queues; >> u16 index; >> struct xenvif_rx_cb *cb; >> >> @@ -221,18 +221,21 @@ static struct net_device_stats >> *xenvif_get_stats(struct net_device *dev) >> { >> struct xenvif *vif = netdev_priv(dev); >> struct xenvif_queue *queue = NULL; >> + unsigned int num_queues; >> u64 rx_bytes = 0; >> u64 rx_packets = 0; >> u64 tx_bytes = 0; >> u64 tx_packets = 0; >> unsigned int index; >> >> - spin_lock(&vif->lock); >> - if (vif->queues == NULL) >> + rcu_read_lock(); >> + >> + num_queues = rcu_dereference(vif)->num_queues; >> + if (num_queues < 1) >> goto out; >> >> /* Aggregate tx and rx stats from each queue */ >> - for (index = 0; index < vif->num_queues; ++index) { >> + for (index = 0; index < num_queues; ++index) { >> queue = &vif->queues[index]; >> rx_bytes += queue->stats.rx_bytes; >> rx_packets += queue->stats.rx_packets; >> @@ -241,7 +244,7 @@ static struct net_device_stats >> *xenvif_get_stats(struct net_device *dev) >> } >> >> out: >> - spin_unlock(&vif->lock); >> + rcu_read_unlock(); >> >> vif->dev->stats.rx_bytes = rx_bytes; >> vif->dev->stats.rx_packets = rx_packets; >> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen- >> netback/xenbus.c >> index d2d7cd9..76efb01 100644 >> --- a/drivers/net/xen-netback/xenbus.c >> +++ b/drivers/net/xen-netback/xenbus.c >> @@ -495,26 +495,23 @@ static void backend_disconnect(struct >> backend_info *be) >> struct xenvif *vif = be->vif; >> >> if (vif) { >> + unsigned int num_queues = vif->num_queues; >> unsigned int queue_index; >> - struct xenvif_queue *queues; >> >> xen_unregister_watchers(vif); >> #ifdef CONFIG_DEBUG_FS >> xenvif_debugfs_delif(vif); >> #endif /* CONFIG_DEBUG_FS */ >> xenvif_disconnect_data(vif); >> - for (queue_index = 0; >> - queue_index < vif->num_queues; >> - ++queue_index) >> - xenvif_deinit_queue(&vif->queues[queue_index]); >> >> - spin_lock(&vif->lock); >> - queues = vif->queues; >> vif->num_queues = 0; >> - vif->queues = NULL; >> - spin_unlock(&vif->lock); >> + synchronize_net(); > > So, num_queues is your RCU protected value, rather than the queues pointer, > in which case I think you probably need to change code such as > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/net/xen-netback/netback.c?id=refs/tags/v4.10#n216 > > to be gated on num_queues. > > Also shouldn't xenvif_up(), xenvif_down() and xenvif_get_ethtool_stats() not > be using rcu_read_lock() and rcu_dereference() of num_queues as well? > > Paul > xenvif_up() and xenvif_down() are called under rtnl_lock. xenvif_get_ethtool_stats() is a good candidate. I'm not sure about xenvif_fatal_tx_err. Is it actually called from already protected regions? Igor >> >> - vfree(queues); >> + for (queue_index = 0; queue_index < num_queues; >> ++queue_index) >> + xenvif_deinit_queue(&vif->queues[queue_index]); >> + >> + vfree(vif->queues); >> + vif->queues = NULL; >> >> xenvif_disconnect_ctrl(vif); >> } >> -- >> 1.8.3.1 > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |