[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] xen-netback: fix race condition on XenBus disconnect
> -----Original Message----- > From: Igor Druzhinin > Sent: 03 March 2017 13:54 > To: Paul Durrant <Paul.Durrant@xxxxxxxxxx>; netdev@xxxxxxxxxxxxxxx; xen- > devel@xxxxxxxxxxxxxxxxxxxx > Cc: jgross@xxxxxxxx; Wei Liu <wei.liu2@xxxxxxxxxx> > Subject: Re: [PATCH] xen-netback: fix race condition on XenBus disconnect > > On 03/03/17 09:18, Paul Durrant wrote: > >> -----Original Message----- > >> From: Igor Druzhinin [mailto:igor.druzhinin@xxxxxxxxxx] > >> Sent: 02 March 2017 22:57 > >> To: netdev@xxxxxxxxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxxx > >> Cc: Paul Durrant <Paul.Durrant@xxxxxxxxxx>; jgross@xxxxxxxx; Wei Liu > >> <wei.liu2@xxxxxxxxxx>; Igor Druzhinin <igor.druzhinin@xxxxxxxxxx> > >> Subject: [PATCH] xen-netback: fix race condition on XenBus disconnect > >> > >> In some cases during XenBus disconnect event handling and subsequent > >> queue resource release there may be some TX handlers active on > >> other processors. Use RCU in order to synchronize with them. > >> > >> Signed-off-by: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx> > >> --- > >> drivers/net/xen-netback/interface.c | 13 ++++++++----- > >> drivers/net/xen-netback/xenbus.c | 17 +++++++---------- > >> 2 files changed, 15 insertions(+), 15 deletions(-) > >> > >> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen- > >> netback/interface.c > >> index a2d32676..32e2cc6 100644 > >> --- a/drivers/net/xen-netback/interface.c > >> +++ b/drivers/net/xen-netback/interface.c > >> @@ -164,7 +164,7 @@ static int xenvif_start_xmit(struct sk_buff *skb, > struct > >> net_device *dev) > >> { > >> struct xenvif *vif = netdev_priv(dev); > >> struct xenvif_queue *queue = NULL; > >> - unsigned int num_queues = vif->num_queues; > > > > Do you not need an rcu_read_lock() around this and use of the > num_queues value (as you have below)? > > > >> + unsigned int num_queues = rcu_dereference(vif)->num_queues; > >> u16 index; > >> struct xenvif_rx_cb *cb; > >> > >> @@ -221,18 +221,21 @@ static struct net_device_stats > >> *xenvif_get_stats(struct net_device *dev) > >> { > >> struct xenvif *vif = netdev_priv(dev); > >> struct xenvif_queue *queue = NULL; > >> + unsigned int num_queues; > >> u64 rx_bytes = 0; > >> u64 rx_packets = 0; > >> u64 tx_bytes = 0; > >> u64 tx_packets = 0; > >> unsigned int index; > >> > >> - spin_lock(&vif->lock); > >> - if (vif->queues == NULL) > >> + rcu_read_lock(); > >> + > >> + num_queues = rcu_dereference(vif)->num_queues; > >> + if (num_queues < 1) > >> goto out; > >> > >> /* Aggregate tx and rx stats from each queue */ > >> - for (index = 0; index < vif->num_queues; ++index) { > >> + for (index = 0; index < num_queues; ++index) { > >> queue = &vif->queues[index]; > >> rx_bytes += queue->stats.rx_bytes; > >> rx_packets += queue->stats.rx_packets; > >> @@ -241,7 +244,7 @@ static struct net_device_stats > >> *xenvif_get_stats(struct net_device *dev) > >> } > >> > >> out: > >> - spin_unlock(&vif->lock); > >> + rcu_read_unlock(); > >> > >> vif->dev->stats.rx_bytes = rx_bytes; > >> vif->dev->stats.rx_packets = rx_packets; > >> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen- > >> netback/xenbus.c > >> index d2d7cd9..76efb01 100644 > >> --- a/drivers/net/xen-netback/xenbus.c > >> +++ b/drivers/net/xen-netback/xenbus.c > >> @@ -495,26 +495,23 @@ static void backend_disconnect(struct > >> backend_info *be) > >> struct xenvif *vif = be->vif; > >> > >> if (vif) { > >> + unsigned int num_queues = vif->num_queues; > >> unsigned int queue_index; > >> - struct xenvif_queue *queues; > >> > >> xen_unregister_watchers(vif); > >> #ifdef CONFIG_DEBUG_FS > >> xenvif_debugfs_delif(vif); > >> #endif /* CONFIG_DEBUG_FS */ > >> xenvif_disconnect_data(vif); > >> - for (queue_index = 0; > >> - queue_index < vif->num_queues; > >> - ++queue_index) > >> - xenvif_deinit_queue(&vif->queues[queue_index]); > >> > >> - spin_lock(&vif->lock); > >> - queues = vif->queues; > >> vif->num_queues = 0; > >> - vif->queues = NULL; > >> - spin_unlock(&vif->lock); > >> + synchronize_net(); > > > > So, num_queues is your RCU protected value, rather than the queues > pointer, in which case I think you probably need to change code such as > > > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/ne > t/xen-netback/netback.c?id=refs/tags/v4.10#n216 > > > > to be gated on num_queues. > > > > Also shouldn't xenvif_up(), xenvif_down() and xenvif_get_ethtool_stats() > not be using rcu_read_lock() and rcu_dereference() of num_queues as well? > > > > Paul > > > > xenvif_up() and xenvif_down() are called under rtnl_lock. > xenvif_get_ethtool_stats() is a good candidate. I'm not sure about > xenvif_fatal_tx_err. Is it actually called from already protected regions? > IIRC it's called out of the NAPI worker so I'd hope turning carrier off would be sufficient. I still thing that testing for !num_queues everywhere rather than !queues in some places would be better, even if you don't actually need to rcu protect all the tests. Paul > Igor > > >> > >> - vfree(queues); > >> + for (queue_index = 0; queue_index < num_queues; > >> ++queue_index) > >> + xenvif_deinit_queue(&vif->queues[queue_index]); > >> + > >> + vfree(vif->queues); > >> + vif->queues = NULL; > >> > >> xenvif_disconnect_ctrl(vif); > >> } > >> -- > >> 1.8.3.1 > > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |