[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH net-next 16/24] net: netkit, veth, tun, virt*: Use nested-BH locking for XDP redirect.
- To: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx, netdev@xxxxxxxxxxxxxxx
- From: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
- Date: Mon, 18 Dec 2023 09:52:05 +0100
- Cc: "David S. Miller" <davem@xxxxxxxxxxxxx>, Boqun Feng <boqun.feng@xxxxxxxxx>, Eric Dumazet <edumazet@xxxxxxxxxx>, Frederic Weisbecker <frederic@xxxxxxxxxx>, Ingo Molnar <mingo@xxxxxxxxxx>, Jakub Kicinski <kuba@xxxxxxxxxx>, Paolo Abeni <pabeni@xxxxxxxxxx>, Peter Zijlstra <peterz@xxxxxxxxxxxxx>, Thomas Gleixner <tglx@xxxxxxxxxxxxx>, Waiman Long <longman@xxxxxxxxxx>, Will Deacon <will@xxxxxxxxxx>, "K. Y. Srinivasan" <kys@xxxxxxxxxxxxx>, "Michael S. Tsirkin" <mst@xxxxxxxxxx>, Alexei Starovoitov <ast@xxxxxxxxxx>, Andrii Nakryiko <andrii@xxxxxxxxxx>, Dexuan Cui <decui@xxxxxxxxxxxxx>, Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>, Hao Luo <haoluo@xxxxxxxxxx>, Jesper Dangaard Brouer <hawk@xxxxxxxxxx>, Jiri Olsa <jolsa@xxxxxxxxxx>, John Fastabend <john.fastabend@xxxxxxxxx>, Juergen Gross <jgross@xxxxxxxx>, KP Singh <kpsingh@xxxxxxxxxx>, Martin KaFai Lau <martin.lau@xxxxxxxxx>, Nikolay Aleksandrov <razor@xxxxxxxxxxxxx>, Song Liu <song@xxxxxxxxxx>, Stanislav Fomichev <sdf@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wei.liu@xxxxxxxxxx>, Willem de Bruijn <willemdebruijn.kernel@xxxxxxxxx>, Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx>, Yonghong Song <yonghong.song@xxxxxxxxx>, bpf@xxxxxxxxxxxxxxx, virtualization@xxxxxxxxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxxx
- Delivery-date: Mon, 18 Dec 2023 09:58:37 +0000
- List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
Hi Sebastian,
On 12/15/23 6:07 PM, Sebastian Andrzej Siewior wrote:
The per-CPU variables used during bpf_prog_run_xdp() invocation and
later during xdp_do_redirect() rely on disabled BH for their protection.
Without locking in local_bh_disable() on PREEMPT_RT these data structure
require explicit locking.
This is a follow-up on the previous change which introduced
bpf_run_lock.redirect_lock and uses it now within drivers.
The simple way is to acquire the lock before bpf_prog_run_xdp() is
invoked and hold it until the end of function.
This does not always work because some drivers (cpsw, atlantic) invoke
xdp_do_flush() in the same context.
Acquiring the lock in bpf_prog_run_xdp() and dropping in
xdp_do_redirect() (without touching drivers) does not work because not
all driver, which use bpf_prog_run_xdp(), do support XDP_REDIRECT (and
invoke xdp_do_redirect()).
Ideally the minimal locking scope would be bpf_prog_run_xdp() +
xdp_do_redirect() and everything else (error recovery, DMA unmapping,
free/ alloc of memory, …) would happen outside of the locked section.
[...]
drivers/net/hyperv/netvsc_bpf.c | 1 +
drivers/net/netkit.c | 13 +++++++----
drivers/net/tun.c | 28 +++++++++++++----------
drivers/net/veth.c | 40 ++++++++++++++++++++-------------
drivers/net/virtio_net.c | 1 +
drivers/net/xen-netfront.c | 1 +
6 files changed, 52 insertions(+), 32 deletions(-)
[...]
Please exclude netkit from this set given it does not support XDP, but
instead only accepts tc BPF typed programs.
Thanks,
Daniel
diff --git a/drivers/net/netkit.c b/drivers/net/netkit.c
index 39171380ccf29..fbcf78477bda8 100644
--- a/drivers/net/netkit.c
+++ b/drivers/net/netkit.c
@@ -80,8 +80,15 @@ static netdev_tx_t netkit_xmit(struct sk_buff *skb, struct
net_device *dev)
netkit_prep_forward(skb, !net_eq(dev_net(dev), dev_net(peer)));
skb->dev = peer;
entry = rcu_dereference(nk->active);
- if (entry)
- ret = netkit_run(entry, skb, ret);
+ if (entry) {
+ scoped_guard(local_lock_nested_bh, &bpf_run_lock.redirect_lock)
{
+ ret = netkit_run(entry, skb, ret);
+ if (ret == NETKIT_REDIRECT) {
+ dev_sw_netstats_tx_add(dev, 1, len);
+ skb_do_redirect(skb);
+ }
+ }
+ }
switch (ret) {
case NETKIT_NEXT:
case NETKIT_PASS:
@@ -95,8 +102,6 @@ static netdev_tx_t netkit_xmit(struct sk_buff *skb, struct
net_device *dev)
}
break;
case NETKIT_REDIRECT:
- dev_sw_netstats_tx_add(dev, 1, len);
- skb_do_redirect(skb);
break;
case NETKIT_DROP:
default:
|