[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Issue: Networking performance in Xen VM on Arm64



On Mon, 24 Oct 2022, Leo Yan wrote:
> > If you are really running with the NULL scheduler, then I would
> > investigate why the vCPU has is_running == 0 because it should not
> > happen.
> 
> Correct for this: it's my bad that I didn't really enable NULL scheduler
> in my code base.  After I enabled NULL scheduler, the latency by context
> switching is dismissed.
> 
>  8963              pub-338   [002]   217.777652: bprint:               
> xennet_tx_setup_grant: id=60 ref=1340 offset=2 len=1514 TSC: 7892178799
>  8964              pub-338   [002]   217.777662: bprint:               
> xennet_tx_setup_grant: id=82 ref=1362 offset=2050 len=1006 TSC: 7892179043
>  8965     ksoftirqd/12-75    [012]   255.466914: bprint:               
> xenvif_tx_build_gops.constprop.0: id=60 ref=1340 offset=2 len=1514 TSC: 
> 7892179731
>  8966     ksoftirqd/12-75    [012]   255.466915: bprint:               
> xenvif_tx_build_gops.constprop.0: id=82 ref=1362 offset=2050 len=1006 TSC: 
> 7892179761
>  8967              pub-338   [002]   217.778057: bprint:               
> xennet_tx_setup_grant: id=60 ref=1340 offset=2050 len=1514 TSC: 7892188930
>  8968              pub-338   [002]   217.778072: bprint:               
> xennet_tx_setup_grant: id=53 ref=1333 offset=2 len=1514 TSC: 7892189293
>  8969       containerd-2965  [012]   255.467304: bprint:               
> xenvif_tx_build_gops.constprop.0: id=60 ref=1340 offset=2050 len=1514 TSC: 
> 7892189479
>  8970       containerd-2965  [012]   255.467306: bprint:               
> xenvif_tx_build_gops.constprop.0: id=53 ref=1333 offset=2 len=1514 TSC: 
> 7892189533

I am having difficulty following the messages. Are the two points [a]
and [b] as described in the previous email shown here?


> So the xennet (Xen net forend driver) and xenvif (net backend driver)
> work in parallel.  Please note, I didn't see networking performance
> improvement after changed to use NULL scheduler.
> 
> Now I will compare the duration for two directions, one direction is
> sending data from xennet to xenvif, and another is the reversed
> direction.  It's very likely the two directions have significant
> difference for sending data with grant tables, you could see in above
> log, it takes 20~30us to transmit a data block (we can use the id
> number and grant table's ref number to match the data block in xennet
> driver and xenvif driver).
> 
> > Now regarding the results, I can see the timestamp 3842008681 for
> > xennet_notify_tx_irq, 3842008885 for vgic_inject_irq, and 3842008935 for
> > vcpu_kick. Where is the corresponding TSC for the domain receiving the
> > notification?
> > 
> > Also for the other case, starting at 3842016505, can you please
> > highlight the timestamp for vgic_inject_irq, vcpu_kick, and also the one
> > for the domain receiving the notification?
> > 
> > The most interesting timestamps would be the timestamp for vcpu_kick in
> > "notification sending domain" [a], the timestamp for receiving the
> > interrupt in the Xen on pCPU for the "notification receiving domain"
> > [b], and the timestamp for the "notification receiving domain" getting
> > the notification [c].
> > 
> > If really context switch is the issue, then the interesting latency
> > would be between [a] and [b].
> 
> Understand.  I agree that I didn't move into more details, the main
> reason is Xen dmesg buffer is fragile after adding more logs, e.g.
> after I added log in the function gicv3_send_sgi(), Xen will stuck
> during the booting phase, and after adding logs in
> leave_hypervisor_to_guest() it will introduce huge logs (so I need to
> only trace for first 16 CPUs to mitigate log flood).
> 
> I think it would be better to enable xentrace for my profiling at my
> side.  If I have any further data, will share back.

Looking forward to it. Without more details it is impossible to identify
the source of the problem and fix it.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.