[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Memory ordering question in the shutdown deferral code





On 21/09/2020 13:55, Durrant, Paul wrote:
-----Original Message-----
From: Julien Grall <julien@xxxxxxx>
Sent: 21 September 2020 12:41
To: Jan Beulich <jbeulich@xxxxxxxx>; Stefano Stabellini 
<sstabellini@xxxxxxxxxx>;
andrew.cooper3@xxxxxxxxxx; George Dunlap <george.dunlap@xxxxxxxxxx>; Durrant, 
Paul
<pdurrant@xxxxxxxxxxxx>
Cc: Xia, Hongyan <hongyxia@xxxxxxxxxx>; xen-devel@xxxxxxxxxxxxxxxxxxxx
Subject: RE: [EXTERNAL] Memory ordering question in the shutdown deferral code

CAUTION: This email originated from outside of the organization. Do not click 
links or open
attachments unless you can confirm the sender and know the content is safe.



(+ Xen-devel)

Sorry I forgot to CC xen-devel.

On 21/09/2020 12:38, Julien Grall wrote:
Hi all,

I have started to look at the deferral code (see
vcpu_start_shutdown_deferral()) because we need it for LiveUpdate and
Arm will soon use it.

The current implementation is using an smp_mb() to ensure ordering
between a write then a read. The code looks roughly (I have slightly
adapted it to make my question more obvious):

domain_shutdown()
      d->is_shutting_down = 1;
      smp_mb();
      if ( !vcpu0->defer_shutdown )
      {
        vcpu_pause_nosync(v);
        v->paused_for_shutdown = 1;
      }

vcpu_start_shutdown_deferral()
      vcpu0->defer_shutdown = 1;
      smp_mb();
      if ( unlikely(d->is_shutting_down) )
        vcpu_check_shutdown(v);

      return vcpu0->defer_shutdown;

smp_mb() should only guarantee ordering (this may be stronger on some
arch), so I think there is a race between the two functions.

It would be possible to pause the vCPU in domain_shutdown() because
vcpu0->defer_shutdown wasn't yet seen.

Equally, vcpu_start_shutdown_deferral() may not see d->is_shutting_down
and therefore Xen may continue to send the I/O. Yet the vCPU will be
paused so the I/O will never complete.


The barrier enforces global order, right?

It is not clear to me what you mean by "global ordering". This seems to suggest a very expensive synchronization barrier between all the processors.

From an arch-agnostic PoV, smp_mb() will enforce an ordering between loads/stores but it doesn't guarantee *when* they will be observed.

So, if domain_shutdown() pauses the vcpu then is_shutting_down must necessarily 
be visible all cpus.

That's not the guarantee provided by smp_mb() (see above).

Thus vcpu_start_shutdown referral will execute vcpu_check_shutdown(), so I'm 
having a hard time seeing the race.

I am not fully familiar with the IOREQ code, but it sounds to me this is
not the behavior that was intended. Can someone more familiar with the
code confirm it?


No indeed. I think emulation should complete before the vcpu pauses.

I think this part is racy at least on non-x86 platform as x86 seems to implement smp_mb() with a strong memory barrier (mfence).

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.