|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Memory ordering question in the shutdown deferral code
On Mon, 21 Sep 2020, Julien Grall wrote:
> On 21/09/2020 13:55, Durrant, Paul wrote:
> > > (+ Xen-devel)
> > >
> > > Sorry I forgot to CC xen-devel.
> > >
> > > On 21/09/2020 12:38, Julien Grall wrote:
> > > > Hi all,
> > > >
> > > > I have started to look at the deferral code (see
> > > > vcpu_start_shutdown_deferral()) because we need it for LiveUpdate and
> > > > Arm will soon use it.
> > > >
> > > > The current implementation is using an smp_mb() to ensure ordering
> > > > between a write then a read. The code looks roughly (I have slightly
> > > > adapted it to make my question more obvious):
> > > >
> > > > domain_shutdown()
> > > > d->is_shutting_down = 1;
> > > > smp_mb();
> > > > if ( !vcpu0->defer_shutdown )
> > > > {
> > > > vcpu_pause_nosync(v);
> > > > v->paused_for_shutdown = 1;
> > > > }
> > > >
> > > > vcpu_start_shutdown_deferral()
> > > > vcpu0->defer_shutdown = 1;
> > > > smp_mb();
> > > > if ( unlikely(d->is_shutting_down) )
> > > > vcpu_check_shutdown(v);
> > > >
> > > > return vcpu0->defer_shutdown;
> > > >
> > > > smp_mb() should only guarantee ordering (this may be stronger on some
> > > > arch), so I think there is a race between the two functions.
> > > >
> > > > It would be possible to pause the vCPU in domain_shutdown() because
> > > > vcpu0->defer_shutdown wasn't yet seen.
> > > >
> > > > Equally, vcpu_start_shutdown_deferral() may not see d->is_shutting_down
> > > > and therefore Xen may continue to send the I/O. Yet the vCPU will be
> > > > paused so the I/O will never complete.
> > > >
> >
> > The barrier enforces global order, right?
>
> It is not clear to me what you mean by "global ordering". This seems to
> suggest a very expensive synchronization barrier between all the processors.
>
> From an arch-agnostic PoV, smp_mb() will enforce an ordering between
> loads/stores but it doesn't guarantee *when* they will be observed.
>
> > So, if domain_shutdown() pauses the vcpu then is_shutting_down must
> > necessarily be visible all cpus.
>
> That's not the guarantee provided by smp_mb() (see above).
I simplified the code further to help us reason about it:
thread#1 | thread#2
|
1) WRITE A | WRITE B
2) BARRIER | BARRIER
3) READ B | READ A
I think it is (theoretically) possible for thread#1 to be at 1) and
about to do 2), while thread#2 goes ahead and does 1) 2) 3). By the time
thread#1 does 2), thread#2 has already completed the entire sequence.
If thread#2 has already done 2), and thread#1 is about to do 3), then I
think we are guaranteed that thread#1 will see the new value of B. Or
is this the core of the issue we are discussing?
If it works the way I wrote, then it would confirm Paul's view.
For your information I went to check what the Linux memory model has to
say about this. It says:
"smp_mb() guarantees to restore sequential consistency among accesses that use
READ_ONCE, WRITE_ONCE(), or stronger. For example, the following Linux-kernel
code would forbid non-SC outcomes"
It is interesting that they chose the words "restore sequential
consistency". It would be difficult to come up with an example that has
"sequential consistency" but doesn't work the way described earlier.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |