[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Guest start issue on ARM (maybe related to Credit2) [Was: Re: [xen-unstable test] 113807: regressions - FAIL]



On Mon, 2017-09-25 at 17:23 +0100, Julien Grall wrote:
> On 09/25/2017 03:07 PM, Dario Faggioli wrote:
> > Hey,
> 
> Hi Dario,
> 
Hi!

> > I don't see much in the logs, TBH, but both `xl vcpu-list' and the
> > 'r'
> > debug key seem to suggest that vCPU 0 is running, while the other
> > vCPUs
> > have never run... like it was an issue with secondary (v)CPU
> > bringup.
> > 
> It definitely rings a bell, I have seen similar trace in July and I
> have 
> been working on a potential fix since then.
> 
> Most of the time guest-start/debian.repeat fails, vCPU 0 is in 
> data/prefetch abort state. My guess is a latent cache bug that
> credit2 
> appears to expose.
> 
> Indeed, the arm32 kernel is using set/way cache flush instruction at 
> boot time. They are used to clean one by one each level of caches on 
> each CPUs.
> 
> At the moment, Xen does not trap those instructions. As you know
> cache 
> may not be private to a given physical processors. So if you happen
> to 
> migrate the vCPU to another physical CPU, you may hit stale data.
> 
Ah, yes, I remember "hearing" you talking about this. We've also talked
about it a bit together... I just wasn't recognising it being what's
biting us here.

> I am still cleaning-up my work and hopefully can post a couple of
> series 
> soon. This is not targeting Xen 4.10 and I am not even sure it would
> fix 
> the problem here. But that's my best guess.
> 
Well, yes, now that you mention it, it indeed sounds plausible.

So, I was mainly curious about whether it was either something which
was affecting or directly caused by Credit2, or something that Credit2
can help diagnose, reproduce and fix.

Since we already have a candidate, and you're already working on the
(difficult! :-( ), well, let's see, once you'll have it, if it actually
cures the problem.

We'll jump back on it if it does not.

Thanks and regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.