[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 2/2 V2] iommu/amd: Workaround for erratum 787
>>> On 10.06.13 at 14:43, Tim Deegan <tim@xxxxxxx> wrote: > At 11:53 +0100 on 10 Jun (1370865209), Jan Beulich wrote: >> > Oh I see, the idea is that we use the control register to mask >> > interrupts instead of relying on the status register? That seems >> > better. But doesn't this IOMMU already mask further interrupts when the >> > pending bit in the status register is set? I can't see any wording >> > about that in the IOMMU doc but the erratum implies it. Suravee, do you >> > know if this is the case? >> >> Actually, the documentation has a subtle but perhaps important >> difference int the wording here: For EventLogInt and ComWaitInt >> is read "An interrupt is generated when <status bit> = 1b and MMIO >> Offset 0018h[<control bit>] = 1b", where as for PPRInt and GAInt >> it says "An interrupt is generated when <status bit> changes from 0b >> to 1b and MMIO Offset 0018h[<control bit>] = 1b". >> >> So I'd like to be one the safe side and assume further interrupts can >> be generated in all cases. > > Fair enough. > > - see also the emulation behavior in >> iommu_guest.c which - afaict - always raises an interrupt, not just on >> a 0 -> 1 transition. > > Well, if it were a documented beahviour, we ought to change that. :) I'll leave that to Suravee though, pending clarification on whether the difference in wording is actually meaningful. >> > In either case, the while () loop worries me; I think it would be better >> > to schedule the tasklet again if we see the bit is set; a 'while >> > (pending is set) { clear pending bit; }' loop might never exit the >> > tasklet at all. >> >> That could only be due to a hardware bug worse than the one we're >> working around here, and I don't think is worth dealing with. > > Well, there's runaway guest hardware. If we reschedule the tasklet then > pci_check_disable_device() will eventually catch and suppress the > misbehaviour; if we spin here it won't get a chance. I guess the > argument is that it will eventually overflow the log buffer and stop > setting the log-interrupt bit -- that needs at least a comment. > > I was also a bit worried about the erratum-787 event getting delayed > (since there's no interrupt to cause us to actually process it), but I > just realised that's a more general problem: we ought to be resetting > the 'pending' bits _before_ scanning the log, or any new entries that > arrive between the log scan and the RW1C write won't be seen until the > _next_ log entry causes an interrupt. > > How about: > write-to-clear status.pending > process the log > if (status.pending) > reschedule the tasklet > else > unmask the interrupt. > > Since we have to do the extra read anyway for erratum 787, we might as > well save ourselves the bother of taking an interrupt in the other case. Yes, that's a very reasonable approach. Will be a v4 then here soon too, except that the locking there looks bogus too (and hence may need fixing along the way)... Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |