[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Issue with shared information page on Xen/ARM 4.17


  • To: Julien Grall <julien@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Elliott Mitchell <ehem+xen@xxxxxxx>
  • From: Oleksandr Tyshchenko <Oleksandr_Tyshchenko@xxxxxxxx>
  • Date: Wed, 4 Oct 2023 11:42:32 +0000
  • Accept-language: en-US, ru-RU
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=XizltJT/Q7SMTKvLAWgE+cyvFXW0KjW65Ifj+u3b0bI=; b=FEpB1FLc3RW67lZBh5b7XOd1bx40rSqANNcXa78r8R0iCJfRnp8018o6YKK9Yu+IIvc2sg55jeb67yOgN9zXhiDDldBJEdJCal5tY7llBhQ6KuFPahE6x8AllTXf37Jeg4YRhnn0uOPL0vrVhivUFcUHpxRzaghLcID015yhuUucCg2Rfq3kIKXxXLc+kloNjPW8SNep8xwbnMKzx6V1uo3AcTI/n++aV0pT1s4eYAmSoyLEv31zl3W4qe0OQzXmnH8WIMgfD1GwLi7iAMmy3Oalwdg1MQzL18eg+sZL3cQ+gz++WXxaItpBK4QSJM4SQj+h5iIN4gUxa4C9FuQKsw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Gv/FKPOjZIe+qJKcHRbTJtxCON70HWKgsd9Xf9mfRAa6biDrtpUQKrRK4IivfKVSpr7WEDhkXfQ9V5Y4Sg10H6kAhJ+TNQGnDN0BJPfLHrDwR6M2s3G6eOwi7OEvHHauHJhO5AuRA2fbsD3AZtv4pxe9U3i+9N1xzoCcwy/pxQDFctPpFvIaDE7iKc0Lj4oA9tOFxy2sp+DaGvPKfiCvlq9nOiELsdSyKuUMEQ8EoRk+eauzZqi9blkFIftIWsMKfCEdsn2lOn8K/DqinXhMc1d+36ZrOTl+VeTuMYoLg/RZbLKMFw53Nlb0NZCWgcoaB2nC+TzZdpG9OurmkO+wkw==
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>
  • Delivery-date: Wed, 04 Oct 2023 11:42:57 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHZ9dNZbcd5CVq6q0CaRs8d4dlf/7A4cSiAgADYhwCAAC0hgIAADTmA
  • Thread-topic: Issue with shared information page on Xen/ARM 4.17


On 04.10.23 13:55, Julien Grall wrote:

Hello all.

> Hi Roger,
> 
> On 04/10/2023 09:13, Roger Pau Monné wrote:
>> On Tue, Oct 03, 2023 at 12:18:35PM -0700, Elliott Mitchell wrote:
>>> On Tue, Oct 03, 2023 at 10:26:28AM +0200, Roger Pau Monné wrote:
>>>> On Thu, Sep 28, 2023 at 07:49:18PM -0700, Elliott Mitchell wrote:
>>>>> I'm trying to get FreeBSD/ARM operational on Xen/ARM.  Current 
>>>>> issue is
>>>>> the changes with the handling of the shared information page appear to
>>>>> have broken things for me.
>>>>>
>>>>> With a pre-4.17 build of Xen/ARM things worked fine.  Yet with a build
>>>>> of the 4.17 release, mapping the shared information page doesn't work.
>>>>
>>>> This is due to 71320946d5edf AFAICT.
>>>
>>> Yes.  While the -EBUSY line may be the one triggering, I'm unsure why.
>>> This seems a fairly reasonable change, so I had no intention of asking
>>> for a revert (which likely would have been rejected).  There is also a
>>> real possibility the -EBUSY comes from elsewhere.  Could also be
>>> 71320946d5edf caused a bug elsewhere to be exposed.
>>
>> A good way to know would be to attempt to revert 71320946d5edf and see
>> if that fixes your issue.
>>
>> Alternatively you can try (or similar):
>>
>> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
>> index 6ccffeaea57d..105ef3faecfd 100644
>> --- a/xen/arch/arm/mm.c
>> +++ b/xen/arch/arm/mm.c
>> @@ -1424,6 +1424,8 @@ int xenmem_add_to_physmap_one(
>>                   page_set_xenheap_gfn(mfn_to_page(mfn), gfn);
>>           }
>>           else
>> +        {
>> +            printk("%u already mapped\n", space);
>>               /*
>>                * Mandate the caller to first unmap the page before 
>> mapping it
>>                * again. This is to prevent Xen creating an unwanted 
>> hole in
>> @@ -1432,6 +1434,7 @@ int xenmem_add_to_physmap_one(
>>                * to unmap it afterwards.
>>                */
>>               rc = -EBUSY;
>> +        }
>>           p2m_write_unlock(p2m);
>>       }
>>
>>>>> I'm using Tianocore as the first stage loader.  This continues to work
>>>>> fine.  The build is using tag "edk2-stable202211", commit fff6d81270.
>>>>> While Tianocore does map the shared information page, my reading of 
>>>>> their
>>>>> source is that it properly unmaps the page and therefore shouldn't 
>>>>> cause
>>>>> trouble.
>>>>>
>>>>> Notes on the actual call is gpfn was 0x0000000000040072.  This is 
>>>>> outside
>>>>> the recommended address range, but my understanding is this is 
>>>>> supposed
>>>>> to be okay.
>>>>>
>>>>> The return code is -16, which is EBUSY.
>>>>>
>>>>> Ideas?
>>>>
>>>> I think the issue is that you are mapping the shared info page over a
>>>> guest RAM page, and in order to do that you would fist need to create
>>>> a hole and then map the shared info page.  IOW: the issue is not with
>>>> edk2 not having unmapped the page, but with FreeBSD trying to map the
>>>> shared_info over a RAM page instead of a hole in the p2m.  x86
>>>> behavior is different here, and does allow mapping the shared_info
>>>> page over a RAM gfn (by first removing the backing RAM page on the
>>>> gfn).
>>>
>>> An interesting thought.  I thought I'd tried this, but since I didn't 
>>> see
>>> such in my experiments list.  What I had tried was removing all the 
>>> pages
>>> in the suggested mapping range.  Yet this failed.
>>
>> Yeah, I went too fast and didn't read the code correctly, it is not
>> checking that the provided gfn is already populated, but whether the
>> mfn intended to be mapped is already mapped at a different location.
>>
>>> Since this seemed reasonable, I've now tried and found it fails.  The
>>> XENMEM_remove_from_physmap call returns 0.
>>
>> XENMEM_remove_from_physmap returning 0 is fine, but it seems to me
>> like edk2 hasn't unmapped the shared_info page.  The OS has no idea
>> at which position the shared_info page is currently mapped, and hence
>> can't do anything to attempt to unmap it in order to cover up for
>> buggy firmware.
>>
>> edk2 should be the entity to issue the XENMEM_remove_from_physmap
>> against the gfn where it has the shared_info page mapped.  Likely
>> needs to be done as part of ExitBootServices() method.
>>
>> FWIW, 71320946d5edf is an ABI change, and as desirable as such
>> behavior might be, a new hypercall should have introduced that had the
>> behavior that the change intended to retrofit into
>> XENMEM_add_to_physmap.
> I can see how you think this is an ABI change but the previous behavior 
> was incorrect. Before this patch, on Arm, we would allow the shared page 
> to be mapped twice. As we don't know where the firmware had mapped it 
> this could result to random corruption.
> 
> Now, we could surely decide to remove the page as x86 did. But this 
> could leave a hole in the RAM. As the OS would not know where the hole 
> is, this could lead to page fault randomly during runtime.


+1.

In addition to what Julien has already said, I would like to say the 
same issue was faced due to U-Boot (running as a part of Xen guest 
before OS) didn't perform a cleanup before jumping to OS. This is 
already fixed to follow the current behavior. I didn't find 
corresponding U-Boot mail thread, but can point to already upstreamed 
commit in the main repo.

https://github.com/u-boot/u-boot/commit/0001a964b840a62c66da42a89a10a2656831aa4b

[snip]

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.