[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Issue with shared information page on Xen/ARM 4.17


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Oleksandr Tyshchenko <Oleksandr_Tyshchenko@xxxxxxxx>
  • Date: Wed, 4 Oct 2023 17:18:42 +0000
  • Accept-language: en-US, ru-RU
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Wm3I36RDY0zfDwhCiXvT/6o0lRf1GQ4jDZbzk1FPHl8=; b=NZHLWnsHDnK3p+M9SJU8lMlewehpyx86U/q6XDBaLw4tT3gF+yJth6hiNfnjKdFyh1vf/ve5uloufqtcAjxgGv+RFM8IKOfAyHl3/Fyl/JGNeAdH2Vtkzs5zeC+V5CT0x4F4YgKRCzA3WcDKiamhYO9NAaq57EXfbkC/PjY4kYreMnl6AMQXSn7EG68dXpfq+yzbUHjK6TH9CnO819SGOzxwn31DtlNn9fsem5ywfVpNA1fj3WUukMclKMNmMka2+xtx8tukaXhGO/jGdQstJ5+YpQuo3hjjENpGrUPePQFde3Hk1KZ1oMuXVB2xn8qG+AkuChQvdlPOohukKjomcA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=NjIB0teCLGGM+xXCkjmEEpPPVS/sy/eLoHWFPXlwRgxjtc/5b7iy1u7Zi8uD9TzzLq+gG2hkynQFaDg/oB/bN4BbsNFomRfp1CpKCWY+2VZYmZ+kqTIggzzakpRCZiB+PSwUB2zKNJmlQgBg2QvKi/d3S+0W2R+KvTnZikzib6F4IpS8eoA5ns+dquvgmNM9qn+Rk8qGGQnEwpWpIW0U6pMy5q3f8JfIm/VawzokGMdSESaxhlzE+VedyBcl2X5gIt2YadN7RvSg/+hV2V1rJJsKvKG0oAfYavnO/o77pZA+bT/g2zhycwrlGUebxD5Dx8/myvjtvHTsFzHF3TJaXw==
  • Cc: Julien Grall <julien@xxxxxxx>, Elliott Mitchell <ehem+xen@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>
  • Delivery-date: Wed, 04 Oct 2023 17:19:19 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHZ9dNZbcd5CVq6q0CaRs8d4dlf/7A4cSiAgADYhwCAAC0hgIAADTmAgAAVaACAAEiNAA==
  • Thread-topic: Issue with shared information page on Xen/ARM 4.17


On 04.10.23 15:59, Roger Pau Monné wrote:

Hello Roger



> On Wed, Oct 04, 2023 at 11:42:32AM +0000, Oleksandr Tyshchenko wrote:
>>
>>
>> On 04.10.23 13:55, Julien Grall wrote:
>>
>> Hello all.
>>
>>> Hi Roger,
>>>
>>> On 04/10/2023 09:13, Roger Pau Monné wrote:
>>>> On Tue, Oct 03, 2023 at 12:18:35PM -0700, Elliott Mitchell wrote:
>>>>> On Tue, Oct 03, 2023 at 10:26:28AM +0200, Roger Pau Monné wrote:
>>>>>> On Thu, Sep 28, 2023 at 07:49:18PM -0700, Elliott Mitchell wrote:
>>>>>>> I'm trying to get FreeBSD/ARM operational on Xen/ARM.  Current
>>>>>>> issue is
>>>>>>> the changes with the handling of the shared information page appear to
>>>>>>> have broken things for me.
>>>>>>>
>>>>>>> With a pre-4.17 build of Xen/ARM things worked fine.  Yet with a build
>>>>>>> of the 4.17 release, mapping the shared information page doesn't work.
>>>>>>
>>>>>> This is due to 71320946d5edf AFAICT.
>>>>>
>>>>> Yes.  While the -EBUSY line may be the one triggering, I'm unsure why.
>>>>> This seems a fairly reasonable change, so I had no intention of asking
>>>>> for a revert (which likely would have been rejected).  There is also a
>>>>> real possibility the -EBUSY comes from elsewhere.  Could also be
>>>>> 71320946d5edf caused a bug elsewhere to be exposed.
>>>>
>>>> A good way to know would be to attempt to revert 71320946d5edf and see
>>>> if that fixes your issue.
>>>>
>>>> Alternatively you can try (or similar):
>>>>
>>>> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
>>>> index 6ccffeaea57d..105ef3faecfd 100644
>>>> --- a/xen/arch/arm/mm.c
>>>> +++ b/xen/arch/arm/mm.c
>>>> @@ -1424,6 +1424,8 @@ int xenmem_add_to_physmap_one(
>>>>                    page_set_xenheap_gfn(mfn_to_page(mfn), gfn);
>>>>            }
>>>>            else
>>>> +        {
>>>> +            printk("%u already mapped\n", space);
>>>>                /*
>>>>                 * Mandate the caller to first unmap the page before
>>>> mapping it
>>>>                 * again. This is to prevent Xen creating an unwanted
>>>> hole in
>>>> @@ -1432,6 +1434,7 @@ int xenmem_add_to_physmap_one(
>>>>                 * to unmap it afterwards.
>>>>                 */
>>>>                rc = -EBUSY;
>>>> +        }
>>>>            p2m_write_unlock(p2m);
>>>>        }
>>>>
>>>>>>> I'm using Tianocore as the first stage loader.  This continues to work
>>>>>>> fine.  The build is using tag "edk2-stable202211", commit fff6d81270.
>>>>>>> While Tianocore does map the shared information page, my reading of
>>>>>>> their
>>>>>>> source is that it properly unmaps the page and therefore shouldn't
>>>>>>> cause
>>>>>>> trouble.
>>>>>>>
>>>>>>> Notes on the actual call is gpfn was 0x0000000000040072.  This is
>>>>>>> outside
>>>>>>> the recommended address range, but my understanding is this is
>>>>>>> supposed
>>>>>>> to be okay.
>>>>>>>
>>>>>>> The return code is -16, which is EBUSY.
>>>>>>>
>>>>>>> Ideas?
>>>>>>
>>>>>> I think the issue is that you are mapping the shared info page over a
>>>>>> guest RAM page, and in order to do that you would fist need to create
>>>>>> a hole and then map the shared info page.  IOW: the issue is not with
>>>>>> edk2 not having unmapped the page, but with FreeBSD trying to map the
>>>>>> shared_info over a RAM page instead of a hole in the p2m.  x86
>>>>>> behavior is different here, and does allow mapping the shared_info
>>>>>> page over a RAM gfn (by first removing the backing RAM page on the
>>>>>> gfn).
>>>>>
>>>>> An interesting thought.  I thought I'd tried this, but since I didn't
>>>>> see
>>>>> such in my experiments list.  What I had tried was removing all the
>>>>> pages
>>>>> in the suggested mapping range.  Yet this failed.
>>>>
>>>> Yeah, I went too fast and didn't read the code correctly, it is not
>>>> checking that the provided gfn is already populated, but whether the
>>>> mfn intended to be mapped is already mapped at a different location.
>>>>
>>>>> Since this seemed reasonable, I've now tried and found it fails.  The
>>>>> XENMEM_remove_from_physmap call returns 0.
>>>>
>>>> XENMEM_remove_from_physmap returning 0 is fine, but it seems to me
>>>> like edk2 hasn't unmapped the shared_info page.  The OS has no idea
>>>> at which position the shared_info page is currently mapped, and hence
>>>> can't do anything to attempt to unmap it in order to cover up for
>>>> buggy firmware.
>>>>
>>>> edk2 should be the entity to issue the XENMEM_remove_from_physmap
>>>> against the gfn where it has the shared_info page mapped.  Likely
>>>> needs to be done as part of ExitBootServices() method.
>>>>
>>>> FWIW, 71320946d5edf is an ABI change, and as desirable as such
>>>> behavior might be, a new hypercall should have introduced that had the
>>>> behavior that the change intended to retrofit into
>>>> XENMEM_add_to_physmap.
>>> I can see how you think this is an ABI change but the previous behavior
>>> was incorrect. Before this patch, on Arm, we would allow the shared page
>>> to be mapped twice. As we don't know where the firmware had mapped it
>>> this could result to random corruption.
>>>
>>> Now, we could surely decide to remove the page as x86 did. But this
>>> could leave a hole in the RAM. As the OS would not know where the hole
>>> is, this could lead to page fault randomly during runtime.
>>
>>
>> +1.
>>
>> In addition to what Julien has already said, I would like to say the
>> same issue was faced due to U-Boot (running as a part of Xen guest
>> before OS) didn't perform a cleanup before jumping to OS. This is
>> already fixed to follow the current behavior. I didn't find
>> corresponding U-Boot mail thread, but can point to already upstreamed
>> commit in the main repo.
> 
> What about other bootloaders?
> 
> There might be other loaders that didn't unmap shared_info either, but
> that removed the page where the shared_info is mapped from the
> reported RAM ranges to the guest.  In such case not doing the
> unmapping was benign, as the guest would never use that gfn as RAM
> anyway.

I got your point, this way, I agree, not doing the
unmapping wouldn't be a big problem. I am wondering whether such 
bootloaders that instead of unmapping indeed hiding/reserving the gfn 
(where shared_info is mapped to) are really present in the world (Arm)? 
Patch was merged more than 1 year ago, and the issue hasn't been noticed 
until now with EDK2. Let's wait for the addition input on how the EDK2 
is really dealing with shared_info.


> 
> Thanks, Roger.

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.