[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Event delivery and "domain blocking" on PVHv2


  • To: Martin Lucina <martin@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxx>, <mirageos-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Fri, 19 Jun 2020 00:43:55 +0100
  • Authentication-results: esa5.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none
  • Delivery-date: Thu, 18 Jun 2020 23:44:15 +0000
  • Ironport-sdr: uFy2yF6A3Sz1dmKVsQj8ZhJG1SABuqaEaoPwHhB8Y6/iNzyUA/U/xV0ZpN9d06cQS2bdfVIWUl bJauUP46LSy9HaJd9th/lLF/rusg4AEYsYC3H3U3qZSnhLmyUsXEBzhX60YNVoA7JT9g+2tbAY aXLXo3bBlfCVkM2VNolaXd54+MZl/cAY7DOx+9NcGqeV/paOHuf3GYs+ehwff2bdptf4xVmY3U Ni3Mx7ShySsIofO22N9sHbMdVgGtEUW1RpTfVDBJf0ZkQW48nJALoxJRXZFaqQFu/5mc+nlXXH geE=
  • List-id: Developer list for MirageOS <mirageos-devel.lists.xenproject.org>

On 18/06/2020 11:13, Martin Lucina wrote:
> On Monday, 15.06.2020 at 17:58, Andrew Cooper wrote:
>> On 15/06/2020 15:25, Martin Lucina wrote:
>>> Hi,
>>>
>>> puzzle time: In my continuing explorations of the PVHv2 ABIs for the
>>> new MirageOS Xen stack, I've run into some issues with what looks like
>>> missed deliveries of events on event channels.
>>>
>>> While a simple unikernel that only uses the Xen console and
>>> effectively does for (1..5) { printf("foo"); sleep(1); } works fine,
>>> once I plug in the existing OCaml Xenstore and Netfront code, the
>>> behaviour I see is that the unikernel hangs in random places, blocking
>>> as if an event that should have been delivered has been missed.
>> You can see what is going on, event channel wise, with the 'e'
>> debug-key.  This will highlight cases such as the event channel being
>> masked and pending, which is a common guest bug ending up in this state.
> Ok, based on your and Roger's suggestions I've made some changes:
>
> 1. I've dropped all the legacy PIC initialisation code from the Solo5
> parts, written some basic APIC initialisation code and switched to using
> HVMOP_set_evtchn_upcall_vector for upcall registration, along with setting
> HVM_PARAM_CALLBACK_IRQ to 1 as suggested by Roger and done by Xen when
> running as a guest. Commit at [1], nothing controversial there.

Well...

    uint64_t apic_base = rdmsrq(MSR_IA32_APIC_BASE);
    wrmsrq(MSR_IA32_APIC_BASE,
            apic_base | (APIC_BASE << 4) | MSR_IA32_APIC_BASE_ENABLE);
    apic_base = rdmsrq(MSR_IA32_APIC_BASE);
    if (!(apic_base & MSR_IA32_APIC_BASE_ENABLE)) {
        log(ERROR, "Solo5: Could not enable APIC or not present\n");
        assert(false);
    }

The only reason Xen doesn't crash your guest on that WRMSR is because
0xfee00080ull | (0xfee00000u << 4) == 0xfee00080ull, due to truncation
and 0xfe | 0xee == 0xfe.

Either way, the logic isn't correct.

Xen doesn't support moving the APIC MMIO window (and almost certainly
never will, because the only thing which changes it is malware).  You
can rely on the default state being correct, because it is
architecturally specified.

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.