Xen project Mailing List

Re: [Xen-devel] [PATCH v5 4/9] xen/x86: populate PVHv2 Dom0 physical memory map

To: "Roger Pau Monne" <roger.pau@xxxxxxxxxx>

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Fri, 27 Jan 2017 08:11:56 -0700

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, boris.ostrovsky@xxxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxxx

Delivery-date: Fri, 27 Jan 2017 15:12:10 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

>>> On 27.01.17 at 13:23, <roger.pau@xxxxxxxxxx> wrote: > On Thu, Jan 26, 2017 at 05:41:58AM -0700, Jan Beulich wrote: >> >>> On 19.01.17 at 18:29, <roger.pau@xxxxxxxxxx> wrote: >> > @@ -43,6 +44,9 @@ static long __initdata dom0_nrpages; >> > static long __initdata dom0_min_nrpages; >> > static long __initdata dom0_max_nrpages = LONG_MAX; >> > >> > +/* Size of the VM86 TSS for virtual 8086 mode to use. */ >> > +#define HVM_VM86_TSS_SIZE 128 >> >> I continue to be puzzled by this value. Why 128? I think this really >> needs to be clarified in the comment. > > Given the recent comments by Tim, and that this is starting to look like a can > of worms, I would like to leave this as-is for the moment, on the grounds that > it's what hvmloader does (I'm not saying it's right), and that this issue > should be treated independently from this patch series. Well, for the purpose of this patch it would be sufficient if the comment referred to hvmloader. But then I think I saw you set the TSS limit to 0x67, which is neither in line with the value above nor - according to what Tim said (but I didn't check myself yet) - the 255 used in hvmloader. I.e. if you clone hvmloader code, all aspects of it should match. > Alternatively, I can just remove setting HVM_PARAM_VM86_TSS for a PVHv2 Dom0. > IIRC I've tried that before (without unrestricted mode support) and it was > working fine. Now if that's the case, then why bother with the TSS? >> > @@ -333,7 +338,9 @@ static unsigned long __init compute_dom0_nr_pages( >> > avail -= max_pdx >> s; >> > } >> > >> > - need_paging = opt_dom0_shadow || (is_pvh_domain(d) && >> > !iommu_hap_pt_share); >> > + need_paging = opt_dom0_shadow || >> > + (has_hvm_container_domain(d) && (!iommu_hap_pt_share || >> > + !paging_mode_hap(d))); >> >> What is the !paging_mode_hap() part good for? It's being taken care >> of by checking opt_dom0_shadow already, isn't it? Alternatively, to >> make the distinction more obvious, I'd suggest >> >> need_paging = has_hvm_container_domain(d) >> ? !iommu_hap_pt_share || !paging_mode_hap(d) >> : opt_dom0_shadow; > > AFAICT it *might* be possible to run a PVHv2 Dom0 on a box with no EPT, but > with an IOMMU? Does that exist? In that case opt_dom0_shadow won't be set, but > paging_mode_hap would be false. Maybe that's just an impossible combination in > any case... At least when running Xen itself virtualized, I wouldn't dare to assume this is an impossible combination. However, I can't see how that case would be handled any different by the original or the suggested replacement expressions: need_paging would get set either way afaict. >> > @@ -608,8 +617,22 @@ static __init void pvh_setup_e820(struct domain *d, >> > unsigned long nr_pages) >> > continue; >> > } >> > >> > - *entry_guest = *entry; >> > - pages = PFN_UP(entry_guest->size); >> > + /* >> > + * Make sure the start and length are aligned to PAGE_SIZE, >> > because >> > + * that's the minimum granularity of the 2nd stage translation. >> > Since >> > + * the p2m code uses PAGE_ORDER_4K internally, also use it here in >> > + * order to prevent this code from getting out of sync. >> > + */ >> > + start = ROUNDUP(entry->addr, _AC(1,L) << PAGE_ORDER_4K << >> > PAGE_SHIFT); >> >> You definitely don't need to use _AC() in C code. But the whole thing >> can anyway simply be >> >> start = ROUNDUP(entry->addr, PAGE_SIZE << PAGE_ORDER_4K); >> >> (albeit I'd like to note that if anything we'd have to be prepared >> for page sizes > 4k, not smaller ones, and the whole idea of >> PAGE_ORDER_4K breaks in that case). > > Thanks, I will change as per your recommendation above, although I'm not sure > what to do with the PAGE_ORDER_4K thing. Are you fine with leaving it like you > suggest? Yes, there's far more broken code in that case, and hence the remark was in parentheses in an attempt to make clear it's really just a remark. >> > +static int __init pvh_setup_vmx_realmode_helpers(struct domain *d) >> > +{ >> > + p2m_type_t p2mt; >> > + uint32_t rc, *ident_pt; >> > + uint8_t *tss; >> > + mfn_t mfn; >> > + paddr_t gaddr; >> > + unsigned int i; >> > + >> > + /* >> > + * Steal some space from the last found RAM region. One page will be >> > + * used for the identity page tables, and the remaining space for the >> > + * VM86 TSS. Note that after this not all e820 regions will be aligned >> > + * to PAGE_SIZE. >> > + */ >> > + if ( pvh_steal_ram(d, PAGE_SIZE + HVM_VM86_TSS_SIZE, ULONG_MAX, >> > &gaddr) ) >> > + { >> > + printk("Unable to find memory to stash the identity map and >> > TSS\n"); >> > + return -ENOMEM; >> > + } >> > + >> > + /* >> > + * Identity-map page table is required for running with CR0.PG=0 >> > + * when using Intel EPT. Create a 32-bit non-PAE page directory of >> > + * superpages. >> > + */ >> > + ident_pt = map_domain_gfn(p2m_get_hostp2m(d), _gfn(PFN_DOWN(gaddr)), >> > + &mfn, &p2mt, 0, &rc); >> > + if ( ident_pt == NULL ) >> > + { >> > + printk("Unable to map identity page tables\n"); >> > + return -ENOMEM; >> > + } >> > + for ( i = 0; i < PAGE_SIZE / sizeof(*ident_pt); i++ ) >> > + ident_pt[i] = ((i << 22) | _PAGE_PRESENT | _PAGE_RW | _PAGE_USER | >> > + _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_PSE); >> > + unmap_domain_page(ident_pt); >> > + put_page(mfn_to_page(mfn_x(mfn))); >> > + d->arch.hvm_domain.params[HVM_PARAM_IDENT_PT] = gaddr; >> > + gaddr += PAGE_SIZE; >> > + ASSERT(IS_ALIGNED(gaddr, PAGE_SIZE)); >> >> This comes too late - the page table setup above also requires >> page alignment (and with that, adding PAGE_SIZE would not break >> the alignment requirement). Even more, the code below doesn't >> strictly require page alignment, it only requires for the range to >> not cross a page boundary. > > Given the change that you requested in pvh_steal_ram, now the start of the > memory area returned by it it's not going to be page-aligned, so I will have > to > perform the TSS setup first, and then the identity page tables. Or simply pass the required alignment. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.