[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCHv1] xen/balloon: disable memory hotplug in PV guests



On 18/03/15 13:57, Juergen Gross wrote:
> On 03/18/2015 11:36 AM, David Vrabel wrote:
>> On 16/03/15 10:31, Juergen Gross wrote:
>>> On 03/16/2015 11:03 AM, Daniel Kiper wrote:
>>>> On Mon, Mar 16, 2015 at 06:35:04AM +0100, Juergen Gross wrote:
>>>>> On 03/11/2015 04:40 PM, Boris Ostrovsky wrote:
>>>>>> On 03/11/2015 10:42 AM, David Vrabel wrote:
>>>>>>> On 10/03/15 13:35, Boris Ostrovsky wrote:
>>>>>>>> On 03/10/2015 07:40 AM, David Vrabel wrote:
>>>>>>>>> On 09/03/15 14:10, David Vrabel wrote:
>>>>>>>>>> Memory hotplug doesn't work with PV guests because:
>>>>>>>>>>
>>>>>>>>>>      a) The p2m cannot be expanded to cover the new sections.
>>>>>>>>> Broken by 054954eb051f35e74b75a566a96fe756015352c8 (xen: switch to
>>>>>>>>> linear virtual mapped sparse p2m list).
>>>>>>>>>
>>>>>>>>> This one would be non-trivial to fix.  We'd need a sparse set of
>>>>>>>>> vm_area's for the p2m or similar.
>>>>>>>>>
>>>>>>>>>>      b) add_memory() builds page tables for the new sections
>>>>>>>>>> which
>>>>>>>>>> means
>>>>>>>>>>         the new pages must have valid p2m entries (or a BUG
>>>>>>>>>> occurs).
>>>>>>>>> After some more testing this appears to be broken by:
>>>>>>>>>
>>>>>>>>> 25b884a83d487fd62c3de7ac1ab5549979188482 (x86/xen: set regions
>>>>>>>>> above
>>>>>>>>> the
>>>>>>>>> end of RAM as 1:1) included 3.16.
>>>>>>>>>
>>>>>>>>> This one can be trivially fixed by setting the new sections in
>>>>>>>>> the p2m
>>>>>>>>> to INVALID_P2M_ENTRY before calling add_memory().
>>>>>>>> Have you tried 3.17? As I said yesterday, it worked for me (with
>>>>>>>> 4.4
>>>>>>>> Xen).
>>>>>>> No.  But there are three bugs that prevent it from working in
>>>>>>> 3.16+ so
>>>>>>> I'm really not sure how you had a working in a 3.17 PV guest.
>>>>>>
>>>>>> This is what I have:
>>>>>>
>>>>>> [build@build-mk2 linux-boris]$ ssh root@tst008 cat
>>>>>> /mnt/lab/bootstrap-x86_64/test_small.xm
>>>>>> extra="console=hvc0 debug earlyprintk=xen "
>>>>>> kernel="/mnt/lab/bootstrap-x86_64/vmlinuz"
>>>>>> ramdisk="/mnt/lab/bootstrap-x86_64/initramfs.cpio.gz"
>>>>>> memory=1024
>>>>>> maxmem = 4096
>>>>>> vcpus=1
>>>>>> maxvcpus=3
>>>>>> name="bootstrap-x86_64"
>>>>>> on_crash="preserve"
>>>>>> vif = [ 'mac=00:0F:4B:00:00:68, bridge=switch' ]
>>>>>> vnc=1
>>>>>> vnclisten="0.0.0.0"
>>>>>> disk=['phy:/dev/guests/bootstrap-x86_64,xvda,w']
>>>>>> [build@build-mk2 linux-boris]$ ssh root@tst008 xl create
>>>>>> /mnt/lab/bootstrap-x86_64/test_small.xm
>>>>>> Parsing config from /mnt/lab/bootstrap-x86_64/test_small.xm
>>>>>> [build@build-mk2 linux-boris]$ ssh root@tst008 xl list |grep
>>>>>> bootstrap-x86_64
>>>>>> bootstrap-x86_64                             2  1024     1
>>>>>> -b----       5.4
>>>>>> [build@build-mk2 linux-boris]$ ssh root@g-pvops uname -r
>>>>>> 3.17.0upstream
>>>>>> [build@build-mk2 linux-boris]$ ssh root@g-pvops dmesg|grep
>>>>>> paravirtualized
>>>>>> [    0.000000] Booting paravirtualized kernel on Xen
>>>>>> [build@build-mk2 linux-boris]$ ssh root@g-pvops grep MemTotal
>>>>>> /proc/meminfo
>>>>>> MemTotal:         968036 kB
>>>>>> [build@build-mk2 linux-boris]$ ssh root@tst008 xl mem-set
>>>>>> bootstrap-x86_64 2048
>>>>>> [build@build-mk2 linux-boris]$ ssh root@tst008 xl list |grep
>>>>>> bootstrap-x86_64
>>>>>> bootstrap-x86_64                             2  2048     1
>>>>>> -b----       5.7
>>>>>> [build@build-mk2 linux-boris]$ ssh root@g-pvops grep MemTotal
>>>>>> /proc/meminfo
>>>>>> MemTotal:        2016612 kB
>>>>>> [build@build-mk2 linux-boris]$
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Regardless, it definitely doesn't work now because of the linear p2m
>>>>>>> change.  What do you want to do about this?
>>>>>>
>>>>>> Since backing out p2m changes is not an option I guess your patch is
>>>>>> the
>>>>>> only short-term alternative.
>>>>>>
>>>>>> But this still looks like a regression so perhaps Juergen can take a
>>>>>> look to see how it can be fixed.
>>>>>
>>>>> Hmm, the p2m list is allocated for the maximum memory size of the
>>>>> domain
>>>>> which is obtained from the hypervisor. In case of Dom0 it is read via
>>>>> XENMEM_maximum_reservation, for a domU it is based on the E820 memory
>>>>> map read via XENMEM_memory_map.
>>>>>
>>>>> I just tested it with a 4.0-rc1 domU kernel with 512MB initial memory
>>>>> and 4GB of maxmem. The E820 map looked like this:
>>>>>
>>>>> [    0.000000] Xen: [mem 0x0000000000000000-0x000000000009ffff] usable
>>>>> [    0.000000] Xen: [mem 0x00000000000a0000-0x00000000000fffff]
>>>>> reserved
>>>>> [    0.000000] Xen: [mem 0x0000000000100000-0x00000000ffffffff] usable
>>>>>
>>>>> So the complete 4GB were included, like they should. The resulting p2m
>>>>> list is allocated in the needed size:
>>>>>
>>>>> [    0.000000] p2m virtual area at ffffc90000000000, size is 800000
>>>>>
>>>>> So what is your problem here? Can you post the E820 map and the p2m
>>>>> map
>>>>> info for your failing domain, please?
>>>>
>>>> If you use memory hotplug then maxmem is not a limit from guest kernel
>>>> point of view (host still must allow that operation but it is another
>>>> not related issue). The problem is that p2m must be dynamically
>>>> expendable
>>>> to support it. Earlier implementation supported that thing and memory
>>>> hotplug worked without any issue.
>>>
>>> Okay, now I get it.
>>>
>>> The problem with the earlier p2m implementation was that it was
>>> expendable to support only up to 512GB of RAM. So we need some way to
>>> tell the kernel how much virtual memory it should reserve for the p2m
>>> list if memory hotplug is enabled. We could:
>>>
>>> a) use a configurable maximum (e.g. for 512GB RAM as today)
>>
>> I would set the p2m virtual area to cover up to 512 GB (needs 1 GB of
>> virt space) for a 64-bit guest and up to 64 GB (needs 64 MB of virt
>> space) for a 32-bit guest.
> 
> Are 64 GB for 32 bit guests a sensible default? This will need more than
> 10% of the available virtual kernel space (taking fixmap etc. into
> account). And a 64 GB sized 32 bit domain is hardly usable (you have to
> play dirty tricks to get it even running).
> 
> I'd rather use a default of 4 GB which can be changed via a Kconfig
> option. For 64 bits the default of 512 GB is okay, but should be
> configurable as well.

Ok.

David

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.