[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFCv2 0/4] mm/memory_hotplug: Introduce memory block types

On 20.12.18 14:08, Michal Hocko wrote:
> On Thu 20-12-18 13:58:16, David Hildenbrand wrote:
>> On 30.11.18 18:59, David Hildenbrand wrote:
>>> This is the second approach, introducing more meaningful memory block
>>> types and not changing online behavior in the kernel. It is based on
>>> latest linux-next.
>>> As we found out during dicussion, user space should always handle onlining
>>> of memory, in any case. However in order to make smart decisions in user
>>> space about if and how to online memory, we have to export more information
>>> about memory blocks. This way, we can formulate rules in user space.
>>> One such information is the type of memory block we are talking about.
>>> This helps to answer some questions like:
>>> - Does this memory block belong to a DIMM?
>>> - Can this DIMM theoretically ever be unplugged again?
>>> - Was this memory added by a balloon driver that will rely on balloon
>>>   inflation to remove chunks of that memory again? Which zone is advised?
>>> - Is this special standby memory on s390x that is usually not automatically
>>>   onlined?
>>> And in short it helps to answer to some extend (excluding zone imbalances)
>>> - Should I online this memory block?
>>> - To which zone should I online this memory block?
>>> ... of course special use cases will result in different anwers. But that's
>>> why user space has control of onlining memory.
>>> More details can be found in Patch 1 and Patch 3.
>>> Tested on x86 with hotplugged DIMMs. Cross-compiled for PPC and s390x.
>>> Example:
>>> $ udevadm info -q all -a /sys/devices/system/memory/memory0
>>>     KERNEL=="memory0"
>>>     SUBSYSTEM=="memory"
>>>     DRIVER==""
>>>     ATTR{online}=="1"
>>>     ATTR{phys_device}=="0"
>>>     ATTR{phys_index}=="00000000"
>>>     ATTR{removable}=="0"
>>>     ATTR{state}=="online"
>>>     ATTR{type}=="boot"
>>>     ATTR{valid_zones}=="none"
>>> $ udevadm info -q all -a /sys/devices/system/memory/memory90
>>>     KERNEL=="memory90"
>>>     SUBSYSTEM=="memory"
>>>     DRIVER==""
>>>     ATTR{online}=="1"
>>>     ATTR{phys_device}=="0"
>>>     ATTR{phys_index}=="0000005a"
>>>     ATTR{removable}=="1"
>>>     ATTR{state}=="online"
>>>     ATTR{type}=="dimm"
>>>     ATTR{valid_zones}=="Normal"
>>> RFC -> RFCv2:
>>> - Now also taking care of PPC (somehow missed it :/ )
>>> - Split the series up to some degree (some ideas on how to split up patch 3
>>>   would be very welcome)
>>> - Introduce more memory block types. Turns out abstracting too much was
>>>   rather confusing and not helpful. Properly document them.
>>> Notes:
>>> - I wanted to convert the enum of types into a named enum but this
>>>   provoked all kinds of different errors. For now, I am doing it just like
>>>   the other types (e.g. online_type) we are using in that context.
>>> - The "removable" property should never have been named like that. It
>>>   should have been "offlinable". Can we still rename that? E.g. boot memory
>>>   is sometimes marked as removable ...
>> Any feedback regarding the suggested block types would be very much
>> appreciated!
> I still do not like this much to be honest. I just didn't get to think
> through this properly. My fear is that this is conflating an actual API
> with the current implementation and as such will cause problems in
> future. But I haven't really looked into your patches closely so I might
> be wrong. Anyway I won't be able to look into it by the end of year.

I guess as long as we have memory block devices and we expect user space
to make a decision we will have this API and the involved problems.

I am open for alternatives, and as I said, any feedback on how to sort
this out will be highly appreciated.

I'll be on vacation for the next two weeks, so this can wait. Just
wanted to note that I am still interested in feedback :)



David / dhildenb

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.