[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 2/3] mm/memory_hotplug: Introduce MHP_NO_FIRMWARE_MEMMAP


  • To: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
  • From: David Hildenbrand <david@xxxxxxxxxx>
  • Date: Thu, 30 Apr 2020 20:43:39 +0200
  • Autocrypt: addr=david@xxxxxxxxxx; prefer-encrypt=mutual; keydata= mQINBFXLn5EBEAC+zYvAFJxCBY9Tr1xZgcESmxVNI/0ffzE/ZQOiHJl6mGkmA1R7/uUpiCjJ dBrn+lhhOYjjNefFQou6478faXE6o2AhmebqT4KiQoUQFV4R7y1KMEKoSyy8hQaK1umALTdL QZLQMzNE74ap+GDK0wnacPQFpcG1AE9RMq3aeErY5tujekBS32jfC/7AnH7I0v1v1TbbK3Gp XNeiN4QroO+5qaSr0ID2sz5jtBLRb15RMre27E1ImpaIv2Jw8NJgW0k/D1RyKCwaTsgRdwuK Kx/Y91XuSBdz0uOyU/S8kM1+ag0wvsGlpBVxRR/xw/E8M7TEwuCZQArqqTCmkG6HGcXFT0V9 PXFNNgV5jXMQRwU0O/ztJIQqsE5LsUomE//bLwzj9IVsaQpKDqW6TAPjcdBDPLHvriq7kGjt WhVhdl0qEYB8lkBEU7V2Yb+SYhmhpDrti9Fq1EsmhiHSkxJcGREoMK/63r9WLZYI3+4W2rAc UucZa4OT27U5ZISjNg3Ev0rxU5UH2/pT4wJCfxwocmqaRr6UYmrtZmND89X0KigoFD/XSeVv jwBRNjPAubK9/k5NoRrYqztM9W6sJqrH8+UWZ1Idd/DdmogJh0gNC0+N42Za9yBRURfIdKSb B3JfpUqcWwE7vUaYrHG1nw54pLUoPG6sAA7Mehl3nd4pZUALHwARAQABtCREYXZpZCBIaWxk ZW5icmFuZCA8ZGF2aWRAcmVkaGF0LmNvbT6JAlgEEwEIAEICGwMFCQlmAYAGCwkIBwMCBhUI AgkKCwQWAgMBAh4BAheAFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAl3pImkCGQEACgkQTd4Q 9wD/g1o+VA//SFvIHUAvul05u6wKv/pIR6aICPdpF9EIgEU448g+7FfDgQwcEny1pbEzAmiw zAXIQ9H0NZh96lcq+yDLtONnXk/bEYWHHUA014A1wqcYNRY8RvY1+eVHb0uu0KYQoXkzvu+s Dncuguk470XPnscL27hs8PgOP6QjG4jt75K2LfZ0eAqTOUCZTJxA8A7E9+XTYuU0hs7QVrWJ jQdFxQbRMrYz7uP8KmTK9/Cnvqehgl4EzyRaZppshruKMeyheBgvgJd5On1wWq4ZUV5PFM4x II3QbD3EJfWbaJMR55jI9dMFa+vK7MFz3rhWOkEx/QR959lfdRSTXdxs8V3zDvChcmRVGN8U Vo93d1YNtWnA9w6oCW1dnDZ4kgQZZSBIjp6iHcA08apzh7DPi08jL7M9UQByeYGr8KuR4i6e RZI6xhlZerUScVzn35ONwOC91VdYiQgjemiVLq1WDDZ3B7DIzUZ4RQTOaIWdtXBWb8zWakt/ ztGhsx0e39Gvt3391O1PgcA7ilhvqrBPemJrlb9xSPPRbaNAW39P8ws/UJnzSJqnHMVxbRZC Am4add/SM+OCP0w3xYss1jy9T+XdZa0lhUvJfLy7tNcjVG/sxkBXOaSC24MFPuwnoC9WvCVQ ZBxouph3kqc4Dt5X1EeXVLeba+466P1fe1rC8MbcwDkoUo65Ag0EVcufkQEQAOfX3n0g0fZz Bgm/S2zF/kxQKCEKP8ID+Vz8sy2GpDvveBq4H2Y34XWsT1zLJdvqPI4af4ZSMxuerWjXbVWb T6d4odQIG0fKx4F8NccDqbgHeZRNajXeeJ3R7gAzvWvQNLz4piHrO/B4tf8svmRBL0ZB5P5A 2uhdwLU3NZuK22zpNn4is87BPWF8HhY0L5fafgDMOqnf4guJVJPYNPhUFzXUbPqOKOkL8ojk CXxkOFHAbjstSK5Ca3fKquY3rdX3DNo+EL7FvAiw1mUtS+5GeYE+RMnDCsVFm/C7kY8c2d0G NWkB9pJM5+mnIoFNxy7YBcldYATVeOHoY4LyaUWNnAvFYWp08dHWfZo9WCiJMuTfgtH9tc75 7QanMVdPt6fDK8UUXIBLQ2TWr/sQKE9xtFuEmoQGlE1l6bGaDnnMLcYu+Asp3kDT0w4zYGsx 5r6XQVRH4+5N6eHZiaeYtFOujp5n+pjBaQK7wUUjDilPQ5QMzIuCL4YjVoylWiBNknvQWBXS lQCWmavOT9sttGQXdPCC5ynI+1ymZC1ORZKANLnRAb0NH/UCzcsstw2TAkFnMEbo9Zu9w7Kv AxBQXWeXhJI9XQssfrf4Gusdqx8nPEpfOqCtbbwJMATbHyqLt7/oz/5deGuwxgb65pWIzufa N7eop7uh+6bezi+rugUI+w6DABEBAAGJAiUEGAECAA8FAlXLn5ECGwwFCQlmAYAACgkQTd4Q 9wD/g1qA6w/+M+ggFv+JdVsz5+ZIc6MSyGUozASX+bmIuPeIecc9UsFRatc91LuJCKMkD9Uv GOcWSeFpLrSGRQ1Z7EMzFVU//qVs6uzhsNk0RYMyS0B6oloW3FpyQ+zOVylFWQCzoyyf227y GW8HnXunJSC+4PtlL2AY4yZjAVAPLK2l6mhgClVXTQ/S7cBoTQKP+jvVJOoYkpnFxWE9pn4t H5QIFk7Ip8TKr5k3fXVWk4lnUi9MTF/5L/mWqdyIO1s7cjharQCstfWCzWrVeVctpVoDfJWp 4LwTuQ5yEM2KcPeElLg5fR7WB2zH97oI6/Ko2DlovmfQqXh9xWozQt0iGy5tWzh6I0JrlcxJ ileZWLccC4XKD1037Hy2FLAjzfoWgwBLA6ULu0exOOdIa58H4PsXtkFPrUF980EEibUp0zFz GotRVekFAceUaRvAj7dh76cToeZkfsjAvBVb4COXuhgX6N4pofgNkW2AtgYu1nUsPAo+NftU CxrhjHtLn4QEBpkbErnXQyMjHpIatlYGutVMS91XTQXYydCh5crMPs7hYVsvnmGHIaB9ZMfB njnuI31KBiLUks+paRkHQlFcgS2N3gkRBzH7xSZ+t7Re3jvXdXEzKBbQ+dC3lpJB0wPnyMcX FOTT3aZT7IgePkt5iC/BKBk3hqKteTnJFeVIT7EC+a6YUFg=
  • Cc: virtio-dev@xxxxxxxxxxxxxxxxxxxx, linux-hyperv@xxxxxxxxxxxxxxx, Michal Hocko <mhocko@xxxxxxxx>, Baoquan He <bhe@xxxxxxxxxx>, linux-mm@xxxxxxxxx, Wei Yang <richard.weiyang@xxxxxxxxx>, linux-s390@xxxxxxxxxxxxxxx, linux-nvdimm@xxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx, virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx, linux-acpi@xxxxxxxxxxxxxxx, "Michael S . Tsirkin" <mst@xxxxxxxxxx>, Pankaj Gupta <pankaj.gupta.linux@xxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, Michal Hocko <mhocko@xxxxxxxxxx>, linuxppc-dev@xxxxxxxxxxxxxxxx
  • Delivery-date: Thu, 30 Apr 2020 18:44:07 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

 >>> If the class of memory is different then please by all means let's mark
>>> it differently in struct resource so everyone knows it is different.
>>> But that difference needs to be more than hotplug.
>>>
>>> That difference needs to be the hypervisor loaned us memory and might
>>> take it back at any time, or this memory is persistent and so it has
>>> these different characteristics so don't use it as ordinary ram.
>>
>> Yes, and I think kmem took an excellent approach of explicitly putting
>> that "System RAM" into a resource hierarchy. That "System RAM" won't
>> show up as a root node under /proc/iomem (see patch #3), which already
>> results in kexec-tools to treat it in a special way. I am thinking about
>> doing the same for virtio-mem.
> 
> Reading this and your patch cover letters again my concern is that
> the justification seems to be letting the tail wag the dog.
> 
> You want kexec-tools to behave in a certain way so you are changing the
> kernel.
> 
> Rather it should be change the kernel to clearly reflect reality and if
> you can get away without a change to kexec-tools that is a bonus.
> 

Right, because user space has to have a way to figure out what to do.

But talking about the firmware memmap, indicating something via a "raw
firmware-provided memory map", that is not actually in the "raw
firmware-provided memory map" feels wrong to me. (below)


>>> That information is also useful to other people looking at the system
>>> and seeing what is going on.
>>>
>>> Just please don't muddle the concepts, or assume that whatever subset of
>>> hotplug memory you are dealing with is the only subset.
>>
>> I can certainly rephrase the subject/description/comment, stating that
>> this is not to be used for ordinary hotplugged DIMMs - only when the
>> device driver is under control to decide what to do with that memory -
>> especially when kexec'ing.
>>
>> (previously, I called this flag MHP_DRIVER_MANAGED, but I think
>> MHP_NO_FIRMWARE_MEMMAP is clearer, we just need a better description)
>>
>> Would that make it clearer?
> 
> I am not certain, but Andrew Morton deliberately added that
> firmware_map_add_hotplug call.  Which means that there is a reason
> for putting hotplugged memory in the firmware map.
> 
> So the justification needs to take that reason into account.  The
> justification can not be it is hotplugged therefore it should not belong
> in the firmware memory map.  Unless you can show that
> firmware_map_add_hotplug that was actually a bug and should be removed.
> But as it has been that way since 2010 that seems like a long shot.
> 
> So my question is what is right for the firmware map?

We have documentation for that since 2008. Andrews patch is from 2010.

Documentation/ABI/testing/sysfs-firmware-memmap

It clearly talks about "raw firmware-provided memory map" and why the
interface was introduced at all ("on most architectures that
firmware-provided memory map is modified afterwards by the kernel itself").

> 
> Why does the firmware map support hotplug entries?

I assume:

The firmware memmap was added primarily for x86-64 kexec (and still, is
mostly used on x86-64 only IIRC). There, we had ACPI hotplug. When DIMMs
get hotplugged on real HW, they get added to e820. Same applies to
memory added via HyperV balloon (unless memory is unplugged via
ballooning and you reboot ... the the e820 is changed as well). I assume
we wanted to be able to reflect that, to make kexec look like a real reboot.

This worked for a while. Then came dax/kmem. Now comes virtio-mem.


But I assume only Andrew can enlighten us.

@Andrew, any guidance here? Should we really add all memory to the
firmware memmap, even if this contradicts with the existing
documentation? (especially, if the actual firmware memmap will *not*
contain that memory after a reboot)

-- 
Thanks,

David / dhildenb




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.