[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Draft NVDIMM proposal




> On May 15, 2018, at 11:05 AM, Roger Pau Monne <roger.pau@xxxxxxxxxx> wrote:
> 
> Just some replies/questions to some of the points raised below.
> 
> On Fri, May 11, 2018 at 09:33:10AM -0700, Dan Williams wrote:
>> [ adding linux-nvdimm ]
>> 
>> Great write up! Some comments below...
>> 
>> On Wed, May 9, 2018 at 10:35 AM, George Dunlap <george.dunlap@xxxxxxxxxx> 
>> wrote:
>>>> To use a namespace, an operating system needs at a minimum two pieces
>>>> of information: The UUID and/or Name of the namespace, and the SPA
>>>> range where that namespace is mapped; and ideally also the Type and
>>>> Abstraction Type to know how to interpret the data inside.
>> 
>> Not necessarily, no. Linux supports "label-less" mode where it exposes
>> the raw capacity of a region in 1:1 mapped namespace without a label.
>> This is how Linux supports "legacy" NVDIMMs that do not support
>> labels.
> 
> In that case, how does Linux know which area of the NVDIMM it should
> use to store the page structures?

The answer to that is right here:

>>>> `fsdax` and `devdax` mode are both designed to make it possible for
>>>> user processes to have direct mapping of NVRAM.  As such, both are
>>>> only suitable for PMEM namespaces (?).  Both also need to have kernel
>>>> page structures allocated for each page of NVRAM; this amounts to 64
>>>> bytes for every 4k of NVRAM.  Memory for these page structures can
>>>> either be allocated out of normal "system" memory, or inside the PMEM
>>>> namespace itself.
>>>> 
>>>> In both cases, an "info block", very similar to the BTT info block, is
>>>> written to the beginning of the namespace when created.  This info
>>>> block specifies whether the page structures come from system memory or
>>>> from the namespace itself.  If from the namespace itself, it contains
>>>> information about what parts of the namespace have been set aside for
>>>> Linux to use for this purpose.

That is, each fsdax / devdax namespace has a superblock that, in part, defines 
what parts are used for Linux and what parts are used for data.  Or to put it a 
different way: Linux decides which parts of a namespace to use for page 
structures, and writes it down in the metadata starting in the first page of 
the namespace.


>>>> 
>>>> Linux has also defined "Type GUIDs" for these two types of namespace
>>>> to be stored in the namespace label, although these are not yet in the
>>>> ACPI spec.
>> 
>> They never will be. One of the motivations for GUIDs is that an OS can
>> define private ones without needing to go back and standardize them.
>> Only GUIDs that are needed to inter-OS / pre-OS compatibility would
>> need to be defined in ACPI, and there is no expectation that other
>> OSes understand Linux's format for reserving page structure space.
> 
> Maybe it would be helpful to somehow mark those areas as
> "non-persistent" storage, so that other OSes know they can use this
> space for temporary data that doesn't need to survive across reboots?

In theory there’s no reason another OS couldn’t learn Linux’s format, discover 
where the blocks were, and use those blocks for its own purposes while Linux 
wasn’t running.

But that won’t help Xen, as we want to use those blocks while Linux *is* 
running.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.