[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Minios-devel] [PATCH 29/40] arm64: init the memory system before console/xenbus/gic



On Thu, Nov 16, 2017 at 11:27:29AM +0000, Julien Grall wrote:
> 
> 
> On 16/11/17 09:35, Huang Shijie wrote:
> > On Thu, Nov 16, 2017 at 08:47:02AM +0000, Julien Grall wrote:
> > > Hi Shijie,
> > > 
> > > On 11/16/2017 08:35 AM, Huang Shijie wrote:
> > > > On Tue, Nov 14, 2017 at 01:40:58PM +0000, Julien Grall wrote:
> > > > > Hi Shijie,
> > > > > 
> > > > > On 13/11/17 06:05, Huang Shijie wrote:
> > > > > > On Thu, Nov 09, 2017 at 11:37:39AM +0000, Julien Grall wrote:
> > > > > > > Hi Shijie,
> > > > > > > 
> > > > > > > On 09/11/17 05:33, Huang Shijie wrote:
> > > > > > > > On Wed, Nov 08, 2017 at 10:38:09AM +0000, Julien Grall wrote:
> > > > > > > > > Hi Shijie,
> > > > > > > > > 
> > > > > > > > > On 08/11/17 08:54, Huang Shijie wrote:
> > > > > > > > > > On Mon, Nov 06, 2017 at 05:06:58PM +0000, Julien Grall 
> > > > > > > > > > wrote:
> > > > > > > > > > > Hi Shijie,
> > > > > > > > > > > 
> > > > > > > > > > > On 03/11/17 03:12, Huang Shijie wrote:
> > > > > > > > > > > > The mappings for console/xenbus/gic need several pages 
> > > > > > > > > > > > to setup
> > > > > > > > > > > > the page tables.
> > > > > > > > > > > > 
> > > > > > > > > > > > So we init the memory system before initiating the 
> > > > > > > > > > > > console/xenbus/gic,
> > > > > > > > > > > > and then we can allocate the pages which are needed by
> > > > > > > > > > > > the console/xenbus/gic mapping.
> > > > > > > > > > > > 
> > > > > > > > > > > > we use the 2M memory block for the memory.
> > > > > > > > > > > 
> > > > > > > > > > > Why 2MB and not 1GB mapping when possible? Also, what if 
> > > > > > > > > > > the guest is using
> > > > > > > > > > 2MB is a proper size for most of the memory. For example, 
> > > > > > > > > > we can use 2M
> > > > > > > > > > for 512MB, it is not proper to use 1GB mapping.
> > > > > > > > > 
> > > > > > > > > And you can be clever and mixing 2MB/1GB mapping depending on 
> > > > > > > > > what you need to do.
> > > > > > > > Yes, we can. but the code looks a little ugly.
> > > > > > > I am sorry but avoiding using switch between 1GB/2MB because the 
> > > > > > > code might
> > > > > > > look ugly is not an excuse.
> > > > > > > 
> > > > > > > Xen is able to decide what is the best page size to use (see 
> > > > > > > p2m_set_entry)
> > > > > > > and the code is readable and fairly clean.
> > > > > > I read the p2m_set_entry, I am think it is not clear to me. :(
> > > > > > 
> > > > > > Can we just keep the current code for page table, we have other 
> > > > > > things
> > > > > > to do which have high prioiries.
> > > > > > 
> > > > > > I prefer to keep the current code, and if we really need to change 
> > > > > > the
> > > > > > page table, we can do it the sync exception.
> > > > > 
> > > > > I don't get the "we can do it the sync exception"... Anyway, as I 
> > > > > said, the
> > > > When there is a page table issue(such as page fault), fix it by the 
> > > > sync exception, just as
> > > > the linux kernel does.
> > > 
> > > There are no need for that in Mini-OS... AFAIK, in the Linux kernel is
> > > because usespace memory is not always mapped (imagine paging-out 
> > > bits,...).
> > > 
> > > > 
> > > > > way you implemented the page-tables is not going to work well. You 
> > > > > don't
> > > > > support removing mapping for instance.
> > > > I have not implemented the unmapping yet..
> > > > 
> > > > > 
> > > > > I don't much care whether you follow the idea in p2m_set_entry, but I 
> > > > > would
> > > > > like to see page-table correctly handled from the beginning. 
> > > > > Otherwise, how
> > > > > are you going to use Mini-OS correctly?
> > > > Can we do it gradually?  First, make it runs;secondly, make it runs 
> > > > better;
> > > > the last, run perfect.
> > > 
> > > Are you aware that every steps you mention will require some significant
> > > rework in the page-tables? I am not going to review 4 times the 
> > > page-tables
> > I do not think we should rework it significantly.. :)
> > 
> > > as this is going to waste much more time than doing it correctly once and
> > > for all. So rather than try to argue, you should do it.
> > Please persuade me, not force me.
> 
> Well, everytime I tried to persuade you replied back with "It is not
> necessary now" or "It is not significant work"... So at this point, you
Yes, I need a solid reason to change my mind...

> should look back at my e-mails and ask yourself why I keep asking for
> getting the page-tables code complete right now. I am not doing that for
> fun.
> 
> You can boot Mini-OS with your current series, congrats for that. However,
> it is not yet in a state you could use the PV drivers and even some bits of
> the libc. One of the reason is because the page-tables is not ready, for
> instance, to remove mapping.
> 
> Anyway, in your answer to the cover letters you said you would test the PV
> drivers. So I expect the page-tables to change in the next version.
I will test the PV drivers in next version. If I should change it if
we really need.

> 
> > 
> > > 
> > > > 
> > > > 
> > > > > 
> > > > > [...]
> > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > > +
> > > > > > > > > > > > +static unsigned long first_free_pfn;
> > > > > > > > > > > 
> > > > > > > > > > > Likely this variable and get_new_page needs more 
> > > > > > > > > > > explanation in the code.
> > > > > > > > > > okay, I stole this code from the x86 in mini-os.
> > > > > > > > > 
> > > > > > > > > Sorry but I can't find this code in x86... The variable is 
> > > > > > > > > here but seem to
> > > > > > > > > be used differently. Would you mind giving a pointer?
> > > > > > > > in arch/x86/mm.c, search first_free_pfn.
> > > > > > > 
> > > > > > > Have you read what I wrote? I am perfectly happy to explain if 
> > > > > > > you don't
> > > > > > > understand something, but it is annoying me to see you don't seen 
> > > > > > > to bother
> > > > > > > considering what I say. It is already the second time in this 
> > > > > > > e-mail...
> > > > > > > 
> > > > > > > I know where the first_free_pfn is used in x86 code... My point 
> > > > > > > was that the
> > > > > > > usage is completely different from here. It makes sense over 
> > > > > > > there, here it
> > > > > > > looks quite hackish.
> > > > > > I can add more comment in the __setup_initial_pgtable().
> > > > > > I have already make at least 2M space for the page table:
> > > > > > 
> > > > > >       ---------------------------------------------------------
> > > > > >        /* Add 2 MiB for rounding above */
> > > > > >        add     x2, x2, #1                 /* x2 := total number of 
> > > > > > entries */
> > > > > >       ---------------------------------------------------------
> > > > > > 
> > > > > > And I think it is a good way to allocate page to setup the page 
> > > > > > table
> > > > > > during the boot.
> > > > > 
> > > > > I don't think your statement: "I have already make at least 2M space 
> > > > > for the
> > > > > page table" is true.
> > > > > 
> > > > > If you get the whole calculation:
> > > > > 
> > > > > /* Find the size of the kernel */
> > > > > ldr     x0, =_text                 /* x0 := vaddr(_text)            */
> > > > > ldr     x1, =_end                  /* x1 := vaddr(_end)             */
> > > > > sub     x2, x1, x0
> > > > > /* Get the number of l2 pages to allocate, rounded down */
> > > > > lsr     x2, x2, #L2_SHIFT
> > > > > /* Add 2 MiB for rounding above */
> > > > > add     x2, x2, #1                 /* x2 := total number of entries */
> > > > > 
> > > > > You first get the number of 2MB pages to allocate, rounded down, then 
> > > > > add 1.
> > > > > 
> > > > > So if your kernel is less than 2MB, you will allocate only one 2MB 
> > > > > page. And
> > > > > hence have less than 2MB space for the page table.
> > > > > 
> > > > > As you can see this is really fragile, because you depend on the size 
> > > > > of the
> > > > > kernel which may change depending on the application linked with.
> > > > okay, I can change the code like this:
> > > >     -------------------------------------------------------
> > > >          /* Add 4 MiB for rounding above */
> > > >          add     x2, x2, #2                 /* x2 := total number of 
> > > > entries */
> > > >     -------------------------------------------------------
> > > > 
> > > > Add extra 2M memory for the page table.
> > > 
> > > How do you know this is enough? This solution is just fragile and does not
> > > scale easily. You should just use BSS as suggested below.
> > 
> > One page will span 1G memory.
> > so 2M memory is 512 pages which will span 512G memory.
> > Most of the time, we only pass less 1G memory for the mini-os, so we
> > actually use one page.
> 
> Hu? Where does this statement come from?
No matter how much memory you pass to the mini-os, the 2M memory is
enough for 39bit address space.

> 
> > 
> > The minios supports 39bit now which is 512G, so the 2M is enough for the
> > 39bit now.
> > 
> > > 
> > > > 
> > > > > 
> > > > > > 
> > > > > > 
> > > > > > If you think it is not good, please give me a better one.
> > > > > 
> > > > > A better solution is to statically allocate in BSS another 
> > > > > page-table. It
> > > > How many pages we should reserved in the BSS? :)
> > > 
> > > You surely must now the number of pages you know given you mapped an extra
> > > "2MB" in your solution....
> > I think we do _not_ know how many pages we should reserved in the BSS.
> > 
> > You can pass 512M to mini-os, we can also pass 10G to mini-os, how can
> > we know the pages we should reserve for them?
> > 
> > How can we set it the arm64.lds ?
> 
> The same way you allocate space for boot_l1_pgtable.
> 
> > 
> > If we reserve 2M in the arm64.lds, the minios will bigger then 2M
> > then...
> 
> No you don't need to reserve 2MB. You just need to reserve one more table to
> bootstrap yourself. This will cover the first mapping in the case the L2
> page-table is not present. Once that is done, page-table could be allocated
> normally.
The mini-os will copy the device-tree to the end of the memory:
  
-----------------------------------------------------------------------------------
    /* The device tree is probably in memory that we're about to hand over to 
the page
     * allocator, so move it to the end and reserve that space.
     */
    fdt_size = fdt_totalsize(device_tree);
    new_device_tree = to_virt(((*max_pfn_p << PAGE_SHIFT) - fdt_size) & 
PAGE_MASK);
    if (new_device_tree != device_tree) {
        memmove(new_device_tree, device_tree, fdt_size);
    }
    device_tree = new_device_tree;
    *max_pfn_p = to_phys(new_device_tree) >> PAGE_SHIFT;
  
-----------------------------------------------------------------------------------

We should setup the mapping for that memory which will store the
device_tree. Do you mean we reserve one page for the device_tree?

Thanks
Huang Shijie

_______________________________________________
Minios-devel mailing list
Minios-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/cgi-bin/mailman/listinfo/minios-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.