[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC v3 1/2] x86/xen: add xen_is_preemptible_hypercall()



On Thu, Jan 22, 2015 at 09:44:12PM +0000, Andrew Cooper wrote:
> On 22/01/2015 21:09, Luis R. Rodriguez wrote:
> > On Thu, Jan 22, 2015 at 12:01:50PM -0800, Andy Lutomirski wrote:
> >> On Thu, Jan 22, 2015 at 11:30 AM, Luis R. Rodriguez <mcgrof@xxxxxxxx> 
> >> wrote:
> >>> On Wed, Jan 21, 2015 at 07:07:36PM -0800, Andy Lutomirski wrote:
> >>>> On Wed, Jan 21, 2015 at 6:17 PM, Luis R. Rodriguez
> >>>> <mcgrof@xxxxxxxxxxxxxxxx> wrote:
> >>>>> From: "Luis R. Rodriguez" <mcgrof@xxxxxxxx>
> >>>>>
> >>>>> On kernels with voluntary or no preemption we can run
> >>>>> into situations where a hypercall issued through userspace
> >>>>> will linger around as it addresses sub-operatiosn in kernel
> >>>>> context (multicalls). Such operations can trigger soft lockup
> >>>>> detection.
> >>>>>
> >>>>> We want to address a way to let the kernel voluntarily preempt
> >>>>> such calls even on non preempt kernels, to address this we first
> >>>>> need to distinguish which hypercalls fall under this category.
> >>>>> This implements xen_is_preemptible_hypercall() which lets us do
> >>>>> just that by adding a secondary hypercall page, calls made via
> >>>>> the new page may be preempted.
> >>>>>
> >>>>> Andrew had originally submitted a version of this work [0].
> >>>>>
> >>>>> [0] http://lists.xen.org/archives/html/xen-devel/2014-02/msg01056.html
> >>>>>
> >>>>> Based on original work by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> >>>>>
> >>>>> Cc: Andy Lutomirski <luto@xxxxxxxxxxxxxx>
> >>>>> Cc: Borislav Petkov <bp@xxxxxxx>
> >>>>> Cc: David Vrabel <david.vrabel@xxxxxxxxxx>
> >>>>> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> >>>>> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> >>>>> Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
> >>>>> Cc: x86@xxxxxxxxxx
> >>>>> Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
> >>>>> Cc: Masami Hiramatsu <masami.hiramatsu.pt@xxxxxxxxxxx>
> >>>>> Cc: Jan Beulich <JBeulich@xxxxxxxx>
> >>>>> Cc: linux-kernel@xxxxxxxxxxxxxxx
> >>>>> Signed-off-by: Luis R. Rodriguez <mcgrof@xxxxxxxx>
> >>>>> ---
> >>>>>  arch/x86/include/asm/xen/hypercall.h | 20 ++++++++++++++++++++
> >>>>>  arch/x86/xen/enlighten.c             |  7 +++++++
> >>>>>  arch/x86/xen/xen-head.S              | 18 +++++++++++++++++-
> >>>>>  3 files changed, 44 insertions(+), 1 deletion(-)
> >>>>>
> >>>>> diff --git a/arch/x86/include/asm/xen/hypercall.h 
> >>>>> b/arch/x86/include/asm/xen/hypercall.h
> >>>>> index ca08a27..221008e 100644
> >>>>> --- a/arch/x86/include/asm/xen/hypercall.h
> >>>>> +++ b/arch/x86/include/asm/xen/hypercall.h
> >>>>> @@ -84,6 +84,22 @@
> >>>>>
> >>>>>  extern struct { char _entry[32]; } hypercall_page[];
> >>>>>
> >>>>> +#ifndef CONFIG_PREEMPT
> >>>>> +extern struct { char _entry[32]; } preemptible_hypercall_page[];
> >>>> A comment somewhere explaining why only non-preemptible kernels have
> >>>> preemptible hypercalls might be friendly to some future reader. :)
> >>> Good idea, since this section is arch specific, I'll instead add a blurb
> >>> explaining this on the upcall.
> >>>
> >>>>> +
> >>>>> +static inline bool xen_is_preemptible_hypercall(struct pt_regs *regs)
> >>>>> +{
> >>>>> +       return !user_mode_vm(regs) &&
> >>>>> +               regs->ip >= (unsigned long)preemptible_hypercall_page &&
> >>>>> +               regs->ip < (unsigned long)preemptible_hypercall_page + 
> >>>>> PAGE_SIZE;
> >>>>> +}
> >>>> This makes it seem like the page is indeed one page long, but I don't
> >>>> see what actually allocates a whole page for it.  What am I missing?
> >>> arch/x86/xen/xen-head.S
> >>>
> >>> .pushsection .text
> >>>         .balign PAGE_SIZE
> >>> ENTRY(hypercall_page)
> >>>
> >>> #ifndef CONFIG_PREEMPT
> >>> ENTRY(preemptible_hypercall_page)
> >>>         .skip PAGE_SIZE
> >>> #endif /* CONFIG_PREEMPT */
> >>>
> >>> Does that suffice to be sure?
> >> This looks like hypercall_page and preemptible_hypercall_page will
> >> both be page-aligned but will be the same page.  Should there be
> >> another .skip PAGE_SIZE in there?
> > I think the trick here was since hypercall_page is already aligned,
> > and we are just allocation PAGE_SIZE we are essentially pegging
> > preemptible_hypercall_page right after hypercall_page.
> >
> > Andrew, David, can you confirm?
> 
> Your version is different to my original one (observe the lack of
> NEXT_HYPERCALL()s),

I don't get what is missing they should be pretty identical. It had:

--- a/arch/x86/xen/xen-head.S
+++ b/arch/x86/xen/xen-head.S
@@ -85,9 +85,18 @@ ENTRY(xen_pvh_early_cpu_init)
 .pushsection .text
        .balign PAGE_SIZE
 ENTRY(hypercall_page)
+
+#ifdef CONFIG_PREEMPT
+#  define PREEMPT_HYPERCALL_ENTRY(x)
+#else
+#  define PREEMPT_HYPERCALL_ENTRY(x) \
+       .global xen_hypercall_##x ## _p ASM_NL \
+       .set preemptible_xen_hypercall_##x, xen_hypercall_##x + PAGE_SIZE ASM_NL
+#endif
 #define NEXT_HYPERCALL(x) \
        ENTRY(xen_hypercall_##x) \
-       .skip 32
+       .skip 32 ASM_NL \
+       PREEMPT_HYPERCALL_ENTRY(x)

Is that what you mean?

> and I would agree that it would appear as if in your
> version, hypercall_page and preemptible_hypercall_page are symbols with
> the same address.
> 
> nm should give you a quick confirmation one way or another.

symtab:

mcgrof@ergon ~/linux (git::hypercall-preemption-v3)$ grep hypercall_page 
/boot/System.map-3.19.0-rc4-00040-g9c6fb2a 
ffffffff81001000 T hypercall_page
ffffffff81002000 T preemptible_hypercall_page

This is one page apart.

ffffffff81ae7b50 R __ksymtab_hypercall_page
ffffffff81aea610 R __ksymtab_preemptible_hypercall_page
ffffffff81af86c0 r __kcrctab_hypercall_page
ffffffff81af9c20 r __kcrctab_preemptible_hypercall_page
ffffffff81afbaa8 r __kstrtab_preemptible_hypercall_page
ffffffff81afbac3 r __kstrtab_hypercall_page

  Luis

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.