[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Migration bug added by commit 2df1aa01bef7366798248ac6d03cfb42048b003d



On 06/29/15 10:03, Paul Durrant wrote:
-----Original Message-----
From: Paul Durrant
Sent: 29 June 2015 13:54
To: Paul Durrant; Don Slutz; xen-devel@xxxxxxxxxxxxx; Jan Beulich
Subject: RE: [Xen-devel] Migration bug added by commit
2df1aa01bef7366798248ac6d03cfb42048b003d

-----Original Message-----
From: xen-devel-bounces@xxxxxxxxxxxxx [mailto:xen-devel-
bounces@xxxxxxxxxxxxx] On Behalf Of Paul Durrant
Sent: 29 June 2015 10:42
To: Don Slutz; xen-devel@xxxxxxxxxxxxx; Jan Beulich
Subject: Re: [Xen-devel] Migration bug added by commit
2df1aa01bef7366798248ac6d03cfb42048b003d

-----Original Message-----
From: Don Slutz [mailto:don.slutz@xxxxxxxxx]
Sent: 27 June 2015 22:02
To: xen-devel@xxxxxxxxxxxxx; Paul Durrant; Jan Beulich
Subject: Migration bug added by commit
2df1aa01bef7366798248ac6d03cfb42048b003d

commit 2df1aa01bef7366798248ac6d03cfb42048b003d
Author: Paul Durrant <paul.durrant@xxxxxxxxxx>
Date:   Tue Jun 23 18:07:49 2015 +0200

      x86/hvm: remove hvm_io_pending() check in hvmemul_do_io()

...
-            rc = X86EMUL_RETRY;
-            if ( !hvm_send_assist_req(s, &p) )
+            rc = hvm_send_assist_req(s, &p);
+            if ( rc != X86EMUL_RETRY )
...
       if ( unlikely(!vcpu_start_shutdown_deferral(curr)) )
-        return 0; /* implicitly bins the i/o operation */
+        return X86EMUL_OKAY;

So now X86EMUL_OKAY is returned from hvmemul_do_io() during
shutdown.

  From Jan Beulich about this:


   Re: [Xen-devel] [PATCH 3/5] hvmemul_do_io: If the send to the ioreq
   server failed do not retry.
   <https://www.mail-archive.com/search?l=xen-
devel@xxxxxxxxxxxxx&q=subject:%22Re%5C%3A+%5C%5BXen%5C-

devel%5C%5D+%5C%5BPATCH+3%5C%2F5%5C%5D+hvmemul_do_io%5C%3
A+If+the+send+to+the+ioreq+server+failed+do+not+retry.%22&o=newest

Jan Beulich
<https://www.mail-archive.com/search?l=xen-
devel@xxxxxxxxxxxxx&q=from:%22Jan+Beulich%22>
Fri, 30 Jan 2015 02:24:55 -0800
<https://www.mail-archive.com/search?l=xen-
devel@xxxxxxxxxxxxx&q=date:20150130>


On 30.01.15 at 01:52, <dsl...@xxxxxxxxxxx> wrote:
I.E. do just what no backing DM does.

_If_ this is correct, the if() modified here should be folded with the
one a few lines up. But looking at the description of the commit that
introduced this (bac0999325 "x86 hvm: Do not incorrectly retire an
instruction emulation...", almost immediately modified by f20f3c8ece
"x86 hvm: On failed hvm_send_assist_req(), io emulation...") I doubt
this is really what we want, or at the very least your change
description should explain what was wrong with the original commit.

Jan



going up the call stack:

in handle_pio when hvmemul_do_pio_buffer() returns X86EMUL_OKAY,
it
returns 1.

     svm_vmexit_handler 2578 if ( handle_pio(port, bytes, dir) )
or
     vmx_vmexit_handler 3178 if ( handle_pio(port, bytes, dir) )

both update the IP in this case which is wrong during shutdown.

If this is indeed the problem then it seems to me that the correct fix is to
add
a check in handle_pio() and return 0 if the cpu is shutting down.

Actually, with some more thought, it sounds like we essentially want to
unwind the I/O emulation if we're shutting down and I think this could be
achieved by having hvm_send_assist_req() return X86EMUL_RETRY if it hits
the shutdown deferral case, having hvm_emul_do_io() reset the emulation
state back to 'none' if the domain is shutting down, and then having
handle_pio() return 0 if sees X86EMUL_RETRY without any form of
completion being requested.

I think this patch should do it for now:

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index a4d7225..cc6130c 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -183,7 +183,7 @@ static int hvmemul_do_io(
          else
          {
              rc = hvm_send_assist_req(s, &p);
-            if ( rc != X86EMUL_RETRY )
+            if ( rc != X86EMUL_RETRY || curr->domain->is_shutting_down )

I do not know enough about "is_shutting_down" to agree. What is clear is that
this test is not the same as "!vcpu_start_shutdown_deferral(curr)".

    -Don Slutz

                  vio->io_state = HVMIO_none;
              else if ( data_is_addr || dir == IOREQ_WRITE )
                  rc = X86EMUL_OKAY;
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 8226d8c..aba4ef6 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2547,7 +2547,7 @@ int hvm_send_assist_req(struct hvm_ioreq_server *s, 
ioreq_t *proto_p)

      ASSERT(s);
      if ( unlikely(!vcpu_start_shutdown_deferral(curr)) )
-        return X86EMUL_OKAY;
+        return X86EMUL_RETRY;

      list_for_each_entry ( sv,
                            &s->ioreq_vcpu_list,

   Paul

   Paul

So I think:

              rc = hvm_send_assist_req(s, &p);
              if ( rc != X86EMUL_RETRY )
                   vio->io_state = HVMIO_none;

needs to change to:

                  rc = hvm_send_assist_req(s, &p);
                  if ( rc != X86EMUL_RETRY )
                  {
                      vio->io_state = HVMIO_none;
                      if ( rc == X86EMUL_OKAY )
                          rc = X86EMUL_RETRY;
                  }


     -Don Slutz
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.