[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] questions of vm save/restore on arm64



Hello,

On 13/06/16 01:55, Chenxiao Zhao wrote:


On 6/12/2016 11:31 PM, Julien Grall wrote:
On 12/06/2016 10:46, Chenxiao Zhao wrote:
I finally got save/restore working on arm64, but it only works when I
assign only one vCPU to VM. If I set vcpus=4 in configure file, the
restored VM does not work properly.

Can you describe what you mean by "does not work properly"? What are the
symptoms?

After restoring VM with more than one vCPU, the VM keeps in "b" state.

This happen if all the vCPUs of the guest are waiting on an event. For instance if the guest is executing the instruction WFI, the vCPU will get blocked until an interrupt is coming up.

I would not worry about this.

[   32.530490] Xen: initializing cpu0
[   32.530490] xen:grant_table: Grant tables using version 1 layout
[   32.531034] PM: noirq restore of devices complete after 0.382 msecs
[   32.531382] PM: early restore of devices complete after 0.300 msecs
[   32.531430] Xen: initializing cpu1
[   32.569028] PM: restore of devices complete after 24.663 msecs
[   32.569304] Restarting tasks ...
[   32.569903] systemd-journal[800]: undefined instruction:
pc=0000007fa37dd4c8
[   32.569975] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
[   32.571530] done.
[   32.571631] systemd[1]: undefined instruction: pc=0000007f8a9ea4c8
[   32.571650] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
[   32.573527] auditd[1365]: undefined instruction: pc=0000007f8aca24c8
[   32.573553] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
[   32.636573] systemd-cgroups[2210]: undefined instruction:
pc=0000007f99ad14c8
[   32.636633] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
[   32.636726] audit: *NO* daemon at audit_pid=1365
[   32.636741] audit: audit_lost=1 audit_rate_limit=0
audit_backlog_limit=320
[   32.636755] audit: auditd disappeared
[   32.638545] systemd-logind[1387]: undefined instruction:
pc=0000007f86e5b4c8
[   32.638594] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)

[...]

[   32.638673] audit: type=1701 audit(68.167:214): auid=4294967295 uid=0
gid=0 s
es=4294967295 subj=system_u:system_r:systemd_logind_t:s0 pid=1387
comm="systemd-
logind" exe="/usr/lib/systemd/systemd-logind" sig=4
[   32.647972] systemd-cgroups[2211]: undefined instruction:
pc=0000007fa7f414c8
[   32.648017] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)
[   32.648087] audit: type=1701 audit(68.177:215): auid=4294967295 uid=0
gid=0 s
es=4294967295 subj=system_u:system_r:init_t:s0 pid=2211
comm="systemd-cgroups" e
xe="/usr/lib/systemd/systemd-cgroups-agent" sig=4
[   61.401838] do_undefinstr: 8 callbacks suppressed
[   61.401882] crond[1550]: undefined instruction: pc=0000007f8d15d4c8
[   61.401903] Code: 2947b0cb d50339bf b94038c9 d5033fdf (d53be04f)

[...]


Also, I would start by debugging with 2 vCPUs and then increasing the
number step by step.

It's the same issue when restoring VM with more than one vCPUS. What I
see is guest reported "undefined instruction" with random PC depends on
the save point.

My point was that it is easier to debug with 2 vCPUs than 4 vCPUs. There is less concurrency involved.

The PC is the program counter of the application, which might be fully randomized.


Can you advice how would I start debugging this issue?

The undefined instructions are always the same in your log (d53be04f).
This is the encoding for "mrs x15, cntvct_el0". This register is only accessible at EL0 if CTKCTL_EL0.EL0VCTEN is enabled.

I guess that this register has not been save/restore correctly.

Regards,

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.