[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Flask vs paging mempool - Was: [xen-unstable test] 174809: regressions - trouble: broken/fail/pass


  • To: Jason Andryuk <jandryuk@xxxxxxxxx>
  • From: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>
  • Date: Mon, 21 Nov 2022 11:37:34 +0000
  • Accept-language: en-GB, en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=0Yo2Ijg1FoGfPa3iGjQpR7x4/uRmoIYcCH6eiUISy4k=; b=oFMKxubBfmcaK/tUjstd7lCO6wuy0cia/YZrFfFg4hd8MZWDmiGIz1/TJr++ONeAc3VIniQon5IgQjMJ+tjjGZt+DxhyvwcDL5pPaADZeIuiLkVdVFiVhpxvv4v9f0TtoGQx162eOo1IveHrPwXGr++sPvlTzcKWbfVCcZCPXh372/SioFaohurmIKvESA+n6JdM+NoEQPYOsiTLnvDxZZrQAb1q61Yco1+aIFPwfNYVOcR8Q9aGN/Ipl2cGKdN/s2z9bDU6jKqp45tk1NYkA0gmBzF0kY/9RrNw/Uwt8Yl5Gnz4phoqaTYu27Aaf8bBVv+J8HOt/xFjKomj1F0feg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=cVoeeGYOR+kLywlbz6L1Vup1JhbE5TZD8MsSjD7WWYQNDFOgAD5hSdTERYLUc7+ragKUXILKUwIgie6xIyGGIotCMn/u3bs+vFTjUaa63qDenM2N/tcqrJWIgaiewLJVmxMJcqJccJwzt1xZ4vye5f/NFf6xjoT1THNb3E4AdzkMR0s3OGygws7XtclUlBfHJAzAo7fI+CEfqMpPMBTP5M7RCpCOfqqimnnhu4LIg7DLx5SnSb+ZPlwVMYpPcPgbMuXGeHP9IDiCy5d82Uj3zMOpGwvic0Hk2lGlO6b2udRAEX0egKj4i8pcI3rE8TpkupDmad+P2D2Tg1JZYjsX8g==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Roger Pau Monne <roger.pau@xxxxxxxxxx>, Henry Wang <Henry.Wang@xxxxxxx>, Anthony Perard <anthony.perard@xxxxxxxxxx>, Daniel Smith <dpsmith@xxxxxxxxxxxxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Mon, 21 Nov 2022 11:37:58 +0000
  • Ironport-data: A9a23:HPQGGqn+BvzEJirRS7ndDM3o5gxKJ0RdPkR7XQ2eYbSJt1+Wr1Gzt xIWDGzUOfzZM2qgKdF+aYS1pE4HuZKHzYMxSABk/io0FSMWpZLJC+rCIxarNUt+DCFhoGFPt JxCN4aafKjYaleG+39B55C49SEUOZmgH+a6U6icf3grHmeIcQ954Tp7gek1n4V0ttawBgKJq LvartbWfVSowFaYCEpNg064gE4p7aqaVA8w5ARkP6kS5gaGzhH5MbpETU2PByqgKmVrNrbSq 9brlNmR4m7f9hExPdKp+p6TnpoiG+O60aCm0xK6aoD66vRwjnVaPpUTbZLwXXx/mTSR9+2d/ f0W3XCGpaXFCYWX8AgVe0Ew/yiTpsSq8pefSZS0mZT7I0Er7xIAahihZa07FdRwxwp5PY1B3 eIICQIhdFeOu+Kd4rumSuMx3p0pLOC+aevzulk4pd3YJdAPZMmZBo/stZpf1jp2gd1SF/HDY cZfcSBocBnLfxxIPBEQFY46m+CrwHL4dlW0qnrM/fZxvzeVkVE3ieeyWDbWUoXiqcF9t0CUv G/ZuU/+BQkXLoe3wjuZ6HO8wOTImEsXXapCSefmr6U23jV/wEQMLx1KVWWU+sPgyWT5VNhnC hYp+Sgh+P1aGEuDC4OVsweDiHyNuBIGSsdTO+I/4QCJjKHT5m6xJmUCVC8HV9Ugu+c/Xzls3 ViM9/vlHSdqsaGVYXuF+62IsCipPiwIMW4FYzRCRgwAi/HzrYd2gh/RQ9JLFK+uksazCTz22 yqNriU1m/MUl8Fj6kmg1VXOgjbprJ6ZSAcwv1/TRjj8sVw/Y5O5bYu171Sd9exHMIuSUliGu j4DhtSa6+cNS5qKkURhXdkwIV1g3N7dWBW0vLKlN8BJG+iFk5J7Qb1t3Q==
  • Ironport-hdrordr: A9a23:y/E4qa3hGH59+GP2zdiLsQqjBZpxeYIsimQD101hICG9Lfb0qy n+pp4mPEHP4wr5AEtQ4uxpOMG7MBDhHQYc2/hdAV7QZnidhILOFvAv0WKC+UyrJ8SazIJgPM hbAs9D4bHLbGSSyPyKmDVQcOxQj+VvkprY49s2pk0FJW4FV0gj1XYBNu/xKDwVeOAyP+tcKH Pq3Lsjm9PPQxQqR/X+IkNAc/nIptXNmp6jSRkaByQ/4A3LoSK05KX8Gx242A5bdz9U278t/U XMjgS8v8yYwrCG4y6Z81WWw4VdmdPnxNcGLMuQivINIjGpphe0aJ9nU7iiuilwhO208l4lnP TFvh9lFcVu7HH6eH2zvHLWqkfd+Qdrz0Wn5U6TgHPlr8C8bik9EdB9iYVQdQacw1Y8vflnuZ g7nF6xht5yN1ftjS7979/HW1VBjUyvu0cvluYVkjh2TZYeUrlMtoYSlXklUqvoXRiKrbzPIt MeS/0018wmN29yqEqp51WH9ebcGkjb2C32GnTq9PbliAS+10oJsnfwjPZv4kvosqhNC6Wsrt 60TJiB3tt1P7ArRLM4C+EbTcStDGvRBRrKLWKJOFziULoKInTXtvfMkfwIDcyRCes1JaEJ6e L8eUIdsXR3d1PlCMWI0pEO+hfRQH+lVTCozs1F/ZB2trD1WbKuaES4ORsTutrlp+9aDtzQWv 61Np4TC/j/LXH2EYIM2wHlQZFdJXQXTcVQsNcmXFCFpN7NN+TRx6TmWeeWIKCoHScvW2v5DH dGVD/vJN9Y5kTuQXP8iAi5YQKYRqU+x+MELEH3xZlh9GFWDPw8juE8syXI2uibbTtfr6cxYE xyZLv6j6LTnxjFwVr1
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHY+zeiYojqAeFzxUaCyzIpzopUAa5EwM6AgAAteACAAD/FAIAEFtEA
  • Thread-topic: Flask vs paging mempool - Was: [xen-unstable test] 174809: regressions - trouble: broken/fail/pass

On 18/11/2022 21:10, Jason Andryuk wrote:
> On Fri, Nov 18, 2022 at 12:22 PM Andrew Cooper
> <Andrew.Cooper3@xxxxxxxxxx> wrote:
>> On 18/11/2022 14:39, Roger Pau Monne wrote:
>>> Nov 18 01:55:11.753936 (XEN) arch/x86/mm/hap/hap.c:304: d1 failed to 
>>> allocate from HAP pool
>>> Nov 18 01:55:18.633799 (XEN) Failed to shatter gfn 7ed37: -12
>>> Nov 18 01:55:18.633866 (XEN) d1v0 EPT violation 0x19c (--x/rw-) gpa 
>>> 0x0000007ed373a1 mfn 0x33ed37 type 0
>>> Nov 18 01:55:18.645790 (XEN) d1v0 Walking EPT tables for GFN 7ed37:
>>> Nov 18 01:55:18.645850 (XEN) d1v0  epte 9c0000047eba3107
>>> Nov 18 01:55:18.645893 (XEN) d1v0  epte 9c000003000003f3
>>> Nov 18 01:55:18.645935 (XEN) d1v0  --- GLA 0x7ed373a1
>>> Nov 18 01:55:18.657783 (XEN) domain_crash called from 
>>> arch/x86/hvm/vmx/vmx.c:3758
>>> Nov 18 01:55:18.657844 (XEN) Domain 1 (vcpu#0) crashed on cpu#8:
>>> Nov 18 01:55:18.669781 (XEN) ----[ Xen-4.17-rc  x86_64  debug=y  Not 
>>> tainted ]----
>>> Nov 18 01:55:18.669843 (XEN) CPU:    8
>>> Nov 18 01:55:18.669884 (XEN) RIP:    0020:[<000000007ed373a1>]
>>> Nov 18 01:55:18.681711 (XEN) RFLAGS: 0000000000010002   CONTEXT: hvm guest 
>>> (d1v0)
>>> Nov 18 01:55:18.681772 (XEN) rax: 000000007ed373a1   rbx: 000000007ed3726c  
>>>  rcx: 0000000000000000
>>> Nov 18 01:55:18.693713 (XEN) rdx: 000000007ed2e610   rsi: 0000000000008e38  
>>>  rdi: 000000007ed37448
>>> Nov 18 01:55:18.693775 (XEN) rbp: 0000000001b410a0   rsp: 0000000000320880  
>>>  r8:  0000000000000000
>>> Nov 18 01:55:18.705725 (XEN) r9:  0000000000000000   r10: 0000000000000000  
>>>  r11: 0000000000000000
>>> Nov 18 01:55:18.717733 (XEN) r12: 0000000000000000   r13: 0000000000000000  
>>>  r14: 0000000000000000
>>> Nov 18 01:55:18.717794 (XEN) r15: 0000000000000000   cr0: 0000000000000011  
>>>  cr4: 0000000000000000
>>> Nov 18 01:55:18.729713 (XEN) cr3: 0000000000400000   cr2: 0000000000000000
>>> Nov 18 01:55:18.729771 (XEN) fsb: 0000000000000000   gsb: 0000000000000000  
>>>  gss: 0000000000000002
>>> Nov 18 01:55:18.741711 (XEN) ds: 0028   es: 0028   fs: 0000   gs: 0000   
>>> ss: 0028   cs: 0020
>>>
>>> It seems to be related to the paging pool adding Andrew and Henry so
>>> that he is aware.
>> Summary of what I've just given on IRC/Matrix.
>>
>> This crash is caused by two things.  First
>>
>>   (XEN) FLASK: Denying unknown domctl: 86.
>>
>> because I completely forgot to wire up Flask for the new hypercalls.
>> But so did the original XSA-409 fix (as SECCLASS_SHADOW is behind
>> CONFIG_X86), so I don't feel quite as bad about this.
> Broken for ARM, but not for x86, right?

Specifically, the original XSA-409 fix broke Flask (on ARM only) by
introducing shadow domctl to ARM without making flask_shadow_control()
common.

I "fixed" that by removing ARM's use of shadow domctl, and broke it
differently by not adding Flask controls for the new common hypercalls.

> I think SECCLASS_SHADOW is available in the policy bits - it's just
> whether or not the hook functions are available?

I suspect so.

>> And second because libxl ignores the error it gets back, and blindly
>> continues onward.  Anthony has posted "libs/light: Propagate
>> libxl__arch_domain_create() return code" to fix the libxl half of the
>> bug, and I posted a second libxl bugfix to fix an error message.  Both
>> are very simple.
>>
>>
>> For Flask, we need new access vectors because this is a common
>> hypercall, but I'm unsure how to interlink it with x86's shadow
>> control.  This will require a bit of pondering, but it is probably
>> easier to just leave them unlinked.
> It sort of seems like it could go under domain2 since domain/domain2
> have most of the memory stuff, but it is non-PV.  shadow has its own
> set of hooks.  It could go in hvm which already has some memory stuff.

Having looked at all the proposed options, I'm going to put it in
domain/domain2.

This new hypercall is intentionally common, and applicable to all domain
types (eventually - x86 PV guests use this memory pool during migrate). 
Furthermore, it needs backporting along with all the other fixes to try
and make 409 work.

~Andrew

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.