[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [patch 00/37] cpu/hotplug, x86: Reworked parallel CPU bringup


  • To: Peter Zijlstra <peterz@xxxxxxxxxxxxx>, Thomas Gleixner <tglx@xxxxxxxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Mon, 17 Apr 2023 11:44:06 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=fuffgUis9cpulCJM35538EEjK3qLzbj/4QHfErdU/c8=; b=PzzerW5LkBUh+esbneENdCIChDwqTpooec28+LI/5jheHALkhOPiEOPsGvC/6IV2n4ZW00h+b8GCE/MMKe0OqCLTx8ntAI9tLSRm+qmxklgSxfNC7QFW/SDJETv0nsORDE9xLDqipOJLhtBckapEnC+U2T0Sc3beo2/a0P98nQOWZX4JaAFFo8ZudGkH0uuGCvBJmRY8pWM3O5oWinHexvTHWKcSnZgCNQkRUW1lP03C93KUsENowmv4QJYhUWn5Kh2SPrzl/GjFatiCDF2PX2I7N8ScUZdTHBedbb8GZTtPKWZEDaRL+qSvVb0jr4EMV9S+v7/Fb7K8c9baNrfLww==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=bemZVqoo06j/0liAU3QeQysHX6exlFmUOVkQDlFdrjEN02ijjjOIahYQjTlSEVGxFkWGN2ebzHzw+pO1cuBbgV7FpvmXf1U9SmRQf50ae1WaeWXWGhDgjKJfEbVTQfkZD1agUSw6+ByRc54rUh7RByF4x44FwjweZJ2R3WecPW8y9jVoVM3DF/Y6OMqH1RbDeuOpOchaDljmTgUSdrP6qXxL9nszl3h2kHw2hLMoP5cJ8wh5oTsqETWSE1FxPNgpcT3iKOwzVcuA9DckBub3hFVEPn+nHY0DfvtabQ5FGde8CRuEzvYlbzJSDNgfaN7DyKIc21hJfcb65MJBmrRVfA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: LKML <linux-kernel@xxxxxxxxxxxxxxx>, x86@xxxxxxxxxx, David Woodhouse <dwmw2@xxxxxxxxxxxxx>, Brian Gerst <brgerst@xxxxxxxxx>, Arjan van de Veen <arjan@xxxxxxxxxxxxxxx>, Paolo Bonzini <pbonzini@xxxxxxxxxx>, Paul McKenney <paulmck@xxxxxxxxxx>, Tom Lendacky <thomas.lendacky@xxxxxxx>, Sean Christopherson <seanjc@xxxxxxxxxx>, Oleksandr Natalenko <oleksandr@xxxxxxxxxxxxxx>, Paul Menzel <pmenzel@xxxxxxxxxxxxx>, "Guilherme G. Piccoli" <gpiccoli@xxxxxxxxxx>, Piotr Gorski <lucjan.lucjanov@xxxxxxxxx>, David Woodhouse <dwmw@xxxxxxxxxxxx>, Usama Arif <usama.arif@xxxxxxxxxxxxx>, Juergen Gross <jgross@xxxxxxxx>, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, Russell King <linux@xxxxxxxxxxxxxxx>, Arnd Bergmann <arnd@xxxxxxxx>, linux-arm-kernel@xxxxxxxxxxxxxxxxxxx, Catalin Marinas <catalin.marinas@xxxxxxx>, Will Deacon <will@xxxxxxxxxx>, Guo Ren <guoren@xxxxxxxxxx>, linux-csky@xxxxxxxxxxxxxxx, Thomas Bogendoerfer <tsbogend@xxxxxxxxxxxxxxxx>, linux-mips@xxxxxxxxxxxxxxx, "James E.J. Bottomley" <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx>, Helge Deller <deller@xxxxxx>, linux-parisc@xxxxxxxxxxxxxxx, Paul Walmsley <paul.walmsley@xxxxxxxxxx>, Palmer Dabbelt <palmer@xxxxxxxxxxx>, linux-riscv@xxxxxxxxxxxxxxxxxxx, Mark Rutland <mark.rutland@xxxxxxx>, Sabin Rapan <sabrapan@xxxxxxxxxx>
  • Delivery-date: Mon, 17 Apr 2023 10:44:38 +0000
  • Ironport-data: A9a23:BK+A+aDHYggiWRVW//Xlw5YqxClBgxIJ4kV8jS/XYbTApD8rhDZWz zQfXjqGbP+CMTCkKY8gady3pkxXvsSHzoA2QQY4rX1jcSlH+JHPbTi7wuUcHAvJd5GeExg3h yk6QoOdRCzhZiaE/n9BCpC48T8nk/nOHuGmYAL9EngZbRd+Tys8gg5Ulec8g4p56fC0GArIs t7pyyHlEAbNNwVcbyRFuspvlDs15K6p4G9B4ARkDRx2lAS2e0c9Xcp3yZ6ZdxMUcqEMdsamS uDKyq2O/2+x13/B3fv8z94X2mVTKlLjFVDmZkh+AsBOsTAbzsAG6Y4pNeJ0VKtio27hc+ada jl6ncfYpQ8BZsUgkQmGOvVSO3kW0aZuoNcrLZUj2CA6IoKvn3bEmp1T4E8K0YIw+utsLm1u+ f8jAz1Ra0qFm8mf7Le2Y7w57igjBJGD0II3nFhFlWucJ9B/BJfJTuPN+MNS2yo2ioZWB/HCa sEFaD1pKhPdfxlIPVRRA5U79AuqriCnL3sE9xTI/OxrvAA/zyQouFTpGPPTdsaHWoN+mUGAq 3id12/4HgsbJJqUzj/tHneE37eSw3KhBN5JfFG+3qA3unq83WY8MiExUUax+qfppGe9Zc0Kf iT4/QJr98De7neDVtThUgeqiH+CsAQVV9dZH6s98g7l4q7V5RuJLmEeSzpAbsE28sgsSno31 TehlsnvCRRmqruZQzSR+9+8qTK0JDhQJHUHICwJVw0I5/HnoZovlVTOSNh5GaK4h9GzHiv/q xiBpTQ3g7QVy8sCzaS99Evviiip4JPOS2Yd+QTTWkqm4xl/aYrjYJangXDU8PFaIYCxTVSbu nUA3c+E44gmCZCLiTzIS/4ODZm36Pufdj7Rm1hiG98m7TvF02K4d4df7TdyDE5tKsYNPzHza UnQtAUX6JI7FH+ra7JnJoewE98C06ftD5LmW+rSY94IZYJ+HCel9SRjfgi62Hzxl00onLAXO Z6dasuqFX8AFaJq1iG2Rv9b2rgurggyyGfXSIrTwBG3l7aTDFaRSLEYIB6WZ/o496isvgrY6 ZBcOtGMxhEZV/fxCgHP+JMXa08DKX0gAZ3ng9dWeO+dL0xtH2RJI+fYxbYsYaRplq5fm+PUu Hq6XydwzFv5mG2CMwSRYW5LbLL0QY05rHQ1JyUgMF+knX85bu6H/KoZMpc6Y7Qj3Ohi1uJvC ekIfd2aBfZCQSiB/C4SBbH4pZZhMg62mQaHOSaNaSI6OZVnQmTh8Nj+fxCp8zISFC2prsgvi 7q63wjfTNwIQAEKJNvNYfemiVqrvHY1kvNuUkfBJNJePk7r9eBCNyP1ntczIscRNQ/EwDqKk QqbaT8cpO/Qs8o2/cPPiKSssYikCa19E1BcEm2d6qy5XQHKrjSLwoJaVuuMOzfHWwvc9Lqne r99zvfyKvQLkV9G9Y1mHN5Dx7gx6sH0u5dVyw1lGDPAaFHDIrp6IHCA9c1OsLBdgL5fpQayH EmI/7FyIb+OPtHkFl85Pgcpbu2fk/oTn1H69f0oJEDx5wd08aCBXEEUOAOD4ARBLLxwOZhjx eontMcd6B2Xgx8mdN2Bi0h87GmFKH4Le6ogsZ4eDcngjQ9D4lZGbJ/HAyn6+rmJYslFNkQsI TjSgqPHitx03k/PdWg+EVDI2u5SiJJIvQhFilMPT3yGgtvOjfgz3TVQ/j0zTw0TxRJCu8p3J 2NpPkszIbiF/T5ug9ZrUGWlGgUHDxqckmT91F4WvGTcRluvUCrGKysgOo6l/k8D9HlHViNG5 7zew2HgOR7xdcvr1zM7X2ZsrvXxSto3/QrH8P1LBOyAFpg+JDDj26mnYDJSrwO9WZ9uwkrau eNt4eB8L7XhMjIdqLE6DI/c0qkMTBeDJypJRvQJEL41IFwwsQqagVCmQ31dsOsWTxAW2SdU0 /BTG/8=
  • Ironport-hdrordr: A9a23:bCC1sqDd/L4CPQflHeg5sceALOsnbusQ8zAXPh9KJCC9I/bzqy nxpp8mPEfP+U8ssQIb6Ki90ci7MDrhHPtOjbX5Uo3SODUPNgGTXeZfBOfZogEIeBeOvtK1t5 0QFZSWYeeYZTcVsS+Q2njaLz9U+qjjzEnev5a9854Cd2FXQpAlyz08JheQE0VwSgUDL4E+Do Cg6s1OoCflUWgLb+ygb0N1FdTrlpnurtbLcBQGDxko5E2lljWz8oP3FBCew1M3Ty5P+7E/6m LI+jaJq5lL8svLhiM05VWjoai+q+GRi+erw/b8yvT9Hw+cxTpAor4RGIFq8gpF4t1Ho2xa6+ Uk6y1QRfibrUmhNV1d6CGdpzUJ3FsVmgLf4E7djn35rcPjQjUmT8JHmIJCaxPcr1Etpddmzc twrhakXrdsfGH9dR7Glq31fgAvklDxrWspkOYVgXAaWYwCaKVJpYha+E9OCp8PEC/z9YhiSY BVfbfhzecTdUnfY2HSv2FpztDpVnMvHg2eSkxHvsCOyTBZkH1w0kNdzs0CmXUL8o47VvB/lp P5G7UtkKsLQt4dbKp7CutEScyrCnbVSRaJK26WKUSPLtB1B5sMke+G3FwY3pDaRHVT9upMpH 3oaiIniVIP
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 17/04/2023 11:30 am, Peter Zijlstra wrote:
> On Sat, Apr 15, 2023 at 01:44:13AM +0200, Thomas Gleixner wrote:
>
>> Background
>> ----------
>>
>> The reason why people are interested in parallel bringup is to shorten
>> the (kexec) reboot time of cloud servers to reduce the downtime of the
>> VM tenants. There are obviously other interesting use cases for this
>> like VM startup time, embedded devices...
> ...
>
>>   There are two issue there:
>>
>>     a) The death by MCE broadcast problem
>>
>>        Quite some (contemporary) x86 CPU generations are affected by
>>        this:
>>
>>          - MCE can be broadcasted to all CPUs and not only issued locally
>>            to the CPU which triggered it.
>>
>>          - Any CPU which has CR4.MCE == 0, even if it sits in a wait
>>            for INIT/SIPI state, will cause an immediate shutdown of the
>>            machine if a broadcasted MCE is delivered.
> When doing kexec, CR4.MCE should already have been set to 1 by the prior
> kernel, no?

No(ish).  Purgatory can't take #MC, or NMIs for that matter.

It's cleaner to explicitly disable CR4.MCE and let the system reset
(with all the MC banks properly preserved), than it is to take #MC while
the IDT isn't in sync with the handlers, and wander off into the weeds.

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.