[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: oxenstored performance issue when starting VMs in parallel
> -----Original Message----- > From: Jürgen Groß <jgross@xxxxxxxx> > Sent: 22 September 2020 15:18 > To: paul@xxxxxxx; 'Edwin Torok' <edvin.torok@xxxxxxxxxx>; > sstabellini@xxxxxxxxxx; 'Anthony Perard' > <anthony.perard@xxxxxxxxxx>; xen-devel@xxxxxxxxxxxxxxxxxxxx > Cc: xen-users@xxxxxxxxxxxxxxxxxxxx; jerome.leseinne@xxxxxxxxx; julien@xxxxxxx > Subject: Re: oxenstored performance issue when starting VMs in parallel > > On 22.09.20 15:42, Paul Durrant wrote: > >> -----Original Message----- > >> From: Edwin Torok <edvin.torok@xxxxxxxxxx> > >> Sent: 22 September 2020 14:29 > >> To: sstabellini@xxxxxxxxxx; Anthony Perard <anthony.perard@xxxxxxxxxx>; > >> xen- > >> devel@xxxxxxxxxxxxxxxxxxxx; paul@xxxxxxx > >> Cc: xen-users@xxxxxxxxxxxxxxxxxxxx; jerome.leseinne@xxxxxxxxx; > >> julien@xxxxxxx > >> Subject: Re: oxenstored performance issue when starting VMs in parallel > >> > >> On Tue, 2020-09-22 at 15:17 +0200, jerome leseinne wrote: > >>> Hi, > >>> > >>> Edwin you rock ! This call in qemu is effectively the culprit ! > >>> I have disabled this xen_bus_add_watch call and re-run test on our > >>> big server: > >>> > >>> - oxenstored is now between 10% to 20% CPU usage (previously was > >>> 100% all the time) > >>> - All our VMs are responsive > >>> - All our VM start in less than 10 seconds (before the fix some VMs > >>> could take more than one minute to be fully up > >>> - Dom0 is more responsive > >>> > >>> Disabling the watch may not be the ideal solution ( I let the qemu > >>> experts answer this and the possible side effects), > >> > >> Hi, > >> > >> CC-ed Qemu maintainer of Xen code, please see this discussion about > >> scalability issues with the backend watching code in qemu 4.1+. > >> > >> I think the scalability issue is due to this code in qemu, which causes > >> an instance of qemu to see watches from all devices (even those > >> belonging to other qemu instances), such that adding a single device > >> causes N watches to be fired on each N instances of qemu: > >> xenbus->backend_watch = > >> xen_bus_add_watch(xenbus, "", /* domain root node */ > >> "backend", xen_bus_backend_changed, > >> &local_err); > >> > >> I can understand that for backwards compatibility you might need this > >> code, but is there a way that an up-to-date (xl) toolstack could tell > >> qemu what it needs to look at (e.g. via QMP, or other keys in xenstore) > >> instead of relying on an overly broad watch? > > > > I think this could be made more efficient. The call to > > "module_call_init(MODULE_INIT_XEN_BACKEND)" > just prior to this watch will register backends that do auto-creation so we > could register individual > watches for the various backend types instead of this single one. > > The watch should be on guest domain level, e.g. for: > > /local/domain/0/backend/vbd/5 > > We have one qemu process per guest, after all. > I'll see if I can spin a patch this afternoon. Paul > > Juergen
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |