[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 04/10] xen/blkfront: separate ring information to an new struct
On 19/02/15 11:08, Roger Pau Monné wrote: > El 19/02/15 a les 3.05, Bob Liu ha escrit: >> >> >> On 02/19/2015 02:08 AM, Felipe Franciosi wrote: >>>> -----Original Message----- >>>> From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@xxxxxxxxxx] >>>> Sent: 18 February 2015 17:38 >>>> To: Roger Pau Monne >>>> Cc: Bob Liu; xen-devel@xxxxxxxxxxxxx; David Vrabel; linux- >>>> kernel@xxxxxxxxxxxxxxx; Felipe Franciosi; axboe@xxxxxx; hch@xxxxxxxxxxxxx; >>>> avanzini.arianna@xxxxxxxxx >>>> Subject: Re: [PATCH 04/10] xen/blkfront: separate ring information to an >>>> new >>>> struct >>>> >>>> On Wed, Feb 18, 2015 at 06:28:49PM +0100, Roger Pau Monné wrote: >>>>> El 15/02/15 a les 9.18, Bob Liu ha escrit: >>>>> AFAICT you seem to have a list of persistent grants, indirect pages >>>>> and a grant table callback for each ring, isn't this supposed to be >>>>> shared between all rings? >>>>> >>>>> I don't think we should be going down that route, or else we can hoard >>>>> a large amount of memory and grants. >>>> >>>> It does remove the lock that would have to be accessed by each ring thread >>>> to >>>> access those. Those values (grants) can be limited to be a smaller value >>>> such >>>> that the overall number is the same as it was with the previous version. >>>> As in: >>>> each ring has = MAX_GRANTS / nr_online_cpus(). >>>>> >>> >>> We should definitely be concerned with the amount of memory consumed on the >>> backend for each plugged virtual disk. We have faced several problems in >>> XenServer around this area before; it drastically affects VBD scalability >>> per host. >>> >> >> Right, so we have to keep both the lock and the amount of memory >> consumed in mind. >> >>> This makes me think that all the persistent grants work was done as a >>> workaround while we were facing several performance problems around >>> concurrent grant un/mapping operations. Given all the recent submissions >>> made around this (grant ops) area, is this something we should perhaps >>> revisit and discuss whether we want to continue offering persistent grants >>> as a feature? >>> >> >> Agree, Life would be easier if we can remove the persistent feature. > > I was thinking about this yesterday, and IMHO I think we should remove > persistent grants now while it's not too entangled, leaving it for later > will just make our life more miserable. > > While it's true that persistent grants provide a throughput increase by > preventing grant table operations and TLB flushes, it has several > problems that cannot by avoided: > > - Memory/grants hoarding, we need to reserve the same amount of memory > as the amount of data that we want to have in-flight. While this is not > so critical for memory, it is for grants, since using too many grants > can basically deadlock other PV interfaces. There's no way to avoid this > since it's the design behind persistent grants. > > - Memcopy: guest needs to perform a memcopy of all data that goes > through blkfront. While not so critical, Felipe found systems were > memcopy was more expensive than grant map/unmap in the backend (IIRC > those were AMD systems). > > - Complexity/interactions: when persistent grants was designed number > of requests was limited to 32 and each request could only contain 11 > pages. This means we had to use 352 pages/grants which was fine. Now > that we have indirect IO and multiqueue in the horizon this number has > gone up by orders of magnitude, I don't think this is viable/useful any > more. > > If Konrad/Bob agree I would like to send a patch to remove persistent > grants and then have the multiqueue series rebased on top of that. I agree with this. I think we can get better performance/scalability gains of with improvements to grant table locking and TLB flush avoidance. David _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |