[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4] altp2m: Allow the hostp2m entries to be of type p2m_ram_shared



On Tue, Jul 19, 2016 at 5:39 AM, George Dunlap <george.dunlap@xxxxxxxxxx> wrote:
> On 18/07/16 18:06, Tamas K Lengyel wrote:
>>>> Anyhow, at this point I'm
>>>> going to start carrying out-of-tree patches for Xen in my project and
>>>> just resign from my mem_sharing maintainership as I feel like it's
>>>> pretty pointless.
>>>
>>> I'm sorry that you're discouraged; all I can say is that I hope you
>>> reconsider.  I'm not trying to block you, and I'm not ignoring your use
>>> case; it's the job of a maintainer to look at *everyone's* use cases and
>>> try to make sure that they are all accommodated in so far as it is
>>> possible.
>>>
>>> I'm also trying to make sure that the code you end up using in your
>>> project is robust and reliable.  It seems to me like if the current
>>> implementation was fixed, your life would be a lot easier than if we
>>> accept your patch as it is -- your sharing code could just worry about
>>> sharing, your altp2m code could just worry about whatever it's trying to
>>> do, without having to carefully avoid corner cases or manually fix
>>> things up when corner cases happen.  A bit less sharing would happen,
>>> because fewer pages would be eligible to be shared, but overall you'd
>>> have a lot less bugs and headache.
>>>
>>> I invested a lot of my very limited time carefully going through both
>>> sets of code before I answered your e-mail, and I spent a lot of time
>>> trying to explain the kinds of interactions I think will be a problem.
>>> I could have just acked the patch without doing that; but I think that
>>> would have been both less good for you, and less good for the project as
>>> a whole.
>>
>> I certainly appreciate your time spent on this. However, I don't see
>> the point of being maintainer if my opinion on what constitutes an
>> improvement of the system just gets overruled.
>
> You're not being overruled; you're just being asked to make a case for a
> change you want to make to an area of code that I maintain (the p2m
> code).  And the discussion is by no means over; I started the most
> recent discussion by saying "Correct me if I'm wrong", and it looks like
> there are still a number of places where we have different views of the
> facts of the matter.  Once we've established those we may end up with
> closer opinions.
>
> Working together means that sometimes you have to spend the time and
> effort to understand where other people are coming from -- what they
> think is important, what they think is true; and then working with that
> -- correcting them on places where they have misconceptions (or
> double-checking your own beliefs to make sure that you're not mistaken);
> communicating what it is that you think is important, and then trying to
> come up with a way forward that takes everyone's values into account, or
> convincing someone that a particular way really is the best way forward
> (which may mean convincing them to change their priorities somewhat).
>
> I am sorry that the tone of this conversation has heated up.  But the
> reason I've been "raising my voice" as it were is because I've been
> trying to ask questions and raise potential issues, and I feel like
> you've been just hand-waving them away.  You may be 100% right, but it
> is my duty as the maintainer of the p2m code to not accept code until
> I'm reasonably convinced that it's a good way forward.

By no means I meant to heat-up the conversation or hand-wave your
concerns away. I do understand what it takes to work with the
community and that it takes cooperation for that to happen. I was not
hand-waving your concerns away but describing how the two systems
could interact safely together while agreeing with you that yes, there
are still scenarios where it would not be wise to turn two
experimental systems on together.

>
>> I would like to hear the
>> other maintainers opinion on this matter as well but I'm not
>> interested in arguing endlessly or initiating or vote, so if the patch
>> is not allowed in I will accept that decision but I will see no point
>> in continuing as maintainer of the system.
>
> At a basic level, the other maintainers will agree that I shouldn't
> accept code unless I am convinced it's for the good of the project.  But
> since this is a technical issue, before anyone would express an opinion
> to ask me to change my mind, they would want a more complete view of the
> facts of the matter -- facts which you and I are still in the process of
> sorting out.

Both of these systems are fairly complicated and not many people have
been looking at them in-depth, so I most certainly appreciate the time
you spent on reviewing thus far. But your conclusion that there is a
"long way to go here" tells me that an arbitrary criteria is getting
pushed on me that I don't even know how to address. The altp2m system
got merged while it was known that it crashes the hypervisor when
mem_sharing is used, but now when an incremental fix introduces at
least one setup where they can be used together, all of a sudden that
is regression. I understand that it would be better to address all
cases and make the system as robust as possible. It's just way beyond
what I can do here.

>
>> As pretty much my
>> project is the only use-case where these two systems would be used
>> together at this point, and since I already require my users to
>> compile Xen from source it is just easier to go this route then what
>> you suggest and exploring and remedying all possible ways the two
>> systems could be misused when setup in ways they were not intended. If
>> these were considered stable features and not experimental I would
>> agree, but that's just not the case. So I think both of our time is
>> better spent doing other things then arguing.
>
> So a lot of points here.
>
> You have no idea what other projects are doing.  Lots of people take the
> Xen code, do something with it internally, and and we never hear from
> them.  Or maybe they're in a start-up in stealth mode and will announce
> themselves in due course.  If you step down from being a maintainer and
> stop engaging with the community you'll be in the same position.

Surely, I didn't imply there would or could be noone else who would be
interested in these two systems working together. All I meant was that
my project is the only known opensource project at this time that even
uses mem_sharing and altp2m. If anyone else would have been interested
in getting the two systems working together othen then me they
probably would have complained already that hey this crashes the
hypervisor. My point being that at this point the impact of this patch
is likely really low.

>
> There's a very obvious other use case which I've been talking about from
> the beginning: A host administrator / cloud provider / user wants to
> both 1) use page sharing to improve memory efficiency, and 2) allow
> tools using the altp2m feature to run internally within the guest.
> (Given that I have brought it up several times very explicitly, I am in
> fact a bit at a loss to understand why you would say your project is the
> "only use case where these two systems would be used together".)
>
> Even it were true that your project is the only use case, making the
> code so that it only works for your use case is a sure-fire way to
> guarantee that there never are any other use cases.

Well, I do make sure of that in my project by way of a custom XSM
policy that disables the guest from being able to do altp2m on itself.
I don't know what other way you could envision to enforce such
separation then documenting the known behaviors of the system and
trusting that the brave enough administrator who turns on experimental
features is ready to experiment with them.

> I'm not insisting that you either leave the code broken or make sharing
> and altp2m over the same gfn ranges fully robust.  What I've been
> suggesting you do is to fix the code so that it works as it is
> *intended* to work currently -- that is, that gfns refered to by altp2ms
> will be prevented from being shared.  Maybe that's not possible, or
> maybe that's not suitable for your use case; but you haven't yet argued
> either of those points.

OK, now this is a technical point that can be addressed and has not
been articulated as such. From my perspective this is fine, my usecase
assumes the domain is completely memshared before any altp2m
operations are done so I see no problem with this.

>
> As I've said, it seems to me like the option you've chosen isn't even
> the best option *for your project*.  You're giving *yourself* an
> interface which is inherently fragile, and for which you as the consumer
> of that interface need to think carefully about all the corner cases and
> then either avoid them or use mem_sharing to work around them.  Is that
> actually more difficult than just doing the work up front to make sure
> that the corner cases are handled in a robust manner by Xen?
>
> Finally, yes one of the benefits of being a maintainer is that you can
> shepherd and develop the code that you maintain to make sure that it
> doesn't degrade and continues to work for use cases that you care about.
>  But one of the responsibilities of that is looking not only for your
> own use case, but to try to make it a useful tool for the community as a
> whole.
>
> Working with the community certainly is a lot of work, but it's a *lot*
> less work than writing your own hypervisor.  It's also less work, in the
> end, than keeping a massive patchset out-of-tree and having to rebase
> every release.  And it often ends up improving and sharpening the code
> you wrote anyway.
>

Yes, I absolutely agree and that's why I've been working hard with the
community this past years to improve a lot of half - or completely -
abandoned and undocumented portions of the system. I certainly don't
like carrying out-of-tree patches and it's not my first choice.
However, it is unreasonable to expect complete reworks and/or
addressing all possible misuse cases on experimental parts that have
very low utility for most people.

Describing and fixing all possible ways how experiments can go wrong
with experimental systems is unreasonable and is not what I agreed to
when I accepted to maintain mem_sharing. I've accepted it with the
understanding that I can do "odd fixes" and incremental improvements
here and there. I understand if this may not be acceptable for your
components but since mem_sharing is inherently entangled with it, then
that means that doing odd fixes is not possible so keeping that hat as
is pointless. I certainly hope we can do incremental improvements as I
don't see anyone picking this up anytime in the near future.

Thanks,
Tamas

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.