So I tried to it myself for fun but the first problem I encountered
was counting how much memory consumed on each node for each VM, which
a more accurate NUMA scheduling & load-balancing could depend on such

Yes, I remember talking with you about this, but I do not remember that
the showstopper was knowing how many pages of a certain domain are
allocated on a certain NUMA node. That is quite straightforward to tell
and, at present, never changes after domain creation!
I think number of pages allocated to a domain on a node does change during domain's runtime.
Thats what the balloon driver in guest is all about. When ballooning out pages are taken from guest and its mfns marked as invalid.

If, OTOH, you mean how frequently a certain page is _accessed_ from a
CPU of a certain node, that's indeed a different story. But it's not
what is being asked here.


