|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v5 05/14] tools/xenstore: use accounting buffering for node accounting
Hi Juergen, On 11/05/2023 06:25, Juergen Gross wrote: On 10.05.23 23:31, Julien Grall wrote:On 10/05/2023 13:54, Juergen Gross wrote:On 09.05.23 20:46, Julien Grall wrote:Hi Juergen, On 08/05/2023 12:47, Juergen Gross wrote:Add the node accounting to the accounting information buffering in order to avoid having to undo it in case of failure. This requires to call domain_nbentry_dec() before any changes to the data base, as it can return an error now. Signed-off-by: Juergen Gross <jgross@xxxxxxxx> --- V5: - add error handling after domain_nbentry_dec() calls (Julien Grall) --- tools/xenstore/xenstored_core.c | 29 +++++++---------------------- tools/xenstore/xenstored_domain.h | 4 ++-- 2 files changed, 9 insertions(+), 24 deletions(-)diff --git a/tools/xenstore/xenstored_core.c b/tools/xenstore/xenstored_core.cindex 8392bdec9b..22da434e2a 100644 --- a/tools/xenstore/xenstored_core.c +++ b/tools/xenstore/xenstored_core.c@@ -1454,7 +1454,6 @@ static void destroy_node_rm(struct connection *conn, struct node *node) It is not very clear from the code. And that's why comments are always useful to clarify why corrupt() is the right call. We couldn't allocate a small amount of memory (around 64 bytes)! So long this is the only reason then... Xenstored will probably die within milliseconds. Using the big hammer in such a situation is fine IMO. It will maybe result in solving the problem by freeing of memory (quite unlikely, though), but it won't leave xenstored in a worse state than with your suggestion. ... this might be OK. But in the past, we had a place where corrupt() could be reliably triggered by a guest. If you think that's not possible, then it should be properly documented. And calling acc_commit() here wouldn't really help, as accounting data couldn't be recorded, so there are missing updates anyway due to the failed call of domain_nbentry_dec(). We are removing the node after the accounting is updated. So if the accounting fail, then it should still be correct for anything that was removed before. Also, I think a comment would be warrant to explain why we are returning WALK_TREE_ERROR_STOP here when...+ /* In case of error stop the walk. */ if (!ret && do_tdb_delete(conn, &key, &node->acc)) return WALK_TREE_SUCCESS_STOP;... this is not the case when do_tdb_delete() fails for some reasons.The main idea was that the remove is working from the leafs towards the root.In case one entry can't be removed, we should just stop.OTOH returning WALK_TREE_ERROR_STOP might be cleaner, as this would make sure that accounting data is repaired afterwards. Even if do_tdb_delete() can'treally fail in normal configurations, as the only failure reasons are:- the node isn't found (quite unlikely, as we just read it before enteringdelnode_sub()), or- writing the updated data base failed (in normal configurations it is inalready allocated memory, so no way to fail that) I think I'll switch to return WALK_TREE_ERROR_STOP here.See above for a different proposal.Without deleting the node in the data base this would be another accounting data inconsistency, so calling corrupt() is the correct cleanup measure. Hmmm... I read this as this is already a bug rather than one introduced by this patch. IIUC, then this should be done in a new commit. Cheers, -- Julien Grall
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |