Xen project Mailing List

Re: [Xen-devel] [PATCH 5/5] xen: Write CR0, CR3 and CR4 in arch_set_info_guest()

To: Tamas K Lengyel <tamas.lengyel@xxxxxxxxxxxx>

From: Razvan Cojocaru <rcojocaru@xxxxxxxxxxxxxxx>

Date: Tue, 19 May 2015 15:10:08 +0300

Cc: Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, "Tian, Kevin" <kevin.tian@xxxxxxxxx>, "wei.liu2@xxxxxxxxxx" <wei.liu2@xxxxxxxxxx>, Ian Campbell <ian.campbell@xxxxxxxxxx>, Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, "Dong, Eddie" <eddie.dong@xxxxxxxxx>, Tim Deegan <tim@xxxxxxx>, "Aravind.Gopalakrishnan@xxxxxxx" <Aravind.Gopalakrishnan@xxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>, Keir Fraser <keir@xxxxxxx>, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>, "suravee.suthikulpanit@xxxxxxx" <suravee.suthikulpanit@xxxxxxx>

Comment: DomainKeys? See http://domainkeys.sourceforge.net/

Delivery-date: Tue, 19 May 2015 12:09:36 +0000

Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=bitdefender.com; b=xBnX6Bm9UpYh/YPbWXBx5o0OWBd+L1Vh7aEL1l2FrHfz8PtkRkF9CZ90Vqn++m75yRc9seYfRgsCxYq2xDmITLxrj4ObNb+pG2vFdFgzSYl2E+oGjueJUSaBS6UpdF6yCXWwyW7wLiD59CiftZhtvabE/h36dKpiaLbXNJOqRH+qJqtad4Mg+JUCm9+y00EWV46y9JfMG9iFFWPLH6/exIaIPEvEaxY9F2TTSgLkpyJ+W22wIc8/WJ5EcCTx3DwmlPzSfKPAhYuKdf9jydSdTQDqk1J6Rb9W6uubRFrB3OV+i4lRLZyxJrV9V+bIp555nRXYBUJrWSxrpdtRLJ6rFw==; h=Received:Received:Received:Received:Received:Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject:References:In-Reply-To:Content-Type:Content-Transfer-Encoding:X-BitDefender-Scanner:X-BitDefender-Spam:X-BitDefender-SpamStamp:X-BitDefender-CF-Stamp;

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 05/19/2015 01:14 PM, Tamas K Lengyel wrote: >>> Isn't the event from the guest's perspective guaranteed to be >>> pre-write already? IMHO there is not much point in having two distinct >> >> The CR events are not pre-write: >> >> 3289 int hvm_set_cr3(unsigned long value) >> 3290 { >> 3291 struct vcpu *v = current; >> 3292 struct page_info *page; >> 3293 unsigned long old; >> 3294 >> 3295 if ( hvm_paging_enabled(v) && !paging_mode_hap(v->domain) && >> 3296 (value != v->arch.hvm_vcpu.guest_cr[3]) ) >> 3297 { >> 3298 /* Shadow-mode CR3 change. Check PDBR and update refcounts. */ >> 3299 HVM_DBG_LOG(DBG_LEVEL_VMMU, "CR3 value = %lx", value); >> 3300 page = get_page_from_gfn(v->domain, value >> PAGE_SHIFT, >> 3301 NULL, P2M_ALLOC); >> 3302 if ( !page ) >> 3303 goto bad_cr3; >> 3304 >> 3305 put_page(pagetable_get_page(v->arch.guest_table)); >> 3306 v->arch.guest_table = pagetable_from_page(page); >> 3307 >> 3308 HVM_DBG_LOG(DBG_LEVEL_VMMU, "Update CR3 value = %lx", value); >> 3309 } >> 3310 >> 3311 old=v->arch.hvm_vcpu.guest_cr[3]; >> 3312 v->arch.hvm_vcpu.guest_cr[3] = value; >> 3313 paging_update_cr3(v); >> 3314 hvm_event_cr(VM_EVENT_X86_CR3, value, old); >> 3315 return X86EMUL_OKAY; >> 3316 >> 3317 bad_cr3: >> 3318 gdprintk(XENLOG_ERR, "Invalid CR3\n"); >> 3319 domain_crash(v->domain); >> 3320 return X86EMUL_UNHANDLEABLE; >> 3321 } >> >> The line numbers assume my patch has been applied, but the only relevant >> change here is that hvm_event_cr(VM_EVENT_X86_CR3, value, old); replaced >> the old hvm_event_cr3(value, old); at line 3314. >> >> As you can see, first CR3 is being updated, and the events is being sent >> afterwards. This applies to CR4, etc. also. > > Well, from the guest's perspective it is still pre-write as the guest > hasn't executed yet with the new CR value set. It's post-write only > from the hypervisor's perspective right now. So where we actually send > the event is somewhat arbitrary IMHO as the response processing would > have to call hvm_set_cr3 again if the deny/change_value flag is set. I beg to differ, the time when that control register is set in the HV makes a world of difference. For one, you can see in the code above that changes to the guest _are_ being made before sending out the event (see also the function that sets CR4). Any changes to the guest during hvm_set_crX() will be seen by the vm_event handler application, which can (and does, in our case) query the guest state. So, whatever consequences a CR write might trigger (not least of which is the actual new register value) will be immediately visible to the monitoring application. Secondly, if the vm_event client decides that the write is not wanted, it has to issue a call to the HV to change the value back to the old one. This implies an extra context switch. When you're dealing with _many_ events, an extra context switch can make or break your client application. (This is why we've submitted the initial series patch that fills the registers and puts them in the vm_event request - that could have been done with an extra libxc call from the application, but it made a world of difference to not need to do that.) As for the DENY thing, the idea with the MSR write deny patch is that the work needed to set the MSR is done only once, in hvm_do_resume(). For the vm_event-enabled case, hvm_msr_write_intercept() becomes basically a no-op, so things are _not_ being set twice. >>> event types (PRE/POST). I'm not particularly happy with the "deny >>> change" flag idea either. Why not let the user specify what value he >>> wants to set the register to in such a case? It could the one that was >>> already there (old_value) in "deny change" style, but it might as well >>> be something else. I'm thinking we could keep the existing vm_event >>> hooks in place and simply let the vm_event response specify the value >>> the register should be set to (in the new_value field). >> >> You mean not have a distinct DENY vm_event response flag, but if rsp's >> new_value != req's new value set that one? Sure, that'll work, but it's >> less explicit, and thus, IMHO, more error-prone (it's easy for a >> vm_event consumer to just create the response on the stack, forget (or >> not know) that this might happen, and have the guest just write garbage >> to some register). > > You can have a response flag for it to tell Xen to look at the > new_value. What I meant is why restrict the feature to be DENY only. > You might as well let the user choose the value he wants to see in the > register. The flag would still be necessary to be defined for the > response as to avoid accidentally changing new_value without realizing > the consequence of it. Would you find that useful? All the use cases we have only involve not allowing a malicious write to go through, as previously discussed here: http://lists.xen.org/archives/html/xen-devel/2014-10/msg00254.html It could, of course, be done, but we haven't needed anything like that so far. Maybe there are valid use cases for that though, I'm certainly open to discussion. Thanks, Razvan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.