[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-ia64-devel] RE: ar.unat[patch] fixed this ar.uant issue.[patch] fixed ar.unat save/restore issue
> There should be not register nat bit fault when running itp, Could you explain why this is true? (what is itp?) > When nat page fault happens, it is usually caused by an > instruction which is accessing a page whose page attribute is > nat page, so it must be ld or st instruction, it is What if a privileged instruction is on a NaT page and Xen needs to emulate that instruction? Thanks, Dan > -----Original Message----- > From: Xu, Anthony [mailto:anthony.xu@xxxxxxxxx] > Sent: Monday, November 14, 2005 3:37 AM > To: Magenheimer, Dan (HP Labs Fort Collins) > Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx > Subject: RE: ar.unat[patch] fixed this ar.uant issue.[patch] > fixed ar.unat save/restore issue > > Yes, this patch may make dom0 go through ltp test, > Your logic to handle nat consumption fault is > If( register nat bit fault) > Inject nat consumption fault to guest; > Else(means this nat page fault) > Attempting to handle as privop > If( it is privop) > Return; > Else > Inject nat consumption fault to guest > > When nat page fault happens, it is usually caused by an > instruction which is accessing a page whose page attribute is > nat page, so it must be ld or st instruction, it is > definitely not privop instruction. So it is not necessary to > attempt to handle nat fault as privop, we should inject it to > guest directly. > There should be not register nat bit fault when running itp, > So the logic in my mind is, > If(register nat bit fault) > Panic(); > Else > Inject nat consumption fault to guest. > > If it panics, there should be some places nearby where > ar.unat is not correctly handled. We should take this chance > to fix all ar.unat related bugs. > > >I am still not sure about the use of eml_unat. I commented > >out your code (in ia64_handle_reflection) that sets it to zero > > yes, you can comment this code, it was used for debugging > ar.unat fault. > > > > Thanks > -Anthony > > > > > > > >-----Original Message----- > >From: Magenheimer, Dan (HP Labs Fort Collins) > [mailto:dan.magenheimer@xxxxxx] > >Sent: 2005å11æ12æ 3:30 > >To: Xu, Anthony > >Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx > >Subject: RE: ar.unat[patch] fixed this ar.uant issue.[patch] > fixed ar.unat > >save/restore issue > > > >Anthony -- > > > >I just committed a fix to allow nat consumption faults to > >be delivered again. I think this is now necessary after > >the region0 virtual address fixes needed for ltp-mmap09. > >Without these nat fixes, ltp-getpeername01 reproducibly > >goes into an infinite loop reporting NaT errors (because > >the "return" in the reflection code doesn't result in > >the NaT getting reflected to the guest). > > > >I have left the printfs so any code that results in > >a inst/data page nat consumption fault (e.g. certain > >situations where the zero page is accessed) will be > >very chatty, but I think that's OK for now until we > >are sure we have fixed all NaT problems. > > > >I am still not sure about the use of eml_unat. I commented > >out your code (in ia64_handle_reflection) that sets it to zero > >and Tony's checker program and getpeername01 still work. > >If this (setting eml_unat to zero) is handling some > >special case that I am not testing for, please let me > >know. > > > >Thanks, > >Dan > > > >> -----Original Message----- > >> From: Xu, Anthony [mailto:anthony.xu@xxxxxxxxx] > >> Sent: Monday, November 07, 2005 6:30 PM > >> To: Magenheimer, Dan (HP Labs Fort Collins) > >> Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx > >> Subject: RE: ar.unat[patch] fixed this ar.uant issue.[patch] > >> fixed ar.unat save/restore issue > >> > >> See my comments, > >> > >> >-----Original Message----- > >> >From: Magenheimer, Dan (HP Labs Fort Collins) > >> [mailto:dan.magenheimer@xxxxxx] > >> >Sent: 2005å11æ8æ 2:07 > >> >To: Xu, Anthony > >> >Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx > >> >Subject: RE: ar.unat[patch] fixed this ar.uant issue.[patch] > >> fixed ar.unat > >> >save/restore issue > >> > > >> >Another NaT question... > >> > > >> >>I recall that some time ago (around the time of the merge) > >> >>you submitted some patches related to fixing ar.unat saving > >> >>and restoring. > >> > > >> >Another part of your earlier patch was a change in > >> >ia64_handle_reflection. I still periodically get the > >> >message: > >> > > >> > NaT fault... attempting to handle as privop > >> > > >> >Since your latest fix, Tony's regcheck tool no longer > >> >reports ar.unat as being saved/restored incorrectly. > >> >I was hoping that the above message would go away also, > >> >but it has not. I see it a couple times at boot and > >> >a couple times for every linux compile (at the end so > >> >it is probably the linker or some other link-related > >> >tool). I have also seen programs segfault after printing > >> >this message. So I went to look at the Xen/ia64 code where > >> >this is printed. > >> > > >> > >> I have not seen nat consumptions and segmentations faults for > >> a long time, in your build test and ltp test. Otherwise, I'll > >> definitely try to fix that. > >> > >> >It doesn't look right to me. There are two issues: > >> > > >> >1) Your patch added a "return"... I think this means that > >> > NaT faults will never get reflected to a guest (even > >> > Register NaT Consumption faults). > >> > >> Yes, you are right, we should inject Nat Consumption faults > >> to guest, but as I know there should be not NaT consumption > >> faults in linux, so I simply added a "return". I think the > >> best way is to add "panic" at this place, this will enforce > >> us to debug this issue rather than temporarily work around. > >> > >> > >> >2) Since a Instruction NaTPage Consumption fault has higher > >> > priority than a Privileged Operation fault, I think the > >> > original printf/priv_emulate code was intended to catch > >> > this case and properly emulate a privileged instruction > >> > on a NaTPage. I think it may also be necessary if a Data > >> > NaTPage Consumption fault is incurred when the privop > >> > emulation code fetches the instruction. (The code in > >> > ia64_handle_reflection should probably check the ISR to > >> > avoid calling priv_emulate for other kinds of NaT > >> > Consumption though.) > >> > >> I have been being curious why use emulate function to handle > >> NaT consumption. > >> Now I understand, thank you for your detailed explain. Maybe > >> we need to put more comments in the confusing place like this. > >> > >> > >> > >> >You know more about NaT's than I do... could you recheck > >> >this code in ia64_handle_reflection please? Do you have > >> >any test code that provokes any of these NaT faults? > >> > > >> > >> It' is very kind of you to say that, unfortunately I have not > >> seen those issues. What I suspect is dom0 does bank switch on > >> shared page but not consider ar.unat. > >> > >> Anyway, I'll try to provoke this fault, If I find, I'll > >> definitely fix it. > >> > >> >Thanks. > >> >Dan > >> > > >> >> -----Original Message----- > >> >> From: Xu, Anthony [mailto:anthony.xu@xxxxxxxxx] > >> >> Sent: Friday, November 04, 2005 12:10 AM > >> >> To: Magenheimer, Dan (HP Labs Fort Collins) > >> >> Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx > >> >> Subject: RE: ar.unat[patch] fixed this ar.uant issue.[patch] > >> >> fixed ar.unat save/restore issue > >> >> > >> >> >I am curious about the use of B1NATS in the code > >> >> >around this patch. Under what circumstances does > >> >> >this get set/used? > >> >> > >> >> 1. emulate bsw1, bsw0 > >> >> 2. emulate rfi. > >> >> 3. inject fault to guest. > >> >> > >> >> There is similar unat code in > >> >> >fast_tick (default off) and fast_reflect (default on) > >> >> >and I am wondering if similar unat changes are needed > >> >> >and whether it is now OK to turn on HANDLE_AR_UNAT > >> >> >(which is now default off). > >> >> You are right, in above two cases you should also save > >> >> ar.unat to XSI_B1NATS_OFS after spilling the guest bank1to > >> >> share page. I had handled all this in C code. I didn't look > >> >> into fast hypercall code, It's hard to read due to I am not > >> >> good at assembly code. The principle of handling ar.unat is > >> >> obvious; every time you spill banking register you must save > >> >> corresponding ar.unat after it, every time you fill banking > >> >> register you must restore corresponding ar.unat before it. > >> >> > >> >> We don't need to clear all guest b0 registers and their's nat > >> >> bit. Because r16~r23 are preserved regs and r24~r31 are > >> >> scratch regs, we only need to restore r16~r23 rather than > >> >> clear r16~r23 to 0. > >> >> > >> >> Next time you enable some functions like hyper_ssm_i, when > >> >> you save bank1 regs you should also save bank1 unat. > >> >> > >> >> Below patch enables HANDLE_AR_UNAT. > >> >> > >> >> > >> >> > >> >> Signed-off-by Anthony Xu <Anthony.xu@xxxxxxxxx> > >> >> > >> >> Thanks, > >> >> Anthony. > >> >> > >> >> > >> >> > >> >> > >> >> >-----Original Message----- > >> >> >From: Magenheimer, Dan (HP Labs Fort Collins) > >> >> [mailto:dan.magenheimer@xxxxxx] > >> >> >Sent: 2005å11æ3æ 22:42 > >> >> >To: Xu, Anthony > >> >> >Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx > >> >> >Subject: RE: ar.unat[patch] fixed this ar.uant issue. > >> >> > > >> >> >Hi Anthony -- > >> >> > > >> >> >I am curious about the use of B1NATS in the code > >> >> >around this patch. Under what circumstances does > >> >> >this get set/used? There is similar unat code in > >> >> >fast_tick (default off) and fast_reflect (default on) > >> >> >and I am wondering if similar unat changes are needed > >> >> >and whether it is now OK to turn on HANDLE_AR_UNAT > >> >> >(which is now default off). > >> >> > > >> >> >Thanks, > >> >> >Dan > >> >> > > >> >> >> -----Original Message----- > >> >> >> From: Xu, Anthony [mailto:anthony.xu@xxxxxxxxx] > >> >> >> Sent: Thursday, November 03, 2005 1:08 AM > >> >> >> To: Magenheimer, Dan (HP Labs Fort Collins) > >> >> >> Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx > >> >> >> Subject: RE: ar.unat[patch] fixed this ar.uant issue. > >> >> >> > >> >> >> Dan, > >> >> >> Last time, I used ar.unat register to restore guest general > >> >> >> register nat bit in hyper_rfi function for eliminating nat > >> >> >> bit consumption fault,but not restored ar.unat. > >> >> >> > >> >> >> Signed-off-by Anthony Xu <Anthony.xu@xxxxxxxxx> > >> >> >> > >> >> >> Thanks, > >> >> >> Anthony. > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> >-----Original Message----- > >> >> >> >From: Magenheimer, Dan (HP Labs Fort Collins) > >> >> >> [mailto:dan.magenheimer@xxxxxx] > >> >> >> >Sent: 2005å11æ3æ 11:54 > >> >> >> >To: Xu, Anthony > >> >> >> >Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx > >> >> >> >Subject: RE: ar.unat > >> >> >> > > >> >> >> >> I can take a look at this, please send me regcheck utilty. > >> >> >> >> > >> >> >> >> > >> >> >> >> Thanks > >> >> >> >> Anthony > >> >> >> > > >> >> >> >Great, thanks! Here's where I got Tony's regcheck tool. If > >> >> >> >it's not still there, perhaps Tony can post it. > >> >> >> > > >> >> >> >By the way, if anyone tries this on a domU, Matt Chapman > >> >> >> >has a pending fix that resolves a FP save/restore issue. > >> >> >> > > >> >> >> >Thanks, > >> >> >> >Dan > >> >> >> > > >> >> >> >> -----Original Message----- > >> >> >> >> From: linux-ia64-owner@xxxxxxxxxxxxxxx > >> >> >> >> [mailto:linux-ia64-owner@xxxxxxxxxxxxxxx] On Behalf Of > >> >> Luck, Tony > >> >> >> >> Sent: Tuesday, March 01, 2005 4:33 PM > >> >> >> >> To: linux-ia64@xxxxxxxxxxxxxxx > >> >> >> >> Subject: RE: [patch 2.6.11-rc3-bk4] Correctly dereference > >> >> >> >> ia64_mca_data > >> >> >> >> > >> >> >> >> Back on February 9th, I wrote: > >> >> >> >> >I wrote a test program that loads up random values > >> >> into registers > >> >> >> >> >(just r1-r31, a bunch of stacked registers, and > >> >> f2-f127 for now) > >> >> >> >> >and then checks that all the registers haven't > >> changed value a > >> >> >> >> >few thousand times, before reloading with a new set > >> of random > >> >> >> >> >values. > >> >> >> >> > >> >> >> >> A few people asked whether I could post the program > >> ... it took > >> >> >> >> a while to get sign-off ... but that gave me time to > >> >> add "branch", > >> >> >> >> "predicate" and half a dozen "application" registers > >> to the mix, > >> >> >> >> plus make it print the name of the register that was > >> >> nuked (instead > >> >> >> >> of a number that required manual translation). > >> >> >> >> > >> >> >> >> I've tested it by using a debugger to zap one of > each class > >> >> >> >> of register > >> >> >> >> that is being monitored to check that it works. > >> >> >> >> > >> >> >> >> > >> >> > http://www.kernel.org/pub/linux/kernel/people/aegl/ia64regcheck.tgz > >> >> >> >> > >> >> >> >> Usage ... compile, and run a few copies. If they all > >> >> >> "exit(0)" (which > >> >> >> >> may take a couple of days) the test passed. Otherwise you > >> >> >> should see > >> >> >> >> the name of the register printed to stderr, and > exit code 1. > >> >> >> >> > >> >> >> >> Apart from the MCA case, I haven't seen it report > a problem > >> >> >> >> yet ... but > >> >> >> >> I've only run a few hours. > >> >> >> >> > >> >> >> >> -Tony > >> >> >> > >> >> > >> > _______________________________________________ Xen-ia64-devel mailing list Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-ia64-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |