[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] 2/3: MCA/MCE correctable error handling



>>> "Christoph Egger" <Christoph.Egger@xxxxxxx> 22.08.07 10:47 >>>
>On Tuesday 21 August 2007 17:53:45 Jan Beulich wrote:
>> >+} __attribute__((packed));
>>
>> I think it was a general agreement that it is not a good idea (non-portable
>> to non-GNU compilers) to put things like this in public headers. I think by
>> properly ordering things you can get away without this (and almost without
>> padding).
>
>Oh, you're right. I should use #pragma pack(1)  instead.
>Is this fine with you?

I'm afraid that wouldn't work with the compat mode header generation, as the
original use of #pragma pack(push, ...) was banned for (Sun) compatibility
reasons. Consequently, I believe the only way of doing this properly is to avoid
depending on compiler behavior by arranging things appropriately (including
padding members if needed) and avoiding #pragma pack() altogether.

>>
>> >+struct mcinfo_global {
>> >...
>> >+    uint16_t mc_socketid;
>> >+    uint16_t mc_coreid;
>> >+    uint16_t mc_vcpu_id;
>>
>> Unless mc_vcpu_id is intended for that purpose, this neglects
>> hyperthreading (I know, AMD doesn't use this, but the interface should be
>> vendor neutral).
>>
>> If mc_vcpu_id is meant for that purpose, its naming is ambiguous.
>>
>> If mc_vcpu_id is meant as a vcpuid, then ordering things os that mc_domid
>> and mc_vcpu_id are contiguous would seem more natural (making eventual
>> examination in raw memory less confusing).
>
>mc_coreid is the physical core that reported the machine check information.
>mc_socketid is the physical socket the physical core is in. This is useful
>to find all other affected physical cores, when an error in the L3-Cache is 
>reported that is shared over all cores in the chip.
>
>mc_vcpu_id is the id of the active vcpu for the domain, that reported the
>machine check information. Inside Xen, this is current->vcpu_id.
>mc_vcpu_id is needed to deal properly with the upcoming NUMA support
>my collegue is working on.

Okay, but then you're really lacking a thread id here, for being filled by
respective Intel code (AMD code would set this to zero).

>> >+/* sizeof(struct mcinfo_global) + 6 * sizeof(struct mcinfo_bank) == 200.
>> >+ * This is enough space to store mc information of up to six banks.
>> >+ */
>> >+#define MCINFO_MAXSIZE (204 - sizeof(uint32_t))
>>
>> Why don't you use the sizeof()-s from the description? If this is really
>> needed for some reason, then having 200 in the description and 204 in the
>> macro is at least confusing...
>
>MCINFO_MAXSIZE is the size for the content of the mi_data array.
>MCINFO_MAXSIZE + sizeof(mi_nentries) == 204. That is where is number comes
>from.

I concluded that, but pointed out that seeing two different numbers made
me think of why this is so, whereas identical numbers would have avoided
that (and will likely keep others from asking later, too).

Also, you don't say what was the reason for you to use constants instead
of sizeof() here.

>> >     /* Frame containing list of mfns containing list of mfns containing
>> > p2m. */ xen_pfn_t     pfn_to_mfn_frame_list_list;
>> >     unsigned long nmi_reason;
>> >+    struct arch_mc_info mc_info; /* machine check information */
>> >     uint64_t pad[32];
>> > };
>>
>> Are you sure it is appropriate to add a member here without reducing the
>> padding member?
>
>You want me to replace "uint64_t pad[32];" with "uint32_t pad[13];" ?

It would be my understanding that that's the right thing to do (assuming you
calculated the numbers correctly), but I'd rely on Keir to give a final word on
this.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.