[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] RFE: Detect NUMA misconfigurations and prevent machine freezes
When playing with NUMA support recently, I noticed a host would always hang when trying to create a cpupool for the second NUMA node in the system. I was using the following commands: # xl cpupool-create name=\"Pool-1\" sched=\"credit2\ # xl cpupool-cpu-remove Pool-0 node:1 # xl cpupool-cpu-add Pool-1 node:1 After the last command, the system would hang - requiring a hard reset of the machine to fix. I tried a different variation with the same result: # xl cpupool-create name=\"Pool-1\" sched=\"credit2\ # xl cpupool-cpu-remove Pool-0 node:1 # xl cpupool-cpu-add Pool-1 12 It turns out that the RAM was installed sub-optimally in this machine. A partial output from 'xl info -n' shows: numa_info : node: memsize memfree distances 0: 67584 62608 10,21 1: 0 0 21,10 A machine where we could get this working every time shows: node: memsize memfree distances 0: 34816 30483 10,21 1: 32768 32125 21,10 As we can deduce RAM misconfigurations in this scenario, I believe we should check to ensure that RAM configuration / layout is sane *before* attempting to split the system and print a warning. This would prevent a hard system freeze in this scenario. -- Steven Haigh 📧 netwiz@xxxxxxxxx 💻 https://www.crc.id.au 📞 +61 (3) 9001 6090 📱 0412 935 897 Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |