-
Welcome
-
Subscribe to
Proposals
Issues with Linux and large NUMA/COMA factor architectures
*Excerpt
This is a large SMP BoF where the speaker/moderator will be presenting his experience with performance issues and solutions employed on ScaleMP vSMP Foundation based large SMP/NUMA systems. These issues are relevant to other large SMP/NUMA architectures as well.
Description
ScaleMP vSMP Foundation is a form of aggregation virtualization. vSMP Foundation is a distributed virtual machine monitor (VMM) aggregating multiple similar x86-64 systems to make a single large shared memory system. The multiple systems are interconnected to each other with a commodity fast interconnect (currently Infiniband). The vSMP Foundation VMM takes care of aggregating all the constituent hardware of the aggregated system. This implies that the VMM also takes care of the memory/cache coherency among the constituent systems. Currently Linux is the only guest operating system that is supported by the vSMP Foundation.
The vSMP Foundation VMM implements multiple inter-node coherency mechanisms. The resulting shared memory architecture is both “NUMA” and “COMA” in nature. The
coherency mechanism chosen is transparent as far as the guest kernel is concerned (Applications can explicity choose a coherency mechanism though). Due to the software approach to coherency, and speeds of the existing commodity interconnects the NUMA factor of the aggregated system is fairly large. The COMA coherency domain results in a large internode cacheline size — 4kB usually. Due to the above two reasons, cache misses and cacheline ping-pongs are a major issue. This BoF will focus on solutions employed in the kernel and applications to overcome performance penalties due to the NUMA factors and large cacheline. The effects due to the large cacheline show up in different ways — from the classical false sharing cases where traditional solutions based on padding could be employed to true sharing/lock contention cases where workarounds based on certain features in the Linux kernel and userspace libraries like hugetlb, libhugetlbfs, arena based allocations, third party malloc replacements, make more sense. This BoF will surmise all these workarounds and techniques used to date, some of the techniques we plan to use, and solicit suggestions on some unsolved issues.
Tags
NUMA, COMA, SMP, VMM, OS, virtualization, AGGREGATION
Speaker
-
Ravikiran Thirumalai
ScaleMP Inc- Website: http://www.scalemp.com/
Biography
Ravikiran works for ScaleMP as the lead Linux developer. Kiran (as he likes to be called) maintains the ScaleMP related bits in the linux kernel and works on scalability aspects of Linux and its interactions with the ScaleMP vSMP Foundation VMM.