-
Welcome
-
Subscribe to
Scaling Monitoring and Discovery above the 100K systems range
This proposal has been rejected.
One Line Summary
We give an overview of algorithms for reliable exception monitoring and control that extends beyond the 100K system range while minimizing network traffic.
Abstract
The Assimilation Project has implemented algorithms for scaling certain classes of system management work (discovery and monitoring) far beyond other known implementations.
The key portion of this work is how to reliably distribute work to systems without the central control system participating any more than it has to.
The overwhelming majority of normal monitoring and discovery is replaced by a O(1) method – resulting in a highly scalable and reliable “no news is good news” implementation.
This code is useful in many contexts, including managing large cloud infrastructures.
Tags
monitoring, command-and-control, discovery
Speaker
-
Alan Robertson
Assimilation Systems Limited- Website: http://assimproj.org/
- Blog: http://techthoughts.typepad.com/
- Twitter: OSSalanr
Biography
Alan is the a frequently requested speaker on high-availability, discovery, monitoring and scalability. He is the founder and leader of the Assimilation Project – providing extremely scalable IT discovery and monitoring. Before founding the Assimilation Project, Alan founded the Linux-HA project (currently known as Pacemaker) which he managed for about 10 years. Alan is currently employed by Assimilation Systems Limited. Before that he worked for IBM, SuSE and Bell Labs.