Scaling Monitoring and Discovery above the 100K systems range

This proposal has been rejected.


One Line Summary

We give an overview of algorithms for reliable exception monitoring and control that extends beyond the 100K system range while minimizing network traffic.


The Assimilation Project has implemented algorithms for scaling certain classes of system management work (discovery and monitoring) far beyond other known implementations.

The key portion of this work is how to reliably distribute work to systems without the central control system participating any more than it has to.

The overwhelming majority of normal monitoring and discovery is replaced by a O(1) method – resulting in a highly scalable and reliable “no news is good news” implementation.

This code is useful in many contexts, including managing large cloud infrastructures.


monitoring, command-and-control, discovery


  • Img_1358-small

    Alan Robertson

    Assimilation Systems Limited


    Alan is the a frequently requested speaker on high-availability, discovery, monitoring and scalability. He is the founder and leader of the Assimilation Project – providing extremely scalable IT discovery and monitoring. Before founding the Assimilation Project, Alan founded the Linux-HA project (currently known as Pacemaker) which he managed for about 10 years. Alan is currently employed by Assimilation Systems Limited. Before that he worked for IBM, SuSE and Bell Labs.

Leave a private comment to organizers about this proposal