Efficient operation of large-scale distributed systems such as enterprise networks, grid systems, and sensor networks is a grand challenge. One of the crucial reasons is the lack of current and accurate knowledge of the global state of the individual components in the system including both network and individual machine attributes. Most commercial network management systems monitor only a subset of system metrics and also at a relatively coarse grain timescales to be able to collect and process the measurements at a central location. This impedes decision making at very fine timescales, which is important for several emerging applications like interactive multi-media services and early-anomaly detection systems. We provide a Scalable Sensing Service (S3) that the network management subsystem as well as individual applications/services can subscribe to and securely get customized information at very fine timescales suitable for their purposes. For scalable operation, S3 provides the sensing service in a decentralized manner, eliminates unnecessary duplicate measurements by consolidating sensing requirements of different applications, and provides inference engines to estimate network metrics with high accuracy while avoiding quadratic all-pair measurements load. S3 can be used for a number of management tasks such as failure or anomalous behavior detection, resource location and placement, and network routes setup for optimal performance.


DARPA logo This work is supported in part by DARPA Contract N66001-05-9-8904.

HP Logo

|| Planet-Lab data || Tools || Papers || People ||