Information about http://www.planet-lab.org/files/pdn/PDN-02-008/pdn-02-008.pdf

InfoSpect: Using a Logic Language for System Health Monitoring in…

Tags: adrian court, berkeley ca 94704, burlingame ca, cambridge cb3, cambridge computer laboratory, dependable systems, european workshop, intel research, j thompson, logic language, management component, microsoft research cambridge, mortier, pdn, saint emilion, sprint labs, system health, timothy roscoe, ubiquitous networks, university of cambridge,
Pages: 8
Language: english
Created: Tue May 13 09:41:24 2003
Display cached document
Page 1
image
Page 2
image
Page 3
image
Page 4
image
Page 5
image
Page 6
image
Page 7
image
Page 8
image
InfoSpect: Using a Logic Language for
System Health Monitoring in Distributed Systems
Timothy Roscoe
Intel Research ­ Berkeley

Richard Mortier
Microsoft Research ­ Cambridge

Paul Jardetzky
Sprint Labs

Steven Hand
University of Cambridge Computer Laboratory



PDN-02-008
December 2002




Appears in Proceedings of the SIGOPS European Workshop, Saint-Emilion,
France, September 2002.
                             InfoSpect: Using a Logic Language for
                        System Health Monitoring in Distributed Systems

                     Timothy Roscoe                                           Richard Mortier
               Intel Research at Berkeley                                   Microsoft Research
            2150 Shattuck Avenue, Suite 1300                             7 J.J. Thompson Avenue
               Berkeley, CA, 94704, USA                                 Cambridge CB3 0FB, UK
           troscoe@intel-research.net                                   mort@microsoft.com

                      Paul Jardetzky                                         Steven Hand
                        Sprint Labs                             Univ. Cambridge Computer Laboratory
                      1 Adrian Court                                  15 J.J. Thompson Avenue
               Burlingame, CA 94010, USA                              Cambridge CB3 0FD, UK
            pjardetzky@sprintlabs.com                            steven.hand@cl.cam.ac.uk


                        Abstract                                pendable service to customers; administrators need to pro-
                                                                vide a solid infrastructure to users; and users need their
   Dependable systems cannot be built without a monitor-        ubiquitous networks to be unobtrusively reliable. System
ing and management component. In this paper we pro-             management--always a difficult problem--becomes partic-
pose using a wide variety of information gathering tools        ularly acute when the manager has so little control over so
coupled with custom scripts and a Prolog language engine        many aspects of the environment, or little idea about how it
to aggregate information from multiple sources. Complex         all fits together [2].
queries, difficult to express in standard database languages,      This paper addresses the problem of dependability of
can then be used to answer questions about the system (e.g.     complex, dynamic distributed systems. We specifically
the health of individual components) or to discover contra-     look at the problem of system health monitoring, answer-
dictions (e.g. inconsistent configurations). We describe our    ing questions like "Is the system functioning correctly?",
prototype implementation and present some early results.        "What is wrong with the system?", "Is the state of the sys-
                                                                tem in line or at odds with our expectations?", and "What
                                                                needs to be fixed to ensure correct functioning of the sys-
1. Introduction                                                 tem?"
                                                                    We start from an assumption that the system will always
   Today's distributed systems are highly complex dynamic       admit inconsistency (at least transiently), and from the be-
environments where components are strongly dependent on         lief that the goal of dependability can be furthered by cap-
each other for their correct behavior. This situation becomes   turing and reasoning about these inconsistencies. We use a
further complicated as we move toward pervasive and mo-         logic programming language to reason about the state of the
bile computing since systems must now cope with a huge          system, at least in part since the flexible data representation
variety of devices and device software versions. In such        afforded by such languages is rather more appropriate for
systems change is no longer rare: indeed, the state of indi-    dealing with inconsistencies than a rigid database schema.
vidual components may be changing rapidly enough that in-           System information is gathered by using whatever ad hoc
consistency and instability is the norm. This may be an un-     or off-the-shelf tools are available; our approach is to build
avoidable emergent property of sufficiently large and com-      upon existing system and network management tools, not
plex distributed systems (see e.g. recent work on Internet      to replace them. Indeed, a key benefit of our approach is
routing stability [7, 9, 14]).                                  the ability to use tools with overlapping areas of applicabil-
   Yet it is crucial that these dynamic and intricate envi-     ity, explicitly record the origin of each piece of information
ronments be managed: service providers need to sell a de-       gathered, and then reproduce this when resolving contradic-
tions.                                                              mainder of our paper presents our design rationale in further
   In addressing the above problems in this paper, we are           detail, and introduces our prototype implementation and ini-
explicitly not attempting to provide a mechanism for ac-            tial results.
tively controlling or configuring networks. Neither is InfoS-
pect intended to provide the kind of rapid response offered         3. Our approach: Logic languages
by, for example, networking intrusion detection systems.
Instead, we focus on providing useful diagnostic informa-
tion to a management system (which still involves a human              We have codified our motivation for investigating logic
being), integrated from a variety of sources, and which can         languages into a series of (overlapping) design principles
effectively deal with unforeseen and/or contradictory con-          which we present below:
ditions in the system being monitored.
                                                                    Decouple health monitoring from system operation
2. Current approaches                                               Most system management tools manage `before the fact':
                                                                    they tightly integrate the functions of monitoring and con-
    Current approaches to determining system state may be           trol. The emphasis is on deciding on a desired system state,
split into network-related and host-related schemes. The            and then making the system elements consistent with that
former tend to be marketed by networking equipment or               state. While this approach can work well in a highly cen-
big-iron vendors and address the traditional FCAPS objec-           tralized and controlled environment, complete consistency
tives of network management. Examples include Cisco's               is an unrealistic goal in a large and complex system. There
CiscoWorks2000, Micromuse's Netcool, Riversoft's Open-              are several reasons for this:
River, IBM's Tivoli Netview, and the HP OpenView suite.
    These systems help operators control their network by            1. The time taken for the system to converge to the de-
providing simplified interfaces to topology discovery, ser-             sired state may be comparable with the interval be-
vice provisioning and equipment maintenance checks. The                 tween configuration changes. This has been observed
Internet community also provides some network-related in-               to be the case with many networks and peer-to-peer
spection tools, particularly for checking consistency or syn-           systems [8, 17].
tax of router configuration, for example RPSL at RIPE [13],
                                                                     2. Changes are frequently made independently of the
or the ISI RaToolSet [6].
                                                                        management entity. In many systems, a central man-
    Commercial host-related systems management software
                                                                        agement solution simply does not scale socially, since
typically focuses on the PC/server oriented enterprise space:
                                                                        the demands of users for changes to the infrastructure
e.g. IBM's Tivoli Suite, or Computer Associates' UniCen-
                                                                        exceed the capacity of the management organization
ter. Microsoft's SMS is targetted at smaller systems com-
                                                                        to implement them in a timely manner. As a result,
posed of Windows-based machines.
                                                                        users (whether individuals or organizations) take mat-
    All these commercial systems, while useful, are insuf-
                                                                        ters into their own hands. This is not an unusual state
ficient to solve the problems of managing a dependable
                                                                        of affairs in networking research laboratories, for ex-
computing environment since they assume prior knowledge
                                                                        ample. More generally, this is the normal state of af-
of the correct state of a semi-static set of actors. The di-
                                                                        fairs in a pervasive computing system.
verse, dynamic and ad hoc systems of the future are not
well served by such simplifying assumptions.                         3. There may be no clearly defined notion of a central
    In the research community, there are efforts to increase            management entity anyway, because the system is in
system dependability by building operating systems and                  constant interaction with others whose configurations
architectures for distributed and/or ubiquitous computing               are themselves changing. This is again the case for
(e.g. EROS [15], Xenoservers [12], one.world [4], JX [3]).              mobile users in a future pervasive computing environ-
This work is valuable, but we see it as largely orthogonal to           ment, but is also true for large ISP networks which
our work: no matter how reliable or flexible individual com-            have peering relationships with other carriers.
ponents are, there will always be a need for a distributed
monitoring and management function.                                    An alternative approach to system health monitoring is
    To summarize: reliability management based on as-               to manage `after the fact': construct a view of what the con-
sumptions of total control within well-defined perimeters           figuration of the system actually is, and then allow the op-
starts to look very fragile in the context of dynamically           erator to manage the system in order to achieve what is de-
evolving networks of devices and peer-to-peer software sys-         sired. This option is more appropriate in the kinds of open
tems. To cope with such an environment, management soft-            and dynamic environments we are interested in here, and
ware must itself be ad-hoc and constantly evolving. The re-         our approach falls squarely into this category.

                                                                2
    Logic languages such as Prolog [1] seem to have many                 We note that in this example, as in most others, a re-
advantages over the relational databases used in modern              lational schema clearly could be extended to deal with the
system monitoring packages. Prolog is a purely declara-              situation. Our point is that for unexpected inconsistencies
tive language in which a program consists of facts about             this must be done after the fact, a difficult operation in
objects in a system, inference rules which define relation-          databases, especially as important information may have al-
ships between objects and allow the derivation of new facts,         ready been thrown away. With our approach there is no
and queries which ask questions about objects and their re-          need to throw away anything. There is no requirement that
lationships. Facts in Prolog are free-form logical proposi-          the set of facts we have be consistent. It is not even required
tions; there is no predefined data schema.                           that we have some a priori notion of consistency.


Make it easy to integrate diverse information sources
                                                                     Bind assumptions about state as late as possible
As well as integrated system management environments, a
wide variety of excellent ad hoc system monitoring tools are         The preceding goals and assumptions lead to perhaps our
available both commercially and non-commercially. Such               most important design principle: avoid binding any assump-
tools work well because they focus on specific functionality         tions about the state of the system until the last possible mo-
­ port scanning, SNMP traps, etc. Effective distributed sys-         ment. The flexible knowledge representations allowed by
tem diagnosis, on the other hand, requires correlating and           declarative logic languages such as Prolog benefit us greatly
aggregating information from many sources.                           in this respect.
   We want to use as many of these sources of information
                                                                         The same flexibility that allows us to easily integrate
as possible: as systems evolve over time, a system health
                                                                     diverse tools also allows us to accept data from the tools
monitor which can usefully combine the results from other
                                                                     "without prejudice", and only later interpret these facts
tools will win over an integrated solution which tries to do
                                                                     within a particular model of behavior (and check that they
everything itself.
                                                                     are consistent with the model).
   There are two challenges here: providing a common rep-
resentation of the results from disparate tools so that they             This is in contrast to the use of relational databases,
can be unified, and the software engineering problem of in-          where an explicit schema is imposed on observations from
terfacing our monitoring system to these tools.                      the network or system. The process known as "data clean-
   Prolog performs well in both roles here: in contrast              ing" (in data warehousing terminology) when outputs from
to monolithic software architectures, logic languages are            tools are first fed into the RDBMS has a tendency to throw
highly effective at unifying the output of other tools via the       away precisely the anomalous observations that are most in-
use of inference rules, and in contrast to RDBM systems,             teresting. The use of a schema also makes it hard to evolve
the absence of predefined schema makes it easy to add new            the system as new network characteristics become impor-
information sources.                                                 tant.
                                                                         With Prolog, the schema is implicitly present only in the
                                                                     queries and inference rules written alongside the data from
Expect the unexpected                                                network tools, and so can be changed at whim. In fact, we
In large distributed systems, contradictions frequently ex-          can view the process of making a query as temporarily bind-
ist between what the managers believe the system configu-            ing a schema to the data for the duration of the query. This
ration to be and what is independently observed to be the            late binding is essential in a highly dynamic and evolving
actual system state. These contradictions are usually symp-          environment.
tomatic of security problems, faults, or misconfigurations,              Of course, the nature of the discovery tools we use also
yet they are unlikely to be detected by monolithic systems           imposes something of an a priori schema on the data. Note,
based on well defined schema, since the idea of a schema             however, that this is also the case with systems based on
itself always presupposes some consistency of state.                 the relational model. Furthermore, by using a diverse array
    For example, a database which models domain name                 of different discovery tools and delaying the unification of
records as a series of Unix hostent-like structures will             their results until query time, InfoSpect can mitigate the ef-
have no way of representing a situation in which two replica         fect of these tool-specific representations of network state.
domain name servers have conflicting A-records.                          Finally, we note that the source of a particular fact about
    Put simply: relational database systems do not handle            the system is often as useful as the fact itself in determin-
contradictions well. The separation between observed sys-            ing state anomalies. In InfoSpect all facts are labeled with
tem facts and inferences about system state distinguishes            where they were obtained. This kind of information is typi-
InfoSpect from database-oriented monitoring systems.                 cally discarded in database-oriented systems.

                                                                 3
4. Implementation                                                     third kind could include the idea that a machine listening on
                                                                      port 53 is likely to be a DNS server.
   The InfoSpect prototype consists of a central Prolog
knowledge base, a set of driver scripts, and a corresponding          4.1. Example tools
set of ad hoc or off-the-shelf discovery tools.
                                                                         Some concrete examples of driver scripts and discovery
                                                                      tools used in our current implementation are:
                 Nmap          prolog facts

                                                                      Network discovery tools: We have a simple SNMP
                Nessus         prolog facts                           walker written in Python which writes topology and routing
                                                                      information to the knowledge base, and a network consis-
                 SNMP          prolog facts                           tency checker which queries the knowledge base for likely
                                                                      routers and fetches the running configurations from those
                                                                      routers. The knowledge base includes the router's bootstrap
                 RtCfg         prolog facts
                                                                      configurations and so the checker can determine what, if
                                                                      anything, has changed.
               (other tools)                                             Our experience has been that automatically interpreting
                                                                      router configurations obtained in the traditional way using
             Prolog Knowledge Base             inference rules        expect scripts is easier if the information is converted into
                                               other facts            Prolog as early as possible in the process, and the rest of the
                                                                      job implemented as Prolog rules.
                                                                         Using this information, predicates can be written to (for
      Figure 1. System Data Flow in InfoSpect
                                                                      example) dynamically check for consistent BGP filters and
                                                                      policies. Another application is determining whether it is
    Figure 1 illustrates the flow of information through the
                                                                      possible to transmit an IP packet out of an intranet with-
system. The "main loop" of the system queries the knowl-
                                                                      out traversing a firewall box--a useful feature in a network
edge base for a list of driver scripts to be run, and then
                                                                      testing lab.
invokes each of these (preferably in parallel) to gather in-
formation. A driver script queries the knowledge base for
                                                                      Host system scanners: Several driver scripts are used for
information (e.g. the set of possible routers) which it uses to
                                                                      discovering hosts and checking host system security, in-
drive a discovery tool. The driver script translates the output       cluding the network mapper Nmap [5], and the remote se-
of the discovery tool into Prolog facts which are then added          curity scanner Nessus [16], fed with the facts derived from
to the knowledge base.                                                the network discovery tools. The results are facts like:
    This system is highly extensible since it may be updated
to use new tools without interrupting normal operation by               nmap_ipaddr('10.64.201.201').
                                                                        nmap_ipaddr('10.64.201.209').
adding to the knowledge base a set of Prolog facts that de-
                                                                        ...
scribe the set of discovery scripts to be run. These new or             nmap_os('10.64.201.201','Solaris 2.6 -
modified tools will be run in subsequent iterations.                  2.7').
    Entries in the knowledge base come in three flavors:                nmap_os('10.64.201.209','Foundry Server-
                                                                      Iron XL Switch Version 06.0.00T12').
 1. Facts input from external sources (such as a list of                ...
    well-known Trojan horse ports, or vendor tags for Eth-              nmap_tcp_port_open('10.64.201.201',22).
    ernet MAC addresses);
                                                                      --which indicates among other things that, according to
 2. Facts directly observed from some discovery tool                  Nmap, a host with IP address 10.64.201.201 exists, and is
    (which themselves retain information as to which tool             listening on the ssh port, and probably runs Solaris1 .
    they originate from); and                                             From these results, predicates can be derived to locate
                                                                      security anomalies in a variety of end systems. A simple
 3. Inference rules which are used to derive new facts from           example might be to ask the system for all Windows ma-
    existing ones.                                                    chines running a vulnerable version of IIS.
                                                                          These facts are used by the system in other ways as
   For example, entries of the first kind might include the
                                                                      well. A heuristic for discovering routers incorporates op-
fact that port 53 is reserved for DNS traffic, entries of the
                                                                      erating system information and would include the machine
second kind could include which hosts were observed by
Nmap to be listening on TCP port 53, and entries of the                 1 Nmap   includes an operating system fingerprint capability, used here.


                                                                  4
10.64.201.209 above as a result. We will see in the next                  For each potential DNS server, the walker starts from
section how heuristics like this can be very concisely ex-             an initial list of host IP addresses, obtained in a similar
pressed.                                                               manner from the results of running previous tools, and per-
                                                                       forms it's walk over the record space of the server. The
4.2. A detailed example: DNS walker                                    output of the walker consists of two types of Prolog facts.
                                                                       The first simply confirms that a DNS server was found at
                                                                       a particular address; it's therefore a stronger statement than
   In this section we present a much more detailed discus-             likely dns server(X). For example:
sion of one example driver script, to give a more concrete
feel for how the system works.                                              dns_working('10.64.201.201').
   The purpose of the DNS walker is to acquire as many                      dns_working('10.64.209.2').
DNS records as possible from as many DNS servers as pos-                    ...
sible in the local network, and add this information to the
                                                                          The second type of fact indicates the existence of a DNS
knowledge base. As well as requesting zone transfers from              record of a particular type, in a particular server, with par-
DNS servers, the walker repeatedly and recursively makes               ticular name and value. For example:
forward and reverse name lookups for all the host names
and addresses it can find. The walker is fairly straightfor-                dns_record('10.64.201.201','A',
ward, and written in Python; we concentrate here on the                            'kristeva.smoke.sprintlabs.com.',
Prolog operations related to it.                                                   '10.64.202.54').
   The walker starts by requesting all values which match
the predicate:                                                         --indicates that the DNS server 10.64.201.201 has a A
                                                                       record specifying that kristeva.smoke.sprintlabs.com has IP
     likely_dns_server(X).                                             address 10.64.202.54. All information is kept, including the
   This predicate is a pre-defined heuristic for spotting DNS          address of the server which supplied the record. Even the
servers, it's simply defined as:                                       name of the predicate indicates that the DNS walker, and
                                                                       not some other driver script, was the source of the informa-
     likely_dns_server(Machine) :-                                     tion.
         server(Machine,domain).                                          These facts are now available as input to other discovery
   The server predicate is a similarly-defined heuristic               tools (for example, a driver script might require a list of ma-
for spotting servers in general; it's a little more interesting:       chines pointed to by MX records), and can also be queried
                                                                       for inconsistent or anomalous configurations. For instance,
     server(Machine,Svc) :-                                            the following Prolog function will uncover servers with in-
         ipservice(Svc,Port,tcp,_),                                    consistent A-records:
         nmap_tcp_port_open(Machine,Port).
     server(Machine,Svc) :-                                                 dns_has_two_arecs(Server,V,N1,N2) :-
         nmap_tcp_port_open(Machine,Svc).                                           dns_record(Server,'A',N1,V),
                                                                                    dns_record(Server,'A',N2,V),
   This allows us to specify services by name (as in the ex-                        N1 @> N2.
ample above) or port number. The upshot if this is the sys-
tem regards a machine as a potential DNS server if Nmap                   More sophisticated queries are also possible. For in-
saw it listening on the domain port.                                   stance, this Prolog function succeeds if a DNS server has
                                                                       an inconsistent pair of PTR and A records (i.e., forward and
   The level of indirection afforded by the
                                                                       reverse address/name bindings):
likely dns server heuristic is important. It gives us
a way to flexibly integrate tools without introducing fragile               dns_ptr_conflict(Server,Ptr,DNS,A) :-
dependencies between them: the DNS walker does not                              dns_record(Server,'A',DNS,A),
need to be aware of the operating of Nmap, and we could                         dns_record(Server,'PTR',Ptr,DNS),
remove Nmap and substitute some other source of port                            not(ip_arpa_equiv(A,Ptr)).
information if we wanted, or indeed do without automatic
discovery of DNS servers altogether, and manage with a set                The ip arpa equiv(A,Ptr) function succeeds if
of user assertions about which machines should be queried              Ptr is the ".inaddr.arpa" representation of A (or vice
for DNS records.                                                       versa).
   Conversely, the wrapper for Nmap does not need to con-
cern itself with outputting facts in a form friendly to the            4.3. Performance
DNS walker. The Prolog functions we have listed above
give a flavor of the conciseness and flexibility in tool inte-           We ran InfoSpect in the Sprint Labs internal network of
gration that the use of a logic language affords us.                   about 300 hosts, more than 20 of which are IP routers. With

                                                                   5
our current suite of discovery tools, this results in a knowl-           While we are unaware of other work using logic lan-
edge base of around 25,000 entries.                                   guages to integrate networking monitoring tools, the use of
    There are two aspects to the performance of the system:           rule- and/or event-based systems in network monitoring is
time taken to perform a query against the knowledge base,             well established. A recent development has been to apply
and time take to collect information from the network. To-            techniques from data mining and machine learning to de-
gether these determine how up-to-date the information in              rive a set of empirical rules about the "normal" behavior of
the knowledge base can be, and how quickly this informa-              a distributed system or network, and use these in turn to de-
tion can be interpreted to diagnose problems.                         tect anomalies [11]. Such work is complementary to ours
    Query performance is good: a typical query (for exam-             and operates at a higher level of abstraction, we speculate
ple, to determine routing table inconsistencies) takes only a         that InfoSpect's knowledge base offers a rich foundation on
few milliseconds to execute on a modern workstation. We               which to build such tools.
have not addressed the issue of formulating queries so as to
optimize performance at the Prolog level since this has not           6. Conclusion
been a problem so far. Recent work on high-performance
declarative languages such as Mercury [10] may help to ad-
dress this below the level of the language.                               Network management and configuration is an increas-
                                                                      ingly important and complex part of any corporation's busi-
    Runtime of discovery tools, on the other hand, is dom-
                                                                      ness. Logic languages appear to offer significant advantages
inated by network communication latencies and, more sig-
                                                                      in simplifying the difficult tasks of network and security ad-
nificantly, by the need to throttle some network probes to
                                                                      ministration. We have built a prototype system which has
prevent undue load on the systems being monitored. Many
                                                                      been used to monitor the local intranet, and in doing so un-
of our driver scripts (such as the port scanners) take on the
                                                                      covered hitherto unknown inconsistencies and potential se-
order of minutes to complete. Consequently, the knowledge
                                                                      curity vulnerabilities.
base is always a few minutes behind the state of the system.
On this basis, InfoSpect does not fall into the "real-time                Our experience so far has been good: wrapping existing
anomaly detection" class of applications, but the speed of            network tools so that they are both driven from the con-
queries does allow complex, unanticipated questions to be             tents of the knowledge base and deposit their results into
asked and answered in a timely fashion.                               the knowledge base has proved relatively simple. By decou-
                                                                      pling our approach from the infrastructure as much as pos-
                                                                      sible, we avoid jeopardizing the dependability of the system
5. Ongoing work                                                       we are trying to manage.
                                                                          Prolog has proved to be highly effective at unifying the
                                                                      results of disparate tools. Furthermore, Prolog queries to
   There is much short-term work left to do in InfoSpect.
                                                                      uncover inconsistencies in the network state (for example,
The addition of more discovery tools to the collection we
                                                                      duplicate or potentially forged DNS records) are remark-
have now will extend the scope of the system but also al-
                                                                      ably concise and intuitive; typically three or four lines.
low us to gain more experience with constructing the Prolog
                                                                          We are currently extending the work on router and rout-
functions that integrate tools and interpret the knowledge
                                                                      ing policy configurations to provide the kinds of answers
base.
                                                                      network operators need in the daily running of complex net-
   A larger unresolved issue is how best to represent time in
                                                                      works and services, including backbone networks.
InfoSpect: the knowledge base at present holds only obser-
vations about the network from the loosely-defined "recent
past". A natural first step is to add a timestamp to every fact       7. Acknowledgments
in the knowledge base which is directly generated from a
discovery tool - note that this change can in fact be made to             We would like to thank John Larson and the Sprint Labs
a running InfoSpect system by adding a trivial rule for every         Infrastructure Group for letting us loose on the Lab network
class of observation which removes the timestamp. Infer-              in the early stages of this work. We are also grateful to the
ence rules and queries could then be formulated over a set of         anonymous reviewers of this paper for several insights and
similar facts with different timestamps, but we would prefer          useful suggestions.
a more structured approach to the problem. Two promising
avenues to explore are the kind of timing techniques em-
ployed in high-level hardware design languages, and tem-              References
poral logics. A challenge will be to implement a framework
for handling time without unduly expanding the state space             [1] W. Clocksin and C. Mellish.     Programming in Prolog.
of facts we have to deal with.                                             Springer Verlag, 1984.


                                                                  6
 [2] P. Dourish, D. Swinehart, and M. Theimer. The Doctor Is
     In: Helping End Users Understand the Health of Distributed
     Systems. In Proceedings of the IFIP/IEEE International
     Workshop on Distributed Systems: Operations and Manage-
     ment, December 2000.
 [3] M. Golm and J. Kleinoeder.           Ubiquitous Computing
     and the Need for a New Operating System Architecture.
     Online at http://www4.informatik.uni-erlangen
     .de/Projects/JX /Papers/ubitools01.pdf, 2001.
 [4] R. Grimm, T. Anderson, B. Bershad, and D. Wetherall. A
     System Architecture for Pervasive Computing. In Proceed-
     ings of the 9th ACM SIGOPS European Workshop, Kolding,
     Denmark, pages 177­182, September 2000.
 [5] Insecure.org.         The Nmap stealth port scanner.
     http://insecure.org/nmap/, 2001.
 [6] ISI. RAToolSet. http://www.isi.edu/ra/RAToolSet/,
     2001.
 [7] C. Labovitz, A. Ahuja, A. Bose, and F. Jahanian. Delayed
     Internet Routing Convergence. In Proceedings of ACM SIG-
     COMM 2000, pages 175­187, 2000.
 [8] D. Liben-Nowell, H. Balakrishnan, and D. Karger. Obser-
     vations on the dynamic evolution of peer-to-peer networks.
     In Proceedings of the 1st International Workshop on Peer-
     to-Peer Systems (IPTPS '02), Cambridge, MA, USA, March
     2002.
 [9] R. Mahajan, D. Wetherall, and T. Anderson. Understanding
     BGP Misconfiguration. In Proceedings of ACM SIGCOMM
     2002, August 2002.
[10] T. Mercury Project. http://www.cs.mu.oz.au/research/mercury,
     1996.
[11] M. N. nez, R. Morales, and F. Triguero. Automatic discovery
     of rules for predicting network management events. IEEE
     Journal on Selected Areas in Communications, 20(4):736­
     745, May 2002.
[12] D. Reed, I. Pratt, P. Menage, S. Early, and N. Stratford.
     Xenoservers: Accounted Execution of Untrusted Code. In
     Proceedings of the fifth Workshop on Hot Topics in Operat-
     ing Systems (HotOS-VII), 1999.
[13] RIPE. RPSL. http://www.ripe.net/ripencc/pub-
     services/db/irrtoolset/documentation/, 2000.
[14] A. Shaikh, L. Kalampoukas, R. Dube, and A. Varma. Rout-
     ing Stability in Congested Networks: Experimentation and
     Analysis. In Proceedings of ACM SIGCOMM 2000, pages
     163­174, 2000.
[15] J. S. Shapiro, S. J. Muir, J. M. Smith, and D. J. Farber. Op-
     erating System Support for Active Networks. Technical Re-
     port MS-CIS-97-03, University of Pennsylvania, February
     1997.
[16] The Nessus Project. http://www.nessus.org/intro.html,
     2000.
[17] B. Wilcox-O'Hearn. Experiences deploying a large-scale
     emergent network. In Proceedings of the 1st International
     Workshop on Peer-to-Peer Systems (IPTPS '02), Cambridge,
     MA, USA, March 2002.




                                                                7