Information about http://www.cs.washington.edu/homes/djw/papers/hotnets-reverse-final.pdf

Reverse Engineering the Internet …

Tags: client populations, david wetherall, department of computer science, dns names, engineering university, geog, greater than the sum, internet operation, link bandwidth, mea, measurement techniques, measurement tools, neil spring, network measurement, raphy, reverse engineering, routing policies, thomas anderson, tomography, ure,
Pages: 6
Language: english
Created: Sat Nov 1 22:55:00 2003
Display cached document
Page 1
image
Page 2
image
Page 3
image
Page 4
image
Page 5
image
Page 6
image
                                   Reverse Engineering the Internet
                         Neil Spring, David Wetherall, and Thomas Anderson
               Department of Computer Science and Engineering, University of Washington


    Abstract-- To provide insight into Internet operation and          ficient network measurement techniques make this pos-
performance, recent efforts have measured various aspects              sible. For example, tomography can be used to mea-
of the Internet, developing and improving measurement                  sure link loss passively [29], correlating BGP routing up-
tools in the process. In this paper, we argue that these in-           dates can expose topology [5], tailgating approaches mea-
dependent advances present the community with a startling              sure link bandwidth using few packets [21, 31], decoding
opportunity: the collaborative reverse-engineering of the In-
                                                                       DNS names can expose geography [30], and avoiding re-
ternet. By this, we mean annotating a map of the Inter-
                                                                       dundant measurements can make topology studies practi-
net with properties such as: client populations, features and
workloads; network ownership, capacity, connectivity, geog-            cal [37].
raphy and routing policies; patterns of loss, congestion, fail-           The challenge is to use these techniques to produce new
ure and growth; and so forth. This combination of properties           measurement tools that are efficient enough to run quickly
is greater than the sum of its parts, and exposes the attributes       at the scale of the Internet. Skitter [9] shows that an
of network design easily overlooked by simpler, uncorrelated           individual property, IP connectivity, can be measured at
models. We argue that reverse engineering the Internet is
                                                                       Internet-scale with reasonable accuracy and on a regular
feasible based on continuing improvements in measurement
                                                                       basis. We seek to extend this map with a richer set of prop-
techniques, the potential to infer new properties from exter-
nal measurements, and an accounting of the resources re-               erties. We describe promising approaches to measuring
quired to complete the process.                                        and inferring missing properties of individual routers or
                                                                       links including failure, utilization, packet duplication and
                                                                       corruption, layer 2 topologies, and the location of clients.
1     Introduction                                                     Measurement tools often probe the links along a single
Reverse engineering is the process of learning the design              network path; we describe how to optimize these tools to
of an object by studying its implementation. For the Inter-            measure regions of the network efficiently.
net, we take "design" to mean how its components (links,                  The benefits of a collaborative effort over existing
routers, clients, and networks) are assembled and config-              piecemeal projects are significant. First, coordinated mea-
ured. An individual ISP can use private information to                 surements expose relationships between properties that
study its own network, sometimes publishing the result,                cannot otherwise be analyzed. For example, studies of
but the heterogeneity and number of different ISPs pre-                the IP-level topology have shown heavy-tailed degree dis-
vents us from applying the same techniques to the whole                tributions with the implication that the IP graph may have
Internet or assuming that the results from a single ISP                a low attack-tolerance [1]. However, observing router role
would generalize.                                                      together with node degree shows this implication is mis-
   The importance of quantitative data on the composition              leading because most of the high outdegree nodes are rel-
and operation of the Internet has been broadly recognized              atively unimportant access routers rather than part of an
in the research community [11], yet the challenges of ob-              ISP backbone. Second, relationships between properties
taining it by reverse-engineering are daunting. The Inter-             can be used to derive or check further properties. For ex-
net comprises millions of hosts and routers and is con-                ample, a model of network routing results from a set of
stantly adapting to an ever-changing workload of appli-                chosen paths combined with a topology [36]. Third, co-
cations and traffic, so the process of reverse-engineering             ordinated measurement can save substantial redundant or
must be fast enough to observe a consistent picture. Be-               unnecessary work. For example, measurements that show
cause ISPs provide little support for observing the struc-             that router IP addresses are in different geographic loca-
ture and state of their networks directly, new techniques              tions also show that those IP addresses are not aliases,
must be devised to infer internal details from limited ex-             which would make the alias resolution process more ef-
ternally available information.                                        ficient. Lastly, a community effort will facilitate the vali-
   We argue that, with a community effort, it is possible to           dation that is needed before results can be used with confi-
reverse-engineer the Internet in a single day of concerted             dence. Those with internal information, such as ISP oper-
measurement. These measurements can be repeated on                     ators, can detect errors and provide feedback about where
a regular basis to provide an ongoing view of the struc-               tools work and where they fail. And with a "live" repos-
ture and operation of the Internet. Recent advances in ef-             itory that supports the ongoing collection and analysis of

                                                                   1
measurements, different researchers can compare differ-               2.2    Measured Link Properties
ent methods and repeat experiments at different times and             Loss Packet loss has been inferred in a large scale study
over different ISP networks.                                          by Padmanabhan, et al. [29] using tomography. Tomogra-
   Table 1 is a roadmap for this paper. It lists the properties       phy assigns loss rates to links using measured routing and
that expose network design ­ what we think constitutes a              end-to-end observed loss at a busy Web server. This tech-
reverse-engineering of the Internet. In the next section,             nique holds promise for locating other rare events like du-
we outline the state-of-the-art in Internet measurement:              plication and corruption. Although efficient, tomography
those properties that can be measured today. In Section 3,            requires passive observation points that may be difficult to
we sketch approaches to estimating the remaining proper-              find, so active probing using Tulip [23] may be required
ties. We argue in Section 4 that these measurements can               instead, at least to measure some links.
be made efficient enough that they can be run every sin-              Reordering Although not a performance problem in it-
gle day. Finally, we summarize and discuss collaborative              self, reordering indicates fine-grained multi-path routing
Internet measurement in Section 5.                                    or a topology that includes parallel links. Tulip [23] builds
                                                                      on the techniques of Sting [6] to measure link reordering.
2     State of the Art                                                Delay End-to-end one-way trip times (OTT) can be mea-
                                                                      sured with synchronized clocks, but for the delay between
In this section, we briefly describe what can be measured             routers, geographic distance is an inexpensive approxima-
and analyzed today. While the methodologies we list have              tion [36, 39]. Measurements may refine these estimates,
been published and shown to work, they have generally                 but must use routing knowledge to account for asymmet-
been tested only on a few paths, in simulation, or on sub-            ric return paths.
sets of the topology. A general challenge for reverse-                Delay variation The variation in delay caused by queu-
engineering is to validate these techniques and generalize            ing in contention with other flows is both a problem for
them to the scale and heterogeneity of the Internet.                  time-critical traffic and an indication of congestion in the
                                                                      network. The cing [4] tool uses ICMP timestamps to esti-
2.1    Measured Node Properties                                       mate the delay variation of path segments.
IP aliases Traceroute lists interface addresses, and differ-          Capacity Two methods can measure the capacity of a
ent interface addresses for the same router must be re-               link: variable packet size and tailgating packet pair. Vari-
solved. The Rocketfuel tool, Ally [37], builds on the Mer-            able packet size methods (pchar [22], clink [12]) are tra-
cator [15] alias work, using topology and DNS to guide                ditional but require thousands of packets. Packet quar-
the search for alias pairs.                                           tets [31] and cartouche probes [17] refine nettimer [21]
                                                                      and measure capacity more cheaply. We see advances in
Geography The placement of routers in cities exposes                  capacity measurement as evidence that network measure-
POP structure and backbone interconnections. GeoClus-                 ment techniques are constantly improving.
ter [30] uses databases to infer node location. DNS names
work for routers, but the database is tedious to maintain.            2.3    Measured Topology Properties
                                                                      Topology Four levels of Internet connectivity have been
Owner Isolating networks by their administration allows
                                                                      studied. RouteViews [26] data includes the graph of con-
separately engineered networks to be analyzed indepen-
                                                                      nections between autonomous systems. Skitter [9] peri-
dently. Mao et al. [24] provide techniques to determine
                                                                      odically collects the topology between IP addresses. Mer-
the autonomous system responsible for a router and host.
                                                                      cator [15] resolves IP aliases to construct a router-level
Router role Routers serve different purposes: some are                topology. Finally, Rocketfuel [37] groups routers by ge-
backbone routers that connect to other POPs, others are               ography to provide a POP-level (backbone) topology.
access routers that aggregate customers. Rocketfuel used              Routing Routing policy affects how packets are directed,
DNS and topological ordering to identify roles.                       and policies that differ from the default indicate traffic
                                                                      engineering. Policy is applied both at the IP level [36],
Implementation features TCP features, which are                       which relates directly to performance and traffic engineer-
tracked by tbit [28], and services supported, which can               ing, and at the autonomous system level [14], which ex-
be measured with nmap, affect the composition of traffic              presses the business relationships in the network. Both
and show how quickly features are deployed. We do                     policies can be inferred by observing alternate paths that
not expect it to be practical to run nmap to every host in            are not chosen, either because of intra-domain link met-
today's security conscious Internet. It is an open question           rics or inter-domain policy.
whether such techniques can be extended to infer router               Workload Traffic matrices ­ how much traffic is ex-
configuration parameters (e.g. for RED).                              changed between network edges ­ shape the evolution of
                                                                      the network. The tomogravity [40] approach provides a

                                                                  2
 Property                  Tool               Technique                                       Packets and dependencies §4
 Node (Host and Router) Properties
 IP aliases              ally [37]            Four-packet test for same IP ID counter,             5×A + 0.005 × 4(0.9A)2
                                              to TTL-nearby addresses
 Geography                GeoCluster [30]     Groups prefixes by topology, BGP                                  database: 0
 Router role              Rocketfuel [37]     DNS feature extraction, topology                                  database: 0
 Owner                    Mao et al. [24]     DNS, BGP, whois                                                   database: 0
 Implementation              tbit [28]        TCP behavior inference                                               126 ×N
 features                   nmap [13]         TCP port scan                                                       1658 ×N
 Node failure                  §3.1           Probe IP-ID every minute                              24 hours × 60 min. × N
 Link Properties
 Loss                    tomography [29]      Observe TCP retransmissions                                    tomographic: 0
                            Tulip [23]        Observe IP ID sequences                        ( 200 (small) + 100 (big) ) × L
 Reordering                 Tulip [23]        Observe IP ID ordering in responses                                   200 × L
 Delay                    geography [39]      Geographic distance ÷ speed of light           Geography
 Delay variation             cing [4]         ICMP timestamp requests sent to routers                 (estimated) 1000 × L
 Capacity                  quartets [31]      Tailgating-based packet train                            40 (big packets) × L
 Link failure                  §3.1           Correlate route changes with policy to                     Routing, Topology
                                              find infinite cost links
 Idle capacity                  §3.2          Extend pathload [20], chirps [35]                   5 retries × 100 (big) × L
                                §3.2          Model queuing                                  Delay variation, Capacity
 Utilization                    §3.2          Capacity ­ Idle capacity                       Capacity, Idle capacity
 Layer 2 Connection             §3.3          Point-to-point vs. multiple access by          Topology
                                              address allocation
 Layer 2 Switches               §3.3          Difference [34] between capacity               Capacity
                                              measurements
 Duplication               §3.4               Observe repeated IP identifiers                                 tomographic: 0
 Corruption                §3.4               Observe bad TCP checksums                                       tomographic: 0
 Topology (Graph) Properties
 Topology (AS)      RouteViews [26]       Archive BGP tables                                              database: 0
 Topology (IP)         Skitter [9]        Traceroute to all destinations                                    2×S ×A
 Topology (Router)    Mercator [15]       IP aliases and IP topology                  Topology (IP), IP aliases
 Routing (IP)        Rocketfuel [36]      Constraint solver infers metrics            Topology (Router)
 Routing (AS)        Gao et al. [14]      Find AS relationships in BGP tables                             database: 0
 Client location           §3.5           Analyze prefix density [25] passively,                       tomographic: 0
                                          with BGP dynamics [5] or topology
 Workload                tomogravity [40] Traffic matrix estimation using gravity     Client location, Utilization,
                                          and tomography models                       Capacity, Routing
 Overall                       3224N + 1400L + 640(big packets)L + 0.016A2 + 2SA = 5 billion packets, 803 GB
 Per source                                       27 million packets, 4.0 GB, 372 Kbits/s = $37 per month per source

Table 1: Properties that expose network design and approaches to their measurement. L is the number of router to router
or router to host links to measure; there are 786,000 IP to IP links in Skitter, and we estimate there are two-thirds as
many (500,000) router to router links. A represents the addresses, 419,752 in Skitter, and N the number left after alias
resolution (we assume N =A). S is the number of sources or vantage points, 200 combining Skitter, Surveyor, PlanetLab,
RON, and NIMI. Tomographic analyses are passive combined with routing policy and topology. Database measurements
require at most A packets but usually zero. We list tools to show that properties can be measured, not to endorse certain
tools over others. Section numbers represent research challenges discussed in this paper. The totals at the bottom are
calculated without optimizations discussed in Section 4, and the cost estimate is based on 1Mbps  $100 per month [16].


technique to estimate workload from measurable quanti-             3    Inferring New Properties
ties, but first we must measure link utilization and the lo-
cation of clients, described in the next section.                  The next challenge we discuss is estimating the remaining
                                                                   properties in Table 1. We argue that it is possible to mea-
                                                                   sure failure, utilization, layer 2 topology, duplication, cor-

                                                               3
ruption, and client location, although much research re-              approach that models a link as an M/D/1/K queue, and
mains to address practical issues and validate techniques.            uses queue length (which can be sampled by delay varia-
                                                                      tion), loss rate, and capacity to solve for the capacity of the
3.1    Failure and Evolution                                          queue and link utilization. Results using these techniques
Failures generally last only a few minutes [18], so exter-            can be validated against known workloads from cooperat-
nally observing failures, let alone determining where they            ing ISPs.
occur, is daunting. However, as yet another instance of
heterogeneity in the Internet, some links are more failure            3.3    Below IP
prone than others [18], which means that if we can iden-              Internet mapping efforts that use traceroute are limited in
tify the links and routers that are likely to fail, we can            their ability to discover the real, physical topology under-
focus measurements on those.                                          lying the IP-level topology: MPLS, multi-access LANs,
   Several primitives can detect node and link failures. A            and other switched networks can appear as a mesh of vir-
simple approach to detecting long term failure is to ana-             tual point-to-point links. It is important to identify these
lyze changes in the measured network: removal and re-                 switched networks so that they do not hide the true prop-
placement of a link or router suggests a temporary failure.           erties of the media.
To measure short-term failures, however, new approaches                  The immediate goal in understanding layer two topolo-
are needed. First, when packets traverse a path that is in-           gies is to simplify measured network maps that include
consistent with routing policy, a failure may be the cause.           switched networks. Common address allocation policies
Inferring the routing policy during the failure may show a            may provide a solution. Point-to-point links in the core
link (or set of links around a failed node) as having "infi-          of the Internet (not dial-up PPP) are assigned addresses
nite" cost ­ exposing the failed component. This extends              from /30 prefixes ­ address blocks with one address for
the approach of Chandra et al. [8]. To detect node fail-              either end of the link, a network address, and a broad-
ure explicitly, probe traffic could monitor routers to ob-            cast address. As a result, in a network full of point to
serve the IP identifier. The IP identifier (a unique value            point IP links, one half of the addresses will be observed
that helps fragment reassembly) is generally implemented              as unavailable (those that end in 0 and 3 modulo 4). Ob-
as a counter [7], which may be cleared when a router is               serving an interface address that is not part of a point to
rebooted or powered off. In practice, the rate of change              point link indicates that a larger address block (such as a
of the IP identifier varies from router to router, and large          /29 or /28) is being used by a shared or switched network.
spikes (perhaps due to routing protocol exchanges ­ an-               This method would not detect a "switch" in the middle of
other indication of change) may conceal a fast reboot. We             a point-to-point link, but the result should simplify mea-
believe that future measurement studies can resolve these             sured maps.
issues to detect failure.                                                Also of interest is the underlying network switch topol-
                                                                      ogy. Prasad et al. [34] showed that store-and-forward
3.2    Utilization and Workload                                       switches have an observable effect on pathchar-like tools.
The utilization of a network link is the difference between           The extra store and forward stage represented by an "in-
its total capacity and the available (idle) capacity. As uti-         visible" switch doubles the extra serialization delay ob-
lization alone has limited value, we focus on measuring               served when increasing the size of single packet probes.
it along with total link capacity. Tools exist to measure             This makes the link appear only half as fast as it is. If
link capacity (e.g. pchar [22]) and the bottleneck avail-             any tool measures greater available bandwidth than link
able capacity of a path (e.g. pathload [20]), but not yet the         capacity, a switch is indicated. Similarly, one might
available capacity of individual links.                               use traffic designed to contend for only certain seg-
   As a dynamic property with troublesome statistical                 ments of a hypothesized switched link, as proposed by
properties [32], the utilization of a link is likely to be dif-       Coates et al. [10]. Similarly, differences between geo-
ficult to measure. There are two possible approaches to               graphic latency and measured link latency can imply de-
estimating utilization. First, pathload may be extended               tour routing at the link layer.
using tailgating. Pathload uses an adaptive search to de-
tect when it barely fills the idle capacity of a path. The            3.4    Link Pathologies
available capacity of links before the bottleneck may be              Duplication and corruption are so rare in the network that
measured by modifying pathload's link-filling traffic to              their active measurement is likely to be too expensive.
expire at various hops in the network while small tailgat-            Passive measurement with tomography provides an alter-
ing packets continue on to a receiver. Links downstream               native. Although the outbound path from a Web server can
of the bottleneck may be mapped by combining the results              be measured with traceroute and loss observed through its
of other vantage points. Second, tools such as cing [4] and           recovery in TCP (as by Padmanabhan [29]), duplication
tulip [23] measure delay variation and loss to a router. It           and corruption are only visible on inbound traffic. With
may be possible to analyze the distribution of these sam-             the collaborative reverse-engineering we propose, the path
ples to estimate utilization. Alouf et al. [3] present an             that inbound duplicated or corrupted traffic traversed can

                                                                  4
be inferred from other measurements. This means that                    Alias resolution is the most expensive analysis in Ta-
a modest number of passive packet monitors at selected               ble 1, consuming 53% of the packets even after simple
sites may be able to infer link pathologies, enhancing the           improvements. In Rocketfuel [37], we used the TTL re-
completeness of the reverse engineering effort.                      maining in responses from routers as evidence that two
                                                                     addresses are not aliases. With additional vantage points,
3.5    Client location                                               more alias pairs can be disproven cheaply ­ using five van-
Clients create a demand for traffic. Although network                tage points, we were able to eliminate 99.5% of alias pairs
mapping projects focus on collecting an accurate picture             within an ISP, but looking at the entire Internet or using
of the core of the network, clients ultimately shape the net-        more vantage points may eliminate more. Both Merca-
work ­ more clients create more demand, inducing ISPs                tor and Rocketfuel observed that 10% of router addresses
to add capacity. Andersen et al. [5] showed that prefixes            were unresponsive to alias resolution ­ this further re-
that share BGP behavior mirror the underlying topology.              duces the number of pairs that must be tested to (0.9A)2 .
BGP updates are already collected and stored at several              We expect that the number of addresses considered can be
sites. The traceroutes collected as part of network map-             reduced further by, for example, removing all addresses
ping could also be used to determine where clients attach            with Web server hostnames. Finally, topology and aliases
in the topology.                                                     change together, so changes in topology may drive which
   Knowing how much address space lies in a location                 addresses are tested as aliases.
alone is insufficient for suggesting their demand on the                The second largest term, 3224N , or 25% of the pack-
network. For example, smaller prefixes appear to be more             ets, are spent in learning the properties of nodes. Because
densely populated with clients and servers [25]. As a                these properties are slowly changing, it may be sufficient
result, knowing where prefixes connect should be aug-                to verify that they are unchanged ­ for tbit, that a host still
mented with a estimate of the number of active hosts at-             does not support ECN or for nmap, that the Web server is
tached to the network ­ potentially from an unobtrusive,             still available.
but possibly biased, passive analysis at Web servers.                   Measuring link properties uses 19% of the packets but
                                                                     65% of the bytes. We assume that network link mea-
                                                                     surement tools are easily modified to measure individual
4     Efficient Measurement at Scale                                 links or trees leaving a source without repetition. Most of
Once we can approximate each of the properties in Ta-                the overhead of link measurement is the cost of running
ble 1, the next challenge will be to scale to the entire In-         pathload for each link in the network. It may be possi-
ternet. The amount of traffic generated by each vantage              ble to guide (and therefore shorten) pathload's adaptive
point limits how quickly or how often reverse-engineering            search for a sustainable rate based on past link utilization
may be run. Table 1 provides a starting point for consider-          and recently measured capacity.
ing this workload. It shows that with present techniques,               With these techniques ­ verifying past information in-
roughly 25 million packets are needed per vantage point.             stead of measuring anew, avoiding redundant measure-
Surprisingly, this number is already within reach: it is             ment, and guiding the analysis with measurements of
only 4 times the traffic sent by Skitter today. Our goal is to       other properties ­ we hope to improve the efficiency of
make reverse-engineering clearly practical and acceptable            reverse engineering to be comparable to that of Skitter.
by reducing this workload until it is comparable to that of
Skitter. There is much scope for doing so because existing
techniques were typically not designed to be efficient for
                                                                     5    Looking Ahead
Internet mapping.                                                    In this paper, we have argued that reverse-engineering the
   Table 1 shows that it is inexpensive to estimate those            Internet is within reach of the research community, and
properties that use passive analysis, database lookups, the          that a collective effort would achieve significantly more
results of other tools, or modest traffic to each address.           than the isolated efforts of individual researchers. We next
The properties to be concerned about are topology, IP                describe what such a collective effort would look like the
aliases, and link properties.                                        challenges it would face.
   Topology can be measured using fewer packets than the                We envision a "measurement blackboard" that super-
6.5 million used by Skitter, opening the door for a richer           vises the execution of measurements, archives and ag-
map to be measured with comparable traffic. Instead of               gregates their results, and exports raw data and sum-
running a basic traceroute at each vantage point to each of          mary views including network topologies. Unlike Route
500,000 destinations, if each vantage point ran its trace-           Views and Skitter, which have shown the value of continu-
route backwards until it merged with an already discov-              ously archived wide-area measurements, the measurement
ered part of the tree, it would take only 700,000 packets            blackboard includes such diverse information as the per-
(the number of traceroute destinations plus the number of            formance of links, the location of nodes, and routing poli-
addresses a source discovers). Our goal is to estimate the           cies, as well as raw measurements in the form of packet
remaining properties using the 5.8 million packets saved.            traces and traceroute output. Unlike the Internet Traffic

                                                                 5
Archive [19] and the proposed SIMR system [2] for in-                 [11] Computer Science and Telecommunications Board, Na-
dexing and archiving diverse Internet measurements, new                    tional Research Council. Looking Over the Fence at Net-
                                                                           works: A Neighbor's View of Networking Research. The
measurements will build on the fresh, intermediate results                 National Academies Press, 2001.
of others. To support such tightly coupled measurement,               [12] A. B. Downey. Using pathchar to estimate Internet link
the measurement blackboard will execute measurement                        characteristics. In ACM SIGCOMM, 1999.
                                                                      [13] Fyodor. NMAP: The network mapper. http://www.
and inference programs that researchers write to a com-                    insecure.org/nmap/.
mon API. Scriptroute [38] running on PlanetLab [33] pro-              [14] L. Gao. On inferring autonomous system relationships in
vides a suitable large-scale measurement platform. This                    the Internet. In IEEE Global Internet Symposium, 2000.
                                                                      [15] R. Govindan and H. Tangmunarunkit. Heuristics for Inter-
architecture of researchers contributing extensions to a                   net map discovery. In IEEE INFOCOM, 2000.
narrow core is similar to the way the network simulator               [16] J. Gray.      Distributed computing economics.         IEEE
ns [27] has facilitated improved experimentation and com-                  TFCC Newsletter, 2003. http://www.clustercomputing.
                                                                           org/content/tfcc-5-1-gray.html.
parison.                                                              [17] K. Harfoush, A. Bestavros, and J. Byers. Measuring bot-
   The main challenge we have explored in this paper is                    tleneck bandwidth of targeted path segments. In IEEE IN-
the design of techniques that can estimate network prop-                   FOCOM, 2003.
                                                                      [18] G. Iannaccone, et al. Analysis of link failures in an IP
erties efficiently at the scale of the Internet. Another chal-             backbone. In ACM SIGCOMM Internet Measurement
lenge is the construction of the measurement blackboard                    Workshop, 2002.
for exchanging and archiving measurements. Beyond this                [19] The Internet Traffic Archive. http://ita.ee.lbl.gov/.
                                                                      [20] M. Jain and C. Dovrolis. End-to-end available bandwidth:
work by researchers, there are challenges that arise from                  measurement methodology, dynamics, and relation with
others who are inadvertently involved. Site administrators                 TCP throughput. In ACM SIGCOMM, 2002.
using intrusion detection systems may mistake measure-                [21] K. Lai and M. Baker. Nettimer: A tool for measuring bot-
                                                                           tleneck link bandwidth. In USITS, 2001.
ment traffic for the prelude to an attack, even when traffic          [22] B. Mah. Estimating bandwidth and other network proper-
is sent responsibly and poses no real threat. The challenge                ties. In Internet Statistics and Metrics Analysis, 2000.
for researchers is to engineer a consensus as to what con-            [23] R. Mahajan, N. Spring, D. Wetherall, and T. Anderson.
                                                                           User-level Internet path diagnosis. In SOSP, 2003.
stitutes responsible measurement. Although ISPs stand                 [24] Z. M. Mao, J. Rexford, J. Wang, and R. Katz. Towards
to benefit from reverse-engineering efforts through im-                    an accurate AS-level traceroute tool. In ACM SIGCOMM,
proved diagnostic tools and operational knowledge, they                    2003.
                                                                      [25] S. McCreary and k. claffy. IP v4 address space uti-
continue to be wary about the competitive disadvantages                    lization. http://www.caida.org/outreach/resources/learn/
of publishing what is considered proprietary information.                  ipv4space/, 1998.
The challenge for researchers is to provide sufficient oper-          [26] D. Meyer. Routeviews. http://www.routeviews.org.
                                                                      [27] UCB/LBNL/VINT network simulator - ns, 2000.
ational benefit to discourage ISPs from blocking measure-             [28] J. Padhye and S. Floyd. Identifying the TCP behavior of
ments and encourage them to publish more data on their                     Web servers. In ACM SIGCOMM, 2001.
own.                                                                  [29] V. N. Padmanabhan, L. Qiu, and H. J. Wang. Passive net-
                                                                           work tomography using bayesian inference. In ACM SIG-
                                                                           COMM Internet Measurement Workshop, 2002.
References                                                            [30] V. N. Padmanabhan and L. Subramanian. An investigation
                                                                           of geographic mapping techniques for Internet hosts. In
 [1] R. Albert, H. Jeong, and A.-L. Barab´ si. Error and attack
                                           a                               ACM SIGCOMM, 2001.
     tolerance of complex networks. In Nature, vol. 406, pp.          [31] A. P´ sztor and D. Veitch. Active probing using packet
                                                                                 a
     378­382, 2000.                                                        quartets. In ACM SIGCOMM Internet Measurement Work-
 [2] M. Allman, E. Blanron, and W. M. Eddy. A scalable sys-                shop, 2002.
     tem for sharing Internet measurement. In Passive & Active        [32] V. Paxson and S. Floyd. Wide-area traffic: The failure of
     Measurement (PAM), 2002.                                              poisson modeling. IEEE/ACM Transactions on Network-
 [3] S. Alouf, P. Nain, and D. Towsley. Inferring network char-            ing, 3(3), 1995.
     acteristics via moment-based estimators. In IEEE INFO-           [33] L. Peterson, T. Anderson, D. Culler, and T. Roscoe. A
                                                                           blueprint for introducing disruptive technology into the In-
     COM, 2001.                                                            ternet. In HotNets-I, 2002.
 [4] K. G. Anagnostakis, M. B. Greenwald, and R. S. Ryger.            [34] R. S. Prasad, C. Dovrolis, and B. A. Mah. The effect
     cing: Measuring network-internal delays using only exist-             of layer-2 switches on pathchar-like tools. In ACM SIG-
     ing infrastructure. In IEEE INFOCOM, 2003.                            COMM Internet Measurement Workshop, 2002.
 [5] D. G. Andersen, N. Feamster, S. Bauer, and H. Balakrish-         [35] V. Ribeiro, et al. Multifractal cross-traffic estimation. In
     nan. Topology inference from BGP routing dynamics. In                 ITC Specialist Seminar on IP Traffic Measurement, Mod-
     ACM SIGCOMM Internet Measurement Workshop, 2002.                      eling and Management, 2000.
                                                                      [36] N. Spring, R. Mahajan, and T. Anderson. Quantifying the
 [6] J. Bellardo and S. Savage. Measuring packet reordering. In            causes of path inflation. In ACM SIGCOMM, 2003.
     ACM SIGCOMM Internet Measurement Workshop, 2002.                 [37] N. Spring, R. Mahajan, and D. Wetherall. Measuring ISP
 [7] C. Bormann, et al. Robust header compression. RFC                     topologies with Rocketfuel. In ACM SIGCOMM, 2002.
     3095, 2001.                                                      [38] N. Spring, D. Wetherall, and T. Anderson. Scriptroute: A
 [8] B. Chandra, M. Dahlin, L. Gao, and A. Nayate. End-to-                 public Internet measurement facility. In USITS, 2003.
     end WAN service availability. In USITS, 2001.                    [39] L. Subramanian, V. N. Padmanabhan, and R. H. Katz. Ge-
 [9] k. claffy, T. E. Monk, and D. McRobb. Internet tomogra-               ographic properties of Internet routing. In USENIX Annual
     phy. In Nature, 1999.                                                 Technical Conference, 2002.
                                                                      [40] Y. Zhang, M. Roughan, N. Duffield, and A. Greenberg.
[10] M. Coates, R. Castro, and R. Nowak. Maximum likelihood                Fast accurate computation of large-scale IP traffic matrices
     network topology identification from edge-based unicast               from link loads. In ACM SIGMETRICS, 2003.
     measurements. In ACM SIGMETRICS, 2002.


                                                                  6