Ticket #2886: rfc3040.txt

File rfc3040.txt, 61.8 KB (added by menzer, 16 years ago)

RFC 3040

Line 
1
2
3
4
5
6
7Network Working Group I. Cooper
8Request for Comments: 3040 Equinix, Inc.
9Category: Informational I. Melve
10 UNINETT
11 G. Tomlinson
12 CacheFlow Inc.
13 January 2001
14
15
16 Internet Web Replication and Caching Taxonomy
17
18Status of this Memo
19
20 This memo provides information for the Internet community. It does
21 not specify an Internet standard of any kind. Distribution of this
22 memo is unlimited.
23
24Copyright Notice
25
26 Copyright (C) The Internet Society (2001). All Rights Reserved.
27
28Abstract
29
30 This memo specifies standard terminology and the taxonomy of web
31 replication and caching infrastructure as deployed today. It
32 introduces standard concepts, and protocols used today within this
33 application domain. Currently deployed solutions employing these
34 technologies are presented to establish a standard taxonomy. Known
35 problems with caching proxies are covered in the document titled
36 "Known HTTP Proxy/Caching Problems", and are not part of this
37 document. This document presents open protocols and points to
38 published material for each protocol.
39
40Table of Contents
41
42 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 3
43 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3
44 2.1 Base Terms . . . . . . . . . . . . . . . . . . . . . . . . 4
45 2.2 First order derivative terms . . . . . . . . . . . . . . . 6
46 2.3 Second order derivatives . . . . . . . . . . . . . . . . . 7
47 2.4 Topological terms . . . . . . . . . . . . . . . . . . . . 7
48 2.5 Automatic use of proxies . . . . . . . . . . . . . . . . . 8
49 3. Distributed System Relationships . . . . . . . . . . . . . 9
50 3.1 Replication Relationships . . . . . . . . . . . . . . . . 9
51 3.1.1 Client to Replica . . . . . . . . . . . . . . . . . . . . 9
52 3.1.2 Inter-Replica . . . . . . . . . . . . . . . . . . . . . . 9
53 3.2 Proxy Relationships . . . . . . . . . . . . . . . . . . . 10
54 3.2.1 Client to Non-Interception Proxy . . . . . . . . . . . . . 10
55
56
57
58Cooper, et al. Informational [Page 1]
59
60
61RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
62
63
64 3.2.2 Client to Surrogate to Origin Server . . . . . . . . . . . 10
65 3.2.3 Inter-Proxy . . . . . . . . . . . . . . . . . . . . . . . 11
66 3.2.3.1 (Caching) Proxy Meshes . . . . . . . . . . . . . . . . . . 11
67 3.2.3.2 (Caching) Proxy Arrays . . . . . . . . . . . . . . . . . . 12
68 3.2.4 Network Element to Caching Proxy . . . . . . . . . . . . . 12
69 4. Replica Selection . . . . . . . . . . . . . . . . . . . . 13
70 4.1 Navigation Hyperlinks . . . . . . . . . . . . . . . . . . 13
71 4.2 Replica HTTP Redirection . . . . . . . . . . . . . . . . . 14
72 4.3 DNS Redirection . . . . . . . . . . . . . . . . . . . . . 14
73 5. Inter-Replica Communication . . . . . . . . . . . . . . . 15
74 5.1 Batch Driven Replication . . . . . . . . . . . . . . . . . 15
75 5.2 Demand Driven Replication . . . . . . . . . . . . . . . . 16
76 5.3 Synchronized Replication . . . . . . . . . . . . . . . . . 16
77 6. User Agent to Proxy Configuration . . . . . . . . . . . . 17
78 6.1 Manual Proxy Configuration . . . . . . . . . . . . . . . . 17
79 6.2 Proxy Auto Configuration (PAC) . . . . . . . . . . . . . . 17
80 6.3 Cache Array Routing Protocol (CARP) v1.0 . . . . . . . . . 18
81 6.4 Web Proxy Auto-Discovery Protocol (WPAD) . . . . . . . . . 18
82 7. Inter-Proxy Communication . . . . . . . . . . . . . . . . 19
83 7.1 Loosely coupled Inter-Proxy Communication . . . . . . . . 19
84 7.1.1 Internet Cache Protocol (ICP) . . . . . . . . . . . . . . 19
85 7.1.2 Hyper Text Caching Protocol . . . . . . . . . . . . . . . 20
86 7.1.3 Cache Digest . . . . . . . . . . . . . . . . . . . . . . . 21
87 7.1.4 Cache Pre-filling . . . . . . . . . . . . . . . . . . . . 22
88 7.2 Tightly Coupled Inter-Cache Communication . . . . . . . . 22
89 7.2.1 Cache Array Routing Protocol (CARP) v1.0 . . . . . . . . . 22
90 8. Network Element Communication . . . . . . . . . . . . . . 23
91 8.1 Web Cache Control Protocol (WCCP) . . . . . . . . . . . . 23
92 8.2 Network Element Control Protocol (NECP) . . . . . . . . . 24
93 8.3 SOCKS . . . . . . . . . . . . . . . . . . . . . . . . . . 25
94 9. Security Considerations . . . . . . . . . . . . . . . . . 25
95 9.1 Authentication . . . . . . . . . . . . . . . . . . . . . . 26
96 9.1.1 Man in the middle attacks . . . . . . . . . . . . . . . . 26
97 9.1.2 Trusted third party . . . . . . . . . . . . . . . . . . . 26
98 9.1.3 Authentication based on IP number . . . . . . . . . . . . 26
99 9.2 Privacy . . . . . . . . . . . . . . . . . . . . . . . . . 26
100 9.2.1 Trusted third party . . . . . . . . . . . . . . . . . . . 26
101 9.2.2 Logs and legal implications . . . . . . . . . . . . . . . 27
102 9.3 Service security . . . . . . . . . . . . . . . . . . . . . 27
103 9.3.1 Denial of service . . . . . . . . . . . . . . . . . . . . 27
104 9.3.2 Replay attack . . . . . . . . . . . . . . . . . . . . . . 27
105 9.3.3 Stupid configuration of proxies . . . . . . . . . . . . . 28
106 9.3.4 Copyrighted transient copies . . . . . . . . . . . . . . . 28
107 9.3.5 Application level access . . . . . . . . . . . . . . . . . 28
108 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . 28
109 References . . . . . . . . . . . . . . . . . . . . . . . . 28
110 Authors' Addresses . . . . . . . . . . . . . . . . . . . . 31
111 Full Copyright Statement . . . . . . . . . . . . . . . . . 32
112
113
114
115Cooper, et al. Informational [Page 2]
116
117
118RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
119
120
1211. Introduction
122
123 Since its introduction in 1990, the World-Wide Web has evolved from a
124 simple client server model into a complex distributed architecture.
125 This evolution has been driven largely due to the scaling problems
126 associated with exponential growth. Distinct paradigms and solutions
127 have emerged to satisfy specific requirements. Two core
128 infrastructure components being employed to meet the demands of this
129 growth are replication and caching. In many cases, there is a need
130 for web caches and replicated services to be able to coexist.
131
132 This memo specifies standard terminology and the taxonomy of web
133 replication and caching infrastructure deployed in the Internet
134 today. The principal goal of this document is to establish a common
135 understanding and reference point of this application domain.
136
137 It is also expected that this document will be used in the creation
138 of a standard architectural framework for efficient, reliable, and
139 predictable service in a web which includes both replicas and caches.
140
141 Some of the protocols which this memo examines are specified only by
142 company technical white papers or work in progress documents. Such
143 references are included to demonstrate the existence of such
144 protocols, their experimental deployment in the Internet today, or to
145 aid the reader in their understanding of this technology area.
146
147 There are many protocols, both open and proprietary, employed in web
148 replication and caching today. A majority of the open protocols
149 include DNS [8], Cache Digests [21][10], CARP [14], HTTP [1], ICP
150 [2], PAC [12], SOCKS [7], WPAD [13], and WCCP [18][19]. These
151 protocols, and their use within the caching and replication
152 environments, are discussed below.
153
1542. Terminology
155
156 The following terminology provides definitions of common terms used
157 within the web replication and caching community. Base terms are
158 taken, where possible, from the HTTP/1.1 specification [1] and are
159 included here for reference. First- and second-order derivatives are
160 constructed from these base terms to help define the relationships
161 that exist within this area.
162
163 Terms that are in common usage and which are contrary to definitions
164 in RFC 2616 and this document are highlighted.
165
166
167
168
169
170
171
172Cooper, et al. Informational [Page 3]
173
174
175RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
176
177
1782.1 Base Terms
179
180 The majority of these terms are taken as-is from RFC 2616 [1], and
181 are included here for reference.
182
183 client (taken from [1])
184 A program that establishes connections for the purpose of sending
185 requests.
186
187 server (taken from [1])
188 An application program that accepts connections in order to
189 service requests by sending back responses. Any given program may
190 be capable of being both a client and a server; our use of these
191 terms refers only to the role being performed by the program for a
192 particular connection, rather than to the program's capabilities
193 in general. Likewise, any server may act as an origin server,
194 proxy, gateway, or tunnel, switching behavior based on the nature
195 of each request.
196
197 proxy (taken from [1])
198 An intermediary program which acts as both a server and a client
199 for the purpose of making requests on behalf of other clients.
200 Requests are serviced internally or by passing them on, with
201 possible translation, to other servers. A proxy MUST implement
202 both the client and server requirements of this specification. A
203 "transparent proxy" is a proxy that does not modify the request or
204 response beyond what is required for proxy authentication and
205 identification. A "non-transparent proxy" is a proxy that
206 modifies the request or response in order to provide some added
207 service to the user agent, such as group annotation services,
208 media type transformation, protocol reduction, or anonymity
209 filtering. Except where either transparent or non-transparent
210 behavior is explicitly stated, the HTTP proxy requirements apply
211 to both types of proxies.
212
213 Note: The term "transparent proxy" refers to a semantically
214 transparent proxy as described in [1], not what is commonly
215 understood within the caching community. We recommend that the term
216 "transparent proxy" is always prefixed to avoid confusion (e.g.,
217 "network transparent proxy"). However, see definition of
218 "interception proxy" below.
219
220 The above condition requiring implementation of both the server and
221 client requirements of HTTP/1.1 is only appropriate for a non-network
222 transparent proxy.
223
224
225
226
227
228
229Cooper, et al. Informational [Page 4]
230
231
232RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
233
234
235 cache (taken from [1])
236 A program's local store of response messages and the subsystem
237 that controls its message storage, retrieval, and deletion. A
238 cache stores cacheable responses in order to reduce the response
239 time and network bandwidth consumption on future, equivalent
240 requests. Any client or server may include a cache, though a
241 cache cannot be used by a server that is acting as a tunnel.
242
243 Note: The term "cache" used alone often is meant as "caching proxy".
244
245 Note: There are additional motivations for caching, for example
246 reducing server load (as a further means to reduce response time).
247
248 cacheable (taken from [1])
249 A response is cacheable if a cache is allowed to store a copy of
250 the response message for use in answering subsequent requests.
251 The rules for determining the cacheability of HTTP responses are
252 defined in section 13. Even if a resource is cacheable, there may
253 be additional constraints on whether a cache can use the cached
254 copy for a particular request.
255
256 gateway (taken from [1])
257 A server which acts as an intermediary for some other server.
258 Unlike a proxy, a gateway receives requests as if it were the
259 origin server for the requested resource; the requesting client
260 may not be aware that it is communicating with a gateway.
261
262 tunnel (taken from [1])
263 An intermediary program which is acting as a blind relay between
264 two connections. Once active, a tunnel is not considered a party
265 to the HTTP communication, though the tunnel may have been
266 initiated by an HTTP request. The tunnel ceases to exist when
267 both ends of the relayed connections are closed.
268
269 replication
270 "Creating and maintaining a duplicate copy of a database or file
271 system on a different computer, typically a server." - Free
272 Online Dictionary of Computing (FOLDOC)
273
274 inbound/outbound (taken from [1])
275 Inbound and outbound refer to the request and response paths for
276 messages: "inbound" means "traveling toward the origin server",
277 and "outbound" means "traveling toward the user agent".
278
279 network element
280 A network device that introduces multiple paths between source and
281 destination, transparent to HTTP.
282
283
284
285
286Cooper, et al. Informational [Page 5]
287
288
289RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
290
291
2922.2 First order derivative terms
293
294 The following terms are constructed taking the above base terms as
295 foundation.
296
297 origin server (taken from [1])
298 The server on which a given resource resides or is to be created.
299
300 user agent (taken from [1])
301 The client which initiates a request. These are often browsers,
302 editors, spiders (web-traversing robots), or other end user tools.
303
304 caching proxy
305 A proxy with a cache, acting as a server to clients, and a client
306 to servers.
307
308 Caching proxies are often referred to as "proxy caches" or simply
309 "caches". The term "proxy" is also frequently misused when
310 referring to caching proxies.
311
312 surrogate
313 A gateway co-located with an origin server, or at a different
314 point in the network, delegated the authority to operate on behalf
315 of, and typically working in close co-operation with, one or more
316 origin servers. Responses are typically delivered from an
317 internal cache.
318
319 Surrogates may derive cache entries from the origin server or from
320 another of the origin server's delegates. In some cases a
321 surrogate may tunnel such requests.
322
323 Where close co-operation between origin servers and surrogates
324 exists, this enables modifications of some protocol requirements,
325 including the Cache-Control directives in [1]. Such modifications
326 have yet to be fully specified.
327
328 Devices commonly known as "reverse proxies" and "(origin) server
329 accelerators" are both more properly defined as surrogates.
330
331 reverse proxy
332 See "surrogate".
333
334 server accelerator
335 See "surrogate".
336
337
338
339
340
341
342
343Cooper, et al. Informational [Page 6]
344
345
346RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
347
348
3492.3 Second order derivatives
350
351 The following terms further build on first order derivatives:
352
353 master origin server
354 An origin server on which the definitive version of a resource
355 resides.
356
357 replica origin server
358 An origin server holding a replica of a resource, but which may
359 act as an authoritative reference for client requests.
360
361 content consumer
362 The user or system that initiates inbound requests, through use of
363 a user agent.
364
365 browser
366 A special instance of a user agent that acts as a content
367 presentation device for content consumers.
368
3692.4 Topological terms
370
371 The following definitions are added to describe caching device
372 topology:
373
374 user agent cache
375 The cache within the user agent program.
376
377 local caching proxy
378 The caching proxy to which a user agent connects.
379
380 intermediate caching proxy
381 Seen from the content consumer's view, all caches participating in
382 the caching mesh that are not the user agent's local caching
383 proxy.
384
385 cache server
386 A server to requests made by local and intermediate caching
387 proxies, but which does not act as a proxy.
388
389 cache array
390 A cluster of caching proxies, acting logically as one service and
391 partitioning the resource name space across the array. Also known
392 as "diffused array" or "cache cluster".
393
394
395
396
397
398
399
400Cooper, et al. Informational [Page 7]
401
402
403RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
404
405
406 caching mesh
407 a loosely coupled set of co-operating proxy- and (optionally)
408 caching-servers, or clusters, acting independently but sharing
409 cacheable content between themselves using inter-cache
410 communication protocols.
411
4122.5 Automatic use of proxies
413
414 Network administrators may wish to force or facilitate the use of
415 proxies by clients, enabling such configuration within the network
416 itself or within automatic systems in user agents, such that the
417 content consumer need not be aware of any such configuration issues.
418
419 The terms that describe such configurations are given below.
420
421 automatic user-agent proxy configuration
422 The technique of discovering the availability of one or more
423 proxies and the automated configuration of the user agent to use
424 them. The use of a proxy is transparent to the content consumer
425 but not to the user agent. The term "automatic proxy
426 configuration" is also used in this sense.
427
428 traffic interception
429 The process of using a network element to examine network traffic
430 to determine whether it should be redirected.
431
432 traffic redirection
433 Redirection of client requests from a network element performing
434 traffic interception to a proxy. Used to deploy (caching) proxies
435 without the need to manually reconfigure individual user agents,
436 or to force the use of a proxy where such use would not otherwise
437 occur.
438
439 interception proxy (a.k.a. "transparent proxy", "transparent cache")
440 The term "transparent proxy" has been used within the caching
441 community to describe proxies used with zero configuration within
442 the user agent. Such use is somewhat transparent to user agents.
443 Due to discrepancies with [1] (see definition of "proxy" above),
444 and objections to the use of the word "transparent", we introduce
445 the term "interception proxy" to describe proxies that receive
446 redirected traffic flows from network elements performing traffic
447 interception.
448
449 Interception proxies receive inbound traffic flows through the
450 process of traffic redirection. (Such proxies are deployed by
451 network administrators to facilitate or require the use of
452 appropriate services offered by the proxy). Problems associated
453 with the deployment of interception proxies are described in the
454
455
456
457Cooper, et al. Informational [Page 8]
458
459
460RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
461
462
463 document "Known HTTP Proxy/Caching Problems" [23]. The use of
464 interception proxies requires zero configuration of the user agent
465 which act as though communicating directly with an origin server.
466
4673. Distributed System Relationships
468
469 This section identifies the relationships that exist in a distributed
470 replication and caching environment. Having defined these
471 relationships, later sections describe the communication protocols
472 used in each relationship.
473
4743.1 Replication Relationships
475
476 The following sections describe relationships between clients and
477 replicas and between replicas themselves.
478
4793.1.1 Client to Replica
480
481 A client may communicate with one or more replica origin servers, as
482 well as with master origin servers. (In the absence of replica
483 servers the client interacts directly with the origin server as is
484 the normal case.)
485
486 ------------------ ----------------- ------------------
487 | Replica Origin | | Master Origin | | Replica Origin |
488 | Server | | Server | | Server |
489 ------------------ ----------------- ------------------
490 \ | /
491 \ | /
492 -----------------------------------------
493 | Client to
494 ----------------- Replica Server
495 | Client |
496 -----------------
497
498 Protocols used to enable the client to use one of the replicas can be
499 found in Section 4.
500
5013.1.2 Inter-Replica
502
503 This is the relationship between master origin server(s) and replica
504 origin servers, to replicate data sets that are accessed by clients
505 in the relationship shown in Section 3.1.1.
506
507
508
509
510
511
512
513
514Cooper, et al. Informational [Page 9]
515
516
517RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
518
519
520 ------------------ ----------------- ------------------
521 | Replica Origin |-----| Master Origin |-----| Replica Origin |
522 | Server | | Server | | Server |
523 ------------------ ----------------- ------------------
524
525 Protocols used in this relationship can be found in Section 5.
526
5273.2 Proxy Relationships
528
529 There are a variety of ways in which (caching) proxies and cache
530 servers communicate with each other, and with user agents.
531
5323.2.1 Client to Non-Interception Proxy
533
534 A client may communicate with zero or more proxies for some or all
535 requests. Where the result of communication results in no proxy
536 being used, the relationship is between client and (replica) origin
537 server (see Section 3.1.1).
538
539 ----------------- ----------------- -----------------
540 | Local | | Local | | Local |
541 | Proxy | | Proxy | | Proxy |
542 ----------------- ----------------- -----------------
543 \ | /
544 \ | /
545 -----------------------------------------
546 |
547 -----------------
548 | Client |
549 -----------------
550
551 In addition, a user agent may interact with an additional server -
552 operated on behalf of a proxy for the purpose of automatic user agent
553 proxy configuration.
554
555 Schemes and protocols used in these relationships can be found in
556 Section 6.
557
5583.2.2 Client to Surrogate to Origin Server
559
560 A client may communicate with zero or more surrogates for requests
561 intended for one or more origin servers. Where a surrogate is not
562 used, the client communicates directly with an origin server. Where
563 a surrogate is used the client communicates as if with an origin
564 server. The surrogate fulfills the request from its internal cache,
565 or acts as a gateway or tunnel to the origin server.
566
567
568
569
570
571Cooper, et al. Informational [Page 10]
572
573
574RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
575
576
577 -------------- -------------- --------------
578 | Origin | | Origin | | Origin |
579 | Server | | Server | | Server |
580 -------------- -------------- --------------
581 \ | /
582 \ | /
583 -----------------
584 | Surrogate |
585 | |
586 -----------------
587 |
588 |
589 ------------
590 | Client |
591 ------------
592
5933.2.3 Inter-Proxy
594
595 Inter-Proxy relationships exist as meshes (loosely coupled) and
596 clusters (tightly coupled).
597
5983.2.3.1 (Caching) Proxy Meshes
599
600 Within a loosely coupled mesh of (caching) proxies, communication can
601 happen at the same level between peers, and with one or more parents.
602
603 --------------------- ---------------------
604 -----------| Intermediate | | Intermediate |
605 | | Caching Proxy (D) | | Caching Proxy (E) |
606 |(peer) --------------------- ---------------------
607 -------------- | (parent) / (parent)
608 | Cache | | ------/
609 | Server (C) | | /
610 -------------- | /
611 (peer) | ----------------- ---------------------
612 -------------| Local Caching |-------| Intermediate |
613 | Proxy (A) | (peer)| Caching Proxy (B) |
614 ----------------- ---------------------
615 |
616 |
617 ----------
618 | Client |
619 ----------
620
621 Client included for illustration purposes only
622
623
624
625
626
627
628Cooper, et al. Informational [Page 11]
629
630
631RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
632
633
634 An inbound request may be routed to one of a number of intermediate
635 (caching) proxies based on a determination of whether that parent is
636 better suited to resolving the request.
637
638 For example, in the above figure, Cache Server C and Intermediate
639 Caching Proxy B are peers of the Local Caching Proxy A, and may only
640 be used when the resource requested by A already exists on either B
641 or C. Intermediate Caching Proxies D & E are parents of A, and it is
642 A's choice of which to use to resolve a particular request.
643
644 The relationship between A & B only makes sense in a caching
645 environment, while the relationships between A & D and A & E are also
646 appropriate where D or E are non-caching proxies.
647
648 Protocols used in these relationships can be found in Section 7.1.
649
6503.2.3.2 (Caching) Proxy Arrays
651
652 Where a user agent may have a relationship with a proxy, it is
653 possible that it may instead have a relationship with an array of
654 proxies arranged in a tightly coupled mesh.
655
656 ----------------------
657 ---------------------- |
658 --------------------- | |
659 | (Caching) Proxy | |-----
660 | Array |----- ^ ^
661 --------------------- ^ ^ | |
662 ^ ^ | |--- |
663 | |----- |
664 --------------------------
665
666 Protocols used in this relationship can be found in Section 7.2.
667
6683.2.4 Network Element to Caching Proxy
669
670 A network element performing traffic interception may choose to
671 redirect requests from a client to a specific proxy within an array.
672 (It may also choose not to redirect the traffic, in which case the
673 relationship is between client and (replica) origin server, see
674 Section 3.1.1.)
675
676
677
678
679
680
681
682
683
684
685Cooper, et al. Informational [Page 12]
686
687
688RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
689
690
691 ----------------- ----------------- -----------------
692 | Caching Proxy | | Caching Proxy | | Caching Proxy |
693 | Array | | Array | | Array |
694 ----------------- ----------------- -----------------
695 \ | /
696 -----------------------------------------
697 |
698 --------------
699 | Network |
700 | Element |
701 --------------
702 |
703 ///
704 |
705 ------------
706 | Client |
707 ------------
708
709 The interception proxy may be directly in-line of the flow of traffic
710 - in which case the intercepting network element and interception
711 proxy form parts of the same hardware system - or may be out-of-path,
712 requiring the intercepting network element to redirect traffic to
713 another network segment. In this latter case, communication
714 protocols enable the intercepting network element to stop and start
715 redirecting traffic when the interception proxy becomes
716 (un)available. Details of these protocols can be found in Section 8.
717
7184. Replica Selection
719
720 This section describes the schemes and protocols used in the
721 cooperation and communication between client and replica origin web
722 servers. The ideal situation is to discover an optimal replica
723 origin server for clients to communicate with. Optimality is a
724 policy based decision, often based upon proximity, but may be based
725 on other criteria such as load.
726
7274.1 Navigation Hyperlinks
728
729 Best known reference:
730 This memo.
731
732 Description:
733 The simplest of client to replica communication mechanisms. This
734 utilizes hyperlink URIs embedded in web pages that point to the
735 individual replica origin servers. The content consumer manually
736 selects the link of the replica origin server they wish to use.
737
738
739
740
741
742Cooper, et al. Informational [Page 13]
743
744
745RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
746
747
748 Security:
749 Relies on the protocol security associated with the appropriate
750 URI scheme.
751
752 Deployment:
753 Probably the most commonly deployed client to replica
754 communication mechanism. Ubiquitous interoperability with humans.
755
756 Submitter:
757 Document editors.
758
7594.2 Replica HTTP Redirection
760
761 Best known reference:
762 This memo.
763
764 Description:
765 A simple and commonly used mechanism to connect clients with
766 replica origin servers is to use HTTP redirection. Clients are
767 redirected to an optimal replica origin server via the use of the
768 HTTP [1] protocol response codes, e.g., 302 "Found", or 307
769 "Temporary Redirect". A client establishes HTTP communication
770 with one of the replica origin servers. The initially contacted
771 replica origin server can then either choose to accept the service
772 or redirect the client again. Refer to section 10.3 in HTTP/1.1
773 [1] for information on HTTP response codes.
774
775 Security:
776 Relies entirely upon HTTP security.
777
778 Deployment:
779 Observed at a number of large web sites. Extent of usage in the
780 Internet is unknown.
781
782 Submitter:
783 Document editors.
784
7854.3 DNS Redirection
786
787 Best known references:
788
789 * RFC 1794 DNS Support for Load Balancing Proximity [8]
790
791 * This memo
792
793 Description:
794 The Domain Name Service (DNS) provides a more sophisticated client
795 to replica communication mechanism. This is accomplished by DNS
796
797
798
799Cooper, et al. Informational [Page 14]
800
801
802RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
803
804
805 servers that sort resolved IP addresses based upon quality of
806 service policies. When a client resolves the name of an origin
807 server, the enhanced DNS server sorts the available IP addresses
808 of the replica origin servers starting with the most optimal
809 replica and ending with the least optimal replica.
810
811 Security:
812 Relies entirely upon DNS security, and other protocols that may be
813 used in determining the sort order.
814
815 Deployment:
816 Observed at a number of large web sites and large ISP web hosted
817 services. Extent of usage in the Internet is unknown, but is
818 believed to be increasing.
819
820 Submitter:
821 Document editors.
822
8235. Inter-Replica Communication
824
825 This section describes the cooperation and communication between
826 master- and replica- origin servers. Used in replicating data sets
827 between origin servers.
828
8295.1 Batch Driven Replication
830
831 Best known reference:
832 This memo.
833
834 Description:
835 The replica origin server to be updated initiates communication
836 with a master origin server. The communication is established at
837 intervals based upon queued transactions which are scheduled for
838 deferred processing. The scheduling mechanism policies vary, but
839 generally are re-occurring at a specified time. Once
840 communication is established, data sets are copied to the
841 initiating replica origin server.
842
843 Security:
844 Relies upon the protocol being used to transfer the data set. FTP
845 [4] and RDIST are the most common protocols observed.
846
847 Deployment:
848 Very common for synchronization of mirror sites in the Internet.
849
850 Submitter:
851 Document editors.
852
853
854
855
856Cooper, et al. Informational [Page 15]
857
858
859RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
860
861
8625.2 Demand Driven Replication
863
864 Best known reference:
865 This memo.
866
867 Description:
868 Replica origin servers acquire content as needed due to client
869 demand. When a client requests a resource that is not in the data
870 set of the replica origin server/surrogate, an attempt is made to
871 resolve the request by acquiring the resource from the master
872 origin server, returning it to the requesting client.
873
874 Security:
875 Relies upon the protocol being used to transfer the resources. FTP
876 [4], Gopher [5], HTTP [1] and ICP [2] are the most common
877 protocols observed.
878
879 Deployment:
880 Observed at several large web sites. Extent of usage in the
881 Internet is unknown.
882
883 Submitter:
884 Document editors.
885
8865.3 Synchronized Replication
887
888 Best known reference:
889 This memo.
890
891 Description:
892 Replicated origin servers cooperate using synchronized strategies
893 and specialized replica protocols to keep the replica data sets
894 coherent. Synchronization strategies range from tightly coherent
895 (a few minutes) to loosely coherent (a few or more hours). Updates
896 occur between replicas based upon the synchronization time
897 constraints of the coherency model employed and are generally in
898 the form of deltas only.
899
900 Security:
901 All of the known protocols utilize strong cryptographic key
902 exchange methods, which are either based upon the Kerberos shared
903 secret model or the public/private key RSA model.
904
905 Deployment:
906 Observed at a few sites, primarily at university campuses.
907
908 Submitter:
909 Document editors.
910
911
912
913Cooper, et al. Informational [Page 16]
914
915
916RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
917
918
919 Note:
920 The editors are aware of at least two open source protocols - AFS
921 and CODA - as well as the proprietary NRS protocol from Novell.
922
9236. User Agent to Proxy Configuration
924
925 This section describes the configuration, cooperation and
926 communication between user agents and proxies.
927
9286.1 Manual Proxy Configuration
929
930 Best known reference:
931 This memo.
932
933 Description:
934 Each user must configure her user agent by supplying information
935 pertaining to proxied protocols and local policies.
936
937 Security:
938 The potential for doing wrong is high; each user individually sets
939 preferences.
940
941 Deployment:
942 Widely deployed, used in all current browsers. Most browsers also
943 support additional options.
944
945 Submitter:
946 Document editors.
947
9486.2 Proxy Auto Configuration (PAC)
949
950 Best known reference:
951 "Navigator Proxy Auto-Config File Format" [12]
952
953 Description:
954 A JavaScript script retrieved from a web server is executed for
955 each URL accessed to determine the appropriate proxy (if any) to
956 be used to access the resource. User agents must be configured to
957 request this script upon startup. There is no bootstrap
958 mechanism, manual configuration is necessary.
959
960 Despite manual configuration, the process of proxy configuration
961 is simplified by centralizing it within a script at a single
962 location.
963
964 Security:
965 Common policy per organization possible but still requires initial
966 manual configuration. PAC is better than "manual proxy
967
968
969
970Cooper, et al. Informational [Page 17]
971
972
973RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
974
975
976 configuration" since PAC administrators may update the proxy
977 configuration without further user intervention.
978
979 Interoperability of PAC files is not high, since different
980 browsers have slightly different interpretations of the same
981 script, possibly leading to undesired effects.
982
983 Deployment:
984 Implemented in Netscape Navigator and Microsoft Internet Explorer.
985
986 Submitter:
987 Document editors.
988
9896.3 Cache Array Routing Protocol (CARP) v1.0
990
991 Best known references:
992
993 * "Cache Array Routing Protocol" [14] (work in progress)
994
995 * "Cache Array Routing Protocol (CARP) v1.0 Specifications" [15]
996
997 * "Cache Array Routing Protocol and Microsoft Proxy Server 2.0"
998 [16]
999
1000 Description:
1001 User agents may use CARP directly as a hash function based proxy
1002 selection mechanism. They need to be configured with the location
1003 of the cluster information.
1004
1005 Security:
1006 Security considerations are not covered in the specification works
1007 in progress.
1008
1009 Deployment:
1010 Implemented in Microsoft Proxy Server, Squid. Implemented in user
1011 agents via PAC scripts.
1012
1013 Submitter:
1014 Document editors.
1015
10166.4 Web Proxy Auto-Discovery Protocol (WPAD)
1017
1018 Best known reference:
1019 "The Web Proxy Auto-Discovery Protocol" [13] (work in progress)
1020
1021 Description:
1022 WPAD uses a collection of pre-existing Internet resource discovery
1023 mechanisms to perform web proxy auto-discovery.
1024
1025
1026
1027Cooper, et al. Informational [Page 18]
1028
1029
1030RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
1031
1032
1033 The only goal of WPAD is to locate the PAC URL [12]. WPAD does
1034 not specify which proxies will be used. WPAD supplies the PAC
1035 URL, and the PAC script then operates as defined above to choose
1036 proxies per resource request.
1037
1038 The WPAD protocol specifies the following:
1039
1040 * how to use each mechanism for the specific purpose of web proxy
1041 auto-discovery
1042
1043 * the order in which the mechanisms should be performed
1044
1045 * the minimal set of mechanisms which must be attempted by a WPAD
1046 compliant user agent
1047
1048 The resource discovery mechanisms utilized by WPAD are as follows:
1049
1050 * Dynamic Host Configuration Protocol DHCP
1051
1052 * Service Location Protocol SLP
1053
1054 * "Well Known Aliases" using DNS A records
1055
1056 * DNS SRV records
1057
1058 * "service: URLs" in DNS TXT records
1059
1060 Security:
1061 Relies upon DNS and HTTP security.
1062
1063 Deployment:
1064 Implemented in some user agents and caching proxy servers. More
1065 than two independent implementations.
1066
1067 Submitter:
1068 Josh Cohen
1069
10707. Inter-Proxy Communication
1071
10727.1 Loosely coupled Inter-Proxy Communication
1073
1074 This section describes the cooperation and communication between
1075 caching proxies.
1076
10777.1.1 Internet Cache Protocol (ICP)
1078
1079 Best known reference:
1080 RFC 2186 Internet Cache Protocol (ICP), version 2 [2]
1081
1082
1083
1084Cooper, et al. Informational [Page 19]
1085
1086
1087RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
1088
1089
1090 Description:
1091 ICP is used by proxies to query other (caching) proxies about web
1092 resources, to see if the requested resource is present on the
1093 other system.
1094
1095 ICP uses UDP. Since UDP is an uncorrected network transport
1096 protocol, an estimate of network congestion and availability may
1097 be calculated by ICP loss. This rudimentary loss measurement
1098 provides, together with round trip times, a load balancing method
1099 for caches.
1100
1101 Security:
1102 See RFC 2187 [3]
1103
1104 ICP does not convey information about HTTP headers associated with
1105 resources. HTTP headers may include access control and cache
1106 directives. Since proxies ask for the availability of resources,
1107 and subsequently retrieve them using HTTP, false cache hits may
1108 occur (object present in cache, but not accessible to a sibling is
1109 one example).
1110
1111 ICP suffers from all the security problems of UDP.
1112
1113 Deployment:
1114 Widely deployed. Most current caching proxy implementations
1115 support ICP in some form.
1116
1117 Submitter:
1118 Document editors.
1119
1120 See also:
1121 "Internet Cache Protocol Extension" [17] (work in progress)
1122
11237.1.2 Hyper Text Caching Protocol
1124
1125 Best known reference:
1126 RFC 2756 Hyper Text Caching Protocol (HTCP/0.0) [9]
1127
1128 Description:
1129 HTCP is a protocol for discovering HTTP caching proxies and cached
1130 data, managing sets of HTTP caching proxies, and monitoring cache
1131 activity.
1132
1133 HTCP requests include HTTP header material, while ICPv2 does not,
1134 enabling HTCP replies to more accurately describe the behaviour
1135 that would occur as a result of a subsequent HTTP request for the
1136 same resource.
1137
1138
1139
1140
1141Cooper, et al. Informational [Page 20]
1142
1143
1144RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
1145
1146
1147 Security:
1148 Optionally uses HMAC-MD5 [11] shared secret authentication.
1149 Protocol is subject to attack if authentication is not used.
1150
1151 Deployment:
1152 HTCP is implemented in Squid and the "Web Gateway Interceptor".
1153
1154 Submitter:
1155 Document editors.
1156
11577.1.3 Cache Digest
1158
1159 Best known references:
1160
1161 * "Cache Digest Specification - version 5" [21]
1162
1163 * "Summary Cache: A Scalable Wide-Area Web Cache Sharing
1164 Protocol" [10] (see note)
1165
1166 Description:
1167 Cache Digests are a response to the problems of latency and
1168 congestion associated with previous inter-cache communication
1169 mechanisms such as the Internet Cache Protocol (ICP) [2] and the
1170 Hyper Text Cache Protocol [9]. Unlike these protocols, Cache
1171 Digests support peering between caching proxies and cache servers
1172 without a request-response exchange taking place for each inbound
1173 request. Instead, a summary of the contents in cache (the Digest)
1174 is fetched by other systems that peer with it. Using Cache
1175 Digests it is possible to determine with a relatively high degree
1176 of accuracy whether a given resource is cached by a particular
1177 system.
1178
1179 Cache Digests are both an exchange protocol and a data format.
1180
1181 Security:
1182 If the contents of a Digest are sensitive, they should be
1183 protected. Any methods which would normally be applied to secure
1184 an HTTP connection can be applied to Cache Digests.
1185
1186 A 'Trojan horse' attack is currently possible in a mesh: System A
1187 A can build a fake peer Digest for system B and serve it to B's
1188 peers if requested. This way A can direct traffic toward/from B.
1189 The impact of this problem is minimized by the 'pull' model of
1190 transferring Cache Digests from one system to another.
1191
1192
1193
1194
1195
1196
1197
1198Cooper, et al. Informational [Page 21]
1199
1200
1201RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
1202
1203
1204 Cache Digests provide knowledge about peer cache content on a URL
1205 level. Hence, they do not dictate a particular level of policy
1206 management and can be used to implement various policies on any
1207 level (user, organization, etc.).
1208
1209 Deployment:
1210 Cache Digests are supported in Squid.
1211
1212 Cache Meshes: NLANR Mesh; TF-CACHE Mesh (European Academic
1213 networks
1214
1215 Submitter:
1216 Alex Rousskov for [21], Pei Cao for [10].
1217
1218 Note: The technology of Summary Cache [10] is patent pending by the
1219 University of Wisconsin-Madison.
1220
12217.1.4 Cache Pre-filling
1222
1223 Best known reference:
1224 "Pre-filling a cache - A satellite overview" [20] (work in
1225 progress)
1226
1227 Description:
1228 Cache pre-filling is a push-caching implementation. It is
1229 particularly well adapted to IP-multicast networks because it
1230 allows preselected resources to be simultaneously inserted into
1231 caches within the targeted multicast group. Different
1232 implementations of cache pre-filling already exist, especially in
1233 satellite contexts. However, there is still no standard for this
1234 kind of push-caching and vendors propose solutions either based on
1235 dedicated equipment or public domain caches extended with a pre-
1236 filling module.
1237
1238 Security:
1239 Relies on the inter-cache protocols being employed.
1240
1241 Deployment:
1242 Observed in two commercial content distribution service providers.
1243
1244 Submitter:
1245 Ivan Lovric
1246
12477.2 Tightly Coupled Inter-Cache Communication
1248
12497.2.1 Cache Array Routing Protocol (CARP) v1.0
1250
1251 Also see Section 6.3
1252
1253
1254
1255Cooper, et al. Informational [Page 22]
1256
1257
1258RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
1259
1260
1261 Best known references:
1262
1263 * "Cache Array Routing Protocol" [14] (work in progress)
1264
1265 * "Cache Array Routing Protocol (CARP) v1.0 Specifications" [15]
1266
1267 * "Cache Array Routing Protocol and Microsoft Proxy Server 2.0"
1268 [16]
1269
1270 Description:
1271 CARP is a hashing function for dividing URL-space among a cluster
1272 of proxies. Included in CARP is the definition of a Proxy Array
1273 Membership Table, and ways to download this information.
1274
1275 A user agent which implements CARP v1.0 can allocate and
1276 intelligently route requests for the URLs to any member of the
1277 Proxy Array. Due to the resulting sorting of requests through
1278 these proxies, duplication of cache contents is eliminated and
1279 global cache hit rates may be improved.
1280
1281 Security:
1282 Security considerations are not covered in the specification works
1283 in progress.
1284
1285 Deployment:
1286 Implemented in caching proxy servers. More than two independent
1287 implementations.
1288
1289 Submitter:
1290 Document editors.
1291
12928. Network Element Communication
1293
1294 This section describes the cooperation and communication between
1295 proxies and network elements. Examples of such network elements
1296 include routers and switches. Generally used for deploying
1297 interception proxies and/or diffused arrays.
1298
12998.1 Web Cache Control Protocol (WCCP)
1300
1301 Best known references:
1302 "Web Cache Control Protocol" [18][19] (work in progress)
1303
1304 Note: The name used for this protocol varies, sometimes referred
1305 to as the "Web Cache Coordination Protocol", but frequently just
1306 "WCCP" to avoid confusion
1307
1308
1309
1310
1311
1312Cooper, et al. Informational [Page 23]
1313
1314
1315RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
1316
1317
1318 Description:
1319 WCCP V1 runs between a router functioning as a redirecting network
1320 element and out-of-path interception proxies. The protocol allows
1321 one or more proxies to register with a single router to receive
1322 redirected traffic. It also allows one of the proxies, the
1323 designated proxy, to dictate to the router how redirected traffic
1324 is distributed across the array.
1325
1326 WCCP V2 additionally runs between multiple routers and the
1327 proxies.
1328
1329 Security:
1330 WCCP V1 has no security features.
1331 WCCP V2 provides optional authentication of protocol packets.
1332
1333 Deployment:
1334 Network elements: WCCP is deployed on a wide range of Cisco
1335 routers.
1336 Caching proxies: WCCP is deployed on a number of vendors' caching
1337 proxies.
1338
1339 Submitter:
1340 David Forster
1341 Document editors.
1342
13438.2 Network Element Control Protocol (NECP)
1344
1345 Best known reference:
1346 "NECP: The Network Element Control Protocol" [22] (work in
1347 progress)
1348
1349 Description:
1350 NECP provides methods for network elements to learn about server
1351 capabilities, availability, and hints as to which flows can and
1352 cannot be serviced. This allows network elements to perform load
1353 balancing across a farm of servers, redirection to interception
1354 proxies, and cut-through of flows that cannot be served by the
1355 farm.
1356
1357 Security:
1358 Optionally uses HMAC-SHA-1 [11] shared secret authentication along
1359 with complex sequence numbers to provide moderately strong
1360 security. Protocol is subject to attack if authentication is not
1361 used.
1362
1363 Deployment:
1364 Unknown at present; several network element and caching proxy
1365 vendors have expressed intent to implement the protocol.
1366
1367
1368
1369Cooper, et al. Informational [Page 24]
1370
1371
1372RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
1373
1374
1375 Submitter:
1376 Gary Tomlinson
1377
13788.3 SOCKS
1379
1380 Best known reference:
1381 RFC 1928 SOCKS Protocol Version 5 [7]
1382
1383 Description:
1384 SOCKS is primarily used as a caching proxy to firewall protocol.
1385 Although firewalls don't conform to the narrowly defined network
1386 element definition above, they are a integral part of the network
1387 infrastructure. When used in conjunction with a firewall, SOCKS
1388 provides a authenticated tunnel between the caching proxy and the
1389 firewall.
1390
1391 Security:
1392 An extensive framework provides for multiple authentication
1393 methods. Currently, SSL, CHAP, DES, 3DES are known to be
1394 available.
1395
1396 Deployment:
1397 SOCKS is widely deployed in the Internet.
1398
1399 Submitter:
1400 Document editors.
1401
14029. Security Considerations
1403
1404 This document provides a taxonomy for web caching and replication.
1405 Recommended practice, architecture and protocols are not described in
1406 detail.
1407
1408 By definition, replication and caching involve the copying of
1409 resources. There are legal implications of making and keeping
1410 transient or permanent copies; these are not covered here.
1411
1412 Information on security of each protocol referred to by this memo is
1413 provided in the preceding sections, and in their accompanying
1414 documentation. HTTP security is discussed in section 15 of RFC 2616
1415 [1], the HTTP/1.1 specification, and to a lesser extent in RFC 1945
1416 [6], the HTTP/1.0 specification. RFC 2616 contains security
1417 considerations for HTTP proxies.
1418
1419
1420
1421
1422
1423
1424
1425
1426Cooper, et al. Informational [Page 25]
1427
1428
1429RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
1430
1431
1432 Caching proxies have the same security issues as other application
1433 level proxies. Application level proxies are not covered in these
1434 security considerations. IP number based authentication is
1435 problematic when a proxy is involved in the communications. Details
1436 are not discussed here.
1437
14389.1 Authentication
1439
1440 Requests for web resources, and responses to such requests, may be
1441 directed to replicas and/or may flow through intermediate proxies.
1442 The integrity of communication needs to be preserved to ensure
1443 protection from both loss of access and from unintended change.
1444
14459.1.1 Man in the middle attacks
1446
1447 HTTP proxies are men-in-the-middle, the perfect place for a man-in-
1448 the-middle-attack. A discussion of this is found in section 15 of
1449 RFC 2616 [1].
1450
14519.1.2 Trusted third party
1452
1453 A proxy must either be trusted to act on behalf of the origin server
1454 and/or client, or it must act as a tunnel. When presenting cached
1455 objects to clients, the clients need to trust the caching proxy to
1456 act on behalf on the origin server.
1457
1458 A replica may get accreditation from the origin server.
1459
14609.1.3 Authentication based on IP number
1461
1462 Authentication based on the client's IP number is problematic when
1463 connecting through a proxy, since the authenticating device only has
1464 access to the proxy's IP number. One (problematic) solution to this
1465 is for the proxy to spoof the client's IP number for inbound
1466 requests.
1467
1468 Authentication based on IP number assumes that the end-to-end
1469 properties of the Internet are preserved. This is typically not the
1470 case for environments containing interception proxies.
1471
14729.2 Privacy
1473
14749.2.1 Trusted third party
1475
1476 When using a replication service, one must trust both the replica
1477 origin server and the replica selection system.
1478
1479
1480
1481
1482
1483Cooper, et al. Informational [Page 26]
1484
1485
1486RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
1487
1488
1489 Redirection of traffic - either by automated replica selection
1490 methods, or within proxies - may introduce third parties the end user
1491 and/or origin server must to trust. In the case of interception
1492 proxies, such third parties are often unknown to both end points of
1493 the communication. Unknown third parties may have security
1494 implications.
1495
1496 Both proxies and replica selection services may have access to
1497 aggregated access information. A proxy typically knows about
1498 accesses by each client using it, information that is more sensitive
1499 than the information held by a single origin server.
1500
15019.2.2 Logs and legal implications
1502
1503 Logs from proxies should be kept secure, since they provide
1504 information about users and their patterns of behaviour. A proxy's
1505 log is even more sensitive than a web server log, as every request
1506 from the user population goes through the proxy. Logs from replica
1507 origin servers may need to be amalgamated to get aggregated
1508 statistics from a service, and transporting logs across borders may
1509 have legal implications. Log handling is restricted by law in some
1510 countries.
1511
1512 Requirements for object security and privacy are the same in a web
1513 replication and caching system as it is in the Internet at large. The
1514 only reliable solution is strong cryptography. End-to-end encryption
1515 frequently makes resources uncacheable, as in the case of SSL
1516 encrypted web sessions.
1517
15189.3 Service security
1519
15209.3.1 Denial of service
1521
1522 Any redirection of traffic is susceptible to denial of service
1523 attacks at the redirect point, and both proxies and replica selection
1524 services may redirect traffic.
1525
1526 By attacking a proxy, access to all servers may be denied for a large
1527 set of clients.
1528
1529 It has been argued that introduction of an interception proxy is a
1530 denial of service attack, since the end-to-end nature of the Internet
1531 is destroyed without the content consumer's knowledge.
1532
15339.3.2 Replay attack
1534
1535 A caching proxy is by definition a replay attack.
1536
1537
1538
1539
1540Cooper, et al. Informational [Page 27]
1541
1542
1543RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
1544
1545
15469.3.3 Stupid configuration of proxies
1547
1548 It is quite easy to have a stupid configuration which will harm
1549 service for content consumers. This is the most common security
1550 problem with proxies.
1551
15529.3.4 Copyrighted transient copies
1553
1554 The legislative forces of the world are considering the question of
1555 transient copies, like those kept in replication and caching system,
1556 being legal. The legal implications of replication and caching are
1557 subject to local law.
1558
1559 Caching proxies need to preserve the protocol output, including
1560 headers. Replication services need to preserve the source of the
1561 objects.
1562
15639.3.5 Application level access
1564
1565 Caching proxies are application level components in the traffic flow
1566 path, and may give intruders access to information that was
1567 previously only available at the network level in a proxy-free world.
1568 Some network level equipment may have required physical access to get
1569 sensitive information. Introduction of application level components
1570 may require additional system security.
1571
157210. Acknowledgements
1573
1574 The editors would like to thank the following for their assistance:
1575 David Forster, Alex Rousskov, Josh Cohen, John Martin, John Dilley,
1576 Ivan Lovric, Joe Touch, Henrik Nordstrom, Patrick McManus, Duane
1577 Wessels, Wojtek Sylwestrzak, Ted Hardie, Misha Rabinovich, Larry
1578 Masinter, Keith Moore, Roy Fielding, Patrik Faltstrom, Hilarie Orman,
1579 Mark Nottingham and Oskar Batuner.
1580
1581References
1582
1583 [1] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L.,
1584 Leach, P. and T. Berners-Lee, "Hypertext Transfer Protocol --
1585 HTTP/1.1", RFC 2616, June 1999.
1586
1587 [2] Wessels, D. and K. Claffy, "Internet Cache Protocol (ICP),
1588 Version 2", RFC 2186, September 1997.
1589
1590 [3] Wessels, D. and K. Claffy, "Application of Internet Cache
1591 Protocol (ICP), Version 2", RFC 2187, September 1997.
1592
1593
1594
1595
1596
1597Cooper, et al. Informational [Page 28]
1598
1599
1600RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
1601
1602
1603 [4] Postel, J. and J. Reynolds, "File Transfer Protocol (FTP)", STD
1604 9, RFC 959, October 1985.
1605
1606 [5] Anklesaria, F., McCahill, M., Lindner, P., Johnson, D., Torrey,
1607 D. and B. Alberti, "The Internet Gopher Protocol", RFC 1436,
1608 March 1993.
1609
1610 [6] Berners-Lee, T., Fielding, R. and H. Frystyk, "Hypertext
1611 Transfer Protocol -- HTTP/1.0", RFC 1945, May 1996.
1612
1613 [7] Leech, M., Ganis, M., Lee, Y., Kuris, R., Koblas, D. and L.
1614 Jones, "SOCKS Protocol Version 5", RFC 1928, March 1996.
1615
1616 [8] Brisco, T., "DNS Support for Load Balancing", RFC 1794, April
1617 1995.
1618
1619 [9] Vixie, P. and D. Wessels, "Hyper Text Caching Protocol
1620 (HTCP/0.0)", RFC 2756, January 2000.
1621
1622 [10] Fan, L., Cao, P., Almeida, J. and A. Broder, "Summary Cache: A
1623 Scalable Wide-Area Web Cache Sharing Protocol", Proceedings of
1624 ACM SIGCOMM'98 pp. 254-265, September 1998.
1625
1626 [11] Krawczyk, H., Bellare, M. and R. Canetti, "HMAC: Keyed-Hashing
1627 for Message Authentication", RFC 2104, February 1997.
1628
1629 [12] Netscape, Inc., "Navigator Proxy Auto-Config File Format",
1630 March 1996,
1631 <URL:http://www.netscape.com/eng/mozilla/2.0/relnotes/demo/proxy-
1632 live.html>.
1633
1634 [13] Gauthier, P., Cohen, J., Dunsmuir, M. and C. Perkins, "The Web
1635 Proxy Auto-Discovery Protocol", Work in Progress.
1636
1637 [14] Valloppillil, V. and K. Ross, "Cache Array Routing Protocol",
1638 Work in Progress.
1639
1640 [15] Microsoft Corporation, "Cache Array Routing Protocol (CARP)
1641 v1.0 Specifications, Technical Whitepaper", August 1999,
1642 <URL:http://www.microsoft.com/Proxy/Guide/carpspec.asp>.
1643
1644 [16] Microsoft Corporation, "Cache Array Routing Protocol and
1645 Microsoft Proxy Server 2.0, Technical White Paper", August
1646 1998,
1647 <URL:http://www.microsoft.com/proxy/documents/CarpWP.exe>.
1648
1649 [17] Lovric, I., "Internet Cache Protocol Extension", Work in
1650 Progress.
1651
1652
1653
1654Cooper, et al. Informational [Page 29]
1655
1656
1657RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
1658
1659
1660 [18] Cieslak, M. and D. Forster, "Cisco Web Cache Coordination
1661 Protocol V1.0", Work in Progress.
1662
1663 [19] Cieslak, M., Forster, D., Tiwana, G. and R. Wilson, "Cisco Web
1664 Cache Coordination Protocol V2.0", Work in Progress.
1665
1666 [20] Goutard, C., Lovric, I. and E. Maschio-Esposito, "Pre-filling a
1667 cache - A satellite overview", Work in Progress.
1668
1669 [21] Hamilton, M., Rousskov, A. and D. Wessels, "Cache Digest
1670 specification - version 5", December 1998,
1671 <URL:http://www.squid-cache.org/CacheDigest/cache-digest-
1672 v5.txt>.
1673
1674 [22] Cerpa, A., Elson, J., Beheshti, H., Chankhunthod, A., Danzig,
1675 P., Jalan, R., Neerdaels, C., Shroeder, T. and G. Tomlinson,
1676 "NECP: The Network Element Control Protocol", Work in Progress.
1677
1678 [23] Cooper, I. and J. Dilley, "Known HTTP Proxy/Caching Problems",
1679 Work in Progress.
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711Cooper, et al. Informational [Page 30]
1712
1713
1714RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
1715
1716
1717Authors' Addresses
1718
1719 Ian Cooper
1720 Equinix, Inc.
1721 2450 Bayshore Parkway
1722 Mountain View, CA 94043
1723 USA
1724
1725 Phone: +1 650 316 6065
1726 EMail: icooper@equinix.com
1727
1728
1729 Ingrid Melve
1730 UNINETT
1731 Tempeveien 22
1732 Trondheim N-7465
1733 Norway
1734
1735 Phone: +47 73 55 79 07
1736 EMail: Ingrid.Melve@uninett.no
1737
1738
1739 Gary Tomlinson
1740 CacheFlow Inc.
1741 12034 134th Ct. NE, Suite 201
1742 Redmond, WA 98052
1743 USA
1744
1745 Phone: +1 425 820 3009
1746 EMail: gary.tomlinson@cacheflow.com
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768Cooper, et al. Informational [Page 31]
1769
1770
1771RFC 3040 Internet Web Replication & Caching Taxonomy January 2001
1772
1773
1774Full Copyright Statement
1775
1776 Copyright (C) The Internet Society (2001). All Rights Reserved.
1777
1778 This document and translations of it may be copied and furnished to
1779 others, and derivative works that comment on or otherwise explain it
1780 or assist in its implementation may be prepared, copied, published
1781 and distributed, in whole or in part, without restriction of any
1782 kind, provided that the above copyright notice and this paragraph are
1783 included on all such copies and derivative works. However, this
1784 document itself may not be modified in any way, such as by removing
1785 the copyright notice or references to the Internet Society or other
1786 Internet organizations, except as needed for the purpose of
1787 developing Internet standards in which case the procedures for
1788 copyrights defined in the Internet Standards process must be
1789 followed, or as required to translate it into languages other than
1790 English.
1791
1792 The limited permissions granted above are perpetual and will not be
1793 revoked by the Internet Society or its successors or assigns.
1794
1795 This document and the information contained herein is provided on an
1796 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
1797 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
1798 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
1799 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
1800 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
1801
1802Acknowledgement
1803
1804 Funding for the RFC Editor function is currently provided by the
1805 Internet Society.
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825Cooper, et al. Informational [Page 32]
1826
1827
1828
1829