Hierarchical DHTs

From P2P Wiki

Jump to: navigation, search

Introduction

Peer-to-peer (p2p) computing is attracting increasing attention, and many p2p systems have rapidly become basic platforms for users to share information and resources (that is, CPU cycles, memory, storage space and network bandwidth) over the Internet. In essence, a “pure” p2p system can be described as a decentralized network system in which all participant computers (also called peers) have symmetric duties and responsibilities. In simplistic terms, all participant nodes act as both clients and servers to one another, yielding a large pool of information sources and computing power. Key characteristics of p2p systems are decentralization, self-organization, dynamism, and fault-tolerance, all of which make p2p paradigm very attractive for information storage and retrieval.

However, as the system size increases, the weak peers can seriously compromise the scalability of the whole network, resulting in inefficient group communication. This is especially visible on bandwidth multicast services. To tackle this flaw, the traditional approach revolves around organizing the nodes into a multi-layer architecture. By partitioning the flat network into smaller regions, the system complexity reduces as well. To wit, a typical example is the Domain Name System (DNS), which achieves scalability through a hierarchical design.

For this purpose, recently, there have appeared in the literature many hierarchical P2P designs, specially for Distributed Hash Tables (DHTs) which strive to exploit (to their benefit) the potential advantages obtained of multi-layer designs when the most “reliable” peers are at the top layers, the peers are organized into clusters according to topological proximity, and when popular files are cached within the clusters. In this setting, one can classify existing hierarchical designs for DHTs into two main families: The homogenous design, in which all nodes act equal roles, against the superpeer design, in which a relatively small subset of the participating peers (the nodes with relatively long lifetime and large capacities), behave as proxies, interconnecting clusters of instable peers.

According to this classification, examples of homogenous designs include CORAL [1], Canon [2] and Cyclone [3], while HONET [4] and HIERAS [5] are part of the superpeer design category.

In what follows, we review some important works on hierarchical DHT systems.

Related Work

  1. M. J. Freedman and D. Mazières. Sloppy Hashing and Self-Organizing Clusters. Proceedings of the second International Workshop on Peer-to-Peer Systems (IPTPS'03), Berkeley, CA, February 2003. Coral is a p2p CDN that organizes computers into a hierarchy of clusters of increasing network diameter, where the diameter of a cluster is the maximum round-trip-time between any two computers it contains. The basic idea of Coral is to use three levels of clusters, with the diameters of level-2, -1, and-0 clusters being increasingly larger. This way any member of a cluster can locate content, without querying computers farther away than the cluster’s diameter. It uses Kademlia and it doesn’t perform more routing hops than a single level Kademlia, as a query always continues from the point at which it left off in the identifier space of the previous cluster. Thus, all queries at the beginning are fast, ensuring that nearby copies of data can be located without querying distant nodes.
  2. P. Ganesan, K. Gummadi, and H. Garcia-Molina. Canon in G Major: Designing DHTs with Hierarchical Structure. Proceedings of the 24th International Conference on Distributed Computing Systems (ICDCS 2004), Tokyo, Japan, March 2004. Canon is a generic technique to build a hierarchical DHT from the flat version of a DHT (e.g. Chord, Pastry ...) in such a way that the new design inherits the homogeneity of load and functionality offered by the original flat design, but with the advantages of possessing a hierarhical structure. A better fault isolation, more effective bandwidth utilization, and better adaptation to the physical Internet are the main arguments the author use to justify such a new approach of constructing hierarchical DHT systems.
  3. M. S. Artigas, P. G. López, J. P. Ahulló, and A. F. Skarmeta. Cyclone: A novel design schema for hierarchical dhts. Proceedings of the fifth IEEE International Conference on Peer-to-Peer Computing (P2P’05), Konstanz, Germany, September 2005. Cyclone is a generic technique to construct hierarchical DHTs from their flat DHT versions, in such a way that one can obtain the best of both worlds, without inheriting the disadvantages of either. To be specific, Cyclone produces hierarchical DHTs with all peers assuming equal roles, thereby maximizing load balancing, which is the main shortcoming of the superpeer design. To do it, it uses nodeId suffixes (instead of prefixes) to organize peers into a logical hierarchy of clusters, which brings to the hierarhical versions interesting features such as load balancing when the population in the clusters is skewed, effective replication and so on.
  4. R. Tian, Y. Xiong, Q. Zhang, B. Li, B. Y. Zhao, and X. Li. Hybrid overlay structure based on random walks. Proceedings of the fourth International Workshop on Peer-to-Peer Systems (IPTPS’05), Ithaca, NY, USA, February 2005. HONet proposes a locality-aware hybrid overlay by taking the advantages of both structured and unstructured overlays in a hierarchical structure, and using random connections between nodes across clusters, which serve as shortcuts to reduce communication consumption.
  5. Z. Xu, R. Min, and Y. Hu. HIERAS: A DHT based hierarchical p2p routing algorithm. Proceedings of the 2003 International Conference on Parallel Processing (ICPP'03), Kaohsiung, Taiwan, October 2003. HIERAS is a multi-layer DHT that organizes the participating peers into multiple overlay networks (rings) at the different layers. Each of these rings contains a subset of all system peers. The lower the layer of a ring, the smaller is the average link latency between two peers participating in that ring. In HIERAS, routing requests are first executed in the lowest layer rings which have smaller link latencies. Thus a lower total routing latency can be achieved.
  6. L. Garcés-Erice, E. Biersack, K. W. Ross, P. A. Felber, and G. Urvoy-Keller. Hierarchical P2P Systems. Proceedings of ACM/IFIP Conference on Parallel and Distributed Computing (Euro-Par), Klagenfurt, Austria, August 2003. This paper investigates the potential advantages a hierarchical system can offer when the most “reliable” peers are located at the top layers of the hierarchy, assuming that remaining peers are organized into clusters according to topological proximity and popular files are cached within the clusters.
  7. M. S. Artigas, P. G. López, and A. F. Skarmeta. A Comparative Study of Hierarchical DHT Systems. Proceedings of the 32nd IEEE Conference on Local Computer Networks (LCN 2007), Dublin, Ireland, October 2007. In this article, the authors propose a novel analytical cost framework aimed at evaluating hierarchical DHTs. Using this framework, we compare the two main families of hierarchical DHTs: The homogenous design, in which all nodes act equal roles, against the superpeer design, in which a small subset of peers (i.e., the most powerful and stable), behave as proxies, interconnecting clusters with highly dynamic membership. More specifically, their results demonstrate that no design is “universally” better, and to that effect, they suggest that their cost-based model can be very useful in identifying the advantages and disadvantages of a design against multiple alternatives for a specific problem.
  8. S. Zoels, Z. Despotovic, and W. Kellerer. Cost-Based Analysis of Hierarchical DHT Design. Proceedings of the sixth IEEE International Conference on Peer-to-Peer Computing (P2P’06), Cambridge, UK, September 2006. In this paper, the authors propose a cost-based model to analyze when a superpeer architecture is better than a flat architecture, specially emphasizing the existence of a natural trade-off between minimizing total system costs (routing and maintenance cost) and minimizing the costs for the highest loaded peer in the network.
  9. B. Yang and H. Garcia-Molina. Designing a Super-Peer Network. Proceedings of 19th IEEE International Conference on Data Engineering (ICDE), Bangalore, India, March 2003. A practical study that, besides performance trade-offs, examines the effect of redundancy and topology variations in superpeer design. It provides some practical rules of thumb, despite not contemplating the homogenous alternative.
Personal tools