Distributed hash table
From P2P Wiki
Distributed hash tables (DHTs) are a class of decentralized distributed systems that provide a lookup service similar to a hash table: (name, value) pairs are stored in the DHT.
Contents |
Introduction (Hash Tables)
A hash table associates data with keys so that the hash of each key returns the position of the bucket associated with this key. It is necessary to guarantee that each bucket holds a number of items close or equal to #items/#buckets, that is, the data are stored homogeneously along the table.
In a similar way, in a distributed hash table(DHT) nodes act like the buckets in the classic hash tables, so the hash of the key is used to obtain the identifier of the node responsible of a certain data. Equally, it is required that distribution of data is uniform among the nodes.
Any participating node can efficiently retrieve the value associated with a given name. Responsibility for maintaining the mapping from names to values is distributed among the nodes, in such a way that a change in the set of participants causes a minimal amount of disruption. This allows DHTs to scale to extremely large numbers of nodes and to handle continual node arrivals, departures, and failures.
DHTs form an infrastructure that can be used to build more complex services, such as distributed file systems, peer-to-peer, file sharing and content distribution systems, cooperative web caching, multicast, anycast, domain name services, and instant messaging. Maybe the main example of a distributed network that uses DHTs is the eDonkey network.
History
DHT research was originally motivated, in part, by peer-to-peer systems such as Napster, Gnutella, and Freenet, which took advantage of resources distributed across the Internet to provide a single useful application. In particular, they took advantage of increased bandwidth and hard disk capacity to provide a file sharing service.
These systems differed in how they found the data their peers contained. Napster had a central index server: each node, upon joining, would send a list of locally held files to the server, which would perform searches and refer the querier to the nodes that held the results. This central component left the system vulnerable to attacks and lawsuits. Gnutella and similar networks moved to a flooding query model—in essence, each search would result in a message being broadcast to every other machine in the network. While avoiding a single point of failure, this method was significantly less efficient than Napster. Finally, Freenet was also fully distributed, but employed a heuristic key based routing in which each file was associated with a key, and files with similar keys tended to cluster on a similar set of nodes. Queries were likely to be routed through the network to such a cluster without needing to visit many peers. However, Freenet did not guarantee that data would be found.
Distributed hash tables use a more structured key based routing in order to attain both the decentralization of Gnutella and Freenet, and the efficiency and guaranteed results of Napster. One drawback is that, like Freenet, DHTs only directly support exact-match search, rather than keyword search, although that functionality can be layered on top of a DHT.
The first four DHTs—Content addressable network, Chord, Pastry, and Tapestry were introduced about the same time in 2001. Since then this area of research has been quite active. Outside academia, DHT technology has been adopted as a component of BitTorrent and in the Coral Content Distribution Network.
Structure
The structure of a DHT can be decomposed into several main components. The foundation is an abstract keyspace, such as the set of 160-bit strings. A keyspace partitioning scheme splits ownership of this keyspace among the participating nodes. An overlay network then connects the nodes, allowing them to find the owner of any given key in the keyspace.
Once these components are in place, a typical use of the DHT for storage and retrieval might proceed as follows. Suppose the keyspace is the set of 160-bit strings. To store a file with given filename and data in the DHT, the SHA1 hash of filename is found, producing a 160-bit key (k) and a message put(k, data) is sent to any node participating in the DHT. The message is forwarded from node to node through the overlay network until it reaches the single node responsible for key k as specified by the keyspace partitioning, where the pair(k, data) is stored. Any other client can then retrieve the contents of the file by again hashing filename to produce k and asking any DHT node to find the data associated with k with a message get(k). The message will again be routed through the overlay to the node responsible for k, which will reply with the stored data.
The keyspace partitioning and overlay network components are described below with the goal of capturing the principal ideas common to most DHTs; many designs differ in the details.
Keyspace partitioning
The addition or removal of nodes can become a problem, due to the fact that the hash of two different keys can be considered to return a virtually different location:
- hask(k) mod m != hash(k) mod (m+1) != hask(k) mod(m-1)
The solution most DHTs employ is the usage of consistent hashing to map keys to nodes. This technique employs a function delta which defines an abstract notion of the distance from a given key k1 to another key k2, which is unrelated to geographical distance or network latency. Each node is assigned a single key called its identifier (ID). A node with ID i owns all the keys for which i is the closest ID, measured according to delta.
Consistent hashing has the essential property that removal or addition of one node changes only the set of keys owned by the nodes with adjacent IDs, and leaves all other nodes unaffected. Contrast this with a traditional hash table in which addition or removal of one bucket causes nearly the entire keyspace to be remapped. Since any change in ownership typically corresponds to bandwidth-intensive movement of objects stored in the DHT from one node to another, minimizing such reorganization is required to efficiently support high rates of churn (node arrival and failure).
Overlay network
Each node maintains a set of links to other nodes (its neighbors or routing table). Together these links form the overlay network. A node picks its neighbors according to a certain structure, called the network's topology.
All DHT topologies share some variant of the most essential property: for any key k, the node either owns k or has a link to a node that is closer to k in terms of the keyspace distance defined above. It is then easy to route a message to the owner of any key k using the following greedy algorithm: at each step, forward the message to the neighbor whose ID is closest to k. When there is no such neighbor, then we must have arrived at the closest node, which is the owner of k as defined above. This style of routing is sometimes called key based routing.
Beyond basic routing correctness, two key constraints on the topology are to guarantee that the maximum number of hops in any route (route length) is low, so that requests complete quickly; and that the maximum number of neighbors of any node degree is low, so that maintenance overhead is not excessive. Of course, having shorter routes requires higher maximum degree.
Maximum route length is closely related to diameter: the maximum number of hops in any shortest path between nodes. Clearly the network's route length is at least as large as its diameter, so DHTs are limited by the degree/diameter tradeoff which is fundamental in graph theory. Route length can be greater than diameter since the greedy routing algorithm may not find shortest paths.
Properties
In an efficient DHT, hosts are configured into a structured network so that mapping table lookups require a small number of hops. Designing a practical scheme along these lines is challenging because of the following desiderata:
- Scalability: the protocol should work for a range of networks of arbitrary size.
- Stability: the protocol should work for hosts with arbitrary arrival and departure times, typically with small lifetimes. This means that nodes joining and leaving should be gracefully handled, which implies repartitioning the affected keys over existing nodes, reorganizing the neighbour sets, and providing bootstrap mechanisms to connect new nodes into the DHT.
- Performance: the protocol should provide low latency for hash lookups and low maintenance cost in the presence of frequent joins and leaves.
- Small diameter: a consequence of the previous property. The node(s) responsible for each object should be reachable via a short path. In fact, the xisting DHT models fundamentally differ only in the routing approach.
- Flexibility: the protocol should impose few restrictions on the remainder of the system. It should allow for smooth trade-offs between performance and state management complexity.
- Small degree: a consequence of the flexibility property. There should be a reasonable number of neighbours for each node.
- Decentralized routing: DHT routing mechanisms should be decentralized, thus avoiding any single point of failure or bottleneck.
- Low stretch: necessary if our DHT wishes to perform well, minimizing the ratio of DHT routing versus unicast latency.
- Simplicity: the protocol should be easy to understand, code, debug and deploy.
A key technique used to achieve these goals is that any one node needs to coordinate with only a few other nodes in the system so that only a limited amount of work needs to be done for each change in membership.
Some DHT designs seek to be secure against malicious participants and to allow participants to remain anonymous, though this is less common than in many other peer-to-peer (especially file sharing) systems.
Finally, DHTs must deal with more traditional distributed systems issues such as load balance, data integrity, and performance (in particular, ensuring that operations such as routing and data storage or retrieval complete quickly).
The DHT abstraction provides a minimal access interface, which is mainly data-centric. It naturally supports a wide range of applications, because it imposes very few restrictions: keys have no semantic meaning, and values are application dependent. Therefore, DHTs can be used as a decentralized data insertion and location facility. It is important to note that DHTs are not meant for storing data: they provide the means to insert it and locate it in a decentralized fashion. However, data storage logic can be built on top of the DHTs, by using its principal programming interface: put (key, data) and get (key) → data.
Examples
Many systems have adopted this scheme, this is a list of the most known ones:
Applications employing DHTs
- BitTorrent: File distribution. BitTorrent optionally uses a DHT as a distributed tracker to provide rendezvous between clients downloading a particular file.
- The Circle: File sharing and chat
- Codeen: Web caching
- Coral Content Distribution Network
- CSpace: Secure communications platform
- Dijjer: Freenet-like distribution network
- eMule: File sharing
- FAROO: Peer-to-peer web search engine
- GNUnet [1]: Freenet-like distribution network
- I2P: Anonymous network
- JXTA: Opensource P2P platform
- NEOnet: File sharing
- Overnet: File sharing
- Warez P2P: File sharing
- YaCy: distributed search engine
- uTorrent: File sharing
Chronology
Graphic showing the evolution of DHTs, since the appear of the theoric research paper to the implementation of commercial applications based in DHTs.
Papers and links
- Ratnasamy, S.; Shenker, S. and Stoica, I. Routing Algorithms for DHTs: Some Open Questions
- Iamnitchi, A; Ripeanu, M. and Foster, I. Locating Data in (Small-World?) Peer-to-Peer Scientific Collaborations
- Distributed Hash Tables, Part 1 by Brandon Wiley.
- Tangosol Coherence includes a structure similar to a DHT, though all nodes have knowledge of the other participants
- Hermanni Hyytiälä A survey of implemented DHT overlays as of July 2003.
- Harren, M.; Hellerstein, J.M.; Huebsch, R; et al. Complex Queries in DHT-based Peer-to-Peer Networks
- Zhang, Z.; Shi, S.M. and Zhu, J. SOMO: Self-Organized Metadata Overlay for Resource Management in P2P DHT
- Freedman, M.J. and Mazières, D. Sloppy hashing and self-organizing clusters
- Kaashoek, M.F. and Karger, D.R.Koorde Koorde: A simple degree-optimal distributed hash table
- I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan. Chord: A scalable peer-to-peer lookup service for internet applications. In Proceedings of SIGCOMM 2001, August 2001.

