P2P Backup
From P2P Wiki
Contents |
Introduction
The goal of P2P Backup systems is to use the free hard drive space that nowadays machines have to build a network of interested that want to backup their data to other peers. A Survey of P2P Backup Networks
The interesting backup systems are the ones that promote the exchange of backup data between peers. In these systems all peers has an interest to participate in the network. But they have a set of problems:
Save the data
When two peers exchange data, both parties has to ensure that the data is stored in the other side. This is accomplished in the majority of systems by performing challenges regularly between them.
Incentive peers to participate in the recovery
If a peer fails, it loose its own data and the data that other peers has stored in it. When it wants to recover their data, other peers don't have any incentives to participate in the recovery because the failed peer doesn't have any of their data.
Grace period attack
In P2P Backup systems is impossible to assume that peers will be connected to the network the 100% of time. For this reason peers need to maintain the data of their exchange partners even if they fail in a data challenge. Most of the times peers give a grace period to their partners before cancel the partnership. This grace period can be used by selfish peers to store data and replicate it to other node before the grace period expiration.
Peer-to-Peer Data Trading to Preserve Information
Data Trading to Preserve Information
pStore
- Chord as the routing backend.
Cooperative Internet Backup Scheme
Cooperative Internet Backup Scheme
Pastiche and Samsara
Pastiche
Pastiche aims to backup an entire filesystem to a set of peers (buddies). Each peer tries to choose as its buddies the peers that share the major quantity of data with it (overlapped data).
Pastiche uses two Pastry overlays. The first one is a normal Pastry overlay that use the latency between peers as a metric to build the finger table, the second is a Pastry modification that use the data overlap between peers as a metric.
To compute the overlap between peers, Pastiche computes the hash (signature) of a small, random subset of files called abstract, and sends it to other peers. This other peers will compute the difference of the received abstract with its own and will return the result to the first peer. Pastiche computes a small abstract of a file system’s content that potential backup buddies can inspect to approximate overlap. Pastiche is able to limit the size of the abstract by taking advantage of the fact that arbitrary, small pieces of larger logical entities are almost always unique and can, therefore, stand for the whole. This allows machines with common installations to find suitable buddies with very little effort. Machines with uncommon installations may need to use a Pastry overlay with a new routing metric, coverage rate.
Samsara
Samsara adds to Pastiche the bilateral and equal exchange of data between peers and a probabilistic punishment to solve the problem of incentive recovery.
Collaborative Backup for Self-Interested Hosts
PeerStore
- Use Chord as the routing backend, only for the metadata.
- Data blocks are backed-up using an unstructured overlay using a broadcast protocol to find the partners.
Others
P2P Backup Blog with alternatives

