P2P Backup

From P2P Wiki

Jump to: navigation, search

Contents

Introduction

The goal of P2P Backup systems is to use the free hard drive space that nowadays machines have to build a network of interested that want to backup their data to other peers. A Survey of P2P Backup Networks

The interesting backup systems are the ones that promote the exchange of backup data between peers. In these systems all peers has an interest to participate in the network. But they have a set of problems:

Save the data

When two peers exchange data, both parties has to ensure that the data is stored in the other side. This is accomplished in the majority of systems by performing challenges regularly between them.

Incentive peers to participate in the recovery

If a peer fails, it loose its own data and the data that other peers has stored in it. When it wants to recover their data, other peers don't have any incentives to participate in the recovery because the failed peer doesn't have any of their data.

Grace period attack

In P2P Backup systems is impossible to assume that peers will be connected to the network the 100% of time. For this reason peers need to maintain the data of their exchange partners even if they fail in a data challenge. Most of the times peers give a grace period to their partners before cancel the partnership. This grace period can be used by selfish peers to store data and replicate it to other node before the grace period expiration.

Peer-to-Peer Data Trading to Preserve Information

Data Trading to Preserve Information

pStore

  • Chord as the routing backend.

pStore

Cooperative Internet Backup Scheme

Cooperative Internet Backup Scheme

Pastiche and Samsara

Pastiche

Pastiche aims to backup an entire filesystem to a set of peers (buddies). Each peer tries to choose as its buddies the peers that share the major quantity of data with it (overlapped data).

Pastiche uses two Pastry overlays. The first one is a normal Pastry overlay that use the latency between peers as a metric to build the finger table, the second is a Pastry modification that use the data overlap between peers as a metric.

To compute the overlap between peers, Pastiche computes the hash (signature) of a small, random subset of files called abstract, and sends it to other peers. This other peers will compute the difference of the received abstract with its own and will return the result to the first peer. Pastiche computes a small abstract of a file system’s content that potential backup buddies can inspect to approximate overlap. Pastiche is able to limit the size of the abstract by taking advantage of the fact that arbitrary, small pieces of larger logical entities are almost always unique and can, therefore, stand for the whole. This allows machines with common installations to find suitable buddies with very little effort. Machines with uncommon installations may need to use a Pastry overlay with a new routing metric, coverage rate.

Samsara

Samsara adds to Pastiche the bilateral and equal exchange of data between peers and a probabilistic punishment to solve the problem of incentive recovery.

Collaborative Backup for Self-Interested Hosts

PeerStore

  • Use Chord as the routing backend, only for the metadata.
  • Data blocks are backed-up using an unstructured overlay using a broadcast protocol to find the partners.

PeerStore

Others

iDIBS

MyriadStore

WuaLa external

DisPairSe Project

P2P Backup Blog with alternatives

Social P2P Backup: Zoogmo

Social F2F Backup: BuddyBackup

F2F Backup: Crashplan

P2P Backup idea and discussion

Personal tools