Ceph Virtual Machine Backup

If you happen to run a ceph cluster as your VM storage backend, and you store your virtual machine filesystem directly on the rados block device (without partitions and stuff), you can do some efficient backups by snapshotting the image and - for example - rsyncing it away. My strategy is to use a virtual machine that instructs the host to create a ceph snapshot (be sure to use –format=2 when creating RBDs), this snapshot will be attached to the virtual machine, the virtual machine will mount the filesystem of the attached device, perform fscks and journal replay on it if necessary and rsync the data away. Once rsyncing is done, the snapshot is detached and removed. This is the script I use within a virtual machine to achieve this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#!/bin/bash
# wogri@wogri.com

# connects to a ceph host, creates a snapshot "sticks" that snapshot into this VM, this VM will mount, rsync and finally unmount that thing. in a last action it will remove the child and snapshot

if [ $# -ne 2 ]
then
 echo "Usage: backup.sh pool-name rbd-name"
 exit 1
fi

pool=$1
rbd=$2
ceph_host=1.2.3.4
rsync_host=1.2.3.5
export RSYNC_PASSWORD=you_secret_password

function backup_problems() {
 echo "ERROR: could backup $1/$2. Will be left in a inconsistent state. Please remove child, unprotect snapshot and remove snapshot."
 exit 1
}

set -x

ssh root@$ceph_host /usr/local/sbin/attach_snapshot.sh $pool $rbd &&\
backup_path=/mnt/backup/$pool/$rbd &&\
mkdir -p $backup_path &&\
mount /dev/vdb $backup_path &&\
rsync -ax --delete --numeric-ids $backup_path $rsync_host::backup/$pool &&\
umount $backup_path &&\
ssh root@$ceph_host /usr/local/sbin/remove_snapshot.sh $pool $rbd ||\
backup_problems $pool $rbd

The attaching is done with libvirt on the host, and the attach-script looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#!/bin/bash

# wogri@wogri.com
# ceph snapshotter and backupper


if [ $# -ne 2 ]
then
 echo "Usage: backup.sh pool-name rbd-name"
 exit 1
fi

set -x

pool=$1
rbd=$2
snap=${rbd}@snap # snapshot name
child=${rbd}_child # name of snapshot-copy-on-write-copy
backup=backup # name of backup machine

function snapshot_problems() {
 echo "ERROR: could not create snapshot of $1/$2. Will be left in a inconsistent state. Please remove child, unprotect snapshot and remove snapshot."
 exit 1
}

rbd snap create $pool/$snap && \
rbd snap protect $pool/$snap && \
rbd clone $pool/$snap $pool/$child || \
snapshot_problems $pool $snap

virsh_xml="<disk type='network' device='disk'>
 <driver name='qemu' type='raw'/>
 <source protocol='rbd' name='$pool/$child:rbd_cache=1'>
 <host name='my.ceph.mon' port='6789'/>
 </source>
 <target dev='vdb' bus='virtio'/>
 </disk>"

xml=/tmp/$pool.$child.xml

echo "$virsh_xml" > $xml
virsh attach-device $backup $xml

Detaching is done like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#!/bin/bash

# wogri@wogri.com
# ceph snapshotter and backupper
if [ $# -ne 2 ]
then
 echo "Usage: snapshot_remover.sh pool-name rbd-name"
 exit 1
fi

set -x

pool=$1
rbd=$2
snap=${rbd}@snap # snapshot name
child=${rbd}_child # name of snapshot-copy-on-write-copy
backup=backup # name of backup machine

function snapshot_problems() {
 echo "ERROR: could not remove snapshot of $1/$2. Will be left in a inconsistent state. Please remove child, unprotect snapshot and remove snapshot."
 exit 1
}

xml=/tmp/$pool.$child.xml

virsh detach-device $backup $xml &&\
rm $xml &&\
rbd rm $pool/$child &&\
rbd snap unprotect $pool/$snap &&\
rbd snap rm $pool/$snap ||\
snapshot_problems $pool $snap
Letzte Änderung: 2014