We’re excited to announce a new feature in OpenNebula 6.10.1: support for incremental Ceph backups. This feature allows you to back up VMs using RBD images without having to save the full disk contents each time— only the changes between backups are stored.
Previously, Ceph only supported full backups, which required starting from scratch each time, exporting and saving the entire disk regardless of how much had changed, often duplicating data.
Why Incremental Backups?
The incremental backup approach, already available with qcow2 images on KVM, offers several advantages:
- Smaller backup sizes. While the initial backup may be the same size as a full backup, subsequent backups only store the changes, dramatically reducing the amount of storage needed.
- Faster backups. Once the initial backup is done, all future backups are quicker, saving both network and disk IOPS.
- Less IO load. Incremental backups place a lighter IO burden on your Ceph cluster, improving performance overall.
How to Use Incremental Backups
Once this feature is available, you can start using it by selecting INCREMENT
as the backup mode option. You can do this via FireEdge Sunstone or the OpenNebula CLI (onevm backupmode <VMID> INCREMENT
). From that point on, the next backup will create a new image, and subsequent backups will add only the increments.
Implementation Details
To track changes between points in time for an RBD image, we leverage Ceph snapshot functionality. RBD snapshots can be created in a crash-consistent way, protected, exported, compared (diff’ed) against one another, and deleted when they are no longer needed.
The process involves:
- Initial backup: create a snapshot and export it as a base image containing the full disk.
- Subsequent backups: create another snapshot and export only the diff (incremental changes) between it and the previous snapshot.
For exported files, we’ve decided to use the .rbd2
extension for the initial snapshot and the .rbdiff
extension for the incremental files. The “2” in .rbd2
refers to the --export-format 2
flag, which ensures the file retains RBD metadata necessary for restoration.
Backup and Restore Process
From the OpenNebula perspective, the main change from the full backup process is the need to store multiple files in different formats (.rbd2
and .rbdiff
), instead of a single qcow2 image. This difference impacts the entire backup and restore workflow. Even if we wanted to “consolidate” the base export with the various diff files quickly(such as through the downloader.sh logic), Ceph doesn’t allow this except directly over a live Ceph cluster.
As a result, this approach differs from full backups in that it will be asymmetric: instead of sending and receiving a single qcow2 file to and from the backup storage system, we’ll send .rbd2
and .rbdiff
files and receive a .tar.gz
file containing all of them.
This is the final process we will follow for incremental backups:
Initial Backup:
- create and protect RBD snapshot (s1)
(rbd snap [create, protect]
) - export s1 to file
s1.rbd2
(rbd export --export-format 2
) s1.rbd2
is uploaded to backup DS_MAD (e.g., rsync)s1.rbd2
is deleted
Subsequent (Incremental) Backups:
- create and protect RBD snapshot (s2)
(rbd snap [create, protect]
) - export diff between s1-s2 as incremental file
s2.rbdiff
(rbd export-diff
) - previous snapshot (s1) is deleted, as it’s no longer useful
s2.rbdiff
is uploaded to backup DS_MADs2.rbdiff
is deleted
VM Restore Process (illustrated in Figure 1):
- obtain disk list from
DS.ls
, and for each of them:- fetch, as a .tar.gz file, both
s1.rbd2
and 0+ incremental diffssN.rbdiff
- restore base image
s1.rbd2
(rbd import
) - apply incremental diffs
sN.rbdiff
in the same order (rbd import-diff
)
- fetch, as a .tar.gz file, both
Managing Different Backup Types
Another challenge we encountered is the need to differentiate between two types of Ceph backup images moving forward:
- Full Backup Images. These will continue to store a single qcow2 file, representing both previous and future full backups.
- Incremental Backup Images. These will consist of the previously described combination of
.rbd2
and.rbdiff
files, resulting from current incremental backups.
To distinguish between these two image types, particularly during the restore process, we’ll use the image’s FORMAT attribute. Currently, this attribute is set to the value raw
for all images. To ensure retrocompatibility with existing backups, we’ll retain this value for the first type (qcow2), and set it to rbd
for the new incremental backups.
It’s important to note that the relationship between full-qcow2 and incremental-rbd images is based on our current approach and may evolve in the future. For instance, it’s reasonable to expect that full backups might also be stored as rbd
in order to streamline the logic and potentially allow features like converting existing full backups into incremental ones.
Changes to the Backup API
Additionally, this change necessitated updates to the API of the DS.backup operation, which is called by the VMM_MAD
exec script (src/vmm_mad/exec/one_vmm_exec.rb
). This operation will now be responsible for determining whether the new image will be in raw
or rbd
format and will propagate that information back to oned. The changes made include:
- Adding a third value to the return output to indicate the image format (
raw
/rbd
). - Modifying the input format to include the VM information, which is essential for making the format decision.
We’re thrilled to introduce this new functionality, and we hope it makes your Ceph backup strategies more efficient and practical. As always, we welcome your feedback and are happy to answer any questions you may have. Stay tuned for more exciting OpenNebula features!
🇪🇺 Part of the new functionality has been funded by ONENextGen (Grant Agreement UNICO IPCEI-2023-003), supported by the Spanish Ministry for Digital Transformation and Civil Service through the UNICO IPCEI Program, co-funded by the European Union–NextGenerationEU through the Recovery and Resilience Facility (RRF).
0 Comments