Backup Methods
K8up currently implements three ways to do backups:
-
PVC backups, which work by mounting the underlying
PV
of eachPVC
and reading its files. -
Backup commands, which print the content, that should be backed up, to STDOUT.
-
PreBackupPod, which is run as part of a backup.
This explanation guide briefly explains each backup method.
PVC Backup
This is the most straight-forward backup approach.
What K8up basically does is that it looks at all the PVC
definitions in a namespace.
For every PVC
it creates a Job
in that same namespace.
That Job
will launch a Pod
, which mounts the respective PV
.
All the content of that PV
is then backed-up.
This method’s advantage is that it is dead-easy to use and works for a surprising large amount of use-cases. It does not work in some cases though. More precise, it does not work when files are kept open for a long period of time, like databases do. It does also not work for content that is not stored in the cluster, like a managed database that is offered by a service provider.
If the PVC has the RWO access mode, the backup Pod needs to be scheduled onto the same node, on which the Pod (which uses the respective PVC ) runs.
|
Read Annotations to learn more about how the backup process can be influenced.
Application-Aware Backups
The backup command is defined as an annotation on your Pod
.
When K8up does the backup, it will start that command in the context of your Pod
.
It uses the same method as if you are running kubectl exec POD — COMMAND ARGS
.
K8up will then collect everything that is written to STDOUT
.
The collected content is then stored in the configured backup storage as file.
This method is especially useful to back up databases, because you will get a consistent view of that database.
This is usually not guaranteed with a file-based backup, as the database may have not yet written some content to disk when K8up reads the file.
Or worse: The database modifies the file while K8up is reading it.
This can be prevented by relying on the respective tooling of your database to take backups.
For PostgreSQL, this tool is pg_dump
for example.
The advantage of this method is that it has access to everything that your Pod
has access to.
Which means that it can connect to internal (and also external) endpoints to fetch data.
It also works with a variety of database systems and likewise programs that store (some portion of) their data in-memory.
The drawback is that it transfers data via stdout
, which is less efficient.
The reason is that Kubernetes has to relay that data from the executed command to K8up.
Another drawback is that your Pod
must provide the tools to execute the command.
If it does not already contain that command, for example if it’s a distro-less container, then the next method might be for you.
Read How to create Application-Aware Backups to learn more about to use this backup method.
PreBackupPod
The PreBackupPod
builds upon the concept of the aforementioned backup command.
It is essentially a special Pod
that is created by K8up for every backup run.
You have all the flexibility of the Kubernetes Pod
-definition, like defining a special image
or accessing a Secret
.
After the Pod
started, K8up will run the given backupCommand
in that Pod
as described above.
When the command finished, K8up will remove the Pod
again.
The advantage of this method is its flexibility.
You can provide a special image which contains all the commands you need.
You can access your Secrets
.
You can connect to services inside or outside the cluster, like a managed database.
The drawback, in contrast to the pure backup command, is that this Pod
runs in its own context and can’t access services which are internal to another Pod.
You may also need to keep the PreBackupPod
in sync with your main Deployment
or with an external dependency.
When you update your database in your main Deployment
for example, you may need to update your PreBackupPod
as well.
Read How to use Pre-Backup pods to learn more about the usage of and more use-cases for this backup method.