Home > Backup and Recovery Strategies Blog > How to Backup Kubernetes?
1 Star2 Stars3 Stars4 Stars5 Stars
(11 votes, average: 4.98 out of 5)

How to Backup Kubernetes?

  • July 8, 2020, Rob Morrison

While the assumption about Kubernetes being for devops being somewhat previously correct, many companies are now actively deploying containers in operational environments. They are also increasingly choosing containers over traditional VMs. This is due to the various advantages of flexibility, performance and cost that containers can often provide. However, as containers move into the operations side of the IT environment, there is increasing concern about the security aspects of containers in a mission-critical environment, including their persistent data in the context of backup and restore processes.

Originally, the overwhelming majority of containerized apps were stateless, allowing them to have a much easier deployment process on a public cloud. But that changed in time, with a lot more stateful applications being deployed in containers than before. This change is why backup and recovery in Kubernetes is now an important topic for a lot of organizations.

Important features of a competent Kubernetes backup solution

The specific nature of Kubernetes environments makes it harder for more traditional backup systems and techniques to work well in the context of Kubernetes nodes and applications. Both RPO and RTO may need to be far more strict, since applications need to constantly be up and running, or especially critical, and so on.

This leads us to discern three different features that are highly recommended for every enterprise in general, and a clear necessity when it comes to best practice  Kubernetes backups:

  • Disaster recovery;
  • Backup and restore;
  • Local high availability.

It’s also important to go into a little more detail about these three features, since their context in a Kubernetes environment may slightly change from the regular definition of a “backup and restore” feature.

Local high availability as a feature is more about failure prevention/protection from within a specific data center or across availability zones (if we’re talking about the cloud, for example). A “local” failure is the one that occurs in the infrastructure/node/app used to run the application. In a perfect scenario, your Kubernetes backup solution should be able to react to this failure by keeping the app working, essentially meaning no downtime to the end user. One of the most common examples of a local failure is a stuck cloud volume that happens after a node failure.

In this perspective, local high availability as a feature can be considered a foundation of the overall data protection system. For one, to perform such a task, your solution needs to offer some sort of a data replication system locally and it also has to be in the data path in the first place. It is important to mention that providing local availability via backup restoration is still considered backup and restore and not local high availability, due to the overall recovery time.

Backup and restore is another important part of a Kubernetes backup system. In most use cases it backs up the entire application offsite from a local Kubernetes cluster. The context of Kubernetes also brings up another important consideration – if the backup software “understands” what is included in a Kubernetes app:

  • App configuration;
  • Kubernetes resources;
  • Data

A correct Kubernetes backup needs to save all of the parts above as a single unit for it to be useful in the Kubernetes system after restoring it. Targeting specific VMs, servers or disks does not mean that a software “understands” Kubernetes apps. Ideally, a Kubernetes software should be able to back up specific applications, specific groups of applications, as well as the entire Kubernetes namespace. That’s not to say that it is completely different from the regular backup process – Kubernetes backups can also benefit greatly from some of the regular features of a usual backup, including retention, scheduling, encryption, tiering, and so on.

Disaster recovery (DR) is the last one of the three, but definitely not the least important. First, DR needs to “understand” the context of Kubernetes backups, just like backup and restore. It can also have different levels of both RTO and RPO and different levels of protection according to these levels. For example, there is a strict Zero RPO requirement that implies strictly zero downtime, and there’s 15 minute RPO, with somewhat less strict requirements. It’s also not uncommon for different apps to have completely different RTO and RPO within the same database.

Another important distinction of a Kubernetes-specific disaster recovery system is that it should also be able to work with metadata to some extent (labels, app replicas, etc.). An inability to provide this feature could easily lead to a disjointed recovery in general, as well as the general data loss or an additional downtime.

More examples of the Kubernetes backup solution market

In the context of these three important factors/features, let’s look at a few more examples of a Kubernetes backup and recovery solution.  The examples we use here are Kasten, Portworx, Cohesity, OpenEBS and Rancher Longhorn.

Kasten (recently acquired by Veeam) is a backup and restore solution that also takes pride in its mobility and disaster recovery systems. Local high availability is not available with it since Kasten does not directly support replication within a single cluster and relies on the underlying data storage systems instead.  Disaster recovery is also only partially “there” since Kasten can’t achieve zero RPO case cases due to the lack of a data path component. Also noteworthy is the fact that Kasten’s backups are asynchronous only, which is typically an additional downtime between operations.

Portworx is a data management company that develops a cloud-based storage platform to manage and access the database for Kubernetes projects. It is another example of a data management solution and despite its limitations as such, one of the key benefits of using Portworx is the high availability of data. Backup and recovery operations, Kubernetes apps understanding, local high availability, disaster recovery, among other features – all of that makes Portworx a good solution for kubernetes backup - if you’re looking for one that specializes in Kubernetes-related tasks.

Cohesity is a relatively popular competitor in the field of general backup and recovery, but their Kubernetes-related capabilities still have some room to grow. First of all, Kubernetes is a relatively new addition for them, and they’ve added the “understanding” for Kubernetes apps from the get-go, but at the same time it only works for all of the applications within the same namespace, and you can’t protect specific apps within that one namespace. The other two mentioned features – disaster recovery and backup and restore – are not yet available for Kubernetes from Cohesity.

OpenEBS is another example of a solution that has managed to achieve some results with only one of the three features from our list, and in this case it’s all about Local high availability. At the same time, OpenEBS can also integrate with Velero, creating a combined Kubernetes solution that excels in Kubernetes data migration. OpenEBS on its own can only backup individual applications (a direct opposite of what Cohesity does). However this may not cover a users’ needs, since some users might need those namespace backups in specific use cases.

Rancher Longhorn is the last of our examples, and this one is probably the least known out of them all. Its community is relatively small for an open-source solution, and it does not allow for complete Kubernetes backups with metadata and resources to make app-aware recovery happen. However, there is one unique feature about it that stands out, and it’s called DR Volume. DR Volume can be set up as a both source and a destination, making the volume active in a new cluster that’s based on the latest backed up data.

As is clear in this blog, the topic of Kubernetes is still relatively new and the market is still trying to catch up to the full list of features that any Kubernetes-based system demands from the get-go. The entire nature of Kubernetes makes apps into a very different animal from what they were before, and this brings us to the current list of solutions that excel in one thing and struggle to catch up in the other.

Clearly, Kubernetes is a rapidly growing technology area, so it’s safe to say that there will soon be more solutions coming along, with the current ones likely  becoming even better than they are right now. One example of a new, powerful  Kubernetes solution is represented in Bacula Enterprise.

Bacula Enterprise as a Kubernetes backup solution

The very nature of Kubernetes environments makes them at once very dynamic and potentially complex. Backing up a Kubernetes cluster should not add unnecessarily to complexity. And of course it is usually important - if not critical - for System Administrators and other IT personnel to have centralized control over the complete backup and recovery system of the entire organization, including any Kubernetes environments. In this way, factors such as compliance, manageability, speed, efficiency and business continuity become much more realistic. At the same time, the agile approach of development teams should not be thus compromised in any way.

Bacula Enterprise is unique in this space because it is a comprehensive enterprise  solution for complete IT environments (not just Kubernetes)  that also offers natively integrated Kubernetes backup and recovery, including multiple clusters, whether the applications or data reside outside or inside a specific cluster. Every company’s Operations Department recognizes the need to have a proper recovery strategy when it comes to cluster recovery, upgrades and other situations. A cluster that is in unrecoverable state can be reverted back to the stable state with Bacula if both the configuration files and the persistent volumes of the cluster were backed up correctly beforehand.

Another way of showing Bacula’s working methods is by using the picture below:

bacula enterprise kubernetes module schematic

One of the prime advantages of Bacula’s Kubernetes module is the ability to backup various Kubernetes resources, including:

  • Pods;
  • Services;
  • Deployments;
  • Persistent volumes.

Features of Bacula Enterprise’s Kubernetes module

The way this module works is that the solution itself is not a part of the Kubernetes environment, but instead accesses the relevant data inside the cluster via Bacula pods that are attached to single Kubernetes nodes in a cluster. The deployment of these pods is automatic and it works on a “as needed” basis.

Some other features that the Kubernetes backup module provides also includes are:

  • Kubernetes backup and restore for persistent volumes;
  • Restoration of a single Kubernetes configuration resource;
  • The ability to restore configuration files and/or data from persistent volumes to the local directory;
  • The ability to backup resource configuration of Kubernetes clusters.

It’s also worth noting that Bacula readily supports multiple cloud storage platforms simultaneously, including the likes of AWS, Google, Glacier, Oracle Cloud and Azure, at the level of native integration. Hybrid cloud capabilities are thus built in, including advanced cloud management and automated cloud caching features, allowing for an easy integration of either public or private cloud services to support various tasks.

Solution flexibility is particularly important nowadays, with a lot of companies and enterprises becoming ever more complex in terms of different hypervisor families and containers. At the same time, this significantly raises the demand for vendor flexibility for all of the database vendors. Bacula’s capabilities in this regard are substantially high, combining its broad compatibility list with various technologies to reach especially high flexibility standards without locking in to one vendor.

The ever-increasing complexity of different aspects of any organization’s job is always rising, and it’s more often than not easier and more cost-efficient to use one solution for the entire IT environment, and not several solutions at once. Bacula is designed to do exactly this, and is also able to provide both a traditional web-based  interface for your configuration needs, as well as the classic command line type of control. These two interfaces can even be used simultaneously.

Bacula’s Kubernetes backup plugin allows for two main target types for restore operations:

  • Restore to a local directory;
  • Restore to cluster.

Regular and/or automated backups are highly recommended to ensure the best possible backup and recovery environment for containers. Testing your backups from time to time should be mandatory for your System Administrator, as well. In the next picture, you’ll see a part of the restoration process, namely the Restore Selection part, in which you can choose what files and/or directories you want to restore:

restore selection area

Another part of the restoration process that you’ll encounter is the advanced restore options page, which looks like this:

advanced restore options

Here you can specify multiple different options, such as output format, KBS config file path, endpoint port, and more.

It’s also easy to watch over the entire restoration process after the customization is complete, thanks to the restore job log page writing every action one by one:

restore log

Another important capability of Kubernetes module is the Plugin Listing feature, offering plenty of useful information about your available Kubernetes resources, including namespaces, persistent volumes, and so on. To do that, the module is using a special .ls command with a specific plugin=<plugin> parameter.

Bacula’s Kubernetes module offers a variety of features, some of which are:

  • Fast and efficient cluster resource redeployment;
  • Kubernetes cluster state safeguarding;
  • Saving configurations to be used in other operations;
  • Keeping amended configurations as secure as possible and restoring the exact same state as before.

Although this happens often, it is heavily recommended to avoid paying your vendor based on data volume. It makes no sense to be held to ransom now or in the future by a provider that is ready to take advantage of your organization in this way. Instead, take a close look at Bacula Systems’ licensing models, which removes its customers from exposure to data growth charges, while making it far easier for customers procurement departments to forecast future costs, too. This more reasonable approach from Bacula comes from its open source roots and resonates well in a DevOps environment.

Bacula Enterprise and Docker Container Environments

There’s a lot that Bacula Enterprise aims to offer in the container backup and recovery department. For example, Docker container environments are also covered by Bacula, using a specific Docker module that allows for the backup of multiple high-level resources such as services, deployments, replication controllers, external volumes, and so on.

The module itself is integrated into the Docker API, making both backup and restore for containers far easier than ever before, and without the need to install agents inside of each and every one container beforehand.

The entirety of container image is saved using the Docker module, including both read-only and writable layers at once. Docker volumes that belong to a container can also be backed up using this module.

Basically, the Docker module contacts the Docker service via the API to read and save all of the contents of any requested system image, with the ability to dump it via that same API afterwards. This method is way faster than the one that uses agents on each of the Docker containers, and the amount of resources consumed is also reduced. There’s no need to create snapshots, either, thus there’s no requirement for the additional free space for this feature to work properly.

Velero & Bacula Enterprise: What’s the difference?

That’s not to say there are no other solutions on the market, both premium and free-of-charge. For example, Velero.

Velero (previously called Heptio Ark) is a free open-source backup and restore solution that mainly focuses on working with Kubernetes clusters / persistent volumes. It has the ability to work with a number of different cloud platforms via specific plugins, and you can choose if you want to run it on premises or within the public cloud platform of your choosing.

The main three target fields of Velero’s capabilities are:

  • Production cluster replication for the purpose of testing or development;
  • General backup and restore capabilities for Kubernetes clusters;
  • Cluster migration feature.

The idea of how Velero works is all about two main parts – a server working within your cluster and a local client represented by a command line for your operating needs. It’s also quite unique in the way it works with Kubernetes clusters, as well.

The way it works is that the Kubernetes API is used to capture the specific state of clusters and perform the restoration process when necessary. This is different from what the majority of other solutions do – they access Kubernetes etcd databases directly and interact with the data in question through that (Bacula Pods is one such example). The advantages of doing everything via API are as follows:

  • Even if the resources that are exposed via API are stored in a separate database - they can still be quickly and efficiently backed up and/or restored;
  • Backups can be somewhat selective, capturing specific subsets of the resources of a cluster, filtered by resource type, namespace, etc., this provides that much more flexibility in regards of the data that you want to backup;
  • It’s not a rare occurrence for users of managed Kubernetes offerings to have no access to the underlying etcd database, making direct backups and restores basically impossible and forcing to use various workarounds.

When it comes to direct comparison between Velero and Bacula, then it’s safe to say that each has its own advantages and benefits.

Bacula is much more comprehensive in terms of being a broad, enterprise backup and recovery solution, and offers an especially wide range of features and technologies that you would expect from a heavy-lifting, enterprise-grade solution. Therefore, Bacula offers a complete single-platform backup solution for medium large and large enterprises. Bacula also has ‘BWeb’; a comprehensive web interface to the many features that it provides. Bacula is probably the solution an IT Director would choose when he or she needs to backup complex, changing IT environments using a single, modern platform.

Velero on the other hand, is specific in a sense that it doesn’t try to cover every aspect of backing up all applications, data and storage types, but instead focuses only on working with Kubernetes. Some users might find that more attractive rather than an all-in-one solution. Then there’s also the unique approach that Velero takes to work with data and backups – via API. And the last, but definitely not the least – it’s free and open source. Despite all of the advantages that Bacula has, it is designed to be a high-end solution for medium and large enterprises, and that, of course, is not representative of all users of Kubernetes.

About the author

Rob Morrison Rob on LinkedIn

Rob Morrison is the marketing director at Bacula Systems. He started his IT marketing career with Silicon Graphics in Switzerland, performing strongly in various marketing management roles for almost 10 years. In the next 10 years Rob also held various marketing management positions in JBoss, Red Hat and Pentaho ensuring market share growth for these well-known companies. He is a graduate of Plymouth University and holds an Honours Digital Media and Communications degree, and completed an Overseas Studies Program.

Leave a comment

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>