Contents
- What is PostgreSQL and why does it need to be backed up?
- What methods of backup are built-in to Postgres?
- SQL Dump
- File System Level Backup
- Continuous Archiving PostgreSQL Backup
- What products exist on the PostgreSQL backup software market, and what they can provide?
- Bacula Enterprise capabilities in PostgreSQL backup and restore
What is PostgreSQL and why does it need to be backed up?
PostgreSQL is a plugin designed to simplify backup and restore procedures. There are many reasons why your databases might be at risk. For example, device failure (which can lead to all instances becoming unavailable), data corruption (which can also spread across replicas), software error, human error, ransomware or malware attack, and compliance requirements (such as data retention periods and storage of data).
If you are administering a database, it’s critical to keep regular backups so you don’t risk losing valuable information and hours of work. Having a solid database backup and recovery strategy is a crucial part of data management for any operation.
A plugin like PostgreSQL backup can also help you avoid costly failures that could be grounds for legal action. If something happens that disrupts your database (due to innocent mistakes or malicious attacks), you can restore your data and have it running again quickly. This article will explain how to easily backup and restore Postgres databases.
What methods of backup are built-in to Postgres?
PostgreSQL can help admins protect data integrity, developers build more resilient and secure environments, and programmers create new applications. The plugin supports three backup techniques: SQL dump, file system level backup, and continuous archiving. Let’s go through each in more detail.
SQL Dump
The goal of this dump method is to generate a file with SQL commands that can be fed back to the server. This information can help it recreate the database in the same state it was when the dump was made. You can run this procedure from any remote host, as long as it has access to the database (you will need to have read access to all tables you want to back up, so it’s best to run the dump as a database superuser). Dumps created using PostgreSQL backup are internally consistent and represent a snapshot of the database as it was when the dump began.
The basic command for creating a dump is:
pg_dump dbname > dumpfile
And for restoring one:
psql dbname < dumpfile
One advantage of this backup method is that you can generally reupload this output into new versions of PostgreSQL and transfer a database to a different machine architecture (for example, from a 32-bit to a 64-bit server).
File System Level Backup
You can also backup and restore a PostgreSQL database by copying the files directly. For example, by using the command:
tar -cf backup.tar /usr/local/pgsql/data
This method is considered slightly more impractical or inferior compared to a dump, though, because:
- The database server needs to be shut down to get a usable backup. You can’t just disallow connections; you need to stop the server altogether. The reason is that tools like tar take a snapshot of not only the state of the file system but also of the internal buffering).
- You cannot back up or restore only selected individual tables or databases from their directories and files because the information they contain can’t be used without their respective commit log files.
File system level backups only work for complete database cluster backup and restoration. You need to create a ‘consistent snapshot’ or ‘frozen snapshot’ of the entire volume containing the database – then copy the whole data directory to a backup device and release the frozen snapshot.
This backup procedure can be done while the server is running; just be aware the backup will save the database files in a state as if the server was not properly shut down (you can perform a checkpoint right before taking the snapshot to reduce the time to recovery). Keep in mind also that a file system backup will probably be larger in size than an SQL dump.
Continuous Archiving PostgreSQL Backup
PostgreSQL maintains a continuous write ahead log in the pg_wal/ subdirectory at all times, recording every change made to the data files in the database. The goal of this log is to make sure that, in the case of a system crash, the database can be restored to consistency. The way PostgreSQL does this is by ‘replaying’ the log entries that were made since the last checkpoint.
This backup technique can be combined with the file system level backup one to backup the WAL files. Then, if you need to make a recovery, you can restore the file system and then replay from the WAL files to bring it to the current state. The log replay will correct any internal inconsistencies because you can combine a long sequence of WAL files and achieve continuous backup by archiving the WAL files. Additionally, if you feed these series of files to another machine that has the same base backup file, you can have a warm standby system with a nearly-current copy of the database.
Just keep in mind this backup technique can only restore entire database clusters and not subsets. It also uses a lot of archival storage because it archives not just the base database but also all the WAL traffic. However, many people prefer this manner due to its higher reliability.
What products exist on the PostgreSQL backup software market, and what they can provide?
PostgreSQL is an incredibly popular database management system (Stack Overflow actually ranked it second ahead of Microsoft SQL Server, Mongo, and Oracle).
Due to its capabilities and increasing use, several companies offer this system as part of their service pack. Some of these include:
This simple and convenient tool handles disaster recovery and data storage effectively. It also allows users to create different strategies according to various jobs. It can cover a multitude of tasks, including backup modernization, ransomware protection, and cloud acceleration. It works with a variety of different systems, from the cloud and virtual environments to physical storage locations and apps. It can work with specific backup targets and it can also be a centralized platform that manages all of your data no matter what type it is. However, customers have reported long response times from the support team, and additional thought needs to be given when security is paramount.
Dell EMC Data Protection is one of the few legitimately famous PostgreSQL backup tools on the market. You can use their Data Protection central as a consolidated dashboard for the overall health of all your backups using Networker and Avamar. Dell EMC offers multiple different data protection solutions, as well as a multitude of physical appliances. It is a modern data protection solution that is easy to use, and it also provides a secure infrastructure for your data to work with in the face of today’s many kinds of potential cyber threats. The product, however, is considered slightly antiquated, hasn’t evolved much in recent years and as a result, has its limitations.
Commvault lets you restore data through a centrally controlled management console, eliminating the need for admins to work in the field. It can simplify backup systems, no matter if they are in the cloud or on-premise, by offering a centralized dashboard for backup monitoring and management purposes. Their “Edge Drive” feature also allows users to use the system as a private cloud. Some customers, however, have reported severe issues with underlying infrastructure problems. For example, a power outage may cause corruption of deduplication databases.
Percona offers a collection of tools to help you manage your PostgreSQL database system by installing PostgreSQL and complementing it with extensions to solve essential practical tasks and allow client programs to send and receive queries to and from the PostgreSQL backend server. It is free to use and can transfer backups to and from cloud storage locations, as well as regular physical locations. It supports Azure, S3, AWS, and GCP. Unfortunately, it has limited integration with other technologies and lacks any kind of Windows application whatsoever.
Veritas NetBackup is an enterprise data protection tool that addresses legacy systems and allows you to expand your existing automation, governance, and management processes to the EDB Postgres Advanced Server and PostgreSQL databases. It performs backups by taking snapshots of configured PostgreSQL databases – which is different from the usual process of dumping the entire database into an external file and backing that file up. This kind of approach should theoretically make backup and recovery processes more effective as a whole, although compatibility with some technologies is limited and prices can be very high.
MicroFocus is one of the world’s largest enterprise software providers, offering several backup options for SQL Server, Oracle, and PostgreSQL. It offers advanced cyber resiliency options, and improves core business applications, offering a number of different solutions for a variety of purposes. Their cyber resiliency solution helps with evolving your security system by detecting and responding to various threats to your data, while also protecting data, applications, and even identities using a variety of methods. They can also make recommendations for backing up databases used by server configurations.
Cohesity is a fast and feature-rich PostgreSQL backup software that simplifies and streamlines data management across distributed and relational databases. This option is commonly chosen by enterprises with large, expanding data volumes. Cohesity uses a Remote Adapter feature to back up PostgreSQL DBs and WAL logs. It is a backup and recovery solution that is simple to use – and it also has features such as backup automation, fast granular and mass data restore, extensive data encryption, data deduplication, and it can also come in several different forms – as a software subscription or as a Backup-as-a-Service. Support and security levels could be better, though.
Rubrik provides protection against hackers and can help companies recover quickly from attacks. The platform supports app awareness for Oracle, SAP HANA, PostgreSQL, Microsoft SQL Server, MySQL, and NoSQL. It offers fast recovery times, quick search through your backups, backup automation capabilities, being able to save your backups to a cloud storage provider of your choice, and more. It can prevent your data from being encrypted by ransomware, can act as a centralized hub in terms of data protection throughout your entire system, etc.However, there are still a number of technologies that Rubrik cannot back up as well as some weaknesses in its reporting capabilities..
One of the biggest draws of N-able Cove Data Protector on this list of PostgreSQL backup solutions is its centralized dashboard which acts as a great helper when it comes to tracking and protecting all of your data throughout the system. It may not have a dedicated PostgreSQL backup functionality, but it does offer easier ways for the software to create backups of SQL Server, Oracle or MySQL databases. Replication and data migration are two of many tasks that N-able Cove Data Protection can do when it comes to PostgreSQL databases – it also has data encryption, granular backup/recovery, and so on.
Ottomatik is a PostgreSQL backup and recovery tool built specifically for automatization purposes. Its main goal is to make all of the time-consuming backup operations easier to manage and monitor. It has a rather user-friendly interface, making it accessible to any user, even the ones without a technical background. The recovery processes are just as simple, but with a lot of customization, if necessary. Ottomatik also supports a variety of backup storage options – Google Cloud, Dropbox, S3, and even Ottomatik’s own cloud servers. It is a great tool for backup management in small and medium-sized companies that work with PostgreSQL databases.
NetVault Plus is a versatile enterprise backup solution that offers a lot of scalability to its users. It supports a variety of virtual and physical environments that it can create backups of – covering physical servers, cloud servers, NAS devices, and even databases (MySQL, Oracle, SQL Server, etc.). It also offers vast deduplication and encryption capabilities, as well as a dedicated plugin specifically for PostgreSQL. This plugin offers both full and incremental backup types, as well as the capability to back up only data, only configuration files, or even just global objects. It also offers several backup formats, has a user-friendly web UI, and is capable of scheduling operations.
Iperuis Backup is a relatively unknown backup solution that was originally created for Windows systems with a rather small list of supported data to be backed up. However, it managed to experience a tremendous growth, and now it supports a wide variety of data sources that you can protect, from disk images to PostgreSQL databases. It does not have a limitation on how many target locations you can back up within a single license, and it supports a lot of other database types – MariaDB, SQL Server, Oracle, MySQL, and more. It also has data encryption, data compression and a wide selection of backup storage locations, such as NAS, S3, Google Drive, FTP server, etc.
IBM Spectrum Protect is a full-scale data protection solution that aims to simplify data protection as a process, no matter if the said data is hosted in a physical or virtual location, or even in a cloud environment. It can make sure that your data is well-protected by offering simplified backup administration, extensive scalability, great general efficiency of the solution as a whole, and a host of other capabilities that are more advanced and case-specific. It is also capable of working with virtual machines via IBM Spectrum Protect Plus, offering superior VM coverage in terms of data protection.
AMANDA is an acronym for Advanced Maryland Automatic Network Disk Archiver, it is a straightforward backup system that lets the LAN administrator to create a backup server so that it could create backups of multiple hosts to either a disk drive or a tape drive. AMANDA is a rather versatile backup solution that uses native tools such as dump and GNUtar to create backups of a variety of workstations that run either Windows, Unix or Mac OS X. It can also be used to create backups of PostgreSQL databases, as well as a variety of other features, but the knowledge required to perform such operations is significantly higher than that of a regular computer professional.
SimpleBackups is a service and website that offers database backup automation capabilities. It was built to work with cloud storage specifically, and it is rather user-friendly in both its initial setup and day-to-day usage. It also has the ability to extend its functionality via third-party addons and plugins. The service itself can create backups of entire databases while excluding specific tables, and the backup process itself is extremely easy to set up with no bash scripts required whatsoever. SimpleBackups also protects all backups with AES-256 encryption and is a generally great solution for most small to middle-sized companies out there.
Barman is one of several different PostgreSQL backup tools that are completely open-source and thus highly customizable. Barman itself offers data backup and restore features for PostgreSQL databases, while also offering disaster recovery features and a high level of business continuity. It can perform incremental backups, supports hot backups via Rsync, is capable of performing backups remotely, and even has the ability to continuously stream WAL files to a backup location as the means of creating backups for this specific part of a database (Write-Ahead Logging files).
As a PostgreSQL backup solution with a rather telling name, SQLBackupAndFTP is a backup solution that focuses solely on working with different forms of SQL. It has everything you would expect from this kind of backup solution by this point – support for MySQL, PostgreSQL, and SQL Server database backups and the capability to send backups to a vast variety of locations such as FTP, NAS, local storage, network storage, Backblaze B2, Yandex.Disk, etc. There are also customizable notifications, easy backup restoration processes, data encryption, data compression, and so on.
Handy Backup is a multifunctional tool that covers a large number of different backup targets, from regular customer-level backup for Windows folders, pictures, and videos to backing up websites and databases such as PostgreSQL. It supports a lot of different backup storage locations, including cloud, NAS, FTP, external drives, local drives, etc. It also has backup scheduling, different backup types, data encryption, activity logging, detailed notifications, and more. As for PostgreSQL specifically, it helps with database mirroring and cloning, can save the data within the database without stopping it, and has many other options available.
Adding another free open-source solution to the list, Bacula Community is a free community-driven version of the existing Bacula Enterprise solution that has a lot of useful functionality completely free of charge – including PostgreSQL backup capabilities. It does require a moderate level of knowledge in the field to be able to operate properly, just like most open-source solutions, but in return you get a multifunctional, high performance, deep backup solution at zero cost.
Bacula Enterprise capabilities in PostgreSQL backup and restore
Bacula Enterprise is a subscription-based enterprise data backup solution that helps data centers handle high data volume at a very low cost. Customers have rated the company consistently high, as it’s an extremely customizable enterprise-grade backup solution with solid performance and excellent scalability. Bacula offers especially high levels of security, being relied on by some of the largest western military organizations and, also crucial for larger organizations, very high performance at scale.
Bacula offers several advantages over its competitors: It can work with physical, virtual, container, and Cloud environments, snapshots, advanced deduplication, restores, and data verification – all in one platform. You can easily use PostgreSQL with Bacula Enterprise.
You can choose between two major techniques to backup your Postgres database with Bacula Enterprise: PITR (point-in-time recovery) or Dump. Configuring the plugin is straightforward.
First, you will specify the Plugin Directory directive in the FileDaemon resource of the bacula-fd.conf file.
To configure PITR, you will need to enable WAL Archiving to use the Point In Time Recovery feature of PostgreSQL. The configuration varies slightly depending on your cluster version (you will have to either configure the archive_command and archive_mode, or the archive_command, wal_level, and archive_mode). You also need to enable the Accurate Option in your Job resource because the plugin uses Accurate mode information to handle Differential backups.
The plugin can perform the following actions in PITR mode:
- Full Backup: Backup the entire data directory and all WAL files.
- Incremental Backup: Force the switch of the current WAL and backup all WAL files that were generated since the last backup.
- Differential Backup: Backup data files that changed since the last Full backup and WAL files generated since the last backup.
You won’t be able to perform Incremental Backups or Differential Backups when using the Dump mode, but the plugin will detect all databases of your server.
You can also restore users and roles to your PostgreSQL cluster (just select the roles.sql file), a single database (select the database directory in the restore command, which should contain the data file and the database creation script), as well as Dump files to a Directory, restore a single table, or a complete cluster using PITR. In this case, you will need to:
- Stop the server if it’s running.
- Copy the whole cluster data directory and any tablespaces to a temporary location (as a precaution).
- Clean out all existing files and subdirectories.
- Restore the database files from your backup dump.
- Remove files present in pg_xlog, copy the unarchived WAL segment file into the file.
- Edit the recovery command file in the cluster data directory and rename it as recovery.conf.
- Start the server.
Bacula uses a connection service that allows PostgreSQL parameters to be associated with a single service name – which can be specified by a PostgreSQL connection and its associated settings. You can test database access using the estimate command to make sure the plugin is well configured.
The PostgreSQL plugin is available as a Bacula Enterprise package for all supported platforms and supports PostgreSQL versions 8.4, 9.0, 9.1, 9.2, 9.3, 9.4, and 9.5. You just need to install it on the Client where your PostgreSQL cluster resides.
With Bacula, you can backup and restore PostgreSQL databases fast and without the need for complex scripts. The software takes care of all the essential information while supporting dump and Point in Time Recovery (or PITR) backup and restore techniques, including incremental PostgreSQL DB backup and differential PostgreSQL database backup.
Getting started with Bacula Enterprise is easy; just fill in the form and download the trial. Bacula has very high security levels, has no data volume cost, a subscription licensing model, and no vendor lock-in. The company includes not just the best data backup software but also support, patches, and updates.