- What is PostgreSQL and why does it need to be backed up?
- What methods of backup are built-in to Postgres?
- SQL Dump
- File System Level Backup
- Continuous Archiving PostgreSQL Backup
- What products exist on the PostgreSQL backup market, and what they can provide?
- Bacula Enterprise capabilities in Postgres backup and restore
What is PostgreSQL and why does it need to be backed up?
PostgreSQL is a plugin designed to simplify backup and restore procedures. There are many reasons why your databases might be at risk. For example, device failure (which can lead to all instances becoming unavailable), data corruption (which can also spread across replicas), software error, human error, ransomware or malware attack, and compliance requirements (such as data retention periods and storage of data).
If you are administering a database, it’s critical to keep regular backups so you don’t risk losing valuable information and hours of work. Having a solid database backup and recovery strategy is a crucial part of data management for any operation.
A plugin like PostgreSQL backup can also help you avoid costly failures that could be grounds for legal action. If something happens that disrupts your database (due to innocent mistakes or malicious attacks), you can restore your data and have it running again quickly. This article will explain how to easily backup and restore Postgres databases.
What methods of backup are built-in to Postgres?
PostgreSQL can help admins protect data integrity, developers build more resilient and secure environments, and programmers create new applications. The plugin supports three backup techniques: SQL dump, file system level backup, and continuous archiving. Let’s go through each in more detail.
The goal of this dump method is to generate a file with SQL commands that can be fed back to the server. This information can help it recreate the database in the same state it was when the dump was made. You can run this procedure from any remote host, as long as it has access to the database (you will need to have read access to all tables you want to back up, so it’s best to run the dump as a database superuser). Dumps created using PostgreSQL backup are internally consistent and represent a snapshot of the database as it was when the dump began.
The basic command for creating a dump is:
pg_dump dbname > dumpfile
And for restoring one:
psql dbname < dumpfile
One advantage of this backup method is that you can generally reupload this output into new versions of PostgreSQL and transfer a database to a different machine architecture (for example, from a 32-bit to a 64-bit server).
File System Level Backup
You can also backup and restore a PostgreSQL database by copying the files directly. For example, by using the command:
tar -cf backup.tar /usr/local/pgsql/data
This method is considered slightly more impractical or inferior compared to a dump, though, because:
- The database server needs to be shut down to get a usable backup. You can’t just disallow connections; you need to stop the server altogether. The reason is that tools like tar take a snapshot of not only the state of the file system but also of the internal buffering).
- You cannot back up or restore only selected individual tables or databases from their directories and files because the information they contain can’t be used without their respective commit log files.
File system level backups only work for complete database cluster backup and restoration. You need to create a ‘consistent snapshot’ or ‘frozen snapshot’ of the entire volume containing the database – then copy the whole data directory to a backup device and release the frozen snapshot.
This backup procedure can be done while the server is running; just be aware the backup will save the database files in a state as if the server was not properly shut down (you can perform a checkpoint right before taking the snapshot to reduce the time to recovery). Keep in mind also that a file system backup will probably be larger in size than an SQL dump.
Continuous Archiving PostgreSQL Backup
PostgreSQL maintains a continuous write ahead log in the pg_wal/ subdirectory at all times, recording every change made to the data files in the database. The goal of this log is to make sure that, in the case of a system crash, the database can be restored to consistency. The way PostgreSQL does this is by ‘replaying’ the log entries that were made since the last checkpoint.
This backup technique can be combined with the file system level backup one to backup the WAL files. Then, if you need to make a recovery, you can restore the file system and then replay from the WAL files to bring it to the current state. The log replay will correct any internal inconsistencies because you can combine a long sequence of WAL files and achieve continuous backup by archiving the WAL files. Additionally, if you feed these series of files to another machine that has the same base backup file, you can have a warm standby system with a nearly-current copy of the database.
Just keep in mind this backup technique can only restore entire database clusters and not subsets. It also uses a lot of archival storage because it archives not just the base database but also all the WAL traffic. However, many people prefer this manner due to its higher reliability.
What products exist on the PostgreSQL backup market, and what they can provide?
PostgreSQL is an incredibly popular database management system (Stack Overflow actually ranked it second ahead of Microsoft SQL Server, Mongo, and Oracle).
Due to its capabilities and increasing use, several companies offer this system as part of their service pack. Some of these include:
- Veeam: This simple and convenient tool handles disaster recovery and data storage effectively. It also allows users to create different strategies according to various jobs. However, customers have reported long response times from the support team.
- Dell Technologies: You can use their Data Protection central as a consolidated dashboard for the overall health of all your backups using Networker and Avamar. The product, however, is considered slightly antiquated and hasn’t evolved much in the last years.
- Commvault Backup and Recovery: Commvault lets you restore data through a centrally controlled management console, eliminating the need for admins to work in the field. Their “Edge Drive” feature also allows users to use the system as a private cloud. Some customers, however, have reported severe issues with underlaying infrastructure problems. For example, a power outage causes the corruption of deduplication databases.
- Percona PostgreSQL distribution: Percona offers a collection of tools to help you manage your PostgreSQL database system by installing PostgreSQL and complementing it with extensions to solve essential practical tasks and allow client programs to send and receive queries to and from the PostgreSQL backend server.
- Veritas: Veritas is an enterprise data protection tool that addresses legacy systems and allows you to expand your existing automation, governance, and management processes to the EDB Postgres Advanced Server and PostgreSQL databases.
- MicroFocus: MicroFocus is one of the world’s largest enterprise software providers, offering several backup options for SQL Server, Oracle, and PostgreSQL. They can also make recommendations for backing up databases used by server configurations.
- Cohesity: Cohesity simplifies and streamlines data management across distributed and relational databases. This option is commonly chosen by enterprises with large, expanding data volumes. Cohesity uses a Remote Adapter feature to backup PostgreSQL DBs and WAL logs.
- Rubrik: Rubrik provides protection against hackers and can help companies recover quickly from attacks. The platform supports app awareness for Oracle, SAP HANA, PostgreSQL, Microsoft SQL Server, MySQL, and NoSQL.
Bacula Enterprise capabilities in Postgres backup and restore
Bacula is a subscription-based enterprise data backup solution that helps data centres handle high data volume at a very low cost. Customers have rated the company consistently high, as it’s an extremely customizable enterprise-grade backup solution with solid performance and excellent scalability.
Bacula offers several advantages over its competitors: It can work with physical, virtual, container, and Cloud environments, snapshots, advanced deduplication, restores, and data verification – all in one platform. You can easily use PostgreSQL with Bacula Enterprise.
You can choose between two major techniques to backup your Postgres database with Bacula Enterprise: PITR (point-in-time recovery) or Dump. Configuring the plugin is straightforward.
First, you will specify the Plugin Directory directive in the FileDaemon resource of the bacula-fd.conf file.
To configure PITR, you will need to enable WAL Archiving to use the Point In Time Recovery feature of PostgreSQL. The configuration varies slightly depending on your cluster version (you will have to either configure the archive_command and archive_mode, or the archive_command, wal_level, and archive_mode). You also need to enable the Accurate Option in your Job resource because the plugin uses Accurate mode information to handle Differential backups.
The plugin can perform the following actions in PITR mode:
- Full Backup: Backup the entire data directory and all WAL files.
- Incremental Backup: Force the switch of the current WAL and backup all WAL files that were generated since the last backup.
- Differential Backup: Backup data files that changed since the last Full backup and WAL files generated since the last backup.
You won’t be able to perform Incremental Backups or Differential Backups when using the Dump mode, but the plugin will detect all databases of your server.
You can also restore users and roles to your PostgreSQL cluster (just select the roles.sql file), a single database (select the database directory in the restore command, which should contain the data file and the database creation script), as well as Dump files to a Directory, restore a single table, or a complete cluster using PITR. In this case, you will need to:
- Stop the server if it’s running.
- Copy the whole cluster data directory and any tablespaces to a temporary location (as a precaution).
- Clean out all existing files and subdirectories.
- Restore the database files from your backup dump.
- Remove files present in pg_xlog, copy the unarchived WAL segment file into the file.
- Edit the recovery command file in the cluster data directory and rename it as recovery.conf.
- Start the server.
Bacula uses a connection service that allows PostgreSQL parameters to be associated with a single service name – which can be specified by a PostgreSQL connection and its associated settings. You can test database access using the estimate command to make sure the plugin is well configured.
The PostgreSQL plugin is available as a Bacula Enterprise package for all supported platforms and supports PostgreSQL versions 8.4, 9.0, 9.1, 9.2, 9.3, 9.4, and 9.5. You just need to install it on the Client where your PostgreSQL cluster resides.
With Bacula, you can backup and restore PostgreSQL databases fast and without the need for complex scripts. The software takes care of all the essential information while supporting dump and Point in Time Recovery (or PITR) backup and restore techniques, including incremental PostgreSQL DB backup and differential PostgreSQL database backup.
Getting started with Bacula Enterprise is easy; just fill in the form and download the trial. Bacula has very high security levels, has no data volume cost, a subscription licensing model, and no vendor lock-in. The company includes not just the best data backup software but also a support, patches, and updates.