- 1. Testing backup and restore
- 2. Conﬁg ﬁles, version control and test before reload/restart
- 3. Test Tape Autochanger/Drives
- 3.1 Testing Tape Autochanger for the best performance
- 3.2 Testing a Single Device or Tape Autochanger
- 3.3 Performing a Writing/Reading Device Check
- 4 System Monitoring
1. Testing backup and restore
1.1 Verify your conﬁguration
After you have created a new job, it is strongly recommended to test it. Run the backup and then restore its data as a test to see that everything is correctly conﬁgured and thus can be documented in your disaster recovery procedures.
Important information can be gathered during this test:
◾ How long does the job take?
◾ What is the load on the client, the network and the Storage Daemon?
◾ Does the job run successfully ?
◾ Can the data be restored ?
1.2 How to test
In order to test your backup job, from bconsole simply type:
and follow the on-screen options.
Once the job has ﬁnished successfully, type again into the console:
It is very important to run a test backup and restore to see the impact on the SD/network but also to ask for help if issues arise during such tests.
You will surely make a lot of modiﬁcations over time to your conﬁguration ﬁles in /opt/bacula/etc, that’s why it is very good practice to:
◾ include this folder in your backup policy (see the chapter 3.5)
◾ and/or put this folder under a revision control system of your choice eg. CVS/SVN/GIT/Mercurial
This will ensure the integrity of your conﬁguration, especially if there are several system administrators involved in the backup process
2.2 Conﬁguration Backup Job
Here is a simple FileSet/Job deﬁnition to do the conﬁguration backup
Name = “DisasterRecovery-fs”
Signature = Md5
File = “/opt/bacula/working/bacula.sql” # where the Bacula catalog dump goes
File = “/opt/bacula/etc”
# you can add other files like keys, content of /etc to make this FileSet more complete and adapted to your environment
Name = “DisasterRecovery-job”
Type = “Backup”
Client = “baculaServer” # change to the name of the fd running on the Bacula DIR
Fileset = “DisasterRecovery-fs”
JobDefs = “Default-jd”
Level = “Full” # full backup is preferable
Messages = “Standard”
Pool = “DisasterRecovery-pool” # the pool we just defined to hold all config and catalog dumps
Priority = 10 # Adjust to your priorities so this job runs last, after all jobs of your backup window
Command = “/opt/bacula/scripts/make_catalog_backup bacula bacula”
RunsOnClient = no
RunsWhen = Before
Command = “/opt/bacula/scripts/delete_catalog_backup”
RunsOnClient = no
RunsWhen = After
Schedule = “NightAfterJoba”
Storage = “OnDisk”
WriteBootstrap = “/opt/bacula/working/catalog-backup.bsr” # very important, set this to be able to send it via email afterwards
2.3 Check for conﬁguration errors
After each modiﬁcation, always check the conﬁguration to avoid issues when reloading/restarting the Bacula Director process:
# /opt/bacula/bin/bacula-dir -t -u bacula -g bacula
And correct errors that are displayed or contact Support.
The Director will not start if there are parsing errors in your conﬁguration. The same will happen if you use reload in your bconsole while there are errors in your conﬁguration.Please test your conﬁguration and modiﬁcations.
3 Test Tape Autochanger/Drives
3.1 Testing Tape Autochanger for the best performance
Testing the tape autochanger is a very important task in every BEE setup and it should be done before running any production backup because:
◾ it will identify any connectivity, hardware or conﬁguration issues preventing data to be backed up eﬃciently and safely
◾ it will assure the best performance of Bacula Enterprise Edition working with the tested tape autochanger
◾ it will conﬁrm the best settings so no other changes will be required that would make the overall Bacula Enterprise Edition conﬁguration more complex.
3.2 Testing a Single Device or Tape Autochanger
Tests should be performed with the btape utility to verify the Autochanger conﬁguration and if mt and mtx commands are running correctly. Preferably, the btape tests are done before going to production. Additionally, a special “speed” test will perform raw data and random data tests on your tape device with your current conﬁguration to test the performance of your device.
3.3 Performing a Writing/Reading Device Check
Before running the btape tests, a working SD conﬁguration to connect to the Tape Library must be functional. Please refer to the Main Manual in order to install this.
The Storage Daemon connected to the Tape-Library needs to be shutdown before running the btape test, so no backup jobs will interfere. As an example, the ﬁrst tape drive (LTO-drive0) will be used for this test.
Important Note: A blank and empty tape-media that has not been used before with other third part vendor or legacy backup tools must be used for these tests.
The commands used below are examples only, you will need to adapt them to your current settings and environment.
service bacula-sd stop
Then, the status of the tape library should be reviewed:
mtx -f /dev/tape/by-serial/changer-device status
The tape drive to test, for example the drive “0”, should be empty. If not, please unload it with mtx or your tape library interface. Then, a tape should be loaded in the Tape-Drive (Drive “0”, index=0 in the Device resource in bacula-sd.conf or in the related BWeb ressource) that will be tested with a new tape media (eg. from slot 22 in the command below):
mtx -f /dev/tape/by-serial/changer load 22 0
Then, btape should be run:
“/opt/bacula/bin/btape -v -c /opt/bacula/etc/bacula-sd.conf LTO-drive0”
where “LTO-drive0” is the name of the tape drive to test in your SD conﬁguration, at index=0 in this example. Finally, the “Autochanger Test” inside btape should be started.
If any error happens with this ﬁrst test, it must be ﬁxed. Once ﬁxed, please continue with the next step performing a speed check (The ﬁle_size parameter is important in order to write a ﬁle bigger than the Maximum File Size deﬁned in the Device resource).
Two directives can be ﬁne tuned in a device conﬁguration so it leads to a faster writing process to a tape-drive.
Maximum File Size:
For LTO-5 tapes, a values between 8GiB and 12GiB is indicated and between 8GiB and 24GiB for LTO-6. The bigger the ﬁle is, the slower it will take to restore a single ﬁle as the tape will need to read the whole ﬁle before extracting a single ﬁle. The bigger the ﬁle is, the faster the writing is, as it needs to set an EOF less regularly.
Maximum Block Size:
The following values should be tested: 128K, 256K, 512K. The largest value should not be exceeded in order to avoid Kernel problems. For LTO-5 and LTO-6, 256K and 512K are usually appropriate.
◾ If the “Maximum Block Size” settings changes with a lower value after medias
have been written with production data, then your media will get incompatible
and I/O-ERRORS will happen.
◾ Minimum Block Size should never be used, as it will just waste tape space. If
you think you should use it, please contact us.
◾ Organize your tests through pairis of directives, for example “Maximum File
Size = 8GiB; Maximum Block Size = 131072”, “Maximum File Size = 8GiB;
Maximum Block Size = 262144”, etc… (8GiB-524288, 12GiB-131072, 12GiB-
262144, 12GiB-524288, etc…)
◾ When the test plan is deﬁned, both Directives should be changed accordingly
in the Device Resource of the Storage Daemon conﬁguration ﬁle for the chosen
Once modiﬁed, please run the btape “Speed” tests according the Maximum
File Size of your selected pair. After each test with a pair you will need to
restart the btape utility to load the conﬁguration changes. Please ﬁnd this
example with a Maximum File Size given in GiB:
Note:the ﬁle_size value should be bigger than the Maximum ﬁle Size directive.
In order to ﬁnd the “best pair settings” close to a production environment, it will be necessary to calculate the average throughput of the tests results “Test with zero data and Bacula block structure” and the following “Test with random data, should give the minimum throughput” from all three single tests.
Please ﬁnd more details about the btape command in our Utility manual “Volume Utility Tools” in your download area or by contacting us.
If you have any questions, please write to Bacula Systems Support Team.
4 System Monitoring
4.1 Things to Monitor
As a central part of your system, Bacula services will report issues such as your network, your disk space, the load of your database.
It is important to assure 2 parallel tasks to ensure correct backup operations:
◾ monitor space
◾ monitor jobs
You need to monitor space because Bacula cannot check it for you, before the launch of several parallel jobs, whether there is enough room on tape/disks to store them. You must ensure the available space is adequate for storing jobs but also on the Director as a full ﬁlesystem will prohibit the backup from running correctly.
For example, the spooling data is by default in /opt/bacula/working,the same for the spooled attributes. If the partition is full, these jobs will not succeed. You might want to use a tool like Nagios in order to create alarms if the pool space or partition space for /opt is low. Also remember that ﬁlesystems ﬁlled above 80% are prone to have performance issues. Job monitoring is helpful when the Backup Administrator needs to decide what to do if a job fails or if a warning occurs, as well as in sorting and prioritizing backup issues to resolve them in accordance with your environment automatically. Joblogs and Bacula log ﬁle can be parsed by a monitor tool software in order to re-run a job or take any action you may consider necessary (email/scripts/snapshots/reschedule the job) if an error or a warning occurs.
4.2 Example output
When you parse joblogs you can see:
Non-fatal FD errors: 0
SD Errors: 0
Non-fatal FD errors: 0 AND SD Errors: 0
Here everything went smoothly, no issues to report, probably no actions should be taken.
In the next case, you can see 1 Non-fatal FD errors and a ’Backup OK – with warnings’ Termination:
Non-fatal FD errors: 1
SD Errors: 0
FD termination status: OK
SD termination status: OK
Termination: Backup OK — with warnings
Human intervention is needed on this job.
It should help to have a look at the joblog (llist joblog jobid=
time: 2016-12-16 12:24:24
logtext: debianserver-fd JobId 18: Could not stat “/home/bacula”: ERR=No such file or directory
Here we can see the /home/bacula directory speciﬁed in the job FileSet is not on the Client’s ﬁlesystem to be backed up, as requested by the backup Job’s FileSet.
Check regularly all your joblogs to ﬁnd potential problems and in case of error, update your error parsing scripts/software to take them into account and take appropriate action(s). A job is correctly done when : Backup Termination is “Backup OK” and Non-fatal FD/SD errors are equal to 0.Previous section Next section