1. Introduction

1.1. Disclaimer

No liability for the contents of this documents can be accepted. Use the concepts, examples and other content at your own risk. As this is a new edition of this document, there may be errors and inaccuracies, that may of course be damaging to your system. Proceed with caution, and although this is highly unlikely, the author does not take any responsibility for that.

All copyrights are held by their respective owners, unless specifically noted otherwise. Use of a term in this document should not be regarded as affecting the validity of any trademark or service mark.

Naming of particular products or brands should not be seen as endorsements.

1.2. Formats

This document is available in the following formats:

A tarball containing all the above is also available:

1.3. Problem description

AMANDA (Advanced Maryland Automatic Network Disk Archiver) is a network backup manager that is designed to use the operating system's dump or tar program to archive a collection of filesystems or directories from different computers (called clients) on one backup system (located on the server). For a detailed introduction to AMANDA see the AMANDA section of Unix Backup and Recovery written by John R. Jackson.

AMANDA is a very flexible backup manager, designed to run under a variety of UNIX-like systems. It is not a backup program in itself, rather it is a backup manager. It uses the information provided by the tools of the operating system (like tar or dump) to estimate the size of the backup images due to be made and then decides an optimal schedule taking a lot of parameters into account, such as the desired period between two full backups of the same filesystem or directory (the so-called dumpcycle), the number of AMANDA runs in this period (the runspercycle) and the number of tapes in rotation (the tapecycle). The schedule is optimal in the sense that it tries to balance the backup load across the whole dumpcycle. This is done by choosing the aproppriate backup level for each filesystem, each time AMANDA is run. Since most of the time a full backup will be much larger than an incremental one, a proper mix of full and incremental backups will distribute the load evenly, so that on each AMANDA run the amount of data put on tape will be more or less the same. Clearly, this is a big headache with even a small number of systems, if such an optimum mix has to be found by the system administartor. AMANDA takes this burden off the administrator's shoulders by deciding herself for which filesystem to do a full backup and, if an incremental backup is better, which level it should be.

Windows shares can also be backed up with AMANDA using SAMBA. It is clear that in order for the correct computation of the backup levels to take place, a correct estimation of their size is absolutely necessary. For this purpose, AMANDA calls the operating systems tar or dump tools with aproppriate parameters and parses their output to infer the size of the image. During this stage nothing is backed up. It is only after the estimation and scheduling have taken place that tar or dump are called again to perform th real task. In the case that the filesystem to be backed up is a Windows vfat filesystem, tar has to be used to estimate the sizes of the various backup levels. Due to the way the linux kernel interacts with a vfat filesystem and the parameters that AMANDA uses in the tar call for the size estimate, it may happen that the estimated size of an incremental turns out to be almost as large as that of a full backup, even if very few data have changed since the last full backup was made. This document describes a way to circumvent this problem through some very simple changes in AMANDA's source code.