Using AMANDA to backup vfat filesystems Chris Karakas Due to the interaction of a vfat filesystem with the linux kernel, it is not alsways possible for AMANDA (Advanced Maryland Automatic Network Disk Archiver) to use tar to compute incremental backups of SAMBA shares correctly. This document describes a way to circumvent this problem by some simple modifications of the AMANDA source code and a custom gtar-wrapper file. Copyright © 2001 Chris Karakas. Permission is granted to copy, distribute and/ or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts. A copy of the license can be found at the Free Software Foundation. ------------------------------------------------------------------------------- Introduction Disclaimer No liability for the contents of this documents can be accepted. Use the concepts, examples and other content at your own risk. As this is a new edition of this document, there may be errors and inaccuracies, that may of course be damaging to your system. Proceed with caution, and although this is highly unlikely, the author does not take any responsibility for that. All copyrights are held by their respective owners, unless specifically noted otherwise. Use of a term in this document should not be regarded as affecting the validity of any trademark or service mark. Naming of particular products or brands should not be seen as endorsements. ------------------------------------------------------------------------------- Formats This document is available in the following formats: * HTML (HyperText Markup Language) * TXT (ASCII Text) * RTF (Rich Text Format) * PDF (Portable Document Format) * PS.GZ (Compressed Postscript) * SGML (Standard Generalized Markup Language) * LYX (LaTeX frontend LyX) A tarball containing all the above is also available: * TAR.GZ (Compressed TAR Archive), All files ------------------------------------------------------------------------------- Problem description AMANDA (Advanced Maryland Automatic Network Disk Archiver) is a network backup manager that is designed to use the operating system's dump or tar program to archive a collection of filesystems or directories from different computers (called clients) on one backup system (located on the server). For a detailed introduction to AMANDA see the AMANDA section of Unix Backup and Recovery written by John R. Jackson. AMANDA is a very flexible backup manager, designed to run under a variety of UNIX-like systems. It is not a backup program in itself, rather it is a backup manager. It uses the information provided by the tools of the operating system (like tar or dump) to estimate the size of the backup images due to be made and then decides an optimal schedule taking a lot of parameters into account, such as the desired period between two full backups of the same filesystem or directory (the so-called dumpcycle), the number of AMANDA runs in this period (the runspercycle) and the number of tapes in rotation (the tapecycle). The schedule is optimal in the sense that it tries to balance the backup load across the whole dumpcycle. This is done by choosing the aproppriate backup level for each filesystem, each time AMANDA is run. Since most of the time a full backup will be much larger than an incremental one, a proper mix of full and incremental backups will distribute the load evenly, so that on each AMANDA run the amount of data put on tape will be more or less the same. Clearly, this is a big headache with even a small number of systems, if such an optimum mix has to be found by the system administartor. AMANDA takes this burden off the administrator's shoulders by deciding herself for which filesystem to do a full backup and, if an incremental backup is better, which level it should be. Windows shares can also be backed up with AMANDA using SAMBA. It is clear that in order for the correct computation of the backup levels to take place, a correct estimation of their size is absolutely necessary. For this purpose, AMANDA calls the operating systems tar or dump tools with aproppriate parameters and parses their output to infer the size of the image. During this stage nothing is backed up. It is only after the estimation and scheduling have taken place that tar or dump are called again to perform th real task. In the case that the filesystem to be backed up is a Windows vfat filesystem, tar has to be used to estimate the sizes of the various backup levels. Due to the way the linux kernel interacts with a vfat filesystem and the parameters that AMANDA uses in the tar call for the size estimate, it may happen that the estimated size of an incremental turns out to be almost as large as that of a full backup, even if very few data have changed since the last full backup was made. This document describes a way to circumvent this problem through some very simple changes in AMANDA's source code. ------------------------------------------------------------------------------- Solutions that did not work Here are some attempts at a solution that did not work: ------------------------------------------------------------------------------- Install tar version 1.13.18 I installed tar-1.13.18.tar.gz from GNU. I also installed the source rpm for the already installed version 1.12, (in SuSE this is in the packet base.spm) in order to be able to see (in the /usr/src/packages/SOURCES/.dif file) what values SuSE uses for prefix, exec_prefix etc. I used CFLAGS='-Wall -O2 -pipe' ./configure --prefix=/usr --bindir=/bin make make install for the tar-1-13-18 installation. But the new tar did not solve my problem. ------------------------------------------------------------------------------- Use --incremental In a message, Conrad Hughes wrote: If I understand correctly (and there's no guarantee that I do), the vfat change was to ensure that a file on a vfat FS would have the same inode number for the duration of a single mount; inodes need to be constructed in some manner on vfat because it doesn't actually have real inodes, and the previous mechanism meant that a file's inode wouldn't be constant (for example a rename would change it; this caused much gnashing of teeth among one crowd of people). This new mechanism means inodes are fixed for the duration of a mount, but if you umount and remount then you have no guarantee of continuity; this is now causing gnashing of teeth amoung another crowd. Since tar --listed-incremental seems to record inodes, it gets very confused if the machine umounts and mounts a vfat system between backups (as would inevitably be the case if you rebooted for example). I had the same problem, since I reboot my machine at least once day, i.e. between backup runs. So my solution was to avoid the use of tar with the --listed-incremental=FILE option and using --incremental instead. To achieve this I had to change the file config/config.h near line 628, undefining GNUTAR_LISTED_INCREMENTAL_DIR as shown below and recompile. #ifdef GNUTAR /* Used in sendbackup-gnutar.c */ /*#define GNUTAR_LISTED_INCREMENTAL_DIR "/usr/local/var/amanda/gnutar-lists"*/ #undef GNUTAR_LISTED_INCREMENTAL_DIR /* #undef ENABLE_GNUTAR_ATIME_PRESERVE */ #endif Still, there were days when I did not reboot and the incrementals on the vfat filesystems were unjustifiably large. ------------------------------------------------------------------------------- Use --incremental and --newer In config/config.h I did: /* Changed by root. * I undefine GNUTAR_LISTED_INCREMENTAL_DIR, so that, * instead of --listed-incremental, the options --incremental and --newer * will be used in sendsize.c #define GNUTAR_LISTED_INCREMENTAL_DIR "/var/lib/amanda/gnutar-lists" */ #undef GNUTAR_LISTED_INCREMENTAL_DIR /* #undef ENABLE_GNUTAR_ATIME_PRESERVE */ #endif After recompiling, I had to: cp /scsi/DynaMo/linux/amanda/amcleanup /usr/bin/amcleanup cp /scsi/DynaMo/linux/amanda/amandates /etc/amandates John R. Jackson pointed out: I think (but am not 100% certain) you can do this when you ./ configure by adding --without-gnutar-listdir. Make sure you do a "make distclean" before re-running ./configure. Check config/config.h for GNUTAR_LISTED_INCREMENTAL_DIR after running ./configure to see if it's defined or not for this (re)build. He is right - but I didn't do it that way. Anyway, it does not work! Look what happens to /dos/f, which was fully backed up the day before: sendsize: getting size via gnutar for /dos/f level 0 sendsize: running "/usr/lib/amanda/runtar --create --directory /dos/f --incremental --newer 1970-01-01 0:00:00 GMT --sparse --one-file-system --ignore-failed-read --totals --file /dev/null ." Total bytes written: 220958720 (211MB, 8.8MB/s) ..... sendsize: getting size via gnutar for /dos/f level 1 sendsize: running "/usr/lib/amanda/runtar --create --directory /dos/f --incremental --newer 2000-12-06 4:36:16 GMT --sparse --one-file-system --ignore-failed-read --totals --file /dev/null ." Total bytes written: 194088960 (185MB, 14MB/s) And: got result for host bacchus disk /dos/f: 0 -> 215780K, 1 -> 189540K, -1 -> -1K It is as if almost all of /dos/f changed in just one day (from Dec. 6th to Dec. 7th)! But f was not touched at all! I decided to look at the ctime of a directory in /dos/f (with ll -c): /dos/f/original/adaptec/other: total 1064 -rwxrwxr-x 1 chris windows 74649 Feb 9 2035 1542ccfg.exe -rwxrwxr-x 1 chris windows 288433 Feb 9 2035 aspi32.exe -rwxrwxr-x 1 chris windows 115200 Feb 9 2035 aspichk.exe -rwxrwxr-x 1 chris windows 49654 Feb 9 2035 cf154x.exe -rwxrwxr-x 1 chris windows 536645 Feb 9 2035 dosdrvr.exe drwxrwxr-x 2 chris windows 4096 Dec 30 1979 . drwxrwxr-x 7 chris windows 4096 Dec 30 1979 .. See this? Somehow the ctime has been set to a date in 2035 and since this is definitely bigger than any current date, --newer will not work.There are 2 solutions to this: 1. Change the ctime to something more usable. 2. Use --newer-mtime (see below). ------------------------------------------------------------------------------- Solution The idea of a solution is the following: let AMANDA use a wrapper program instead of tar. The wrapper will still call tar internally to do the backup job, but for vfat filesystems it will pass different parameters to tar than for the ext2 ones. ------------------------------------------------------------------------------- Modifying the AMANDA code I decided to use --newer-mtime (which worked fine in a test run) for the vfat filesystems and --listed-incremental for the rest. In order to do this I needed to write my own gtar-wrapper script. I also needed _both_ options to be passed to gtar-wrapper, which would then decide which one to pass to tar. For this, the following has been entered in 2 places in client-src/sendsize.c: #ifdef GNUTAR_LISTED_INCREMENTAL_DIR "--listed-incremental", incrname, /* Changed by root. * I want to use my own gtar-wrapper script. * For this I need _all_ the options relevant to * incrementals, so I just commented this "#else" here. #else */ "--incremental", "--newer-mtime", dumptimestr, #endif "--sparse","--one-file-system", (just search for GNUTAR_LISTED_INCREMENTAL_DIR). In sendbackup-gnutar.c: { char *format_buf; char *a00, *a01, *a02, *a03, *a04, *a05; char *a06, *a07, *a08, *a09, *a10, *a11; /* Changed by root. * incr1 and incr2 instead of just incr. * They are both needed in the print statement below. */ char *incr1; char *incr2; char *tarcmd; and: #ifdef GNUTAR_LISTED_INCREMENTAL_DIR /* Changed by root. Original code: a03 = " --listed-incremental %s"; #else a03 = " --incremental --newer-mtime %s"; */ a03 = " --listed-incremental %s --incremental --newer-mtime %s"; #endif and: /* Changed by root. * incr1 and incr2, instead of just incr1 */ #ifdef GNUTAR_LISTED_INCREMENTAL_DIR incr1 = incrname; /* #else */ incr2 = dumptimestr; #endif /* Changed by root. * incr1 and incr2, instead of just incr1 */ dbprintf((format_buf, dumppid, cmd, dirname, incr1, incr2, efile ? efile : ".")); amfree(format_buf); } Finally, in /usr/src/packages/SOURCES/amanda-2.4.1p1.dif: --with-gnutar=/usr/local/bin/gtar-wrapper Do not forget: also in config/config.h: #define GNUTAR "/usr/local/bin/gtar-wrapper" (this is necessary, since I don't run configure every time). ------------------------------------------------------------------------------- The gtar-wrapper script The wrapper script, called "gtar-wrapper" is a direct modification of the "gtar-wrapper" script that comes with AMANDA. It determines if the directory is on a vfat or an ext2 filesystem and then eliminates the parameters in the argument list that do not fit. Remember that AMANDA passes --listed-incremental as well as --incremental, --newer-mtime and date to gtar-wrapper, due to the changes in the code above. So gtar-wrapper's job is to call tar with --listed-incremental eliminated from the parameter list, if it the filesystem is vfat - and to call tar with --incremental, --newer-mtime and the date eliminated from the parameter list, if it is ext2. You will find my gtar-wrapper script here. ------------------------------------------------------------------------------- How to test Don't wait for AMANDA's run each night to test! Just do something like su amanda cd /dumps/amanda1 /usr/lib/amanda/runtar --create --directory /dos/e --listed-incremental /var/lib/amanda/gnutar-lists/midas_dos_e.new --incremental --newer-mtime "2000-12-23 2:00:53 GMT" --sparse --one-file-system --ignore-failed-read --totals --file /dev/null . (don't forget the "." at the end, it's essential!). To test tar itself, do: /bin/gtar --create --directory /dos/c --incremental --newer-mtime "2000-12-24 6:02:05 GMT" --sparse --one-file-system --ignore-failed-read --totals --file /dev/null . and then tar -tf _dos_f.1.tar to see the contents of the tar file. This is a nice method to test the effects of --listed-incremental, --newer, --newer-mtime between successive mounts/ unmounts of the vfat filesystem (/dos/e in the above example). Also check the log file /var/lib/amanda/gtar-wrapper.log (the location of the log file is set in the gtar-wrapper script). Here is a part of it - you can see that gtar-wrapper itself is called with all options, but tar is called by gtar-wrapper with the aproppriate set, depending on the filesystem. For /usr/ src: gtar-wrapper: start: Thu Nov 15 14:05:02 CET 2001 gtar-wrapper: args: --create --directory /usr/src --listed-incremental /var/lib/amanda/gnutar-lists/midas_usr_src_0.new --incremental --newer-mtime 1970-01-01 0:00:00 GMT --sparse --one-file-system --ignore-failed-read --totals --file /dev/null --exclude-from=/var/lib/amanda/gnutar-lists/exclude-list-3 . ... gtar-wrapper: running /bin/gtar --create --directory /usr/src --listed-incremental /var/lib/amanda/gnutar-lists/midas_usr_src_0.new --sparse --one-file-system --ignore-failed-read --totals --file /dev/null --exclude-from=/var/lib/amanda/gnutar-lists/exclude-list-3 . And for /dos/h: gtar-wrapper: args: --create --directory /dos/h --listed-incremental /var/lib/amanda/gnutar-lists/midas_dos_h_1.new --incremental --newer-mtime 2001-11-07 22:22:11 GMT --sparse --one-file-system --ignore-failed-read --totals --file - . ... gtar-wrapper: running /bin/gtar --create --directory /dos/h --incremental --newer-mtime 2001-11-07 22:22:11 GMT --sparse --one-file-system --ignore-failed-read --totals --file - . To see the inode numbers do: ls -lsi /dos/g/original/ ------------------------------------------------------------------------------- Remaining problems Suddenly, on Jan 6, 2001 between 06:28:14 GMT and 06:28:15 GMT (i.e. in only one second!) 236MB of /dos/d (which is totally 253MB in size => almost all of / dos/d !) changed its modification time! Look at the following output : bacchus:~ # /bin/gtar --create --directory /dos/d --incremental --newer-mtime "2001-01-06 06:28:14 GMT" --sparse --one-file-system --ignore-failed-read --totals --file /dev/null . Total bytes written: 247214080 (236MB, 39MB/s) bacchus:~ # /bin/gtar --create --directory /dos/d --incremental --newer-mtime "2001-01-06 06:28:15 GMT" --sparse --one-file-system --ignore-failed-read --totals --file /dev/null . Total bytes written: 286720 (280kB, 93kB/s) Of course, AMANDA has correctly backed up a level 1 of 236+ MB...But why should this one filesystem change mtime almost as a whole in just one second? Probably, somehow the directory entry of /dos/d changed, causing tar to dump everything beneath it. But I could swear I didn't touch /dos/d... Here are some words of caution from the tar info file: *Please Note:* `--after-date=DATE' (`--newer=DATE', `-N DATE') and `--newer-mtime=DATE' should not be used for incremental backups. Some files (such as those in renamed directories) are not selected properly by these options. I use the described solution for almost a year without having experienced any other surprises. It works for me. I hope it will work for you too.