September 27th, 2007
Nothing kinky here. Amanda is not a girl, but rather a really nice piece of software —
The Advanced Maryland Automatic Network Disk Archiver. In short, a backup program. Details can be found at http://www.amanda.org, or at http://www.zmanda.com (with documentation at http://wiki.zmanda.com). I’ve been following Amanda for a couple of years, and started using it intensively last January. I’ve been extremely happy with my decision to implement Amanda, so I decided to list some of the reasons.
1. Amanda is a simple, elegant and powerful tool. It is a planner and scheduler for backups. It does what it does better than any other program, and it doesn’t try to reinvent the wheel or throw in the kitchen sink (see 2 & 3). There is depth, maturity and complexity to Amanda, because it has been around for quite a while (over 15 years) and the task it undertakes has some complexity to it. But it has stayed focused and true to its original, fundamentally sound design concepts.
2. Amanda uses native tools to do the actual backups rather than trying to write its own. This has a variety of implications. (a) It is using tools that have a broader base and more extensive testing than would be possible if Amanda developers wrote their own backup code. (b) It gives the user a choice. For example, on Solaris you can use ufsdump, and on Linux you can use gnutar. (c) It avoids a whole class of potential bugs that could creep into the program (gee, we didn’t get those acl’s on Mac OS X quite right). (d) It makes it relatively straightforward to extract and recover data even if Amanda is unavailable to assist. (e) It makes it easier to support a broader selection of platforms.
3. Amanda is minimilist in that it doesn’t require the installation of other significant packages (beyond what’s typically on a Unix or Linux system by default) for functionality that is critical to Amanda’s operation. For example, it doesn’t store records and indexes of its backups in an SQL database. So, you don’t need to get anything else running to run Amanda, you don’t have the system load associated with other software (or need an additional server to run it), and you aren’t cripled in a disaster recovery situation by having to get more stuff up and running before you have full functionality.
4. Amanda is really good at multitasking. During my testing, I had terminal sessions open to several of my servers. They each had top running with the command to display jobs for user amanda. When I launched amdump on the server, I almost immediately saw multiple jobs firing up on all the servers. Amanda had estimates going on all of them, and soon had data streaming in from several of them. I had two spool drives, and I could watch data accumulating on them and then going out to tape. With adequate resources and tuning parameters, Amanda can orchestrate quite a show.
5. Amanda’s algorithm for scheduling backups is awesome; and, as far as I know, there is nothing like it in any other backup software. If you’ve been doing backups with other software for years, it might be a little difficult to get your head around this, but it really feels good when you do. Basically, you choose a dump cycle, say, a week, and runs per cycle, say, 5. Amanda guarantees that you will have at least one full backup of every disk list entry during a dump cycle. On the other days it will schedule incrementals. The distribution of fulls and incrementals across the dump cycle is planned so that the amount of data being transferred and put on tape is about the same every day. The benefits of smoothed out network load, smoothed out demand on other servers, and even use of tape from day to day is tremendous. And there really is no downside. You’ve always got incrementals, so you can always recover to any day. No worries. Just let Amanda plan it.
In a more general sense, Amanda’s backup levels cover the full range from 0 to 9. In common language, a level 0 backup is a full backup, a level 1 backup is what some people call a differential backup, and anything from level 1 to 9 can be called an incremental. A backup at any level will backup all the files under the disk list entry that have been modified since the last backup at a lower level. Amanda chooses the backup level to optimize the balance between backup efficiency and ease of recovery. And all of this is tunable by configuration parameters.
6. Amanda is extremely robust against error situations. It typically does the right thing, and administrator intervention is only required when it really makes sense. For example, I had a tape drive failure several months ago. For a variety of reasons, it was out of service for a couple of days. My attention was focused on the tape drive, and I really didn’t want to have to think about a lot of other things. Amanda was scheduled to run daily backups. At the scheduled time, Amanda “surveyed the situation” and made an operational decision. No tape drive is available. Backups must be run. Holding disk space is available. Holding disk space should be used conservatively, since it can’t be flushed to tape. So Amanda dropped back and did incremental backups on all the disk list entries and saved it all to the holding disk. It sent me an appropriate status report with a notice about the tape failure at the top. This went on for a couple of days. Then, when I got the drive repaired and reinstalled, Amanda said, “Oh Joy”, scheduled an appropriate mix of fulls and incrementals, and flushed everything out to tape. I never had to say anything to Amanda during this episode that spanned a couple of days. It just worked.
7. Amanda is a collection of command line utilities. You configure Amanda in part by making cron entries to run these utilities (amcheck and amdump). When no backup is running, there are no processes or daemons on the system, and there is no need to monitor the system to make sure the daemon is still running. Typically, cron is configured to run amcheck during the afternoon when sysadmins are around. It checks everything and notifies sysadmins only if there is a problem that needs attention. Cron is configured to run amdump after the end of the day or in the middle of the night. This follows on item 2 by using the native system scheduling tools (cron) combined with Amanda’s backup scheduling algorithm. On the client, Amanda is triggered by inetd (or xinetd), so, again, it is only running when a backup is happening and a request comes in from the Amanda server.
8. It is extremely easy to get appropriate information about what is going on with Amanda. Typically, a routine amcheck is scheduled in the afternoon to check the readiness for daily bakckups. It checks the configuration, the backup media, the connections to the clients, and only reports if there is a problem. Because amcheck is a command line utility, you can run it at any time. Whenever amdump is run, it sends a report with its results. In addition, if you come in in the morning and haven’t gotten the daily amanda backup report yet, you can run amstatus and find out exactly what it is doing. There is also amreport, which is a versatile reporting tool for accessing information about existing backups and tapes.
9. The community support for Amanda is very stong. Open source projects typically rely on community support, but the results are highly variable. Amanda falls into the group of projects that have exceptionally good support. The activity on the Amanda users list shows a large number of respondents answering questions on a variety of platforms; and, according to Preston*, over 250 programmers have contributed to the Amanda code base.
10. Amanda has commercial support through Zmanda. Normally, managers have to choose (typically, sysadmins only get to recommend) between open source or commercial software, and can’t balance the virtues of the two. Significant open source project often have a community of consultants and freelancers who can be hired; but, with Amanda, there’s also commercial scale support, if that’s what a manager wants, with up to 24/7 and unlimited support calls. And, because Zmanda strongly supports the open source model, and has paid programmers on full time staff, the overall level of development and support for Amanda is richer. The open source community has access to this through the participation of the Zmanda staff in the user community and their contribution of code.
*Preston, W. Curtis. 2006. Backup & Recovery. O’Reilly, Sebastopol, CA. http://www.oreilly.com/catalog/9780596102463/).