Dr. Kuhlmann Software Dr. Kuhlmann Beratung Software-Engineering

HD on CDHelpGo to homeGo to table of contentsIntroductionBackup media KaTeker  Backup strategies

This text describes some general backup strategies and then the strategies of KaTeker.

A backup of data is useful when a system crashed an has to be restored. This is a rare case. Much more often backup is used to restore distinct files lost.

General backup strategies

There are several strategies and types of backup of Data

These types are not disjoint and some can be used at the same time. Finally there are the strategies of KaTeker.

Saving the whole system

Saving all data of the system is the object of data backup. This is often not possible because the amount of data is to large to fit onto a data medium. In most cases it is to expensive. If the data should be saved periodical this will escalate. Some Data is not accessible immediately or is expected at distinct positions of the file system.

Saving partitions

Saving whole partitions is an other variant of saving all files of the system. This solves the problem saving . On the other hand a lot the backup media is wasted with unused space of the partitions.

Saving selected data

Saving selected data can save a lot of backup media. In most cases user data and system configuration data are selected. All data which can be restored from other source are not saved. It can be difficult to restore the whole system because the data has to be collected from several sources. Often only the system data is not saved. The system can be reinstalled and then the backup restored.

Full backup

In the context of backup a full backup is saving all selected data. In rare cases it is a backup of all system data or partitions.

Saving versions of files

Saving versions of files is not a principal task of backup. There may be saved some versions of files be saved when a periodical backup is running. The versions will be incomplete in most cases. Use special versioning software like rcs or cvs for this task. The repositories of rcs and cvs should be included in the backup of course.

Saving data changed

Saving changed data can be used together with other types of backup. In most cases is the size of data changed on a day small compared the whole data. The data changed refers to a set of data to be restored from other sources. This strategy can be used to save system data. Thus restoring the system is simply to reinstall the system and restore the data changed. In most cases the changed data refers to a full backup. There are two strategies to save the data changed. An incremental backup saves only the data that has changed since the last backup. A differential backup saves the data changed since the last full backup. An incremental backup is differential backup of a differential backup. The drawback of incremental and differential backup is, that it is difficult to restore data. Another problem is to determine where distinct data is saved.

Saving data periodically

A periodical backup is necessary because nobody knows when data will be lost. A daily backup is a good idea because if data is lost not more than the work of one day will be lost. Often a full backup in conjunction with some kind of incremental / differential backup is used. An incremental / differential backup should be not based on an very old full backup. That is there is periodical full backup too. The period of the full backup is longer than the incremental / differential backup. In many cases the full backups are weekly or monthly while the incremental / differential backups are daily. The period of the full backups depends on the changes of the data and the size of the incremental / differential backup.

Saving data of distinct days

Saving the data of distinct days is a good practice. These days may be each Friday or each last day of a month. Often these backups are full backups in conjunction with a daily incremental or differential backup.

Saving the backup media

The simplest backup is to save the backup data on the same media as the data itself. This is adequate to restore data lost because of some user errors or so. It is fatal in case of a crash of the media itself. Saving data onto another HD is a better choice. If the compute system itself is lost the data will be lost too. The next choice is to save the data on external media. If this media is saved in the same room or building as the System the whole data will be lost in case of fire for instance. Therefore it is better to store the backup media outside of the building possibly at a bank.

Another aspect is the security of the data. External backup media have a lot of advantages. On the other side they are potential security lack. Everybody who has access to the backup media has access to all data saved on the media. Therefore the backup media should be stored in a safe way. The best case is to encrypt the data on the backup media. If the media are encrypted restoring data will be more difficult.

Reuse of backup media

In most cases it not necessary to save each snapshot of data. The older the data the coarser the snapshots can be. It is a good practice to save the daily backup for a week and then save the one backup of a week. The media are rotated. Another strategy can rotate the media of a month. I rotate the media monthly. The media saved at a Saturday will be saved for tree months and then reused.

KaTeker

KaTeker is not designed to save the whole system or partitions. It is not designed to save data onto sequential media like tapes.

KaTeker is designed to save selected data periodically in conjunction with a differential backup. The differential backup refers always to a full backup. The full backup is called the master while the differential backup is the update. It is also possible to save only data changed. Then there is a master too but it will be empty. The size of the backup is restricted to the size of a few CDs for the Master and one CD for the update. If larger backup media are available the size can be increased. The master and the update are saved onto HD. The master and the update can and should be copied onto external media like CDR or CDRW. The master and update should be not on the same HD as the data. The master and update can be saved onto another HD or via NFS onto another computer. KaTeker supports to save the master and update via NFS onto a server which saves these archives onto CDR(W).

KaTeker provides a database with information which file is saved onto which media including the master and update and all external media. It supports to reuse of media like CDRW cyclic. There is no strategy rotating the media implemented in KaTeker. KaTeker provides a table of all media. The table can be ordered by several aspects. A short description can be attached to each media.

KaTeker can be configured to save any data you want to save. There is a default setting saving the data of the directories

   /etc
   /usr/local
   /boot
   /root
   /var without /var/log
   /home

The contents of these directories is split into a sample of archives.

KaTeker supports to restore distinct versions of files. Therefore is necessary to determine the media holding the file of question. In order of this a lot of media have to be scanned. This is abstracted by database integrated in KaTeker resulting in a fast searching process.

KaTeker does not support the encryption of data on external media. This function will be provided in future.

© 2003-2004 Dr. Heiner Kuhlmann: www.dr-kuhlmann-software.de, kateker@dr-kuhlmann-software.de