GNU tar: an archiver tool (*DRAFT*)

Go to the previous, next section.

What `tar` Does

The tar program is used to create and manipulate tar archives. An archive is a single file which contains within it the contents of many files. In addition, the archive identifies the names of the files, their owner, and so forth. (Archives record access permissions, user and group, size in bytes, and last modification time. Some archives also record the file names in each archived directory, as well as other file and directory information.)

The files inside an archive are called members. Within this manual, we use the term file to refer only to files accessible in the normal ways (by ls, cat, and so forth), and the term members to refer only to the members of an archive. Similarly, a file name is the name of a file, as it resides in the filesystem, and a member name is the name of an archive member within the archive.

Initially, tar archives were used to store files conveniently on magnetic tape. The name `tar' comes from this use; it stands for: tape archiver. Despite the utility's name, tar can direct its output to any available device, as well as store it in a file or direct it to another program via a pipe. tar may even access, as archives, remote devices or files.

You can use tar archives in many ways. We want to stress a few of them: storage, backup or transportation.

Storage

Often, tar archives are used to store related files for convenient file transfer over a network. For example, the GNU Project distributes its software bundled into tar archives, so that all the files relating to a particular program (or set of related programs) can be transferred as a single unit.

A magnetic tape can store several files in sequence, but has no names for them, just relative position on the tape. A tar archive or something like it is one way to store several files on one tape and retain their names. Even when the basic transfer mechanism can keep track of names, as FTP can, the nuisance of handling multiple files, directories, and multiple links, makes tar archives an attractive method.

Archive files are also used for long-term storage, which you can think of as transportation from one time to another.

Backup

Because the archive created by tar is capable of preserving file information and directory structure, tar is commonly used for performing full and incremental backups of disks, putting all together bunch of files possibly pertaining to many users and different projects, to secure against accidental destruction of those disks.

The GNU version of tar has special features that allow it to be used to make incremental and full dumps of all the files in a filesystem.

Transportation

Archive files can be used for transporting a group of files from one system to another: put all relevant files into an archive on one computer system, transfer the archive to another, and extract the contents there. The basic transfer medium might be magnetic tape, Internet FTP, or even electronic mail (though you must encode the archive with uuencode or some functional equivalent in order to transport it properly by mail). Both machines do not have to use the same operating system, as long as they both support the tar program.

Piping one tar to another is an easy way to copy a directory's contents from one disk to another, while preserving the dates, modes, owners and link structure of all the files therein. tar is also ideal for transferring directories over networks. We sometimes see a copy of tar packing many files into one archive on one machine, and sending the produced archive over a pipe over the network to another copy of tar on another machine, reading its archive from the pipe and unpacking all files there.

The tar program provides the ability to create tar archives, as well as for various other kinds of manipulation. For example, you can use tar on previously created archives to extract files, to store additional files, or to update or list files already stored. The term extraction is used to refer to the process of copying an archive member into a file in the filesystem. One might speak of extracting a single member. Extracting all the members of an archive is often called extracting the archive. Also, the term unpack is used to refer to the extraction of many or all the members of an archive.

Conventionally, tar archives are given names ending with `.tar'. This is not necessary for tar to operate properly, but this manual follows the convention in order to get the reader used to seeing it.

Occasionally, tar archives are referred to as tar files, archive members are referred to as files, or entries. For people familiar with the operation of tar, this causes no difficulty. However, this manual consistently uses the terminology above in referring to archives and archive members, to make it easier to learn how to use tar.

Go to the previous, next section.

What tar Does

What `tar` Does