Go to the previous, next section.
(This message will disappear, once this node revised.)
The data in an archive is grouped into records, which are 512 bytes. Records are read and written in whole number multiples called blocks. The number of records in a block (ie. the size of a block in units of 512 bytes) is called the blocking factor. The `--block-size=512-size' (`-b 512-size') option specifies the blocking factor of an archive. The default blocking factor is typically 20 (ie. 10240 bytes), but can be specified at installation. To find out the blocking factor of an existing archive, use `tar --list --file=archive-name'. This may not work on some devices.
Blocks are seperated by gaps, which waste space on the archive media.
If you are archiving on magnetic tape, using a larger blocking factor
(and therefore larger blocks) provides faster throughput and allows
you to fit more data on a tape (because there are fewer gaps). If you
are archiving on cartridge, a very large blocking factor (say 126 or
more) greatly increases performance. A
smaller blocking factor, on the other hand, may be usefull when
archiving small files, to avoid archiving lots of nulls as tar
fills out the archive to the end of the block. In general, the ideal block size
depends on the size of the inter-block gaps on the tape you are using,
and the average size of the files you are archiving.
FIXME: xref Creating Archives, for information on writing archives.
FIXME: need example of using a cartridge with blocksize=126 or more
Archives with blocking factors larger than 20 cannot be read by very
old versions of tar
, or by some newer versions of tar
running on old machines with small address spaces. With GNU
tar
, the blocking factor of an archive is limited only by the
maximum block size of the device containing the archive, or by the
amount of available virtual memory.
If you use a non-default blocking factor when you create an archive,
you must specify the same blocking factor when you modify that
archive. Some archive devices will also require you to specify the
blocking factor when reading that archive, however this is not
typically the case. Usually, you can use `--list' (`-t') without
specifying a blocking factor---tar
reports a non-default block
size and then lists the archive members as it would normally. To
extract files from an archive with a non-standard blocking factor
(particularly if you're not sure what the blocking factor is), you can
usually use the `--read-full-blocks' (`-B') option while specifying a blocking
factor larger then the blocking factor of the archive (ie. `tar
--extract --read-full-blocks --block-size=300'.
FIXME: xref Listing Contents
for more information on the `--list' (`-t') operation.
FIXME: xref read-full-blocksfor a more detailed explanation of that option.
Device blocking
-b blocks
--block-size=blocks
This option is used to specify a blocking factor for the archive.
When reading or writing the archive, tar
, will do reads and
writes of the archive in blocks of @math{block*512} bytes.
The default blocking factor is set when tar
is compiled, and
is typically 20.
Blocking factors larger than 20 cannot be read by very old versions
of tar
, or by some newer versions of tar
running on
old machines with small address spaces.
With a magnetic tape, larger blocks give faster throughput and fit more data on a tape (because there are fewer inter-record gaps). If the archive is in a disk file or a pipe, you may want to specify a smaller blocking factor, since a large one will result in a large number of null bytes at the end of the archive.
When writing cartridge or other streaming tapes, a much larger blocking factor (say 126 or more) will greatly increase performance. However, you must specify the same blocking factor when reading or updating the archive.
With GNU tar
the blocking factor is limited only by the maximum
block size of the device containing the archive, or by the amount of
available virtual memory.
--block-compress
-i
--ignore-zeros
The `--ignore-zeros' (`-i') option causes tar
to ignore blocks
of zeros in the archive. Normally a block of zeros indicates the
end of the archive, but when reading a damaged archive, or one which
was created by cat
-ing several archives together, this option
allows tar
to read the entire archive. This option is not on
by default because many versions of tar
write garbage after
the zeroed blocks.
Note that this option causes tar
to read to the end of the
archive file, which may sometimes avoid problems when multiple files
are stored on a single physical tape.
-B
--read-full-blocks
If `--read-full-blocks' (`-B') is used, tar
will not panic if an
attempt to read a block from the archive does not return a full block.
Instead, tar
will keep reading until it has obtained a full
block.
This option is turned on by default when tar
is reading
an archive from standard input, or from a remote machine. This is
because on BSD Unix systems, a read of a pipe will return however
much happens to be in the pipe, even if it is less than tar
requested. If this option was not used, tar
would fail as
soon as it read an incomplete block from the pipe.
This option is also useful with the commands for updating an archive.
Tape blocking
FIXME: Appropriate options should be moved here from elsewhere.
When handling various tapes or cartridges, you have to take care of selecting a proper blocking, that is, the number of disk blocks you put together as a single tape block on the tape, without intervening tape gaps. A tape gap is a small landing area on the tape with no information on it, used for decelerating the tape to a full stop, and for later regaining the reading or writing speed. When the driver starts reading a tape block, the tape block has to be read whole without stopping, as a tape gap is needed to stop the tape motion without loosing information.
Using higher blocking (putting more disk blocks per tape block) will use
the tape more efficiently as there will be less tape gaps. But reading
such tapes may be more difficult for the system, as more memory will be
required to receive at once the whole block. Further, if there is a
reading error on a huge tape block, this is less likely that the system
will succeed in recovering the information. So, blocking should not be
too low, nor it should be too high. tar
uses by default a
blocking of 20 for historical reasons, and it does not really matter
when reading or writing to disk. Current tape technology would easily
accomodate higher blockings. Sun recommends a blocking of 126 for
Exabytes and 96 for DATs. Other manufacturers may use different
recommendations for the same tapes. This might also depends of the
buffering techniques used inside modern tape controllers. Some imposes
a minimum blocking, or a maximum blocking. Others request blocking to
be some exponent of two.
So, there is no fixed rule for blocking. But blocking at read time should ideally be the same as blocking used at write time. At one place I know, with a wide variety of equipment, they found it best to use a blocking of 32 to guarantee that their tapes are fully interchangeable.
I was also told that, for recycled tapes, prior erasure (by the same drive unit that will be used to create the archives) sometimes lowers the error rates observed at rewriting time.
Go to the previous, next section.