This blog post will guide you through the basics of managing tar files in Linux, including how to create, extract, and list tar files. We will also cover some advanced tar features, such as how to compress and decompress tar files, and how to create and manage tar archives.
What is tar files?
A tar file is a file format that is used to archive multiple files into a single file. This can be useful for a variety of purposes, such as backing up data, distributing files, or compressing files.
Tar files can be compressed using a variety of compression algorithms, such as gzip, bzip2, and xz. This can help to reduce the size of the tar file, making it easier to store and distribute.
Tar files are a popular way to archive files in Linux and Unix-like operating systems. They are also supported by many operating systems, including Windows and macOS.
Here are some of the benefits of using tar files:
- Tar files are a convenient way to archive multiple files into a single file.
- Tar files can be compressed to reduce their size.
- Tar files are supported by many operating systems.
- Tar files are a standard format for archiving files, which makes them easy to share with others.
Overall, tar files are a versatile and powerful tool for archiving and distributing files.
Managing compressed tar files
After completing this article, you should be able to archive files and directories into a compressed. file using tar and extract the contents of an existing tar archive.
The tar Command.
Archiving and compressing files are useful when creating backups and transferring data across a network. One of the oldest and most common commands for creating and working with backup archives is the tar command.
With tar, users can gather large sets of files into a single file (archive). A tar archive is a structured sequence of file data mixed in with metadata about each file and an index so that individual files can be extracted. The archive can be compressed using gzip, bzip2, or xz compression.
The tar command can list the contents of archives or extract their files to the current system.
Selected tar option
tar command options are divided into operations (the action you want to take): general options and compression options. The table below shows common options, long version of options, and their description:
Overview of tar Operations
OPTION | DESCRIPTION |
-c, –create | Create a new archive. |
-x, –extract | Extract from an existing archive. |
-t, –list | List the table of contents of an archive. |
Selected tar General Options
OPTION | DESCRIPTION |
-v, –verbose | Verbose. Shows which files get archived or extracted. |
-f, –file= | File name. This option must be followed by the file name of the archive to use or create. |
-p, –preserve-permissions | Preserve the permissions of files and directories when extracting an archive, without subtracting the umask. |
Overview of tar Compression Options
OPTION | DESCRIPTION |
-z, –gzip | Use gzip compression (.tar.gz). |
-j, –bzip2 | Use bzip2 compression (.tar.bz2). bzip2 typically achieves a better compression ratio than gzip. |
-J, –xz | Use xz compression (.tar.xz). The xz compression typically achieves a better compression ratio than bzip2. |
Listing options of the tar command-tar files
The tar command expects one of the three following options:
- Use the -c or –create option to create an archive.
- Use the -t or –list option to list the contents of an archive.
- Use the -x or –extract option to extract an archive.
Other commonly used options are: - Use the -f or –file= option with a file name as an argument of the archive to operate.
- Use the -v or –verbose option for verbosity; useful to see which files get added to or extracted from the archive.
Note
The tar command actually supports a third, old option style that uses the standard single-letter options with no leading -. It is still commonly encountered, and you might run into this syntax when working with other people’s instructions or commands. The info tar ‘old options’ command discusses how this differ from normal short options in some detail.
You can ignore old options for now and focus on the standard short and long options syntax.
ARCHIVING FILES AND DIRECTORIES
The first option to use when creating a new archive is the c option, followed by the f option, then a single space, then the file name of the archive to be created, and finally the list of files and directories that should get added to the archive. The archive is created in the current directory unless specified otherwise.
WARNING
Before creating a tar archive, verify that there is no other archive in the directory with the same name as the new archive to be created. The tar command overwrites an existing archive without warning.
The following command creates an archive named archive.tar with the contents of file1, file2, and file3 in the user’s home directory.
[user@host ~]$ tar -cf archive.tar file1 file2 file3
[user@host ~]$ ls archive.tar
archive.tar
The above tar command can also be executed using the long version options.
[user@host ~]$ tar --file=archive.tar --create file1 file2 file3
NOTE
When archiving files by absolute path names, the leading / of the path is removed from the file name by default. Removing the leading / of the path help users to avoid overwriting important files when extracting the archive. The tar command extracts files relative to the current working directory.
For tar to be able to archive the selected files, it is mandatory that the user executing the tar command can read the files. For example, creating a new archive of the /etc folder and all of its content requires root privileges, because only the root user is allowed to read all of the files present in the /etc directory. An unprivileged user can create an archive of the /etc directory, but the archive omits files which do not include read permission for the user, and it omits directories which do not include both read and execute permission for the user.
To create the tar archive named, /root/etc.tar, with the /etc directory as content as user root:
[root@host ~]# tar -cf /root/etc.tar /etc
tar: Removing leading `/' from member names
[root@host ~]#
IMPORTANT
Some advanced permissions that we have not covered in this course, such as ACLs and SELinux contexts, are not automatically stored in a tar archive. Use the –xattrs option when creating an archive to store those extended attributes in the tar archive.
LISTING CONTENTS OF AN ARCHIVE or tar files
The t option directs tar to list the contents (table of contents, hence t) of the archive. Use the f option with the name of the archive to be queried. For example:
[root@host ~]# tar -tf /root/etc.tar
etc/
etc/fstab
etc/crypttab
etc/mtab
...output omitted...
EXTRACTING FILES FROM AN ARCHIVE
A tar archive should usually be extracted in an empty directory to ensure it does not overwrite any existing files. When root extracts an archive, the tar command preserves the original user and
group ownership of the files. If a regular user extracts files using tar, the file ownership belongs to
the user extracting the files from the archive.
To restore files from the /root/etc.tar archive to the /root/etcbackup directory, run:
[root@host ~]# mkdir /root/etcbackup
[root@host ~]# cd /root/etcbackup
[root@host etcbackup]# tar -tf /root/etc.tar
etc/
etc/fstab
etc/crypttab
etc/mtab
...output omitted...
[root@host etcbackup]# tar -xf /root/etc.tar
By default, when files get extracted from an archive, the umask is subtracted from the permissions of archive content. To preserve the permissions of an archived file, the p option when extracting an archive.
In this example, an archive named, /root/myscripts.tar, is extracted in the /root/scripts directory while preserving the permissions of the extracted files:
[root@host ~]# mkdir /root/scripts
[root@host ~]# cd /root/scripts
[root@host scripts]# tar -xpf /root/myscripts.tar
CREATING A COMPRESSED ARCHIVE
The tar command supports three compression methods. There are three different compression methods supported by the tar command. The gzip compression is the fastest and oldest one and is most widely available across distributions and even across platforms. bzip2 compression creates smaller archive files compared to gzip but is less widely available than gzip, while the xz compression method is relatively new, but usually offers the best compression ratio of the methods available.
NOTE
The effectiveness of any compression algorithm depends on the type of data that is compressed. Data files that are already compressed, such as compressed picture formats or RPM files, usually lead to a low compression ratio.
It is good practice to use a single top-level directory, which can contain other directories and files,
to simplify the extraction of the files in an organized way.
- -z or –gzip for gzip compression (filename.tar.gz or filename.tgz)
- -j or –bzip2 for bzip2 compression (filename.tar.bz2)
- -J or -xz for xz compression (filename.tar.xz)
To create a gzip compressed archive named /root/etcbackup.tar.gz, with the contents from the /etc directory on host:
[root@host ~]# tar -czf /root/etcbackup.tar.gz /etc
tar: Removing leading `/' from member names
To create a bzip2 compressed archive named /root/logbackup.tar.bz2, with the contents from the /var/log directory on host:
[root@host ~]$ tar -cjf /root/logbackup.tar.bz2 /var/log
tar: Removing leading `/' from member names
To create a xz compressed archive named, /root/sshconfig.tar.xz, with the contents from the /etc/ssh directory on host:
[root@host ~]$ tar -cJf /root/sshconfig.tar.xz /etc/ssh
tar: Removing leading `/' from member names
After creating an archive, verify the content of an archive using the tf options. It is not mandatory to use the option for compression agent when listing the content of a compressed archive file. For example, to list the content archived in the /root/etcbackup.tar.gz file, which uses the gzip compression, use the following command:
[root@host ~]# tar -tf /root/etcbackup.tar.gz /etc
etc/
etc/fstab
etc/crypttab
etc/mtab
...output omitted...
EXTRACTING A COMPRESSED ARCHIVE
The first step when extracting a compressed tar archive is to determine where the archived files should be extracted to, then create and change to the target directory. The tar command determines which compression was used and it is usually not necessary to use the same compression option used when creating the archive. It is valid to add the decompression method to the tar command. If one chooses to do so, the correct decompression type option must be used; otherwise tar yields an error about the decompression type specified in the options not matching the file’s decompression type.
To extract the contents of a gzip compressed archive named /root/etcbackup.tar.gz in the /tmp/etcbackup directory:
[root@host ~]# mkdir /tmp/etcbackup
[root@host ~]# cd /tmp/etcbackup
[root@host etcbackup]# tar -tf /root/etcbackup.tar.gz
etc/
etc/fstab
etc/crypttab
etc/mtab
...output omitted...
[root@host etcbackup]# tar -xzf /root/etcbackup.tar.gz
To extract the contents of a bzip2 compressed archive named /root/logbackup.tar.bz2 in the /tmp/logbackup directory:
[root@host ~]# mkdir /tmp/logbackup
[root@host ~]# cd /tmp/logbackup
[root@host logbackup]# tar -tf /root/logbackup.tar.bz2
var/log/
var/log/lastlog
var/log/README
var/log/private/
var/log/wtmp
var/log/btmp
...output omitted...
[root@host logbackup]# tar -xjf /root/logbackup.tar.bz2
To extract the contents of a xz compressed archive named /root/sshbackup.tar.xz in the /tmp/sshbackup directory:
[root@host ~]$ mkdir /tmp/sshbackup
[root@host ~]# cd /tmp/sshbackup
[root@host logbackup]# tar -tf /root/sshbackup.tar.xz
etc/ssh/
etc/ssh/moduli
etc/ssh/ssh_config
etc/ssh/ssh_config.d/
etc/ssh/ssh_config.d/05-redhat.conf
etc/ssh/sshd_config
...output omitted...
[root@host sshbackup]# tar -xJf /root/sshbackup.tar.xz
Listing a compressed tar archive works in the same way as listing an uncompressed tar archive.
NOTE
Additionally, gzip, bzip2, and xz can be used independently to compress single files. For example, the gzip etc.tar command results in the etc.tar.gz compressed file, while the bzip2 abc.tar command results in the abc.tar.bz2 compressed file, and the xz myarchive.tar command results in the myarchive.tar.xz
compressed file.
The corresponding commands to decompress are gunzip, bunzip2, and unxz. For example, the gunzip /tmp/etc.tar.gz command results in the etc.tar uncompressed tar file, while the bunzip2 abc.tar.bz2 command results in the abc.tar uncompressed tar file, and the unxz myarchive.tar.xz command results in the myarchive.tar uncompressed tar file.
Example: –
►Use the tar command with the -czf options to create an archive of the /etc directory using gzip compression. Save the archive file as /tmp/etc.tar.gz.
[root@servera ~]# tar -czf /tmp/etc.tar.gz /etc
tar: Removing leading `/' from member names
[root@servera ~]#
►Use the tar command with the -tzf options to verify that the etc.tar.gz archive contains the files from the /etc directory.
[root@servera ~]# tar -tzf /tmp/etc.tar.gz
etc/
etc/mtab
etc/fstab
etc/crypttab
etc/resolv.conf
...output omitted...
►On servera, create a directory named /backuptest. Verify that the etc.tar.gz backup file is a valid archive by decompressing the file to the /backuptest directory.
• Create the /backuptest directory.
[root@servera ~]# mkdir /backuptest
• Change to the /backuptest directory.
[root@servera ~]# cd /backuptest
[root@servera backuptest]#
• List the contents of the etc.tar.gz archive before extracting.
[root@servera backuptest]# tar -tzf /tmp/etc.tar.gz
etc/
etc/mtab
etc/fstab
etc/crypttab
etc/resolv.conf
…output omitted…
Extract the /tmp/etc.tar.gz archive to the /backuptest directory.
[root@servera backuptest]# tar -xzf /tmp/etc.tar.gz
[root@servera backuptest]#
List the content of the /backuptest directory. Verify that the directory contains the files from the /etc directory.
[root@servera backuptest]# ls -l
total 12
drwxr-xr-x. 95 root root 8192 Feb 8 10:16 etc
[root@servera backuptest]# cd etc
[root@servera etc]# ls -l
total 1204
-rw-r--r--. 1 root root 16 Jan 16 23:41 adjtime
-rw-r--r--. 1 root root 1518 Sep 10 17:21 aliases
drwxr-xr-x. 2 root root 169 Feb 4 21:58 alternatives
-rw-r--r--. 1 root root 541 Oct 2 21:01 anacrontab
…output omitted…