Skip to content
Archiving & Compression in Linux

Archiving & Compression in Linux


Introduction

Archiving bundles multiple files and directories into a single container. Compression then reduces that container’s size. In Linux, tar handles archiving — and optionally compression — while zip and unzip handle the cross-platform format common in mixed environments. This note covers both toolchains.


Creating TAR Archives

tar (tape archive) is the standard archiving utility on Linux and macOS. It bundles files into a .tar container. With the -z flag, it applies gzip compression on top, producing .tar.gz archives.

Creating an Archive

tar -czvf archive.tar.gz file1.txt file2.txt directory/
Flag Purpose
-c Create a new archive
-z Compress with gzip
-v Verbose — list files as they’re processed
-f Specify the archive filename

Extracting an Archive

tar -xzvf archive.tar.gz                    # Extract to current directory
tar -xzvf archive.tar.gz -C /opt/restore/   # Extract to a specific path

The -x flag extracts. The -C flag sets the destination — without it, files land wherever you run the command.

Listing Archive Contents

Before extracting an unknown archive, inspect its contents:

tar -tzvf archive.tar.gz

This lists every file inside without touching the filesystem — useful for verifying integrity or spotting unexpected entries before extraction.

Appending Files

Additional files can be added to an existing archive:

tar -rvf archive.tar additional_file.txt

Note: Appending to a compressed (.tar.gz) archive requires decompression first. The -r flag works on uncompressed .tar containers only.

Extracting Specific Files

To extract only certain files from an archive, name them after the archive:

tar -xzvf archive.tar.gz file1.txt directory/file2.txt

Working with ZIP Archives

zip is the standard for cross-platform archives — common when sharing files between Linux, macOS, and Windows systems.

Creating a ZIP Archive

zip archive.zip file1.txt file2.txt directory/

For directories, add the -r flag to recurse into subdirectories:

zip -r archive.zip directory/

Extracting a ZIP Archive

unzip archive.zip                     # Extract to current directory
unzip archive.zip -d /opt/restore/    # Extract to a specific path

Listing Contents

unzip -l archive.zip

Adding Files to an Existing Archive

zip -r archive.zip additional_file.txt

Extracting Specific Files

unzip archive.zip file1.txt directory/file2.txt

Integrity note: When handling untrusted archives, consider decompressing to a memory buffer or isolated directory rather than directly overwriting production files.


Compression Utilities

tar delegates compression to external tools. You can also use them independently on single files.

Utility Compress Decompress Decompress to stdout
gzip gzip file gzip -d file.gz gzip -dc file.gz
bzip2 bzip2 file bzip2 -d file.bz2 bzip2 -dc file.bz2
tar tar -cf archive.tar tar -xf archive.tar tar -xOf archive.tar

Decompressing to stdout (-dc or -xO) is useful when piping output into another command without writing intermediate files.

Inspecting Nested Archives

When dealing with nested compression layers — common in malware samples or packaged exploits — decompress one layer at a time:

gzip -dc layer1.gz > layer2
gzip -dc layer2 > layer3
file layer3

This allows a step-by-step audit of each layer rather than blindly unpacking everything at once.


TAR vs ZIP — When to Use Which

Consideration TAR ZIP
Platform Linux and macOS native Cross-platform (Linux, macOS, Windows)
Compression Delegated to gzip, bzip2, xz, etc. Built-in (deflate)
Preserves permissions Yes — stores full Unix metadata Limited — permissions often stripped
Streaming Designed for pipes (tar -cf - | gzip) Not stream-friendly
Best for System backups, Linux-native workflows File sharing with non-Linux systems

Practical rule: Use tar for anything internal to Linux — backups, deployment packages, forensic evidence archives. Use zip when the recipient is on Windows or expects a universally extractable format.


Command Reference

Task tar zip
Create tar -czvf archive.tar.gz files/ zip -r archive.zip files/
Extract tar -xzvf archive.tar.gz unzip archive.zip
List contents tar -tzvf archive.tar.gz unzip -l archive.zip
Extract to path tar -xzvf archive.tar.gz -C /path/ unzip archive.zip -d /path/
Extract specific files tar -xzvf archive.tar.gz file1 unzip archive.zip file1
Append tar -rvf archive.tar file zip -r archive.zip file