Btrfs supports transparent file compression. There are two algorithms available, ZLIB and LZO. Basically, compression is on a file by file basis. You can have a single btrfs mount point that has some files that are uncompressed, some that are compressed with LZO, some with ZLIB, for instance (though you may not *want* it that way, it is supported).
How do I enable compression?
Mount with compress or compress-force, see Mount options for more. Then write (or re-write) files, and they will be transparently compressed. Some files may not compress very well, and these are typically not recompressed but still written uncompressed. See the "what happens to incompressible files?" section, below.
What is the default compression method?
As for kernels up to 3.6 it's ZLIB.
What are the differences between compression methods?
There's a speed/ratio trade-off:
- ZLIB -- slower, higher compression ratio (uses zlib level 3 setting, you can see the zlib level difference between 1 and 6 https://code.google.com/p/lz4/ here).
- LZO -- faster compression and decompression than zlib, worse compression ratio, designed to be fast
The differences depend on the actual data set and cannot be expressed by a single number or recommendation. Do your own benchmarks. LZO seems to give satisfying results for general use.
Are there other compression methods supported?
Currently no, there is a work in progress to add LZ4. The difference is a plan to add also its high compression mode that is slow but achieves better compression ratio while the decompressor remains the same and the decompression speed is unaffected (see here and also the High Compression (HC) mode here).
Snappy support (compresses slower than LZ0 but decompresses much faster) has also been proposed. Some work has been done toward adding lzma (very slow, high compression) support as well. Current status unknown.
Can a file data be compressed with different methods?
Yes. The compression algorithm is stored per-extent.
How can I determine compressed size of a file?
There is a patch adding support fort that, currently it's not merged. You can kind of guess at its compressed size by comparing the output from the "df" command before and after writing a file, if this is available to you.
Why does not du report the compressed size?
Traditionally the UNIX/Linux filesystems did not support compression and there was no item in stat data structure allocated for a similar purpose. There's the file size, that denotes nominal file size independent of the actually allocated size on-disk. For that purpose, the stat.st_blocks item contains a value that corresponds to the number of blocks allocated, ie. in case of sparse files. However, when a compression is involved, the actually allocated size may be smaller than nominal, although the file is not sparse.
There are utilities that determine sparseness of a file by comparing the nominal and block-allocated size, this behaviour might cause bugs if st_blocks contained the amount after compression.
Another issue with backward compatibility is that up to now st_blocks always contains the uncompressed number of blocks. It's unclear what would happen if there are files with mixed types of the value. The proposed solution is to add another special call for that (via ioctl), but this may be not the ideal solution.
Can I determine what compression method was used on a file?
Not directly, but this is possible from a userspace tool without any special kernel support (the code just has not been written).
Can I force compression on a file without using the compress mount option?
Yes. The utility chattr supports setting file attribute c that marks the inode to compress newly written data.
Can I disable compression on a file?
Currently not directly. There is no way how to do that from userspace and this will need extending the interface to file attributes. The enable compression via chattr +c works because the attribute has existed there for a long time and there is a bit reserved for that purpose in the file attribute ioctl.
Can I set compression per-subvolume?
Currently no, this is planned. You can simulate this by enabling compression on the subvolume directory and the files/directories will inherit the compression flag.
What's the precedence of all the options affecting compression?
Compression to newly written data happens:
- always -- if the filesystem is mounted with compress-force
- never -- if the NOCOMPRESS flag is set per-file/-directory
- if possible -- if the COMPRESS per-file flag (aka chattr +c) is set, but it may get converted to NOCOMPRESS eventually
- if possible -- if the compress mount option is specified
Note, that mounting with compress will not set the +c file attribute.
How does compression interact with direct IO or COW?
Compression does not work with DIO, does work with COW and does not work for NOCOW files. If a file is opened in DIO mode, it will fall back to buffered IO.
Are there speed penalties when doing random access to a compressed file?
Yes. The compression processes ranges of a file of maximum size 128 KiB and compresses each 4 KiB (or page-sized) block separately. Accessing a byte in the middle of the given 128 KiB range requires to decompress the whole range. This is not optimal and is subject to optimizations and further development.
What happens to incompressible files?
There is a simple decision logic: if the first portion of data being compressed is not smaller than the original, the compression of the file is disabled -- unless the filesystem is mounted with compress-force. In that case it'll be compressed always regardless of the compressibility of the file. This is not optimal and subject to optimizations and further development.
New and unanswered questions
Add your question here. It'll be moved to previous section when answered.