Gotchas

From btrfs Wiki
(Difference between revisions)
Jump to: navigation, search
(enospc is handled nowadays)
(device allocation problems fixed)
Line 22: Line 22:
  
 
* If a filesystem consisting of several drives is used as the root filesystem, it may be necessary to modprobe (and then rmmod) scsi-wait-scan to work around a race condition.
 
* If a filesystem consisting of several drives is used as the root filesystem, it may be necessary to modprobe (and then rmmod) scsi-wait-scan to work around a race condition.
 
* Allocation is done on a round-robin basis. If you have a raid1 strategy on a volume made up of mismatched drives (volumes of differing sizes), your smaller volume may fill up while leaving lots of space free on your single largest drive. You can verify that this is an issue '''if''' there is any discrepancy between 'df' and 'btrfs filesystem df [mountpoint]' AND '''if''' the latter command also shows that "total" and "used" are the same on the "Data" line. A rebalance may mitigate this problem. (2.6.33)
 
** If your volume does fill up in this manner, a rebalance may quickly cause an ENOSPC ("Error NO SPaCe left on device") oops. You may have to delete a relatively large file to resolve this impasse, then a rebalance will succeed. (2.6.33)
 
** Rebalancing may cause prompt very intense CPU usage for extended periods of time. (2.6.34 & 2.6.35)
 
** This should be fixed with the new allocator in 3.0
 
  
 
* btrfs volumes on top of dm-crypt block devices (and possibly LVM) required write-caching to be turned off on the underlying HDD. Failing to do so, in the event of a power failure, may have resulted in corruption not yet handled by btrfs code. As of 3.2 kernels and later, btrfs on top of dm-crypt is now deemed safe.
 
* btrfs volumes on top of dm-crypt block devices (and possibly LVM) required write-caching to be turned off on the underlying HDD. Failing to do so, in the event of a power failure, may have resulted in corruption not yet handled by btrfs code. As of 3.2 kernels and later, btrfs on top of dm-crypt is now deemed safe.

Revision as of 15:35, 27 September 2012

This page lists problems one might face when trying btrfs, some of these are not really bugs, but rather inconveniences about things not yet implemented, or yet undocumented design decisions.

Please add new things below, don't forget to add a comment on which version you observed this.

Issues

  • Files with a lot of random writes can become heavily fragmented (10000+ extents) causing trashing on HDDs and excessive multi-second spikes of CPU load on systems with an SSD or large amount a RAM.
    • On servers and workstations this affects databases and virtual machine images.
      • The nodatacow mount option may be of use here, with associated gotchas.
    • On desktops this primarily affects application databases (including Firefox and Chromium profiles, GNOME Zeitgeist, Ubuntu Desktop Couch, Banshee, and Evolution's datastore.)
      • Workarounds include manually defragmenting your home directory using btrfs fi defragment. Auto-defragment (mount option autodefrag) should solve this problem in 3.0.
    • Symptoms include btrfs-transacti and btrfs-endio-wri taking up a lot of CPU time (in spikes, possibly triggered by syncs). You can use filefrag to locate heavily fragmented files.
  • btrfs defragmenting
    • does not seem to respond to Ctrl-C (kernel 2.6.35, btrfs-progs git version v0.19-16-g075587c-dirty).
    • defragments COW data once for each copy thus instead of one defragmented COW data stream creates multiple separate defragmented data streams wasting space
  • "mount /dev/DEVICE" fails if this this device had a another name the previous time it was mounted (discovered on 2.6.32)
  • mount -o nodatacow also disables compression
  • Using mkfs.btrfs -l and -n for sizes other than 4096 is not supported (*in kernel 3.4 is supported; also -n is an alias for -l)
  • If a filesystem consisting of several drives is used as the root filesystem, it may be necessary to modprobe (and then rmmod) scsi-wait-scan to work around a race condition.
  • btrfs volumes on top of dm-crypt block devices (and possibly LVM) required write-caching to be turned off on the underlying HDD. Failing to do so, in the event of a power failure, may have resulted in corruption not yet handled by btrfs code. As of 3.2 kernels and later, btrfs on top of dm-crypt is now deemed safe.
  • Certain corruption will cause btrfs to oops (see above). (2.6.33)
  • Failed checksums will not in any way initiate a repair. Aside from copying and deleting the offending file, a rebalance is effectively the only operation that will serve as a repair (2.6.33)
    • Scrub should deal with this, from 2.6.39.
  • A rebalance is expensive, as it reads and rewrites every block whether or not it needs to do so to become balanced (2.6.33)
    • Due to the above inefficiency in a rebalance, doing a rebalance will also serve to rereplicate raid1 data after losing a device (2.6.33) -- this is by design
  • on a -d raid1 volume, df will show total raw space, space used by data (not counting duplication), and the difference between those two numbers. This can be surprising since you'll run out of space before the Use% reaches 50. (up to and including 2.6.33 -- reporting changed in 2.6.34)
    • In 2.6.34, df will show total raw space, space used by data (factoring in duplication for raid1), and raw free space, taking metadata and data into consideration. You can still hit an "out of space" condition well before free space reaches 0, especially if, under raid1, your allocated space is unbalanced, and the system finds it impossible to find space for your new write on two separate disks.
  • On a multi device btrfs filesystem, mistakingly re-adding a block device that is already part of the btrfs fs with btrfs device add results in an error, and brings btrfs in an inconsistent state. In striping mode, this causes data loss and kernel oops. The btrfs userland tools need to do more checking to prevent these easy mistakes. (2.6.35, btrfs v0.19)
Personal tools