From btrfs Wiki
Revision as of 16:39, 31 March 2019 by HugoMills (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


The parity RAID code has a specific issue with regard to data integrity: see "write hole", below. It should not be used for metadata. For data, it should be safe as long as a scrub is run immediately after any unclean shutdown.

From 3.19, the recovery and rebuild code was integrated. The one missing piece, from a reliability point of view, is that it is still vulnerable to the parity RAID "write hole", where a partial write as a result of a power failure will result in inconsistent parity data.

  • Parity may be inconsistent after a crash (the "write hole"). The problem born when after "an unclean shutdown" a disk failure happens. But these are *two* distinct failures. These together break the BTRFS raid5 redundancy. If you run a scrub process after "an unclean shutdown" (with no disk failure in between) those data which match their checksum can still be read out while the mismatched data are lost forever.
  • Parity data is not checksummed Checksumming for parity is not necessary.See Talk:Status
  • No support for discard? (possibly -- needs confirmation with cmason)
  • The algorithm uses as many devices as are available: No support for a fixed-width stripe (see note, below)

The first two of these problems mean that the parity RAID code is not suitable for any system which might encounter unplanned shutdowns (power failure, kernel lock-up), and it should not be considered production-ready.

If you'd like to learn btrfs raid5/6 and rebuilds by example (based on kernel 3.14), you can look at Marc MERLIN's page about btrfs raid 5/6.

On 01 Aug 2017 a RFC patch to fix write hole was posted in the mailing list: Btrfs: Add journal for raid5/6 writes


Using as many devices as are available means that there will be a performance issue for filesystems with large numbers of devices. It also means that filesystems with different-sized devices will end up with differing-width stripes as the filesystem fills up, and some space may be wasted when the smaller devices are full.

Both of these issues could be addressed by specifying a fixed-width stripe, always running over exactly the same number of devices. This capability is not yet implemented, though.

Personal tools