Status

From btrfs Wiki
(Difference between revisions)
Jump to: navigation, search
(Free space tree: fixes due)
(Overview: 4.17 review)
(46 intermediate revisions by 4 users not shown)
Line 1: Line 1:
 +
{{PageProtected|edits must be approved, this page reflects status of the whole project}}
 +
 
= Overview =
 
= Overview =
  
 
For a list of features by their introduction, please see the table [[Changelog#By_feature]].
 
For a list of features by their introduction, please see the table [[Changelog#By_feature]].
  
The following table aims to give an overview of the stability of the features BTRFS supports in the latest kernel version ('''4.7'''). This refers to functionality, not usability (as this would need to specify the usecase and expectations).
+
The table below aims to serve as an overview for the stability status of the features BTRFS supports.
 +
While a feature may be functionally safe and reliable, it does not necessarily
 +
mean that its useful, for example in meeting your performance expectations for
 +
your specific workload. Combination of features can vary in performance, the
 +
table does not cover all possibilities.
 +
 
 +
'''The table is based on the latest released linux kernel: 4.17'''
 +
 
 +
The columns for each feature reflrect the status of the implementation in following ways:
 +
 
 +
''Stability'' - completeness of the implementation, usecase coverage<br>
 +
''Status since'' - kernel version when the status has been last changed<br>
 +
''Performance'' - how much it could be improved until the inherent limits are hit<br>
 +
''Notes'' - short description of the known issues, or other information related to status
 +
 
 +
''Legend:''
 +
* '''OK''': should be safe to use, no known major defficiencies
 +
* '''mostly OK''': safe for general use, there are some known problems that do not affect majority of users
 +
* '''Unstable''': do not use for other then testing purposes, known severe problems, missing implementation of some core parts
 +
 
  
 
{| class="wikitable" border=1
 
{| class="wikitable" border=1
 
|-
 
|-
! Feature !! Status !! Notes
+
! Feature !! Stability !! Status since !! Performance !! Notes
 
|-
 
|-
| Subvolumes, snapshots
+
| colspan="5" | '''Performance'''
| style="background: lightgreen;" | OK
+
 
|-
 
|-
 
| Trim (aka. discard)
 
| Trim (aka. discard)
| style="background: lightgreen;" | OK
+
| style="background: lightgreen;"   | OK
 +
| tbd
 +
| style="background: lightgreen;"  | OK
 
| ''fstrim'' and mounted with ''-o discard'' (has performance implications)
 
| ''fstrim'' and mounted with ''-o discard'' (has performance implications)
 
|-
 
|-
 
| Autodefrag
 
| Autodefrag
| style="background: lightgreen;" | OK
+
| style="background: lightgreen;"   | OK
 +
| tbd
 +
| style="background: lightgreen;"  | OK
 
|-
 
|-
| Defrag
+
| [[#Defrag|Defrag]]
| style="background: orange;" | mostly OK
+
| style="background: orange;"       | mostly OK
| extents get unshared ''(see below)''
+
| tbd
 +
| style="background: lightgreen;"  | OK
 +
| extents get unshared ''[[Status#Defrag | (see below)]]''
 
|-
 
|-
| Auto-repair
+
|colspan="5" | '''Compression, deduplication'''
| style="background: lightgreen;" | OK
+
| automatically repair from a correct spare copy if possible (dup, raid1, raid10)
+
 
|-
 
|-
 
| [[Compression]]
 
| [[Compression]]
| style="background: orange;" | mostly OK
+
| style="background: lightgreen;"   | OK
| ''(needs verification and source)'' auto-repair and compression may crash
+
| 4.14
 +
| style="background: lightgreen;"  | OK
 
|-
 
|-
| Scrub
+
| Out-of-band dedupe
| style="background: lightgreen;" | OK
+
| style="background: lightgreen;"  | OK
 +
| tbd
 +
| style="background: orange;"      | mostly OK
 +
| (reflink), heavily referenced extents have a noticeable performance hit ''[[Status#Out_of_band_dedupe|(see below)]]''
 +
|-
 +
| File range cloning
 +
| style="background: lightgreen;"  | OK
 +
| tbd
 +
| style="background: orange;"      | mostly OK
 +
| (reflink), heavily referenced extents have a noticeable performance hit ''[[Status#File_range_cloning|(see below)]]''
 +
|-
 +
|colspan="5" | '''Reliabillity'''
 +
|-
 +
| Auto-repair
 +
| style="background: lightgreen;"  | OK
 +
| tbd
 +
| style="background: lightgreen;"  | OK
 +
| automatically repair from a correct spare copy if possible (dup, raid1, raid10)
 +
|-
 +
| [[Manpage/btrfs-scrub|Scrub]]
 +
| style="background: lightgreen;"  | OK
 +
| tbd
 +
| style="background: lightgreen;"   | OK
 
|-
 
|-
 
| Scrub + RAID56
 
| Scrub + RAID56
| style="background: FireBrick;" | Unstable
+
| style="background: orange;"   | mostly OK
| will verify but not repair
+
| tbd
 +
| style="background: orange;"    | mostly OK
 
|-
 
|-
| Filesystem resize
+
| nodatacow
| style="background: lightgreen;" | OK
+
| style="background: lightgreen;"   | OK
| shrink, grow
+
| tbd
 +
| style="background: lightgreen;"  | OK
 +
| also see [[Manpage/btrfs(5)]].
 +
|-
 +
| Device replace
 +
| style="background: orange;"      | mostly OK
 +
| tbd
 +
| style="background: orange;"      | mostly OK
 +
| [[Status#Device_replace|see below]]
 
|-
 
|-
| Send
+
| Degraded mount
| style="background: lightgreen;" | OK
+
| style="background: orange;"       | mostly OK
| corner cases may still exist
+
| tbd
 +
| n/a
 +
| applies to raid levels with redundancy: needs at least two available devices always. Can get stuck in irreversible read-only mode if only one device is present.
 +
[https://www.spinics.net/lists/linux-btrfs/msg63370.html] [https://www.spinics.net/lists/linux-btrfs/msg63435.html]
 
|-
 
|-
| Receive
+
|colspan="5" | [[Manpage/mkfs.btrfs#PROFILES|'''Block group profile''']]
| style="background: lightgreen;" | OK
+
 
|-
 
|-
 
| Single (block group profile)
 
| Single (block group profile)
| style="background: lightgreen;" | OK
+
| style="background: lightgreen;"   | OK
 +
| tbd
 +
| style="background: lightgreen;"  | OK
 
|-
 
|-
 
| DUP (block group profile)
 
| DUP (block group profile)
| style="background: lightgreen;" | OK
+
| style="background: lightgreen;"   | OK
 +
| tbd
 +
| style="background: lightgreen;"  | OK
 
|-
 
|-
 
| RAID0
 
| RAID0
| style="background: lightgreen;" | OK
+
| style="background: lightgreen;"   | OK
 +
| tbd
 +
| style="background: lightgreen;"  | OK
 
|-
 
|-
 
| RAID1
 
| RAID1
| style="background: lightgreen;" | OK
+
| style="background: lightgreen;"   | OK
 +
| tbd
 +
| style="background: orange;"      | mostly OK
 +
| reading from mirrors in parallel can be optimized further ''[[Status#RAID1,_RAID10|(see below)]]''
 
|-
 
|-
 
| RAID10
 
| RAID10
| style="background: lightgreen;" | OK
+
| style="background: lightgreen;"   | OK
 +
| tbd
 +
| style="background: orange;"      | mostly OK
 +
| reading from mirrors in parallel can be optimized further ''[[Status#RAID1,_RAID10|(see below)]]''
 
|-
 
|-
 
| [[RAID56]]
 
| [[RAID56]]
| style="background: FireBrick;" | Unstable
+
| style="background: FireBrick;"   | Unstable
| write hole still exists, parity not checksummed
+
| tbd
 +
| n/a
 +
| write hole still exists ''[[Status#RAID56|(see below)]]''
 
|-
 
|-
| Seeding
+
| Mixed block groups
| style="background: lightgreen;" | OK
+
| style="background: lightgreen;"   | OK
| should be better documented
+
| tbd
 +
| style="background: lightgreen;"  | OK
 +
| see documentation
 
|-
 
|-
| Device replace
+
|colspan="5" | '''Administration'''
| style="background: orange;" | mostly OK
+
|-
| gets stuck on devices with bad sectors
+
| Filesystem resize
 +
| style="background: lightgreen;"  | OK
 +
| tbd
 +
| style="background: lightgreen;"   | OK
 +
| shrink, grow
 
|-
 
|-
 
| Balance
 
| Balance
| style="background: lightgreen;" | OK
+
| style="background: lightgreen;"   | OK
 +
| tbd
 +
| style="background: lightgreen;"  | OK
 +
| balance + qgroups can be slow when there are many snapshots
 
|-
 
|-
| Quotas, qgroups
+
| Offline UUID change
| style="background: orange;" | mostly OK
+
| style="background: lightgreen;"   | OK
 +
| tbd
 +
| style="background: lightgreen;"  | OK
 
|-
 
|-
| Out-of-band dedupe
+
| [[SysadminGuide#Subvolumes|Subvolumes, snapshots]]
| style="background: orange;" | mostly OK
+
| style="background: lightgreen;"   | OK
| performance issues
+
| tbd
 +
| style="background: lightgreen;"  | OK
 
|-
 
|-
| File range cloning
+
| [[Manpage/btrfs-send|Send]]
| style="background: orange;" | mostly OK
+
| style="background: lightgreen;"   | OK
| (reflink), heavily referenced extents have a noticeable performance hit
+
| tbd
 +
| style="background: lightgreen;"  | OK
 
|-
 
|-
| Offline UUID change
+
| [[Manpage/btrfs-receive|Receive]]
| style="background: lightgreen;" | OK
+
| style="background: lightgreen;"  | OK
 +
| tbd
 +
| style="background: lightgreen;"   | OK
 
|-
 
|-
| Free space tree
+
| Seeding
| style="background: FireBrick;" | Unstable
+
| style="background: lightgreen;"   | OK
| see below
+
| tbd
 +
| style="background: lightgreen;"  | OK
 +
| needs to be better documented
 
|-
 
|-
| no-holes
+
| [[Quota_support|Quotas, qgroups]]
| style="background: lightgreen;" | OK
+
| style="background: orange;"      | mostly OK
| see documentation for compatibility
+
| tbd
 +
| style="background: orange;"       | mostly OK
 +
| qgroups with many snapshots slows down balance
 
|-
 
|-
| skinny-metadata
+
|colspan="5" | '''Misc'''
| style="background: lightgreen;" | OK
+
|-
 +
| [[#Free_space_tree|Free space tree]]
 +
| style="background: lightgreen;"  | OK
 +
| 4.9
 +
| style="background: lightgreen;"  | OK
 +
| ''[[Status#Free_space_tree|(see below)]]''
 +
|-
 +
| [[Manpage/mkfs.btrfs#FILESYSTEM_FEATURES|no-holes]]
 +
| style="background: lightgreen;"  | OK
 +
| tbd
 +
| style="background: lightgreen;"   | OK
 
| see documentation for compatibility
 
| see documentation for compatibility
 
|-
 
|-
| extended-refs
+
| [[Manpage/mkfs.btrfs#FILESYSTEM_FEATURES|skinny-metadata]]
| style="background: lightgreen;" | OK
+
| style="background: lightgreen;"  | OK
 +
| tbd
 +
| style="background: lightgreen;"   | OK
 
| see documentation for compatibility
 
| see documentation for compatibility
 
|-
 
|-
| Mixed block groups
+
| [[Manpage/mkfs.btrfs#FILESYSTEM_FEATURES|extended-refs]]
| style="background: lightgreen;" | OK
+
| style="background: lightgreen;"   | OK
| see documentation
+
| tbd
|-
+
| style="background: lightgreen;"   | OK
| nodatacow
+
| see documentation for compatibility
| style="background: lightgreen;" | OK
+
| ''(see below)''
+
 
|}
 
|}
  
''Legend:''
+
'''Note to editors:'''
* '''OK''': should be safe to use, no known defficiencies
+
* '''mostly OK''': safe for general use, there are some known problems
+
* '''Unstable''': do not use for other then testing purposes, known severe problems, missing implementation of some core part
+
  
'''Note to editors:''' please update the table if:
+
This page reflects status of the whole project and edits need to be approved by one of the maintainers ([[User_talk:kdave|kdave]]).
* you're sure you know the answer, eg. a bug number for a bug that lowers the feature status
+
Suggest edits if:
 +
 
 +
* there's a known missing entry
 
* a particular feature combination that has a different status and is worth mentioning separately
 
* a particular feature combination that has a different status and is worth mentioning separately
* there's a missing entry (put TBD to the status)
+
* you knouw of a bug that lowers the feature status
 
* a reference could be enhanced by an actual link to documentation (wiki, manual pages)
 
* a reference could be enhanced by an actual link to documentation (wiki, manual pages)
 
The page edits are watched by wiki admins, do not worry to edit.
 
  
 
== Details that do not fit the table ==
 
== Details that do not fit the table ==
Line 136: Line 234:
  
 
The data affected by the defragmentation process will be newly written and will consume new space, the links to the original extents will not be kept. See also [[Manpage/btrfs-filesystem]]. Though autodefrag affects newly written data, it can read a few adjacent blocks (up to 64k) and write the contiguous extent to a new location. The adjacent blocks will be unshared. This happens on a smaller scale than the on-demand defrag and doesn't have the same impact.
 
The data affected by the defragmentation process will be newly written and will consume new space, the links to the original extents will not be kept. See also [[Manpage/btrfs-filesystem]]. Though autodefrag affects newly written data, it can read a few adjacent blocks (up to 64k) and write the contiguous extent to a new location. The adjacent blocks will be unshared. This happens on a smaller scale than the on-demand defrag and doesn't have the same impact.
 
=== Nodatacow ===
 
 
Nodatacow does not checksum data, see [[Manpage/btrfs(5)]].
 
  
 
=== Free space tree ===
 
=== Free space tree ===
  
 
* btrfs-progs support is read-only, ie. fsck can check the filesystem but is not able to keep the FST consistent and thus cannot run in repair mode
 
* btrfs-progs support is read-only, ie. fsck can check the filesystem but is not able to keep the FST consistent and thus cannot run in repair mode
* btrfs-progs versions before v4.7.3 might accidentally do writes to the filesystem, but since there's no way to invalidate the FST, this causes inconsistency and possible corruption (using a piece of space twice). ''If'' you have made changes (btrfstune, repair, ...) to a FST enabled filesystem with btrfs progs, then mount with clear_cache,space_cache=v2 and hope the space written to was not reused yet. https://www.spinics.net/lists/linux-btrfs/msg59110.html
+
* the free space tree can be cleared using 'btrfs check --clear-space-cache v2' and will be rebuilt at next mount
* runtime support: fine on little-endian machines (x86*), known to be broken on big-endian (sparc64), https://bugzilla.kernel.org/show_bug.cgi?id=152111
+
 
 +
Compatibility and historical references:
 +
* btrfs-progs versions before v4.7.3 might accidentally do writes to the filesystem, but since there's no way to invalidate the FST, this causes inconsistency and possible corruption (using a piece of space twice). ''If'' you have made changes (btrfstune, repair, ...) to a FST enabled filesystem with btrfs progs, then mount with clear_cache,space_cache=v2 and hope the space written to was not reused yet. (see [https://www.spinics.net/lists/linux-btrfs/msg59110.html Status of free-space-tree feature])
 +
* (fixed in linux 4.9) runtime support: fine on little-endian machines (x86*), known to be broken on big-endian (sparc64), see [https://bugzilla.kernel.org/show_bug.cgi?id=152111 sparc64: btrfs module fails to load on big-endian machines]
 +
 
 +
=== Out of band dedupe ===
 +
 
 +
File extents can be shared either due to snapshotting or reflink. As the
 +
number of owners of an extent grows, the time to process or modify the extent
 +
will also grow.
 +
 
 +
The deduplication increases the level of sharing and reduces data usage.
 +
 
 +
=== File range cloning ===
 +
 
 +
See the intro paragraph in Out of band dedupe. The range cloning increases
 +
the extent sharing.
 +
 
 +
=== RAID1, RAID10 ===
 +
 
 +
The simple redundancy RAID levels utilize different mirrors in a way that does
 +
not achieve the maximum performance. The logic can be improved so the reads
 +
will spread over the mirrors evenly or based on device congestion.
 +
 
 +
=== RAID56 ===
 +
 
 +
Some fixes went to 4.12, namely scrub and auto-repair fixes. Feature marked as ''mostly OK'' for now.
 +
 
 +
Further fixes to raid56 related code are applied each release. The ''write hole'' is the last missing part, preliminary patches have been posted but needed to be reworked. The ''parity not checksummed'' note has been removed.
 +
 
 +
=== Device replace ===
  
Fixes for the bitmap: kernel 4.9, btrsf-progs 4.7.3
+
Device ''replace'' and device ''delete'' insist on being able to read or reconstruct all data. If any read fails due to an IO error, the delete/replace operation is aborted and the administrator must remove or replace the damaged data before trying again.
  
 
= Other =
 
= Other =

Revision as of 15:05, 7 June 2018

Note: this page is protected and cannot be edited by all users: edits must be approved, this page reflects status of the whole project


Contents

Overview

For a list of features by their introduction, please see the table Changelog#By_feature.

The table below aims to serve as an overview for the stability status of the features BTRFS supports. While a feature may be functionally safe and reliable, it does not necessarily mean that its useful, for example in meeting your performance expectations for your specific workload. Combination of features can vary in performance, the table does not cover all possibilities.

The table is based on the latest released linux kernel: 4.17

The columns for each feature reflrect the status of the implementation in following ways:

Stability - completeness of the implementation, usecase coverage
Status since - kernel version when the status has been last changed
Performance - how much it could be improved until the inherent limits are hit
Notes - short description of the known issues, or other information related to status

Legend:

  • OK: should be safe to use, no known major defficiencies
  • mostly OK: safe for general use, there are some known problems that do not affect majority of users
  • Unstable: do not use for other then testing purposes, known severe problems, missing implementation of some core parts


Feature Stability Status since Performance Notes
Performance
Trim (aka. discard) OK tbd OK fstrim and mounted with -o discard (has performance implications)
Autodefrag OK tbd OK
Defrag mostly OK tbd OK extents get unshared (see below)
Compression, deduplication
Compression OK 4.14 OK
Out-of-band dedupe OK tbd mostly OK (reflink), heavily referenced extents have a noticeable performance hit (see below)
File range cloning OK tbd mostly OK (reflink), heavily referenced extents have a noticeable performance hit (see below)
Reliabillity
Auto-repair OK tbd OK automatically repair from a correct spare copy if possible (dup, raid1, raid10)
Scrub OK tbd OK
Scrub + RAID56 mostly OK tbd mostly OK
nodatacow OK tbd OK also see Manpage/btrfs(5).
Device replace mostly OK tbd mostly OK see below
Degraded mount mostly OK tbd n/a applies to raid levels with redundancy: needs at least two available devices always. Can get stuck in irreversible read-only mode if only one device is present.

[1] [2]

Block group profile
Single (block group profile) OK tbd OK
DUP (block group profile) OK tbd OK
RAID0 OK tbd OK
RAID1 OK tbd mostly OK reading from mirrors in parallel can be optimized further (see below)
RAID10 OK tbd mostly OK reading from mirrors in parallel can be optimized further (see below)
RAID56 Unstable tbd n/a write hole still exists (see below)
Mixed block groups OK tbd OK see documentation
Administration
Filesystem resize OK tbd OK shrink, grow
Balance OK tbd OK balance + qgroups can be slow when there are many snapshots
Offline UUID change OK tbd OK
Subvolumes, snapshots OK tbd OK
Send OK tbd OK
Receive OK tbd OK
Seeding OK tbd OK needs to be better documented
Quotas, qgroups mostly OK tbd mostly OK qgroups with many snapshots slows down balance
Misc
Free space tree OK 4.9 OK (see below)
no-holes OK tbd OK see documentation for compatibility
skinny-metadata OK tbd OK see documentation for compatibility
extended-refs OK tbd OK see documentation for compatibility

Note to editors:

This page reflects status of the whole project and edits need to be approved by one of the maintainers (kdave). Suggest edits if:

  • there's a known missing entry
  • a particular feature combination that has a different status and is worth mentioning separately
  • you knouw of a bug that lowers the feature status
  • a reference could be enhanced by an actual link to documentation (wiki, manual pages)

Details that do not fit the table

Defrag

The data affected by the defragmentation process will be newly written and will consume new space, the links to the original extents will not be kept. See also Manpage/btrfs-filesystem. Though autodefrag affects newly written data, it can read a few adjacent blocks (up to 64k) and write the contiguous extent to a new location. The adjacent blocks will be unshared. This happens on a smaller scale than the on-demand defrag and doesn't have the same impact.

Free space tree

  • btrfs-progs support is read-only, ie. fsck can check the filesystem but is not able to keep the FST consistent and thus cannot run in repair mode
  • the free space tree can be cleared using 'btrfs check --clear-space-cache v2' and will be rebuilt at next mount

Compatibility and historical references:

  • btrfs-progs versions before v4.7.3 might accidentally do writes to the filesystem, but since there's no way to invalidate the FST, this causes inconsistency and possible corruption (using a piece of space twice). If you have made changes (btrfstune, repair, ...) to a FST enabled filesystem with btrfs progs, then mount with clear_cache,space_cache=v2 and hope the space written to was not reused yet. (see Status of free-space-tree feature)
  • (fixed in linux 4.9) runtime support: fine on little-endian machines (x86*), known to be broken on big-endian (sparc64), see sparc64: btrfs module fails to load on big-endian machines

Out of band dedupe

File extents can be shared either due to snapshotting or reflink. As the number of owners of an extent grows, the time to process or modify the extent will also grow.

The deduplication increases the level of sharing and reduces data usage.

File range cloning

See the intro paragraph in Out of band dedupe. The range cloning increases the extent sharing.

RAID1, RAID10

The simple redundancy RAID levels utilize different mirrors in a way that does not achieve the maximum performance. The logic can be improved so the reads will spread over the mirrors evenly or based on device congestion.

RAID56

Some fixes went to 4.12, namely scrub and auto-repair fixes. Feature marked as mostly OK for now.

Further fixes to raid56 related code are applied each release. The write hole is the last missing part, preliminary patches have been posted but needed to be reworked. The parity not checksummed note has been removed.

Device replace

Device replace and device delete insist on being able to read or reconstruct all data. If any read fails due to an IO error, the delete/replace operation is aborted and the administrator must remove or replace the damaged data before trying again.

Other

On-disk format

The filesystem disk format is stable. This means it is not expected to change unless there are very strong reasons to do so. If there is a format change, filesystems which implement the previous disk format will continue to be mountable and usable by newer kernels.

The core of the on-disk format that comprises building blocks of the filesystem:

  • layout of the main data structures, eg. superblock, b-tree nodes, b-tree keys, block headers
  • the COW mechanism, based on the original design of Ohad Rodeh's paper "Shadowing and clones"

Newly introduced features build on top of the above and could add specific structures. If a backward compatibility is not possible to maintain, a bit in the filesystem superblock denotes that and the level of incompatibility (full, read-only mount possible).

Personal tools