|
|
Line 1: |
Line 1: |
− | =btrfs-balance(8) manual page=
| |
| {{GeneratedManpage | | {{GeneratedManpage |
| |name=btrfs-balance}} | | |name=btrfs-balance}} |
− |
| |
− | ==NAME==
| |
− | btrfs-balance - balance block groups on a btrfs filesystem
| |
− |
| |
− | ==SYNOPSIS==
| |
− |
| |
− | <p><b>btrfs balance</b> <em><subcommand></em> <em><args></em></p>
| |
− | ==DESCRIPTION==
| |
− |
| |
− | <p>The primary purpose of the balance feature is to spread block groups across
| |
− | all devices so they match constraints defined by the respective profiles. See
| |
− | [[Manpage/mkfs.btrfs|mkfs.btrfs(8)]] section <em>PROFILES</em> for more details.
| |
− | The scope of the balancing process can be further tuned by use of filters that
| |
− | can select the block groups to process. Balance works only on a mounted
| |
− | filesystem. Extent sharing is preserved and reflinks are not broken.
| |
− | Files are not defragmented nor recompressed, file extents are preserved
| |
− | but the physical location on devices will change.</p>
| |
− | <p>The balance operation is cancellable by the user. The on-disk state of the
| |
− | filesystem is always consistent so an unexpected interruption (eg. system crash,
| |
− | reboot) does not corrupt the filesystem. The progress of the balance operation
| |
− | is temporarily stored as an internal state and will be resumed upon mount,
| |
− | unless the mount option <em>skip_balance</em> is specified.</p>
| |
− | <blockquote><b>Warning:</b>
| |
− | running balance without filters will take a lot of time as it basically
| |
− | move data/metadata from the whol filesystem and needs to update all block pointers.</blockquote>
| |
− | <p>The filters can be used to perform following actions:</p>
| |
− | <ul>
| |
− | <li>
| |
− | <p>
| |
− | convert block group profiles (filter <em>convert</em>)
| |
− | </p>
| |
− | </li>
| |
− | <li>
| |
− | <p>
| |
− | make block group usage more compact (filter <em>usage</em>)
| |
− | </p>
| |
− | </li>
| |
− | <li>
| |
− | <p>
| |
− | perform actions only on a given device (filters <em>devid</em>, <em>drange</em>)
| |
− | </p>
| |
− | </li>
| |
− | </ul>
| |
− | <p>The filters can be applied to a combination of block group types (data,
| |
− | metadata, system). Note that changing only the <em>system</em> type needs the force
| |
− | option. Otherwise <em>system</em> gets automatically converted whenever <em>metadata</em>
| |
− | profile is converted.</p>
| |
− | <p>When metadata redundancy is reduced (eg. from RAID1 to single) the force option
| |
− | is also required and it is noted in system log.</p>
| |
− | <blockquote><b>Note:</b>
| |
− | the balance operation needs enough work space, ie. space that is
| |
− | completely unused in the filesystem, otherwise this may lead to ENOSPC reports.
| |
− | See the section <em>ENOSPC</em> for more details.</blockquote>
| |
− | ==COMPATIBILITY==
| |
− |
| |
− | <blockquote><b>Note:</b>
| |
− | The balance subcommand also exists under the <b>btrfs filesystem</b>
| |
− | namespace. This still works for backward compatibility but is deprecated and
| |
− | should not be used any more.</blockquote>
| |
− | <blockquote><b>Note:</b>
| |
− | A short syntax <b>btrfs balance <em><path></em></b> works due to backward compatibility
| |
− | but is deprecated and should not be used any more. Use <b>btrfs balance start</b>
| |
− | command instead.</blockquote>
| |
− | ==PERFORMANCE IMPLICATIONS==
| |
− |
| |
− | <p>Balancing operations are very IO intensive and can also be quite CPU intensive,
| |
− | impacting other ongoing filesystem operations. Typically large amounts of data
| |
− | are copied from one location to another, with corresponding metadata updates.</p>
| |
− | <p>Depending upon the block group layout, it can also be seek heavy. Performance
| |
− | on rotational devices is noticeably worse compared to SSDs or fast arrays.</p>
| |
− | ==SUBCOMMAND==
| |
− |
| |
− | <dl>
| |
− | <dt>
| |
− | <b>cancel</b> <em><path></em>
| |
− | <dd>
| |
− | <p>
| |
− | cancels a running or paused balance, the command will block and wait until the
| |
− | current blockgroup being processed completes
| |
− | </p>
| |
− | <p>Since kernel 5.7 the response time of the cancellation is significantly
| |
− | improved, on older kernels it might take a long time until currently
| |
− | processed chunk is completely finished.</p>
| |
− |
| |
− | <dt>
| |
− | <b>pause</b> <em><path></em>
| |
− | <dd>
| |
− | <p>
| |
− | pause running balance operation, this will store the state of the balance
| |
− | progress and used filters to the filesystem
| |
− | </p>
| |
− |
| |
− | <dt>
| |
− | <b>resume</b> <em><path></em>
| |
− | <dd>
| |
− | <p>
| |
− | resume interrupted balance, the balance status must be stored on the filesystem
| |
− | from previous run, eg. after it was paused or forcibly interrupted and mounted
| |
− | again with <em>skip_balance</em>
| |
− | </p>
| |
− |
| |
− | <dt>
| |
− | <b>start</b> [options] <em><path></em>
| |
− | <dd>
| |
− | <p>
| |
− | start the balance operation according to the specified filters, without any filters
| |
− | the data and metadata from the whole filesystem are moved. The process runs in
| |
− | the foreground.
| |
− | </p>
| |
− | <blockquote><b>Note:</b>
| |
− | the balance command without filters will basically move everything in the
| |
− | filesystem to a new physical location on devices (ie. it does not affect the
| |
− | logical properties of file extents like offsets within files and extent
| |
− | sharing). The run time is potentially very long, depending on the filesystem
| |
− | size. To prevent starting a full balance by accident, the user is warned and
| |
− | has a few seconds to cancel the operation before it starts. The warning and
| |
− | delay can be skipped with <em>--full-balance</em> option.</blockquote>
| |
− | <p>Please note that the filters must be written together with the <em>-d</em>, <em>-m</em> and
| |
− | <em>-s</em> options, because they’re optional and bare <em>-d</em> and <em>-m</em> also work and
| |
− | mean no filters.</p>
| |
− | <blockquote><b>Note:</b>
| |
− | when the target profile for conversion filter is <em>raid5</em> or <em>raid6</em>,
| |
− | there’s a safety timeout of 10 seconds to warn users about the status of the feature</blockquote>
| |
− | <p><tt>Options</tt></p>
| |
− | <dl>
| |
− | <dt>
| |
− | -d[<em><filters></em>]
| |
− | <dd>
| |
− | <p>
| |
− | act on data block groups, see <tt>FILTERS</tt> section for details about <em>filters</em>
| |
− | </p>
| |
− |
| |
− | <dt>
| |
− | -m[<em><filters></em>]
| |
− | <dd>
| |
− | <p>
| |
− | act on metadata chunks, see <tt>FILTERS</tt> section for details about <em>filters</em>
| |
− | </p>
| |
− |
| |
− | <dt>
| |
− | -s[<em><filters></em>]
| |
− | <dd>
| |
− | <p>
| |
− | act on system chunks (requires <em>-f</em>), see <tt>FILTERS</tt> section for details about <em>filters</em>.
| |
− | </p>
| |
− |
| |
− | <dt>
| |
− | -f
| |
− | <dd>
| |
− | <p>
| |
− | force a reduction of metadata integrity, eg. when going from <em>raid1</em> to
| |
− | <em>single</em>, or skip safety timeout when the target conversion profile is <em>raid5</em>
| |
− | or <em>raid6</em>
| |
− | </p>
| |
− |
| |
− | <dt>
| |
− | --background|--bg
| |
− | <dd>
| |
− | <p>
| |
− | run the balance operation asynchronously in the background, uses [http://man7.org/linux/man-pages/man2/fork.2.html fork(2)] to
| |
− | start the process that calls the kernel ioctl
| |
− | </p>
| |
− |
| |
− | <dt>
| |
− | --enqueue
| |
− | <dd>
| |
− | <p>
| |
− | wait if there’s another exclusive operation running, otherwise continue
| |
− | </p>
| |
− |
| |
− | <dt>
| |
− | -v
| |
− | <dd>
| |
− | <p>
| |
− | (deprecated) alias for global <em>-v</em> option
| |
− | </p>
| |
− |
| |
− | </dl>
| |
− |
| |
− | <dt>
| |
− | <b>status</b> [-v] <em><path></em>
| |
− | <dd>
| |
− | <p>
| |
− | Show status of running or paused balance.
| |
− | </p>
| |
− | <p><tt>Options</tt></p>
| |
− | <dl>
| |
− | <dt>
| |
− | -v
| |
− | <dd>
| |
− | <p>
| |
− | (deprecated) alias for global <em>-v</em> option
| |
− | </p>
| |
− |
| |
− | </dl>
| |
− |
| |
− | </dl>
| |
− | ==FILTERS==
| |
− |
| |
− | <p>From kernel 3.3 onwards, btrfs balance can limit its action to a subset of the
| |
− | whole filesystem, and can be used to change the replication configuration (e.g.
| |
− | moving data from single to RAID1). This functionality is accessed through the
| |
− | <em>-d</em>, <em>-m</em> or <em>-s</em> options to btrfs balance start, which filter on data,
| |
− | metadata and system blocks respectively.</p>
| |
− | <p>A filter has the following structure: <em>type</em>[=<em>params</em>][,<em>type</em>=…]</p>
| |
− | <p>The available types are:</p>
| |
− | <dl>
| |
− | <dt>
| |
− | <b>profiles=<em><profiles></em></b>
| |
− | <dd>
| |
− | <p>
| |
− | Balances only block groups with the given profiles. Parameters
| |
− | are a list of profile names separated by "<em>|</em>" (pipe).
| |
− | </p>
| |
− |
| |
− | <dt>
| |
− | <b>usage=<em><percent></em></b>
| |
− | <dt>
| |
− | <b>usage=<em><range></em></b>
| |
− | <dd>
| |
− | <p>
| |
− | Balances only block groups with usage under the given percentage. The
| |
− | value of 0 is allowed and will clean up completely unused block groups, this
| |
− | should not require any new work space allocated. You may want to use <em>usage=0</em>
| |
− | in case balance is returning ENOSPC and your filesystem is not too full.
| |
− | </p>
| |
− | <p>The argument may be a single value or a range. The single value <em>N</em> means <em>at
| |
− | most N percent used</em>, equivalent to <em>..N</em> range syntax. Kernels prior to 4.4
| |
− | accept only the single value format.
| |
− | The minimum range boundary is inclusive, maximum is exclusive.</p>
| |
− |
| |
− | <dt>
| |
− | <b>devid=<em><id></em></b>
| |
− | <dd>
| |
− | <p>
| |
− | Balances only block groups which have at least one chunk on the given
| |
− | device. To list devices with ids use <b>btrfs filesystem show</b>.
| |
− | </p>
| |
− |
| |
− | <dt>
| |
− | <b>drange=<em><range></em></b>
| |
− | <dd>
| |
− | <p>
| |
− | Balance only block groups which overlap with the given byte range on any
| |
− | device. Use in conjunction with <em>devid</em> to filter on a specific device. The
| |
− | parameter is a range specified as <em>start..end</em>.
| |
− | </p>
| |
− |
| |
− | <dt>
| |
− | <b>vrange=<em><range></em></b>
| |
− | <dd>
| |
− | <p>
| |
− | Balance only block groups which overlap with the given byte range in the
| |
− | filesystem’s internal virtual address space. This is the address space that
| |
− | most reports from btrfs in the kernel log use. The parameter is a range
| |
− | specified as <em>start..end</em>.
| |
− | </p>
| |
− |
| |
− | <dt>
| |
− | <b>convert=<em><profile></em></b>
| |
− | <dd>
| |
− | <p>
| |
− | Convert each selected block group to the given profile name identified by
| |
− | parameters.
| |
− | </p>
| |
− | <blockquote><b>Note:</b>
| |
− | starting with kernel 4.5, the <em>data</em> chunks can be converted to/from the
| |
− | <em>DUP</em> profile on a single device.</blockquote>
| |
− | <blockquote><b>Note:</b>
| |
− | starting with kernel 4.6, all profiles can be converted to/from <em>DUP</em> on
| |
− | multi-device filesystems.</blockquote>
| |
− |
| |
− | <dt>
| |
− | <b>limit=<em><number></em></b>
| |
− | <dt>
| |
− | <b>limit=<em><range></em></b>
| |
− | <dd>
| |
− | <p>
| |
− | Process only given number of chunks, after all filters are applied. This can be
| |
− | used to specifically target a chunk in connection with other filters (<em>drange</em>,
| |
− | <em>vrange</em>) or just simply limit the amount of work done by a single balance run.
| |
− | </p>
| |
− | <p>The argument may be a single value or a range. The single value <em>N</em> means <em>at
| |
− | most N chunks</em>, equivalent to <em>..N</em> range syntax. Kernels prior to 4.4 accept
| |
− | only the single value format. The range minimum and maximum are inclusive.</p>
| |
− |
| |
− | <dt>
| |
− | <b>stripes=<em><range></em></b>
| |
− | <dd>
| |
− | <p>
| |
− | Balance only block groups which have the given number of stripes. The parameter
| |
− | is a range specified as <em>start..end</em>. Makes sense for block group profiles that
| |
− | utilize striping, ie. RAID0/10/5/6. The range minimum and maximum are
| |
− | inclusive.
| |
− | </p>
| |
− |
| |
− | <dt>
| |
− | <b>soft</b>
| |
− | <dd>
| |
− | <p>
| |
− | Takes no parameters. Only has meaning when converting between profiles.
| |
− | When doing convert from one profile to another and soft mode is on,
| |
− | chunks that already have the target profile are left untouched.
| |
− | This is useful e.g. when half of the filesystem was converted earlier but got
| |
− | cancelled.
| |
− | </p>
| |
− | <p>The soft mode switch is (like every other filter) per-type.
| |
− | For example, this means that we can convert metadata chunks the "hard" way
| |
− | while converting data chunks selectively with soft switch.</p>
| |
− |
| |
− | </dl>
| |
− | <p>Profile names, used in <em>profiles</em> and <em>convert</em> are one of: <em>raid0</em>, <em>raid1</em>,
| |
− | <em>raid1c3</em>, <em>raid1c4</em>, <em>raid10</em>, <em>raid5</em>, <em>raid6</em>, <em>dup</em>, <em>single</em>. The mixed
| |
− | data/metadata profiles can be converted in the same way, but it’s conversion
| |
− | between mixed and non-mixed is not implemented. For the constraints of the
| |
− | profiles please refer to [[Manpage/mkfs.btrfs|mkfs.btrfs(8)]], section <em>PROFILES</em>.</p>
| |
− | ==ENOSPC==
| |
− |
| |
− | <p>The way balance operates, it usually needs to temporarily create a new block
| |
− | group and move the old data there, before the old block group can be removed.
| |
− | For that it needs the work space, otherwise it fails for ENOSPC reasons.
| |
− | This is not the same ENOSPC as if the free space is exhausted. This refers to
| |
− | the space on the level of block groups, which are bigger parts of the filesystem
| |
− | that contain many file extents.</p>
| |
− | <p>The free work space can be calculated from the output of the <b>btrfs filesystem show</b>
| |
− | command:</p>
| |
− | <pre> Label: 'BTRFS' uuid: 8a9d72cd-ead3-469d-b371-9c7203276265
| |
− | Total devices 2 FS bytes used 77.03GiB
| |
− | devid 1 size 53.90GiB used 51.90GiB path /dev/sdc2
| |
− | devid 2 size 53.90GiB used 51.90GiB path /dev/sde1</pre>
| |
− | <p><em>size</em> - <em>used</em> = <em>free work space</em><br/>
| |
− | <em>53.90GiB</em> - <em>51.90GiB</em> = <em>2.00GiB</em></p>
| |
− | <p>An example of a filter that does not require workspace is <em>usage=0</em>. This will
| |
− | scan through all unused block groups of a given type and will reclaim the
| |
− | space. After that it might be possible to run other filters.</p>
| |
− | <p><b>CONVERSIONS ON MULTIPLE DEVICES</b></p>
| |
− | <p>Conversion to profiles based on striping (RAID0, RAID5/6) require the work
| |
− | space on each device. An interrupted balance may leave partially filled block
| |
− | groups that consume the work space.</p>
| |
− | ==EXAMPLES==
| |
− |
| |
− | <p>A more comprehensive example when going from one to multiple devices, and back,
| |
− | can be found in section <em>TYPICAL USECASES</em> of [[Manpage/btrfs-device|btrfs-device(8)]].</p>
| |
− | ===MAKING BLOCK GROUP LAYOUT MORE COMPACT===
| |
− |
| |
− | <p>The layout of block groups is not normally visible; most tools report only
| |
− | summarized numbers of free or used space, but there are still some hints
| |
− | provided.</p>
| |
− | <p>Let’s use the following real life example and start with the output:</p>
| |
− | <pre>$ btrfs filesystem df /path
| |
− | Data, single: total=75.81GiB, used=64.44GiB
| |
− | System, RAID1: total=32.00MiB, used=20.00KiB
| |
− | Metadata, RAID1: total=15.87GiB, used=8.84GiB
| |
− | GlobalReserve, single: total=512.00MiB, used=0.00B</pre>
| |
− | <p>Roughly calculating for data, <em>75G - 64G = 11G</em>, the used/total ratio is
| |
− | about <em>85%</em>. How can we can interpret that:</p>
| |
− | <ul>
| |
− | <li>
| |
− | <p>
| |
− | chunks are filled by 85% on average, ie. the <em>usage</em> filter with anything
| |
− | smaller than 85 will likely not affect anything
| |
− | </p>
| |
− | </li>
| |
− | <li>
| |
− | <p>
| |
− | in a more realistic scenario, the space is distributed unevenly, we can
| |
− | assume there are completely used chunks and the remaining are partially filled
| |
− | </p>
| |
− | </li>
| |
− | </ul>
| |
− | <p>Compacting the layout could be used on both. In the former case it would spread
| |
− | data of a given chunk to the others and removing it. Here we can estimate that
| |
− | roughly 850 MiB of data have to be moved (85% of a 1 GiB chunk).</p>
| |
− | <p>In the latter case, targeting the partially used chunks will have to move less
| |
− | data and thus will be faster. A typical filter command would look like:</p>
| |
− | <pre># btrfs balance start -dusage=50 /path
| |
− | Done, had to relocate 2 out of 97 chunks
| |
− |
| |
− | $ btrfs filesystem df /path
| |
− | Data, single: total=74.03GiB, used=64.43GiB
| |
− | System, RAID1: total=32.00MiB, used=20.00KiB
| |
− | Metadata, RAID1: total=15.87GiB, used=8.84GiB
| |
− | GlobalReserve, single: total=512.00MiB, used=0.00B</pre>
| |
− | <p>As you can see, the <em>total</em> amount of data is decreased by just 1 GiB, which is
| |
− | an expected result. Let’s see what will happen when we increase the estimated
| |
− | usage filter.</p>
| |
− | <pre># btrfs balance start -dusage=85 /path
| |
− | Done, had to relocate 13 out of 95 chunks
| |
− |
| |
− | $ btrfs filesystem df /path
| |
− | Data, single: total=68.03GiB, used=64.43GiB
| |
− | System, RAID1: total=32.00MiB, used=20.00KiB
| |
− | Metadata, RAID1: total=15.87GiB, used=8.85GiB
| |
− | GlobalReserve, single: total=512.00MiB, used=0.00B</pre>
| |
− | <p>Now the used/total ratio is about 94% and we moved about <em>74G - 68G = 6G</em> of
| |
− | data to the remaining blockgroups, ie. the 6GiB are now free of filesystem
| |
− | structures, and can be reused for new data or metadata block groups.</p>
| |
− | <p>We can do a similar exercise with the metadata block groups, but this should
| |
− | not typically be necessary, unless the used/total ratio is really off. Here
| |
− | the ratio is roughly 50% but the difference as an absolute number is "a few
| |
− | gigabytes", which can be considered normal for a workload with snapshots or
| |
− | reflinks updated frequently.</p>
| |
− | <pre># btrfs balance start -musage=50 /path
| |
− | Done, had to relocate 4 out of 89 chunks
| |
− |
| |
− | $ btrfs filesystem df /path
| |
− | Data, single: total=68.03GiB, used=64.43GiB
| |
− | System, RAID1: total=32.00MiB, used=20.00KiB
| |
− | Metadata, RAID1: total=14.87GiB, used=8.85GiB
| |
− | GlobalReserve, single: total=512.00MiB, used=0.00B</pre>
| |
− | <p>Just 1 GiB decrease, which possibly means there are block groups with good
| |
− | utilization. Making the metadata layout more compact would in turn require
| |
− | updating more metadata structures, ie. lots of IO. As running out of metadata
| |
− | space is a more severe problem, it’s not necessary to keep the utilization
| |
− | ratio too high. For the purpose of this example, let’s see the effects of
| |
− | further compaction:</p>
| |
− | <pre># btrfs balance start -musage=70 /path
| |
− | Done, had to relocate 13 out of 88 chunks
| |
− |
| |
− | $ btrfs filesystem df .
| |
− | Data, single: total=68.03GiB, used=64.43GiB
| |
− | System, RAID1: total=32.00MiB, used=20.00KiB
| |
− | Metadata, RAID1: total=11.97GiB, used=8.83GiB
| |
− | GlobalReserve, single: total=512.00MiB, used=0.00B</pre>
| |
− | ===GETTING RID OF COMPLETELY UNUSED BLOCK GROUPS===
| |
− |
| |
− | <p>Normally the balance operation needs a work space, to temporarily move the
| |
− | data before the old block groups gets removed. If there’s no work space, it
| |
− | ends with <em>no space left</em>.</p>
| |
− | <p>There’s a special case when the block groups are completely unused, possibly
| |
− | left after removing lots of files or deleting snapshots. Removing empty block
| |
− | groups is automatic since 3.18. The same can be achieved manually with a
| |
− | notable exception that this operation does not require the work space. Thus it
| |
− | can be used to reclaim unused block groups to make it available.</p>
| |
− | <pre># btrfs balance start -dusage=0 /path</pre>
| |
− | <p>This should lead to decrease in the <em>total</em> numbers in the <b>btrfs filesystem df</b> output.</p>
| |
− | ==EXIT STATUS==
| |
− |
| |
− | <p>Unless indicated otherwise below, all <b>btrfs balance</b> subcommands
| |
− | return a zero exit status if they succeed, and non zero in case of
| |
− | failure.</p>
| |
− | <p>The <b>pause</b>, <b>cancel</b>, and <b>resume</b> subcommands exit with a status of
| |
− | <b>2</b> if they fail because a balance operation was not running.</p>
| |
− | <p>The <b>status</b> subcommand exits with a status of <b>0</b> if a balance
| |
− | operation is not running, <b>1</b> if the command-line usage is incorrect
| |
− | or a balance operation is still running, and <b>2</b> on other errors.</p>
| |
− | ==AVAILABILITY==
| |
− |
| |
− | <p><b>btrfs</b> is part of btrfs-progs.
| |
− | Please refer to the btrfs wiki http://btrfs.wiki.kernel.org for
| |
− | further details.</p>
| |
− | ==SEE ALSO==
| |
− |
| |
− | <p>[[Manpage/mkfs.btrfs|mkfs.btrfs(8)]],
| |
− | [[Manpage/btrfs-device|btrfs-device(8)]]</p>
| |
− | [[Category:Manpage]]
| |