Talk:Project ideas

From btrfs Wiki
Jump to: navigation, search

Please post project ideas on the mailing list, so they can be discussed by a broader audience.

Contents

Encryption support

What I'm missing is any kind of encryption support. It would be nice to include encryption support into btrfs (as long as the on-disk file format isn't finished), so one can easily handle multiple device (raidX) targets with one key. For now I would have to setup LUKS below btrfs, which is very suboptimal for multiple device configurations (raid)

experimental raid5/6 support is available now, see FAQ#Can_I_use_RAID.5B56.5D_on_my_Btrfs_filesystem.3F -- André 07:55, 18 July 2008 (UTC)

RAID 5/6

It would be nice to see somewhat intelligent or flexible handling of devices of different sizes with RAID 5/6.

For example, if I had two 250GB drives and two 500GB drives and made a RAID5 of them, the two 250GB devices could be first "combined" to make one 500GB volume and then make the RAID5 on top of the three 500GB volumes. The resulting RAID5 volume would then have ~1000GB of usable space (with distributed parity).

In theory, I could make similar setup by combining the two 250GB drives with md RAID0 first, but this seems unoptimal. --Cg 13:40, 1 February 2009 (UTC)

Maintanance

If we are going to have a fs for massive allocation; administration and maintenance will be an issue. I would like to see an active manager system that can be clustered across PCs. Ideally split into two dynamically changing and balanced workloads between guest time and host time.

I envision host time being used for automated fs maintenace tasks that will use upto 90% activity but leave a 10% window for incoming guest work. The guest time will be for live file read/write from the user/guest system(s). As the guest workload increases the Host workload will progressively shut down dynamically reducing to 0% and allowing the guest time to rise to 90% or more on the fly. As guest load drops the host load can rise back to full.

Defragmentation and filesystem integrity checking should be left to this manager, releasing this burden from the admin but allowing the drives to still be utilised at full speed when reqired.

Perhaps an aggression config option can be made available by this clustered manager system and allow host work disk access to be moderated from a lazy 10% to a rampant 90% with aggression states or levels inbetween. If this would be deemed more prudent for drive life expectancy. Then the admin can can alter this value during the day or night (should change be required for some reason) and worry about little else.

Anyway, such a manager may also allow improvements in drive speed by seperating out essential disk write aspects from less essential disk write aspects.

--Relic 17:37, 10 March 2009 (UTC)

RAID from Drives of Differing Capacities

It would be awesome to see intelligent handling of storage devices with various capacities akin to Drobo's Beyondraid. The ability to expand a raid with newer larger & cheaper drives as needed would be very useful for power users, small-medium business users and (with a friendly enough gui) average users.

The Beyondraid system uses the total redundant space as a pool in which it stores data from multiple virtual volumes. These volumes are a predetermined size usually 16TB, well beyond the physically installed capacity, users expand the physical capacity as needed for their data. Redundancy can be dynamically switched between 1 or 2 drives. The system is plug and play, self healing, data-aware, fully automated and near infinitely expandable.

Drobo's Beyondraid from [http://en.wikipedia.org/wiki/Non-standard_RAID_levels]

           Drives
 | 100 GB | 200 GB | 400 GB | 500 GB |

                            ----------
                            |   x    | unusable space (100 GB)
                            ----------
                   -------------------
                   |   A1   |   A1   | RAID 1 set (2× 100 GB)
                   -------------------
                   -------------------
                   |   B1   |   B1   | RAID 1 set (2× 100 GB)
                   -------------------
          ----------------------------
          |   C1   |   C2   |   Cp   | RAID 5 array (3× 100 GB)
          ----------------------------
 -------------------------------------
 |   D1   |   D2   |   D3   |   Dp   | RAID 5 array (4× 100 GB)
 -------------------------------------

1200Gb Drive Capacity 
Beyondraid: 700Gb Usable
RAID 5: 300Gb Usable

Differentiators for btrfs:

Adding drives to a "dynamic raid (pool)" would be a mkfs option and could be integrated into a gui application

The "dynamic raid" system would be able to work with any block device and intelligently deal with partitions from the same drive treating them as a large non-continuous drive so redundancy information is not kept on the same drive. This will allow users to use any capacity they have no matter where it's located.

What Beyondraid doesn't allow is for users to use the entire first drive if it is not paired with a second drive. Allowing this would offer an easy path into redundancy and expansion. New installs create a dynamic raid on one drive (unless more are available) Fill up the first drive; add a second drive, get redundancy; add a third or more get extra space. Perhaps the first drive of a "dynamic raid" could be created from an existing btrfs partition.

--Ddfitzy 11:33, 1 August 2010 (UTC)

Sharing

NFS sharing is ok, but there's no auto-discovery. Not like AoE where any shared devices show up in nautilus Device area.

Something that would be nice is if BTRFS would allow for read-only sharing/mounting of the device over AoE (ATA of Ethernet) or even sharing subvolumes/snapshots.

The main reason I'm looking for this is to read-only share local repo mirrors to anyone on my local network. Also AoE not being TCP/IP tends to go through VPN restrictions so I can install software while connected to a VPN with no local access.

--MikeyCarter 17:43, 8 August 2011 (UTC)

Advanced Replication

Right now I have the Documents/Code I work on copied to my home computer, laptop and server. The reason is when my laptop is off-line I want to still be able to work on my files. The other case is if for some-reason I need to work on a computer I can turn it off and still be able to access my files. Problem now is keeping three locations in sync.

It would be nice if btrfs had a way to link two or more copies of a subvolume on remote locations. Also can serve as a backup if we're keeping checks sums... It could also use this to repair problems.

Which brings me to my second idea. What about having the feature, instead of mirroring (RAID-1) have it so you can mirror a subvolume. That way if there is a problem detected it can use that. I have a lot of cases where I only want subvolume's mirrored and not the entire FS. (ie if I make a backup copy of a DVD, I don't need RAID-1 on it as I have the original. Same goes for things downloaded off the web)


--MikeyCarter 15:17, 15 August 2011 (UTC)

Conversion from other Filesystems

Complex conversion from md. Right now many have an ext4/mdraid1 disk setup. Using btrfs-convert to get to a btrfs-on-mdraid1 is "okay" but not ideal. The ideal conversion would replace the md with btrfs's builtin raid. Other raid-types could also benefit from the same treatment. Brendan M. Hide 18:14, 5 November 2012 (UTC)

Lot of I/O errors (mark drive as unreliable feature)

I think lot of I/O error should be explained because there are cases that give a lot of I/O errors whitout meaning a large part of the drive is corrupted. examples:


-intensive operations on a same damaged part of the disk

-Use of flash memory during a long time:

   When a file is writen, the sectors allocated to the file as well as some sectors in the FS metadata are wrote.
   It means that for each random write access medata sectors are written again.
   Flash disk have a limited number of write operations.This lead to have some metadata sectors corrupted while most of the drive is OK.
   I often deal with this case on usb drives.


In the secound case I think drive should not be taken offline because it would probably lead users to try unmount It and then remount It.

But It would never been remounted. the FS type could no longer be recognized. Or even worse (In the case it is the partition begin in the boot sector) the whole partition table could not be recognized

It is importent in this case to warn the user and let him copy it's unsaved data.Ytrezq 22:02, 1 March 2013 (UTC)

automatic online I/O error recovery

Some filesystems provide the possibility to try to extract data when an allocated damaged sector in supplement of marking the sector damaged.

I know it would probably mean changing certain thing in linux drive management:

 example: testdisk has the possibillity to make dump of damaged disk.
 When I use It on Linux I got the log full with "I/O error on device", and it skip damaged sector that could be eventually rescued
 When I use testdisk on Windows it still skip some sectors but some other has been rescued (I modified some registry entry)

When those issues could be fixed, BTRFS could go farther:

When a damaged sector is found while the volume is mounted. The kernel (while still online) could try to extract the data present on the sector (if it is allocated and there are no copy of it) to a valid place on the disk.Ytrezq 22:03, 1 March 2013 (UTC)

BTRFS previous options support

I often use my filesystems with multiple operation system installations.

Tune2fs from previous filesystems allowed to store options set on the drive. But I think it could be done with an another way.


It would be usefull to have four mount options to handel this feature :

  • One that tell to memorize the option set for n times. Where n is a number
  • another that record the option set for every mount
  • another that tell to revert to default (erase) a previous option set for n time
  • another option allowing to revert the filesystem to default options pemanentlyYtrezq 22:04, 1 March 2013 (UTC)

Advanced deduplication Support

Currently, it seems that data deduplication will 'only' compare files.

Although the reason to not use something block-based or extent-based is valid, I don't understand why something bigger could not be used.

Why not use group of extent/block that would have the size of some MB.

It would allow deduplication to be usefull for non backup usage by identifing data in files of (compeltly) differents type.

It would also allow deduplication inside a same file.

The size used for deduplication could be modified with those kind of parameter (examples):

  • the medium type (RAM,flash,hard_disk...)
  • the number of files used on the FS
  • the fragmentation level deduplication implies ...

This type od deduplication may co-exist with the current deduplication method.


Otherwize the current deduplication with bedup sounds like an improved type of hard link or a cp --reflink copy.Ytrezq 22:03, 1 March 2013 (UTC)

Adjust compression level

Eventually add a parameter to the compress-force option that allow to tune the compression level (beetwen best and fast, or 1 to 9 ...) or as percentage of porcessor time usage.Ytrezq 22:05, 1 March 2013 (UTC)

clear allocated space

Normal users does sometimes mistakes, and delete a wrong file with no backup.

Wipe free space could prevent them to use recovery software.

I think this should be an option that could be enabled by default only for typical industrials configurations.Ytrezq 22:05, 1 March 2013 (UTC)

Automatic "supercache" for perforamnce increase

Suggestion: I suggest that btrfs should consider implementing an optional automatic speed up cache "area" where the user can reserve a bit of space that would automatically be configured as stripe (raid0) that utilize all the fastest disks available for a potential speedup.

The idea is that when the filesystem encounters often used data it transparently duplicates it on the raid0 (speedup) area. If a read should fail btrfs can always try the original location of the data.

Why?:

In a filesystem consisting of many old and new disks where some are (significantly) faster than others it makes sense to utilize the fastest ones that might possibly be idle as a potential source of increased performance.

raid0 (stripe) is normally the fastest configuration you can have providing that it is not slowed down by bad/slow disks. since btrfs know if a stripe is broken it can recover by retrieving the data from a (possibly redundant and) possibly slower location.

How?

- User reserves space for a "supercache"
- Btrfs transparently detect the fastest group/configuration of disks
- Btrfs duplicate often read (hot) data to the "supercache" where data is always striped across disks.
- Btrfs retrives data from it's original location (and possibly redundant configuration) if the "supercache" fails, or if (enough of) the disks in the "supercache" is busy and the disks containing the original data is not (idle).

Pros?

- Can possibly perform better in many situations compared to a standard setup.
- Another layer of places to retrieve the data from in case of a corruption.
- If exclusively configured for data or metadata it can (hopefully) increase performance for specific workloads

Cons?

- Adds complexity
- Might cause irregular performance sometimes
- Suggested by a user without deep knowledge of btrfs ;)
- Perhaps not significantly better than hot data relocation suggested for the VFS layer.

--Svein Engelsgjerd (talk) 19:21, 30 December 2013 (UTC)

Personal tools