linux/fs/xfs
Linus Torvalds 35a891be96 xfs: reflink update for 4.9-rc1
< XFS has gained super CoW powers! >
  ----------------------------------
         \   ^__^
          \  (oo)\_______
             (__)\       )\/\
                 ||----w |
                 ||     ||
 
 Included in this update:
 - unshare range (FALLOC_FL_UNSHARE) support for fallocate
 - copy-on-write extent size hints (FS_XFLAG_COWEXTSIZE) for fsxattr interface
 - shared extent support for XFS
 - copy-on-write support for shared extents
 - copy_file_range support
 - clone_file_range support (implements reflink)
 - dedupe_file_range support
 - defrag support for reverse mapping enabled filesystems
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJX/hrZAAoJEK3oKUf0dfodpwcQAKkTerNPhhDcthqWUJ2+jC7w
 JIuhKUg2GYojJhIJ4+Ue1knmuBeIusda+PzGls+6gdy7GDGdux/esRIJSe1W7A5G
 RNeumiSKVX5iYsZNUEX35O2a/SwUM1Sm5mcIFs4CxUwIRwE/cayNby6vrlVExvz7
 Ns6YYOI2bldUHLsxedg8MLG0it1JGTADB9gwGgb98bxQ3bD/UBn3TF9xTlj+ZH22
 ebnWsogSJOnUigOOSGeaQsmy1pJAhRIhvt+f481KuZak1pdQcK2feL4RcKw0NpNt
 15LCYRqX6RexC684VYgJZxXB4EKyfS2Bma71q41A7dz1x36kw7+wG18xasBqU++p
 GZwwL6si02rIGPMz1oD8xxZ0F97ADCGRmkgUHsCJKbP5UmGiP08K6GEN3osr5hAN
 xAmn9AxcprXVnV3WmnFxpBeWY/qCEsvSQqJuKSThYqAilqUc8wN2u5g/eEpE6mmg
 KEEhzaq5P4ovS/HOIQJWdBu1j5E9Mg2o/ncy87Q6uE+9Fa5AAP6GBWOtGcMwdFSU
 adbN7dqjgoHMyNHFrmePqyJYtOZ2hZovDlVndxnYysl5ZBfiBEEDISmr+x6KcSlo
 3kyOltYQLjEVu1sLOT3COCddn0jt5Lr1QhGeVepnrMlU2E1h4461viCNMDinJRIp
 OYoMOS+J83G2FEFwgXYM
 =Sa+Y
 -----END PGP SIGNATURE-----

Merge tag 'xfs-reflink-for-linus-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs

    < XFS has gained super CoW powers! >
     ----------------------------------
            \   ^__^
             \  (oo)\_______
                (__)\       )\/\
                    ||----w |
                    ||     ||

Pull XFS support for shared data extents from Dave Chinner:
 "This is the second part of the XFS updates for this merge cycle.  This
  pullreq contains the new shared data extents feature for XFS.

  Given the complexity and size of this change I am expecting - like the
  addition of reverse mapping last cycle - that there will be some
  follow-up bug fixes and cleanups around the -rc3 stage for issues that
  I'm sure will show up once the code hits a wider userbase.

  What it is:

  At the most basic level we are simply adding shared data extents to
  XFS - i.e. a single extent on disk can now have multiple owners. To do
  this we have to add new on-disk features to both track the shared
  extents and the number of times they've been shared. This is done by
  the new "refcount" btree that sits in every allocation group. When we
  share or unshare an extent, this tree gets updated.

  Along with this new tree, the reverse mapping tree needs to be updated
  to track each owner or a shared extent. This also needs to be updated
  ever share/unshare operation. These interactions at extent allocation
  and freeing time have complex ordering and recovery constraints, so
  there's a significant amount of new intent-based transaction code to
  ensure that operations are performed atomically from both the runtime
  and integrity/crash recovery perspectives.

  We also need to break sharing when writes hit a shared extent - this
  is where the new copy-on-write implementation comes in. We allocate
  new storage and copy the original data along with the overwrite data
  into the new location. We only do this for data as we don't share
  metadata at all - each inode has it's own metadata that tracks the
  shared data extents, the extents undergoing CoW and it's own private
  extents.

  Of course, being XFS, nothing is simple - we use delayed allocation
  for CoW similar to how we use it for normal writes. ENOSPC is a
  significant issue here - we build on the reservation code added in
  4.8-rc1 with the reverse mapping feature to ensure we don't get
  spurious ENOSPC issues part way through a CoW operation. These
  mechanisms also help minimise fragmentation due to repeated CoW
  operations. To further reduce fragmentation overhead, we've also
  introduced a CoW extent size hint, which indicates how large a region
  we should allocate when we execute a CoW operation.

  With all this functionality in place, we can hook up .copy_file_range,
  .clone_file_range and .dedupe_file_range and we gain all the
  capabilities of reflink and other vfs provided functionality that
  enable manipulation to shared extents. We also added a fallocate mode
  that explicitly unshares a range of a file, which we implemented as an
  explicit CoW of all the shared extents in a file.

  As such, it's a huge chunk of new functionality with new on-disk
  format features and internal infrastructure. It warns at mount time as
  an experimental feature and that it may eat data (as we do with all
  new on-disk features until they stabilise). We have not released
  userspace suport for it yet - userspace support currently requires
  download from Darrick's xfsprogs repo and build from source, so the
  access to this feature is really developer/tester only at this point.
  Initial userspace support will be released at the same time the kernel
  with this code in it is released.

  The new code causes 5-6 new failures with xfstests - these aren't
  serious functional failures but things the output of tests changing
  slightly due to perturbations in layouts, space usage, etc. OTOH,
  we've added 150+ new tests to xfstests that specifically exercise this
  new functionality so it's got far better test coverage than any
  functionality we've previously added to XFS.

  Darrick has done a pretty amazing job getting us to this stage, and
  special mention also needs to go to Christoph (review, testing,
  improvements and bug fixes) and Brian (caught several intricate bugs
  during review) for the effort they've also put in.

  Summary:

   - unshare range (FALLOC_FL_UNSHARE) support for fallocate

   - copy-on-write extent size hints (FS_XFLAG_COWEXTSIZE) for fsxattr
     interface

   - shared extent support for XFS

   - copy-on-write support for shared extents

   - copy_file_range support

   - clone_file_range support (implements reflink)

   - dedupe_file_range support

   - defrag support for reverse mapping enabled filesystems"

* tag 'xfs-reflink-for-linus-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (71 commits)
  xfs: convert COW blocks to real blocks before unwritten extent conversion
  xfs: rework refcount cow recovery error handling
  xfs: clear reflink flag if setting realtime flag
  xfs: fix error initialization
  xfs: fix label inaccuracies
  xfs: remove isize check from unshare operation
  xfs: reduce stack usage of _reflink_clear_inode_flag
  xfs: check inode reflink flag before calling reflink functions
  xfs: implement swapext for rmap filesystems
  xfs: refactor swapext code
  xfs: various swapext cleanups
  xfs: recognize the reflink feature bit
  xfs: simulate per-AG reservations being critically low
  xfs: don't mix reflink and DAX mode for now
  xfs: check for invalid inode reflink flags
  xfs: set a default CoW extent size of 32 blocks
  xfs: convert unwritten status of reverse mappings for shared files
  xfs: use interval query for rmap alloc operations on shared files
  xfs: add shared rmap map/unmap/convert log item types
  xfs: increase log reservations for reflink
  ...
2016-10-13 20:28:22 -07:00
..
libxfs xfs: rework refcount cow recovery error handling 2016-10-10 17:23:07 +11:00
Kconfig xfs: implement iomap based buffered write path 2016-06-21 09:53:44 +10:00
kmem.c xfs: improve kmem_realloc 2016-04-06 09:47:01 +10:00
kmem.h xfs: improve kmem_realloc 2016-04-06 09:47:01 +10:00
Makefile xfs: introduce the CoW fork 2016-10-04 18:06:40 -07:00
mrlock.h
uuid.c
uuid.h
xfs_acl.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
xfs_acl.h
xfs_aops.c xfs: convert COW blocks to real blocks before unwritten extent conversion 2016-10-11 09:03:19 +11:00
xfs_aops.h xfs: allocate delayed extents in CoW fork 2016-10-04 18:06:41 -07:00
xfs_attr_inactive.c xfs: make several functions static 2016-06-01 17:38:15 +10:00
xfs_attr_list.c xfs: make several functions static 2016-06-01 17:38:15 +10:00
xfs_attr.h xfs: remove put_value from attr ->put_listent context 2016-04-06 07:57:45 +10:00
xfs_bmap_item.c xfs: when replaying bmap operations, don't let unlinked inodes get reaped 2016-10-04 11:05:44 -07:00
xfs_bmap_item.h xfs: log bmap intent items 2016-10-04 11:05:44 -07:00
xfs_bmap_util.c xfs: implement swapext for rmap filesystems 2016-10-05 16:26:32 -07:00
xfs_bmap_util.h xfs: change xfs_bmap_{finish,cancel,init,free} -> xfs_defer_* 2016-08-03 11:18:10 +10:00
xfs_buf_item.c xfs: normalize "infinite" retries in error configs 2016-09-14 07:51:30 +10:00
xfs_buf_item.h
xfs_buf.c xfs: prevent dropping ioend completions during buftarg wait 2016-08-26 16:01:59 +10:00
xfs_buf.h xfs: track and serialize in-flight async buffers against unmount 2016-07-20 11:15:28 +10:00
xfs_dir2_readdir.c xfs: return an error when an inline directory is too small 2016-10-03 09:11:15 -07:00
xfs_discard.c xfs: rmap btree requires more reserved free space 2016-08-03 11:38:24 +10:00
xfs_discard.h
xfs_dquot_item.c xfs: allocate log vector buffers outside CIL context lock 2016-07-22 09:52:35 +10:00
xfs_dquot_item.h
xfs_dquot.c xfs: rename flist/free_list to dfops 2016-08-03 11:19:29 +10:00
xfs_dquot.h
xfs_error.c Merge branch 'xfs-4.8-misc-fixes-3' into for-next 2016-07-20 11:51:08 +10:00
xfs_error.h xfs: simulate per-AG reservations being critically low 2016-10-05 16:26:31 -07:00
xfs_export.c xfs: abstract block export operations from nfsd layouts 2016-07-15 15:31:29 -04:00
xfs_export.h
xfs_extent_busy.c xfs: remote attribute blocks aren't really userdata 2016-09-26 08:21:28 +10:00
xfs_extent_busy.h
xfs_extfree_item.c xfs: remove unnecessary parentheses from log redo item recovery functions 2016-08-03 12:29:32 +10:00
xfs_extfree_item.h xfs: refactor redo intent item processing 2016-08-03 11:23:49 +10:00
xfs_file.c xfs: reflink update for 4.9-rc1 2016-10-13 20:28:22 -07:00
xfs_filestream.c Merge branch 'xfs-4.9-log-recovery-fixes' into for-next 2016-10-03 09:56:28 +11:00
xfs_filestream.h
xfs_fsops.c xfs: preallocate blocks for worst-case btree expansion 2016-10-05 16:26:27 -07:00
xfs_fsops.h xfs: preallocate blocks for worst-case btree expansion 2016-10-05 16:26:27 -07:00
xfs_globals.c xfs: garbage collect old cowextsz reservations 2016-10-05 16:26:28 -07:00
xfs_icache.c xfs: garbage collect old cowextsz reservations 2016-10-05 16:26:28 -07:00
xfs_icache.h xfs: garbage collect old cowextsz reservations 2016-10-05 16:26:28 -07:00
xfs_icreate_item.c
xfs_icreate_item.h
xfs_inode_item.c xfs: create a separate cow extent size hint for the allocator 2016-10-05 16:26:26 -07:00
xfs_inode_item.h xfs: remove timestamps from incore inode 2016-02-09 16:54:58 +11:00
xfs_inode.c xfs: reflink update for 4.9-rc1 2016-10-13 20:28:22 -07:00
xfs_inode.h xfs: set a default CoW extent size of 32 blocks 2016-10-05 16:26:31 -07:00
xfs_ioctl32.c xfs: don't pass ioflags around in the ioctl path 2016-07-20 11:29:35 +10:00
xfs_ioctl32.h
xfs_ioctl.c xfs: reflink update for 4.9-rc1 2016-10-13 20:28:22 -07:00
xfs_ioctl.h xfs: don't pass ioflags around in the ioctl path 2016-07-20 11:29:35 +10:00
xfs_iomap.c xfs: create a separate cow extent size hint for the allocator 2016-10-05 16:26:26 -07:00
xfs_iomap.h xfs: create a separate cow extent size hint for the allocator 2016-10-05 16:26:26 -07:00
xfs_iops.c xfs: reflink update for 4.9-rc1 2016-10-13 20:28:22 -07:00
xfs_iops.h xfs: Propagate dentry down to inode_change_ok() 2016-09-22 10:56:19 +02:00
xfs_itable.c xfs: create a separate cow extent size hint for the allocator 2016-10-05 16:26:26 -07:00
xfs_itable.h
xfs_linux.h xfs: garbage collect old cowextsz reservations 2016-10-05 16:26:28 -07:00
xfs_log_cil.c xfs: allocate log vector buffers outside CIL context lock 2016-07-22 09:52:35 +10:00
xfs_log_priv.h xfs: rework log recovery to submit buffers on LSN boundaries 2016-09-26 08:22:16 +10:00
xfs_log_recover.c xfs: when replaying bmap operations, don't let unlinked inodes get reaped 2016-10-04 11:05:44 -07:00
xfs_log.c Merge branch 'xfs-4.8-buf-fixes' into for-next 2016-07-20 11:53:35 +10:00
xfs_log.h xfs: make several functions static 2016-06-01 17:38:15 +10:00
xfs_message.c
xfs_message.h
xfs_mount.c xfs: garbage collect old cowextsz reservations 2016-10-05 16:26:28 -07:00
xfs_mount.h xfs: garbage collect old cowextsz reservations 2016-10-05 16:26:28 -07:00
xfs_mru_cache.c
xfs_mru_cache.h
xfs_ondisk.h xfs: define the on-disk refcount btree format 2016-10-03 09:11:18 -07:00
xfs_pnfs.c xfs: introduce refcount btree definitions 2016-10-03 09:11:16 -07:00
xfs_pnfs.h xfs: abstract block export operations from nfsd layouts 2016-07-15 15:31:29 -04:00
xfs_qm_bhv.c
xfs_qm_syscalls.c xfs: better xfs_trans_alloc interface 2016-04-06 09:19:55 +10:00
xfs_qm.c xfs: better xfs_trans_alloc interface 2016-04-06 09:19:55 +10:00
xfs_qm.h
xfs_quota.h
xfs_quotaops.c
xfs_refcount_item.c xfs: store in-progress CoW allocations in the refcount btree 2016-10-05 16:26:05 -07:00
xfs_refcount_item.h xfs: log refcount intent items 2016-10-03 09:11:21 -07:00
xfs_reflink.c xfs: fix error initialization 2016-10-10 16:49:18 +11:00
xfs_reflink.h xfs: garbage collect old cowextsz reservations 2016-10-05 16:26:28 -07:00
xfs_rmap_item.c xfs: convert unwritten status of reverse mappings for shared files 2016-10-05 16:26:29 -07:00
xfs_rmap_item.h xfs: convert RUI log formats to use variable length arrays 2016-09-19 10:24:27 +10:00
xfs_rtalloc.c xfs: rename flist/free_list to dfops 2016-08-03 11:19:29 +10:00
xfs_rtalloc.h xfs: make several functions static 2016-06-01 17:38:15 +10:00
xfs_stats.c xfs: introduce refcount btree definitions 2016-10-03 09:11:16 -07:00
xfs_stats.h xfs: introduce refcount btree definitions 2016-10-03 09:11:16 -07:00
xfs_super.c xfs: check inode reflink flag before calling reflink functions 2016-10-10 16:47:32 +11:00
xfs_super.h xfs: quiesce the filesystem after recovery on readonly mount 2016-09-26 08:21:44 +10:00
xfs_symlink.c xfs: rename flist/free_list to dfops 2016-08-03 11:19:29 +10:00
xfs_symlink.h
xfs_sysctl.c xfs: garbage collect old cowextsz reservations 2016-10-05 16:26:28 -07:00
xfs_sysctl.h xfs: garbage collect old cowextsz reservations 2016-10-05 16:26:28 -07:00
xfs_sysfs.c xfs: normalize "infinite" retries in error configs 2016-09-14 07:51:30 +10:00
xfs_sysfs.h xfs: configurable error behavior via sysfs 2016-05-18 10:58:51 +10:00
xfs_trace.c xfs: rework xfs_bmap_free callers to use xfs_defer_ops 2016-08-03 11:15:38 +10:00
xfs_trace.h xfs: reflink update for 4.9-rc1 2016-10-13 20:28:22 -07:00
xfs_trans_ail.c
xfs_trans_bmap.c xfs: implement deferred bmbt map/unmap operations 2016-10-04 11:05:44 -07:00
xfs_trans_buf.c xfs: remove XBF_STALE flag wrapper macros 2016-02-10 15:01:11 +11:00
xfs_trans_dquot.c
xfs_trans_extfree.c xfs: set up per-AG free space reservations 2016-09-19 10:30:52 +10:00
xfs_trans_inode.c fs: Replace current_fs_time() with current_time() 2016-09-27 21:06:22 -04:00
xfs_trans_priv.h
xfs_trans_refcount.c xfs: connect refcount adjust functions to upper layers 2016-10-03 09:11:22 -07:00
xfs_trans_rmap.c xfs: add shared rmap map/unmap/convert log item types 2016-10-05 16:26:29 -07:00
xfs_trans.c Merge branch 'xfs-4.9-reflink-prep' into for-next 2016-10-03 09:52:31 +11:00
xfs_trans.h xfs: implement deferred bmbt map/unmap operations 2016-10-04 11:05:44 -07:00
xfs_xattr.c Make __xfs_xattr_put_listen preperly report errors. 2016-09-14 07:40:35 +10:00
xfs.h