70781 Commits

Author SHA1 Message Date
Darrick J. Wong
8b943d21d4 xfs: assorted fixes for 5.14, part 1
This branch contains the first round of various small fixes for 5.14.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAmC5YvwACgkQ+H93GTRK
 tOuDTxAAmW+Mc4oPx/Fsa2xqvkDYCGaM1QtkIY9McrGsMdzF0JbB7bR5rrUET2+N
 /FsFTIQ31vZfMEJ69HM6NjQ4xXIKhdEj+PlGsjvg62BqhpPBRuFH7dLBWChiRcf2
 M5SN1P9kcAjIoMR2tKzh8FMDmJTcJlJxxBi9kxb3YK1+UesUHsPS8/jSW3OuIov3
 AJzAZ08SrA8ful5uCMy9Mf5uBgfRuAUjDzA5CM5kA1xqlHREQi+oHl62E81N33mu
 RR8tdHQzvPO5rHyX84GzV5cu2CsmDuOPF2nA5SUxRhIZFfMo5mEezA/nxqqACYti
 rnRGxZVwIG9YYBESVxXFQXIjn5lHoQ2jomk/CraszeVEJCteNRnpbVGzd1xczM3u
 0iuDuHy+aVB/3QhfA6/0vjfttCzkMEld9U9c3WEjIaw5iUCxe531yfrvVEyF9blx
 NBjnQyHGbt+y26BzBjD33NJEdDoZqS3UIQ/rmb4f2mitGN5d9faAcJ754uRJt3o4
 K9HXGjuR+iOH/tCZKDL1hBc4M/pFgNdeBWyFYdQhh8eSj9HSCTCG58zAyQ2WPOSr
 6D/f4BMqivKzzz0HicZPJAoazrtcKGrWTHTxidHIkI4le367NOwv6YJquJ0pFMBs
 8L7OdRqYT3yw6+qErCjEn03WkP9O7V8lHf8hFxt7+dNxDukIj40=
 =6eZB
 -----END PGP SIGNATURE-----

Merge tag 'assorted-fixes-5.14-1_2021-06-03' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-5.14-merge2

xfs: assorted fixes for 5.14, part 1

This branch contains the first round of various small fixes for 5.14.

* tag 'assorted-fixes-5.14-1_2021-06-03' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
  xfs: don't take a spinlock unconditionally in the DIO fastpath
  xfs: mark xfs_bmap_set_attrforkoff static
  xfs: Remove redundant assignment to busy
  xfs: sort variable alphabetically to avoid repeated declaration
2021-06-08 09:22:34 -07:00
Darrick J. Wong
f52edf6c54 xfs: various unit conversions
Crafting the realtime file extent size hint fixes revealed various
 opportunities to clean up unit conversions, so now that gets its own
 series.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAmC5YvYACgkQ+H93GTRK
 tOtRihAAh5wKVV9Fwafkk3l2zTCw+K/SfciWvZqF8BClAjqzWUDaw4DJ2HS0lSmj
 ezOG+/HhJzZB4udC9WDM4TSFA/elR1pJBu5lSBpAt/K78Mpw/i1aFs1SyAfw6Qsp
 1A8um80jaCyzvcSxU1Hx/35CN0MQwpjJ23LziEDUVBvj/qV7Qj8DmI61CfWWYzs9
 jEJFzIvbKoRqs8Wqws7n4a7fw3VOyqC2EQV290xzumkYKOSAmpWQGkbiU2/YtqKY
 7w/GvjS1zRE7o1cGeKBYRA/35zbw5mqoHD5vJmdlUIsWK22fnN8h5vqYo5dpFPiz
 nFyK+MHVrV6WxKBKECHtgm7Jya+jjH0TFu/9js4k31ehe4LJc/nhEqmSrFiH3eGS
 4AXrhhZqtDZ1skjcPym83dCayW1uE2cWbHqYJ+ztQW5PrLafoWQ9Ld/vtrbL+b/U
 VIgh3LQmF371jGcE0twERNQPIb5F96w9mS6F+vq0JuvrGftxoa4tKVjE8jrNWJjo
 750/KT0Nupti0AYKq9WMD47BiSf5BbdnhYTN15X0mc5TyHo/EyXIl/I/hFfEMFp9
 AH/qWaQlt3LjMXvr1xF6R4RNSI+xxai7PHfW69Zi7tBea+7k9wTZjubgc2FFaMOQ
 Jf8n5L07IXVYDy0mGUGaokenQb2KPvfXVK7yYtaMj1qDh7Sjqjw=
 =HGL1
 -----END PGP SIGNATURE-----

Merge tag 'unit-conversion-cleanups-5.14_2021-06-03' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-5.14-merge2

xfs: various unit conversions

Crafting the realtime file extent size hint fixes revealed various
opportunities to clean up unit conversions, so now that gets its own
series.

* tag 'unit-conversion-cleanups-5.14_2021-06-03' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
  xfs: remove unnecessary shifts
  xfs: clean up open-coded fs block unit conversions
2021-06-08 09:21:24 -07:00
Dave Chinner
9ba0889e22 xfs: drop the AGI being passed to xfs_check_agi_freecount
From: Dave Chinner <dchinner@redhat.com>

Stephen Rothwell reported this compiler warning from linux-next:

fs/xfs/libxfs/xfs_ialloc.c: In function 'xfs_difree_finobt':
fs/xfs/libxfs/xfs_ialloc.c:2032:20: warning: unused variable 'agi' [-Wunused-variable]
 2032 |  struct xfs_agi   *agi = agbp->b_addr;

Which is fallout from agno -> perag conversions that were done in
this function. xfs_check_agi_freecount() is the only user of "agi"
in xfs_difree_finobt() now, and it only uses the agi to get the
current free inode count. We hold that in the perag structure, so
there's not need to directly reference the raw AGI to get this
information.

The btree cursor being passed to xfs_check_agi_freecount() has a
reference to the perag being operated on, so use that directly in
xfs_check_agi_freecount() rather than passing an AGI.

Fixes: 7b13c5155182 ("xfs: use perag for ialloc btree cursors")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-06-08 09:19:22 -07:00
Darrick J. Wong
c3eabd3650 xfs: initial agnumber -> perag conversions for shrink
If we want to use active references to the perag to be able to gate
 shrink removing AGs and hence perags safely, we've got a fair bit of
 work to do actually use perags in all the places we need to.
 
 There's a lot of code that iterates ag numbers and then
 looks up perags from that, often multiple times for the same perag
 in the one operation. If we want to use reference counted perags for
 access control, then we need to convert all these uses to perag
 iterators, not agno iterators.
 
 [Patches 1-4]
 
 The first step of this is consolidating all the perag management -
 init, free, get, put, etc into a common location. THis is spread all
 over the place right now, so move it all into libxfs/xfs_ag.[ch].
 This does expose kernel only bits of the perag to libxfs and hence
 userspace, so the structures and code is rearranged to minimise the
 number of ifdefs that need to be added to the userspace codebase.
 The perag iterator in xfs_icache.c is promoted to a first class API
 and expanded to the needs of the code as required.
 
 [Patches 5-10]
 
 These are the first basic perag iterator conversions and changes to
 pass the perag down the stack from those iterators where
 appropriate. A lot of this is obvious, simple changes, though in
 some places we stop passing the perag down the stack because the
 code enters into an as yet unconverted subsystem that still uses raw
 AGs.
 
 [Patches 11-16]
 
 These replace the agno passed in the btree cursor for per-ag btree
 operations with a perag that is passed to the cursor init function.
 The cursor takes it's own reference to the perag, and the reference
 is dropped when the cursor is deleted. Hence we get reference
 coverage for the entire time the cursor is active, even if the code
 that initialised the cursor drops it's reference before the cursor
 or any of it's children (duplicates) have been deleted.
 
 The first patch adds the perag infrastructure for the cursor, the
 next four patches convert a btree cursor at a time, and the last
 removes the agno from the cursor once it is unused.
 
 [Patches 17-21]
 
 These patches are a demonstration of the simplifications and
 cleanups that come from plumbing the perag through interfaces that
 select and then operate on a specific AG. In this case the inode
 allocation algorithm does up to three walks across all AGs before it
 either allocates an inode or fails. Two of these walks are purely
 just to select the AG, and even then it doesn't guarantee inode
 allocation success so there's a third walk if the selected AG
 allocation fails.
 
 These patches collapse the selection and allocation into a single
 loop, simplifies the error handling because xfs_dir_ialloc() always
 returns ENOSPC if no AG was selected for inode allocation or we fail
 to allocate an inode in any AG, gets rid of xfs_dir_ialloc()
 wrapper, converts inode allocation to run entirely from a single
 perag instance, and then factors xfs_dialloc() into a much, much
 simpler loop which is easy to understand.
 
 Hence we end up with the same inode allocation logic, but it only
 needs two complete iterations at worst, makes AG selection and
 allocation atomic w.r.t. shrink and chops out out over 100 lines of
 code from this hot code path.
 
 [Patch 22]
 
 Converts the unlink path to pass perags through it.
 
 There's more conversion work to be done, but this patchset gets
 through a large chunk of it in one hit. Most of the iterators are
 converted, so once this is solidified we can move on to converting
 these to active references for being able to free perags while the
 fs is still active.
 -----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEmJOoJ8GffZYWSjj/regpR/R1+h0FAmC3HUgUHGRhdmlkQGZy
 b21vcmJpdC5jb20ACgkQregpR/R1+h2yaw/+P0JzpI+6n06Ei00mjgE/Du/WhMLi
 0JQ93Grlj+miuGGT9DgGCiRpoZnefhEk+BH6JqoEw1DQ3T5ilmAzrHLUUHSQC3+S
 dv85sJduheQ6yHuoO+4MzkaSq6JWKe7E9gZwAsVyBul5aSjdmaJaQdPwYMTXSXo0
 5Uqq8ECFkMcaHVNjcBfasgR/fdyWy2Qe4PFTHTHdQpd+DNZ9UXgFKHW2og+1iry/
 zDIvdIppJULA09TvVcZuFjd/1NzHQ/fLj5PAzz8GwagB4nz2x3s78Zevmo5yW/jK
 3/+50vXa8ldhiHDYGTS3QXvS0xJRyqUyD47eyWOOiojZw735jEvAlCgjX6+0X1HC
 k3gCkQLv8l96fRkvUpgnLf/fjrUnlCuNBkm9d1Eq2Tied8dvLDtiEzoC6L05Nqob
 yd/nIUb1zwJFa9tsoheHhn0bblTGX1+zP0lbRJBje0LotpNO9DjGX5JoIK4GR7F8
 y1VojcdgRI14HlxUnbF3p8wmQByN+M2tnp6GSdv9BA65bjqi05Rj/steFdZHBV6x
 wiRs8Yh6BTvMwKgufHhRQHfRahjNHQ/T/vOE+zNbWqemS9wtEUDop+KvPhC36R/k
 o/cmr23cF8ESX2eChk7XM4On3VEYpcvp2zSFgrFqZYl6RWOwEis3Htvce3KuSTPp
 8Xq70te0gr2DVUU=
 =YNzW
 -----END PGP SIGNATURE-----

Merge tag 'xfs-perag-conv-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs into xfs-5.14-merge2

xfs: initial agnumber -> perag conversions for shrink

If we want to use active references to the perag to be able to gate
shrink removing AGs and hence perags safely, we've got a fair bit of
work to do actually use perags in all the places we need to.

There's a lot of code that iterates ag numbers and then
looks up perags from that, often multiple times for the same perag
in the one operation. If we want to use reference counted perags for
access control, then we need to convert all these uses to perag
iterators, not agno iterators.

[Patches 1-4]

The first step of this is consolidating all the perag management -
init, free, get, put, etc into a common location. THis is spread all
over the place right now, so move it all into libxfs/xfs_ag.[ch].
This does expose kernel only bits of the perag to libxfs and hence
userspace, so the structures and code is rearranged to minimise the
number of ifdefs that need to be added to the userspace codebase.
The perag iterator in xfs_icache.c is promoted to a first class API
and expanded to the needs of the code as required.

[Patches 5-10]

These are the first basic perag iterator conversions and changes to
pass the perag down the stack from those iterators where
appropriate. A lot of this is obvious, simple changes, though in
some places we stop passing the perag down the stack because the
code enters into an as yet unconverted subsystem that still uses raw
AGs.

[Patches 11-16]

These replace the agno passed in the btree cursor for per-ag btree
operations with a perag that is passed to the cursor init function.
The cursor takes it's own reference to the perag, and the reference
is dropped when the cursor is deleted. Hence we get reference
coverage for the entire time the cursor is active, even if the code
that initialised the cursor drops it's reference before the cursor
or any of it's children (duplicates) have been deleted.

The first patch adds the perag infrastructure for the cursor, the
next four patches convert a btree cursor at a time, and the last
removes the agno from the cursor once it is unused.

[Patches 17-21]

These patches are a demonstration of the simplifications and
cleanups that come from plumbing the perag through interfaces that
select and then operate on a specific AG. In this case the inode
allocation algorithm does up to three walks across all AGs before it
either allocates an inode or fails. Two of these walks are purely
just to select the AG, and even then it doesn't guarantee inode
allocation success so there's a third walk if the selected AG
allocation fails.

These patches collapse the selection and allocation into a single
loop, simplifies the error handling because xfs_dir_ialloc() always
returns ENOSPC if no AG was selected for inode allocation or we fail
to allocate an inode in any AG, gets rid of xfs_dir_ialloc()
wrapper, converts inode allocation to run entirely from a single
perag instance, and then factors xfs_dialloc() into a much, much
simpler loop which is easy to understand.

Hence we end up with the same inode allocation logic, but it only
needs two complete iterations at worst, makes AG selection and
allocation atomic w.r.t. shrink and chops out out over 100 lines of
code from this hot code path.

[Patch 22]

Converts the unlink path to pass perags through it.

There's more conversion work to be done, but this patchset gets
through a large chunk of it in one hit. Most of the iterators are
converted, so once this is solidified we can move on to converting
these to active references for being able to free perags while the
fs is still active.

* tag 'xfs-perag-conv-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (23 commits)
  xfs: remove xfs_perag_t
  xfs: use perag through unlink processing
  xfs: clean up and simplify xfs_dialloc()
  xfs: inode allocation can use a single perag instance
  xfs: get rid of xfs_dir_ialloc()
  xfs: collapse AG selection for inode allocation
  xfs: simplify xfs_dialloc_select_ag() return values
  xfs: remove agno from btree cursor
  xfs: use perag for ialloc btree cursors
  xfs: convert allocbt cursors to use perags
  xfs: convert refcount btree cursor to use perags
  xfs: convert rmap btree cursor to using a perag
  xfs: add a perag to the btree cursor
  xfs: pass perags around in fsmap data dev functions
  xfs: push perags through the ag reservation callouts
  xfs: pass perags through to the busy extent code
  xfs: convert secondary superblock walk to use perags
  xfs: convert xfs_iwalk to use perag references
  xfs: convert raw ag walks to use for_each_perag
  xfs: make for_each_perag... a first class citizen
  ...
2021-06-08 09:13:13 -07:00
Darrick J. Wong
ebf2e33723 xfs: buffer cache bulk page allocation
This patchset makes use of the new bulk page allocation interface to
 reduce the overhead of allocating large numbers of pages in a
 loop.
 
 The first two patches are refactoring buffer memory allocation and
 converting the uncached buffer path to use the same page allocation
 path, followed by converting the page allocation path to use bulk
 allocation.
 
 The rest of the patches are then consolidation of the page
 allocation and freeing code to simplify the code and remove a chunk
 of unnecessary abstraction. This is largely based on a series of
 changes made by Christoph Hellwig.
 -----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEmJOoJ8GffZYWSjj/regpR/R1+h0FAmC+6OwUHGRhdmlkQGZy
 b21vcmJpdC5jb20ACgkQregpR/R1+h21QQ/8C0f7wq1OKwNI2oRubf6J8jtttiRS
 SD2TA03AP2OIOKx4y2G0h0dJeX9tgnbLerIlpfT80nHDoBgKHbZCYSEHQT0DscYo
 fuwQTVR8RCklKspjUlCGR+Gbm6vI8HakK1lAppw168e4c6t8wX1KiSibwaVTQdaZ
 NaXUqTUzGiNq+iiLS6fW3mJ3PKWFJYyrDOSR2jIPbUGIJdejRCGe0xVnu+hIsz+y
 c2gSGCB+j3cYaazhlJTDYPGja3Wq3eR+Ya9i1GcA1tJiJLsu0ZjaVQ69Bl4dud2F
 c3OyhFK0El1VMSEVb3hY8gTpAO02jNWSnB2Zlidt0h4ZJVAxKus0xe2w3eS4uST2
 hcMI3lwjdzRQuoBwOgXQ+CpYVv2wI8HPNLTSR+NYcC2IZaCNieFRWdTYwXrAJBB3
 H09m04GT/7TkkrYHFD1zRtIedP4DZ6MZn/33bufNxEt1NRCFw5AFAEUFfjDA317A
 4nByCmU6XjmmpI/XLixwu0BYCfKVB4UsrgOyzXBy7ZU0+pIser+ynP1V4d9Bb43Y
 xVQ8S0QirT7gqXjx75mD4B4qkXZ5nrz5Z7fSn6YU4TwqsYtZYlsBauLlWmmHp9MT
 CP4PA4j+CQORhfZzWXw2ViXYGoIssc1cw5i4JB6a4u/OaDi19dYkE6SO8P3b9GSm
 khHqWgcTC4VGpmc=
 =JsrV
 -----END PGP SIGNATURE-----

Merge tag 'xfs-buf-bulk-alloc-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs into xfs-5.14-merge2

xfs: buffer cache bulk page allocation

This patchset makes use of the new bulk page allocation interface to
reduce the overhead of allocating large numbers of pages in a
loop.

The first two patches are refactoring buffer memory allocation and
converting the uncached buffer path to use the same page allocation
path, followed by converting the page allocation path to use bulk
allocation.

The rest of the patches are then consolidation of the page
allocation and freeing code to simplify the code and remove a chunk
of unnecessary abstraction. This is largely based on a series of
changes made by Christoph Hellwig.

* tag 'xfs-buf-bulk-alloc-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs:
  xfs: merge xfs_buf_allocate_memory
  xfs: cleanup error handling in xfs_buf_get_map
  xfs: get rid of xb_to_gfp()
  xfs: simplify the b_page_count calculation
  xfs: remove ->b_offset handling for page backed buffers
  xfs: move page freeing into _xfs_buf_free_pages()
  xfs: merge _xfs_buf_get_pages()
  xfs: use alloc_pages_bulk_array() for buffers
  xfs: use xfs_buf_alloc_pages for uncached buffers
  xfs: split up xfs_buf_allocate_memory
2021-06-08 09:10:01 -07:00
Dave Chinner
8bcac7448a xfs: merge xfs_buf_allocate_memory
It only has one caller and is now a simple function, so merge it
into the caller.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-07 11:50:48 +10:00
Christoph Hellwig
170041f715 xfs: cleanup error handling in xfs_buf_get_map
Use a single goto label for freeing the buffer and returning an
error.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
2021-06-07 11:50:47 +10:00
Dave Chinner
289ae7b48c xfs: get rid of xb_to_gfp()
Only used in one place, so just open code the logic in the macro.
Based on a patch from Christoph Hellwig.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-07 11:50:17 +10:00
Christoph Hellwig
934d1076bb xfs: simplify the b_page_count calculation
Ever since we stopped using the Linux page cache to back XFS buffers
there is no need to take the start sector into account for
calculating the number of pages in a buffer, as the data always
start from the beginning of the buffer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
[dgc: modified to suit this series]
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-07 11:50:00 +10:00
Christoph Hellwig
54cd3aa6f8 xfs: remove ->b_offset handling for page backed buffers
->b_offset can only be non-zero for _XBF_KMEM backed buffers, so
remove all code dealing with it for page backed buffers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
[dgc: modified to fit this patchset]
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-07 11:49:50 +10:00
Dave Chinner
977ec4ddf0 xfs: don't take a spinlock unconditionally in the DIO fastpath
Because this happens at high thread counts on high IOPS devices
doing mixed read/write AIO-DIO to a single file at about a million
iops:

   64.09%     0.21%  [kernel]            [k] io_submit_one
   - 63.87% io_submit_one
      - 44.33% aio_write
         - 42.70% xfs_file_write_iter
            - 41.32% xfs_file_dio_write_aligned
               - 25.51% xfs_file_write_checks
                  - 21.60% _raw_spin_lock
                     - 21.59% do_raw_spin_lock
                        - 19.70% __pv_queued_spin_lock_slowpath

This also happens of the IO completion IO path:

   22.89%     0.69%  [kernel]            [k] xfs_dio_write_end_io
   - 22.49% xfs_dio_write_end_io
      - 21.79% _raw_spin_lock
         - 20.97% do_raw_spin_lock
            - 20.10% __pv_queued_spin_lock_slowpath

IOWs, fio is burning ~14 whole CPUs on this spin lock.

So, do an unlocked check against inode size first, then if we are
at/beyond EOF, take the spinlock and recheck. This makes the
spinlock disappear from the overwrite fastpath.

I'd like to report that fixing this makes things go faster. It
doesn't - it just exposes the the XFS_ILOCK as the next severe
contention point doing extent mapping lookups, and that now burns
all the 14 CPUs this spinlock was burning.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 15:00:38 -07:00
Christoph Hellwig
5a981e4ea8 xfs: mark xfs_bmap_set_attrforkoff static
xfs_bmap_set_attrforkoff is only used inside of xfs_bmap.c, so mark it
static.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 14:58:59 -07:00
Jiapeng Chong
9673261c32 xfs: Remove redundant assignment to busy
Variable busy is set to false, but this value is never read as it is
overwritten or not used later on, hence it is a redundant assignment
and can be removed.

Clean up the following clang-analyzer warning:

fs/xfs/libxfs/xfs_alloc.c:1679:2: warning: Value stored to 'busy' is
never read [clang-analyzer-deadcode.DeadStores].

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 14:56:29 -07:00
Shaokun Zhang
5f7fd75086 xfs: sort variable alphabetically to avoid repeated declaration
Variable 'xfs_agf_buf_ops', 'xfs_agi_buf_ops', 'xfs_dquot_buf_ops' and
'xfs_symlink_buf_ops' are declared twice, so sort these variables
alphabetically and remove the repeated declaration.

Cc: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 14:54:09 -07:00
Dave Chinner
509201163f xfs: remove xfs_perag_t
Almost unused, gets rid of another typedef.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 10:48:51 +10:00
Dave Chinner
f40aadb2bb xfs: use perag through unlink processing
Unlinked lists are held in the perag, and freeing of inodes needs to
be passed a perag, too, so look up the perag early in the unlink
processing and use it throughout.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-06-02 10:48:51 +10:00
Dave Chinner
8237fbf53d xfs: clean up and simplify xfs_dialloc()
Because it's a mess.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 10:48:24 +10:00
Dave Chinner
309161f660 xfs: inode allocation can use a single perag instance
Now that we've internalised the two-phase inode allocation, we can
now easily make the AG selection and allocation atomic from the
perspective of a single perag context. This will ensure AGs going
offline/away cannot occur between the selection and allocation
steps.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 10:48:24 +10:00
Dave Chinner
b652afd937 xfs: get rid of xfs_dir_ialloc()
This is just a simple wrapper around the per-ag inode allocation
that doesn't need to exist. The internal mechanism to select and
allocate within an AG does not need to be exposed outside
xfs_ialloc.c, and it being exposed simply makes it harder to follow
the code and simplify it.

This is simplified by internalising xf_dialloc_select_ag() and
xfs_dialloc_ag() into a single xfs_dialloc() function and then
xfs_dir_ialloc() can go away.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 10:48:24 +10:00
Dave Chinner
89b1f55a29 xfs: collapse AG selection for inode allocation
xfs_dialloc_select_ag() does a lot of repetitive work. It first
calls xfs_ialloc_ag_select() to select the AG to start allocation
attempts in, which can do up to two entire loops across the perags
that inodes can be allocated in. This is simply checking if there is
spce available to allocate inodes in an AG, and it returns when it
finds the first candidate AG.

xfs_dialloc_select_ag() then does it's own iterative walk across
all the perags locking the AGIs and trying to allocate inodes from
the locked AG. It also doesn't limit the search to mp->m_maxagi,
so it will walk all AGs whether they can allocate inodes or not.

Hence if we are really low on inodes, we could do almost 3 entire
walks across the whole perag range before we find an allocation
group we can allocate inodes in or report ENOSPC.

Because xfs_ialloc_ag_select() returns on the first candidate AG it
finds, we can simply do these checks directly in
xfs_dialloc_select_ag() before we lock and try to allocate inodes.
This reduces the inode allocation pass down to 2 perag sweeps at
most - one for aligned inode cluster allocation and if we can't
allocate full, aligned inode clusters anywhere we'll do another pass
trying to do sparse inode cluster allocation.

This also removes a big chunk of duplicate code.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 10:48:24 +10:00
Dave Chinner
4268547305 xfs: simplify xfs_dialloc_select_ag() return values
The only caller of xfs_dialloc_select_ag() will always return
-ENOSPC to it's caller if the agbp returned from
xfs_dialloc_select_ag() is NULL. IOWs, failure to find a candidate
AGI we can allocate inodes from is always an ENOSPC condition, so
move this logic up into xfs_dialloc_select_ag() so we can simplify
the return logic in this function.

xfs_dialloc_select_ag() now only ever returns 0 with a locked
agbp, or an error with no agbp.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 10:48:24 +10:00
Dave Chinner
50f02fe333 xfs: remove agno from btree cursor
Now that everything passes a perag, the agno is not needed anymore.
Convert all the users to use pag->pag_agno instead and remove the
agno from the cursor. This was largely done as an automated search
and replace.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 10:48:24 +10:00
Dave Chinner
7b13c51551 xfs: use perag for ialloc btree cursors
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 10:48:24 +10:00
Dave Chinner
289d38d22c xfs: convert allocbt cursors to use perags
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 10:48:24 +10:00
Dave Chinner
a81a06211f xfs: convert refcount btree cursor to use perags
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 10:48:24 +10:00
Dave Chinner
fa9c3c1973 xfs: convert rmap btree cursor to using a perag
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 10:48:24 +10:00
Dave Chinner
be9fb17d88 xfs: add a perag to the btree cursor
Which will eventually completely replace the agno in it.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-06-02 10:48:24 +10:00
Dave Chinner
58d43a7e32 xfs: pass perags around in fsmap data dev functions
Needs a [from, to] ranged AG walk, and the perag to be stuffed into
the info structure for callouts to use.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 10:48:24 +10:00
Dave Chinner
30933120ad xfs: push perags through the ag reservation callouts
We currently pass an agno from the AG reservation functions to the
individual feature accounting functions, which in future may have to
do perag lookups to access per-AG state. Instead, pre-emptively
plumb the perag through from the highest AG reservation layer to the
feature callouts so they won't have to look it up again.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2021-06-02 10:48:24 +10:00
Dave Chinner
45d0662117 xfs: pass perags through to the busy extent code
All of the callers of the busy extent API either have perag
references available to use so we can pass a perag to the busy
extent functions rather than having them have to do unnecessary
lookups.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 10:48:24 +10:00
Dave Chinner
7f8d3b3ca6 xfs: convert secondary superblock walk to use perags
Clean up the last external manual AG walk.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 10:48:24 +10:00
Dave Chinner
6f4118fc64 xfs: convert xfs_iwalk to use perag references
Rather than manually walking the ags and passing agnunbers around,
pass the perag for the AG we are currently working on around in the
iwalk structure.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 10:48:24 +10:00
Dave Chinner
934933c3ee xfs: convert raw ag walks to use for_each_perag
Convert the raw walks to an iterator, pulling the current AG out of
pag->pag_agno instead of the loop iterator variable.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 10:48:24 +10:00
Dave Chinner
f250eedcf7 xfs: make for_each_perag... a first class citizen
for_each_perag_tag() is defined in xfs_icache.c for local use.
Promote this to xfs_ag.h and define equivalent iteration functions
so that we can use them to iterate AGs instead to replace open coded
perag walks and perag lookups.

We also convert as many of the straight forward open coded AG walks
to use these iterators as possible. Anything that is not a direct
conversion to an iterator is ignored and will be updated in future
commits.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 10:48:24 +10:00
Dave Chinner
07b6403a68 xfs: move perag structure and setup to libxfs/xfs_ag.[ch]
Move the xfs_perag infrastructure to the libxfs files that contain
all the per AG infrastructure. This helps set up for passing perags
around all the code instead of bare agnos with minimal extra
includes for existing files.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 10:48:24 +10:00
Dave Chinner
61aa005a5b xfs: prepare for moving perag definitions and support to libxfs
The perag structures really need to be defined with the rest of the
AG support infrastructure. The struct xfs_perag and init/teardown
has been placed in xfs_mount.[ch] because there are differences in
the structure between kernel and userspace. Mainly that userspace
doesn't have a lot of the internal stuff that the kernel has for
caches and discard and other such structures.

However, it makes more sense to move this to libxfs than to keep
this separation because we are now moving to use struct perags
everywhere in the code instead of passing raw agnumber_t values
about. Hence we shoudl really move the support infrastructure to
libxfs/xfs_ag.[ch].

To do this without breaking userspace, first we need to rearrange
the structures and code so that all the kernel specific code is
located together. This makes it simple for userspace to ifdef out
the all the parts it does not need, minimising the code differences
between kernel and userspace. The next commit will do the move...

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 10:48:24 +10:00
Dave Chinner
9bbafc7191 xfs: move xfs_perag_get/put to xfs_ag.[ch]
They are AG functions, not superblock functions, so move them to the
appropriate location.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-02 10:48:24 +10:00
Darrick J. Wong
20bd8e63f3 xfs: remove unnecessary shifts
The superblock verifier already validates that (1 << blocklog) ==
blocksize, so use the value directly instead of doing math.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
2021-06-01 12:53:59 -07:00
Darrick J. Wong
a7bcb147fe xfs: clean up open-coded fs block unit conversions
Replace some open-coded fs block unit conversions with the standard
conversion macro.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
2021-06-01 12:53:59 -07:00
Dave Chinner
e7d236a6fe xfs: move page freeing into _xfs_buf_free_pages()
Rather than open coding it just before we call
_xfs_buf_free_pages(). Also, rename the function to
xfs_buf_free_pages() as the leading underscore has no useful
meaning.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-01 13:40:36 +10:00
Dave Chinner
02c5117386 xfs: merge _xfs_buf_get_pages()
Only called from one place now, so merge it into
xfs_buf_alloc_pages(). Because page array allocation is dependent on
bp->b_pages being null, always ensure that when the pages array is
freed we always set bp->b_pages to null.

Also convert the page array to use kmalloc() rather than
kmem_alloc() so we can use the gfp flags we've already calculated
for the allocation context instead of hard coding KM_NOFS semantics.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-01 13:40:36 +10:00
Dave Chinner
c9fa563072 xfs: use alloc_pages_bulk_array() for buffers
Because it's more efficient than allocating pages one at a time in a
loop.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-01 13:40:36 +10:00
Dave Chinner
07b5c5add4 xfs: use xfs_buf_alloc_pages for uncached buffers
Use the newly factored out page allocation code. This adds
automatic buffer zeroing for non-read uncached buffers.

This also allows us to greatly simply the error handling in
xfs_buf_get_uncached(). Because xfs_buf_alloc_pages() cleans up
partial allocation failure, we can just call xfs_buf_free() in all
error cases now to clean up after failures.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-01 13:40:35 +10:00
Dave Chinner
0a683794ac xfs: split up xfs_buf_allocate_memory
Based on a patch from Christoph Hellwig.

This splits out the heap allocation and page allocation portions of
the buffer memory allocation into two separate helper functions.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
2021-06-01 13:40:02 +10:00
Linus Torvalds
75b9c727af Fixes for 5.13-rc4:
- Fix a bug where unmapping operations end earlier than expected, which
   can cause chaos on multi-block directory and symlink shrink
   operations.
 - Fix an erroneous assert that can trigger if we try to transition a
   bmap structure from btree format to extents format with zero extents.
   This was exposed by xfs/538.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAmCvxs0ACgkQ+H93GTRK
 tOsfAQ//fAtDZkjKYKHhWUFoyG6kYNsIZr7wf+kow8jJgeWUibwtUYYQQV/RCRtJ
 zR+Tiys9ZorAReYpzq69s1LbZg/Zz1GT4bPgq/9Icni9x8EXIS6MVaWJHjnkFtKD
 7IqDztV3tC3XgSuAEsjey5PA1V1xpSgxxVtaT1Q2BcY8zqf2bnEPzM/rpKdmE++x
 jlTYrgLBctI24nbmTX2Y/+Te1UWXjM4QiV/EBHiUPedAqJZhwA0hU7hJJv/9I/EG
 /GOjisxhAonKR7fr7wPE+LwJMaxxK4LAt3aLZmGKpm3smSYX8O6sGnJv9VI/stsS
 wRD9c3wzLvfmqL5MXeAYq83u3s5DuFsfqmYD2U49xHlFF9tvLTT5S0Pdi/Qiq962
 n3wabi0slBCdzeY3xXXr9M4cCLL6utYY8Vfi7KvBiDHdtCRZUU33/SAwxZzvhHQv
 0XN+2sqnIn3jM9xg342+/BZbi4+SX7h28qixmgxCo+hez96GHuwhdN5GUVa5lF0r
 4uRPn+VVaOJPcNRx69/iTkrJ1R4YPqedCkgLShs6lZX5Ct92UtANLzqmm0xCZ6U5
 Pe7WjXO6aVAugEsVd9qnPdx4o/+sabs6CEQ65BbYyKvOBVg1XWH0yUEeKyGRYDt9
 21ol7QSKJ6+HIg473j8A3omM5s2XKUT8NZMmRm4EojNI+j4O3R4=
 =y7+K
 -----END PGP SIGNATURE-----

Merge tag 'xfs-5.13-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fixes from Darrick Wong:
 "This week's pile mitigates some decades-old problems in how extent
  size hints interact with realtime volumes, fixes some failures in
  online shrink, and fixes a problem where directory and symlink
  shrinking on extremely fragmented filesystems could fail.

  The most user-notable change here is to point users at our (new) IRC
  channel on OFTC. Freedom isn't free, it costs folks like you and me;
  and if you don't kowtow, they'll expel everyone and take over your
  channel. (Ok, ok, that didn't fit the song lyrics...)

  Summary:

   - Fix a bug where unmapping operations end earlier than expected,
     which can cause chaos on multi-block directory and symlink shrink
     operations.

   - Fix an erroneous assert that can trigger if we try to transition a
     bmap structure from btree format to extents format with zero
     extents. This was exposed by xfs/538"

* tag 'xfs-5.13-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: bunmapi has unnecessary AG lock ordering issues
  xfs: btree format inode forks can have zero extents
  xfs: add new IRC channel to MAINTAINERS
  xfs: validate extsz hints against rt extent size when rtinherit is set
  xfs: standardize extent size hint validation
  xfs: check free AG space when making per-AG reservations
2021-05-29 17:47:19 -10:00
Linus Torvalds
e1a9e3db3b Driver core fixes for 5.13-rc4
Here are 3 small driver core / debugfs fixes for 5.13-rc4:
   - debugfs fix for incorrect "lockdown" mode for selinux accesses
   - 2 device link changes, one bugfix and one cleanup
 
 All of these have been in linux-next for over a week with no reported
 problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYLJMrQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ynf8ACgvsZCX7Wi3GYtFovfomHsCRKpZBsAn0sqfSAL
 TXHePEnj2tJ5c22TSqSt
 =Zx6Z
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core fixes from Greg KH:
 "Here are three small driver core / debugfs fixes for 5.13-rc4:

   - debugfs fix for incorrect "lockdown" mode for selinux accesses

   - two device link changes, one bugfix and one cleanup

  All of these have been in linux-next for over a week with no reported
  problems"

* tag 'driver-core-5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  drivers: base: Reduce device link removal code duplication
  drivers: base: Fix device link removal
  debugfs: fix security_locked_down() call for SELinux
2021-05-29 06:33:28 -10:00
Linus Torvalds
b3dbbae609 io_uring-5.13-2021-05-28
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmCxY4wQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpqJnD/sEHg2ZVzc3CUtvLI11C+O4nkqzUpetOD8I
 iKtvCYKYNTATOPLGQjsznNTTVcUhN4Mud9XWHjyR3nli98fwRrzLuK3EfJjuq1cL
 v6DZVuYKq4k6s0QN6K8yTMslYBQTmk85l8rvXs06jVqDadnnVc+JdfWWBDducs0e
 56Wtmlse18PhzfDjqtsjAOQBjpv4bhQaJTrYOHcEIqFiih2ZpSvyP3SLED7/nvoe
 Q8MNF0Htff/oVbUEzp/NfhHoOFIZ17wwPV3fRC7zat2Dp4R9ZxpScmozLn8PkdO9
 DW+rKpuCbYTYwY1p11cQ5EhiNWNfPMxX4YXovUP9z+M2cgGUK1IhWQRM83L9bAXt
 r/9Md5WjnNpeDr6/YW6uMe1lOrrEy2ZJfNJ2JJbiXo6CWiz+g2qfHLOxwVsEnfoy
 vZoSbDD8ItZDooaXDFGEp1PLpkka4vt/6Ebg0fUtEeG8QQ48eG5L9xpPMSjm90y9
 /UKZdS1pvSl/x6he+RDPg4aVGBWIhGJhv+Q22hNTO3g5u5QE+hXLvFh0QvoOkDQK
 FGlhIa431EiOdm3rdFCG2I4kH1QzQTO6XLHpoVabGXJULPvS2ztnHCz3pYqOU9w1
 Mh12t1RtWzvcTkyOutfsjVqszV3kTl6O6GkI8CiqqjomnbbfORj6CDsi7h9RFZI+
 HtnY2GbSJg==
 =dfLl
 -----END PGP SIGNATURE-----

Merge tag 'io_uring-5.13-2021-05-28' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:
 "A few minor fixes:

   - Fix an issue with hashed wait removal on exit (Zqiang, Pavel)

   - Fix a recent data race introduced in this series (Marco)"

* tag 'io_uring-5.13-2021-05-28' of git://git.kernel.dk/linux-block:
  io_uring: fix data race to avoid potential NULL-deref
  io-wq: Fix UAF when wakeup wqe in hash waitqueue
  io_uring/io-wq: close io-wq full-stop gap
2021-05-28 14:35:55 -10:00
Linus Torvalds
7c0ec89d31 3 SMB3 fixes, two for stable, and the other fixes a problem pointed out with a recently added ioctl
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmCxbv4ACgkQiiy9cAdy
 T1GkPwwAqq0tvNDZnQ6aAur1jwHMiOIAydvpgNKXlKkiXYu+qQbMRgOdUsOtjM00
 Idi8PHKkID2X63HeiwLwoQTfTXGcs6I4UM1iOslEs2ZaX+Fkgo5PG25lIFRTAsqR
 tqYGqGi6yL6TWE7PlVJxr3QuwGeMr7B8X9A0lTZ7YJwslhByK8ymasPdF+jSgPQI
 zDOuAeXiYvlph8sCftWX7gF34aBfKgiH8LhA6M2SY5S16g7LwtXUJjq1PJactoD3
 +nEPyCtRoN6ohScKNVnM8JDpOKIrM+mJ42RG28ZLo6//8so0SFcUdC8VhECBxOWL
 9WkoL2GxRV0LoRnzCZS30EpAi/eQU+QlTrPueGp+n8GjauJMDPoxJ2l6UXox6CLm
 8CqwxKATG6WbrdcGhaVbIxVbAWC7Ze271C/7L61R5K+RmDTXc6jI4vAIw1Pib4o+
 CG6XtxHya5PM0zvyLgU28M6aY+WExbwnkSQKvI2FJZkOVG0xdFCy2O1QLLRBChmn
 a6hsA05a
 =8nZ2
 -----END PGP SIGNATURE-----

Merge tag '5.13-rc4-smb3' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
 "Three SMB3 fixes.

  Two for stable, and the other fixes a problem pointed out with a
  recently added ioctl"

* tag '5.13-rc4-smb3' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: change format of CIFS_FULL_KEY_DUMP ioctl
  cifs: fix string declarations and assignments in tracepoints
  cifs: set server->cipher_type to AES-128-CCM for SMB3.0
2021-05-28 14:15:47 -10:00
Linus Torvalds
5ff2756afd NFS client bugfixes for Linux 5.13
Highlights include:
 
 Stable fixes
 - Fix v4.0/v4.1 SEEK_DATA return -ENOTSUPP when set NFS_V4_2 config
 - Fix Oops in xs_tcp_send_request() when transport is disconnected
 - Fix a NULL pointer dereference in pnfs_mark_matching_lsegs_return()
 
 Bugfixes
 - Fix instances where signal_pending() should be fatal_signal_pending()
 - fix an incorrect limit in filelayout_decode_layout()
 - Fixes for the SUNRPC backlogged RPC queue
 - Don't corrupt the value of pg_bytes_written in nfs_do_recoalesce()
 - Revert commit 586a0787ce35 ("Clean up rpcrdma_prepare_readch()")
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESQctxSBg8JpV8KqEZwvnipYKAPIFAmCvomgACgkQZwvnipYK
 APLg6xAAqlR/HLNYOLAToQ6d9wzrL6Po3x8Lx7VURjCkFaoKB/jMq3Zbu/K8mZ+X
 CC6/XFtB9AikloK7sle6lRPuwwPL6y+vOML0Ais/dkYPNkbhe9ylf0rsYQiPljXT
 8PAcqn8FXTZ9fKpKU8Quw24X1Jfkk6zUEeMy50HYDBfTx+gYEojMEKa6cl4URGzO
 2JpuBO4Ku/vWDOPj7bWBX9wi7wkrJjGSYDnx1A5SOgUdV87H8VJkbTo9vVdEwFoE
 OtE8MQmFhdton0u9+MKImFQdVfxoYLB1Ig1G45NXGHee91dwfYU0U05THj7E/xP9
 RQWtmJcKdvY1w8sRK/PNEHo43Vkow4usffSrIWNBZ6aO5EkbQFn1tmKMSDtsrkZ2
 ONMfKBiEhhQSy+QRXMR/RC86t4dsQ8SApu62qQT4VuuXqzYhrBum2DqkW0X6Zcti
 gi17+PfjRbgWNvul2yegBvDU016H324aCeT9nfWe0D9iwF7tPK4xsuNTYrWwbFOA
 YFAecIXoyBRtbIV6NZ95/+P5HEFBLAYewEVLpdAOBGQ9fjO023ERiC2sitl5P+ku
 v6V+4HAtBgcPfm/8BZwUYUYBpXnnTZFTizqdJdGGydPXeC671gANPJe6e4xOttCK
 frXFGd9OOqPSXdsRZLUVvhczTOFOGa/UVVG0GxIr4ggy8oKjDMk=
 =66ks
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-5.13-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
"Stable fixes:
   - Fix v4.0/v4.1 SEEK_DATA return -ENOTSUPP when set NFS_V4_2 config
   - Fix Oops in xs_tcp_send_request() when transport is disconnected
   - Fix a NULL pointer dereference in pnfs_mark_matching_lsegs_return()

  Bugfixes:
   - Fix instances where signal_pending() should be fatal_signal_pending()
   - fix an incorrect limit in filelayout_decode_layout()
   - Fixes for the SUNRPC backlogged RPC queue
   - Don't corrupt the value of pg_bytes_written in nfs_do_recoalesce()
   - Revert commit 586a0787ce35 ("Clean up rpcrdma_prepare_readch()")"

* tag 'nfs-for-5.13-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  nfs: Remove trailing semicolon in macros
  xprtrdma: Revert 586a0787ce35
  NFSv4: Fix v4.0/v4.1 SEEK_DATA return -ENOTSUPP when set NFS_V4_2 config
  NFS: Clean up reset of the mirror accounting variables
  NFS: Don't corrupt the value of pg_bytes_written in nfs_do_recoalesce()
  NFS: Fix an Oopsable condition in __nfs_pageio_add_request()
  SUNRPC: More fixes for backlog congestion
  SUNRPC: Fix Oops in xs_tcp_send_request() when transport is disconnected
  NFSv4: Fix a NULL pointer dereference in pnfs_mark_matching_lsegs_return()
  SUNRPC in case of backlog, hand free slots directly to waiting task
  pNFS/NFSv4: Remove redundant initialization of 'rd_size'
  NFS: fix an incorrect limit in filelayout_decode_layout()
  fs/nfs: Use fatal_signal_pending instead of signal_pending
2021-05-28 08:53:19 -10:00
Aurelien Aptel
1bb5681067 cifs: change format of CIFS_FULL_KEY_DUMP ioctl
Make CIFS_FULL_KEY_DUMP ioctl able to return variable-length keys.

* userspace needs to pass the struct size along with optional
  session_id and some space at the end to store keys
* if there is enough space kernel returns keys in the extra space and
  sets the length of each key via xyz_key_length fields

This also fixes the build error for get_user() on ARM.

Sample program:

	#include <stdlib.h>
	#include <stdio.h>
	#include <stdint.h>
	#include <sys/fcntl.h>
	#include <sys/ioctl.h>

	struct smb3_full_key_debug_info {
	        uint32_t   in_size;
	        uint64_t   session_id;
	        uint16_t   cipher_type;
	        uint8_t    session_key_length;
	        uint8_t    server_in_key_length;
	        uint8_t    server_out_key_length;
	        uint8_t    data[];
	        /*
	         * return this struct with the keys appended at the end:
	         * uint8_t session_key[session_key_length];
	         * uint8_t server_in_key[server_in_key_length];
	         * uint8_t server_out_key[server_out_key_length];
	         */
	} __attribute__((packed));

	#define CIFS_IOCTL_MAGIC 0xCF
	#define CIFS_DUMP_FULL_KEY _IOWR(CIFS_IOCTL_MAGIC, 10, struct smb3_full_key_debug_info)

	void dump(const void *p, size_t len) {
	        const char *hex = "0123456789ABCDEF";
	        const uint8_t *b = p;
	        for (int i = 0; i < len; i++)
	                printf("%c%c ", hex[(b[i]>>4)&0xf], hex[b[i]&0xf]);
	        putchar('\n');
	}

	int main(int argc, char **argv)
	{
	        struct smb3_full_key_debug_info *keys;
	        uint8_t buf[sizeof(*keys)+1024] = {0};
	        size_t off = 0;
	        int fd, rc;

	        keys = (struct smb3_full_key_debug_info *)&buf;
	        keys->in_size = sizeof(buf);

	        fd = open(argv[1], O_RDONLY);
	        if (fd < 0)
	                perror("open"), exit(1);

	        rc = ioctl(fd, CIFS_DUMP_FULL_KEY, keys);
	        if (rc < 0)
	                perror("ioctl"), exit(1);

	        printf("SessionId      ");
	        dump(&keys->session_id, 8);
	        printf("Cipher         %04x\n", keys->cipher_type);

	        printf("SessionKey     ");
	        dump(keys->data+off, keys->session_key_length);
	        off += keys->session_key_length;

	        printf("ServerIn Key   ");
	        dump(keys->data+off, keys->server_in_key_length);
	        off += keys->server_in_key_length;

	        printf("ServerOut Key  ");
	        dump(keys->data+off, keys->server_out_key_length);

	        return 0;
	}

Usage:

	$ gcc -o dumpkeys dumpkeys.c

Against Windows Server 2020 preview (with AES-256-GCM support):

	# mount.cifs //$ip/test /mnt -o "username=administrator,password=foo,vers=3.0,seal"
	# ./dumpkeys /mnt/somefile
	SessionId      0D 00 00 00 00 0C 00 00
	Cipher         0002
	SessionKey     AB CD CC 0D E4 15 05 0C 6F 3C 92 90 19 F3 0D 25
	ServerIn Key   73 C6 6A C8 6B 08 CF A2 CB 8E A5 7D 10 D1 5B DC
	ServerOut Key  6D 7E 2B A1 71 9D D7 2B 94 7B BA C4 F0 A5 A4 F8
	# umount /mnt

	With 256 bit keys:

	# echo 1 > /sys/module/cifs/parameters/require_gcm_256
	# mount.cifs //$ip/test /mnt -o "username=administrator,password=foo,vers=3.11,seal"
	# ./dumpkeys /mnt/somefile
	SessionId      09 00 00 00 00 0C 00 00
	Cipher         0004
	SessionKey     93 F5 82 3B 2F B7 2A 50 0B B9 BA 26 FB 8C 8B 03
	ServerIn Key   6C 6A 89 B2 CB 7B 78 E8 04 93 37 DA 22 53 47 DF B3 2C 5F 02 26 70 43 DB 8D 33 7B DC 66 D3 75 A9
	ServerOut Key  04 11 AA D7 52 C7 A8 0F ED E3 93 3A 65 FE 03 AD 3F 63 03 01 2B C0 1B D7 D7 E5 52 19 7F CC 46 B4

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-05-27 15:26:32 -05:00