IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
- Allow unsharing time namespace on vfork+exec (Andrei Vagin)
- Replace usage of deprecated kmap APIs (Fabio M. De Francesco)
- Fix spelling mistake (Zhang Jiaming)
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmLoDyAWHGtlZXNjb29r
QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJh0mEACL07hj3eT3rWg6ohZx9sCTcAjY
/tG+zxLQ7xu717nM1a4j7CI5kdNNpYbsCqG71ikDDRrOCeEutu7M8zE1emctjtHv
oh853D6BKhV2Hvsiuk1oM2ZHR1bmgiW1eFNAJcCLz6rE6wYu564R0wYJV0h418fH
Rjk+Y989A7Srs9t/9GQSktjX3Q039/PG28avhA5q144/ZNycr5FnLFOf4RlmzEUz
7E8TfGsftX8eRAfxW/dPiWuIKMuYPLqspca9pT3aFj3ze2qKnldjNV3c9M5ajL5Q
q7KKWeWzunKyYHMaRzIxkHyhs396ZGKFN2PbcNYyml+NBItyc3fCHishMF7bW0Vb
nyZbmYJslBloYmrSJYgqCfxyjUuhe0cMMk9iMzDVp6ROwtLgFFLwfwunM6RwRmnr
dAmM8QGwSE3qYLhVnLEcRqpgdXzVd+S0TGhB5k5AyI3628/mLxhE66/eWq0X8QF5
los5zku1GagMkylt6SOGb3TME4JZe6ZdZpU4fe/ilM22qw852xgbF3+6Zap6IBbD
AdzXVCHyU/obORfIxx5KTF213m4KpkWBBi3N1/vVlxIAFAUy1WdXDM1o2RPMD7hw
DeHe8sgfTZxLmSqfWLuX+3qC94IvrbDPFaRCIMj1QNK0ltM8I9oHRPcUFyZMaV0O
xHN/5QtmgVDfKA3mTw==
=82SS
-----END PGP SIGNATURE-----
Merge tag 'execve-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull execve updates from Kees Cook:
- Allow unsharing time namespace on vfork+exec (Andrei Vagin)
- Replace usage of deprecated kmap APIs (Fabio M. De Francesco)
- Fix spelling mistake (Zhang Jiaming)
* tag 'execve-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
exec: Call kmap_local_page() in copy_string_kernel()
exec: Fix a spelling mistake
selftests/timens: add a test for vfork+exit
fs/exec: allow to unshare a time namespace on vfork+exec
- Migrate to modern acomp crypto interface (Ard Biesheuvel)
- Use better return type for "rcnt" (Dan Carpenter)
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmLoDWAWHGtlZXNjb29r
QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJmSWD/9BKLN+f7kpYfGKk0q9nnceo1CU
q9uFHIW6eqFcInHTGIzc68iig1SbtQePjhrXafLez8ZKeX/L3yy/G4R4+vKj2XDA
Lkjd9Xrj1d1nhh/mu/SSkDlsIUXNkcxbpfQX7qUXPfNc7Wln7r+aWqHFT5cziDYf
G58TOlmn4WsdGWyhadsBxcPE9UA7eGsoqya+PD/bKgDCqp6ArhzZd3bFK/OLkSxF
xR2n26sS2Qfg7ApbTrBSyG4lJkuIxZILFF8a8CsmYj78xCGSrwPlOFRHW5oKHCdI
JEQ3CR/I3HofP1ajQCti07XT/dDbwzQaHkF/sjZmhAh9OnPXClGxYlczT+Cw9ffS
R6o/XWldz8KgG9m3K6j1NmkBdwF0zukznNRrEBaeAONAbquG36b8fcH9GTRWBRAS
x57bTHOCws9RLvjCGoxdEOYzs2Sm/23lUuF7nv0wRaDmlH+6eG6LU26ZBJpC1ple
xdvnJxUI6tU5ojSUbaqQ4jyel/H5NZTb0l32gAmh5nZy/2U7jlLyeGFg814Hz9Jy
ga8IWfumYezYMZfY5zTnmzhxF5mcLB34c/B5Qkb5SgER/WwnrTkRuJAXfbPE1j5y
h8360yT5HPg5WSUe+N7S78LFqMWscu4iEPMdXbnKbLLp0Ytzz6xTlKi++teOfVle
PI2xO97HN6oLA02IBw==
=jJ4i
-----END PGP SIGNATURE-----
Merge tag 'pstore-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull pstore updates from Kees Cook:
- Migrate to modern acomp crypto interface (Ard Biesheuvel)
- Use better return type for "rcnt" (Dan Carpenter)
* tag 'pstore-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
pstore/zone: cleanup "rcnt" type
pstore: migrate to crypto acomp interface
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmLko3gQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpmQaD/90NKFj4v8I456TUQyg1jimXEsL+e84E6o2
ALWVb6JzQvlPVQXNLnK5YKIunMWOTtTMz0nyB8sVRwVJVJO0P5d7QopAkZM8fkyU
MK5OCzoryENw4DTc2wJS4in6cSbGylIuN74wMzlf7+M67JTImfoZQhbTMcjwzZfn
b3OlL6sID7zMXwGcuOJPZyUJICCpDhzdSF9JXqKma5PQuG2SBmQyvFxJAcsoFBPc
YetnoRIOIN6yBvsIZaPaYq7XI9MIvF0e67EQtyCEHj4tHpyVnyDWkeObVFULsISU
gGEKbkYPvNUzRAU5Q1NBBHh1tTfkf/MaUxTuZwoEwZ/s04IGBGMmrZGyfvdfzYo6
M7NwSEg/TrUSNfTwn65mQi7uOXu1pGkJrqz84Flm8u9Qid9Vd7LExLG5p/ggnWdH
5th93MDEmtEg29e9DXpEAuS5d0t3TtSvosflaKpyfNNfr+P0rWCN6GM/uW62VUTK
ls69SQh/AQJRbg64jU4xper6WhaYtSXK7TKEnxJycoEn9gYNyCcdot2uekth0xRH
ChHGmRlteiqe/y4uFWn/2dcxWjoleiHbFjTaiRL75WVl8wIDEjw02LGuoZ61Ss9H
WOV+MT7KqNjBGe6lreUY+O/PO02dzmoR6heJXN19p8zr/pBuLCTGX7UpO7rzgaBR
4N1HEozvIw==
=celk
-----END PGP SIGNATURE-----
Merge tag 'for-5.20/block-2022-07-29' of git://git.kernel.dk/linux-block
Pull block updates from Jens Axboe:
- Improve the type checking of request flags (Bart)
- Ensure queue mapping for a single queues always picks the right queue
(Bart)
- Sanitize the io priority handling (Jan)
- rq-qos race fix (Jinke)
- Reserved tags handling improvements (John)
- Separate memory alignment from file/disk offset aligment for O_DIRECT
(Keith)
- Add new ublk driver, userspace block driver using io_uring for
communication with the userspace backend (Ming)
- Use try_cmpxchg() to cleanup the code in various spots (Uros)
- Finally remove bdevname() (Christoph)
- Clean up the zoned device handling (Christoph)
- Clean up independent access range support (Christoph)
- Clean up and improve block sysfs handling (Christoph)
- Clean up and improve teardown of block devices.
This turns the usual two step process into something that is simpler
to implement and handle in block drivers (Christoph)
- Clean up chunk size handling (Christoph)
- Misc cleanups and fixes (Bart, Bo, Dan, GuoYong, Jason, Keith, Liu,
Ming, Sebastian, Yang, Ying)
* tag 'for-5.20/block-2022-07-29' of git://git.kernel.dk/linux-block: (178 commits)
ublk_drv: fix double shift bug
ublk_drv: make sure that correct flags(features) returned to userspace
ublk_drv: fix error handling of ublk_add_dev
ublk_drv: fix lockdep warning
block: remove __blk_get_queue
block: call blk_mq_exit_queue from disk_release for never added disks
blk-mq: fix error handling in __blk_mq_alloc_disk
ublk: defer disk allocation
ublk: rewrite ublk_ctrl_get_queue_affinity to not rely on hctx->cpumask
ublk: fold __ublk_create_dev into ublk_ctrl_add_dev
ublk: cleanup ublk_ctrl_uring_cmd
ublk: simplify ublk_ch_open and ublk_ch_release
ublk: remove the empty open and release block device operations
ublk: remove UBLK_IO_F_PREFLUSH
ublk: add a MAINTAINERS entry
block: don't allow the same type rq_qos add more than once
mmc: fix disk/queue leak in case of adding disk failure
ublk_drv: fix an IS_ERR() vs NULL check
ublk: remove UBLK_IO_F_INTEGRITY
ublk_drv: remove unneeded semicolon
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmLkm7UQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpldTEADTg/96R+eq78UZBNZmdifY9/qwQD+kzNiK
ACDoYZFSbWUjMeOWqRxbYr6mXBKHnGHyTGlraTTpLDzhpB1xwoWfgOK9uOYXW/Ik
eWfgTujPW/8v/l/z86khE+GH9b/maGCRqNZgS6uLVLzhxG6oCkoYTyOh1iHaF1VM
Rma4nbJ8GSEDtiXNDl0Bznnyks/pzwoz/9slwzZ7PxtFwZsBxKuxgMUR5HIXdRp7
5iUoFJhZrGWyi/dbQZUsK/9VYVVnKkcBCz2pb4GEmC+3dS/vlPEoeWUpPHInNyd1
9NB9v8c+KFmQaWnCxuxcdHvCfmRRQrX8Pr8/OBNZKO6McYrKWKA+lurp4EGClE3m
cZdK+P/9FS/Eeua8hum9UnbPAqsJPqLTbpbrySeBdd4iFA6u7rRqDX2+nz3PNe9U
1b7V1bWBIEY/Rsw/PKo59oIeV0auD8v9OCHJ0lF2pv6dRln2/W0y1Qfd1DI18xFG
+9bBnQzhF7R0O8UP5ApVayQCYrd906YsSVUOqAiLmUs/BoOgRq6g/0BqSOVVKE2u
5iq8zTsVMkxY0ZpExwZST/700JwkPIV4SVPEYRC6QssFTcylvlisIek6XYSS9HX4
Z6gzMwJW1H47bEfG4JolTI8uBjp0hQLCPX0O0XFLVnbHQwN0kjIBmv3axAwJO2NV
qrrHXjf09w==
=hV7G
-----END PGP SIGNATURE-----
Merge tag 'for-5.20/io_uring-buffered-writes-2022-07-29' of git://git.kernel.dk/linux-block
Pull io_uring buffered writes support from Jens Axboe:
"This contains support for buffered writes, specifically for XFS. btrfs
is in progress, will be coming in the next release.
io_uring does support buffered writes on any file type, but since the
buffered write path just always -EAGAIN (or -EOPNOTSUPP) any attempt
to do so if IOCB_NOWAIT is set, any buffered write will effectively be
handled by io-wq offload. This isn't very efficient, and we even have
specific code in io-wq to serialize buffered writes to the same inode
to avoid further inefficiencies with thread offload.
This is particularly sad since most buffered writes don't block, they
simply copy data to a page and dirty it. With this pull request, we
can handle buffered writes a lot more effiently.
If balance_dirty_pages() needs to block, we back off on writes as
indicated.
This improves buffered write support by 2-3x.
Jan Kara helped with the mm bits for this, and Stefan handled the
fs/iomap/xfs/io_uring parts of it"
* tag 'for-5.20/io_uring-buffered-writes-2022-07-29' of git://git.kernel.dk/linux-block:
mm: honor FGP_NOWAIT for page cache page allocation
xfs: Add async buffered write support
xfs: Specify lockmode when calling xfs_ilock_for_iomap()
io_uring: Add tracepoint for short writes
io_uring: fix issue with io_write() not always undoing sb_start_write()
io_uring: Add support for async buffered writes
fs: Add async write file modification handling.
fs: Split off inode_needs_update_time and __file_update_time
fs: add __remove_file_privs() with flags parameter
fs: add a FMODE_BUF_WASYNC flags for f_mode
iomap: Return -EAGAIN from iomap_write_iter()
iomap: Add async buffered write support
iomap: Add flags parameter to iomap_page_create()
mm: Add balance_dirty_pages_ratelimited_flags() function
mm: Move updates of dirty_exceeded into one place
mm: Move starting of background writeback into the main balancing loop
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmLkm5gQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpmKMD/4l3QIrLbjYIxlfrzQcHbmYuUkbQtj3SbZg
6ejbnGVhCs1P9DdXH8MgE2BxgpiXQE0CqOK7vbSoo5ep2n2UTLI2DIxAl74SMIo7
0wmJXtUJySuViKr3NYVHqlN180MkQYddBz0nGElhkQBPBCMhW8CrtPCeURr/YyHp
2RxSYBXiUx2gRyig+klnp6oPEqelcBZJUyNHdA9yVrgl/RhB/t2rKj7D++8ukQM3
Zuyh8WIkTeTfUz9hdGG7fuCEdZN4DlO2CCEc7uy0cKi6VRCKH4hYUCqClJ+/cfd2
43dUI2O7B6D1t/ObFh8AGIDXBDqVA6ePQohQU6gooRkfQiBPKkc9d0ts4yIhRqca
AjkzNM+0Eve3A01loJ8J84w8oZnvNpYEv5n8/sZVLWcyU3UIs0I88nC2OBiFtoRq
d77CtFLwOTo+r3STtAhnZOqez90rhS6BqKtqlUP346PCuFItl6/MbGtwdTbLYEFj
CVNIb2pERWSr2NxGv4lFyXaX/cRwruxojWH7yc3rRYjr4Ykevd1pe/fMGNiMAnKw
5em/3QU3qq0ZVcXLMihksKeHHFIQwGDRMuyuv/fktV10+yYXQ0t16WzkJT3aR8Xo
cqs0r8+6Jnj3uYcOMzj/FoLcpEPr21hnwAtzLto1mG1Wh4JRn/D7Nx5zqxPLxcW+
NiU6VihPOw==
=gxeV
-----END PGP SIGNATURE-----
Merge tag 'for-5.20/io_uring-2022-07-29' of git://git.kernel.dk/linux-block
Pull io_uring updates from Jens Axboe:
- As per (valid) complaint in the last merge window, fs/io_uring.c has
grown quite large these days. io_uring isn't really tied to fs
either, as it supports a wide variety of functionality outside of
that.
Move the code to io_uring/ and split it into files that either
implement a specific request type, and split some code into helpers
as well. The code is organized a lot better like this, and io_uring.c
is now < 4K LOC (me).
- Deprecate the epoll_ctl opcode. It'll still work, just trigger a
warning once if used. If we don't get any complaints on this, and I
don't expect any, then we can fully remove it in a future release
(me).
- Improve the cancel hash locking (Hao)
- kbuf cleanups (Hao)
- Efficiency improvements to the task_work handling (Dylan, Pavel)
- Provided buffer improvements (Dylan)
- Add support for recv/recvmsg multishot support. This is similar to
the accept (or poll) support for have for multishot, where a single
SQE can trigger everytime data is received. For applications that
expect to do more than a few receives on an instantiated socket, this
greatly improves efficiency (Dylan).
- Efficiency improvements for poll handling (Pavel)
- Poll cancelation improvements (Pavel)
- Allow specifiying a range for direct descriptor allocations (Pavel)
- Cleanup the cqe32 handling (Pavel)
- Move io_uring types to greatly cleanup the tracing (Pavel)
- Tons of great code cleanups and improvements (Pavel)
- Add a way to do sync cancelations rather than through the sqe -> cqe
interface, as that's a lot easier to use for some use cases (me).
- Add support to IORING_OP_MSG_RING for sending direct descriptors to a
different ring. This avoids the usually problematic SCM case, as we
disallow those. (me)
- Make the per-command alloc cache we use for apoll generic, place
limits on it, and use it for netmsg as well (me).
- Various cleanups (me, Michal, Gustavo, Uros)
* tag 'for-5.20/io_uring-2022-07-29' of git://git.kernel.dk/linux-block: (172 commits)
io_uring: ensure REQ_F_ISREG is set async offload
net: fix compat pointer in get_compat_msghdr()
io_uring: Don't require reinitable percpu_ref
io_uring: fix types in io_recvmsg_multishot_overflow
io_uring: Use atomic_long_try_cmpxchg in __io_account_mem
io_uring: support multishot in recvmsg
net: copy from user before calling __get_compat_msghdr
net: copy from user before calling __copy_msghdr
io_uring: support 0 length iov in buffer select in compat
io_uring: fix multishot ending when not polled
io_uring: add netmsg cache
io_uring: impose max limit on apoll cache
io_uring: add abstraction around apoll cache
io_uring: move apoll cache to poll.c
io_uring: consolidate hash_locked io-wq handling
io_uring: clear REQ_F_HASH_LOCKED on hash removal
io_uring: don't race double poll setting REQ_F_ASYNC_DATA
io_uring: don't miss setting REQ_F_DOUBLE_POLL
io_uring: disable multishot recvmsg
io_uring: only trace one of complete or overflow
...
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCYufiMwAKCRCRxhvAZXjc
os2iAQDr3tK9e2EUZDZ3Vgu3tvmTLKiU7W7f4U/ZAjJE5snBOwD+OqK8r1RdvXf8
TatkVFFNZYlINDN6JrS5yGSKBm1+RwE=
=8eZE
-----END PGP SIGNATURE-----
Merge tag 'fs.idmapped.overlay.acl.v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull acl updates from Christian Brauner:
"Last cycle we introduced support for mounting overlayfs on top of
idmapped mounts. While looking into additional testing we realized
that posix acls don't really work correctly with stacking filesystems
on top of idmapped layers.
We already knew what the fix were but it would require work that is
more suitable for the merge window so we turned off posix acls for
v5.19 for overlayfs on top of idmapped layers with Miklos routing my
patch upstream in 72a8e05d4f ("Merge tag 'ovl-fixes-5.19-rc7' [..]").
This contains the work to support posix acls for overlayfs on top of
idmapped layers. Since the posix acl fixes should use the new
vfs{g,u}id_t work the associated branch has been merged in. (We sent a
pull request for this earlier.)
We've also pulled in Miklos pull request containing my patch to turn
of posix acls on top of idmapped layers. This allowed us to avoid
rebasing the branch which we didn't like because we were already at
rc7 by then. Merging it in allows this branch to first fix posix acls
and then to cleanly revert the temporary fix it brought in by commit
4a47c6385b ("ovl: turn of SB_POSIXACL with idmapped layers
temporarily").
The last patch in this series adds Seth Forshee as a co-maintainer for
idmapped mounts. Seth has been integral to all of this work and is
also the main architect behind the filesystem idmapping work which
ultimately made filesystems such as FUSE and overlayfs available in
containers. He continues to be active in both development and review.
I'm very happy he decided to help and he has my full trust. This
increases the bus factor which is always great for work like this. I'm
honestly very excited about this because I think in general we don't
do great in the bringing on new maintainers department"
For more explanations of the ACL issues, see
https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org/
* tag 'fs.idmapped.overlay.acl.v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
Add Seth Forshee as co-maintainer for idmapped mounts
Revert "ovl: turn of SB_POSIXACL with idmapped layers temporarily"
ovl: handle idmappings in ovl_get_acl()
acl: make posix_acl_clone() available to overlayfs
acl: port to vfs{g,u}id_t
acl: move idmapped mount fixup into vfs_{g,s}etxattr()
mnt_idmapping: add vfs[g,u]id_into_k[g,u]id()
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCYufP6AAKCRCRxhvAZXjc
omzRAQCGJ11r7T0C7t1kTdQiFSs5XN9ksFa86Hfj3dHEBIj+LQEA+bZ2/LLpElDz
zPekgXkFQqdMr+FUL8sk94dzHT0GAgk=
=BcK/
-----END PGP SIGNATURE-----
Merge tag 'fs.idmapped.vfsuid.v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull fs idmapping updates from Christian Brauner:
"This introduces the new vfs{g,u}id_t types we agreed on. Similar to
k{g,u}id_t the new types are just simple wrapper structs around
regular {g,u}id_t types.
They allow to establish a type safety boundary in the VFS for idmapped
mounts preventing confusion betwen {g,u}ids mapped into an idmapped
mount and {g,u}ids mapped into the caller's or the filesystem's
idmapping.
An initial set of helpers is introduced that allows to operate on
vfs{g,u}id_t types. We will remove all references to non-type safe
idmapped mounts helpers in the very near future. The patches do
already exist.
This converts the core attribute changing codepaths which become
significantly easier to reason about because of this change.
Just a few highlights here as the patches give detailed overviews of
what is happening in the commit messages:
- The kernel internal struct iattr contains type safe vfs{g,u}id_t
values clearly communicating that these values have to take a given
mount's idmapping into account.
- The ownership values placed in struct iattr to change ownership are
identical for idmapped and non-idmapped mounts going forward. This
also allows to simplify stacking filesystems such as overlayfs that
change attributes In other words, they always represent the values.
- Instead of open coding checks for whether ownership changes have
been requested and an actual update of the inode is required we now
have small static inline wrappers that abstract this logic away
removing a lot of code duplication from individual filesystems that
all open-coded the same checks"
* tag 'fs.idmapped.vfsuid.v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
mnt_idmapping: align kernel doc and parameter order
mnt_idmapping: use new helpers in mapped_fs{g,u}id()
fs: port HAS_UNMAPPED_ID() to vfs{g,u}id_t
mnt_idmapping: return false when comparing two invalid ids
attr: fix kernel doc
attr: port attribute changes to new types
security: pass down mount idmapping to setattr hook
quota: port quota helpers mount ids
fs: port to iattr ownership update helpers
fs: introduce tiny iattr ownership update helpers
fs: use mount types in iattr
fs: add two type safe mapping helpers
mnt_idmapping: add vfs{g,u}id_t
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEES8DXskRxsqGE6vXTAA5oQRlWghUFAmLnshQTHGpsYXl0b25A
a2VybmVsLm9yZwAKCRAADmhBGVaCFa1ZEACWzjP9gDRO+b5HuovRofO5gCfi0LNK
jQAnUQmFBbV28MuRBr8lzjZFsn52C3nEz/unHpl2NXrg1dErdXmTZIUYkZIoESQl
0hyA2lhdm/pvqfj5t9xwt9lK9xts7G+Q1Q2JsT53QlpGd7q9VOq0CFrFTuIe+HmZ
qw9Sy/3rfP/rPALv1OzIlGDdBuslfPuijJJZq0wYx4WupA6vlGGSZXn+LxF2dHW9
Ex/Z+n6o5mzEuPedopBBsCvdMTO2/sVmz33puqM0KBb/gmL47i15o1XXdg1O0cbL
7LxIDOfaIm6gFsznUwrJV54WrL8zISQd/BhXbQOrbE8kmnNii1kfIyJHYx55Sa4X
y6TmqVbYERXIwCFquO78Uywt8UgjRjuxG8SRe0AmqsxvIn/IxTjqMn5yaMURCTxA
uyOmXHxLss3Jf2LNfd6nnrK5qKpOnPOBAn8I/4UY+eNdJGqesLyKoVPZ9O6K1dr3
+jZJ8Ju4TVs7L3fljq6pHvbhAWivM3JEZmYrv+y8QKSRZBV0XqHagwDGHUaaLe9H
6eHgU5yxCb+fj8EXbwxzKnJkhHXJikd4bbPOaJC+QZEKPCJJMo/pyXmDkCVwhJ73
pjO4W0w6TGmCHinlVX6dkyYrCvWYjWglHyO5BnTY2F0Ub87/59KmepZz4dh81hi+
ZdOIvHoF6uca3A==
=wDtt
-----END PGP SIGNATURE-----
Merge tag 'filelock-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux
Pull file locking updates from Jeff Layton:
"Just a couple of flock() patches from Kuniyuki Iwashima.
The main change is that this moves a file_lock allocation from the
slab to the stack"
* tag 'filelock-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
fs/lock: Rearrange ops in flock syscall.
fs/lock: Don't allocate file_lock in flock_make_lock().
- Add Yue Hu and Jeffle Xu as reviewers;
- Add the missing wake_up when updating lzma streams;
- Avoid consecutive detection for Highmem memory;
- Prepare for multi-reference pclusters and get rid of PG_error;
- Fix ctx->pos update for NFS export;
- minor cleanups.
-----BEGIN PGP SIGNATURE-----
iIcEABYIAC8WIQThPAmQN9sSA0DVxtI5NzHcH7XmBAUCYub3chEceGlhbmdAa2Vy
bmVsLm9yZwAKCRA5NzHcH7XmBMMSAP9P7kMPLuc0RP9AjoiQXKNAfWqIbGnbkI5C
ACUUu5tZEgD/T7HhkDYIs/wAZzYB7qTkpepkY/XzuwnlodhaSTwnPQ8=
=/vU1
-----END PGP SIGNATURE-----
Merge tag 'erofs-for-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs updates from Gao Xiang:
"First of all, we'd like to add Yue Hu and Jeffle Xu as two new
reviewers. Thank them for spending time working on EROFS!
There is no major feature outstanding in this cycle, mainly a patchset
I worked on to prepare for rolling hash deduplication and folios for
compressed data as the next big features. It kills the unneeded
PG_error flag dependency as well.
Apart from that, there are bugfixes and cleanups as always. Details
are listed below:
- Add Yue Hu and Jeffle Xu as reviewers
- Add the missing wake_up when updating lzma streams
- Avoid consecutive detection for Highmem memory
- Prepare for multi-reference pclusters and get rid of PG_error
- Fix ctx->pos update for NFS export
- minor cleanups"
* tag 'erofs-for-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: (23 commits)
erofs: update ctx->pos for every emitted dirent
erofs: get rid of the leftover PAGE_SIZE in dir.c
erofs: get rid of erofs_prepare_dio() helper
erofs: introduce multi-reference pclusters (fully-referenced)
erofs: record the longest decompressed size in this round
erofs: introduce z_erofs_do_decompressed_bvec()
erofs: try to leave (de)compressed_pages on stack if possible
erofs: introduce struct z_erofs_decompress_backend
erofs: get rid of `z_pagemap_global'
erofs: clean up `enum z_erofs_collectmode'
erofs: get rid of `enum z_erofs_page_type'
erofs: rework online page handling
erofs: switch compressed_pages[] to bufvec
erofs: introduce `z_erofs_parse_in_bvecs'
erofs: drop the old pagevec approach
erofs: introduce bufvec to store decompressed buffers
erofs: introduce `z_erofs_parse_out_bvecs()'
erofs: clean up z_erofs_collector_begin()
erofs: get rid of unneeded `inode', `map' and `sb'
erofs: avoid consecutive detection for Highmem memory
...
Pull fsnotify updates from Jan Kara:
- support for FAN_MARK_IGNORE which untangles some of the not well
defined corner cases with fanotify ignore masks
- small cleanups
* tag 'fsnotify_for_v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
fsnotify: Fix comment typo
fanotify: introduce FAN_MARK_IGNORE
fanotify: cleanups for fanotify_mark() input validations
fanotify: prepare for setting event flags in ignore mask
fs: inotify: Fix typo in inotify comment
Pull ext2 and reiserfs updates from Jan Kara:
"A fix for ext2 handling of a corrupted fs image and cleanups in ext2
and reiserfs"
* tag 'fs_for_v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
ext2: Add more validity checks for inode counts
fs/reiserfs/inode: remove dead code in _get_block_create_0()
fs/ext2: replace ternary operator with min_t()
Changes in this set of commits:
. Delay the cleanup of interrupted posix lock requests until the
user space result arrives. Previously, the immediate cleanup
would lead to extraneous warnings when the result arrived.
. Tracepoint improvements, e.g. adding the lock resource name.
. Delay the completion of lockspace creation until one full recovery
cycle has completed. This allows more error cases to be returned to
the caller.
. Remove warnings from the locking layer about delayed network replies.
The recently added midcomms warnings are much more useful.
. Begin the process of deprecating two unused lock-timeout-related
features. These features now require enabling via a Kconfig option,
and enabling them triggers deprecation warnings. We expect to
remove the code in v6.2.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJi5+RGAAoJEDgbc8f8gGmq1RwP/2xZaVKiTPYcQ0GfcmefCnnG
8WxpMNv4ZPkjKVv7csBA8mcQyNuQqA4yLb3P+jEgkWDOKesJQNeTvfrittXCyfhG
C7uvbUe3OCg9m+dIrzNKBu+2WtFu6tKa3aSlpPUF3Bhhe8IhwRmAkyd/Ky0VCGr9
5jQWvy8D1p2pNoFsGKqhkfolqovmeTxgYtGxd/eHtiApo6tNwzbgcQAZw4vquCjk
FSPO7s5HyINik0nQQ9b8MCjywmF6HG6UZjcd/qYHTUmcBZkgpegCKZRnYwQklnBD
6BYj6X+w7WxgVsHgYBAtgd8oLRN5CtCmPljvnPTCjvgx6N9FTl8RJV8rwMqZ9C8U
9+w7WosLxQFSyRm7KxHmKaatkOa3Baqg7cPXSwaZnsA3vBpitHWKs9cyDKwA0j3/
sUWZFw+3VSuf7AJkSA848tC8Xs8G6YXvZgzvxzNEvtTJgO3X7sXB2lavZDyI0S26
nwgXgs/Dt6QcOoQKGv8WgRSOMrFxtq/gX+f3gwPCHvM3panttPevXwKKQW2UtVOn
u/BF3Oe9bGhf+J0o58Zp3gjtfDIz+c3yPkxeQqAc3pC/o1Lw7AMV2WxlxULoBLsv
aErKwT0UemrQYRZnBmlGPaV4H1KyXzwC/fA1N8YAObJ/Ohe6x7oCKioWWMA4ggiD
A4mOIY95o24rm++lNUkD
=Dnn4
-----END PGP SIGNATURE-----
Merge tag 'dlm-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
Pull dlm updates from David Teigland:
- Delay the cleanup of interrupted posix lock requests until the user
space result arrives. Previously, the immediate cleanup would lead to
extraneous warnings when the result arrived.
- Tracepoint improvements, e.g. adding the lock resource name.
- Delay the completion of lockspace creation until one full recovery
cycle has completed. This allows more error cases to be returned to
the caller.
- Remove warnings from the locking layer about delayed network replies.
The recently added midcomms warnings are much more useful.
- Begin the process of deprecating two unused lock-timeout-related
features. These features now require enabling via a Kconfig option,
and enabling them triggers deprecation warnings. We expect to remove
the code in v6.2.
* tag 'dlm-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
fs: dlm: move kref_put assert for lkb structs
fs: dlm: don't use deprecated timeout features by default
fs: dlm: add deprecation Kconfig and warnings for timeouts
fs: dlm: remove timeout from dlm_user_adopt_orphan
fs: dlm: remove waiter warnings
fs: dlm: fix grammar in lowcomms output
fs: dlm: add comment about lkb IFL flags
fs: dlm: handle recovery result outside of ls_recover
fs: dlm: make new_lockspace() wait until recovery completes
fs: dlm: call dlm_lsop_recover_prep once
fs: dlm: update comments about recovery and membership handling
fs: dlm: add resource name to tracepoints
fs: dlm: remove additional dereference of lksb
fs: dlm: change ast and bast trace order
fs: dlm: change posix lock sigint handling
fs: dlm: use dlm_plock_info for do_unlock_close
fs: dlm: change plock interrupted message to debug again
fs: dlm: add pid to debug log
fs: dlm: plock use list_first_entry
The unhold_lkb() function decrements the lock's kref, and
asserts that the ref count was not the final one. Use the
kref_put release function (which should not be called) to
call the assert, rather than doing the assert based on the
kref_put return value. Using kill_lkb() as the release
function doesn't make sense if we only want to assert.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
This patch will disable use of deprecated timeout features if
CONFIG_DLM_DEPRECATED_API is not set. The deprecated features
will be removed in upcoming kernel release v6.2.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
This patch adds a CONFIG_DLM_DEPRECATED_API Kconfig option
that must be enabled to use two timeout-related features
that we intend to remove in kernel v6.2. Warnings are
printed if either is enabled and used. Neither has ever
been used as far as we know.
. The DLM_LSFL_TIMEWARN lockspace creation flag will be
removed, along with the associated configfs entry for
setting the timeout. Setting the flag and configfs file
would cause dlm to track how long locks were waiting
for reply messages. After a timeout, a kernel message
would be logged, and a netlink message would be sent
to userspace. Recently, midcomms messages have been
added that produce much better logging about actual
problems with messages. No use has ever been found
for the netlink messages.
. The userspace libdlm API has allowed the DLM_LKF_TIMEOUT
flag with a timeout value to be set in lock requests.
The lock request would be cancelled after the timeout.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
erofs_readdir update ctx->pos after filling a batch of dentries
and it may cause dir/files duplication for NFS readdirplus which
depends on ctx->pos to fill dir correctly. So update ctx->pos for
every emitted dirent in erofs_fill_dentries to fix it.
Also fix the update of ctx->pos when the initial file position has
exceeded nameoff.
Fixes: 3e917cc305 ("erofs: make filesystem exportable")
Signed-off-by: Hongnan Li <hongnan.li@linux.alibaba.com>
Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20220722082732.30935-1-jefflexu@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
The use of kmap_atomic() is being deprecated in favor of kmap_local_page().
With kmap_local_page(), the mappings are per thread, CPU local and not
globally visible. Furthermore, the mappings can be acquired from any
context (including interrupts).
Therefore, replace kmap_atomic() with kmap_local_page() in
copy_string_kernel(). Instead of open-coding local mapping + memcpy(),
use memcpy_to_page(). Delete a redundant call to flush_dcache_page().
Tested with xfstests on a QEMU/ KVM x86_32 VM, 6GB RAM, booting a kernel
with HIGHMEM64GB enabled.
Suggested-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220724212523.13317-1-fmdefrancesco@gmail.com
issues or are too minor to warrant backporting
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYuCV7gAKCRDdBJ7gKXxA
jrK2AQDeoayQKXJFTcEltKAUTooXM/BoRf+O3ti/xrSWpwta8wEAjaBIJ8e7UlCj
g+p6u/pd38f226ldzI5w3bIBSPCbnwU=
=3rO0
-----END PGP SIGNATURE-----
Merge tag 'mm-hotfixes-stable-2022-07-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"Thirteen hotfixes.
Eight are cc:stable and the remainder are for post-5.18 issues or are
too minor to warrant backporting"
* tag 'mm-hotfixes-stable-2022-07-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mailmap: update Gao Xiang's email addresses
userfaultfd: provide properly masked address for huge-pages
Revert "ocfs2: mount shared volume without ha stack"
hugetlb: fix memoryleak in hugetlb_mcopy_atomic_pte
fs: sendfile handles O_NONBLOCK of out_fd
ntfs: fix use-after-free in ntfs_ucsncmp()
secretmem: fix unhandled fault in truncate
mm/hugetlb: separate path for hwpoison entry in copy_hugetlb_page_range()
mm: fix missing wake-up event for FSDAX pages
mm: fix page leak with multiple threads mapping the same page
mailmap: update Seth Forshee's email address
tmpfs: fix the issue that the mount and remount results are inconsistent.
mm: kfence: apply kmemleak_ignore_phys on early allocated pool
Commit 824ddc601a ("userfaultfd: provide unmasked address on
page-fault") was introduced to fix an old bug, in which the offset in the
address of a page-fault was masked. Concerns were raised - although were
never backed by actual code - that some userspace code might break because
the bug has been around for quite a while. To address these concerns a
new flag was introduced, and only when this flag is set by the user,
userfaultfd provides the exact address of the page-fault.
The commit however had a bug, and if the flag is unset, the offset was
always masked based on a base-page granularity. Yet, for huge-pages, the
behavior prior to the commit was that the address is masked to the
huge-page granulrity.
While there are no reports on real breakage, fix this issue. If the flag
is unset, use the address with the masking that was done before.
Link: https://lkml.kernel.org/r/20220711165906.2682-1-namit@vmware.com
Fixes: 824ddc601a ("userfaultfd: provide unmasked address on page-fault")
Signed-off-by: Nadav Amit <namit@vmware.com>
Reported-by: James Houghton <jthoughton@google.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: James Houghton <jthoughton@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add checks verifying number of inodes stored in the superblock matches
the number computed from number of inodes per group. Also verify we have
at least one block worth of inodes per group. This prevents crashes on
corrupted filesystems.
Reported-by: syzbot+d273f7d7f58afd93be48@syzkaller.appspotmail.com
Signed-off-by: Jan Kara <jack@suse.cz>
Since commit 27b3a5c51b ("kill-the-bkl/reiserfs: drop the fs race
watchdog from _get_block_create_0()"), which removed a label that may
have the pointer 'p' touched in its control flow, related if statements
now eval to constant value now. Just remove them.
Assigning value NULL to p here
293 char *p = NULL;
In the following conditional expression, the value of p is always NULL,
As a result, the kunmap() cannot be executed.
308 if (p)
309 kunmap(bh_result->b_page);
355 if (p)
356 kunmap(bh_result->b_page);
366 if (p)
367 kunmap(bh_result->b_page);
Also, the kmap() cannot be executed.
399 if (!p)
400 p = (char *)kmap(bh_result->b_page);
[JK: Removed unnecessary initialization of 'p' to NULL]
Signed-off-by: Zeng Jingxiang <linuszeng@tencent.com>
Signed-off-by: Kairui Song <kasong@tencent.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220720083029.1065578-1-zengjx95@gmail.com
This adds the async buffered write support to XFS. For async buffered
write requests, the request will return -EAGAIN if the ilock cannot be
obtained immediately.
Signed-off-by: Stefan Roesch <shr@fb.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Link: https://lore.kernel.org/r/20220623175157.1715274-15-shr@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This patch changes the helper function xfs_ilock_for_iomap such that the
lock mode must be passed in.
Signed-off-by: Stefan Roesch <shr@fb.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Link: https://lore.kernel.org/r/20220623175157.1715274-14-shr@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This adds a file_modified_async() function to return -EAGAIN if the
request either requires to remove privileges or needs to update the file
modification time. This is required for async buffered writes, so the
request gets handled in the io worker of io-uring.
Signed-off-by: Stefan Roesch <shr@fb.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Link: https://lore.kernel.org/r/20220623175157.1715274-11-shr@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This splits off the functions inode_needs_update_time() and
__file_update_time() from the function file_update_time().
This is required to support async buffered writes.
No intended functional changes in this patch.
Signed-off-by: Stefan Roesch <shr@fb.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Link: https://lore.kernel.org/r/20220623175157.1715274-10-shr@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This adds the function __remove_file_privs, which allows the caller to
pass the kiocb flags parameter.
No intended functional changes in this patch.
Signed-off-by: Stefan Roesch <shr@fb.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Link: https://lore.kernel.org/r/20220623175157.1715274-9-shr@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This introduces the flag FMODE_BUF_WASYNC. If devices support async
buffered writes, this flag can be set. It also modifies the check in
generic_write_checks to take async buffered writes into consideration.
Signed-off-by: Stefan Roesch <shr@fb.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Link: https://lore.kernel.org/r/20220623175157.1715274-8-shr@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
If iomap_write_iter() encounters -EAGAIN, return -EAGAIN to the caller.
Signed-off-by: Stefan Roesch <shr@fb.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Link: https://lore.kernel.org/r/20220623175157.1715274-7-shr@fb.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
[axboe: make the suggested ternary edit]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This adds async buffered write support to iomap.
This replaces the call to balance_dirty_pages_ratelimited() with the
call to balance_dirty_pages_ratelimited_flags. This allows to specify if
the write request is async or not.
In addition this also moves the above function call to the beginning of
the function. If the function call is at the end of the function and the
decision is made to throttle writes, then there is no request that
io-uring can wait on. By moving it to the beginning of the function, the
write request is not issued, but returns -EAGAIN instead. io-uring will
punt the request and process it in the io-worker.
By moving the function call to the beginning of the function, the write
throttling will happen one page later.
Signed-off-by: Stefan Roesch <shr@fb.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Link: https://lore.kernel.org/r/20220623175157.1715274-6-shr@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Add the kiocb flags parameter to the function iomap_page_create().
Depending on the value of the flags parameter it enables different gfp
flags.
No intended functional changes in this patch.
Signed-off-by: Stefan Roesch <shr@fb.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Link: https://lore.kernel.org/r/20220623175157.1715274-5-shr@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
In preparation for splitting io_uring up a bit, move it into its own
top level directory. It didn't really belong in fs/ anyway, as it's
not a file system only API.
This adds io_uring/ and moves the core files in there, and updates the
MAINTAINERS file for the new location.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Rather than have two giant switches for doing request preparation and
then for doing request issue, add a prep and issue handler for each
of them in the io_op_defs[] request definition.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmLaEsMQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgprg0EADcHfSh4bM+37B1rHV3urn+Cmpei69/QUPo
XP9hLHFeNsokZC30JFaEN2oNiHic5oGOO9aoHSrWifnXGKScaN1ZmveGE5ctwA6B
n4mLKVjKNUAWzAG2MWFHCykxW2gXaLSfrMQNl/FIGrjnUwEVjS7b9BsORhSfeM86
AMl2he38Btr9n1PUw63RHf0tC/p0/nBg/dta95Bwu1cjhC/7gv3wBe739PKslo4c
oYimvLPobDkX60LcMXBsZbu9pwT7UR+WKcgOYtgqX9/Dq1G3KtO/mN+cOKJiTwAz
yStCQepnDqYHS9ltvBeh2a6BEtl09YFGMK8WkAVUPo21+AdmzZtAxAZtsuK9erSL
w9i3Z524rLR+PNjo0oW4QGIaLLPrQt152Q+KhCGtD69qdxf/BecO6jOiNMauWZjp
LWJW7I9bNQI8rxiSYIA4mgS8oS/8GXae2N+UYsjaMSXtJY1GMtO+ldI05y01rcJp
8/0MIqC3uhR5uFZ6VFLVGnzZuy0Tsw4KfN/Z/JMzTM5iWoCZvNXHgNTAnNbzLP6u
M3yUGlO3qEhg9UerqNSRi+rOyVNScXL6joP8nKjy+fSxOtS9uYegBKlL4fjrDAQr
fYuYnq+r9dk/Kqm6uQSuowZj8+nEwEEPbYErTLQT1GRO9A+/rIJ5a34FzKeRYWPU
bJAuWCu+BQ==
=lo96
-----END PGP SIGNATURE-----
Merge tag 'io_uring-5.19-2022-07-21' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
"Fix for a bad kfree() introduced in this cycle, and a quick fix for
disabling buffer recycling for IORING_OP_READV.
The latter will get reworked for 5.20, but it gets the job done for
5.19"
* tag 'io_uring-5.19-2022-07-21' of git://git.kernel.dk/linux-block:
io_uring: do not recycle buffer in READV
io_uring: fix free of unallocated buffer list
Let's introduce multi-reference pclusters at runtime. In details,
if one pcluster is requested by multiple extents at almost the same
time (even belong to different files), the longest extent will be
decompressed as representative and the other extents are actually
copied from the longest one in one round.
After this patch, fully-referenced extents can be correctly handled
and the full decoding check needs to be bypassed for
partial-referenced extents.
Acked-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220715154203.48093-17-hsiangkao@linux.alibaba.com
Currently, `pcl->length' records the longest decompressed length
as long as the pcluster itself isn't reclaimed. However, such
number is unneeded for the general cases since it doesn't indicate
the exact decompressed size in this round.
Instead, let's record the decompressed size for this round instead,
thus `pcl->nr_pages' can be completely dropped and pageofs_out is
also designed to be kept in sync with `pcl->length'.
Acked-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220715154203.48093-16-hsiangkao@linux.alibaba.com
For the most cases, small pclusters can be decompressed with page
arrays on stack.
Try to leave both (de)compressed_pages on stack if possible as before.
Acked-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220715154203.48093-14-hsiangkao@linux.alibaba.com
Let's introduce struct z_erofs_decompress_backend in order to pass
on the decompression backend context between helper functions more
easier and avoid too many arguments.
Acked-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220715154203.48093-13-hsiangkao@linux.alibaba.com
In order to introduce multi-reference pclusters for compressed data
deduplication, let's get rid of the global page array for now since
it needs to be re-designed then at least.
Acked-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220715154203.48093-12-hsiangkao@linux.alibaba.com
`enum z_erofs_collectmode' is really ambiguous, but I'm not quite
sure if there are better naming, basically it's used to judge whether
inplace I/O can be used due to the current status of pclusters in
the chain.
Rename it as `enum z_erofs_pclustermode' instead.
Acked-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220715154203.48093-11-hsiangkao@linux.alibaba.com
Since all decompressed offsets have been integrated to bvecs[], this
patch avoids all sub-indexes so that page->private only includes a
part count and an eio flag, thus in the future folio->private can have
the same meaning.
In addition, PG_error will not be used anymore after this patch and
we're heading to use page->private (later folio->private) and
page->mapping (later folio->mapping) only.
Acked-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220715154203.48093-9-hsiangkao@linux.alibaba.com
Convert compressed_pages[] to bufvec in order to avoid using
page->private to keep onlinepage_index (decompressed offset)
for inplace I/O pages.
In the future, we only rely on folio->private to keep a countdown
to unlock folios and set folio_uptodate.
Acked-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220715154203.48093-8-hsiangkao@linux.alibaba.com
`z_erofs_decompress_pcluster()' is too long therefore it'd be better
to introduce another helper to parse compressed pages (or laterly,
compressed bvecs.)
BTW, since `compressed_bvecs' is too long as a part of the function
name, `in_bvecs' is used here instead.
Acked-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220715154203.48093-7-hsiangkao@linux.alibaba.com
Remove the old pagevec approach but keep z_erofs_page_type for now.
It will be reworked in the following commits as well.
Also rename Z_EROFS_NR_INLINE_PAGEVECS as Z_EROFS_INLINE_BVECS with
the new value 2 since it's actually enough to bootstrap.
Acked-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220715154203.48093-6-hsiangkao@linux.alibaba.com
For each pcluster, the total compressed buffers are determined in
advance, yet the number of decompressed buffers actually vary. Too
many decompressed pages can be recorded if one pcluster is highly
compressed or its pcluster size is large. That takes extra memory
footprints compared to uncompressed filesystems, especially a lot of
I/O in flight on low-ended devices.
Therefore, similar to inplace I/O, pagevec was introduced to reuse
page cache to store these pointers in the time-sharing way since
these pages are actually unused before decompressing.
In order to make it more flexable, a cleaner bufvec is used to
replace the old pagevec stuffs so that
- Decompressed offsets can be stored inline, thus it can be used
for the upcoming feature like compressed data deduplication.
It's calculated by `page_offset(page) - map->m_la';
- Towards supporting large folios for compressed inodes since
our final goal is to completely avoid page->private but use
folio->private only for all page cache pages.
Acked-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220715154203.48093-5-hsiangkao@linux.alibaba.com
`z_erofs_decompress_pcluster()' is too long therefore it'd be better
to introduce another helper to parse decompressed pages (or laterly,
decompressed bvecs.)
BTW, since `decompressed_bvecs' is too long as a part of the function
name, `out_bvecs' is used instead.
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Acked-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220715154203.48093-4-hsiangkao@linux.alibaba.com