1252748 Commits

Author SHA1 Message Date
Serge Semin
361dd531a1 net: pcs: xpcs: Explicitly return error on caps validation
If an unsupported interface is passed to the PCS validation callback there
is no need in further link-modes calculations since the resultant array
will be initialized with zeros which will be perceived by the phylink
subsystem as error anyway (see phylink_validate_mac_and_pcs()). Instead
let's explicitly return the -EINVAL error to inform the caller about the
unsupported interface as it's done in the rest of the pcs_validate
callbacks.

Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26 13:09:09 +00:00
Serge Semin
f5151005d3 net: pcs: xpcs: Return EINVAL in the internal methods
In particular the xpcs_soft_reset() and xpcs_do_config() functions
currently return -1 if invalid auto-negotiation mode is specified. That
value might be then passed to the generic kernel subsystems which require
a standard kernel errno value. Even though the erroneous conditions are
very specific (memory corruption or buggy driver implementation) using a
hard-coded -1 literal doesn't seem correct anyway especially when it comes
to passing it higher to the network subsystem or printing to the system
log.  Convert the hard-coded error values to -EINVAL then.

Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26 13:09:08 +00:00
Serge Semin
e26802ebd2 net: pcs: xpcs: Drop redundant workqueue.h include directive
There is nothing CM workqueue-related in the driver. So the respective
include directive can be dropped.

While at it add an empty line delimiter between the generic and local path
include directives to visually separate them.

Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26 13:09:08 +00:00
Serge Semin
0ffc3292c0 net: pcs: xpcs: Drop sentinel entry from 2500basex ifaces list
There are currently only two methods (xpcs_find_compat() and
xpcs_get_interfaces()) defined in the driver which loop over the available
interfaces. All of them rely on the xpcs_compat::num_interfaces field
value to get the total number of supported interfaces. Thus the interface
arrays are supposed to be filled with actual interface IDs and there is no
need in the dummy terminating ID placed at the end of the arrays.

Based on the above drop the PHY_INTERFACE_MODE_MAX entry from the
xpcs_2500basex_interfaces array and the PHY_INTERFACE_MODE_MAX-based
conditional statement from the xpcs_get_interfaces() method as redundant.

Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26 13:09:08 +00:00
Eric Dumazet
10bfd453da ipv6: fix potential "struct net" leak in inet6_rtm_getaddr()
It seems that if userspace provides a correct IFA_TARGET_NETNSID value
but no IFA_ADDRESS and IFA_LOCAL attributes, inet6_rtm_getaddr()
returns -EINVAL with an elevated "struct net" refcount.

Fixes: 6ecf4c37eb3e ("ipv6: enable IFA_TARGET_NETNSID for RTM_GETADDR")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Ahern <dsahern@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26 11:56:23 +00:00
David S. Miller
5fc3903c46 Merge branch 'rtnetlink-reduce-rtnl-pressure'
Eric Dumazet says:

====================
rtnetlink: reduce RTNL pressure for dumps

This series restarts the conversion of rtnl dump operations
to RCU protection, instead of requiring RTNL.

In this new attempt (prior one failed in 2011), I chose to
allow a gradual conversion of selected operations.

After this series, "ip -6 addr" and "ip -4 ro" no longer
need to acquire RTNL.

I refrained from changing inet_dump_ifaddr() and inet6_dump_addr()
to avoid merge conflicts because of two fixes in net tree.

I also started the work for "ip link" future conversion.

v2: rtnl_fill_link_ifmap() always emit IFLA_MAP (Jiri Pirko)
    Added "nexthop: allow nexthop_mpath_fill_node()
           to be called without RTNL" to avoid a lockdep splat (Ido Schimmel)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26 11:46:13 +00:00
Eric Dumazet
0ec4e48c3a rtnetlink: provide RCU protection to rtnl_fill_prop_list()
We want to be able to run rtnl_fill_ifinfo() under RCU protection
instead of RTNL in the future.

dev->name_node items are already rcu protected.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26 11:46:13 +00:00
Eric Dumazet
74808e72e0 rtnetlink: make rtnl_fill_link_ifmap() RCU ready
Use READ_ONCE() to read the following device fields:

	dev->mem_start
	dev->mem_end
	dev->base_addr
	dev->irq
	dev->dma
	dev->if_port

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26 11:46:13 +00:00
Eric Dumazet
4ce5dc9316 inet: switch inet_dump_fib() to RCU protection
No longer hold RTNL while calling inet_dump_fib().

Also change return value for a completed dump:

Returning 0 instead of skb->len allows NLMSG_DONE
to be appended to the skb. User space does not have
to call us again to get a standalone NLMSG_DONE marker.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26 11:46:13 +00:00
Eric Dumazet
0ac3fa0c3b nexthop: allow nexthop_mpath_fill_node() to be called without RTNL
nexthop_mpath_fill_node() will be potentially called
from contexts holding rcu_lock instead of RTNL.

Suggested-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/all/ZdZDWVdjMaQkXBgW@shredder/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26 11:46:12 +00:00
Eric Dumazet
22e36ea9f5 inet: allow ip_valid_fib_dump_req() to be called with RTNL or RCU
Add a new field into struct fib_dump_filter, to let callers
tell if they use RTNL locking or RCU.

This is used in the following patch, when inet_dump_fib()
no longer holds RTNL.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26 11:46:12 +00:00
Eric Dumazet
69fdb7e411 ipv6: switch inet6_dump_ifinfo() to RCU protection
No longer hold RTNL while calling inet6_dump_ifinfo()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26 11:46:12 +00:00
Eric Dumazet
386520e0ec rtnetlink: add RTNL_FLAG_DUMP_UNLOCKED flag
Similarly to RTNL_FLAG_DOIT_UNLOCKED, this new flag
allows dump operations registered via rtnl_register()
or rtnl_register_module() to opt-out from RTNL protection.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26 11:46:12 +00:00
Eric Dumazet
e39951d965 rtnetlink: change nlk->cb_mutex role
In commit af65bdfce98d ("[NETLINK]: Switch cb_lock spinlock
to mutex and allow to override it"), Patrick McHardy used
a common mutex to protect both nlk->cb and the dump() operations.

The override is used for rtnl dumps, registered with
rntl_register() and rntl_register_module().

We want to be able to opt-out some dump() operations
to not acquire RTNL, so we need to protect nlk->cb
with a per socket mutex.

This patch renames nlk->cb_def_mutex to nlk->nl_cb_mutex

The optional pointer to the mutex used to protect dump()
call is stored in nlk->dump_cb_mutex

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26 11:46:12 +00:00
Eric Dumazet
b559027006 netlink: hold nlk->cb_mutex longer in __netlink_dump_start()
__netlink_dump_start() releases nlk->cb_mutex right before
calling netlink_dump() which grabs it again.

This seems dangerous, even if KASAN did not bother yet.

Add a @lock_taken parameter to netlink_dump() to let it
grab the mutex if called from netlink_recvmsg() only.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26 11:46:12 +00:00
Eric Dumazet
6647b338fc netlink: fix netlink_diag_dump() return value
__netlink_diag_dump() returns 1 if the dump is not complete,
zero if no error occurred.

If err variable is zero, this means the dump is complete:
We should not return skb->len in this case, but 0.

This allows NLMSG_DONE to be appended to the skb.
User space does not have to call us again only to get NLMSG_DONE.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26 11:46:12 +00:00
Eric Dumazet
ac14ad9755 ipv6: use xarray iterator to implement inet6_dump_ifinfo()
Prepare inet6_dump_ifinfo() to run with RCU protection
instead of RTNL and use for_each_netdev_dump() interface.

Also properly return 0 at the end of a dump, avoiding
an extra recvmsg() system call and RTNL acquisition.

Note that RTNL-less dumps need core changes, coming later
in the series.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26 11:46:12 +00:00
Eric Dumazet
8afc7a78d5 ipv6: prepare inet6_fill_ifinfo() for RCU protection
We want to use RCU protection instead of RTNL
for inet6_fill_ifinfo().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26 11:46:12 +00:00
Eric Dumazet
4ad2681364 ipv6: prepare inet6_fill_ifla6_attrs() for RCU
We want to no longer hold RTNL while calling inet6_fill_ifla6_attrs()
in the future. Add needed READ_ONCE()/WRITE_ONCE() annotations.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26 11:46:12 +00:00
Eric Dumazet
e353ea9ce4 rtnetlink: prepare nla_put_iflink() to run under RCU
We want to be able to run rtnl_fill_ifinfo() under RCU protection
instead of RTNL in the future.

This patch prepares dev_get_iflink() and nla_put_iflink()
to run either with RTNL or RCU held.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26 11:46:12 +00:00
David S. Miller
5f6000aa24 Merge branch 'dp83826'
Jérémie Dautheribes says:

====================
Add support for TI DP83826 configuration

This short patch series introduces the possibility of overriding
some parameters which are latched by default by hardware straps on the
TI DP83826 PHY.

The settings that can be overridden include:
  - Configuring the PHY in either MII mode or RMII mode.
  - When in RMII mode, configuring the PHY in RMII slave mode or RMII
  master mode.

The RMII master/slave mode is TI-specific and determines whether the PHY
operates from a 25MHz reference clock (master mode) or from a 50MHz
reference clock (slave mode).

While these features should be supported by all the TI DP8382x family,
I have only been able to test them on TI DP83826 hardware.  Therefore,
support has been added specifically for this PHY in this patch series.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26 11:38:45 +00:00
Jérémie Dautheribes
2844a0d7cf net: phy: dp83826: support configuring RMII master/slave operation mode
The TI DP83826 PHY can operate between two RMII modes:
	- master mode (PHY operates from a 25MHz clock reference)
        - slave mode (PHY operates from a 50MHz clock reference)

By default, the operation mode is configured by hardware straps.

Add support to configure the operation mode from within the driver.

Signed-off-by: Jérémie Dautheribes <jeremie.dautheribes@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26 11:38:44 +00:00
Jérémie Dautheribes
d2ed0774b6 net: phy: dp83826: Add support for phy-mode configuration
The TI DP83826 PHY can operate in either MII mode or RMII mode.
By default, it is configured by straps.
It can also be configured by writing to the bit 5 of register 0x17 - RMII
and Status Register (RCSR).

When phydev->interface is rmii, rmii mode must be enabled, otherwise
mii mode must be set.
This prevents misconfiguration of hw straps.

Signed-off-by: Jérémie Dautheribes <jeremie.dautheribes@bootlin.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26 11:38:44 +00:00
Jérémie Dautheribes
95f4fa1f45 dt-bindings: net: dp83822: support configuring RMII master/slave mode
Add property ti,rmii-mode to support selecting the RMII operation mode
between:
	- master mode (PHY operates from a 25MHz clock reference)
	- slave mode (PHY operates from a 50MHz clock reference)

If not set, the operation mode is configured by hardware straps.

Signed-off-by: Jérémie Dautheribes <jeremie.dautheribes@bootlin.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26 11:38:44 +00:00
Oleksij Rempel
a7f08029e2 net: dsa: microchip: Add support for bridge port isolation
Implement bridge port isolation for KSZ switches. Enabling the isolation
of switch ports from each other while maintaining connectivity with the
CPU and other forwarding ports. For instance, to isolate swp1 and swp2
from each other, use the following commands:
- bridge link set dev swp1 isolated on
- bridge link set dev swp2 isolated on

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26 11:35:34 +00:00
Jakub Kicinski
1a825e4cdf selftests: net: veth: test syncing GRO and XDP state while device is down
Test that we keep GRO flag in sync when XDP is disabled while
the device is closed.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26 11:34:13 +00:00
Jakub Kicinski
fe9f801355 net: veth: clear GRO when clearing XDP even when down
veth sets NETIF_F_GRO automatically when XDP is enabled,
because both features use the same NAPI machinery.

The logic to clear NETIF_F_GRO sits in veth_disable_xdp() which
is called both on ndo_stop and when XDP is turned off.
To avoid the flag from being cleared when the device is brought
down, the clearing is skipped when IFF_UP is not set.
Bringing the device down should indeed not modify its features.

Unfortunately, this means that clearing is also skipped when
XDP is disabled _while_ the device is down. And there's nothing
on the open path to bring the device features back into sync.
IOW if user enables XDP, disables it and then brings the device
up we'll end up with a stray GRO flag set but no NAPI instances.

We don't depend on the GRO flag on the datapath, so the datapath
won't crash. We will crash (or hang), however, next time features
are sync'ed (either by user via ethtool or peer changing its config).
The GRO flag will go away, and veth will try to disable the NAPIs.
But the open path never created them since XDP was off, the GRO flag
was a stray. If NAPI was initialized before we'll hang in napi_disable().
If it never was we'll crash trying to stop uninitialized hrtimer.

Move the GRO flag updates to the XDP enable / disable paths,
instead of mixing them with the ndo_open / ndo_close paths.

Fixes: d3256efd8e8b ("veth: allow enabling NAPI even without XDP")
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Reported-by: syzbot+039399a9b96297ddedca@syzkaller.appspotmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-26 11:34:13 +00:00
Linus Torvalds
d206a76d7d Linux 6.8-rc6 v6.8-rc6 2024-02-25 15:46:06 -08:00
Linus Torvalds
e231dbd452 bcachefs fixes for 6.8-rc6
Some more mostly boring fixes, but some not
 
 User reported ones:
  - the BTREE_ITER_FILTER_SNAPSHOTS one fixes a really nasty performance
    bug; user reported an unter initially taking 2 seconds and then ~2
    minutes
 
  - kill a __GFP_NOFAIL in the buffered read path; this was a leftover
    from the trickier fix to kill __GFP_NOFAIL in readahead, where we
    can't return errors (and have to silently truncate the read
    ourselves).
 
    bcachefs can't use GFP_NOFAIL for folio state unlike iomap based
    filesystems because our folio state is just barely too big, 2MB
    hugepages cause us to exceed the 2 page threshhold for GFP_NOFAIL.
 
    additionally, the flags argument was just buggy, we weren't supplying
    GFP_KERNEL previously (!).
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmXbqqMACgkQE6szbY3K
 bnYjnhAApY0vT6eVIYrZ7JGR6tw++xw02xRkcNW4zFE8INAvxQor5TXMEKkJs9Ui
 owh8WZjydXe0FJPE+pROcHMfxkkup4yP2SafgzR8DGERBwZbV9x7hvUbdG90EngY
 V/MevV+vr6UaV7133sY70K8BqUA/yAlCmmtOQVFgGRprEtEPS4Ur3vYR5+IzA0N7
 OhNXu6LxzkYbrNp9qroCN2UEVgRDJ/Mtda6uHfIUrqOQMUhiq2og9kvzJXzIrW9l
 URxm4eFQtJe0Yz09Ppypve+FutJIbtuDEYbcMJNT9Ig7BosD5vDjy9nhp8A5Q1Uk
 oDWBbCJhDdSYSVC/EQY8bv0AaCkyCa7vshSoKq0fDCFJ8k+nQ1YMF5wNhfgJhtU9
 Tl2Qytphp9/dxkvpIsR/5iNhLply9xTka1Wkp3G+3QJk0c17Dftpvz0/WhKI0P2B
 d6y4mz/hfCtWoSQOJbJl3fM/ZVpjH54VHDmb7sGyb5f+bTUkX6OUoJ4os8MNKGcS
 GdpEoWt/IAQj69c7w8aama5TXJ4kYe0XtXwbHTRE4j1PIQJA5SPvVt+32spRtb6i
 1gIa94uWKYMuG2U0XGxookHfZZZaMQkl79oXJOYRiC589YVyZC1Lp5iqr027jHEQ
 1HacrWPekPfmrhchyIzpH1mHOgaS+FKoD7eKrkvj0QSxpwfwpbI=
 =KNWR
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2024-02-25' of https://evilpiepirate.org/git/bcachefs

Pull bcachefs fixes from Kent Overstreet:
 "Some more mostly boring fixes, but some not

  User reported ones:

   - the BTREE_ITER_FILTER_SNAPSHOTS one fixes a really nasty
     performance bug; user reported an untar initially taking two
     seconds and then ~2 minutes

   - kill a __GFP_NOFAIL in the buffered read path; this was a leftover
     from the trickier fix to kill __GFP_NOFAIL in readahead, where we
     can't return errors (and have to silently truncate the read
     ourselves).

     bcachefs can't use GFP_NOFAIL for folio state unlike iomap based
     filesystems because our folio state is just barely too big, 2MB
     hugepages cause us to exceed the 2 page threshhold for GFP_NOFAIL.

     additionally, the flags argument was just buggy, we weren't
     supplying GFP_KERNEL previously (!)"

* tag 'bcachefs-2024-02-25' of https://evilpiepirate.org/git/bcachefs:
  bcachefs: fix bch2_save_backtrace()
  bcachefs: Fix check_snapshot() memcpy
  bcachefs: Fix bch2_journal_flush_device_pins()
  bcachefs: fix iov_iter count underflow on sub-block dio read
  bcachefs: Fix BTREE_ITER_FILTER_SNAPSHOTS on inodes btree
  bcachefs: Kill __GFP_NOFAIL in buffered read path
  bcachefs: fix backpointer_to_text() when dev does not exist
2024-02-25 15:31:57 -08:00
Kent Overstreet
5197728f81 bcachefs: fix bch2_save_backtrace()
Missed a call in the previous fix.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-02-25 15:45:36 -05:00
Linus Torvalds
70ff1fe626 Two documentation build fixes:
- The XFS online fsck documentation uses incredibly deeply nested
   subsection and list nesting; that broke the PDF docs build.  Tweak a
   parameter to tell LaTeX to allow the deeper nesting.
 
 - Fix a 6.8 PDF-build regression
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAmXbi5QPHGNvcmJldEBs
 d24ubmV0AAoJEBdDWhNsDH5YZSMH/RIZh48S/Jh5mhjzqnKhGf1sFn6lSk8sFY3I
 uJqML/LPo6GYzX8WvYKlfyP9+UvrLiDcQF0Er6MeIhK6mhKE1Lp7w1YvRgeXcgFR
 H9DtxA4fJSGWlAaMqZBwsXjF2EFwjyxHtHUeNyaJ+YocHfrT6L9Cp9uBEvdT3Iye
 F191VpjWLrFD0DJEh64CcmNd3rggN5jeD/n24dbNOmnem1cak2brIIUeltdkUmQG
 48Hr27xqYF1QyVckfoRtnT/C3AyaCKbxRbTxeAjwUDjU+7nCsHf1MKltiFAZHnFs
 7ZLsOboLhmR+y9xiZUg7OlpRaVj1C+7JSYC+WSaNjwRkkIfJUu4=
 =MEzm
 -----END PGP SIGNATURE-----

Merge tag 'docs-6.8-fixes3' of git://git.lwn.net/linux

Pull two documentation build fixes from Jonathan Corbet:

 - The XFS online fsck documentation uses incredibly deeply nested
   subsection and list nesting; that broke the PDF docs build. Tweak a
   parameter to tell LaTeX to allow the deeper nesting.

 - Fix a 6.8 PDF-build regression

* tag 'docs-6.8-fixes3' of git://git.lwn.net/linux:
  docs: translations: use attribute to store current language
  docs: Instruct LaTeX to cope with deeper nesting
2024-02-25 10:58:12 -08:00
Linus Torvalds
c46ac50ebe USB fixes for 6.8-rc6
Here are some small USB fixes for 6.8-rc6 to resolve some reported
 problems.  These include:
   - regression fixes with typec tpcm code as reported by many
   - cdnsp and cdns3 driver fixes
   - usb role setting code bugfixes
   - build fix for uhci driver
   - ncm gadget driver bugfix
   - MAINTAINERS entry update
 
 All of these have been in linux-next all week with no reported issues
 and there is at least one fix in here that is in Thorsten's regression
 list that is being tracked.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZdtGEA8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymzsgCg2IsWqIR72XUGsa5rrbRnskOP/G4An24BmUb6
 t34d0VjiHagZTFlfRx6g
 =eOL1
 -----END PGP SIGNATURE-----

Merge tag 'usb-6.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here are some small USB fixes for 6.8-rc6 to resolve some reported
  problems. These include:

   - regression fixes with typec tpcm code as reported by many

   - cdnsp and cdns3 driver fixes

   - usb role setting code bugfixes

   - build fix for uhci driver

   - ncm gadget driver bugfix

   - MAINTAINERS entry update

  All of these have been in linux-next all week with no reported issues
  and there is at least one fix in here that is in Thorsten's regression
  list that is being tracked"

* tag 'usb-6.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: typec: tpcm: Fix issues with power being removed during reset
  MAINTAINERS: Drop myself as maintainer of TYPEC port controller drivers
  usb: gadget: ncm: Avoid dropping datagrams of properly parsed NTBs
  Revert "usb: typec: tcpm: reset counter when enter into unattached state after try role"
  usb: gadget: omap_udc: fix USB gadget regression on Palm TE
  usb: dwc3: gadget: Don't disconnect if not started
  usb: cdns3: fix memory double free when handle zero packet
  usb: cdns3: fixed memory use after free at cdns3_gadget_ep_disable()
  usb: roles: don't get/set_role() when usb_role_switch is unregistered
  usb: roles: fix NULL pointer issue when put module's reference
  usb: cdnsp: fixed issue with incorrect detecting CDNSP family controllers
  usb: cdnsp: blocked some cdns3 specific code
  usb: uhci-grlib: Explicitly include linux/platform_device.h
2024-02-25 10:41:57 -08:00
Linus Torvalds
1e592e9536 TTY/Serial driver fixes for 6.8-rc6
Here are 3 small serial/tty driver fixes for 6.8-rc6 that resolve the
 following reported errors:
   - riscv hvc console driver fix that was reported by many
   - amba-pl011 serial driver fix for RS485 mode
   - stm32 serial driver fix for RS485 mode
 
 All of these have been in linux-next all week with no reported problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZdtGnA8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymrqwCfSIsUj9GLazXJTTTgMz1I94HXLrQAnjq9QOtg
 EFt6xmUGcF4zFhnfSLal
 =/k5+
 -----END PGP SIGNATURE-----

Merge tag 'tty-6.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial driver fixes from Greg KH:
 "Here are three small serial/tty driver fixes for 6.8-rc6 that resolve
  the following reported errors:

   - riscv hvc console driver fix that was reported by many

   - amba-pl011 serial driver fix for RS485 mode

   - stm32 serial driver fix for RS485 mode

  All of these have been in linux-next all week with no reported
  problems"

* tag 'tty-6.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  serial: amba-pl011: Fix DMA transmission in RS485 mode
  serial: stm32: do not always set SER_RS485_RX_DURING_TX if RS485 is enabled
  tty: hvc: Don't enable the RISC-V SBI console by default
2024-02-25 10:35:41 -08:00
Linus Torvalds
1eee4ef38c - Make sure clearing CPU buffers using VERW happens at the latest possible
point in the return-to-userspace path, otherwise memory accesses after
   the VERW execution could cause data to land in CPU buffers again
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmXbG7IACgkQEsHwGGHe
 VUoEEg//d1qt/PEWCC23wMO6gLMl4J/e4ZQAuGOKGed/jUmOaQKpHJmpDMRc0li5
 llRYDdfE0ikmtQT3t9vQDs3xbWfT5bLMsijliRimb193FaS1HGlHMMS1nxhfjyfv
 MecbWfkwzX2JnrxJpsbfue+7kks3HyIXYsXV7kSFiHavk4F3GFQXYLO11pKbNQwN
 9UfjJDeVsrcWPGCHhoPKF5NHUnQKIA8ZC6g8yBq894AtdWOhFY7ePKBZefUWQQ1n
 myc5GJ3dKFICMCZvkMABtHYCmHU/W3y/6tPtnrz3kT8GdCIAHG+K9VRUfY1ml94H
 x327GoM3sEzHLsPizKy00/Uao+j6FOtv631LoDLsO2MF3sHoTZDaSgg5y2D/ZC7t
 IZdK3mUGtdINRhGiWWpdxyaMfkQ62cdZk8FkeYkRAewYS6WYSdMX3cPqFNy4Ss5u
 r3reMOD3JcxAatcqhHMXjARMfY+N08gQBpxBul3ejgH8t8aY7xJx6Vggty5kBlHZ
 7urV9jIRxSXfbBmOcYu6HP1ucFLWNSUQCBn7Imrh+5zbE1XVv7NaAWvT4Nmgb0/X
 57fHoYYSVwaJ0k3zWWM7QYEdcuJ7IZnVgTCQYx26Ec2AOQRxE9ose+awTLYtTbp1
 T+XaOlItHKMRzx9K46D7xHwmC5qiokFki3exp5vfGZxGyT3+t/c=
 =n5us
 -----END PGP SIGNATURE-----

Merge tag 'x86_urgent_for_v6.8_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

 - Make sure clearing CPU buffers using VERW happens at the latest
   possible point in the return-to-userspace path, otherwise memory
   accesses after the VERW execution could cause data to land in CPU
   buffers again

* tag 'x86_urgent_for_v6.8_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  KVM/VMX: Move VERW closer to VMentry for MDS mitigation
  KVM/VMX: Use BT+JNC, i.e. EFLAGS.CF to select VMRESUME vs. VMLAUNCH
  x86/bugs: Use ALTERNATIVE() instead of mds_user_clear static key
  x86/entry_32: Add VERW just before userspace transition
  x86/entry_64: Add VERW just before userspace transition
  x86/bugs: Add asm helpers for executing VERW
2024-02-25 10:22:21 -08:00
Linus Torvalds
8c46ed3740 - Make sure GICv4 always gets initialized to prevent a kexec-ed kernel
from silently failing to set it up
 
 - Do not call bus_get_dev_root() for the mbigen irqchip as it always
   returns NULL - use NULL directly
 
 - Fix hardware interrupt number truncation when assigning MSI interrupts
 
 - Correct sending end-of-interrupt messages to disabled interrupts lines on
   RISC-V PLIC
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmXbGgcACgkQEsHwGGHe
 VUqrig//ay2UcLEi8CwxobHXIpuUq+pMt1pLhdDtyehTKx+T44GCwMFXGML8H27A
 8CszKTEJsRxXuUP1iXquECfYqYqGmOZHcIMCX0vDodezRriJXq3m549zdoVY6LIy
 m7x5mN4rfc8xaK/krSz0IKgCn7TZ7Nugw8zHE9PEJ7hj/exIA6EH2f1p0dbDc2z8
 PRWsexi39mVLEstOl7yf5+hys6RN07a+9+PFJrEWCC0bO5We9Z+m7gnpu3zUrwcO
 LlDAU6UwWhVc+xFipW9SFYEhCqtprdfUftf1OW2BLe1TM7pHxdvA3OwlT5ZxxN90
 h4wmQ084v08hcn8YpUkaK5fWEtT+1isD3/8dVUMSRQQ4jcjLiEAVdVOvKOmJNEeJ
 +MYqAktCoyay9ZYCrpZRRIVYfC4/FLMEPExPwILFM6nfMMVEBfkDqFyjGyw3uls7
 QT7eHSo121kQsZc5V/8SuU30f64w/vJbtaZIYOjSR9+hQMQ+i+8cMuV0RSvmDIa2
 1vGdhOLvG5Rk7Wy9xd9ITXbeq+z9KD/tH8BadyfARosvQTrg6aMhhqQRHWJtS6Pf
 Vg50yS6D8ETzNNZSPFABrjBKiqQmEo3ILlUpMbR8jGBaLggqZhTs8eBRj3+XIXp8
 UxgB2b47qKmZI9eaFzne9scnOGmQSRuZ7x8IrwworFGzChSzwpU=
 =cMur
 -----END PGP SIGNATURE-----

Merge tag 'irq_urgent_for_v6.8_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fixes from Borislav Petkov:

 - Make sure GICv4 always gets initialized to prevent a kexec-ed kernel
   from silently failing to set it up

 - Do not call bus_get_dev_root() for the mbigen irqchip as it always
   returns NULL - use NULL directly

 - Fix hardware interrupt number truncation when assigning MSI
   interrupts

 - Correct sending end-of-interrupt messages to disabled interrupts
   lines on RISC-V PLIC

* tag 'irq_urgent_for_v6.8_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/gic-v3-its: Do not assume vPE tables are preallocated
  irqchip/mbigen: Don't use bus_get_dev_root() to find the parent
  PCI/MSI: Prevent MSI hardware interrupt number truncation
  irqchip/sifive-plic: Enable interrupt if needed before EOI
2024-02-25 10:14:12 -08:00
Linus Torvalds
4ca0d9894f Change since last update:
- Fix page refcount leak when looking up specific inodes
    introduced by metabuf reworking.
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEQ0A6bDUS9Y+83NPFUXZn5Zlu5qoFAmXbQekRHHhpYW5nQGtl
 cm5lbC5vcmcACgkQUXZn5Zlu5qpkyg//cjKnuzI2gHL3ff1o0jiTMD11Y/EmwzQS
 C+JgE8WDMjUlVQpbwYnFWo/6se8qvk/fwTXPcRc45piF8YNZVchwsagxO626Kab7
 0xTX7ZUgjuth6ahrkAJZkJNMgO4mYf928uYB6EVWaTb4iF0iT6glEOsSVAR/cssL
 wYKbtgv3OP9t6fQcN/XL31hRkqr2MPk0y5Q27KyT4zo4lrx3xyah7Ndo3aEK/RcM
 +6FUwqRiDsgDF/Ga65ylDvEp9eA03OFNHBn4DrORe3B9KV75NkmSJf/8QVEceNV/
 9D072Hvt7/iyOq53AxWH3Jvp7aro/i0rvAHPbXZX4RVyqcJxaLYCQyrBFvQL0/Ie
 B+793Iua8zbkQCbZ85LpTGrxAb5WydlSJp10AuHTr2MO+wS8bBqf96Jp9x1MuS9D
 vqq7jbjwuZfnFUjpzu49GF6htG+WRVgY5TDzU6IAr3izXuqVxmz+zw5zExbGMmKm
 2S5lb+q68DOBP4YaelAQHh97k0pYDW3eQ6GhDD2FA4P/DOr8p3vszZFBeaqaL70v
 WS8z1wzOgEaleTRN4iRFlMCTvRLjOtDEaUNquohcHqJLe7w/DF7gzCU+0//8dVjD
 cZieFX8uZDtzzcjrlYWeTo8oHd0q2WYixbCN8P92/YXUFIVpfdl85ugyr9KvkZxd
 jKzpAy2LbSw=
 =/Bfj
 -----END PGP SIGNATURE-----

Merge tag 'erofs-for-6.8-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs

Pull erofs fix from Gao Xiang:

 - Fix page refcount leak when looking up specific inodes
   introduced by metabuf reworking

* tag 'erofs-for-6.8-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: fix refcount on the metabuf used for inode lookup
2024-02-25 09:53:13 -08:00
Linus Torvalds
66a97c2ec9 We still have some races in filesystem methods when exposed to RCU
pathwalk.  This series is a result of code audit (the second round
 of it) and it should deal with most of that stuff.  Exceptions: ntfs3
 ->d_hash()/->d_compare() and ceph_d_revalidate().  Up to maintainers (a
 note for NTFS folks - when documentation says that a method may not block,
 it *does* imply that blocking allocations are to be avoided.  Really).
 
 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCZdroDAAKCRBZ7Krx/gZQ
 60dKAQCzp6rYr3ye4nylho9Rzu8LEpH04TuNf3C6JuyUaNHxHwEAvNLatZsyFnmV
 Ksp2Rg/IlKPNtQgYJ8xPxv9DFmNe8gI=
 =47Un
 -----END PGP SIGNATURE-----

Merge tag 'pull-fixes.pathwalk-rcu-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull RCU pathwalk fixes from Al Viro:
 "We still have some races in filesystem methods when exposed to RCU
  pathwalk. This series is a result of code audit (the second round of
  it) and it should deal with most of that stuff.

  Still pending: ntfs3 ->d_hash()/->d_compare() and ceph_d_revalidate().
  Up to maintainers (a note for NTFS folks - when documentation says
  that a method may not block, it *does* imply that blocking allocations
  are to be avoided. Really)"

[ More explanations for people who aren't familiar with the vagaries of
  RCU path walking: most of it is hidden from filesystems, but if a
  filesystem actively participates in the low-level path walking it
  needs to make sure the fields involved in that walk are RCU-safe.

  That "actively participate in low-level path walking" includes things
  like having its own ->d_hash()/->d_compare() routines, or by having
  its own directory permission function that doesn't just use the common
  helpers.  Having a ->d_revalidate() function will also have this issue.

  Note that instead of making everything RCU safe you can also choose to
  abort the RCU pathwalk if your operation cannot be done safely under
  RCU, but that obviously comes with a performance penalty. One common
  pattern is to allow the simple cases under RCU, and abort only if you
  need to do something more complicated.

  So not everything needs to be RCU-safe, and things like the inode etc
  that the VFS itself maintains obviously already are. But these fixes
  tend to be about properly RCU-delaying things like ->s_fs_info that
  are maintained by the filesystem and that got potentially released too
  early.   - Linus ]

* tag 'pull-fixes.pathwalk-rcu-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  ext4_get_link(): fix breakage in RCU mode
  cifs_get_link(): bail out in unsafe case
  fuse: fix UAF in rcu pathwalks
  procfs: make freeing proc_fs_info rcu-delayed
  procfs: move dropping pde and pid from ->evict_inode() to ->free_inode()
  nfs: fix UAF on pathwalk running into umount
  nfs: make nfs_set_verifier() safe for use in RCU pathwalk
  afs: fix __afs_break_callback() / afs_drop_open_mmap() race
  hfsplus: switch to rcu-delayed unloading of nls and freeing ->s_fs_info
  exfat: move freeing sbi, upcase table and dropping nls into rcu-delayed helper
  affs: free affs_sb_info with kfree_rcu()
  rcu pathwalk: prevent bogus hard errors from may_lookup()
  fs/super.c: don't drop ->s_user_ns until we free struct super_block itself
2024-02-25 09:29:05 -08:00
Linus Torvalds
9b24349279 A couple of fixes - revert of regression from this cycle
and a fix for erofs failure exit breakage (had been there since
 way back).
 
 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCZdrkZAAKCRBZ7Krx/gZQ
 67D8AP0eM68yZvbThA/Hb5iElDh3Aogt1AW/QAu9/alkDVHr+wD+PKqhamC8WXGk
 b1QZ5AOHQFwzkzdF4738fdbujquBWQE=
 =Ra0D
 -----END PGP SIGNATURE-----

Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull vfs fixes from Al Viro:
 "A couple of fixes - revert of regression from this cycle and a fix for
  erofs failure exit breakage (had been there since way back)"

* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  erofs: fix handling kern_mount() failure
  Revert "get rid of DCACHE_GENOCIDE"
2024-02-25 09:17:15 -08:00
Al Viro
9fa8e282c2 ext4_get_link(): fix breakage in RCU mode
1) errors from ext4_getblk() should not be propagated to caller
unless we are really sure that we would've gotten the same error
in non-RCU pathwalk.
2) we leak buffer_heads if ext4_getblk() is successful, but bh is
not uptodate.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-02-25 02:10:32 -05:00
Al Viro
0511fdb4a3 cifs_get_link(): bail out in unsafe case
->d_revalidate() bails out there, anyway.  It's not enough
to prevent getting into ->get_link() in RCU mode, but that
could happen only in a very contrieved setup.  Not worth
trying to do anything fancy here unless ->d_revalidate()
stops kicking out of RCU mode at least in some cases.

Reviewed-by: Christian Brauner <brauner@kernel.org>
Acked-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-02-25 02:10:32 -05:00
Al Viro
053fc4f755 fuse: fix UAF in rcu pathwalks
->permission(), ->get_link() and ->inode_get_acl() might dereference
->s_fs_info (and, in case of ->permission(), ->s_fs_info->fc->user_ns
as well) when called from rcu pathwalk.

Freeing ->s_fs_info->fc is rcu-delayed; we need to make freeing ->s_fs_info
and dropping ->user_ns rcu-delayed too.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-02-25 02:10:32 -05:00
Al Viro
e31f0a57ae procfs: make freeing proc_fs_info rcu-delayed
makes proc_pid_ns() safe from rcu pathwalk (put_pid_ns()
is still synchronous, but that's not a problem - it does
rcu-delay everything that needs to be)

Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-02-25 02:10:32 -05:00
Al Viro
47458802f6 procfs: move dropping pde and pid from ->evict_inode() to ->free_inode()
that keeps both around until struct inode is freed, making access
to them safe from rcu-pathwalk

Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-02-25 02:10:32 -05:00
Al Viro
c1b967d03c nfs: fix UAF on pathwalk running into umount
NFS ->d_revalidate(), ->permission() and ->get_link() need to access
some parts of nfs_server when called in RCU mode:
	server->flags
	server->caps
	*(server->io_stats)
and, worst of all, call
	server->nfs_client->rpc_ops->have_delegation
(the last one - as NFS_PROTO(inode)->have_delegation()).  We really
don't want to RCU-delay the entire nfs_free_server() (it would have
to be done with schedule_work() from RCU callback, since it can't
be made to run from interrupt context), but actual freeing of
nfs_server and ->io_stats can be done via call_rcu() just fine.
nfs_client part is handled simply by making nfs_free_client() use
kfree_rcu().

Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-02-25 02:10:32 -05:00
Al Viro
10a973fc4f nfs: make nfs_set_verifier() safe for use in RCU pathwalk
nfs_set_verifier() relies upon dentry being pinned; if that's
the case, grabbing ->d_lock stabilizes ->d_parent and guarantees
that ->d_parent points to a positive dentry.  For something
we'd run into in RCU mode that is *not* true - dentry might've
been through dentry_kill() just as we grabbed ->d_lock, with
its parent going through the same just as we get to into
nfs_set_verifier_locked().  It might get to detaching inode
(and zeroing ->d_inode) before nfs_set_verifier_locked() gets
to fetching that; we get an oops as the result.

That can happen in nfs{,4} ->d_revalidate(); the call chain in
question is nfs_set_verifier_locked() <- nfs_set_verifier() <-
nfs_lookup_revalidate_delegated() <- nfs{,4}_do_lookup_revalidate().
We have checked that the parent had been positive, but that's
done before we get to nfs_set_verifier() and it's possible for
memory pressure to pick our dentry as eviction candidate by that
time.  If that happens, back-to-back attempts to kill dentry and
its parent are quite normal.  Sure, in case of eviction we'll
fail the ->d_seq check in the caller, but we need to survive
until we return there...

Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-02-25 02:10:31 -05:00
Al Viro
275655d320 afs: fix __afs_break_callback() / afs_drop_open_mmap() race
In __afs_break_callback() we might check ->cb_nr_mmap and if it's non-zero
do queue_work(&vnode->cb_work).  In afs_drop_open_mmap() we decrement
->cb_nr_mmap and do flush_work(&vnode->cb_work) if it reaches zero.

The trouble is, there's nothing to prevent __afs_break_callback() from
seeing ->cb_nr_mmap before the decrement and do queue_work() after both
the decrement and flush_work().  If that happens, we might be in trouble -
vnode might get freed before the queued work runs.

__afs_break_callback() is always done under ->cb_lock, so let's make
sure that ->cb_nr_mmap can change from non-zero to zero while holding
->cb_lock (the spinlock component of it - it's a seqlock and we don't
need to mess with the counter).

Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-02-25 02:10:31 -05:00
Al Viro
af072cf683 hfsplus: switch to rcu-delayed unloading of nls and freeing ->s_fs_info
->d_hash() and ->d_compare() use those, so we need to delay freeing
them.

Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-02-25 02:10:31 -05:00
Al Viro
a13d1a4de3 exfat: move freeing sbi, upcase table and dropping nls into rcu-delayed helper
That stuff can be accessed by ->d_hash()/->d_compare(); as it is, we have
a hard-to-hit UAF if rcu pathwalk manages to get into ->d_hash() on a filesystem
that is in process of getting shut down.

Besides, having nls and upcase table cleanup moved from ->put_super() towards
the place where sbi is freed makes for simpler failure exits.

Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-02-25 02:10:31 -05:00
Al Viro
529f89a9e4 affs: free affs_sb_info with kfree_rcu()
one of the flags in it is used by ->d_hash()/->d_compare()

Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-02-25 02:10:31 -05:00
Al Viro
cdb67fdeed rcu pathwalk: prevent bogus hard errors from may_lookup()
If lazy call of ->permission() returns a hard error, check that
try_to_unlazy() succeeds before returning it.  That both makes
life easier for ->permission() instances and closes the race
in ENOTDIR handling - it is possible that positive d_can_lookup()
seen in link_path_walk() applies to the state *after* unlink() +
mkdir(), while nd->inode matches the state prior to that.

Normally seeing e.g. EACCES from permission check in rcu pathwalk
means that with some timings non-rcu pathwalk would've run into
the same; however, running into a non-executable regular file
in the middle of a pathname would not get to permission check -
it would fail with ENOTDIR instead.

Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-02-25 02:10:31 -05:00