Commit Graph

997580 Commits

Author SHA1 Message Date
Miaohe Lin
d85aecf284 hugetlb_cgroup: fix imbalanced css_get and css_put pair for shared mappings
The current implementation of hugetlb_cgroup for shared mappings could
have different behavior.  Consider the following two scenarios:

 1.Assume initial css reference count of hugetlb_cgroup is 1:
  1.1 Call hugetlb_reserve_pages with from = 1, to = 2. So css reference
      count is 2 associated with 1 file_region.
  1.2 Call hugetlb_reserve_pages with from = 2, to = 3. So css reference
      count is 3 associated with 2 file_region.
  1.3 coalesce_file_region will coalesce these two file_regions into
      one. So css reference count is 3 associated with 1 file_region
      now.

 2.Assume initial css reference count of hugetlb_cgroup is 1 again:
  2.1 Call hugetlb_reserve_pages with from = 1, to = 3. So css reference
      count is 2 associated with 1 file_region.

Therefore, we might have one file_region while holding one or more css
reference counts. This inconsistency could lead to imbalanced css_get()
and css_put() pair. If we do css_put one by one (i.g. hole punch case),
scenario 2 would put one more css reference. If we do css_put all
together (i.g. truncate case), scenario 1 will leak one css reference.

The imbalanced css_get() and css_put() pair would result in a non-zero
reference when we try to destroy the hugetlb cgroup. The hugetlb cgroup
directory is removed __but__ associated resource is not freed. This
might result in OOM or can not create a new hugetlb cgroup in a busy
workload ultimately.

In order to fix this, we have to make sure that one file_region must
hold exactly one css reference. So in coalesce_file_region case, we
should release one css reference before coalescence. Also only put css
reference when the entire file_region is removed.

The last thing to note is that the caller of region_add() will only hold
one reference to h_cg->css for the whole contiguous reservation region.
But this area might be scattered when there are already some
file_regions reside in it. As a result, many file_regions may share only
one h_cg->css reference. In order to ensure that one file_region must
hold exactly one css reference, we should do css_get() for each
file_region and release the reference held by caller when they are done.

[linmiaohe@huawei.com: fix imbalanced css_get and css_put pair for shared mappings]
  Link: https://lkml.kernel.org/r/20210316023002.53921-1-linmiaohe@huawei.com

Link: https://lkml.kernel.org/r/20210301120540.37076-1-linmiaohe@huawei.com
Fixes: 075a61d07a ("hugetlb_cgroup: add accounting for shared mappings")
Reported-by: kernel test robot <lkp@intel.com> (auto build test ERROR)
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Wanpeng Li <liwp.linux@gmail.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-03-25 09:22:55 -07:00
Potnuri Bharat Teja
3408be145a RDMA/cxgb4: Fix adapter LE hash errors while destroying ipv6 listening server
Not setting the ipv6 bit while destroying ipv6 listening servers may
result in potential fatal adapter errors due to lookup engine memory hash
errors. Therefore always set ipv6 field while destroying ipv6 listening
servers.

Fixes: 830662f6f0 ("RDMA/cxgb4: Add support for active and passive open connection with IPv6 address")
Link: https://lore.kernel.org/r/20210324190453.8171-1-bharat@chelsio.com
Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-25 10:25:58 -03:00
Rich Wiley
20109a859a arm64: kernel: disable CNP on Carmel
On NVIDIA Carmel cores, CNP behaves differently than it does on standard
ARM cores. On Carmel, if two cores have CNP enabled and share an L2 TLB
entry created by core0 for a specific ASID, a non-shareable TLBI from
core1 may still see the shared entry. On standard ARM cores, that TLBI
will invalidate the shared entry as well.

This causes issues with patchsets that attempt to do local TLBIs based
on cpumasks instead of broadcast TLBIs. Avoid these issues by disabling
CNP support for NVIDIA Carmel cores.

Signed-off-by: Rich Wiley <rwiley@nvidia.com>
Link: https://lore.kernel.org/r/20210324002809.30271-1-rwiley@nvidia.com
[will: Fix pre-existing whitespace issue]
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-25 10:00:23 +00:00
Maninder Singh
baa96377bc arm64/process.c: fix Wmissing-prototypes build warnings
Fix GCC warnings reported when building with "-Wmissing-prototypes":

  arch/arm64/kernel/process.c:261:6: warning: no previous prototype for '__show_regs' [-Wmissing-prototypes]
      261 | void __show_regs(struct pt_regs *regs)
          |      ^~~~~~~~~~~
  arch/arm64/kernel/process.c:307:6: warning: no previous prototype for '__show_regs_alloc_free' [-Wmissing-prototypes]
      307 | void __show_regs_alloc_free(struct pt_regs *regs)
          |      ^~~~~~~~~~~~~~~~~~~~~~
  arch/arm64/kernel/process.c:365:5: warning: no previous prototype for 'arch_dup_task_struct' [-Wmissing-prototypes]
      365 | int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
          |     ^~~~~~~~~~~~~~~~~~~~
  arch/arm64/kernel/process.c:546:41: warning: no previous prototype for '__switch_to' [-Wmissing-prototypes]
      546 | __notrace_funcgraph struct task_struct *__switch_to(struct task_struct *prev,
          |                                         ^~~~~~~~~~~
  arch/arm64/kernel/process.c:710:25: warning: no previous prototype for 'arm64_preempt_schedule_irq' [-Wmissing-prototypes]
      710 | asmlinkage void __sched arm64_preempt_schedule_irq(void)
          |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~

Link: https://lore.kernel.org/lkml/202103192250.AennsfXM-lkp@intel.com
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
Link: https://lore.kernel.org/r/1616568899-986-1-git-send-email-maninder1.s@samsung.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-25 09:50:16 +00:00
Linus Torvalds
e138138003 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:
 "Various fixes, all over:

   1) Fix overflow in ptp_qoriq_adjfine(), from Yangbo Lu.

   2) Always store the rx queue mapping in veth, from Maciej
      Fijalkowski.

   3) Don't allow vmlinux btf in map_create, from Alexei Starovoitov.

   4) Fix memory leak in octeontx2-af from Colin Ian King.

   5) Use kvalloc in bpf x86 JIT for storing jit'd addresses, from
      Yonghong Song.

   6) Fix tx ptp stats in mlx5, from Aya Levin.

   7) Check correct ip version in tun decap, fropm Roi Dayan.

   8) Fix rate calculation in mlx5 E-Switch code, from arav Pandit.

   9) Work item memork leak in mlx5, from Shay Drory.

  10) Fix ip6ip6 tunnel crash with bpf, from Daniel Borkmann.

  11) Lack of preemptrion awareness in macvlan, from Eric Dumazet.

  12) Fix data race in pxa168_eth, from Pavel Andrianov.

  13) Range validate stab in red_check_params(), from Eric Dumazet.

  14) Inherit vlan filtering setting properly in b53 driver, from
      Florian Fainelli.

  15) Fix rtnl locking in igc driver, from Sasha Neftin.

  16) Pause handling fixes in igc driver, from Muhammad Husaini
      Zulkifli.

  17) Missing rtnl locking in e1000_reset_task, from Vitaly Lifshits.

  18) Use after free in qlcnic, from Lv Yunlong.

  19) fix crash in fritzpci mISDN, from Tong Zhang.

  20) Premature rx buffer reuse in igb, from Li RongQing.

  21) Missing termination of ip[a driver message handler arrays, from
      Alex Elder.

  22) Fix race between "x25_close" and "x25_xmit"/"x25_rx" in hdlc_x25
      driver, from Xie He.

  23) Use after free in c_can_pci_remove(), from Tong Zhang.

  24) Uninitialized variable use in nl80211, from Jarod Wilson.

  25) Off by one size calc in bpf verifier, from Piotr Krysiuk.

  26) Use delayed work instead of deferrable for flowtable GC, from
      Yinjun Zhang.

  27) Fix infinite loop in NPC unmap of octeontx2 driver, from
      Hariprasad Kelam.

  28) Fix being unable to change MTU of dwmac-sun8i devices due to lack
      of fifo sizes, from Corentin Labbe.

  29) DMA use after free in r8169 with WoL, fom Heiner Kallweit.

  30) Mismatched prototypes in isdn-capi, from Arnd Bergmann.

  31) Fix psample UAPI breakage, from Ido Schimmel"

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (171 commits)
  psample: Fix user API breakage
  math: Export mul_u64_u64_div_u64
  ch_ktls: fix enum-conversion warning
  octeontx2-af: Fix memory leak of object buf
  ptp_qoriq: fix overflow in ptp_qoriq_adjfine() u64 calcalation
  net: bridge: don't notify switchdev for local FDB addresses
  net/sched: act_ct: clear post_ct if doing ct_clear
  net: dsa: don't assign an error value to tag_ops
  isdn: capi: fix mismatched prototypes
  net/mlx5: SF, do not use ecpu bit for vhca state processing
  net/mlx5e: Fix division by 0 in mlx5e_select_queue
  net/mlx5e: Fix error path for ethtool set-priv-flag
  net/mlx5e: Offload tuple rewrite for non-CT flows
  net/mlx5e: Allow to match on MPLS parameters only for MPLS over UDP
  net/mlx5: Add back multicast stats for uplink representor
  net: ipconfig: ic_dev can be NULL in ic_close_devs
  MAINTAINERS: Combine "QLOGIC QLGE 10Gb ETHERNET DRIVER" sections into one
  docs: networking: Fix a typo
  r8169: fix DMA being used after buffer free if WoL is enabled
  net: ipa: fix init header command validation
  ...
2021-03-24 18:16:04 -07:00
Ido Schimmel
e43accba9b psample: Fix user API breakage
Cited commit added a new attribute before the existing group reference
count attribute, thereby changing its value and breaking existing
applications on new kernels.

Before:

 # psample -l
 libpsample ERROR psample_group_foreach: failed to recv message: Operation not supported

After:

 # psample -l
 Group Num       Refcount        Group Seq
 1               1               0

Fix by restoring the value of the old attribute and remove the
misleading comments from the enumerator to avoid future bugs.

Cc: stable@vger.kernel.org
Fixes: d8bed686ab ("net: psample: Add tunnel support")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reported-by: Adiel Bidani <adielb@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-24 16:44:31 -07:00
David S. Miller
bf45947864 math: Export mul_u64_u64_div_u64
Fixes: f51d7bf1db ("ptp_qoriq: fix overflow in ptp_qoriq_adjfine() u64 calcalation")
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-24 16:42:54 -07:00
Arnd Bergmann
6f235a69e5 ch_ktls: fix enum-conversion warning
gcc points out an incorrect enum assignment:

drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c: In function 'chcr_ktls_cpl_set_tcb_rpl':
drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c:684:22: warning: implicit conversion from 'enum <anonymous>' to 'enum ch_ktls_open_state' [-Wenum-conversion]

This appears harmless, and should apparently use 'CH_KTLS_OPEN_SUCCESS'
instead of 'false', with the same value '0'.

Fixes: efca3878a5 ("ch_ktls: Issue if connection offload fails")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-24 12:35:39 -07:00
Colin Ian King
9e0a537d06 octeontx2-af: Fix memory leak of object buf
Currently the error return path when lfs fails to allocate is not free'ing
the memory allocated to buf. Fix this by adding the missing kfree.

Addresses-Coverity: ("Resource leak")
Fixes: f788409714 ("octeontx2-af: Formatting debugfs entry rsrc_alloc.")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-24 12:33:06 -07:00
Yangbo Lu
f51d7bf1db ptp_qoriq: fix overflow in ptp_qoriq_adjfine() u64 calcalation
Current calculation for diff of TMR_ADD register value may have
64-bit overflow in this code line, when long type scaled_ppm is
large.

adj *= scaled_ppm;

This patch is to resolve it by using mul_u64_u64_div_u64().

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-24 12:10:03 -07:00
Linus Torvalds
4ee998b0ef Three fixes for the Qualcomm clk driver, two for regressions this merge
window and one for a long standing problem that only popped up now that
 eMMC is being used.
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAmBbg6ARHHNib3lkQGtl
 cm5lbC5vcmcACgkQrQKIl8bklSUSnBAAz9crwEFRI5iZu+yubSStfNCNdXbH6eev
 UEMfi0G21EodS5D5qG2YcPmT4gDkpdkMGO/UosJWrTeFA09dImmmj0TeQ8S2KwHH
 GcOfoWCnMkC/qg/v8aSLVtbj6IORup/fq+oMyd9LdNRcNXg5DZrifzoJWcCXpXMX
 Q1dLYj1aL/JeLh842HxUH0YQI7CxlO/R2hLhYmCjO/ZFHDWpBUbjefv79P40ykV/
 jjCrU1roNPJipmS40puYbyMvPQTaGcXKAKq9n+fdBzuFUP5Sp4/bNPgA3rGO6ABw
 bSenFTfEuvEvSLds6oczSZk/hRhpBmcd865ryLG9ZiAerDX9cb21us0kIkvI6hwZ
 ywLzqRbWDPBrxXHZuUzoLbu4yIqY5wGCqpLmxH5CYoGcit7edlkdnaJPTCXBIen7
 +whoapOFGf5Mgh6hi7zKR9m53GtKTUt5MScVx3nk/iBmQ+OPKQ+DnukhYXXXggEj
 E7XzF8RWqEMMHd//V39RSAAJqNCS7K1t8XKpr0wYc1FP8YsPoiHP/tMNFnqoeptY
 hBQunoVkrDLIyKm/bL3VWFUJaOqEZajkrTvG9jKry+mzIVFjCboNFwDMZ5srEWuu
 XzqdoVvQEjOh1arLdK2KY2Y9xGPQAM/nrIMY8h/6CLHB10tniEP+Dl5y1r1yxtn3
 SJTAjGvN8GQ=
 =e9G9
 -----END PGP SIGNATURE-----

Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk fixes from Stephen Boyd:
 "Three fixes for the Qualcomm clk driver: two for regressions this
  merge window and one for a long-standing problem that only popped up
  now that eMMC is being used"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: qcom: gcc-sc7180: Use floor ops for the correct sdcc1 clk
  clk: qcom: rcg2: Rectify clk_gfx3d rate rounding without mux division
  clk: qcom: rpmh: Update the XO clock source for SC7280
2021-03-24 11:26:50 -07:00
Linus Torvalds
a0a4df6a9e platform-drivers-x86 for v5.12-2
Summary:
  - dell-wmi-sysman: A set of probe-error-exit-handling fixes to fix some systems
    which advertise the WMI GUIDs, but are not compatible, not booting
  - intel-vbtn/intel-hid: Misc. bugfixes
  - intel_pmc: Bug-fixes + a quirk to lower suspend power-consumption on Tiger Lake
  - thinkpad_acpi: Misc. bugfixes
 
 The following is an automated git shortlog grouped by driver:
 
 dell-wmi-sysman:
  -  Cleanup create_attributes_level_sysfs_files()
  -  Make sysman_init() return -ENODEV of the interfaces are not found
  -  Cleanup sysman_init() error-exit handling
  -  Fix release_attributes_data() getting called twice on init_bios_attributes() failure
  -  Make it safe to call exit_foo_attributes() multiple times
  -  Fix possible NULL pointer deref on exit
  -  Fix crash caused by calling kset_unregister twice
 
 intel-hid:
  -  Support Lenovo ThinkPad X1 Tablet Gen 2
 
 intel-vbtn:
  -  Stop reporting SW_DOCK events
 
 intel_pmc_core:
  -  Ignore GBE LTR on Tiger Lake platforms
  -  Update Kconfig
 
 intel_pmt_class:
  -  Initial resource to 0
 
 intel_pmt_crashlog:
  -  Fix incorrect macros
 
 thinkpad_acpi:
  -  Disable DYTC CQL mode around switching to balanced mode
  -  Allow the FnLock LED to change state
  -  check dytc version for lapmode sysfs
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEEuvA7XScYQRpenhd+kuxHeUQDJ9wFAmBbbL0UHGhkZWdvZWRl
 QHJlZGhhdC5jb20ACgkQkuxHeUQDJ9yYEgf/dWwTip21gYoi02mdsHsPduaL0Mtu
 grcZpnRSEuWvgl5P26zttmLjAK4rTyySePVBWsxDwH/4qBqY2DCicSQfeQke2/9c
 PJU3i8zXTQxlBUWFrjM8vqFKdTypFXJwpdoBGQD3JJAh8LcSQj5xkhhDQVJYIXLQ
 HIxVM44gPLZc/lHOFGUEtREc2/k2/A09pER6udvVGxSy/Vz1w646G3u9f5edi1jz
 jX5HIlEtEYpZ55E8bQSUcMIVpiv6HLAu5qQXQ+1xeQXXwM7mM6gRpG8Qr9Cy70Aq
 us0AA5AjYd4IudlgFtUQ7NOB5YYEs2WHiFx4+ck0DSE7CMzcamnUNNp7Tg==
 =ybep
 -----END PGP SIGNATURE-----

Merge tag 'platform-drivers-x86-v5.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform drivers fixes from Hans de Goede:
 "A set of bug-fixes and some model specific quirks.

  Summary:

   - dell-wmi-sysman: A set of probe-error-exit-handling fixes to fix
     some systems which advertise the WMI GUIDs, but are not compatible,
     not booting

   - intel-vbtn/intel-hid: Misc. bugfixes

   - intel_pmc: Bug-fixes + a quirk to lower suspend power-consumption
     on Tiger Lake

   - thinkpad_acpi: misc bugfixes"

* tag 'platform-drivers-x86-v5.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86: intel_pmc_core: Ignore GBE LTR on Tiger Lake platforms
  platform/x86: intel_pmc_core: Update Kconfig
  platform/x86: intel_pmt_crashlog: Fix incorrect macros
  platform/x86: intel_pmt_class: Initial resource to 0
  platform/x86: intel-vbtn: Stop reporting SW_DOCK events
  platform/x86: dell-wmi-sysman: Cleanup create_attributes_level_sysfs_files()
  platform/x86: dell-wmi-sysman: Make sysman_init() return -ENODEV of the interfaces are not found
  platform/x86: dell-wmi-sysman: Cleanup sysman_init() error-exit handling
  platform/x86: dell-wmi-sysman: Fix release_attributes_data() getting called twice on init_bios_attributes() failure
  platform/x86: dell-wmi-sysman: Make it safe to call exit_foo_attributes() multiple times
  platform/x86: dell-wmi-sysman: Fix possible NULL pointer deref on exit
  platform/x86: dell-wmi-sysman: Fix crash caused by calling kset_unregister twice
  platform/x86: thinkpad_acpi: Disable DYTC CQL mode around switching to balanced mode
  platform/x86: thinkpad_acpi: Allow the FnLock LED to change state
  platform/x86: thinkpad_acpi: check dytc version for lapmode sysfs
  platform/x86: intel-hid: Support Lenovo ThinkPad X1 Tablet Gen 2
2021-03-24 11:21:01 -07:00
Linus Torvalds
8a9d2e133e cachefiles, afs: mm wait fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEqG5UsNXhtOCrfGQP+7dXa6fLC2sFAmBaVsAACgkQ+7dXa6fL
 C2u/7w/8DU9UZN3IRgZzR47xw3qYlgNMWRoiJ2RwSHYDJcsFqziJ/6jN/MDr7vzc
 eo1XQnDUH1Ok02WNxI6iVIfkX6cC/SidCWs6mNevQ6ksn9ei8tG0ZUWLcUl1IA+O
 HzXxvouyL9aJB+aNTQXttoi8JaSuoW/HBV3MbjOLywsy41AicCpt0gI0AJgXHKe8
 nEz3mqWZpCywRTkVkt9sWFOMX2shUzy8SoFgLMNpDUgyMD4r98XVJdIH8X4Em3zE
 syLg92aOnxxTEOAAYefcOSsgDBIkxLqW6F/K884cTPgLC24RJ/LO+M4GoOWX1Cmj
 Gqy9DZ3TGTu9yXr6Cm32OMl6t1Y0rYnktNl1Z4OT0XibK4gxgohZEr811A1/pHHu
 OfPBIUAotKRS4o/scs8Au0+XMT0/R7qfsGZe+TUGzWG1CRzf+tOLMrgXPxWnh2fV
 E2eNfOzy2Ry5v0XB4Lb4tb0JVPM2WOBTbswgUIHUOLz7fT6+mVaFYK/8eDDu6EJH
 zmDxs7HLZvI6X6XB2DOCDDWJbzKk9Jo27raGV5o6QCwAKENIr8XAvgZBEg5+Quvc
 feNBNSWTplgB5ROPlRWgmy/Xh4Y4+uRMCzMN+q9FtC810bDCE5rY5TRnayxmx9ni
 XugpJnoMBM8QcbtHNxropGOg+gQpABYfSfZMmcNPd+Oyix3SbtQ=
 =/IaF
 -----END PGP SIGNATURE-----

Merge tag 'afs-cachefiles-fixes-20210323' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull cachefiles and afs fixes from David Howells:
 "Fixes from Matthew Wilcox for page waiting-related issues in
  cachefiles and afs as extracted from his folio series[1]:

   - In cachefiles, remove the use of the wait_bit_key struct to access
     something that's actually in wait_page_key format. The proper
     struct is now available in the header, so that should be used
     instead.

   - Add a proper wait function for waiting killably on the page
     writeback flag. This includes a recent bugfix[2] that's not in the
     afs code.

   - In afs, use the function added in (2) rather than using
     wait_on_page_bit_killable() which doesn't provide the
     aforementioned bugfix"

Link: https://lore.kernel.org/r/20210320054104.1300774-1-willy@infradead.org[1]
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c2407cf7d22d0c0d94cf20342b3b8f06f1d904e7 [2]
Link: https://lore.kernel.org/r/20210323120829.GC1719932@casper.infradead.org/ # v1

* tag 'afs-cachefiles-fixes-20210323' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  afs: Use wait_on_page_writeback_killable
  mm/writeback: Add wait_on_page_writeback_killable
  fs/cachefiles: Remove wait_bit_key layout dependency
2021-03-24 10:22:00 -07:00
Christian Brauner
bf1c82a538 cachefiles: do not yet allow on idmapped mounts
Based on discussions (e.g. in [1]) my understanding of cachefiles and
the cachefiles userspace daemon is that it creates a cache on a local
filesystem (e.g. ext4, xfs etc.) for a network filesystem. The way this
is done is by writing "bind" to /dev/cachefiles and pointing it to the
directory to use as the cache.

Currently this directory can technically also be an idmapped mount but
cachefiles aren't yet fully aware of such mounts and thus don't take the
idmapping into account when creating cache entries. This could leave
users confused as the ownership of the files wouldn't match to what they
expressed in the idmapping. Block cache files on idmapped mounts until
the fscache rework is done and we have ported it to support idmapped
mounts.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: linux-cachefs@redhat.com
Link: https://lore.kernel.org/lkml/20210303161528.n3jzg66ou2wa43qb@wittgenstein [1]
Link: https://lore.kernel.org/r/20210316112257.2974212-1-christian.brauner@ubuntu.com/ # v1
Link: https://listman.redhat.com/archives/linux-cachefs/2021-March/msg00044.html # v2
Link: https://lore.kernel.org/r/20210319114146.410329-1-christian.brauner@ubuntu.com/ # v3
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-03-24 10:20:22 -07:00
Steffen Klassert
b1e3a56070 xfrm: Fix NULL pointer dereference on policy lookup
When xfrm interfaces are used in combination with namespaces
and ESP offload, we get a dst_entry NULL pointer dereference.
This is because we don't have a dst_entry attached in the ESP
offloading case and we need to do a policy lookup before the
namespace transition.

Fix this by expicit checking of skb_dst(skb) before accessing it.

Fixes: f203b76d78 ("xfrm: Add virtual xfrm interfaces")
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2021-03-24 10:00:24 +01:00
Xin Long
68dc022d04 xfrm: BEET mode doesn't support fragments for inner packets
BEET mode replaces the IP(6) Headers with new IP(6) Headers when sending
packets. However, when it's a fragment before the replacement, currently
kernel keeps the fragment flag and replace the address field then encaps
it with ESP. It would cause in RX side the fragments to get reassembled
before decapping with ESP, which is incorrect.

In Xiumei's testing, these fragments went over an xfrm interface and got
encapped with ESP in the device driver, and the traffic was broken.

I don't have a good way to fix it, but only to warn this out in dmesg.

Reported-by: Xiumei Mu <xmu@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2021-03-24 09:58:19 +01:00
Vladimir Oltean
6ab4c3117a net: bridge: don't notify switchdev for local FDB addresses
As explained in this discussion:
https://lore.kernel.org/netdev/20210117193009.io3nungdwuzmo5f7@skbuf/

the switchdev notifiers for FDB entries managed to have a zero-day bug.
The bridge would not say that this entry is local:

ip link add br0 type bridge
ip link set swp0 master br0
bridge fdb add dev swp0 00:01:02:03:04:05 master local

and the switchdev driver would be more than happy to offload it as a
normal static FDB entry. This is despite the fact that 'local' and
non-'local' entries have completely opposite directions: a local entry
is locally terminated and not forwarded, whereas a static entry is
forwarded and not locally terminated. So, for example, DSA would install
this entry on swp0 instead of installing it on the CPU port as it should.

There is an even sadder part, which is that the 'local' flag is implicit
if 'static' is not specified, meaning that this command produces the
same result of adding a 'local' entry:

bridge fdb add dev swp0 00:01:02:03:04:05 master

I've updated the man pages for 'bridge', and after reading it now, it
should be pretty clear to any user that the commands above were broken
and should have never resulted in the 00:01:02:03:04:05 address being
forwarded (this behavior is coherent with non-switchdev interfaces):
https://patchwork.kernel.org/project/netdevbpf/cover/20210211104502.2081443-1-olteanv@gmail.com/
If you're a user reading this and this is what you want, just use:

bridge fdb add dev swp0 00:01:02:03:04:05 master static

Because switchdev should have given drivers the means from day one to
classify FDB entries as local/non-local, but didn't, it means that all
drivers are currently broken. So we can just as well omit the switchdev
notifications for local FDB entries, which is exactly what this patch
does to close the bug in stable trees. For further development work
where drivers might want to trap the local FDB entries to the host, we
can add a 'bool is_local' to br_switchdev_fdb_call_notifiers(), and
selectively make drivers act upon that bit, while all the others ignore
those entries if the 'is_local' bit is set.

Fixes: 6b26b51b1d ("net: bridge: Add support for notifying devices about FDB add/del")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-23 14:39:41 -07:00
Marcelo Ricardo Leitner
8ca1b090e5 net/sched: act_ct: clear post_ct if doing ct_clear
Invalid detection works with two distinct moments: act_ct tries to find
a conntrack entry and set post_ct true, indicating that that was
attempted. Then, when flow dissector tries to dissect CT info and no
entry is there, it knows that it was tried and no entry was found, and
synthesizes/sets
                  key->ct_state = TCA_FLOWER_KEY_CT_FLAGS_TRACKED |
                                  TCA_FLOWER_KEY_CT_FLAGS_INVALID;
mimicing what OVS does.

OVS has this a bit more streamlined, as it recomputes the key after
trying to find a conntrack entry for it.

Issue here is, when we have 'tc action ct clear', it didn't clear
post_ct, causing a subsequent match on 'ct_state -trk' to fail, due to
the above. The fix, thus, is to clear it.

Reproducer rules:
tc filter add dev enp130s0f0np0_0 ingress prio 1 chain 0 \
	protocol ip flower ip_proto tcp ct_state -trk \
	action ct zone 1 pipe \
	action goto chain 2
tc filter add dev enp130s0f0np0_0 ingress prio 1 chain 2 \
	protocol ip flower \
	action ct clear pipe \
	action goto chain 4
tc filter add dev enp130s0f0np0_0 ingress prio 1 chain 4 \
	protocol ip flower ct_state -trk \
	action mirred egress redirect dev enp130s0f1np1_0

With the fix, the 3rd rule matches, like it does with OVS kernel
datapath.

Fixes: 7baf2429a1 ("net/sched: cls_flower add CT_FLAGS_INVALID flag support")
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-23 14:32:26 -07:00
Matthew Wilcox (Oracle)
75b6979961 afs: Use wait_on_page_writeback_killable
Open-coding this function meant it missed out on the recent bugfix
for waiters being woken by a delayed wake event from a previous
instantiation of the page[1].

[DH: Changed the patch to use vmf->page rather than variable page which
 doesn't exist yet upstream]

Fixes: 1cf7a1518a ("afs: Implement shared-writeable mmap")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: kafs-testing@auristor.com
cc: linux-afs@lists.infradead.org
cc: linux-mm@kvack.org
Link: https://lore.kernel.org/r/20210320054104.1300774-4-willy@infradead.org
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c2407cf7d22d0c0d94cf20342b3b8f06f1d904e7 [1]
2021-03-23 20:54:37 +00:00
Matthew Wilcox (Oracle)
e5dbd33218 mm/writeback: Add wait_on_page_writeback_killable
This is the killable version of wait_on_page_writeback.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: kafs-testing@auristor.com
cc: linux-afs@lists.infradead.org
cc: linux-mm@kvack.org
Link: https://lore.kernel.org/r/20210320054104.1300774-3-willy@infradead.org
2021-03-23 20:54:29 +00:00
Matthew Wilcox (Oracle)
39f985c8f6 fs/cachefiles: Remove wait_bit_key layout dependency
Cachefiles was relying on wait_page_key and wait_bit_key being the
same layout, which is fragile.  Now that wait_page_key is exposed in
the pagemap.h header, we can remove that fragility

A comment on the need to maintain structure layout equivalence was added by
Linus[1] and that is no longer applicable.

Fixes: 6290602709 ("mm: add PageWaiters indicating tasks are waiting for a page bit")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: kafs-testing@auristor.com
cc: linux-cachefs@redhat.com
cc: linux-mm@kvack.org
Link: https://lore.kernel.org/r/20210320054104.1300774-2-willy@infradead.org/
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3510ca20ece0150af6b10c77a74ff1b5c198e3e2 [1]
2021-03-23 20:54:29 +00:00
David E. Box
d1635448f1 platform/x86: intel_pmc_core: Ignore GBE LTR on Tiger Lake platforms
Due to a HW limitation, the Latency Tolerance Reporting (LTR) value
programmed in the Tiger Lake GBE controller is not large enough to allow
the platform to enter Package C10, which in turn prevents the platform from
achieving its low power target during suspend-to-idle.  Ignore the GBE LTR
value on Tiger Lake. LTR ignore functionality is currently performed solely
by a debugfs write call. Split out the LTR code into its own function that
can be called by both the debugfs writer and by this work around.

Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Reviewed-by: Sasha Neftin <sasha.neftin@intel.com>
Cc: intel-wired-lan@lists.osuosl.org
Reviewed-by: Rajneesh Bhardwaj <irenic.rajneesh@gmail.com>
Link: https://lore.kernel.org/r/20210319201844.3305399-2-david.e.box@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-03-23 21:50:14 +01:00
David E. Box
269b04a509 platform/x86: intel_pmc_core: Update Kconfig
The intel_pmc_core driver is mostly used as a debugging driver for Intel
platforms that support SLPS0 (S0ix). But the driver may also be used to
communicate actions to the PMC in order to ensure transition to SLPS0 on
some systems and architectures. As such the driver should be built on all
platforms it supports. Indicate this in the Kconfig. Also update the list
of supported features.

Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Suggested-by: Mario Limonciello <mario.limonciello@dell.com>
Link: https://lore.kernel.org/r/20210319201844.3305399-1-david.e.box@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-03-23 21:50:08 +01:00
David E. Box
10c931cdfe platform/x86: intel_pmt_crashlog: Fix incorrect macros
Fixes off-by-one bugs in the macro assignments for the crashlog control
bits. Was initially tested on emulation but bug revealed after testing on
silicon.

Fixes: 5ef9998c96 ("platform/x86: Intel PMT Crashlog capability driver")
Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Link: https://lore.kernel.org/r/20210317024455.3071477-2-david.e.box@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-03-23 21:50:02 +01:00
David E. Box
7547deff8a platform/x86: intel_pmt_class: Initial resource to 0
Initialize the struct resource in intel_pmt_dev_register to zero to avoid a
fault should the char *name field be non-zero.

Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Link: https://lore.kernel.org/r/20210317024455.3071477-1-david.e.box@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-03-23 21:49:56 +01:00
Linus Torvalds
7acac4b319 linux-kselftest-kunit-fixes-5.12-rc5.1
This KUnit update for Linux 5.12-rc5 consists of two fixes to kunit
 tool from David Gow.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmBaFeoACgkQCwJExA0N
 Qxz0Gg//Y3mX/SmBWYCB32Vm3nXOIqEPxDuTxn4V6l1XoSOjViDlqPjVMIeQ1/Cv
 R8Hx0rXjzU9e9tuQaHSCnp89vqjnhIB3ROPWABsL2784EDsCpIqlT2ivey9wwBva
 VImaTDFn4xlJjJG1L/vmlBzagczk02qSsvadbwC/+vLQSfSZJTZWD9c6TpmfF82T
 P+FbWs8+WgpilsPOCYqXRvH8HDSWR/oXFOFxJTg64N3c5Vj2JIgnBeaCs+yahvEA
 KnAt8988uhHH15ROsa93g6dIvQ4C7NT74IKZW8VLvkkUxAk86clQYRcZZdCBbZhc
 yL3X1RhP/8a452df2LO0xraZTaxc2XI8H1F2ZhsJmmeBL2r2dyEe/4mWQD944e0h
 ZGQqIi4+WiMV/+WyTwp05KxvPBaiG0GFFLznNEPQQLuP9Iwh8QBXnBpg+pw58pQ1
 rk0fi8H0IaDjQjbAKiILnxRuINMAhoMOaYuYpMy+mD60AwyPOCAv1dHRInS8SG9D
 UZxTDFyOQDJqSmLueqULbaWQ/21Jg8Jui4Mowf6DpXSXdPpJt4wHffBnWVW1OHqC
 pYTCFAZ+S6bSoEeTvnyHXKO9MtsS0kxS1O5V5LS+Sc0sUcf3sxr83gU9E7Jvnmum
 Os4HyFr5H4AAS4Yif2vgmB7Y2kIIgWViIcKzcGfXha8im8vcooM=
 =uSNy
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-kunit-fixes-5.12-rc5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull KUnit fixes from Shuah Khan:
 "Two fixes to the kunit tool from David Gow"

* tag 'linux-kselftest-kunit-fixes-5.12-rc5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: tool: Disable PAGE_POISONING under --alltests
  kunit: tool: Fix a python tuple typing error
2021-03-23 10:18:08 -07:00
Hans de Goede
bd83a2fc05 brcmfmac: p2p: Fix recently introduced deadlock issue
Commit a05829a722 ("cfg80211: avoid holding the RTNL when calling the
driver") replaced the rtnl_lock parameter passed to various brcmf
functions with just lock, because since that commit it is not just
about the rtnl_lock but also about the wiphy_lock .

During this search/replace the "if (!rtnl_locked)" check in brcmfmac/p2p.c
was accidentally replaced with "if (locked)", dropping the inversion of
the check. This causes the code to now call rtnl_lock() while already
holding the lock, causing a deadlock.

Add back the "!" to the if-condition to fix this.

Cc: Johannes Berg <johannes.berg@intel.com>
Fixes: a05829a722 ("cfg80211: avoid holding the RTNL when calling the driver")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210313143635.109154-1-hdegoede@redhat.com
2021-03-23 11:37:15 +02:00
Lorenzo Bianconi
8f6a70fd71 mt76: mt7921: fix airtime reporting
Fix {tx,rx}_airtime reporting for mt7921 driver. Wrong register definitions
trigger a tx hangs before resetting airtime stats.

Fixes: 163f4d22c1 ("mt76: mt7921: add MAC support")
Tested-by: Leon Yen <leon.yen@mediatek.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Acked-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/c05333be0e3e85a50a71bb2de81034fe425e3701.1615284335.git.lorenzo@kernel.org
2021-03-23 11:35:50 +02:00
Jiri Kosina
2800aadc18 iwlwifi: Fix softirq/hardirq disabling in iwl_pcie_enqueue_hcmd()
It's possible for iwl_pcie_enqueue_hcmd() to be called with hard IRQs
disabled (e.g. from LED core). We can't enable BHs in such a situation.

Turn the unconditional BH-enable/BH-disable code into
hardirq-disable/conditional-enable.

This fixes the warning below.

 WARNING: CPU: 1 PID: 1139 at kernel/softirq.c:178 __local_bh_enable_ip+0xa5/0xf0
 CPU: 1 PID: 1139 Comm: NetworkManager Not tainted 5.12.0-rc1-00004-gb4ded168af79 #7
 Hardware name: LENOVO 20K5S22R00/20K5S22R00, BIOS R0IET38W (1.16 ) 05/31/2017
 RIP: 0010:__local_bh_enable_ip+0xa5/0xf0
 Code: f7 69 e8 ee 23 14 00 fb 66 0f 1f 44 00 00 65 8b 05 f0 f4 f7 69 85 c0 74 3f 48 83 c4 08 5b c3 65 8b 05 9b fe f7 69 85 c0 75 8e <0f> 0b eb 8a 48 89 3c 24 e8 4e 20 14 00 48 8b 3c 24 eb 91 e8 13 4e
 RSP: 0018:ffffafd580b13298 EFLAGS: 00010046
 RAX: 0000000000000000 RBX: 0000000000000201 RCX: 0000000000000000
 RDX: 0000000000000003 RSI: 0000000000000201 RDI: ffffffffc1272389
 RBP: ffff96517ae4c018 R08: 0000000000000001 R09: 0000000000000000
 R10: ffffafd580b13178 R11: 0000000000000001 R12: ffff96517b060000
 R13: 0000000000000000 R14: ffffffff80000000 R15: 0000000000000001
 FS:  00007fc604ebefc0(0000) GS:ffff965267480000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 000055fb3fef13b2 CR3: 0000000109112004 CR4: 00000000003706e0
 Call Trace:
  ? _raw_spin_unlock_bh+0x1f/0x30
  iwl_pcie_enqueue_hcmd+0x5d9/0xa00 [iwlwifi]
  iwl_trans_txq_send_hcmd+0x6c/0x430 [iwlwifi]
  iwl_trans_send_cmd+0x88/0x170 [iwlwifi]
  ? lock_acquire+0x277/0x3d0
  iwl_mvm_send_cmd+0x32/0x80 [iwlmvm]
  iwl_mvm_led_set+0xc2/0xe0 [iwlmvm]
  ? led_trigger_event+0x46/0x70
  led_trigger_event+0x46/0x70
  ieee80211_do_open+0x5c5/0xa20 [mac80211]
  ieee80211_open+0x67/0x90 [mac80211]
  __dev_open+0xd4/0x150
  __dev_change_flags+0x19e/0x1f0
  dev_change_flags+0x23/0x60
  do_setlink+0x30d/0x1230
  ? lock_is_held_type+0xb4/0x120
  ? __nla_validate_parse.part.7+0x57/0xcb0
  ? __lock_acquire+0x2e1/0x1a50
  __rtnl_newlink+0x560/0x910
  ? __lock_acquire+0x2e1/0x1a50
  ? __lock_acquire+0x2e1/0x1a50
  ? lock_acquire+0x277/0x3d0
  ? sock_def_readable+0x5/0x290
  ? lock_is_held_type+0xb4/0x120
  ? find_held_lock+0x2d/0x90
  ? sock_def_readable+0xb3/0x290
  ? lock_release+0x166/0x2a0
  ? lock_is_held_type+0x90/0x120
  rtnl_newlink+0x47/0x70
  rtnetlink_rcv_msg+0x25c/0x470
  ? netlink_deliver_tap+0x97/0x3e0
  ? validate_linkmsg+0x350/0x350
  netlink_rcv_skb+0x50/0x100
  netlink_unicast+0x1b2/0x280
  netlink_sendmsg+0x336/0x450
  sock_sendmsg+0x5b/0x60
  ____sys_sendmsg+0x1ed/0x250
  ? copy_msghdr_from_user+0x5c/0x90
  ___sys_sendmsg+0x88/0xd0
  ? lock_is_held_type+0xb4/0x120
  ? find_held_lock+0x2d/0x90
  ? lock_release+0x166/0x2a0
  ? __fget_files+0xfe/0x1d0
  ? __sys_sendmsg+0x5e/0xa0
  __sys_sendmsg+0x5e/0xa0
  ? lockdep_hardirqs_on_prepare+0xd9/0x170
  do_syscall_64+0x33/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xae
 RIP: 0033:0x7fc605c9572d
 Code: 28 89 54 24 1c 48 89 74 24 10 89 7c 24 08 e8 da ee ff ff 8b 54 24 1c 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 33 44 89 c7 48 89 44 24 08 e8 2e ef ff ff 48
 RSP: 002b:00007fffc83789f0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
 RAX: ffffffffffffffda RBX: 000055ef468570c0 RCX: 00007fc605c9572d
 RDX: 0000000000000000 RSI: 00007fffc8378a30 RDI: 000000000000000c
 RBP: 0000000000000010 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000
 R13: 00007fffc8378b80 R14: 00007fffc8378b7c R15: 0000000000000000
 irq event stamp: 170785
 hardirqs last  enabled at (170783): [<ffffffff9609a8c2>] __local_bh_enable_ip+0x82/0xf0
 hardirqs last disabled at (170784): [<ffffffff96a8613d>] _raw_read_lock_irqsave+0x8d/0x90
 softirqs last  enabled at (170782): [<ffffffffc1272389>] iwl_pcie_enqueue_hcmd+0x5d9/0xa00 [iwlwifi]
 softirqs last disabled at (170785): [<ffffffffc1271ec6>] iwl_pcie_enqueue_hcmd+0x116/0xa00 [iwlwifi]

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM/Clang v12.0.0-rc3
Acked-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/nycvar.YFH.7.76.2103021125430.12405@cbobk.fhfr.pm
2021-03-23 11:34:57 +02:00
Andy Shevchenko
a61f4661fb mfd: intel_quark_i2c_gpio: Revert "Constify static struct resources"
The structures are used as place holders, so they are modified at run-time.
Obviously they may not be constants.

  BUG: unable to handle page fault for address: d0643220
  ...
  CPU: 0 PID: 110 Comm: modprobe Not tainted 5.11.0+ #1
  Hardware name: Intel Corp. QUARK/GalileoGen2, BIOS 0x01000200 01/01/2014
  EIP: intel_quark_mfd_probe+0x93/0x1c0 [intel_quark_i2c_gpio]

This partially reverts the commit c4a164f415.

While at it, add a comment to avoid similar changes in the future.

Fixes: c4a164f415 ("mfd: Constify static struct resources")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Tested-by: Tong Zhang <ztong0001@gmail.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
2021-03-23 09:14:12 +00:00
George McCollister
e0c755a45f net: dsa: don't assign an error value to tag_ops
Use a temporary variable to hold the return value from
dsa_tag_driver_get() instead of assigning it to dst->tag_ops. Leaving
an error value in dst->tag_ops can result in deferencing an invalid
pointer when a deferred switch configuration happens later.

Fixes: 357f203bb3 ("net: dsa: keep a copy of the tagging protocol in the DSA switch tree")

Signed-off-by: George McCollister <george.mccollister@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-22 17:24:42 -07:00
David S. Miller
8fb16e80cb mlx5-fixes-2021-03-22
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmBY+ykACgkQSD+KveBX
 +j7oyAgAy6RtOHXTRPvMU2H6iTOO48fTUiPVQpQEZEQdk0GuhsBhbPG19u8GFqJu
 LlVBc90c8ZCbve84u9BRrBkZJUM9mVKuHOXsqFc7SeUuedSnaBtziAOYThWnmPkq
 uO5KvBS4Rbjbh6PXeSmjPwuPzEWBZlKYEbbCyrO7kSm3p9HWjhudHqpd/fLQ+Kxc
 NaMqieD3O2HkMKO1+RmZSanokLixmhF1h25uIVNwIVnCniz4qsLy02fQt7lzw0l5
 VZeEgZBME3rwrrsGtgl29oSbZEGIV10bPw0k2OoCUVX1yrX8zHCZzYdmIUVRUtqP
 hlLC1xuwqFq5Xyd6RMCb3FRbYZv17g==
 =mbVa
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2021-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5 fixes 2021-03-22

This series introduces some fixes to mlx5 driver.
Please pull and let me know if there is any problem.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-22 17:00:48 -07:00
Arnd Bergmann
5ee7d4c7fb isdn: capi: fix mismatched prototypes
gcc-11 complains about a prototype declaration that is different
from the function definition:

drivers/isdn/capi/kcapi.c:724:44: error: argument 2 of type ‘u8 *’ {aka ‘unsigned char *’} declared as a pointer [-Werror=array-parameter=]
  724 | u16 capi20_get_manufacturer(u32 contr, u8 *buf)
      |                                        ~~~~^~~
In file included from drivers/isdn/capi/kcapi.c:13:
drivers/isdn/capi/kcapi.h:62:43: note: previously declared as an array ‘u8[64]’ {aka ‘unsigned char[64]’}
   62 | u16 capi20_get_manufacturer(u32 contr, u8 buf[CAPI_MANUFACTURER_LEN]);
      |                                        ~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/isdn/capi/kcapi.c:790:38: error: argument 2 of type ‘u8 *’ {aka ‘unsigned char *’} declared as a pointer [-Werror=array-parameter=]
  790 | u16 capi20_get_serial(u32 contr, u8 *serial)
      |                                  ~~~~^~~~~~
In file included from drivers/isdn/capi/kcapi.c:13:
drivers/isdn/capi/kcapi.h:64:37: note: previously declared as an array ‘u8[8]’ {aka ‘unsigned char[8]’}
   64 | u16 capi20_get_serial(u32 contr, u8 serial[CAPI_SERIAL_LEN]);
      |                                  ~~~^~~~~~~~~~~~~~~~~~~~~~~

Change the definition to make them match.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-22 16:51:11 -07:00
Parav Pandit
7c1ef1959b net/mlx5: SF, do not use ecpu bit for vhca state processing
Device firmware doesn't handle ecpu bit for vhca state processing
events and commands. Instead device firmware refers to the unique
function id to distinguish SF of different PCI functions.

When ecpu bit is used, firmware returns a syndrome.

mlx5_cmd_check:780:(pid 872): MODIFY_VHCA_STATE(0xb0e) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x263211)
mlx5_sf_dev_table_create:248:(pid 872): SF DEV table create err = -22

Hence, avoid using ecpu bit.

Fixes: 8f01054186 ("net/mlx5: SF, Add port add delete functionality")
Fixes: 90d010b863 ("net/mlx5: SF, Add auxiliary device support")
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-03-22 13:16:41 -07:00
Maxim Mikityanskiy
846d6da1fc net/mlx5e: Fix division by 0 in mlx5e_select_queue
mlx5e_select_queue compares num_tc_x_num_ch to real_num_tx_queues to
determine if HTB and/or PTP offloads are active. If they are, it
calculates netdev_pick_tx() % num_tc_x_num_ch to prevent it from
selecting HTB and PTP queues for regular traffic. However, before the
channels are first activated, num_tc_x_num_ch is zero. If
ndo_select_queue gets called at this point, the HTB/PTP check will pass,
and mlx5e_select_queue will attempt to take a modulo by num_tc_x_num_ch,
which equals to zero.

This commit fixes the bug by assigning num_tc_x_num_ch to a non-zero
value before registering the netdev.

Fixes: 214baf2287 ("net/mlx5e: Support HTB offload")
Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-03-22 13:16:41 -07:00
Aya Levin
4eacfe72e3 net/mlx5e: Fix error path for ethtool set-priv-flag
Expose error value when failing to comply to command:
$ ethtool --set-priv-flags eth2 rx_cqe_compress [on/off]

Fixes: be7e87f92b ("net/mlx5e: Fail safe cqe compressing/moderation mode setting")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-03-22 13:16:41 -07:00
Dima Chumak
96b5b45858 net/mlx5e: Offload tuple rewrite for non-CT flows
Setting connection tracking OVS flows and then setting non-CT flows that
use tuple rewrite action (e.g. mod_tp_dst), causes the latter flows not
being offloaded.

Fix by using a stricter condition in modify_header_match_supported() to
check tuple rewrite support only for flows with CT action. The check is
factored out into standalone modify_tuple_supported() function to aid
readability.

Fixes: 7e36feeb04 ("net/mlx5e: CT: Don't offload tuple rewrites for established tuples")
Signed-off-by: Dima Chumak <dchumak@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-03-22 13:16:40 -07:00
Alaa Hleihel
7d6c86e3cc net/mlx5e: Allow to match on MPLS parameters only for MPLS over UDP
Currently, we support hardware offload only for MPLS over UDP.
However, rules matching on MPLS parameters are now wrongly offloaded
for regular MPLS, without actually taking the parameters into
consideration when doing the offload.
Fix it by rejecting such unsupported rules.

Fixes: 72046a91d1 ("net/mlx5e: Allow to match on mpls parameters")
Signed-off-by: Alaa Hleihel <alaa@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-03-22 13:16:40 -07:00
Huy Nguyen
a07231084d net/mlx5: Add back multicast stats for uplink representor
The multicast counter got removed from uplink representor due to the
cited patch.

Fixes: 47c97e6b10 ("net/mlx5e: Fix multicast counter not up-to-date in "ip -s"")
Signed-off-by: Huy Nguyen <huyn@nvidia.com>
Reviewed-by: Daniel Jurgens <danielj@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-03-22 13:16:40 -07:00
Vladimir Oltean
a50a151e31 net: ipconfig: ic_dev can be NULL in ic_close_devs
ic_close_dev contains a generalization of the logic to not close a
network interface if it's the host port for a DSA switch. This logic is
disguised behind an iteration through the lowers of ic_dev in
ic_close_dev.

When no interface for ipconfig can be found, ic_dev is NULL, and
ic_close_dev:
- dereferences a NULL pointer when assigning selected_dev
- would attempt to search through the lower interfaces of a NULL
  net_device pointer

So we should protect against that case.

The "lower_dev" iterator variable was shortened to "lower" in order to
keep the 80 character limit.

Fixes: f68cbaed67 ("net: ipconfig: avoid use-after-free in ic_close_devs")
Fixes: 46acf7bdbc ("Revert "net: ipv4: handle DSA enabled master network devices"")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Heiko Thiery <heiko.thiery@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-22 12:57:51 -07:00
Jonathan Neuschäfer
6debc0fd71 MAINTAINERS: Combine "QLOGIC QLGE 10Gb ETHERNET DRIVER" sections into one
There ended up being two sections with the same title. Combine the two
into one section.

Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Cc: Manish Chopra <manishc@marvell.com>
Cc: Coiby Xu <coiby.xu@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-22 12:37:26 -07:00
Linus Torvalds
8419639062 selinux/stable-5.12 PR 20210322
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmBYx3gUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXPSSw/+MnJxbBEfxMXll2LwCRXvyW0Q/F++
 sSLPKZL9B5E7jANbTBlkUW+tMwsckTS7euPvRuJj2+mrSujRnSTl158JAAcn34gd
 lpiGQpttFZD75Eh9sLNg0OZ7PflwQvAzHt52EweD8/OE5O8BLBg7o56SYMr3LkGu
 Up9YcZPHNlj+NhfvWebv3jSB6dv392cG33iZoqmW81wSzmlXHGdzS5UTiIFnsp3X
 kbhLKaZWDSBHuAVMuAxtx3x3sQO1ElfFHxKRYM1fzfl0BMy30Wv6YnXHW2nn08Hr
 oT26968C0Rl9carTnA+G60Nj4WoTWW2dF20Mih+05vkpqFLjdMtFra7fFndbmfNi
 f7Gj5DJNrbunX1dMFJkyPnO/1x74RFUhZbCKm5ffvmF8AcYVivbbsyUAy/xduPWo
 m9hjXDVZLUbWxGBUFxyJD6qQw/wuz+qII8B7SBCKaDdCtM74TlXBVug8prrPcWHV
 tO3ljjbxEjBJ6zsFIJ9IlV3rJTL0v4RbAELXXp5qcZOJpnUtuH8cxj0Ryzo3yCY5
 g/m6IHhm5OfJ5TBSc5UIj2NJQi7sJ+Yv/++lms+RB2MVopx4lJ+UK7140gCA40iC
 1EPOGXCnB/b1k5F38dqdpI5MD+/uAzOMusQvPfL4x0xoQidzsqDmqgaS+V8pIYl6
 nisL4eEe2K7PWX4=
 =mFaE
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20210322' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux fixes from Paul Moore:
 "Three SELinux patches:

   - Fix a problem where a local variable is used outside its associated
     function. Thankfully this can only be triggered by reloading the
     SELinux policy, which is a restricted operation for other obvious
     reasons.

   - Fix some incorrect, and inconsistent, audit and printk messages
     when loading the SELinux policy.

  All three patches are relatively minor and have been through our
  testing with no failures"

* tag 'selinux-pr-20210322' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinuxfs: unify policy load error reporting
  selinux: fix variable scope issue in live sidtab conversion
  selinux: don't log MAC_POLICY_LOAD record on failed policy load
2021-03-22 11:34:31 -07:00
Andre Przywara
7011d72588 kselftest/arm64: sve: Do not use non-canonical FFR register value
The "First Fault Register" (FFR) is an SVE register that mimics a
predicate register, but clears bits when a load or store fails to handle
an element of a vector. The supposed usage scenario is to initialise
this register (using SETFFR), then *read* it later on to learn about
elements that failed to load or store. Explicit writes to this register
using the WRFFR instruction are only supposed to *restore* values
previously read from the register (for context-switching only).
As the manual describes, this register holds only certain values, it:
"... contains a monotonic predicate value, in which starting from bit 0
there are zero or more 1 bits, followed only by 0 bits in any remaining
bit positions."
Any other value is UNPREDICTABLE and is not supposed to be "restored"
into the register.

The SVE test currently tries to write a signature pattern into the
register, which is *not* a canonical FFR value. Apparently the existing
setups treat UNPREDICTABLE as "read-as-written", but a new
implementation actually only stores canonical values. As a consequence,
the sve-test fails immediately when comparing the FFR value:
-----------
 # ./sve-test
Vector length:  128 bits
PID:    207
Mismatch: PID=207, iteration=0, reg=48
        Expected [cf00]
        Got      [0f00]
Aborted
-----------

Fix this by only populating the FFR with proper canonical values.
Effectively the requirement described above limits us to 17 unique
values over 16 bits worth of FFR, so we condense our signature down to 4
bits (2 bits from the PID, 2 bits from the generation) and generate the
canonical pattern from it. Any bits describing elements above the
minimum 128 bit are set to 0.

This aligns the FFR usage to the architecture and fixes the test on
microarchitectures implementing FFR in a more restricted way.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviwed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210319120128.29452-1-andre.przywara@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-22 12:49:57 +00:00
Pavel Tatashin
ee7febce05 arm64: mm: correct the inside linear map range during hotplug check
Memory hotplug may fail on systems with CONFIG_RANDOMIZE_BASE because the
linear map range is not checked correctly.

The start physical address that linear map covers can be actually at the
end of the range because of randomization. Check that and if so reduce it
to 0.

This can be verified on QEMU with setting kaslr-seed to ~0ul:

memstart_offset_seed = 0xffff
START: __pa(_PAGE_OFFSET(vabits_actual)) = ffff9000c0000000
END:   __pa(PAGE_END - 1) =  1000bfffffff

Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Fixes: 58284a901b ("arm64/mm: Validate hotplug range before creating linear mapping")
Tested-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20210216150351.129018-2-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-22 12:47:40 +00:00
Pavel Tatashin
141f8202cf arm64: kdump: update ppos when reading elfcorehdr
The ppos points to a position in the old kernel memory (and in case of
arm64 in the crash kernel since elfcorehdr is passed as a segment). The
function should update the ppos by the amount that was read. This bug is
not exposed by accident, but other platforms update this value properly.
So, fix it in ARM64 version of elfcorehdr_read() as well.

Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Fixes: e62aaeac42 ("arm64: kdump: provide /proc/vmcore file")
Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Link: https://lore.kernel.org/r/20210319205054.743368-1-pasha.tatashin@soleen.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-22 12:46:38 +00:00
Bhaskar Chowdhury
d1296f1265 arm64: cpuinfo: Fix a typo
s/acurate/accurate/

Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20210319222848.29928-1-unixbhaskar@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-22 12:45:13 +00:00
Tom Saeger
e14a371f73 Documentation: arm64/acpi : clarify arm64 support of IBFT
In commit 94bccc3407 ("iscsi_ibft: make ISCSI_IBFT dependson ACPI instead
of ISCSI_IBFT_FIND") Kconfig was disentangled to make ISCSI_IBFT selection
not depend on x86.

Update arm64 acpi documentation, changing IBFT support status from
"Not Supported" to "Optional".
Opportunistically re-flow paragraph for changed lines.

Link: https://lore.kernel.org/lkml/1563475054-10680-1-git-send-email-thomas.tai@oracle.com/

Signed-off-by: Tom Saeger <tom.saeger@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Link: https://lore.kernel.org/r/9efc652df2b8d6b53d9acb170eb7c9ca3938dfef.1615920441.git.tom.saeger@oracle.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-22 12:43:20 +00:00
Mark Rutland
c607ab4f91 arm64: stacktrace: don't trace arch_stack_walk()
We recently converted arm64 to use arch_stack_walk() in commit:

  5fc57df2f6 ("arm64: stacktrace: Convert to ARCH_STACKWALK")

The core stacktrace code expects that (when tracing the current task)
arch_stack_walk() starts a trace at its caller, and does not include
itself in the trace. However, arm64's arch_stack_walk() includes itself,
and so traces include one more entry than callers expect. The core
stacktrace code which calls arch_stack_walk() tries to skip a number of
entries to prevent itself appearing in a trace, and the additional entry
prevents skipping one of the core stacktrace functions, leaving this in
the trace unexpectedly.

We can fix this by having arm64's arch_stack_walk() begin the trace with
its caller. The first value returned by the trace will be
__builtin_return_address(0), i.e. the caller of arch_stack_walk(). The
first frame record to be unwound will be __builtin_frame_address(1),
i.e. the caller's frame record. To prevent surprises, arch_stack_walk()
is also marked noinline.

While __builtin_frame_address(1) is not safe in portable code, local GCC
developers have confirmed that it is safe on arm64. To find the caller's
frame record, the builtin can safely dereference the current function's
frame record or (in theory) could stash the original FP into another GPR
at function entry time, neither of which are problematic.

Prior to this patch, the tracing code would unexpectedly show up in
traces of the current task, e.g.

| # cat /proc/self/stack
| [<0>] stack_trace_save_tsk+0x98/0x100
| [<0>] proc_pid_stack+0xb4/0x130
| [<0>] proc_single_show+0x60/0x110
| [<0>] seq_read_iter+0x230/0x4d0
| [<0>] seq_read+0xdc/0x130
| [<0>] vfs_read+0xac/0x1e0
| [<0>] ksys_read+0x6c/0xfc
| [<0>] __arm64_sys_read+0x20/0x30
| [<0>] el0_svc_common.constprop.0+0x60/0x120
| [<0>] do_el0_svc+0x24/0x90
| [<0>] el0_svc+0x2c/0x54
| [<0>] el0_sync_handler+0x1a4/0x1b0
| [<0>] el0_sync+0x170/0x180

After this patch, the tracing code will not show up in such traces:

| # cat /proc/self/stack
| [<0>] proc_pid_stack+0xb4/0x130
| [<0>] proc_single_show+0x60/0x110
| [<0>] seq_read_iter+0x230/0x4d0
| [<0>] seq_read+0xdc/0x130
| [<0>] vfs_read+0xac/0x1e0
| [<0>] ksys_read+0x6c/0xfc
| [<0>] __arm64_sys_read+0x20/0x30
| [<0>] el0_svc_common.constprop.0+0x60/0x120
| [<0>] do_el0_svc+0x24/0x90
| [<0>] el0_svc+0x2c/0x54
| [<0>] el0_sync_handler+0x1a4/0x1b0
| [<0>] el0_sync+0x170/0x180

Erring on the side of caution, I've given this a spin with a bunch of
toolchains, verifying the output of /proc/self/stack and checking that
the assembly looked sound. For GCC (where we require version 5.1.0 or
later) I tested with the kernel.org crosstool binares for versions
5.5.0, 6.4.0, 6.5.0, 7.3.0, 7.5.0, 8.1.0, 8.3.0, 8.4.0, 9.2.0, and
10.1.0. For clang (where we require version 10.0.1 or later) I tested
with the llvm.org binary releases of 11.0.0, and 11.0.1.

Fixes: 5fc57df2f6 ("arm64: stacktrace: Convert to ARCH_STACKWALK")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Jun <chenjun102@huawei.com>
Cc: Marco Elver <elver@google.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: <stable@vger.kernel.org> # 5.10.x
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210319184106.5688-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
2021-03-22 12:42:49 +00:00
Lang Cheng
af06b628a6 RDMA/hns: Fix bug during CMDQ initialization
When reloading driver, the head/tail pointer of CMDQ may be not at
position 0. Then during initialization of CMDQ, if head is reset first,
the firmware will start to handle CMDQ because the head is not equal to
the tail. The driver can reset tail first since the firmware will be
triggerred only by head. This bug is introduced by changing macros of
head/tail register without changing the order of initialization.

Fixes: 292b3352bd ("RDMA/hns: Adjust fields and variables about CMDQ tail/head")
Link: https://lore.kernel.org/r/1615602611-7963-1-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-22 09:25:57 -03:00
Xin Long
154deab6a3 esp: delete NETIF_F_SCTP_CRC bit from features for esp offload
Now in esp4/6_gso_segment(), before calling inner proto .gso_segment,
NETIF_F_CSUM_MASK bits are deleted, as HW won't be able to do the
csum for inner proto due to the packet encrypted already.

So the UDP/TCP packet has to do the checksum on its own .gso_segment.
But SCTP is using CRC checksum, and for that NETIF_F_SCTP_CRC should
be deleted to make SCTP do the csum in own .gso_segment as well.

In Xiumei's testing with SCTP over IPsec/veth, the packets are kept
dropping due to the wrong CRC checksum.

Reported-by: Xiumei Mu <xmu@redhat.com>
Fixes: 7862b4058b ("esp: Add gso handlers for esp4 and esp6")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2021-03-22 07:43:31 +01:00