Commit Graph

1265643 Commits

Author SHA1 Message Date
Linus Torvalds
d5cf50dafc Kconfig: add some hidden tabs on purpose
Commit d96c36004e ("tracing: Fix FTRACE_RECORD_RECURSION_SIZE Kconfig
entry") removed a hidden tab because it apparently showed breakage in
some third-party kernel config parsing tool.

It wasn't clear what tool it was, but let's make sure it gets fixed.
Because if you can't parse tabs as whitespace, you should not be parsing
the kernel Kconfig files.

In fact, let's make such breakage more obvious than some esoteric ftrace
record size option.  If you can't parse tabs, you can't have page sizes.

Yes, tab-vs-space confusion is sadly a traditional Unix thing, and
'make' is famous for being broken in this regard.  But no, that does not
mean that it's ok.

I'd add more random tabs to our Kconfig files, but I don't want to make
things uglier than necessary.  But it *might* bbe necessary if it turns
out we see more of this kind of silly tooling.

Fixes: d96c36004e ("tracing: Fix FTRACE_RECORD_RECURSION_SIZE Kconfig entry")
Link: https://lore.kernel.org/lkml/CAHk-=wj-hLLN_t_m5OL4dXLaxvXKy_axuoJYXif7iczbfgAevQ@mail.gmail.com/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-04-12 10:05:10 -07:00
Linus Torvalds
5939d45155 Tracing fixes for 6.9:
- Fix the buffer_percent accounting as it is dependent on three variables:
   1) pages_read - number of subbuffers read
   2) pages_lost - number of subbuffers lost due to overwrite
   3) pages_touched - number of pages that a writer entered
   These three counters only increment, and to know how many active pages
   there are on the buffer at any given time, the pages_read and
   pages_lost are subtracted from pages_touched. But the pages touched
   was incremented whenever any writer went to the next subbuffer even
   if it wasn't the only one, so it was incremented more than it should
   be causing the counter for how many subbuffers currently have content
   incorrect, which caused the buffer_percent that holds waiters until
   the ring buffer is filled to a given percentage to wake up early.
 
 - Fix warning of unused functions when PERF_EVENTS is not configured in
 
 - Replace bad tab with space in Kconfig for FTRACE_RECORD_RECURSION_SIZE
 
 - Fix to some kerneldoc function comments in eventfs code.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZhk+khQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qvs0AP98c226UFU6Dha4wvgSulC/wKVvHN3X
 jeclMTdn8RGs2gD/b9OULKNv1//6fP16ZRun7ntRQkotVhlNhf9Ee0smiwU=
 =UYrk
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing fixes from Steven Rostedt:

 - Fix the buffer_percent accounting as it is dependent on three
   variables:

     1) pages_read - number of subbuffers read
     2) pages_lost - number of subbuffers lost due to overwrite
     3) pages_touched - number of pages that a writer entered

   These three counters only increment, and to know how many active
   pages there are on the buffer at any given time, the pages_read and
   pages_lost are subtracted from pages_touched.

   But the pages touched was incremented whenever any writer went to the
   next subbuffer even if it wasn't the only one, so it was incremented
   more than it should be causing the counter for how many subbuffers
   currently have content incorrect, which caused the buffer_percent
   that holds waiters until the ring buffer is filled to a given
   percentage to wake up early.

 - Fix warning of unused functions when PERF_EVENTS is not configured in

 - Replace bad tab with space in Kconfig for FTRACE_RECORD_RECURSION_SIZE

 - Fix to some kerneldoc function comments in eventfs code.

* tag 'trace-v6.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  ring-buffer: Only update pages_touched when a new page is touched
  tracing: hide unused ftrace_event_id_fops
  tracing: Fix FTRACE_RECORD_RECURSION_SIZE Kconfig entry
  eventfs: Fix kernel-doc comments to functions
2024-04-12 09:02:24 -07:00
Linus Torvalds
e00011a146 Fix for syscall_get_nr() to make it work even if tracing is disabled
-----BEGIN PGP SIGNATURE-----
 
 iQJOBAABCAA4FiEEbt46xwy6kEcDOXoUeZbBVTGwZHAFAmYY7QoaHHRzYm9nZW5k
 QGFscGhhLmZyYW5rZW4uZGUACgkQeZbBVTGwZHDuJxAAhS4A48oPpC98MvxzzIUz
 xJIOoJbQZuFK7rEd2CEusbt4Ri3tojsIQ1IqLKEGHq6OJxuWS8U9egn8wQZVcO3s
 BsJrtfiu4jEhUohWNDzVG7XiMc+/H1vRN7/BXV3Fvl8bSvwouqWcYfMU/0qe7Ntq
 X34aidIvVLUCDZPz/LMUXGoaxW2oA2yAEuH+LaLfW/sZ+FWzM9f1e6BgxzV5HnqV
 qeRiZq56C1y2UMUbbgoLdHlP0T2PgiZ4eZ3wzlw9goka47hXUQlvPvg6x7pCv+ZS
 9mOZWMqTiMczhQlm3jjeKBo45eVKLnf6QfV3Sr4TwRYoxoO95hTEsgf6JTgprxiP
 +M5BG5XJchXUE2m3Oi9cnIo9wXcIyj+QlYhxqGPR5zIHcAOOKGUn3E4yuhezbN8X
 tvdF8kUv7He9drRA5rUdZ0AJ1P+nooaBfEPoIPhb7laSxlyXi4SGLKziY9CRdiMq
 jxJezT2ES75cz66XMHJMSCZTHEF4rthVA6UsTa3gGer2BEKWAMFZ2zoMTv1ZO9Mo
 tcxsSonL2vfW99lcFt8PWAucoVi0EjB0sxDASccg0NS/sT6hBb9qrWuG3Bq9u2N7
 DzWPO1275yMuPwYxWpjxQ5bDMBnT/TzPxSv06obFayCip6NJABHAYpZ/gzK8CaaD
 bGmMowZIPpsdoUzrIdFN3DA=
 =fJiH
 -----END PGP SIGNATURE-----

Merge tag 'mips-fixes_6.9_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux

Pull MIPS fix from Thomas Bogendoerfer:
 "Fix for syscall_get_nr() to make it work even if tracing is disabled"

* tag 'mips-fixes_6.9_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: scall: Save thread_info.syscall unconditionally on entry
2024-04-12 08:46:58 -07:00
Linus Torvalds
d1c13e8004 drm fixes for 6.9-rc4
client:
 - Protect connector modes with mode_config mutex
 
 ast:
 - Fix soft lockup
 
 host1x:
 - Do not setup DMA for virtual addresses
 
 ivpu:
 - Fix deadlock in context_xa
 - PCI fixes
 - Fixes to error handling
 
 nouveau:
 - gsp: Fix OOB access
 - Fix casting
 
 panfrost:
 - Fix error path in MMU code
 
 qxl:
 - Revert "drm/qxl: simplify qxl_fence_wait"
 
 vmwgfx:
 - Enable DMA for SEV mappings
 
 i915:
 - Couple CDCLK programming fixes
 - HDCP related fix
 - 4 Bigjoiner related fixes
 - Fix for a circular locking around GuC on reset+wedged case
 
 xe:
 - Fix double display mutex initializations
 - Fix u32 -> u64 implicit conversions
 - Fix RING_CONTEXT_CONTROL not marked as masked
 
 msm:
 - DP refcount leak fix on disconnect
 - Add missing newlines to prints in msm_fb and msm_kms
 - fix dpu debugfs entry permissions
 - Fix the interface table for the catalog of X1E80100
 - fix irq message printing
 - Bindings fix to add DP node as child of mdss for mdss node
 - Minor typo fix in DP driver API which handles port status change
 - fix CHRASHDUMP_READ()
 - fix HHB (highest bank bit) for a619 to fix UBWC corruption
 
 amdgpu:
 - GPU reset fixes
 - Fix some confusing logging
 - UMSCH fix
 - Aborted suspend fix
 - DCN 3.5 fixes
 - S4 fix
 - MES logging fixes
 - SMU 14 fixes
 - SDMA 4.4.2 fix
 - KASAN fix
 - SMU 13.0.10 fix
 - VCN partition fix
 - GFX11 fixes
 - DWB fixes
 - Plane handling fix
 - FAMS fix
 - DCN 3.1.6 fix
 - VSC SDP fixes
 - OLED panel fix
 - GFX 11.5 fix
 
 amdkfd:
 - GPU reset fixes
 - fix ioctl integer overflow
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmYYjYoACgkQDHTzWXnE
 hr5+AhAAqM5ZXNEC5Y5zbxqLmMyr08sn4lm/fE0CNZYPoRXerBJIOTXvc8V9YL9W
 5Nxux+5qLzLrIOKdY5PIGEyq0TlA53ZyR0GSF3NfFDxI2OiBRbuRpS6obuimPRY6
 bHdfb89S3TRj3XtP7eJs3fuejygAzLDg+4AXIEF9htlXVARrEBVA2Cw1Km0goWoa
 uXhfLlt9tycIdtuedpPDnetZGlffZa1eCFqFlDJg671moEdQDfdSSsgVtkaDkAy1
 Cy1jdu4+9WHTNAS6kGHy8iB7Ebq+n6UKiBr3o9GoWTIgqVz33FeMU1zrJ2IdwNNK
 nuPBeyk7Z125VEntyj+29GdLRcKv+K75zbdoKrifYp5/937DBjpiJ5bcqOnZdDVG
 hc6IPeMWcvLJ10W+ZrtUqs7BA2cS2dO06uBKEHqNL5Th/TmOSpyKeS/JOtU9Tpkq
 b2OkGy/H+20wXFBFmanLlPuS0lrvuUkLi7bIvtqZq2e5vOHbm4xKLrCDPWFtp1sR
 ohkDtqmkGocQz5l0Ublqdcnrtttg3+Rr5Jh+cCkq5f9gEqsbQWEoPzG9dyZOOBq2
 xEtWf5enH+3J711/9FB8+aWD+j7T04a7ZEODeDDAQBwgW6gMYeGHQPKvTWC0xkYX
 8LBYvxfV1TSK+4l3geF9m+MQuZnuX1sZ5QSh7b/nfBvxMaFVJ84=
 =tDFk
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-2024-04-12' of https://gitlab.freedesktop.org/drm/kernel

Pull drm fixes from Dave Airlie:
 "Looks like everyone woke up after holidays, this weeks pull has a
  bunch of stuff all over, 2 weeks worth of amdgpu is a lot of it, then
  i915/xe have a few, a bunch of msm fixes, then some scattered driver
  fixes.

  I expect things will settle down for rc5.

  client:
   - Protect connector modes with mode_config mutex

  ast:
   - Fix soft lockup

  host1x:
   - Do not setup DMA for virtual addresses

  ivpu:
   - Fix deadlock in context_xa
   - PCI fixes
   - Fixes to error handling

  nouveau:
   - gsp: Fix OOB access
   - Fix casting

  panfrost:
   - Fix error path in MMU code

  qxl:
   - Revert "drm/qxl: simplify qxl_fence_wait"

  vmwgfx:
   - Enable DMA for SEV mappings

  i915:
   - Couple CDCLK programming fixes
   - HDCP related fix
   - 4 Bigjoiner related fixes
   - Fix for a circular locking around GuC on reset+wedged case

  xe:
   - Fix double display mutex initializations
   - Fix u32 -> u64 implicit conversions
   - Fix RING_CONTEXT_CONTROL not marked as masked

  msm:
   - DP refcount leak fix on disconnect
   - Add missing newlines to prints in msm_fb and msm_kms
   - fix dpu debugfs entry permissions
   - Fix the interface table for the catalog of X1E80100
   - fix irq message printing
   - Bindings fix to add DP node as child of mdss for mdss node
   - Minor typo fix in DP driver API which handles port status change
   - fix CHRASHDUMP_READ()
   - fix HHB (highest bank bit) for a619 to fix UBWC corruption

  amdgpu:
   - GPU reset fixes
   - Fix some confusing logging
   - UMSCH fix
   - Aborted suspend fix
   - DCN 3.5 fixes
   - S4 fix
   - MES logging fixes
   - SMU 14 fixes
   - SDMA 4.4.2 fix
   - KASAN fix
   - SMU 13.0.10 fix
   - VCN partition fix
   - GFX11 fixes
   - DWB fixes
   - Plane handling fix
   - FAMS fix
   - DCN 3.1.6 fix
   - VSC SDP fixes
   - OLED panel fix
   - GFX 11.5 fix

  amdkfd:
   - GPU reset fixes
   - fix ioctl integer overflow"

* tag 'drm-fixes-2024-04-12' of https://gitlab.freedesktop.org/drm/kernel: (65 commits)
  amdkfd: use calloc instead of kzalloc to avoid integer overflow
  drm/xe: Label RING_CONTEXT_CONTROL as masked
  drm/xe/xe_migrate: Cast to output precision before multiplying operands
  drm/xe/hwmon: Cast result to output precision on left shift of operand
  drm/xe/display: Fix double mutex initialization
  drm/amdgpu: differentiate external rev id for gfx 11.5.0
  drm/amd/display: Adjust dprefclk by down spread percentage.
  drm/amd/display: Set VSC SDP Colorimetry same way for MST and SST
  drm/amd/display: Program VSC SDP colorimetry for all DP sinks >= 1.4
  drm/amd/display: fix disable otg wa logic in DCN316
  drm/amd/display: Do not recursively call manual trigger programming
  drm/amd/display: always reset ODM mode in context when adding first plane
  drm/amdgpu: fix incorrect number of active RBs for gfx11
  drm/amd/display: Return max resolution supported by DWB
  amd/amdkfd: sync all devices to wait all processes being evicted
  drm/amdgpu: clear set_q_mode_offs when VM changed
  drm/amdgpu: Fix VCN allocation in CPX partition
  drm/amd/pm: fix the high voltage issue after unload
  drm/amd/display: Skip on writeback when it's not applicable
  drm/amdgpu: implement IRQ_STATE_ENABLE for SDMA v4.4.2
  ...
2024-04-12 08:27:09 -07:00
Dave Airlie
3b0daecfea amdkfd: use calloc instead of kzalloc to avoid integer overflow
This uses calloc instead of doing the multiplication which might
overflow.

Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
2024-04-12 11:11:59 +10:00
Dave Airlie
6d8372713c Merge tag 'drm-msm-next-2024-04-11' of https://gitlab.freedesktop.org/drm/msm into drm-fixes
Fixes for v6.9

Display:
- Fixes for PM refcount leak when DP goes to disconnected state and
  also when link training fails. This is also one of the issues found
  with the pm runtime series
- Add missing newlines to prints in msm_fb and msm_kms
- Change permissions of some dpu debugfs entries which write to const
  data from catalog to read-only to avoid protection faults
- Fix the interface table for the catalog of X1E80100. This is an
  important fix to bringup DP for X1E80100.
- Logging fix to print the callback symbol in the invalid IRQ message
  case rather than printing when its known to be NULL.
- Bindings fix to add DP node as child of mdss for mdss node
- Minor typo fix in DP driver API which handles port status change

GPU:
- fix CHRASHDUMP_READ()
- fix HHB (highest bank bit) for a619 to fix UBWC corruption

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGvFwRUcHGWva7oDeydq1PTiZMduuykCD2MWaFrT4iMGZA@mail.gmail.com
2024-04-12 11:01:45 +10:00
Linus Torvalds
586b5dfb51 cxl fixes for v6.9-rc4
- Fix index of Clear Event Record handles in cxl_clear_event_record().
 - Fix use before init of map->reg_type in cxl_decode_regblock().
 - Fix initialization of mbox_cmd.size_out in cxl_mem_get_records_log().
 - Series fixing CXL path access_coordinate computation.
   - Remove unneded check of iter in loop.
   - Fix of retrieving of access_coordinate in PCI topology walk.
   - Fix of incorrect region access_coordinate data calculation.
   - Consolidate of access_coordinates attached to downstream port
     context.
   - Add check to validate access_coordinate validity to prevent
     incorrect data being exposed via sysfs.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE5DAy15EJMCV1R6v9YGjFFmlTOEoFAmYYa+YACgkQYGjFFmlT
 OErU2A/+MOjbUrgHAm2NECLR2SrXb7JHJA6J5glaWLwjUpuV97BopHpEAU5Whlf4
 sLk5o1j7DcNjKDBQQTtBvefYDdfzMQGS2amZdu9Z7FZJtWW1DRiVYjuKdMS3y4mC
 I6U0jRHWp6ojhf6Wa/09LYrRzOxu+sPLV8t3MGkkIpdYFwunJXl169H22EIjCuWD
 BALjl2jCqKSPIwxZMnM7hR817s1z6sDM25XK2Wr1oSCGaIeV0uGvZyx0PnY1jFSG
 z2iFN/ZntivbT554JTNEFMeHheOlkzZL7liy5QZRGCKmrfTM0WnVtFyMWGpQ85XI
 GMoi/xSCDozrmOk3aMTPqyhabCX9VGdUO5IZDyiMwfofCXKZQrGs6IbzQLvTC/MV
 Ngtzb8CExvel+N24UAiWDBilhsgvzrRLCBRWc8Scl08cGXF0/C+n2+Nq3brTqAaP
 aDn4Zj9IOpSG0POawN4mqLb90A7JkbCNux35ssQ6b/lXVjIe7uRqrmlFcXMnV6ja
 dQ1fw5dxZBCr1wtTSOOAqOqVt1XNw16VP85nmQ6SwWed++4Ja2U/cMZVwtRLnf1A
 sz53Po209RJODhwjzyQ5kxj6oTss3voqQ7MlVUiWrOnXPohQsGMRk3gGW/fC1DG3
 prvNFZlHfeeWyw5H7goJ+Newx/fcY991ytuI9X7II/cG+TLQ0cM=
 =SGRd
 -----END PGP SIGNATURE-----

Merge tag 'cxl-fixes-6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull cxl fixes from Dave Jiang:

 - Fix index of Clear Event Record handles in cxl_clear_event_record()

 - Fix use before init of map->reg_type in cxl_decode_regblock()

 - Fix initialization of mbox_cmd.size_out in cxl_mem_get_records_log()

 - Fix CXL path access_coordinate computation:
     - Remove unneded check of iter in loop
     - Fix of retrieving of access_coordinate in PCI topology walk
     - Fix of incorrect region access_coordinate data calculation
     - Consolidate of access_coordinates attached to downstream port
       context
     - Add check to validate access_coordinate validity to prevent
       incorrect data being exposed via sysfs

* tag 'cxl-fixes-6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
  cxl: Add checks to access_coordinate calculation to fail missing data
  cxl: Consolidate dport access_coordinate ->hb_coord and ->sw_coord into ->coord
  cxl: Fix incorrect region perf data calculation
  cxl: Fix retrieving of access_coordinates in PCIe path
  cxl: Remove checking of iter in cxl_endpoint_get_perf_coordinates()
  cxl/core: Fix initialization of mbox_cmd.size_out in get event
  cxl/core/regs: Fix usage of map->reg_type in cxl_decode_regblock() before assigned
  cxl/mem: Fix for the index of Clear Event Record Handle
2024-04-11 16:49:11 -07:00
Linus Torvalds
52e5070f60 hyperv-fixes for v6.9-rc4
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAmYYYPkTHHdlaS5saXVA
 a2VybmVsLm9yZwAKCRB2FHBfkEGgXnxhB/4/8c7lFT53VbujFmVA5sNvpP5Ji5Xg
 ERhVID7tzDyaVRPr+tpPJIW0Oj/t34SH9seoTYCBCM/UABWe/Gxceg2JaoOzIx+l
 LHi73T4BBaqExiXbCCFj8N7gLO5P4Xz6ZZRgwHws1KmMXsiWYmYsbv36eSv9x6qK
 +z/n6p9/ubKFNj2/vsvfiGmY0XHayD3NM4Y4toMbYE/tuRT8uZ7D5sqWdRf+UhW/
 goRDA5qppeSfuaQu2LNVoz1e6wRmeJFv8OHgaPvQqAjTRLzPwwss28HICmKc8gh3
 HDDUUJCHSs1XItSGDFip6rIFso5X/ZHO0d6pV75hOKCisd7lV0qH6NIZ
 =k62H
 -----END PGP SIGNATURE-----

Merge tag 'hyperv-fixes-signed-20240411' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull hyperv fixes from Wei Liu:

 - Some cosmetic changes (Erni Sri Satya Vennela, Li Zhijian)

 - Introduce hv_numa_node_to_pxm_info() (Nuno Das Neves)

 - Fix KVP daemon to handle IPv4 and IPv6 combination for keyfile format
   (Shradha Gupta)

 - Avoid freeing decrypted memory in a confidential VM (Rick Edgecombe
   and Michael Kelley)

* tag 'hyperv-fixes-signed-20240411' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  Drivers: hv: vmbus: Don't free ring buffers that couldn't be re-encrypted
  uio_hv_generic: Don't free decrypted memory
  hv_netvsc: Don't free decrypted memory
  Drivers: hv: vmbus: Track decrypted status in vmbus_gpadl
  Drivers: hv: vmbus: Leak pages if set_memory_encrypted() fails
  hv/hv_kvp_daemon: Handle IPv4 and Ipv6 combination for keyfile format
  hv: vmbus: Convert sprintf() family to sysfs_emit() family
  mshyperv: Introduce hv_numa_node_to_pxm_info()
  x86/hyperv: Cosmetic changes for hv_apic.c
2024-04-11 16:23:56 -07:00
Steven Rostedt (Google)
ffe3986fec ring-buffer: Only update pages_touched when a new page is touched
The "buffer_percent" logic that is used by the ring buffer splice code to
only wake up the tasks when there's no data after the buffer is filled to
the percentage of the "buffer_percent" file is dependent on three
variables that determine the amount of data that is in the ring buffer:

 1) pages_read - incremented whenever a new sub-buffer is consumed
 2) pages_lost - incremented every time a writer overwrites a sub-buffer
 3) pages_touched - incremented when a write goes to a new sub-buffer

The percentage is the calculation of:

  (pages_touched - (pages_lost + pages_read)) / nr_pages

Basically, the amount of data is the total number of sub-bufs that have been
touched, minus the number of sub-bufs lost and sub-bufs consumed. This is
divided by the total count to give the buffer percentage. When the
percentage is greater than the value in the "buffer_percent" file, it
wakes up splice readers waiting for that amount.

It was observed that over time, the amount read from the splice was
constantly decreasing the longer the trace was running. That is, if one
asked for 60%, it would read over 60% when it first starts tracing, but
then it would be woken up at under 60% and would slowly decrease the
amount of data read after being woken up, where the amount becomes much
less than the buffer percent.

This was due to an accounting of the pages_touched incrementation. This
value is incremented whenever a writer transfers to a new sub-buffer. But
the place where it was incremented was incorrect. If a writer overflowed
the current sub-buffer it would go to the next one. If it gets preempted
by an interrupt at that time, and the interrupt performs a trace, it too
will end up going to the next sub-buffer. But only one should increment
the counter. Unfortunately, that was not the case.

Change the cmpxchg() that does the real switch of the tail-page into a
try_cmpxchg(), and on success, perform the increment of pages_touched. This
will only increment the counter once for when the writer moves to a new
sub-buffer, and not when there's a race and is incremented for when a
writer and its preempting writer both move to the same new sub-buffer.

Link: https://lore.kernel.org/linux-trace-kernel/20240409151309.0d0e5056@gandalf.local.home

Cc: stable@vger.kernel.org
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Fixes: 2c2b0a78b3 ("ring-buffer: Add percentage of ring buffer full to wake up reader")
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-04-11 17:49:57 -04:00
Arnd Bergmann
5281ec8345 tracing: hide unused ftrace_event_id_fops
When CONFIG_PERF_EVENTS, a 'make W=1' build produces a warning about the
unused ftrace_event_id_fops variable:

kernel/trace/trace_events.c:2155:37: error: 'ftrace_event_id_fops' defined but not used [-Werror=unused-const-variable=]
 2155 | static const struct file_operations ftrace_event_id_fops = {

Hide this in the same #ifdef as the reference to it.

Link: https://lore.kernel.org/linux-trace-kernel/20240403080702.3509288-7-arnd@kernel.org

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Zheng Yejian <zhengyejian1@huawei.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Ajay Kaher <akaher@vmware.com>
Cc: Jinjie Ruan <ruanjinjie@huawei.com>
Cc: Clément Léger <cleger@rivosinc.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Cc: "Tzvetomir Stoyanov (VMware)" <tz.stoyanov@gmail.com>
Fixes: 620a30e97f ("tracing: Don't pass file_operations array to event_create_dir()")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-04-11 17:46:55 -04:00
Prasad Pandit
d96c36004e tracing: Fix FTRACE_RECORD_RECURSION_SIZE Kconfig entry
Fix FTRACE_RECORD_RECURSION_SIZE entry, replace tab with
a space character. It helps Kconfig parsers to read file
without error.

Link: https://lore.kernel.org/linux-trace-kernel/20240322121801.1803948-1-ppandit@redhat.com

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Fixes: 773c167050 ("ftrace: Add recording of functions that caused recursion")
Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-04-11 17:45:18 -04:00
Yang Li
a8fa658eeb eventfs: Fix kernel-doc comments to functions
This commit fix kernel-doc style comments with complete parameter
descriptions for the lookup_file(),lookup_dir_entry() and
lookup_file_dentry().

Link: https://lore.kernel.org/linux-trace-kernel/20240322062604.28862-1-yang.lee@linux.alibaba.com

Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-04-11 17:42:09 -04:00
Dave Airlie
1bafeaf262 - Fix double display mutex initializations
- Fix u32 -> u64 implicit conversions
 - Fix RING_CONTEXT_CONTROL not marked as masked
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEE6rM8lpABPHM5FqyDm6KlpjDL6lMFAmYYHaoZHGx1Y2FzLmRl
 bWFyY2hpQGludGVsLmNvbQAKCRCboqWmMMvqU3hzEACbLb74pHI9eGtSUaQ2nhxQ
 v2H0xmwPuJCki5hQ3ohxQtfO0CLclJQzA1zIjtTo0Z0m4ZeMO2RrcvCDpDLnrvGE
 KhrXYvxcleTXEOdYcLnXnZQyEcDUuHFi62SJjPz6b0I9bNr4INJ2bfRtzjb5pav2
 7RaJbhPJB9cYUoPG08CP3N5pWR9WtIKFpI1b4eJfe5Kwl6+kbjcycjQB13Fn7DUl
 QaIv8vfdCEJQUKvRO1aydwPdvi0hneGyeQw0uFbu3JcK8IUbb6uia2OUnAxqi8VB
 G+RWcd94XSUsK4aDppiRk1LEoeMfQxZP0q3gLKTk0q7IEGzt6SfC8qVq56xpTZWZ
 Tn8zgtxtmWmNidLKFpcPeHWSghTIlVTI27rdJvLWzDYrvNtEW56sLHpdPDOnuMf3
 pS802xMrBAIcw30vEFadW3/+2niOkmdmHbChwp0dEHGY8hvmYbrecYa34jA/HTD/
 XuQ/YOHRwPjVoC4zW7rOXaJPtKeIDMsMPXFajnynbBxl4wdZYIm7lgnxAKF53D/t
 GDKp52A1TgSKcywx5r6Fq6L/E3O4OlJ9P8OeQ6lT0Q3sAkFurqBhnUF53EDqsRLB
 AtkK/92Yu+yZjKFmdOmEAIbhkV945FMJI+4hH+t6DoqzdqA3Sxep0axHOpBiVyAn
 hLllZlHl6pd5N0IVPboeag==
 =X8OJ
 -----END PGP SIGNATURE-----

Merge tag 'drm-xe-fixes-2024-04-11' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes

- Fix double display mutex initializations
- Fix u32 -> u64 implicit conversions
- Fix RING_CONTEXT_CONTROL not marked as masked

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ewvvtgcb2gonxvccws6nt6fqswoyfp4g43t5ex24vpqwtrxdzm@hgjoz5uirmxx
2024-04-12 05:37:23 +10:00
Dave Airlie
1b24b3cd1a Short summary of fixes pull:
ast:
 - Fix soft lockup
 
 client:
 - Protect connector modes with mode_config mutex
 
 host1x:
 - Do not setup DMA for virtual addresses
 
 ivpu:
 - Fix deadlock in context_xa
 - PCI fixes
 - Fixes to error handling
 
 nouveau:
 - gsp: Fix OOB access
 - Fix casting
 
 panfrost:
 - Fix error path in MMU code
 
 qxl:
 - Revert "drm/qxl: simplify qxl_fence_wait"
 
 vmwgfx:
 - Enable DMA for SEV mappings
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmYXkjkACgkQaA3BHVML
 eiOIYQf9EnAUsjDy2lc+A8EnnASsJOz1xvmw6ZPAM6+ciT78YvjE7a44BoXTJHiC
 fk7yi7t7EUz40VCV7D4Mtaj6wAFXzNWEtqSxfp2RA3AlchabENw3/MDzuAmezUqW
 8Wf7UsEQXigjazBfQS4oupcfuIy0fZVHnTTL0f+557lBtFW2Xmxh72cNhxWin0IB
 oeHWFiZGZmBt8v0NjhZBVd5gR2wa9u+rwj2S9sPPk8Ep42pePBHGU0KJeOKANSRd
 grIenFiAb7u8UNxt1oRKHZbe5oJwSPufpHgdmvqK4M0Xg49GGEJsrPp+M0FH3oMH
 HAlZDzPjIMYWenDMN1v4OqkpgKOXsw==
 =VH7e
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-fixes-2024-04-11' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes

Short summary of fixes pull:

ast:
- Fix soft lockup

client:
- Protect connector modes with mode_config mutex

host1x:
- Do not setup DMA for virtual addresses

ivpu:
- Fix deadlock in context_xa
- PCI fixes
- Fixes to error handling

nouveau:
- gsp: Fix OOB access
- Fix casting

panfrost:
- Fix error path in MMU code

qxl:
- Revert "drm/qxl: simplify qxl_fence_wait"

vmwgfx:
- Enable DMA for SEV mappings

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240411073403.GA9895@localhost.localdomain
2024-04-12 05:35:46 +10:00
Linus Torvalds
00dcf5d862 ACPI fixes for 6.9-rc4
- Modify the ACPI device enumeration code to avoid counting
    dependencies that have been met already as unmet (Hans de Goede).
 
  - Make _UID matching take the integer value of 0 into account as
    appropriate (Raag Jadav).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmYYI2sSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxPAYP/0tbALMVJFkigYEoMsAGPSJqa94T6ZTC
 UwtnmBl7/IgVAGFiPOqsaGEaJ2PIxNZCz/UBwfDbE98HR9Ow+kvl6V8U/PFLweEj
 8JH1OSm5EpvpGzb+Iqmuq49kcewE3usabm5BPoXnJwchC939YSDMgTcKrNci9X3l
 az9CIeWMTxfSEMyCWP3N90Iaga1cDUJeuZJ344G29RPFLjiDBjZQNFUX4mQro5l8
 NjdeTxYJXfr6P7D81OH944Agw43d6oyjw3hDRvOz2t7EPeEdg8nd8doHL4P31TMy
 378hxzpwmKnK1Q/5d6DdDqUCaTXhxs5ztV/fgGMlLccBabgE7g/Wp+Pd6zEut0wO
 Rw1YrVv/ahuf3VB47kgVK77Q1ij7EXk9nE9ity2CMuLET2QfbFoFgIOlWCfSxVzs
 aoBtS2NmKmvKHxvSxEN7doe3ImVV2EeWMvWNw/03CcA6/SlILskQO/6MKfBhtSPO
 AUcTRKM2ia5PDzPfjFCYoiryLf6TEbNrRgwQ9Sq3bQuHimQAL8IqGaFeMd2VIaB8
 WIsjV7LM+IebiPXf65DKRmVvq6uDe7uUTH8vsGnB9hXeTgrUcmhab9LZtZIdf2Ao
 UtCq024W01QBTz/erW0gDwOMalRYqlsUkuL/gz1QDffXbzKMirtoQqZvonCqYEZO
 iiH4PKpxxSRK
 =94Dn
 -----END PGP SIGNATURE-----

Merge tag 'acpi-6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
 "These fix the handling of dependencies between devices in the ACPI
  device enumeration code and address a _UID matching regression from
  the 6.8 development cycle.

  Specifics:

   - Modify the ACPI device enumeration code to avoid counting
     dependencies that have been met already as unmet (Hans de Goede)

   - Make _UID matching take the integer value of 0 into account as
     appropriate (Raag Jadav)"

* tag 'acpi-6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: bus: allow _UID matching for integer zero
  ACPI: scan: Do not increase dep_unmet for already met dependencies
2024-04-11 12:03:43 -07:00
Linus Torvalds
136eb5fd6a Power management fix for 6.9-rc4
Fix the suspend-to-idle core code to guarantee that timers queued on
 CPUs other than the one that has first left the idle state, which should
 expire directly after resume, will be handled (Anna-Maria Behnsen).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmYYIvYSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxq5oP/RS98YTjjrhqieQ2mfKN+thVpgTEeYBa
 C4NoRbQf0x8kyhHbqEmjEdxlIUUmLDpolWMxFtlYdhTSZB7+k0z9GL3fZfTKRTVq
 8lYSuxEzyicylve+b+gVcQ6m61kaV3M2jwP23Myaf5MOIx2OvQy7SnNS02p3yfJ8
 NhBTqqTXbupWnnRuSPZvejnZaU/3wW1UvVmbkBUNwhR0O+6J0bxC5fNrvgip//kG
 Iyx04j88787nwS9rUL68A9xSA9WVRuLNZRMb7RE2mZ9YVR6qLhxhUxfA8WrKignl
 SxFPPmzGNkWdwtlWeIsQvcAOOhTKnWqOm7aggBfcf7jGoYBOJ7Uo4gfxvEvA1gnc
 36UTpJXc/5HQDeVKYcSx5O9VSm7/cKx4sP5jiDKT/iBPLnSXAKoAflgl1RvkchvA
 Gg0aJ37MRIvoP53OidNACkHN2VsikTMuYA4b21G+we/ib7q8kIbdI9yqDF4aPcCU
 vV0rMlSGSfM5+PGn8fDei2sbZ4E6Mk3XRwzF386xJDIyrvnqr6hhmVxDmWOx7oEb
 Jpv80971cbr1lmtP2SKFVdsRxElhRWfXk3OwLOGRbLbQI1BP+SAWeyICAYUanPJI
 W6vmJwDqylCUFRy4mhufDe7fceXHnvFUYMhi0XWkGvCgWEgLffKHTzFNup12hU1Y
 65Rcv19bRP8h
 =8DDE
 -----END PGP SIGNATURE-----

Merge tag 'pm-6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fix from Rafael Wysocki:
 "Fix the suspend-to-idle core code to guarantee that timers queued on
  CPUs other than the one that has first left the idle state, which
  should expire directly after resume, will be handled (Anna-Maria
  Behnsen)"

* tag 'pm-6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: s2idle: Make sure CPUs will wakeup directly on resume
2024-04-11 12:00:25 -07:00
Linus Torvalds
2ae9a8972c Including fixes from bluetooth.
Current release - new code bugs:
 
   - netfilter: complete validation of user input
 
   - mlx5: disallow SRIOV switchdev mode when in multi-PF netdev
 
 Previous releases - regressions:
 
   - core: fix u64_stats_init() for lockdep when used repeatedly in one file
 
   - ipv6: fix race condition between ipv6_get_ifaddr and ipv6_del_addr
 
   - bluetooth: fix memory leak in hci_req_sync_complete()
 
   - batman-adv: avoid infinite loop trying to resize local TT
 
   - drv: geneve: fix header validation in geneve[6]_xmit_skb
 
   - drv: bnxt_en: fix possible memory leak in bnxt_rdma_aux_device_init()
 
   - drv: mlx5: offset comp irq index in name by one
 
   - drv: ena: avoid double-free clearing stale tx_info->xdpf value
 
   - drv: pds_core: fix pdsc_check_pci_health deadlock
 
 Previous releases - always broken:
 
   - xsk: validate user input for XDP_{UMEM|COMPLETION}_FILL_RING
 
   - bluetooth: fix setsockopt not validating user input
 
   - af_unix: clear stale u->oob_skb.
 
   - nfc: llcp: fix nfc_llcp_setsockopt() unsafe copies
 
   - drv: virtio_net: fix guest hangup on invalid RSS update
 
   - drv: mlx5e: Fix mlx5e_priv_init() cleanup flow
 
   - dsa: mt7530: trap link-local frames regardless of ST Port State
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmYXyoQSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOk72wQAJJ9DQra9b/8S3Zla1dutBcznSCxruas
 vWrpgIZiT3Aw5zUmZUZn+rNP8xeWLBK78Yv4m236B8/D3Kji2uMbVrjUAApBBcHr
 /lLmctZIhDHCoJCYvRTC/VOVPuqbbbmxOmx6rNvry93iNAiHAnBdCUlOYzMNPzJz
 6XIGtztFP0ICtt9owFtQRnsPeKhZJ5DoxqgE9KS2Pmb9PU99i1bEShpwLwB5I83S
 yTHKUY5W0rknkQZTW1gbv+o3dR0iFy7LZ+1FItJ/UzH0bG6JmcqzSlH5mZZJCc/L
 5LdUwtwMmKG2Kez/vKr1DAwTeAyhwVU+d+Hb28QXiO0kAYbjbOgNXse1st3RwDt5
 YKMKlsmR+kgPYLcvs9df2aubNSRvi2utwIA2kuH33HxBYF5PfQR5PTGeR21A+cKo
 wvSit8aMaGFTPJ7rRIzkNaPdIHSvPMKYcXV/T8EPvlOHzi5GBX0qHWj99JO9Eri+
 VFci+FG3HCPHK8v683g/WWiiVNx/IHMfNbcukes1oDFsCeNo7KZcnPY+zVhtdvvt
 QBnvbAZGKeDXMbnHZLB3DCR3ENHWTrJzC3alLDp3/uFC79VKtfIRO2wEX3gkrN8S
 JHsdYU13Yp1ERaNjUeq7Sqk2OGLfsBt4HSOhcK8OPPgE5rDRON5UPjkuNvbaEiZY
 Morzaqzerg1B
 =a9bB
 -----END PGP SIGNATURE-----

Merge tag 'net-6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from bluetooth.

  Current release - new code bugs:

   - netfilter: complete validation of user input

   - mlx5: disallow SRIOV switchdev mode when in multi-PF netdev

  Previous releases - regressions:

   - core: fix u64_stats_init() for lockdep when used repeatedly in one
     file

   - ipv6: fix race condition between ipv6_get_ifaddr and ipv6_del_addr

   - bluetooth: fix memory leak in hci_req_sync_complete()

   - batman-adv: avoid infinite loop trying to resize local TT

   - drv: geneve: fix header validation in geneve[6]_xmit_skb

   - drv: bnxt_en: fix possible memory leak in
     bnxt_rdma_aux_device_init()

   - drv: mlx5: offset comp irq index in name by one

   - drv: ena: avoid double-free clearing stale tx_info->xdpf value

   - drv: pds_core: fix pdsc_check_pci_health deadlock

  Previous releases - always broken:

   - xsk: validate user input for XDP_{UMEM|COMPLETION}_FILL_RING

   - bluetooth: fix setsockopt not validating user input

   - af_unix: clear stale u->oob_skb.

   - nfc: llcp: fix nfc_llcp_setsockopt() unsafe copies

   - drv: virtio_net: fix guest hangup on invalid RSS update

   - drv: mlx5e: Fix mlx5e_priv_init() cleanup flow

   - dsa: mt7530: trap link-local frames regardless of ST Port State"

* tag 'net-6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (59 commits)
  net: ena: Set tx_info->xdpf value to NULL
  net: ena: Fix incorrect descriptor free behavior
  net: ena: Wrong missing IO completions check order
  net: ena: Fix potential sign extension issue
  af_unix: Fix garbage collector racing against connect()
  net: dsa: mt7530: trap link-local frames regardless of ST Port State
  Revert "s390/ism: fix receive message buffer allocation"
  net: sparx5: fix wrong config being used when reconfiguring PCS
  net/mlx5: fix possible stack overflows
  net/mlx5: Disallow SRIOV switchdev mode when in multi-PF netdev
  net/mlx5e: RSS, Block XOR hash with over 128 channels
  net/mlx5e: Do not produce metadata freelist entries in Tx port ts WQE xmit
  net/mlx5e: HTB, Fix inconsistencies with QoS SQs number
  net/mlx5e: Fix mlx5e_priv_init() cleanup flow
  net/mlx5e: RSS, Block changing channels number when RXFH is configured
  net/mlx5: Correctly compare pkt reformat ids
  net/mlx5: Properly link new fs rules into the tree
  net/mlx5: offset comp irq index in name by one
  net/mlx5: Register devlink first under devlink lock
  net/mlx5: E-switch, store eswitch pointer before registering devlink_param
  ...
2024-04-11 11:46:31 -07:00
Linus Torvalds
ab4319fdbc SCSI fixes on 20240411
The most important fix is the sg one because the regression it fixes
 (spurious warning and use after final put) is already backported to
 stable.  The next biggest impact is the target fix for wrong
 credentials used to load a module because it's affecting new kernels
 installed on selinux based distributions.  The other three fixes are
 an obvious off by one and SATA protocol issues.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCZhfZJyYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishaOtAQCADxSy
 dAO9ygAEEoz25FXPUvJquzBhwuiFMy878fdd7AD8DNYBs7p5mo+1omWqpLpa0l6Z
 3ZXOBQ6JiuDOx6iKSiQ=
 =+qtC
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "The most important fix is the sg one because the regression it fixes
  (spurious warning and use after final put) is already backported to
  stable.

  The next biggest impact is the target fix for wrong credentials used
  to load a module because it's affecting new kernels installed on
  selinux based distributions.

  The other three fixes are an obvious off by one and SATA protocol
  issues"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: qla2xxx: Fix off by one in qla_edif_app_getstats()
  scsi: hisi_sas: Modify the deadline for ata_wait_after_reset()
  scsi: hisi_sas: Handle the NCQ error returned by D2H frame
  scsi: target: Fix SELinux error when systemd-modules loads the target module
  scsi: sg: Avoid race in error handling & drop bogus warn
2024-04-11 11:42:11 -07:00
Linus Torvalds
5de6b46799 LoongArch fixes for v6.9-rc4
-----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmYXUTAWHGNoZW5odWFj
 YWlAa2VybmVsLm9yZwAKCRAChivD8uImeuw1D/9unmPbKPdYGT8sfLAjr/xfR6dK
 lyXCBQAoa7mazMjyV+iCjbWYOloMXDsuJ5jaFYU1RFEXLRez7pTI8RAQuGOuS6ql
 liBlZ7zbyo/SRs6GMR0eAGVzNYyKHH/xyqGGmwlQabkZt0yqOXAU/Q2FTNRkzN1w
 peTILEi2MLPZjJzJx6iQt2QDcWHSpqQBWwNQSLoNnSmKkbSwgMEAGqYArJ6w1OPC
 KA513Ct7Lk+Qbbbj4JNHSrZvXcR7sqywiJQHK90MTXlOx6Yha+ymCu33iZf5ryJe
 V8RFJGBLfFj8fywk272skuX5XnhXsK14ej6o1R6lp0g8lqKr1jWnpVC0f2+kbChN
 P2XefOe85cXOkt8TqjuTGchg2P95xBGG59xavQo2y9mjtvZG+YyWJsjQCUR7ppEV
 fZ70h+vBsIVAikEKIfEahruk2x5OU+zhzs+OdXy/Har9JLd8sKymNHbfne0RV7Lq
 BKSFeBBGSSPyxL4Hr5Oo0pnowsY29rBWM6mKEVDvkREptkU8nxJdJ07gI0n/cfWQ
 AD3Z7Op6bEU2oOU2oUk1L2471+p2ozjdU+YNCZKz3RNiciuTVgLkeIxwjbRYDojz
 cV7jD84LGEoSgHdJNE5F3URF3ncX64k8ARS05KCnCVeCz9ko6hTPs5wcKLwMuKG9
 GePDGTqGU7H9OGA2qg==
 =tiT2
 -----END PGP SIGNATURE-----

Merge tag 'loongarch-fixes-6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch fixes from Huacai Chen:

 - make {virt, phys, page, pfn} translation work with KFENCE for
   LoongArch (otherwise NVMe and virtio-blk cannot work with KFENCE
   enabled)

 - update dts files for Loongson-2K series to make devices work
   correctly

 - fix a build error

* tag 'loongarch-fixes-6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  LoongArch: Include linux/sizes.h in addrspace.h to prevent build errors
  LoongArch: Update dts for Loongson-2K2000 to support GMAC/GNET
  LoongArch: Update dts for Loongson-2K2000 to support PCI-MSI
  LoongArch: Update dts for Loongson-2K2000 to support ISA/LPC
  LoongArch: Update dts for Loongson-2K1000 to support ISA/LPC
  LoongArch: Make virt_addr_valid()/__virt_addr_valid() work with KFENCE
  LoongArch: Make {virt, phys, page, pfn} translation work with KFENCE
  mm: Move lowmem_page_address() a little later
2024-04-11 11:30:42 -07:00
Linus Torvalds
e1dc191dbf bcachefs fixes for v6.9-rc4
Notable user impacting bugs
 
 - On multi device filesystems, recovery was looping in
   btree_trans_too_many_iters(). This checks if a transaction has touched
   too many btree paths (because of iteration over many keys), and isuses
   a restart to drop unneeded paths. But it's now possible for some paths
   to exceed the previous limit without iteration in the interior btree
   update path, since the transaction commit will do alloc updates for
   every old and new btree node, and during journal replay we don't use
   the btree write buffer for locking reasons and thus those updates use
   btree paths when they wouldn't normally.
 
 - Fix a corner case in rebalance when moving extents on a durability=0
   device. This wouldn't be hit when a device was formatted with
   durability=0 since in that case we'll only use it as a write through
   cache (only cached extents will live on it), but durability can now be
   changed on an existing device.
 
 - bch2_get_acl() could rarely forget to handle a transaction restart;
   this manifested as the occasional missing acl that came back after
   dropping caches.
 
 - Fix a major performance regression on high iops multithreaded write
   workloads (only since 6.9-rc1); a previous fix for a deadlock in the
   interior btree update path to check the journal watermark introduced a
   dependency on the state of btree write buffer flushing that we didn't
   want.
 
 - Assorted other repair paths and recovery fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmYXTd8ACgkQE6szbY3K
 bna4MA//Y/CSB2JupxJPFUAb69+WmNDMnJJV3FlD/Hwo19kOR7aRbKQMUxsH51nb
 dfv5o/58n39QIBqMcMtTipnoND6jrwv7l5NimFKQmj/YehxosdsOf2BgbD/M4Ozz
 84IUmHXSs5+zezjF8IAw/bjR/p13XNJGSYTfl1RWbGUMERLpfZcLn70FvoCR/qpQ
 Bcp+z70K6bBhrPZqFYd2mEC+Cfo42aCD1lqWUQ/e0FHiNEZnCNH2lNca4phBzGt4
 f9sBxBcwmDfizQxqpyZ4izRzbS9ZwZ3ega336L2DrPpwgjMgTRLKLjdXqJDjGvDW
 ngvnaUw7SgICs+q48g3f67cGhw/4lSPdVu/9a/ldqDEP9PynQiUS1G8aDofLdboW
 xCZM6toX86p0DMNY2kP9vc5kxr377cSXAL7VeKbE+ZV5vGyEz1qFDLRAYVixHDfr
 Q7KgYvoJq8CgjfK7rIZbO2tqKhz2TP4+2rJa6tDLwKXs5ice++w/aF5d/nBZ7iCe
 +dyJ+aJiosjHEVG+QscACFrjBJdNNspJqBWDP396XeOsdl+iCeIRP0VHOGytLLRf
 gisE0Lhj4pz6bv7OhAAkdxvegVQ8HfqN1E+f/WM7Kogqos0NS2skEhy9cTuDCHUm
 qtPTUq5XNibiE6J+NOK86pu6o+6sqpWIpfTPuTbMid5sevLsomE=
 =BQC0
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2024-04-10' of https://evilpiepirate.org/git/bcachefs

Pull more bcachefs fixes from Kent Overstreet:
 "Notable user impacting bugs

   - On multi device filesystems, recovery was looping in
     btree_trans_too_many_iters(). This checks if a transaction has
     touched too many btree paths (because of iteration over many keys),
     and isuses a restart to drop unneeded paths.

     But it's now possible for some paths to exceed the previous limit
     without iteration in the interior btree update path, since the
     transaction commit will do alloc updates for every old and new
     btree node, and during journal replay we don't use the btree write
     buffer for locking reasons and thus those updates use btree paths
     when they wouldn't normally.

   - Fix a corner case in rebalance when moving extents on a
     durability=0 device. This wouldn't be hit when a device was
     formatted with durability=0 since in that case we'll only use it as
     a write through cache (only cached extents will live on it), but
     durability can now be changed on an existing device.

   - bch2_get_acl() could rarely forget to handle a transaction restart;
     this manifested as the occasional missing acl that came back after
     dropping caches.

   - Fix a major performance regression on high iops multithreaded write
     workloads (only since 6.9-rc1); a previous fix for a deadlock in
     the interior btree update path to check the journal watermark
     introduced a dependency on the state of btree write buffer flushing
     that we didn't want.

   - Assorted other repair paths and recovery fixes"

* tag 'bcachefs-2024-04-10' of https://evilpiepirate.org/git/bcachefs: (25 commits)
  bcachefs: Fix __bch2_btree_and_journal_iter_init_node_iter()
  bcachefs: Kill read lock dropping in bch2_btree_node_lock_write_nofail()
  bcachefs: Fix a race in btree_update_nodes_written()
  bcachefs: btree_node_scan: Respect member.data_allowed
  bcachefs: Don't scan for btree nodes when we can reconstruct
  bcachefs: Fix check_topology() when using node scan
  bcachefs: fix eytzinger0_find_gt()
  bcachefs: fix bch2_get_acl() transaction restart handling
  bcachefs: fix the count of nr_freed_pcpu after changing bc->freed_nonpcpu list
  bcachefs: Fix gap buffer bug in bch2_journal_key_insert_take()
  bcachefs: Rename struct field swap to prevent macro naming collision
  MAINTAINERS: Add entry for bcachefs documentation
  Documentation: filesystems: Add bcachefs toctree
  bcachefs: JOURNAL_SPACE_LOW
  bcachefs: Disable errors=panic for BCH_IOCTL_FSCK_OFFLINE
  bcachefs: Fix BCH_IOCTL_FSCK_OFFLINE for encrypted filesystems
  bcachefs: fix rand_delete unit test
  bcachefs: fix ! vs ~ typo in __clear_bit_le64()
  bcachefs: Fix rebalance from durability=0 device
  bcachefs: Print shutdown journal sequence number
  ...
2024-04-11 11:24:55 -07:00
Linus Torvalds
346668f02a chrome-platform fixes for v6.9-rc4
Fix a NULL pointer dereference.
 -----BEGIN PGP SIGNATURE-----
 
 iIkEABYIADEWIQS0yQeDP3cjLyifNRUrxTEGBto89AUCZhdDmxMcdHp1bmdiaUBr
 ZXJuZWwub3JnAAoJECvFMQYG2jz0xxkA/jvAXjiYanRj162Jbmy4A6toyW79oqkc
 z2hLT8cW1VW0APsGIAJtKET4x2W3Gf6Qv3uMwhNruM3V6YPuCR4pbsY5AA==
 =Sf7Q
 -----END PGP SIGNATURE-----

Merge tag 'tag-chrome-platform-fixes-for-v6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux

Pull chrome platform fix from Tzung-Bi Shih:
 "Fix a NULL pointer dereference"

* tag 'tag-chrome-platform-fixes-for-v6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux:
  platform/chrome: cros_ec_uart: properly fix race condition
2024-04-11 11:15:09 -07:00
Rafael J. Wysocki
d7da7e7cec Merge branch 'acpi-bus'
* acpi-bus:
  ACPI: bus: allow _UID matching for integer zero
2024-04-11 19:36:35 +02:00
Ashutosh Dixit
f76646c83f drm/xe: Label RING_CONTEXT_CONTROL as masked
RING_CONTEXT_CONTROL is a masked register.

v2: Also clean up setting register value (Lucas)

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240404161256.3852502-1-ashutosh.dixit@intel.com
(cherry picked from commit dc30c6e714)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-04-11 08:41:54 -05:00
Himal Prasad Ghimiray
9cb46b31f3 drm/xe/xe_migrate: Cast to output precision before multiplying operands
Addressing potential overflow in result of  multiplication of two lower
precision (u32) operands before widening it to higher precision
(u64).

-v2
Fix commit message and description. (Rodrigo)

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240401175300.3823653-1-himal.prasad.ghimiray@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 34820967ae)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-04-11 08:41:53 -05:00
Karthik Poosa
a8ad871547 drm/xe/hwmon: Cast result to output precision on left shift of operand
Address potential overflow in result of left shift of a
lower precision (u32) operand before assignment to higher
precision (u64) variable.

v2:
 - Update commit message. (Himal)

Fixes: 4446fcf220 ("drm/xe/hwmon: Expose power1_max_interval")
Signed-off-by: Karthik Poosa <karthik.poosa@intel.com>
Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com>
Cc: Badal Nilawar <badal.nilawar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240405130127.1392426-5-karthik.poosa@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 883232b47b)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-04-11 08:41:53 -05:00
Lucas De Marchi
50a9b7fc15 drm/xe/display: Fix double mutex initialization
All of these mutexes are already initialized by the display side since
commit 3fef3e6ff8 ("drm/i915: move display mutex inits to display
code"), so the xe shouldn´t initialize them.

Fixes: 44e694958b ("drm/xe/display: Implement display support")
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Arun R Murthy <arun.r.murthy@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240405200711.2041428-1-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 117de185ed)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-04-11 08:41:53 -05:00
Paolo Abeni
4e1ad31ce3 Merge branch 'ena-driver-bug-fixes'
David Arinzon says:

====================
ENA driver bug fixes

From: David Arinzon <darinzon@amazon.com>

This patchset contains multiple bug fixes for the
ENA driver.
====================

Link: https://lore.kernel.org/r/20240410091358.16289-1-darinzon@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-11 11:21:05 +02:00
David Arinzon
36a1ca01f0 net: ena: Set tx_info->xdpf value to NULL
The patch mentioned in the `Fixes` tag removed the explicit assignment
of tx_info->xdpf to NULL with the justification that there's no need
to set tx_info->xdpf to NULL and tx_info->num_of_bufs to 0 in case
of a mapping error. Both values won't be used once the mapping function
returns an error, and their values would be overridden by the next
transmitted packet.

While both values do indeed get overridden in the next transmission
call, the value of tx_info->xdpf is also used to check whether a TX
descriptor's transmission has been completed (i.e. a completion for it
was polled).

An example scenario:
1. Mapping failed, tx_info->xdpf wasn't set to NULL
2. A VF reset occurred leading to IO resource destruction and
   a call to ena_free_tx_bufs() function
3. Although the descriptor whose mapping failed was freed by the
   transmission function, it still passes the check
     if (!tx_info->skb)

   (skb and xdp_frame are in a union)
4. The xdp_frame associated with the descriptor is freed twice

This patch returns the assignment of NULL to tx_info->xdpf to make the
cleaning function knows that the descriptor is already freed.

Fixes: 504fd6a539 ("net: ena: fix DMA mapping function issues in XDP")
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-11 11:21:02 +02:00
David Arinzon
bf02d9fe00 net: ena: Fix incorrect descriptor free behavior
ENA has two types of TX queues:
- queues which only process TX packets arriving from the network stack
- queues which only process TX packets forwarded to it by XDP_REDIRECT
  or XDP_TX instructions

The ena_free_tx_bufs() cycles through all descriptors in a TX queue
and unmaps + frees every descriptor that hasn't been acknowledged yet
by the device (uncompleted TX transactions).
The function assumes that the processed TX queue is necessarily from
the first category listed above and ends up using napi_consume_skb()
for descriptors belonging to an XDP specific queue.

This patch solves a bug in which, in case of a VF reset, the
descriptors aren't freed correctly, leading to crashes.

Fixes: 548c4940b9 ("net: ena: Implement XDP_TX action")
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-11 11:21:02 +02:00
David Arinzon
f7e4171806 net: ena: Wrong missing IO completions check order
Missing IO completions check is called every second (HZ jiffies).
This commit fixes several issues with this check:

1. Duplicate queues check:
   Max of 4 queues are scanned on each check due to monitor budget.
   Once reaching the budget, this check exits under the assumption that
   the next check will continue to scan the remainder of the queues,
   but in practice, next check will first scan the last already scanned
   queue which is not necessary and may cause the full queue scan to
   last a couple of seconds longer.
   The fix is to start every check with the next queue to scan.
   For example, on 8 IO queues:
   Bug: [0,1,2,3], [3,4,5,6], [6,7]
   Fix: [0,1,2,3], [4,5,6,7]

2. Unbalanced queues check:
   In case the number of active IO queues is not a multiple of budget,
   there will be checks which don't utilize the full budget
   because the full scan exits when reaching the last queue id.
   The fix is to run every TX completion check with exact queue budget
   regardless of the queue id.
   For example, on 7 IO queues:
   Bug: [0,1,2,3], [4,5,6], [0,1,2,3]
   Fix: [0,1,2,3], [4,5,6,0], [1,2,3,4]
   The budget may be lowered in case the number of IO queues is less
   than the budget (4) to make sure there are no duplicate queues on
   the same check.
   For example, on 3 IO queues:
   Bug: [0,1,2,0], [1,2,0,1]
   Fix: [0,1,2], [0,1,2]

Fixes: 1738cd3ed3 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Amit Bernstein <amitbern@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-11 11:21:01 +02:00
David Arinzon
713a85195a net: ena: Fix potential sign extension issue
Small unsigned types are promoted to larger signed types in
the case of multiplication, the result of which may overflow.
In case the result of such a multiplication has its MSB
turned on, it will be sign extended with '1's.
This changes the multiplication result.

Code example of the phenomenon:
-------------------------------
u16 x, y;
size_t z1, z2;

x = y = 0xffff;
printk("x=%x y=%x\n",x,y);

z1 = x*y;
z2 = (size_t)x*y;

printk("z1=%lx z2=%lx\n", z1, z2);

Output:
-------
x=ffff y=ffff
z1=fffffffffffe0001 z2=fffe0001

The expected result of ffff*ffff is fffe0001, and without the
explicit casting to avoid the unwanted sign extension we got
fffffffffffe0001.

This commit adds an explicit casting to avoid the sign extension
issue.

Fixes: 689b2bdaaa ("net: ena: add functions for handling Low Latency Queues in ena_com")
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-11 11:21:01 +02:00
Paolo Abeni
fe3eb40672 bluetooth pull request for net:
- L2CAP: Don't double set the HCI_CONN_MGMT_CONNECTED bit
   - Fix memory leak in hci_req_sync_complete
   - hci_sync: Fix using the same interval and window for Coded PHY
   - Fix not validating setsockopt user input
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEE7E6oRXp8w05ovYr/9JCA4xAyCykFAmYW5JMZHGx1aXoudm9u
 LmRlbnR6QGludGVsLmNvbQAKCRD0kIDjEDILKcQjD/9JiPI2Tdb+LQ8g8WZDVEIG
 CefzSzDU1BQLDrU4JaaORPCUjTLNT+dDMTbDzcstmnL3g9yhMzUB8IfcTgcXWa7Z
 gs/cFyssPVtNEoDCZQwB3m83Hrx0e4dPxQAzs8qlIZDhedP33Uohy76jPiLqgoLM
 GtogGfezLthXeQlsCymwS7qju+37QW+GaBdid8N0g7YrAMPEqIkYRrx21OUxTKok
 q/4p9BDeYDpA7JdWXv3Izr2HT0cm6eaCkVu0rANj1pYSytalZxe2GAb10bP/uTLi
 DPxObxOz7J7gh93T6wDZhG3NHZIhRN9yBlO+9FAqKSs6RlPdBq33xNs+Fnilf2Rd
 iKu/cfjtbmc0N/NWih8dpnCMhNU277WJFJMZlOh2Wu0FwSTg0Sqy+sdNicm1cHP3
 oR46lbsD4ctM1JhGX5Whl2BxMR7D8ty3R/d+sTxsICY9MoUFSAzwnsN1+jwCyeW2
 PgPkHEs/gptA8l5XWyw33Rt206PO3B2jgdy96aJ+RN1fU8FkAl6WxnqXZZE2C8cN
 RcYFrFF6iq8pqyey9Y6+9dmzXolcxZ6bC0ooB84gGLRz5L2ptnD7NB1G8XvPkpKJ
 wyVLAdQ6w0KNEgorQkOvh4JVF3/dG04yE841Qz1MsOo8Pmr74j7vfFvko72uGdEC
 49B98NoX3xEmcC815aO3rQ==
 =YwaX
 -----END PGP SIGNATURE-----

Merge tag 'for-net-2024-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth

Luiz Augusto von Dentz says:

====================
bluetooth pull request for net:

  - L2CAP: Don't double set the HCI_CONN_MGMT_CONNECTED bit
  - Fix memory leak in hci_req_sync_complete
  - hci_sync: Fix using the same interval and window for Coded PHY
  - Fix not validating setsockopt user input

* tag 'for-net-2024-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
  Bluetooth: l2cap: Don't double set the HCI_CONN_MGMT_CONNECTED bit
  Bluetooth: hci_sock: Fix not validating setsockopt user input
  Bluetooth: ISO: Fix not validating setsockopt user input
  Bluetooth: L2CAP: Fix not validating setsockopt user input
  Bluetooth: RFCOMM: Fix not validating setsockopt user input
  Bluetooth: SCO: Fix not validating setsockopt user input
  Bluetooth: Fix memory leak in hci_req_sync_complete()
  Bluetooth: hci_sync: Fix using the same interval and window for Coded PHY
  Bluetooth: ISO: Don't reject BT_ISO_QOS if parameters are unset
====================

Link: https://lore.kernel.org/r/20240410191610.4156653-1-luiz.dentz@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-11 10:42:43 +02:00
Michal Luczaj
47d8ac011f af_unix: Fix garbage collector racing against connect()
Garbage collector does not take into account the risk of embryo getting
enqueued during the garbage collection. If such embryo has a peer that
carries SCM_RIGHTS, two consecutive passes of scan_children() may see a
different set of children. Leading to an incorrectly elevated inflight
count, and then a dangling pointer within the gc_inflight_list.

sockets are AF_UNIX/SOCK_STREAM
S is an unconnected socket
L is a listening in-flight socket bound to addr, not in fdtable
V's fd will be passed via sendmsg(), gets inflight count bumped

connect(S, addr)	sendmsg(S, [V]); close(V)	__unix_gc()
----------------	-------------------------	-----------

NS = unix_create1()
skb1 = sock_wmalloc(NS)
L = unix_find_other(addr)
unix_state_lock(L)
unix_peer(S) = NS
			// V count=1 inflight=0

 			NS = unix_peer(S)
 			skb2 = sock_alloc()
			skb_queue_tail(NS, skb2[V])

			// V became in-flight
			// V count=2 inflight=1

			close(V)

			// V count=1 inflight=1
			// GC candidate condition met

						for u in gc_inflight_list:
						  if (total_refs == inflight_refs)
						    add u to gc_candidates

						// gc_candidates={L, V}

						for u in gc_candidates:
						  scan_children(u, dec_inflight)

						// embryo (skb1) was not
						// reachable from L yet, so V's
						// inflight remains unchanged
__skb_queue_tail(L, skb1)
unix_state_unlock(L)
						for u in gc_candidates:
						  if (u.inflight)
						    scan_children(u, inc_inflight_move_tail)

						// V count=1 inflight=2 (!)

If there is a GC-candidate listening socket, lock/unlock its state. This
makes GC wait until the end of any ongoing connect() to that socket. After
flipping the lock, a possibly SCM-laden embryo is already enqueued. And if
there is another embryo coming, it can not possibly carry SCM_RIGHTS. At
this point, unix_inflight() can not happen because unix_gc_lock is already
taken. Inflight graph remains unaffected.

Fixes: 1fd05ba5a2 ("[AF_UNIX]: Rewrite garbage collector, fixes race.")
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20240409201047.1032217-1-mhal@rbox.co
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-11 09:46:15 +02:00
Arınç ÜNAL
17c5601132 net: dsa: mt7530: trap link-local frames regardless of ST Port State
In Clause 5 of IEEE Std 802-2014, two sublayers of the data link layer
(DLL) of the Open Systems Interconnection basic reference model (OSI/RM)
are described; the medium access control (MAC) and logical link control
(LLC) sublayers. The MAC sublayer is the one facing the physical layer.

In 8.2 of IEEE Std 802.1Q-2022, the Bridge architecture is described. A
Bridge component comprises a MAC Relay Entity for interconnecting the Ports
of the Bridge, at least two Ports, and higher layer entities with at least
a Spanning Tree Protocol Entity included.

Each Bridge Port also functions as an end station and shall provide the MAC
Service to an LLC Entity. Each instance of the MAC Service is provided to a
distinct LLC Entity that supports protocol identification, multiplexing,
and demultiplexing, for protocol data unit (PDU) transmission and reception
by one or more higher layer entities.

It is described in 8.13.9 of IEEE Std 802.1Q-2022 that in a Bridge, the LLC
Entity associated with each Bridge Port is modeled as being directly
connected to the attached Local Area Network (LAN).

On the switch with CPU port architecture, CPU port functions as Management
Port, and the Management Port functionality is provided by software which
functions as an end station. Software is connected to an IEEE 802 LAN that
is wholly contained within the system that incorporates the Bridge.
Software provides access to the LLC Entity associated with each Bridge Port
by the value of the source port field on the special tag on the frame
received by software.

We call frames that carry control information to determine the active
topology and current extent of each Virtual Local Area Network (VLAN),
i.e., spanning tree or Shortest Path Bridging (SPB) and Multiple VLAN
Registration Protocol Data Units (MVRPDUs), and frames from other link
constrained protocols, such as Extensible Authentication Protocol over LAN
(EAPOL) and Link Layer Discovery Protocol (LLDP), link-local frames. They
are not forwarded by a Bridge. Permanently configured entries in the
filtering database (FDB) ensure that such frames are discarded by the
Forwarding Process. In 8.6.3 of IEEE Std 802.1Q-2022, this is described in
detail:

Each of the reserved MAC addresses specified in Table 8-1
(01-80-C2-00-00-[00,01,02,03,04,05,06,07,08,09,0A,0B,0C,0D,0E,0F]) shall be
permanently configured in the FDB in C-VLAN components and ERs.

Each of the reserved MAC addresses specified in Table 8-2
(01-80-C2-00-00-[01,02,03,04,05,06,07,08,09,0A,0E]) shall be permanently
configured in the FDB in S-VLAN components.

Each of the reserved MAC addresses specified in Table 8-3
(01-80-C2-00-00-[01,02,04,0E]) shall be permanently configured in the FDB
in TPMR components.

The FDB entries for reserved MAC addresses shall specify filtering for all
Bridge Ports and all VIDs. Management shall not provide the capability to
modify or remove entries for reserved MAC addresses.

The addresses in Table 8-1, Table 8-2, and Table 8-3 determine the scope of
propagation of PDUs within a Bridged Network, as follows:

  The Nearest Bridge group address (01-80-C2-00-00-0E) is an address that
  no conformant Two-Port MAC Relay (TPMR) component, Service VLAN (S-VLAN)
  component, Customer VLAN (C-VLAN) component, or MAC Bridge can forward.
  PDUs transmitted using this destination address, or any other addresses
  that appear in Table 8-1, Table 8-2, and Table 8-3
  (01-80-C2-00-00-[00,01,02,03,04,05,06,07,08,09,0A,0B,0C,0D,0E,0F]), can
  therefore travel no further than those stations that can be reached via a
  single individual LAN from the originating station.

  The Nearest non-TPMR Bridge group address (01-80-C2-00-00-03), is an
  address that no conformant S-VLAN component, C-VLAN component, or MAC
  Bridge can forward; however, this address is relayed by a TPMR component.
  PDUs using this destination address, or any of the other addresses that
  appear in both Table 8-1 and Table 8-2 but not in Table 8-3
  (01-80-C2-00-00-[00,03,05,06,07,08,09,0A,0B,0C,0D,0F]), will be relayed
  by any TPMRs but will propagate no further than the nearest S-VLAN
  component, C-VLAN component, or MAC Bridge.

  The Nearest Customer Bridge group address (01-80-C2-00-00-00) is an
  address that no conformant C-VLAN component, MAC Bridge can forward;
  however, it is relayed by TPMR components and S-VLAN components. PDUs
  using this destination address, or any of the other addresses that appear
  in Table 8-1 but not in either Table 8-2 or Table 8-3
  (01-80-C2-00-00-[00,0B,0C,0D,0F]), will be relayed by TPMR components and
  S-VLAN components but will propagate no further than the nearest C-VLAN
  component or MAC Bridge.

Because the LLC Entity associated with each Bridge Port is provided via CPU
port, we must not filter these frames but forward them to CPU port.

In a Bridge, the transmission Port is majorly decided by ingress and egress
rules, FDB, and spanning tree Port State functions of the Forwarding
Process. For link-local frames, only CPU port should be designated as
destination port in the FDB, and the other functions of the Forwarding
Process must not interfere with the decision of the transmission Port. We
call this process trapping frames to CPU port.

Therefore, on the switch with CPU port architecture, link-local frames must
be trapped to CPU port, and certain link-local frames received by a Port of
a Bridge comprising a TPMR component or an S-VLAN component must be
excluded from it.

A Bridge of the switch with CPU port architecture cannot comprise a
Two-Port MAC Relay (TPMR) component as a TPMR component supports only a
subset of the functionality of a MAC Bridge. A Bridge comprising two Ports
(Management Port doesn't count) of this architecture will either function
as a standard MAC Bridge or a standard VLAN Bridge.

Therefore, a Bridge of this architecture can only comprise S-VLAN
components, C-VLAN components, or MAC Bridge components. Since there's no
TPMR component, we don't need to relay PDUs using the destination addresses
specified on the Nearest non-TPMR section, and the proportion of the
Nearest Customer Bridge section where they must be relayed by TPMR
components.

One option to trap link-local frames to CPU port is to add static FDB
entries with CPU port designated as destination port. However, because that
Independent VLAN Learning (IVL) is being used on every VID, each entry only
applies to a single VLAN Identifier (VID). For a Bridge comprising a MAC
Bridge component or a C-VLAN component, there would have to be 16 times
4096 entries. This switch intellectual property can only hold a maximum of
2048 entries. Using this option, there also isn't a mechanism to prevent
link-local frames from being discarded when the spanning tree Port State of
the reception Port is discarding.

The remaining option is to utilise the BPC, RGAC1, RGAC2, RGAC3, and RGAC4
registers. Whilst this applies to every VID, it doesn't contain all of the
reserved MAC addresses without affecting the remaining Standard Group MAC
Addresses. The REV_UN frame tag utilised using the RGAC4 register covers
the remaining 01-80-C2-00-00-[04,05,06,07,08,09,0A,0B,0C,0D,0F] destination
addresses. It also includes the 01-80-C2-00-00-22 to 01-80-C2-00-00-FF
destination addresses which may be relayed by MAC Bridges or VLAN Bridges.
The latter option provides better but not complete conformance.

This switch intellectual property also does not provide a mechanism to trap
link-local frames with specific destination addresses to CPU port by
Bridge, to conform to the filtering rules for the distinct Bridge
components.

Therefore, regardless of the type of the Bridge component, link-local
frames with these destination addresses will be trapped to CPU port:

01-80-C2-00-00-[00,01,02,03,0E]

In a Bridge comprising a MAC Bridge component or a C-VLAN component:

  Link-local frames with these destination addresses won't be trapped to
  CPU port which won't conform to IEEE Std 802.1Q-2022:

  01-80-C2-00-00-[04,05,06,07,08,09,0A,0B,0C,0D,0F]

In a Bridge comprising an S-VLAN component:

  Link-local frames with these destination addresses will be trapped to CPU
  port which won't conform to IEEE Std 802.1Q-2022:

  01-80-C2-00-00-00

  Link-local frames with these destination addresses won't be trapped to
  CPU port which won't conform to IEEE Std 802.1Q-2022:

  01-80-C2-00-00-[04,05,06,07,08,09,0A]

Currently on this switch intellectual property, if the spanning tree Port
State of the reception Port is discarding, link-local frames will be
discarded.

To trap link-local frames regardless of the spanning tree Port State, make
the switch regard them as Bridge Protocol Data Units (BPDUs). This switch
intellectual property only lets the frames regarded as BPDUs bypass the
spanning tree Port State function of the Forwarding Process.

With this change, the only remaining interference is the ingress rules.
When the reception Port has no PVID assigned on software, VLAN-untagged
frames won't be allowed in. There doesn't seem to be a mechanism on the
switch intellectual property to have link-local frames bypass this function
of the Forwarding Process.

Fixes: b8f126a8d5 ("net-next: dsa: add dsa support for Mediatek MT7530 switch")
Reviewed-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Link: https://lore.kernel.org/r/20240409-b4-for-net-mt7530-fix-link-local-when-stp-discarding-v2-1-07b1150164ac@arinc9.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-11 09:26:41 +02:00
Gerd Bayer
d51dc8dd6a Revert "s390/ism: fix receive message buffer allocation"
This reverts commit 58effa3476.
Review was not finished on this patch. So it's not ready for
upstreaming.

Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
Link: https://lore.kernel.org/r/20240409113753.2181368-1-gbayer@linux.ibm.com
Fixes: 58effa3476 ("s390/ism: fix receive message buffer allocation")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-11 09:18:18 +02:00
Daniel Machon
33623113a4 net: sparx5: fix wrong config being used when reconfiguring PCS
The wrong port config is being used if the PCS is reconfigured. Fix this
by correctly using the new config instead of the old one.

Fixes: 946e7fd505 ("net: sparx5: add port module support")
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20240409-link-mode-reconfiguration-fix-v2-1-db6a507f3627@microchip.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-11 08:30:24 +02:00
Dave Airlie
b4589db566 amd-drm-fixes-6.9-2024-04-10:
amdgpu:
 - GPU reset fixes
 - Fix some confusing logging
 - UMSCH fix
 - Aborted suspend fix
 - DCN 3.5 fixes
 - S4 fix
 - MES logging fixes
 - SMU 14 fixes
 - SDMA 4.4.2 fix
 - KASAN fix
 - SMU 13.0.10 fix
 - VCN partition fix
 - GFX11 fixes
 - DWB fixes
 - Plane handling fix
 - FAMS fix
 - DCN 3.1.6 fix
 - VSC SDP fixes
 - OLED panel fix
 - GFX 11.5 fix
 
 amdkfd:
 - GPU reset fixes
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZhc6QwAKCRC93/aFa7yZ
 2ClTAQDvksG58Ib4Zu+3m0pPuCTeHFdh1pTkgoreviaPzTg5SQEA9/oDD6iKKJ9t
 pJL+NdY21YyO4yeMJ7JqMnkgwmkiHwQ=
 =dewY
 -----END PGP SIGNATURE-----

Merge tag 'amd-drm-fixes-6.9-2024-04-10' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.9-2024-04-10:

amdgpu:
- GPU reset fixes
- Fix some confusing logging
- UMSCH fix
- Aborted suspend fix
- DCN 3.5 fixes
- S4 fix
- MES logging fixes
- SMU 14 fixes
- SDMA 4.4.2 fix
- KASAN fix
- SMU 13.0.10 fix
- VCN partition fix
- GFX11 fixes
- DWB fixes
- Plane handling fix
- FAMS fix
- DCN 3.1.6 fix
- VSC SDP fixes
- OLED panel fix
- GFX 11.5 fix

amdkfd:
- GPU reset fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240411013425.6431-1-alexander.deucher@amd.com
2024-04-11 14:47:39 +10:00
Dave Airlie
aaf00e6150 Display fixes:
- Couple CDCLK programming fixes (Ville)
 - HDCP related fix (Suraj)
 - 4 Bigjoiner related fixes (Ville)
 
 Core fix:
 - Fix for a circular locking around GuC on reset+wedged case (John)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEbSBwaO7dZQkcLOKj+mJfZA7rE8oFAmYXCbUACgkQ+mJfZA7r
 E8p1+Qf+LQGlxAYU0jivL0jISy1Hv3mq3GYVIWnsf7a49uHwXUEnGYv2Kwr0Bh9Z
 mQIyceDSgw2u9P1lJa3kctqqQc7NuZ5e4DfKFOdiEthihgh0W2yFi9WtqenqLh4s
 f3bN/q93wO3J89xN0sW9DXZgQBZniKkYvHWoNRsv2QGnsF/j0bdI6YnfocCHXzSy
 YmsXRGVdVQVCtXgpLyTqoDwFxSdW9IBO6QHE84ZRw5BfZkg/3xuRFRBBslIwuG6n
 PAjMiA5HsMLXXnSsb68SgXEM0ORCQPEg2YYxZXDaNfMpcTLaycyszr/4oMsWoVhJ
 8PneDjvFLNhhEf3U5rRAbyjMwpMidw==
 =iho6
 -----END PGP SIGNATURE-----

Merge tag 'drm-intel-fixes-2024-04-10' of https://anongit.freedesktop.org/git/drm/drm-intel into drm-fixes

Display fixes:
- Couple CDCLK programming fixes (Ville)
- HDCP related fix (Suraj)
- 4 Bigjoiner related fixes (Ville)

Core fix:
- Fix for a circular locking around GuC on reset+wedged case (John)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZhcJxlzc6zLMC1c-@intel.com
2024-04-11 13:52:35 +10:00
Arnd Bergmann
fe87922cee net/mlx5: fix possible stack overflows
A couple of debug functions use a 512 byte temporary buffer and call another
function that has another buffer of the same size, which in turn exceeds the
usual warning limit for excessive stack usage:

drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c:1073:1: error: stack frame size (1448) exceeds limit (1024) in 'dr_dump_start' [-Werror,-Wframe-larger-than]
dr_dump_start(struct seq_file *file, loff_t *pos)
drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c:1009:1: error: stack frame size (1120) exceeds limit (1024) in 'dr_dump_domain' [-Werror,-Wframe-larger-than]
dr_dump_domain(struct seq_file *file, struct mlx5dr_domain *dmn)
drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c:705:1: error: stack frame size (1104) exceeds limit (1024) in 'dr_dump_matcher_rx_tx' [-Werror,-Wframe-larger-than]
dr_dump_matcher_rx_tx(struct seq_file *file, bool is_rx,

Rework these so that each of the various code paths only ever has one of
these buffers in it, and exactly the functions that declare one have
the 'noinline_for_stack' annotation that prevents them from all being
inlined into the same caller.

Fixes: 917d1e799d ("net/mlx5: DR, Change SWS usage to debug fs seq_file interface")
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/all/20240219100506.648089-1-arnd@kernel.org/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20240408074142.3007036-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-10 19:56:12 -07:00
Jakub Kicinski
186abfcda0 Merge branch 'mlx5-misc-fixes'
Tariq Toukan says:

====================
mlx5 misc fixes

This patchset provides bug fixes to mlx5 driver.

This is V2 of the series previously submitted as PR by Saeed:
https://lore.kernel.org/netdev/20240326144646.2078893-1-saeed@kernel.org/T/

Series generated against:
commit 237f3cf13b ("xsk: validate user input for XDP_{UMEM|COMPLETION}_FILL_RING")
====================

Link: https://lore.kernel.org/r/20240409190820.227554-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-10 19:50:26 -07:00
Tariq Toukan
7772dc7460 net/mlx5: Disallow SRIOV switchdev mode when in multi-PF netdev
Adaptations need to be made for the auxiliary device management in the
core driver level. Block this combination for now.

Fixes: 678eb44805 ("net/mlx5: SD, Implement basic query and instantiation")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Link: https://lore.kernel.org/r/20240409190820.227554-12-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-10 19:48:14 -07:00
Carolina Jubran
49e6c93870 net/mlx5e: RSS, Block XOR hash with over 128 channels
When supporting more than 128 channels, the RQT size is
calculated by multiplying the number of channels by 2
and rounding up to the nearest power of 2.

The index of the RQT is derived from the RSS hash
calculations. If XOR8 is used as the RSS hash function,
there are only 256 possible hash results, and therefore,
only 256 indexes can be reached in the RQT.

Block setting the RSS hash function to XOR when the number
of channels exceeds 128.

Fixes: 74a8dadac1 ("net/mlx5e: Preparations for supporting larger number of channels")
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20240409190820.227554-11-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-10 19:48:14 -07:00
Rahul Rameshbabu
86b0ca5b11 net/mlx5e: Do not produce metadata freelist entries in Tx port ts WQE xmit
Free Tx port timestamping metadata entries in the NAPI poll context and
consume metadata enties in the WQE xmit path. Do not free a Tx port
timestamping metadata entry in the WQE xmit path even in the error path to
avoid a race between two metadata entry producers.

Fixes: 3178308ad4 ("net/mlx5e: Make tx_port_ts logic resilient to out-of-order CQEs")
Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20240409190820.227554-10-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-10 19:48:14 -07:00
Carolina Jubran
2f436f1869 net/mlx5e: HTB, Fix inconsistencies with QoS SQs number
When creating a new HTB class while the interface is down,
the variable that follows the number of QoS SQs (htb_max_qos_sqs)
may not be consistent with the number of HTB classes.

Previously, we compared these two values to ensure that
the node_qid is lower than the number of QoS SQs, and we
allocated stats for that SQ when they are equal.

Change the check to compare the node_qid with the current
number of leaf nodes and fix the checking conditions to
ensure allocation of stats_list and stats for each node.

Fixes: 214baf2287 ("net/mlx5e: Support HTB offload")
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20240409190820.227554-9-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-10 19:48:14 -07:00
Carolina Jubran
ecb829459a net/mlx5e: Fix mlx5e_priv_init() cleanup flow
When mlx5e_priv_init() fails, the cleanup flow calls mlx5e_selq_cleanup which
calls mlx5e_selq_apply() that assures that the `priv->state_lock` is held using
lockdep_is_held().

Acquire the state_lock in mlx5e_selq_cleanup().

Kernel log:
=============================
WARNING: suspicious RCU usage
6.8.0-rc3_net_next_841a9b5 #1 Not tainted
-----------------------------
drivers/net/ethernet/mellanox/mlx5/core/en/selq.c:124 suspicious rcu_dereference_protected() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
2 locks held by systemd-modules/293:
 #0: ffffffffa05067b0 (devices_rwsem){++++}-{3:3}, at: ib_register_client+0x109/0x1b0 [ib_core]
 #1: ffff8881096c65c0 (&device->client_data_rwsem){++++}-{3:3}, at: add_client_context+0x104/0x1c0 [ib_core]

stack backtrace:
CPU: 4 PID: 293 Comm: systemd-modules Not tainted 6.8.0-rc3_net_next_841a9b5 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x8a/0xa0
 lockdep_rcu_suspicious+0x154/0x1a0
 mlx5e_selq_apply+0x94/0xa0 [mlx5_core]
 mlx5e_selq_cleanup+0x3a/0x60 [mlx5_core]
 mlx5e_priv_init+0x2be/0x2f0 [mlx5_core]
 mlx5_rdma_setup_rn+0x7c/0x1a0 [mlx5_core]
 rdma_init_netdev+0x4e/0x80 [ib_core]
 ? mlx5_rdma_netdev_free+0x70/0x70 [mlx5_core]
 ipoib_intf_init+0x64/0x550 [ib_ipoib]
 ipoib_intf_alloc+0x4e/0xc0 [ib_ipoib]
 ipoib_add_one+0xb0/0x360 [ib_ipoib]
 add_client_context+0x112/0x1c0 [ib_core]
 ib_register_client+0x166/0x1b0 [ib_core]
 ? 0xffffffffa0573000
 ipoib_init_module+0xeb/0x1a0 [ib_ipoib]
 do_one_initcall+0x61/0x250
 do_init_module+0x8a/0x270
 init_module_from_file+0x8b/0xd0
 idempotent_init_module+0x17d/0x230
 __x64_sys_finit_module+0x61/0xb0
 do_syscall_64+0x71/0x140
 entry_SYSCALL_64_after_hwframe+0x46/0x4e
 </TASK>

Fixes: 8bf30be750 ("net/mlx5e: Introduce select queue parameters")
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20240409190820.227554-8-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-10 19:48:14 -07:00
Carolina Jubran
ee3572409f net/mlx5e: RSS, Block changing channels number when RXFH is configured
Changing the channels number after configuring the receive flow hash
indirection table may affect the RSS table size. The previous
configuration may no longer be compatible with the new receive flow
hash indirection table.

Block changing the channels number when RXFH is configured and changing
the channels number requires resizing the RSS table size.

Fixes: 74a8dadac1 ("net/mlx5e: Preparations for supporting larger number of channels")
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20240409190820.227554-7-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-10 19:48:14 -07:00
Cosmin Ratiu
9eca93f4d5 net/mlx5: Correctly compare pkt reformat ids
struct mlx5_pkt_reformat contains a naked union of a u32 id and a
dr_action pointer which is used when the action is SW-managed (when
pkt_reformat.owner is set to MLX5_FLOW_RESOURCE_OWNER_SW). Using id
directly in that case is incorrect, as it maps to the least significant
32 bits of the 64-bit pointer in mlx5_fs_dr_action and not to the pkt
reformat id allocated in firmware.

For the purpose of comparing whether two rules are identical,
interpreting the least significant 32 bits of the mlx5_fs_dr_action
pointer as an id mostly works... until it breaks horribly and produces
the outcome described in [1].

This patch fixes mlx5_flow_dests_cmp to correctly compare ids using
mlx5_fs_dr_action_get_pkt_reformat_id for the SW-managed rules.

Link: https://lore.kernel.org/netdev/ea5264d6-6b55-4449-a602-214c6f509c1e@163.com/T/#u [1]

Fixes: 6a48faeeca ("net/mlx5: Add direct rule fs_cmd implementation")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20240409190820.227554-6-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-10 19:48:14 -07:00
Cosmin Ratiu
7c6782ad49 net/mlx5: Properly link new fs rules into the tree
Previously, add_rule_fg would only add newly created rules from the
handle into the tree when they had a refcount of 1. On the other hand,
create_flow_handle tries hard to find and reference already existing
identical rules instead of creating new ones.

These two behaviors can result in a situation where create_flow_handle
1) creates a new rule and references it, then
2) in a subsequent step during the same handle creation references it
   again,
resulting in a rule with a refcount of 2 that is not linked into the
tree, will have a NULL parent and root and will result in a crash when
the flow group is deleted because del_sw_hw_rule, invoked on rule
deletion, assumes node->parent is != NULL.

This happened in the wild, due to another bug related to incorrect
handling of duplicate pkt_reformat ids, which lead to the code in
create_flow_handle incorrectly referencing a just-added rule in the same
flow handle, resulting in the problem described above. Full details are
at [1].

This patch changes add_rule_fg to add new rules without parents into
the tree, properly initializing them and avoiding the crash. This makes
it more consistent with how rules are added to an FTE in
create_flow_handle.

Fixes: 74491de937 ("net/mlx5: Add multi dest support")
Link: https://lore.kernel.org/netdev/ea5264d6-6b55-4449-a602-214c6f509c1e@163.com/T/#u [1]
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20240409190820.227554-5-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-10 19:48:13 -07:00
Michael Liang
9f7e8fbb91 net/mlx5: offset comp irq index in name by one
The mlx5 comp irq name scheme is changed a little bit between
commit 3663ad34bc ("net/mlx5: Shift control IRQ to the last index")
and commit 3354822cde ("net/mlx5: Use dynamic msix vectors allocation").
The index in the comp irq name used to start from 0 but now it starts
from 1. There is nothing critical here, but it's harmless to change
back to the old behavior, a.k.a starting from 0.

Fixes: 3354822cde ("net/mlx5: Use dynamic msix vectors allocation")
Reviewed-by: Mohamed Khalfella <mkhalfella@purestorage.com>
Reviewed-by: Yuanyuan Zhong <yzhong@purestorage.com>
Signed-off-by: Michael Liang <mliang@purestorage.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20240409190820.227554-4-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-10 19:48:13 -07:00
Shay Drory
c6e77aa9dd net/mlx5: Register devlink first under devlink lock
In case device is having a non fatal FW error during probe, the
driver will report the error to user via devlink. This will trigger
a WARN_ON, since mlx5 is calling devlink_register() last.
In order to avoid the WARN_ON[1], change mlx5 to invoke devl_register()
first under devlink lock.

[1]
WARNING: CPU: 5 PID: 227 at net/devlink/health.c:483 devlink_recover_notify.constprop.0+0xb8/0xc0
CPU: 5 PID: 227 Comm: kworker/u16:3 Not tainted 6.4.0-rc5_for_upstream_min_debug_2023_06_12_12_38 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Workqueue: mlx5_health0000:08:00.0 mlx5_fw_reporter_err_work [mlx5_core]
RIP: 0010:devlink_recover_notify.constprop.0+0xb8/0xc0
Call Trace:
 <TASK>
 ? __warn+0x79/0x120
 ? devlink_recover_notify.constprop.0+0xb8/0xc0
 ? report_bug+0x17c/0x190
 ? handle_bug+0x3c/0x60
 ? exc_invalid_op+0x14/0x70
 ? asm_exc_invalid_op+0x16/0x20
 ? devlink_recover_notify.constprop.0+0xb8/0xc0
 devlink_health_report+0x4a/0x1c0
 mlx5_fw_reporter_err_work+0xa4/0xd0 [mlx5_core]
 process_one_work+0x1bb/0x3c0
 ? process_one_work+0x3c0/0x3c0
 worker_thread+0x4d/0x3c0
 ? process_one_work+0x3c0/0x3c0
 kthread+0xc6/0xf0
 ? kthread_complete_and_exit+0x20/0x20
 ret_from_fork+0x1f/0x30
 </TASK>

Fixes: cf53021740 ("devlink: Notify users when objects are accessible")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20240409190820.227554-3-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-10 19:48:13 -07:00