Commit Graph

935002 Commits

Author SHA1 Message Date
Qu Wenruo
fa91e4aa17 btrfs: qgroup: fix data leak caused by race between writeback and truncate
[BUG]
When running tests like generic/013 on test device with btrfs quota
enabled, it can normally lead to data leak, detected at unmount time:

  BTRFS warning (device dm-3): qgroup 0/5 has unreleased space, type 0 rsv 4096
  ------------[ cut here ]------------
  WARNING: CPU: 11 PID: 16386 at fs/btrfs/disk-io.c:4142 close_ctree+0x1dc/0x323 [btrfs]
  RIP: 0010:close_ctree+0x1dc/0x323 [btrfs]
  Call Trace:
   btrfs_put_super+0x15/0x17 [btrfs]
   generic_shutdown_super+0x72/0x110
   kill_anon_super+0x18/0x30
   btrfs_kill_super+0x17/0x30 [btrfs]
   deactivate_locked_super+0x3b/0xa0
   deactivate_super+0x40/0x50
   cleanup_mnt+0x135/0x190
   __cleanup_mnt+0x12/0x20
   task_work_run+0x64/0xb0
   __prepare_exit_to_usermode+0x1bc/0x1c0
   __syscall_return_slowpath+0x47/0x230
   do_syscall_64+0x64/0xb0
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  ---[ end trace caf08beafeca2392 ]---
  BTRFS error (device dm-3): qgroup reserved space leaked

[CAUSE]
In the offending case, the offending operations are:
2/6: writev f2X[269 1 0 0 0 0] [1006997,67,288] 0
2/7: truncate f2X[269 1 0 0 48 1026293] 18388 0

The following sequence of events could happen after the writev():
	CPU1 (writeback)		|		CPU2 (truncate)
-----------------------------------------------------------------
btrfs_writepages()			|
|- extent_write_cache_pages()		|
   |- Got page for 1003520		|
   |  1003520 is Dirty, no writeback	|
   |  So (!clear_page_dirty_for_io())   |
   |  gets called for it		|
   |- Now page 1003520 is Clean.	|
   |					| btrfs_setattr()
   |					| |- btrfs_setsize()
   |					|    |- truncate_setsize()
   |					|       New i_size is 18388
   |- __extent_writepage()		|
   |  |- page_offset() > i_size		|
      |- btrfs_invalidatepage()		|
	 |- Page is clean, so no qgroup |
	    callback executed

This means, the qgroup reserved data space is not properly released in
btrfs_invalidatepage() as the page is Clean.

[FIX]
Instead of checking the dirty bit of a page, call
btrfs_qgroup_free_data() unconditionally in btrfs_invalidatepage().

As qgroup rsv are completely bound to the QGROUP_RESERVED bit of
io_tree, not bound to page status, thus we won't cause double freeing
anyway.

Fixes: 0b34c261e2 ("btrfs: qgroup: Prevent qgroup->reserved from going subzero")
CC: stable@vger.kernel.org # 4.14+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-21 22:08:32 +02:00
Paweł Gronowski
38e0c89a19 drm/amdgpu: Fix NULL dereference in dpm sysfs handlers
NULL dereference occurs when string that is not ended with space or
newline is written to some dpm sysfs interface (for example pp_dpm_sclk).
This happens because strsep replaces the tmp with NULL if the delimiter
is not present in string, which is then dereferenced by tmp[0].

Reproduction example:
sudo sh -c 'echo -n 1 > /sys/class/drm/card0/device/pp_dpm_sclk'

Signed-off-by: Paweł Gronowski <me@woland.xyz>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2020-07-21 16:00:01 -04:00
Qiu Wenbo
88bb16ad99 drm/amd/powerplay: fix a crash when overclocking Vega M
Avoid kernel crash when vddci_control is SMU7_VOLTAGE_CONTROL_NONE and
vddci_voltage_table is empty. It has been tested on Intel Hades Canyon
(i7-8809G).

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=208489
Fixes: ac7822b002 ("drm/amd/powerplay: add smumgr support for VEGAM (v2)")
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Qiu Wenbo <qiuwenbo@phytium.com.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2020-07-21 15:59:32 -04:00
Filipe Manana
580c079b57 btrfs: fix double free on ulist after backref resolution failure
At btrfs_find_all_roots_safe() we allocate a ulist and set the **roots
argument to point to it. However if later we fail due to an error returned
by find_parent_nodes(), we free that ulist but leave a dangling pointer in
the **roots argument. Upon receiving the error, a caller of this function
can attempt to free the same ulist again, resulting in an invalid memory
access.

One such scenario is during qgroup accounting:

btrfs_qgroup_account_extents()

 --> calls btrfs_find_all_roots() passes &new_roots (a stack allocated
     pointer) to btrfs_find_all_roots()

   --> btrfs_find_all_roots() just calls btrfs_find_all_roots_safe()
       passing &new_roots to it

     --> allocates ulist and assigns its address to **roots (which
         points to new_roots from btrfs_qgroup_account_extents())

     --> find_parent_nodes() returns an error, so we free the ulist
         and leave **roots pointing to it after returning

 --> btrfs_qgroup_account_extents() sees btrfs_find_all_roots() returned
     an error and jumps to the label 'cleanup', which just tries to
     free again the same ulist

Stack trace example:

 ------------[ cut here ]------------
 BTRFS: tree first key check failed
 WARNING: CPU: 1 PID: 1763215 at fs/btrfs/disk-io.c:422 btrfs_verify_level_key+0xe0/0x180 [btrfs]
 Modules linked in: dm_snapshot dm_thin_pool (...)
 CPU: 1 PID: 1763215 Comm: fsstress Tainted: G        W         5.8.0-rc3-btrfs-next-64 #1
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 RIP: 0010:btrfs_verify_level_key+0xe0/0x180 [btrfs]
 Code: 28 5b 5d (...)
 RSP: 0018:ffffb89b473779a0 EFLAGS: 00010286
 RAX: 0000000000000000 RBX: ffff90397759bf08 RCX: 0000000000000000
 RDX: 0000000000000001 RSI: 0000000000000027 RDI: 00000000ffffffff
 RBP: ffff9039a419c000 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: ffffb89b43301000 R12: 000000000000005e
 R13: ffffb89b47377a2e R14: ffffb89b473779af R15: 0000000000000000
 FS:  00007fc47e1e1000(0000) GS:ffff9039ac200000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007fc47e1df000 CR3: 00000003d9e4e001 CR4: 00000000003606e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  read_block_for_search+0xf6/0x350 [btrfs]
  btrfs_next_old_leaf+0x242/0x650 [btrfs]
  resolve_indirect_refs+0x7cf/0x9e0 [btrfs]
  find_parent_nodes+0x4ea/0x12c0 [btrfs]
  btrfs_find_all_roots_safe+0xbf/0x130 [btrfs]
  btrfs_qgroup_account_extents+0x9d/0x390 [btrfs]
  btrfs_commit_transaction+0x4f7/0xb20 [btrfs]
  btrfs_sync_file+0x3d4/0x4d0 [btrfs]
  do_fsync+0x38/0x70
  __x64_sys_fdatasync+0x13/0x20
  do_syscall_64+0x5c/0xe0
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 RIP: 0033:0x7fc47e2d72e3
 Code: Bad RIP value.
 RSP: 002b:00007fffa32098c8 EFLAGS: 00000246 ORIG_RAX: 000000000000004b
 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fc47e2d72e3
 RDX: 00007fffa3209830 RSI: 00007fffa3209830 RDI: 0000000000000003
 RBP: 000000000000072e R08: 0000000000000001 R09: 0000000000000003
 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000000003e8
 R13: 0000000051eb851f R14: 00007fffa3209970 R15: 00005607c4ac8b50
 irq event stamp: 0
 hardirqs last  enabled at (0): [<0000000000000000>] 0x0
 hardirqs last disabled at (0): [<ffffffffb8eb5e85>] copy_process+0x755/0x1eb0
 softirqs last  enabled at (0): [<ffffffffb8eb5e85>] copy_process+0x755/0x1eb0
 softirqs last disabled at (0): [<0000000000000000>] 0x0
 ---[ end trace 8639237550317b48 ]---
 BTRFS error (device sdc): tree first key mismatch detected, bytenr=62324736 parent_transid=94 key expected=(262,108,1351680) has=(259,108,1921024)
 general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6b6b: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
 CPU: 2 PID: 1763215 Comm: fsstress Tainted: G        W         5.8.0-rc3-btrfs-next-64 #1
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 RIP: 0010:ulist_release+0x14/0x60 [btrfs]
 Code: c7 07 00 (...)
 RSP: 0018:ffffb89b47377d60 EFLAGS: 00010282
 RAX: 6b6b6b6b6b6b6b6b RBX: ffff903959b56b90 RCX: 0000000000000000
 RDX: 0000000000000001 RSI: 0000000000270024 RDI: ffff9036e2adc840
 RBP: ffff9036e2adc848 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000000 R12: ffff9036e2adc840
 R13: 0000000000000015 R14: ffff9039a419ccf8 R15: ffff90395d605840
 FS:  00007fc47e1e1000(0000) GS:ffff9039ac600000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f8c1c0a51c8 CR3: 00000003d9e4e004 CR4: 00000000003606e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  ulist_free+0x13/0x20 [btrfs]
  btrfs_qgroup_account_extents+0xf3/0x390 [btrfs]
  btrfs_commit_transaction+0x4f7/0xb20 [btrfs]
  btrfs_sync_file+0x3d4/0x4d0 [btrfs]
  do_fsync+0x38/0x70
  __x64_sys_fdatasync+0x13/0x20
  do_syscall_64+0x5c/0xe0
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 RIP: 0033:0x7fc47e2d72e3
 Code: Bad RIP value.
 RSP: 002b:00007fffa32098c8 EFLAGS: 00000246 ORIG_RAX: 000000000000004b
 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fc47e2d72e3
 RDX: 00007fffa3209830 RSI: 00007fffa3209830 RDI: 0000000000000003
 RBP: 000000000000072e R08: 0000000000000001 R09: 0000000000000003
 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000000003e8
 R13: 0000000051eb851f R14: 00007fffa3209970 R15: 00005607c4ac8b50
 Modules linked in: dm_snapshot dm_thin_pool (...)
 ---[ end trace 8639237550317b49 ]---
 RIP: 0010:ulist_release+0x14/0x60 [btrfs]
 Code: c7 07 00 (...)
 RSP: 0018:ffffb89b47377d60 EFLAGS: 00010282
 RAX: 6b6b6b6b6b6b6b6b RBX: ffff903959b56b90 RCX: 0000000000000000
 RDX: 0000000000000001 RSI: 0000000000270024 RDI: ffff9036e2adc840
 RBP: ffff9036e2adc848 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000000 R12: ffff9036e2adc840
 R13: 0000000000000015 R14: ffff9039a419ccf8 R15: ffff90395d605840
 FS:  00007fc47e1e1000(0000) GS:ffff9039ad200000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f6a776f7d40 CR3: 00000003d9e4e002 CR4: 00000000003606e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Fix this by making btrfs_find_all_roots_safe() set *roots to NULL after
it frees the ulist.

Fixes: 8da6d5815c ("Btrfs: added btrfs_find_all_roots()")
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-21 21:59:15 +02:00
Serge Semin
551e553f0d serial: 8250_mtk: Fix high-speed baud rates clamping
Commit 7b668c064e ("serial: 8250: Fix max baud limit in generic 8250
port") fixed limits of a baud rate setting for a generic 8250 port.
In other words since that commit the baud rate has been permitted to be
within [uartclk / 16 / UART_DIV_MAX; uartclk / 16], which is absolutely
normal for a standard 8250 UART port. But there are custom 8250 ports,
which provide extended baud rate limits. In particular the Mediatek 8250
port can work with baud rates up to "uartclk" speed.

Normally that and any other peculiarity is supposed to be handled in a
custom set_termios() callback implemented in the vendor-specific
8250-port glue-driver. Currently that is how it's done for the most of
the vendor-specific 8250 ports, but for some reason for Mediatek a
solution has been spread out to both the glue-driver and to the generic
8250-port code. Due to that a bug has been introduced, which permitted the
extended baud rate limit for all even for standard 8250-ports. The bug
has been fixed by the commit 7b668c064e ("serial: 8250: Fix max baud
limit in generic 8250 port") by narrowing the baud rates limit back down to
the normal bounds. Unfortunately by doing so we also broke the
Mediatek-specific extended bauds feature.

A fix of the problem described above is twofold. First since we can't get
back the extended baud rate limits feature to the generic set_termios()
function and that method supports only a standard baud rates range, the
requested baud rate must be locally stored before calling it and then
restored back to the new termios structure after the generic set_termios()
finished its magic business. By doing so we still use the
serial8250_do_set_termios() method to set the LCR/MCR/FCR/etc. registers,
while the extended baud rate setting procedure will be performed later in
the custom Mediatek-specific set_termios() callback. Second since a true
baud rate is now fully calculated in the custom set_termios() method we
need to locally update the port timeout by calling the
uart_update_timeout() function. After the fixes described above are
implemented in the 8250_mtk.c driver, the Mediatek 8250-port should
get back to normally working with extended baud rates.

Link: https://lore.kernel.org/linux-serial/20200701211337.3027448-1-danielwinkler@google.com

Fixes: 7b668c064e ("serial: 8250: Fix max baud limit in generic 8250 port")
Reported-by: Daniel Winkler <danielwinkler@google.com>
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Cc: stable <stable@vger.kernel.org>
Tested-by: Claire Chang <tientzu@chromium.org>
Link: https://lore.kernel.org/r/20200714124113.20918-1-Sergey.Semin@baikalelectronics.ru
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-21 21:23:18 +02:00
Yang Yingliang
f4c23a140d serial: 8250: fix null-ptr-deref in serial8250_start_tx()
I got null-ptr-deref in serial8250_start_tx():

[   78.114630] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[   78.123778] Mem abort info:
[   78.126560]   ESR = 0x86000007
[   78.129603]   EC = 0x21: IABT (current EL), IL = 32 bits
[   78.134891]   SET = 0, FnV = 0
[   78.137933]   EA = 0, S1PTW = 0
[   78.141064] user pgtable: 64k pages, 48-bit VAs, pgdp=00000027d41a8600
[   78.147562] [0000000000000000] pgd=00000027893f0003, p4d=00000027893f0003, pud=00000027893f0003, pmd=00000027c9a20003, pte=0000000000000000
[   78.160029] Internal error: Oops: 86000007 [#1] SMP
[   78.164886] Modules linked in: sunrpc vfat fat aes_ce_blk crypto_simd cryptd aes_ce_cipher crct10dif_ce ghash_ce sha2_ce sha256_arm64 sha1_ce ses enclosure sg sbsa_gwdt ipmi_ssif spi_dw_mmio sch_fq_codel vhost_net tun vhost vhost_iotlb tap ip_tables ext4 mbcache jbd2 ahci hisi_sas_v3_hw libahci hisi_sas_main libsas hns3 scsi_transport_sas hclge libata megaraid_sas ipmi_si hnae3 ipmi_devintf ipmi_msghandler br_netfilter bridge stp llc nvme nvme_core xt_sctp sctp libcrc32c dm_mod nbd
[   78.207383] CPU: 11 PID: 23258 Comm: null-ptr Not tainted 5.8.0-rc6+ #48
[   78.214056] Hardware name: Huawei TaiShan 2280 V2/BC82AMDC, BIOS 2280-V2 CS V3.B210.01 03/12/2020
[   78.222888] pstate: 80400089 (Nzcv daIf +PAN -UAO BTYPE=--)
[   78.228435] pc : 0x0
[   78.230618] lr : serial8250_start_tx+0x160/0x260
[   78.235215] sp : ffff800062eefb80
[   78.238517] x29: ffff800062eefb80 x28: 0000000000000fff
[   78.243807] x27: ffff800062eefd80 x26: ffff202fd83b3000
[   78.249098] x25: ffff800062eefd80 x24: ffff202fd83b3000
[   78.254388] x23: ffff002fc5e50be8 x22: 0000000000000002
[   78.259679] x21: 0000000000000001 x20: 0000000000000000
[   78.264969] x19: ffffa688827eecc8 x18: 0000000000000000
[   78.270259] x17: 0000000000000000 x16: 0000000000000000
[   78.275550] x15: ffffa68881bc67a8 x14: 00000000000002e6
[   78.280841] x13: ffffa68881bc67a8 x12: 000000000000c539
[   78.286131] x11: d37a6f4de9bd37a7 x10: ffffa68881cccff0
[   78.291421] x9 : ffffa68881bc6000 x8 : ffffa688819daa88
[   78.296711] x7 : ffffa688822a0f20 x6 : ffffa688819e0000
[   78.302002] x5 : ffff800062eef9d0 x4 : ffffa68881e707a8
[   78.307292] x3 : 0000000000000000 x2 : 0000000000000002
[   78.312582] x1 : 0000000000000001 x0 : ffffa688827eecc8
[   78.317873] Call trace:
[   78.320312]  0x0
[   78.322147]  __uart_start.isra.9+0x64/0x78
[   78.326229]  uart_start+0xb8/0x1c8
[   78.329620]  uart_flush_chars+0x24/0x30
[   78.333442]  n_tty_receive_buf_common+0x7b0/0xc30
[   78.338128]  n_tty_receive_buf+0x44/0x2c8
[   78.342122]  tty_ioctl+0x348/0x11f8
[   78.345599]  ksys_ioctl+0xd8/0xf8
[   78.348903]  __arm64_sys_ioctl+0x2c/0xc8
[   78.352812]  el0_svc_common.constprop.2+0x88/0x1b0
[   78.357583]  do_el0_svc+0x44/0xd0
[   78.360887]  el0_sync_handler+0x14c/0x1d0
[   78.364880]  el0_sync+0x140/0x180
[   78.368185] Code: bad PC value

SERIAL_PORT_DFNS is not defined on each arch, if it's not defined,
serial8250_set_defaults() won't be called in serial8250_isa_init_ports(),
so the p->serial_in pointer won't be initialized, and it leads a null-ptr-deref.
Fix this problem by calling serial8250_set_defaults() after init uart port.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200721143852.4058352-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-21 21:23:18 +02:00
Johan Hovold
707631ce63 serial: tegra: drop bogus NULL tty-port checks
The struct tty_port is part of the uart state and will never be NULL in
the receive helpers. Drop the bogus NULL checks and rename the
pointer-variables "port" to differentiate them from struct tty_struct
pointers (which can be NULL).

Fixes: 962963e4ee ("serial: tegra: Switch to using struct tty_port")
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20200710135947.2737-3-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-21 20:59:38 +02:00
Johan Hovold
b374c562ee serial: tegra: fix CREAD handling for PIO
Commit 33ae787b74 ("serial: tegra: add support to ignore read") added
support for dropping input in case CREAD isn't set, but for PIO the
ignore_status_mask wasn't checked until after the character had been
put in the receive buffer.

Note that the NULL tty-port test is bogus and will be removed by a
follow-on patch.

Fixes: 33ae787b74 ("serial: tegra: add support to ignore read")
Cc: stable <stable@vger.kernel.org>     # 5.4
Cc: Shardar Shariff Md <smohammed@nvidia.com>
Cc: Krishna Yarlagadda <kyarlagadda@nvidia.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20200710135947.2737-2-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-21 20:58:25 +02:00
Jason Gunthorpe
a862192e92 RDMA/mlx5: Prevent prefetch from racing with implicit destruction
Prefetch work in mlx5_ib_prefetch_mr_work can be queued and able to run
concurrently with destruction of the implicit MR. The num_deferred_work
was intended to serialize this, but there is a race:

       CPU0                                          CPU1

    mlx5_ib_free_implicit_mr()
      xa_erase(odp_mkeys)
      synchronize_srcu()
      __xa_erase(implicit_children)
                                      mlx5_ib_prefetch_mr_work()
                                        pagefault_mr()
                                         pagefault_implicit_mr()
                                          implicit_get_child_mr()
                                           xa_cmpxchg()
                                        atomic_dec_and_test(num_deferred_mr)
      wait_event(imr->q_deferred_work)
      ib_umem_odp_release(odp_imr)
        kfree(odp_imr)

At this point in mlx5_ib_free_implicit_mr() the implicit_children list is
supposed to be empty forever so that destroy_unused_implicit_child_mr()
and related are not and will not be running.

Since it is not empty the destroy_unused_implicit_child_mr() flow ends up
touching deallocated memory as mlx5_ib_free_implicit_mr() already tore down the
imr parent.

The solution is to flush out the prefetch wq by driving num_deferred_work
to zero after creation of new prefetch work is blocked.

Fixes: 5256edcb98 ("RDMA/mlx5: Rework implicit ODP destroy")
Link: https://lore.kernel.org/r/20200719065435.130722-1-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-21 13:51:35 -03:00
Helmut Grohne
22a82fa7d6 tty: xilinx_uartps: Really fix id assignment
The problems started with the revert (18cc7ac8a2). The
cdns_uart_console.index is statically assigned -1. When the port is
registered, Linux assigns consecutive numbers to it. It turned out that
when using ttyPS1 as console, the index is not updated as we are reusing
the same cdns_uart_console instance for multiple ports. When registering
ttyPS0, it gets updated from -1 to 0, but when registering ttyPS1, it
already is 0 and not updated.

That led to 2ae11c46d5. It assigns the index prior to registering
the uart_driver once. Unfortunately, that ended up breaking the
situation where the probe order does not match the id order. When using
the same device tree for both uboot and linux, it is important that the
serial0 alias points to the console. So some boards reverse those
aliases. This was reported by Jan Kiszka. The proposed fix was reverting
the index assignment and going back to the previous iteration.

However such a reversed assignement (serial0 -> uart1, serial1 -> uart0)
was already partially broken by the revert (18cc7ac8a2). While the
ttyPS device works, the kmsg connection is already broken and kernel
messages go missing. Reverting the id assignment does not fix this.

>From the xilinx_uartps driver pov (after reverting the refactoring
commits), there can be only one console. This manifests in static
variables console_pprt and cdns_uart_console. These variables are not
properly linked and can go out of sync. The cdns_uart_console.index is
important for uart_add_one_port. We call that function for each port -
one of which hopefully is the console. If it isn't, the CON_ENABLED flag
is not set and console_port is cleared. The next cdns_uart_probe call
then tries to register the next port using that same cdns_uart_console.

It is important that console_port and cdns_uart_console (and its index
in particular) stay in sync. The index assignment implemented by
Shubhrajyoti Datta is correct in principle. It just may have to happen a
second time if the first cdns_uart_probe call didn't encounter the
console device. And we shouldn't change the index once the console uart
is registered.

Reported-by: Shubhrajyoti Datta <shubhrajyoti.datta@xilinx.com>
Reported-by: Jan Kiszka <jan.kiszka@web.de>
Link: https://lore.kernel.org/linux-serial/f4092727-d8f5-5f91-2c9f-76643aace993@siemens.com/
Fixes: 18cc7ac8a2 ("Revert "serial: uartps: Register own uart console and driver structures"")
Fixes: 2ae11c46d5 ("tty: xilinx_uartps: Fix missing id assignment to the console")
Fixes: 76ed2e1057 ("Revert "tty: xilinx_uartps: Fix missing id assignment to the console"")
Signed-off-by: Helmut Grohne <helmut.grohne@intenta.de>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200713073227.GA3805@laureti-dev
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-21 18:12:03 +02:00
Tetsuo Handa
ce684552a2 vt: Reject zero-sized screen buffer size.
syzbot is reporting general protection fault in do_con_write() [1] caused
by vc->vc_screenbuf == ZERO_SIZE_PTR caused by vc->vc_screenbuf_size == 0
caused by vc->vc_cols == vc->vc_rows == vc->vc_size_row == 0 caused by
fb_set_var() from ioctl(FBIOPUT_VSCREENINFO) on /dev/fb0 , for
gotoxy(vc, 0, 0) from reset_terminal() from vc_init() from vc_allocate()
 from con_install() from tty_init_dev() from tty_open() on such console
causes vc->vc_pos == 0x10000000e due to
((unsigned long) ZERO_SIZE_PTR) + -1U * 0 + (-1U << 1).

I don't think that a console with 0 column or 0 row makes sense. And it
seems that vc_do_resize() does not intend to allow resizing a console to
0 column or 0 row due to

  new_cols = (cols ? cols : vc->vc_cols);
  new_rows = (lines ? lines : vc->vc_rows);

exception.

Theoretically, cols and rows can be any range as long as
0 < cols * rows * 2 <= KMALLOC_MAX_SIZE is satisfied (e.g.
cols == 1048576 && rows == 2 is possible) because of

  vc->vc_size_row = vc->vc_cols << 1;
  vc->vc_screenbuf_size = vc->vc_rows * vc->vc_size_row;

in visual_init() and kzalloc(vc->vc_screenbuf_size) in vc_allocate().

Since we can detect cols == 0 or rows == 0 via screenbuf_size = 0 in
visual_init(), we can reject kzalloc(0). Then, vc_allocate() will return
an error, and con_write() will not be called on a console with 0 column
or 0 row.

We need to make sure that integer overflow in visual_init() won't happen.
Since vc_do_resize() restricts cols <= 32767 and rows <= 32767, applying
1 <= cols <= 32767 and 1 <= rows <= 32767 restrictions to vc_allocate()
will be practically fine.

This patch does not touch con_init(), for returning -EINVAL there
does not help when we are not returning -ENOMEM.

[1] https://syzkaller.appspot.com/bug?extid=017265e8553724e514e8

Reported-and-tested-by: syzbot <syzbot+017265e8553724e514e8@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200712111013.11881-1-penguin-kernel@I-love.SAKURA.ne.jp
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-21 18:07:15 +02:00
Thomas Gleixner
b4a25fb0e6 - Fix kernel panic at suspend / resume time on TI am3/am4 (Tony Lindgren)
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAl8W9DAACgkQqDIjiipP
 6E9HUwf+Of/ePXWAO59IGMsiRdh+yYawc7blshVxtuw9vgfFfXSUXWyoAO8WH6zS
 VOzmlDoze+jdG4VKAr0elFOkXAHjdgfwsVoLAU1aJz1r/rcpt3j8oc8JPKTxck7I
 Yq5L9+I/36sULZt7Pa8VaendoswbrKDbHtQFqQaT+h2LtCdK2kEeRp44Xr3fqIPc
 fuHAzetjWJ20Iy7YRkI3jzL9DAcoXZWdctSoC8FYEWtP7RaGVZVnxBPyaoiGFVJc
 WnZApwxW6Ju03Uav6ypy7fP5A3utFi119pRuyadQYCJT1n9NlzZMmCA+fCCx09+Y
 fvuXIayg6bj7++YD1zLPPFn4rgTY/g==
 =vRtM
 -----END PGP SIGNATURE-----

Merge tag 'timers-v5.8-rc7' of https://git.linaro.org/people/daniel.lezcano/linux into timers/urgent

Pull a timer chip fix from Daniel Lezcano:

 - Fix kernel panic at suspend / resume time on TI am3/am4 (Tony Lindgren)
2020-07-21 17:18:32 +02:00
John David Anglin
be6577af0c parisc: Add atomic64_set_release() define to avoid CPU soft lockups
Stalls are quite frequent with recent kernels. I enabled
CONFIG_SOFTLOCKUP_DETECTOR and I caught the following stall:

watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [cc1:22803]
CPU: 0 PID: 22803 Comm: cc1 Not tainted 5.6.17+ #3
Hardware name: 9000/800/rp3440
 IAOQ[0]: d_alloc_parallel+0x384/0x688
 IAOQ[1]: d_alloc_parallel+0x388/0x688
 RP(r2): d_alloc_parallel+0x134/0x688
Backtrace:
 [<000000004036974c>] __lookup_slow+0xa4/0x200
 [<0000000040369fc8>] walk_component+0x288/0x458
 [<000000004036a9a0>] path_lookupat+0x88/0x198
 [<000000004036e748>] filename_lookup+0xa0/0x168
 [<000000004036e95c>] user_path_at_empty+0x64/0x80
 [<000000004035d93c>] vfs_statx+0x104/0x158
 [<000000004035dfcc>] __do_sys_lstat64+0x44/0x80
 [<000000004035e5a0>] sys_lstat64+0x20/0x38
 [<0000000040180054>] syscall_exit+0x0/0x14

The code was stuck in this loop in d_alloc_parallel:

    4037d414:   0e 00 10 dc     ldd 0(r16),ret0
    4037d418:   c7 fc 5f ed     bb,< ret0,1f,4037d414 <d_alloc_parallel+0x384>
    4037d41c:   08 00 02 40     nop

This is the inner loop of bit_spin_lock which is called by hlist_bl_unlock in
d_alloc_parallel:

static inline void bit_spin_lock(int bitnum, unsigned long *addr)
{
        /*
         * Assuming the lock is uncontended, this never enters
         * the body of the outer loop. If it is contended, then
         * within the inner loop a non-atomic test is used to
         * busywait with less bus contention for a good time to
         * attempt to acquire the lock bit.
         */
        preempt_disable();
#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK)
        while (unlikely(test_and_set_bit_lock(bitnum, addr))) {
                preempt_enable();
                do {
                        cpu_relax();
                } while (test_bit(bitnum, addr));
                preempt_disable();
        }
#endif
        __acquire(bitlock);
}

After consideration, I realized that we must be losing bit unlocks.
Then, I noticed that we missed defining atomic64_set_release().
Adding this define fixes the stalls in bit operations.

Signed-off-by: Dave Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org
Signed-off-by: Helge Deller <deller@gmx.de>
2020-07-21 17:16:37 +02:00
Linus Torvalds
8c26c87b05 sound fixes for 5.8-rc7
This PR became fairly large, containing mostly the collection of
 ASoC fixes that slipped from the previous request, so I sent now
 a bit earlier than usual.  But all changes look small and mostly
 device-specific, hence nothing to worry too much.
 
 Majority of changes are for x86 based platforms and their CODEC
 drivers, in order to address some issues hit by their recent tests
 and fuzzing.  The rest are other ASoC device-specific fixes (imx,
 qcom, wm8974, amd, rockchip) as well as a trivial fix for a kernel
 WARNING hit by syzkaller.
 -----BEGIN PGP SIGNATURE-----
 
 iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAl8WqvQOHHRpd2FpQHN1
 c2UuZGUACgkQLtJE4w1nLE8KEA/+MZ+3jkFFinrq+mPPPZOJdMe10XxT1moZrjie
 H+cA6623iVhnKMB9JKMSncroSpgw7OaFlYP1tGfsQkD32rglubUhAqAs7Up3ve9O
 tzVLqyaxwNmy6I38n7g2TUlEIvJyRCiC2pR97XrtqiAmsRDheYBsn8lEN2Ie6eUF
 uAftr6DHJ5lHeYMWFBwN1fjbg4vMZGAFEtK4czme05b0n2gHo4AnXGfpnxYwYhfN
 5WRQm12rtjDsWPC9Rk32auZBH9qnHeGALRCYWlRje4XfbaFnSgbID9/NWNodXjR7
 m92Tw5bEV9SQx+0kNd3+ibGp0RLrgfMitp3hlv2as5GHTlQi2nfLWnWmWUzWFflR
 TKbcpwOANwncMx/KrfEkcqEt0cozMRL3MgkSaXvbarv8ZAyzJGYYIvNXLpA+AHLu
 ryj02Cc7wyTO5Axv7fqF9yNM53mfu6TEPkdtTGOjszTkkf2OknZYjiB4ci47elZm
 3db303JZVmW09b3qNcrNJ273LWxIRaGbFOe1KExfHB1lsBnlufNSVY3+spDl35Cu
 u5w96oCF6KO/j2f3iKUdMT88XfRxsheN9GnU7g3iso8Ewdk7szzhfE8Ynlz+Zyca
 74fyNbowjBrU/FZzY4VopvqQqfNG9mYazQEH+EIBSLUIoGXaKSz+Xnrfz7BgZvCu
 TEb1sB4=
 =iAZG
 -----END PGP SIGNATURE-----

Merge tag 'sound-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound into master

Pull sound fixes from Takashi Iwai:
 "This became fairly large, containing mostly the collection of ASoC
  fixes that slipped from the previous request, so I sent now a bit
  earlier than usual. But all changes look small and mostly
  device-specific, hence nothing to worry too much.

  Majority of changes are for x86 based platforms and their CODEC
  drivers, in order to address some issues hit by their recent tests and
  fuzzing. The rest are other ASoC device-specific fixes (imx, qcom,
  wm8974, amd, rockchip) as well as a trivial fix for a kernel WARNING
  hit by syzkaller"

* tag 'sound-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (28 commits)
  ALSA: hda/realtek: Fixed ALC298 sound bug by adding quirk for Samsung Notebook Pen S
  ALSA: info: Drop WARN_ON() from buffer NULL sanity check
  ASoC: rt5682: Report the button event in the headset type only
  ASoC: Intel: bytcht_es8316: Add missed put_device()
  ASoC: rt5682: Enable Vref2 under using PLL2
  ASoC: rt286: fix unexpected interrupt happens
  ASoC: wm8974: remove unsupported clock mode
  ASoC: wm8974: fix Boost Mixer Aux Switch
  ASoC: SOF: core: fix null-ptr-deref bug during device removal
  ASoc: codecs: max98373: remove Idle_bias_on to let codec suspend
  ASoC: codecs: max98373: Removed superfluous volume control from chip default
  ASoC: topology: fix tlvs in error handling for widget_dmixer
  ASoC: topology: fix kernel oops on route addition error
  ASoC: SOF: imx: add min/max channels for SAI/ESAI on i.MX8/i.MX8M
  ASoC: Intel: bdw-rt5677: fix non BE conversion
  ASoC: soc-dai: set dai_link dpcm_ flags with a helper
  MAINTAINERS: Add Shengjiu to reviewer list of sound/soc/fsl
  ASoC: core: Remove only the registered component in devm functions
  MAINTAINERS: Change Maintainer for some at91 drivers
  ASoC: dt-bindings: simple-card: Fix 'make dt_binding_check' warnings
  ...
2020-07-21 08:06:45 -07:00
Tony Lindgren
6cfcd5563b clocksource/drivers/timer-ti-dm: Fix suspend and resume for am3 and am4
Carlos Hernandez <ceh@ti.com> reported that we now have a suspend and
resume regresssion on am3 and am4 compared to the earlier kernels. While
suspend and resume works with v5.8-rc3, we now get errors with rtcwake:

pm33xx pm33xx: PM: Could not transition all powerdomains to target state
...
rtcwake: write error

This is because we now fail to idle the system timer clocks that the
idle code checks and the error gets propagated to the rtcwake.

Turns out there are several issues that need to be fixed:

1. Ignore no-idle and no-reset configured timers for the ti-sysc
   interconnect target driver as otherwise it will keep the system timer
   clocks enabled

2. Toggle the system timer functional clock for suspend for am3 and am4
   (but not for clocksource on am3)

3. Only reconfigure type1 timers in dmtimer_systimer_disable()

4. Use of_machine_is_compatible() instead of of_device_is_compatible()
   for checking the SoC type

Fixes: 52762fbd1c ("clocksource/drivers/timer-ti-dm: Add clockevent and clocksource support")
Reported-by: Carlos Hernandez <ceh@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Tested-by: Carlos Hernandez <ceh@ti.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200713162601.6829-1-tony@atomide.com
2020-07-21 15:48:33 +02:00
Forest Crossman
dbb0897e80 usb: xhci: Fix ASM2142/ASM3142 DMA addressing
The ASM2142/ASM3142 (same PCI IDs) does not support full 64-bit DMA
addresses, which can cause silent memory corruption or IOMMU errors on
platforms that use the upper bits. Add the XHCI_NO_64BIT_SUPPORT quirk
to fix this issue.

Signed-off-by: Forest Crossman <cyrozap@gmail.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200717112734.328432-1-cyrozap@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-21 14:05:49 +02:00
Chunfeng Yun
5ce1a24dd9 usb: xhci-mtk: fix the failure of bandwidth allocation
The wMaxPacketSize field of endpoint descriptor may be zero
as default value in alternate interface, and they are not
actually selected when start stream, so skip them when try to
allocate bandwidth.

Cc: stable <stable@vger.kernel.org>
Fixes: 0cbd4b34cd ("xhci: mediatek: support MTK xHCI host controller")
Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
Link: https://lore.kernel.org/r/1594360672-2076-1-git-send-email-chunfeng.yun@mediatek.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-21 14:05:49 +02:00
Thomas Richter
3d3af181d3 s390/cpum_cf,perf: change DFLT_CCERROR counter name
Change the counter name DLFT_CCERROR to DLFT_CCFINISH on IBM z15.
This counter counts completed DEFLATE instructions with exit code
0, 1 or 2. Since exit code 0 means success and exit code 1 or 2
indicate errors, change the counter name to avoid confusion.
This counter is incremented each time the DEFLATE instruction
completed regardless if an error was detected or not.

Fixes: d68d5d51dc ("s390/cpum_cf: Add new extended counters for IBM z15")
Fixes: e7950166e4 ("perf vendor events s390: Add new deflate counters for IBM z15")
Cc: stable@vger.kernel.org # v5.7
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-21 13:53:56 +02:00
Liam Beguin
b344d6a83d parisc: add support for cmpxchg on u8 pointers
The kernel test bot reported[1] that using set_mask_bits on a u8 causes
the following issue on parisc:

	hppa-linux-ld: drivers/phy/ti/phy-tusb1210.o: in function `tusb1210_probe':
	>> (.text+0x2f4): undefined reference to `__cmpxchg_called_with_bad_pointer'
	>> hppa-linux-ld: (.text+0x324): undefined reference to `__cmpxchg_called_with_bad_pointer'
	hppa-linux-ld: (.text+0x354): undefined reference to `__cmpxchg_called_with_bad_pointer'

Add support for cmpxchg on u8 pointers.

[1] https://lore.kernel.org/patchwork/patch/1272617/#1468946

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Liam Beguin <liambeguin@gmail.com>
Tested-by: Dave Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
2020-07-21 08:11:12 +02:00
Vincent Chen
4cb699d044
riscv: kasan: use local_tlb_flush_all() to avoid uninitialized __sbi_rfence
It fails to boot the v5.8-rc4 kernel with CONFIG_KASAN because kasan_init
and kasan_early_init use uninitialized __sbi_rfence as executing the
tlb_flush_all(). Actually, at this moment, only the CPU which is
responsible for the system initialization enables the MMU. Other CPUs are
parking at the .Lsecondary_start. Hence the tlb_flush_all() is able to be
replaced by local_tlb_flush_all() to avoid using uninitialized
__sbi_rfence.

Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-07-20 21:14:51 -07:00
Tung Nguyen
6ef9dcb780 tipc: allow to build NACK message in link timeout function
Commit 02288248b0 ("tipc: eliminate gap indicator from ACK messages")
eliminated sending of the 'gap' indicator in regular ACK messages and
only allowed to build NACK message with enabled probe/probe_reply.
However, necessary correction for building NACK message was missed
in tipc_link_timeout() function. This leads to significant delay and
link reset (due to retransmission failure) in lossy environment.

This commit fixes it by setting the 'probe' flag to 'true' when
the receive deferred queue is not empty. As a result, NACK message
will be built to send back to another peer.

Fixes: 02288248b0 ("tipc: eliminate gap indicator from ACK messages")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 20:11:22 -07:00
Ilya Ponetayev
db415f7aae exfat: fix name_hash computation on big endian systems
On-disk format for name_hash field is LE, so it must be explicitly
transformed on BE system for proper result.

Fixes: 370e812b3e ("exfat: add nls operations")
Cc: stable@vger.kernel.org # v5.7
Signed-off-by: Chen Minqiang <ptpt52@gmail.com>
Signed-off-by: Ilya Ponetayev <i.ponetaev@ndmsystems.com>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2020-07-21 10:44:19 +09:00
Hyeongseok Kim
41e3928f8c exfat: fix wrong size update of stream entry by typo
The stream.size field is updated to the value of create timestamp
of the file entry. Fix this to use correct stream entry pointer.

Fixes: 29bbb14bfc ("exfat: fix incorrect update of stream entry in __exfat_truncate()")
Signed-off-by: Hyeongseok Kim <hyeongseok@gmail.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2020-07-21 10:44:15 +09:00
Namjae Jeon
d2fa0c337d exfat: fix wrong hint_stat initialization in exfat_find_dir_entry()
We found the wrong hint_stat initialization in exfat_find_dir_entry().
It should be initialized when cluster is EXFAT_EOF_CLUSTER.

Fixes: ca06197382 ("exfat: add directory operations")
Cc: stable@vger.kernel.org # v5.7
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2020-07-21 10:44:10 +09:00
Namjae Jeon
43946b7049 exfat: fix overflow issue in exfat_cluster_to_sector()
An overflow issue can occur while calculating sector in
exfat_cluster_to_sector(). It needs to cast clus's type to sector_t
before left shifting.

Fixes: 1acf1a564b ("exfat: add in-memory and on-disk structures and headers")
Cc: stable@vger.kernel.org # v5.7
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
2020-07-21 10:44:06 +09:00
Bixuan Cui
db44c60c45 net: neterion: vxge: reduce stack usage in VXGE_COMPLETE_VPATH_TX
Fix the warning: [-Werror=-Wframe-larger-than=]

drivers/net/ethernet/neterion/vxge/vxge-main.c:
In function'VXGE_COMPLETE_VPATH_TX.isra.37':
drivers/net/ethernet/neterion/vxge/vxge-main.c:119:1:
warning: the frame size of 1056 bytes is larger than 1024 bytes

Dropping the NR_SKB_COMPLETED to 16 is appropriate that won't
have much impact on performance and functionality.

Signed-off-by: Bixuan Cui <cuibixuan@huawei.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:38:27 -07:00
Ming Lei
3f0dcfbcd2 scsi: core: Run queue in case of I/O resource contention failure
I/O requests may be held in scheduler queue because of resource contention.
The starvation scenario was handled properly in the regular completion
path but we failed to account for it during I/O submission. This lead to
the hang captured below. Make sure we run the queue when resource
contention is encountered in the submission path.

[   39.054963] scsi 13:0:0:0: rejecting I/O to dead device
[   39.058700] scsi 13:0:0:0: rejecting I/O to dead device
[   39.087855] sd 13:0:0:1: [sdd] Synchronizing SCSI cache
[   39.088909] scsi 13:0:0:1: rejecting I/O to dead device
[   39.095351] scsi 13:0:0:1: rejecting I/O to dead device
[   39.096962] scsi 13:0:0:1: rejecting I/O to dead device
[  247.021859] INFO: task scsi-stress-rem:813 blocked for more than 122 seconds.
[  247.023258]       Not tainted 5.8.0-rc2 #8
[  247.024069] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  247.025331] scsi-stress-rem D    0   813    802 0x00004000
[  247.025334] Call Trace:
[  247.025354]  __schedule+0x504/0x55f
[  247.027987]  schedule+0x72/0xa8
[  247.027991]  blk_mq_freeze_queue_wait+0x63/0x8c
[  247.027994]  ? do_wait_intr_irq+0x7a/0x7a
[  247.027996]  blk_cleanup_queue+0x4b/0xc9
[  247.028000]  __scsi_remove_device+0xf6/0x14e
[  247.028002]  scsi_remove_device+0x21/0x2b
[  247.029037]  sdev_store_delete+0x58/0x7c
[  247.029041]  kernfs_fop_write+0x10d/0x14f
[  247.031281]  vfs_write+0xa2/0xdf
[  247.032670]  ksys_write+0x6b/0xb3
[  247.032673]  do_syscall_64+0x56/0x82
[  247.034053]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  247.034059] RIP: 0033:0x7f69f39e9008
[  247.036330] Code: Bad RIP value.
[  247.036331] RSP: 002b:00007ffdd8116498 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  247.037613] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f69f39e9008
[  247.039714] RDX: 0000000000000002 RSI: 000055cde92a0ab0 RDI: 0000000000000001
[  247.039715] RBP: 000055cde92a0ab0 R08: 000000000000000a R09: 00007f69f3a79e80
[  247.039716] R10: 000000000000000a R11: 0000000000000246 R12: 00007f69f3abb780
[  247.039717] R13: 0000000000000002 R14: 00007f69f3ab6740 R15: 0000000000000002

Link: https://lore.kernel.org/r/20200720025435.812030-1-ming.lei@redhat.com
Cc: linux-block@vger.kernel.org
Cc: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-07-20 21:38:20 -04:00
Huang Guobin
befc113c56 net: ag71xx: add missed clk_disable_unprepare in error path of probe
The ag71xx_mdio_probe() forgets to call clk_disable_unprepare() when
of_reset_control_get_exclusive() failed. Add the missed call to fix it.

Fixes: d51b6ce441 ("net: ethernet: add ag71xx driver")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Huang Guobin <huangguobin4@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:37:38 -07:00
wenxu
ae372cb175 net/sched: act_ct: fix restore the qdisc_skb_cb after defrag
The fragment packets do defrag in tcf_ct_handle_fragments
will clear the skb->cb which make the qdisc_skb_cb clear
too. So the qdsic_skb_cb should be store before defrag and
restore after that.
It also update the pkt_len after all the
fragments finish the defrag to one packet and make the
following actions counter correct.

Fixes: b57dc7c13e ("net/sched: Introduce action ct")
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:36:37 -07:00
Navid Emamdoost
1e8fd3a97f nfc: s3fwrn5: add missing release on skb in s3fwrn5_recv_frame
The implementation of s3fwrn5_recv_frame() is supposed to consume skb on
all execution paths. Release skb before returning -ENODEV.

Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:31:33 -07:00
Vinay Kumar Yadav
30d9e5057a crypto/chtls: correct net_device reference count
ip_dev_find() call holds net_device reference which is not needed,
use __ip_dev_find() which does not hold reference.

v1->v2:
- Correct submission tree.
- Add fixes tag.

Fixes: cc35c88ae4 ("crypto : chtls - CPL handler definition")
Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:28:04 -07:00
Vinay Kumar Yadav
c271042eb6 crypto/chtls: fix tls alert messages corrupted by tls data
When tls data skb is pending for Tx and tls alert comes , It
is wrongly overwrite the record type of tls data to tls alert
record type. fix the issue correcting it.

v1->v2:
- Correct submission tree.
- Add fixes tag.

Fixes: 6919a8264a ("Crypto/chtls: add/delete TLS header in driver")
Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:27:40 -07:00
David S. Miller
4932893924 Merge branch 'ionic-locking-and-filter-fixes'
Shannon Nelson says:

====================
ionic: locking and filter fixes

These patches address an ethtool show regs problem, some locking sightings,
and issues with RSS hash and filter_id tracking after a managed FW update.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:09:38 -07:00
Shannon Nelson
0925e9db4d ionic: use mutex to protect queue operations
The ionic_wait_on_bit_lock() was a open-coded mutex knock-off
used only for protecting the queue reset operations, and there
was no reason not to use the real thing.  We can use the lock
more correctly and to better protect the queue stop and start
operations from cross threading.  We can also remove a useless
and expensive bit operation from the Rx path.

This fixes a case found where the link_status_check from a link
flap could run into an MTU change and cause a crash.

Fixes: beead698b1 ("ionic: Add the basic NDO callbacks for netdev support")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:09:38 -07:00
Shannon Nelson
bdff46665e ionic: keep rss hash after fw update
Make sure the RSS hash key is kept across a fw update by not
de-initing it when an update is happening.

Fixes: c672412f61 ("ionic: remove lifs on fw reset")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:09:38 -07:00
Shannon Nelson
cc4428c4de ionic: update filter id after replay
When we replay the rx filters after a fw-upgrade we get new
filter_id values from the FW, which we need to save and update
in our local filter list.  This allows us to delete the filters
with the correct filter_id when we're done.

Fixes: 7e4d47596b ("ionic: replay filters after fw upgrade")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:09:38 -07:00
Shannon Nelson
cbec2153a9 ionic: fix up filter locks and debug msgs
Add in a couple of forgotten spinlocks and fix up some of
the debug messages around filter management.

Fixes: c1e329ebec ("ionic: Add management of rx filters")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:09:38 -07:00
Shannon Nelson
f85ae16f92 ionic: use offset for ethtool regs data
Use an offset to write the second half of the regs data into the
second half of the buffer instead of overwriting the first half.

Fixes: 4d03e00a21 ("ionic: Add initial ethtool support")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:09:38 -07:00
Murali Karicheri
5d93518e06 net: hsr: check for return value of skb_put_padto()
skb_put_padto() can fail. So check for return type and return NULL
for skb. Caller checks for skb and acts correctly if it is NULL.

Fixes: 6d6148bc78 ("net: hsr: fix incorrect lsdu size in the tag of HSR frames for small frames")

Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:02:28 -07:00
Guillaume Nault
34b85adad8 Documentation: bareudp: update iproute2 sample commands
bareudp.rst was written before iproute2 gained support for this new
type of tunnel. Therefore, the sample command lines didn't match the
final iproute2 implementation.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:57:20 -07:00
Liu Jian
5dbaeb87f2 mlxsw: destroy workqueue when trap_register in mlxsw_emad_init
When mlxsw_core_trap_register fails in mlxsw_emad_init,
destroy_workqueue() shouled be called to destroy mlxsw_core->emad_wq.

Fixes: d965465b60 ("mlxsw: core: Fix possible deadlock")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:55:23 -07:00
Liu Jian
6790711f8a dpaa_eth: Fix one possible memleak in dpaa_eth_probe
When dma_coerce_mask_and_coherent() fails, the alloced netdev need to be freed.

Fixes: 060ad66f97 ("dpaa_eth: change DMA device")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:54:08 -07:00
David S. Miller
d49e2c9e54 Merge branch 'smc-fixes'
Karsten Graul says:

====================
net/smc: fixes 2020-07-20

Please apply the following patch series for smc to netdev's net tree.

Patch 1 fixes a problem with a buffer that is not put back when the
connection was killed in the meantime.
Patch 2 fixes a wrong behaviour when the maximum dmb buffer count
exceeded.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:52:25 -07:00
Karsten Graul
a9e4450295 net/smc: fix dmb buffer shortage
There is a current limit of 1920 registered dmb buffers per ISM device
for smc-d. One link group can contain 255 connections, each connection
is using one dmb buffer. When the connection is closed then the
registered buffer is held in a queue and is reused by the next
connection. When a link group is 'full' then another link group is
created and uses an own buffer pool. The link groups are added to a
list using list_add() which puts a new link group to the first position
in the list.
In the situation that many connections are opened (>1920) and a few of
them stay open while others are closed quickly we end up with at least 8
link groups. For a new connection a matching link group is looked up,
iterating over the list of link groups. The trailing 7 link groups
all have registered dmb buffers which could be reused, while the first
link group has only a few dmb buffers and then hit the 1920 limit.
Because the first link group is not full (255 connection limit not
reached) it is chosen and finally the connection falls back to TCP
because there is no dmb buffer available in this link group.
There are multiple ways to fix that: using list_add_tail() allows
to scan older link groups first for free buffers which ensures that
buffers are reused first. This fixes the problem for smc-r link groups
as well. For smc-d there is an even better way to address this problem
because smc-d does not have the 255 connections per link group limit.
So fix the problem for smc-d by allowing large link groups.

Fixes: c6ba7c9ba4 ("net/smc: add base infrastructure for SMC-D and ISM")
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:52:25 -07:00
Karsten Graul
2bced6aefa net/smc: put slot when connection is killed
To get a send slot smc_wr_tx_get_free_slot() is called, which might
wait for a free slot. When smc_wr_tx_get_free_slot() returns there is a
check if the connection was killed in the meantime. In that case don't
only return an error, but also put back the free slot.

Fixes: b290098092 ("net/smc: cancel send and receive for terminated socket")
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:52:25 -07:00
David Howells
639f181f0e rxrpc: Fix sendmsg() returning EPIPE due to recvmsg() returning ENODATA
rxrpc_sendmsg() returns EPIPE if there's an outstanding error, such as if
rxrpc_recvmsg() indicating ENODATA if there's nothing for it to read.

Change rxrpc_recvmsg() to return EAGAIN instead if there's nothing to read
as this particular error doesn't get stored in ->sk_err by the networking
core.

Also change rxrpc_sendmsg() so that it doesn't fail with delayed receive
errors (there's no way for it to report which call, if any, the error was
caused by).

Fixes: 17926a7932 ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:47:10 -07:00
David S. Miller
7d85396c4c Merge tag 'ieee802154-for-davem-2020-07-20' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan
Stefan Schmidt says:

====================
pull-request: ieee802154 for net 2020-07-20

An update from ieee802154 for your *net* tree.

A potential memory leak fix for adf7242 from Liu Jian,
and one more HTTPS link change from Alexander A. Klimov.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:04:35 -07:00
Zhang Changzhong
53a92889ec net: bcmgenet: add missed clk_disable_unprepare in bcmgenet_probe
The driver forgets to call clk_disable_unprepare() in error path after
a success calling for clk_prepare_enable().

Fix to goto err_clk_disable if clk_prepare_enable() is successful.

Fixes: c80d36ff63 ("net: bcmgenet: Use devm_clk_get_optional() to get the clocks")
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Acked-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 17:03:45 -07:00
Zhang Changzhong
24a63fe6d4 net: bcmgenet: fix error returns in bcmgenet_probe()
The driver forgets to call clk_disable_unprepare() in error path after
a success calling for clk_prepare_enable().

Fix to goto err_clk_disable if clk_prepare_enable() is successful.

Fixes: 99d55638d4 ("net: bcmgenet: enable NETIF_F_HIGHDMA flag")
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Acked-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 16:57:53 -07:00
Linus Torvalds
4fa640dc52 VFIO fixes for v5.8-rc7
- Fix race with eventfd ctx cleared outside of mutex (Zeng Tao)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJfFapKAAoJECObm247sIsiow0P/iAD64mdb/oRaaIPms+wtSGi
 9d8EDcEJaMRW1VKwLF/ocB58QbsMzvYOqxmAJnVjvv7JmGU4+TFxVojwE0UzHQjB
 aPp6rrKZzM9KWSsTDPiQT5kA6IHrpZwpFlxeLJvh/j04RizvYFjBfw7EUSLoNdOl
 LqHFTxQIh1tdLZPsmcT0Fhfo9m6/ToxxUdpP3AVR/tVXm3OOwHz5yZqYH+1k5nDk
 vIW7WTWk5iCwx8fubbNwwAlF2x51QRHGExqbHPGcKCf+0Vadk5JxjNpewIulMPqW
 f+T9Y7Gl4Ljapw6T1AlX6l1e9Js0b5I5m1afP83Kuvp8nj7pgSkuUpjlEJivsj1G
 E7bRHUjkFEBpKhB8txN269E8nOOB3grE0iwMBKgGNh8UjOqqAg1mKIQSDlIiBVjQ
 OJ9QXHNr1HMKD27QpH36YgkHXXQpjipLQBxUpvBgcp+xYS17T+kfqYJ2a0oAHD0d
 8uXOfoJE7CWSMOxnuqm6S6NwFOM0IDtRj6TSRA8qojBGLamqFFUOdavN3+50uPA7
 TfKz0RnizsRFxSzDY+31bmDGraqgtbJ6ct2NGRhGE7n86cCdhR+PjSiy66LDwrPm
 lSCA104gxu/m3Rg3fhZrt6To7CnPdpuhP7B1dTheHPQmvxbbM55H0X27QwqRPX1t
 NpsULU/13uNZC70siuDV
 =+prG
 -----END PGP SIGNATURE-----

Merge tag 'vfio-v5.8-rc7' of git://github.com/awilliam/linux-vfio into master

Pull VFIO fix from Alex Williamson:
 "Fix race with eventfd ctx cleared outside of mutex (Zeng Tao)"

* tag 'vfio-v5.8-rc7' of git://github.com/awilliam/linux-vfio:
  vfio/pci: fix racy on error and request eventfd ctx
2020-07-20 13:30:59 -07:00