6477 Commits

Author SHA1 Message Date
Coly Li
383ff2183a bcache: check CACHE_SET_IO_DISABLE bit in bch_journal()
When too many I/O errors happen on cache set and CACHE_SET_IO_DISABLE
bit is set, bch_journal() may continue to work because the journaling
bkey might be still in write set yet. The caller of bch_journal() may
believe the journal still work but the truth is in-memory journal write
set won't be written into cache device any more. This behavior may
introduce potential inconsistent metadata status.

This patch checks CACHE_SET_IO_DISABLE bit at the head of bch_journal(),
if the bit is set, bch_journal() returns NULL immediately to notice
caller to know journal does not work.

Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-28 07:39:15 -06:00
Coly Li
e775339e1a bcache: check CACHE_SET_IO_DISABLE in allocator code
If CACHE_SET_IO_DISABLE of a cache set flag is set by too many I/O
errors, currently allocator routines can still continue allocate
space which may introduce inconsistent metadata state.

This patch checkes CACHE_SET_IO_DISABLE bit in following allocator
routines,
- bch_bucket_alloc()
- __bch_bucket_alloc_set()
Once CACHE_SET_IO_DISABLE is set on cache set, the allocator routines
may reject allocation request earlier to avoid potential inconsistent
metadata.

Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-28 07:39:15 -06:00
Coly Li
bd9026c8a7 bcache: remove unncessary code in bch_btree_keys_init()
Function bch_btree_keys_init() initializes b->set[].size and
b->set[].data to zero. As the code comments indicates, these code indeed
is unncessary, because both struct btree_keys and struct bset_tree are
nested embedded into struct btree, when struct btree is filled with 0
bits by kzalloc() in mca_bucket_alloc(), b->set[].size and
b->set[].data are initialized to 0 (a.k.a NULL) already.

This patch removes the redundant code, and add comments in
bch_btree_keys_init() and mca_bucket_alloc() to explain why it's safe.

Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-28 07:39:15 -06:00
Coly Li
0b13efecf5 bcache: add return value check to bch_cached_dev_run()
This patch adds return value check to bch_cached_dev_run(), now if there
is error happens inside bch_cached_dev_run(), it can be catched.

Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-28 07:39:14 -06:00
Alexandru Ardelean
89e0341af0 bcache: use sysfs_match_string() instead of __sysfs_match_string()
The arrays (of strings) that are passed to __sysfs_match_string() are
static, so use sysfs_match_string() which does an implicit ARRAY_SIZE()
over these arrays.

Functionally, this doesn't change anything.
The change is more cosmetic.

It only shrinks the static arrays by 1 byte each.

Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-28 07:39:14 -06:00
Coly Li
f960facb39 bcache: remove unnecessary prefetch() in bset_search_tree()
In function bset_search_tree(), when p >= t->size, t->tree[0] will be
prefetched by the following code piece,
 974                 unsigned int p = n << 4;
 975
 976                 p &= ((int) (p - t->size)) >> 31;
 977
 978                 prefetch(&t->tree[p]);

The purpose of the above code is to avoid a branch instruction, but
when p >= t->size, prefetch(&t->tree[0]) has no positive performance
contribution at all. This patch avoids the unncessary prefetch by only
calling prefetch() when p < t->size.

Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-28 07:39:14 -06:00
Coly Li
08ec1e6282 bcache: add io error counting in write_bdev_super_endio()
When backing device super block is written by bch_write_bdev_super(),
the bio complete callback write_bdev_super_endio() simply ignores I/O
status. Indeed such write request also contribute to backing device
health status if the request failed.

This patch checkes bio->bi_status in write_bdev_super_endio(), if there
is error, bch_count_backing_io_errors() will be called to count an I/O
error to dc->io_errors.

Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-28 07:39:14 -06:00
Coly Li
578df99b1b bcache: ignore read-ahead request failure on backing device
When md raid device (e.g. raid456) is used as backing device, read-ahead
requests on a degrading and recovering md raid device might be failured
immediately by md raid code, but indeed this md raid array can still be
read or write for normal I/O requests. Therefore such failed read-ahead
request are not real hardware failure. Further more, after degrading and
recovering accomplished, read-ahead requests will be handled by md raid
array again.

For such condition, I/O failures of read-ahead requests don't indicate
real health status (because normal I/O still be served), they should not
be counted into I/O error counter dc->io_errors.

Since there is no simple way to detect whether the backing divice is a
md raid device, this patch simply ignores I/O failures for read-ahead
bios on backing device, to avoid bogus backing device failure on a
degrading md raid array.

Suggested-and-tested-by: Thorsten Knabe <linux@thorsten-knabe.de>
Signed-off-by: Coly Li <colyli@suse.de>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-28 07:39:14 -06:00
Coly Li
e6dcbd3e6c bcache: avoid flushing btree node in cache_set_flush() if io disabled
When cache_set_flush() is called for too many I/O errors detected on
cache device and the cache set is retiring, inside the function it
doesn't make sense to flushing cached btree nodes from c->btree_cache
because CACHE_SET_IO_DISABLE is set on c->flags already and all I/Os
onto cache device will be rejected.

This patch checks in cache_set_flush() that whether CACHE_SET_IO_DISABLE
is set. If yes, then avoids to flush the cached btree nodes to reduce
more time and make cache set retiring more faster.

Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-28 07:39:14 -06:00
Coly Li
695277f16b Revert "bcache: set CACHE_SET_IO_DISABLE in bch_cached_dev_error()"
This reverts commit 6147305c73e4511ca1a975b766b97a779d442567.

Although this patch helps the failed bcache device to stop faster when
too many I/O errors detected on corresponding cached device, setting
CACHE_SET_IO_DISABLE bit to cache set c->flags was not a good idea. This
operation will disable all I/Os on cache set, which means other attached
bcache devices won't work neither.

Without this patch, the failed bcache device can also be stopped
eventually if internal I/O accomplished (e.g. writeback). Therefore here
I revert it.

Fixes: 6147305c73e4 ("bcache: set CACHE_SET_IO_DISABLE in bch_cached_dev_error()")
Reported-by: Yong Li <mr.liyong@qq.com>
Signed-off-by: Coly Li <colyli@suse.de>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-28 07:39:14 -06:00
Coly Li
0ae49cb7aa bcache: fix return value error in bch_journal_read()
When everything is OK in bch_journal_read(), finally the return value
is returned by,
	return ret;
which assumes ret will be 0 here. This assumption is wrong when all
journal buckets as are full and filled with valid journal entries. In
such cache the last location referencess read_bucket() sets 'ret' to
1, which means new jset added into jset list. The jset list is list
'journal' in caller run_cache_set().

Return 1 to run_cache_set() means something wrong and the cache set
won't start, but indeed everything is OK.

This patch changes the line at end of bch_journal_read() to directly
return 0 since everything if verything is good. Then a bogus error
is fixed.

Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-28 07:39:14 -06:00
Coly Li
b387e9b586 bcache: check c->gc_thread by IS_ERR_OR_NULL in cache_set_flush()
When system memory is in heavy pressure, bch_gc_thread_start() from
run_cache_set() may fail due to out of memory. In such condition,
c->gc_thread is assigned to -ENOMEM, not NULL pointer. Then in following
failure code path bch_cache_set_error(), when cache_set_flush() gets
called, the code piece to stop c->gc_thread is broken,
         if (!IS_ERR_OR_NULL(c->gc_thread))
                 kthread_stop(c->gc_thread);

And KASAN catches such NULL pointer deference problem, with the warning
information:

[  561.207881] ==================================================================
[  561.207900] BUG: KASAN: null-ptr-deref in kthread_stop+0x3b/0x440
[  561.207904] Write of size 4 at addr 000000000000001c by task kworker/15:1/313

[  561.207913] CPU: 15 PID: 313 Comm: kworker/15:1 Tainted: G        W         5.0.0-vanilla+ #3
[  561.207916] Hardware name: Lenovo ThinkSystem SR650 -[7X05CTO1WW]-/-[7X05CTO1WW]-, BIOS -[IVE136T-2.10]- 03/22/2019
[  561.207935] Workqueue: events cache_set_flush [bcache]
[  561.207940] Call Trace:
[  561.207948]  dump_stack+0x9a/0xeb
[  561.207955]  ? kthread_stop+0x3b/0x440
[  561.207960]  ? kthread_stop+0x3b/0x440
[  561.207965]  kasan_report+0x176/0x192
[  561.207973]  ? kthread_stop+0x3b/0x440
[  561.207981]  kthread_stop+0x3b/0x440
[  561.207995]  cache_set_flush+0xd4/0x6d0 [bcache]
[  561.208008]  process_one_work+0x856/0x1620
[  561.208015]  ? find_held_lock+0x39/0x1d0
[  561.208028]  ? drain_workqueue+0x380/0x380
[  561.208048]  worker_thread+0x87/0xb80
[  561.208058]  ? __kthread_parkme+0xb6/0x180
[  561.208067]  ? process_one_work+0x1620/0x1620
[  561.208072]  kthread+0x326/0x3e0
[  561.208079]  ? kthread_create_worker_on_cpu+0xc0/0xc0
[  561.208090]  ret_from_fork+0x3a/0x50
[  561.208110] ==================================================================
[  561.208113] Disabling lock debugging due to kernel taint
[  561.208115] irq event stamp: 11800231
[  561.208126] hardirqs last  enabled at (11800231): [<ffffffff83008538>] do_syscall_64+0x18/0x410
[  561.208127] BUG: unable to handle kernel NULL pointer dereference at 000000000000001c
[  561.208129] #PF error: [WRITE]
[  561.312253] hardirqs last disabled at (11800230): [<ffffffff830052ff>] trace_hardirqs_off_thunk+0x1a/0x1c
[  561.312259] softirqs last  enabled at (11799832): [<ffffffff850005c7>] __do_softirq+0x5c7/0x8c3
[  561.405975] PGD 0 P4D 0
[  561.442494] softirqs last disabled at (11799821): [<ffffffff831add2c>] irq_exit+0x1ac/0x1e0
[  561.791359] Oops: 0002 [#1] SMP KASAN NOPTI
[  561.791362] CPU: 15 PID: 313 Comm: kworker/15:1 Tainted: G    B   W         5.0.0-vanilla+ #3
[  561.791363] Hardware name: Lenovo ThinkSystem SR650 -[7X05CTO1WW]-/-[7X05CTO1WW]-, BIOS -[IVE136T-2.10]- 03/22/2019
[  561.791371] Workqueue: events cache_set_flush [bcache]
[  561.791374] RIP: 0010:kthread_stop+0x3b/0x440
[  561.791376] Code: 00 00 65 8b 05 26 d5 e0 7c 89 c0 48 0f a3 05 ec aa df 02 0f 82 dc 02 00 00 4c 8d 63 20 be 04 00 00 00 4c 89 e7 e8 65 c5 53 00 <f0> ff 43 20 48 8d 7b 24 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48
[  561.791377] RSP: 0018:ffff88872fc8fd10 EFLAGS: 00010286
[  561.838895] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree.
[  561.838916] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree.
[  561.838934] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree.
[  561.838948] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree.
[  561.838966] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree.
[  561.838979] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree.
[  561.838996] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree.
[  563.067028] RAX: 0000000000000000 RBX: fffffffffffffffc RCX: ffffffff832dd314
[  563.067030] RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000297
[  563.067032] RBP: ffff88872fc8fe88 R08: fffffbfff0b8213d R09: fffffbfff0b8213d
[  563.067034] R10: 0000000000000001 R11: fffffbfff0b8213c R12: 000000000000001c
[  563.408618] R13: ffff88dc61cc0f68 R14: ffff888102b94900 R15: ffff88dc61cc0f68
[  563.408620] FS:  0000000000000000(0000) GS:ffff888f7dc00000(0000) knlGS:0000000000000000
[  563.408622] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  563.408623] CR2: 000000000000001c CR3: 0000000f48a1a004 CR4: 00000000007606e0
[  563.408625] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  563.408627] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  563.904795] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree.
[  563.915796] PKRU: 55555554
[  563.915797] Call Trace:
[  563.915807]  cache_set_flush+0xd4/0x6d0 [bcache]
[  563.915812]  process_one_work+0x856/0x1620
[  564.001226] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree.
[  564.033563]  ? find_held_lock+0x39/0x1d0
[  564.033567]  ? drain_workqueue+0x380/0x380
[  564.033574]  worker_thread+0x87/0xb80
[  564.062823] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree.
[  564.118042]  ? __kthread_parkme+0xb6/0x180
[  564.118046]  ? process_one_work+0x1620/0x1620
[  564.118048]  kthread+0x326/0x3e0
[  564.118050]  ? kthread_create_worker_on_cpu+0xc0/0xc0
[  564.167066] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree.
[  564.252441]  ret_from_fork+0x3a/0x50
[  564.252447] Modules linked in: msr rpcrdma sunrpc rdma_ucm ib_iser ib_umad rdma_cm ib_ipoib i40iw configfs iw_cm ib_cm libiscsi scsi_transport_iscsi mlx4_ib ib_uverbs mlx4_en ib_core nls_iso8859_1 nls_cp437 vfat fat intel_rapl skx_edac x86_pkg_temp_thermal coretemp iTCO_wdt iTCO_vendor_support crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ses raid0 aesni_intel cdc_ether enclosure usbnet ipmi_ssif joydev aes_x86_64 i40e scsi_transport_sas mii bcache md_mod crypto_simd mei_me ioatdma crc64 ptp cryptd pcspkr i2c_i801 mlx4_core glue_helper pps_core mei lpc_ich dca wmi ipmi_si ipmi_devintf nd_pmem dax_pmem nd_btt ipmi_msghandler device_dax pcc_cpufreq button hid_generic usbhid mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect xhci_pci sysimgblt fb_sys_fops xhci_hcd ttm megaraid_sas drm usbcore nfit libnvdimm sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua efivarfs
[  564.299390] bcache: bch_count_io_errors() nvme0n1: IO error on writing btree.
[  564.348360] CR2: 000000000000001c
[  564.348362] ---[ end trace b7f0e5cc7b2103b0 ]---

Therefore, it is not enough to only check whether c->gc_thread is NULL,
we should use IS_ERR_OR_NULL() to check both NULL pointer and error
value.

This patch changes the above buggy code piece in this way,
         if (!IS_ERR_OR_NULL(c->gc_thread))
                 kthread_stop(c->gc_thread);

Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-28 07:39:13 -06:00
Coly Li
141df8bb5d bcache: don't set max writeback rate if gc is running
When gc is running, user space I/O processes may wait inside
bcache code, so no new I/O coming. Indeed this is not a real idle
time, maximum writeback rate should not be set in such situation.
Otherwise a faster writeback thread may compete locks with gc thread
and makes garbage collection slower, which results a longer I/O
freeze period.

This patch checks c->gc_mark_valid in set_at_max_writeback_rate(). If
c->gc_mark_valid is 0 (gc running), set_at_max_writeback_rate() returns
false, then update_writeback_rate() will not set writeback rate to
maximum value even c->idle_counter reaches an idle threshold.

Now writeback thread won't interfere gc thread performance.

Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-28 07:39:13 -06:00
Linus Torvalds
65ee21eb63 - Fix incorrect uses of kstrndup and DM logging macros in DM's early
init code.
 
 - Fix DM log-writes target's handling of super block sectors so updates
   are made in order through use of completion.
 
 - Fix DM core's argument splitting code to avoid undefined behaviour
   reported as a side-effect of UBSAN analysis on ppc64le.
 
 - Fix DM verity target to limit the amount of error messages that can
   result from a corrupt block being found.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEJfWUX4UqZ4x1O2wixSPxCi2dA1oFAl0VBp4THHNuaXR6ZXJA
 cmVkaGF0LmNvbQAKCRDFI/EKLZ0DWlLKB/9wnB1FWhVIwt5j0e3YpqToThptT22N
 sWO/vBKtIR8nbbohXeam7mgh15+D61SpY+8jdkgAlIQonPzCwFrgSWdtpbbveYw6
 gFWAdd013R+piGX40l0EztwAQ9IawwFl7JP+fBevxFQdGAVT6SpaJfBWH0XcIBbb
 udEtoeW0dVp06SFsokGq6TPSRCdxwMh9JpRplgtLmz0yhfsWeFivjw2AMIen5QqB
 Z0zcJnL8UMRdJq0EEqjQ3z9CH/Th8Nryxibo8YLnADPBfv7RyR2+wCjrbode8LAj
 LZHjHikxzt39F9PpJIPvVuvNMzrBdQPDF61O5MK7dOMl49GfIoz2ybvH
 =oEWd
 -----END PGP SIGNATURE-----

Merge tag 'for-5.2/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:

 - Fix incorrect uses of kstrndup and DM logging macros in DM's early
   init code.

 - Fix DM log-writes target's handling of super block sectors so updates
   are made in order through use of completion.

 - Fix DM core's argument splitting code to avoid undefined behaviour
   reported as a side-effect of UBSAN analysis on ppc64le.

 - Fix DM verity target to limit the amount of error messages that can
   result from a corrupt block being found.

* tag 'for-5.2/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm verity: use message limit for data block corruption message
  dm table: don't copy from a NULL pointer in realloc_argv()
  dm log writes: make sure super sector log updates are written in order
  dm init: remove trailing newline from calls to DMERR() and DMINFO()
  dm init: fix incorrect uses of kstrndup()
2019-06-28 08:48:21 +08:00
David Howells
2e12256b9a keys: Replace uid/gid/perm permissions checking with an ACL
Replace the uid/gid/perm permissions checking on a key with an ACL to allow
the SETATTR and SEARCH permissions to be split.  This will also allow a
greater range of subjects to represented.

============
WHY DO THIS?
============

The problem is that SETATTR and SEARCH cover a slew of actions, not all of
which should be grouped together.

For SETATTR, this includes actions that are about controlling access to a
key:

 (1) Changing a key's ownership.

 (2) Changing a key's security information.

 (3) Setting a keyring's restriction.

And actions that are about managing a key's lifetime:

 (4) Setting an expiry time.

 (5) Revoking a key.

and (proposed) managing a key as part of a cache:

 (6) Invalidating a key.

Managing a key's lifetime doesn't really have anything to do with
controlling access to that key.

Expiry time is awkward since it's more about the lifetime of the content
and so, in some ways goes better with WRITE permission.  It can, however,
be set unconditionally by a process with an appropriate authorisation token
for instantiating a key, and can also be set by the key type driver when a
key is instantiated, so lumping it with the access-controlling actions is
probably okay.

As for SEARCH permission, that currently covers:

 (1) Finding keys in a keyring tree during a search.

 (2) Permitting keyrings to be joined.

 (3) Invalidation.

But these don't really belong together either, since these actions really
need to be controlled separately.

Finally, there are number of special cases to do with granting the
administrator special rights to invalidate or clear keys that I would like
to handle with the ACL rather than key flags and special checks.


===============
WHAT IS CHANGED
===============

The SETATTR permission is split to create two new permissions:

 (1) SET_SECURITY - which allows the key's owner, group and ACL to be
     changed and a restriction to be placed on a keyring.

 (2) REVOKE - which allows a key to be revoked.

The SEARCH permission is split to create:

 (1) SEARCH - which allows a keyring to be search and a key to be found.

 (2) JOIN - which allows a keyring to be joined as a session keyring.

 (3) INVAL - which allows a key to be invalidated.

The WRITE permission is also split to create:

 (1) WRITE - which allows a key's content to be altered and links to be
     added, removed and replaced in a keyring.

 (2) CLEAR - which allows a keyring to be cleared completely.  This is
     split out to make it possible to give just this to an administrator.

 (3) REVOKE - see above.


Keys acquire ACLs which consist of a series of ACEs, and all that apply are
unioned together.  An ACE specifies a subject, such as:

 (*) Possessor - permitted to anyone who 'possesses' a key
 (*) Owner - permitted to the key owner
 (*) Group - permitted to the key group
 (*) Everyone - permitted to everyone

Note that 'Other' has been replaced with 'Everyone' on the assumption that
you wouldn't grant a permit to 'Other' that you wouldn't also grant to
everyone else.

Further subjects may be made available by later patches.

The ACE also specifies a permissions mask.  The set of permissions is now:

	VIEW		Can view the key metadata
	READ		Can read the key content
	WRITE		Can update/modify the key content
	SEARCH		Can find the key by searching/requesting
	LINK		Can make a link to the key
	SET_SECURITY	Can change owner, ACL, expiry
	INVAL		Can invalidate
	REVOKE		Can revoke
	JOIN		Can join this keyring
	CLEAR		Can clear this keyring


The KEYCTL_SETPERM function is then deprecated.

The KEYCTL_SET_TIMEOUT function then is permitted if SET_SECURITY is set,
or if the caller has a valid instantiation auth token.

The KEYCTL_INVALIDATE function then requires INVAL.

The KEYCTL_REVOKE function then requires REVOKE.

The KEYCTL_JOIN_SESSION_KEYRING function then requires JOIN to join an
existing keyring.

The JOIN permission is enabled by default for session keyrings and manually
created keyrings only.


======================
BACKWARD COMPATIBILITY
======================

To maintain backward compatibility, KEYCTL_SETPERM will translate the
permissions mask it is given into a new ACL for a key - unless
KEYCTL_SET_ACL has been called on that key, in which case an error will be
returned.

It will convert possessor, owner, group and other permissions into separate
ACEs, if each portion of the mask is non-zero.

SETATTR permission turns on all of INVAL, REVOKE and SET_SECURITY.  WRITE
permission turns on WRITE, REVOKE and, if a keyring, CLEAR.  JOIN is turned
on if a keyring is being altered.

The KEYCTL_DESCRIBE function translates the ACL back into a permissions
mask to return depending on possessor, owner, group and everyone ACEs.

It will make the following mappings:

 (1) INVAL, JOIN -> SEARCH

 (2) SET_SECURITY -> SETATTR

 (3) REVOKE -> WRITE if SETATTR isn't already set

 (4) CLEAR -> WRITE

Note that the value subsequently returned by KEYCTL_DESCRIBE may not match
the value set with KEYCTL_SETATTR.


=======
TESTING
=======

This passes the keyutils testsuite for all but a couple of tests:

 (1) tests/keyctl/dh_compute/badargs: The first wrong-key-type test now
     returns EOPNOTSUPP rather than ENOKEY as READ permission isn't removed
     if the type doesn't have ->read().  You still can't actually read the
     key.

 (2) tests/keyctl/permitting/valid: The view-other-permissions test doesn't
     work as Other has been replaced with Everyone in the ACL.

Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-27 23:03:07 +01:00
Dan Carpenter
16d4b74654 md/raid1: Fix a warning message in remove_wb()
The WARN_ON() macro doesn't take an error message, it just takes a
condition.  I've changed this to use WARN(1, "...") instead.

Fixes: 3e148a320979 ("md/raid1: fix potential data inconsistency issue with write behind device")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
2019-06-26 12:18:48 -07:00
Milan Broz
2eba4e640b dm verity: use message limit for data block corruption message
DM verity should also use DMERR_LIMIT to limit repeat data block
corruption messages.

Signed-off-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-06-25 14:09:14 -04:00
Jerome Marchand
a065192655 dm table: don't copy from a NULL pointer in realloc_argv()
For the first call to realloc_argv() in dm_split_args(), old_argv is
NULL and size is zero. Then memcpy is called, with the NULL old_argv
as the source argument and a zero size argument. AFAIK, this is
undefined behavior and generates the following warning when compiled
with UBSAN on ppc64le:

In file included from ./arch/powerpc/include/asm/paca.h:19,
                 from ./arch/powerpc/include/asm/current.h:16,
                 from ./include/linux/sched.h:12,
                 from ./include/linux/kthread.h:6,
                 from drivers/md/dm-core.h:12,
                 from drivers/md/dm-table.c:8:
In function 'memcpy',
    inlined from 'realloc_argv' at drivers/md/dm-table.c:565:3,
    inlined from 'dm_split_args' at drivers/md/dm-table.c:588:9:
./include/linux/string.h:345:9: error: argument 2 null where non-null expected [-Werror=nonnull]
  return __builtin_memcpy(p, q, size);
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/md/dm-table.c: In function 'dm_split_args':
./include/linux/string.h:345:9: note: in a call to built-in function '__builtin_memcpy'

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-06-25 14:09:13 -04:00
zhangyi (F)
211ad4b733 dm log writes: make sure super sector log updates are written in order
Currently, although we submit super bios in order (and super.nr_entries
is incremented by each logged entry), submit_bio() is async so each
super sector may not be written to log device in order and then the
final nr_entries may be smaller than it should be.

This problem can be reproduced by the xfstests generic/455 with ext4:

  QA output created by 455
 -Silence is golden
 +mark 'end' does not exist

Fix this by serializing submission of super sectors to make sure each
is written to the log disk in order.

Fixes: 0e9cebe724597 ("dm: add log writes target")
Cc: stable@vger.kernel.org
Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Suggested-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-06-25 14:09:13 -04:00
Stephen Boyd
10c9c8e7c0 dm init: remove trailing newline from calls to DMERR() and DMINFO()
These printing macros already add a trailing newline, so having another
one here just makes for blank lines when these prints are enabled.
Remove these needless newlines.

Fixes: 6bbc923dfcf5 ("dm: add support to directly boot to a mapped device")
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-06-25 13:43:09 -04:00
Gen Zhang
dec7e6494e dm init: fix incorrect uses of kstrndup()
Fix 2 kstrndup() calls with incorrect argument order.

Fixes: 6bbc923dfcf5 ("dm: add support to directly boot to a mapped device")
Cc: stable@vger.kernel.org # v5.1
Signed-off-by: Gen Zhang <blackgod016574@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-06-25 13:34:52 -04:00
Guoqing Jiang
d494549ac8 md: add bitmap_abort label in md_run
Now, there are two places need to consider about
the failure of destroy bitmap, so move the common
part between bitmap_abort and abort label.

Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
2019-06-20 16:36:00 -07:00
Guoqing Jiang
617b194a13 md-bitmap: create and destroy wb_info_pool with the change of bitmap
The write-behind attribute is part of bitmap, since bitmap
can be added/removed dynamically with the following.

1. mdadm --grow /dev/md0 --bitmap=none
2. mdadm --grow /dev/md0 --bitmap=internal --write-behind

So we need to destroy wb_info_pool in md_bitmap_destroy,
and create the pool before load bitmap.

Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
2019-06-20 16:36:00 -07:00
Guoqing Jiang
10c92fca63 md-bitmap: create and destroy wb_info_pool with the change of backlog
Since we can enable write-behind mode by write backlog node,
so create wb_info_pool if the mode is just enabled, also call
call md_bitmap_update_sb to make user aware the write-behind
mode is enabled. Conversely, wb_info_pool should be destroyed
when write-behind mode is disabled.

Beside above, it is better to update bitmap sb if we change
the number of max_write_behind.

Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
2019-06-20 16:36:00 -07:00
Guoqing Jiang
963c555e75 md: introduce mddev_create/destroy_wb_pool for the change of member device
Previously, we called rdev_init_wb to avoid potential data
inconsistency when array is created.

Now, we need to call the function and create mempool if a
device is added or just be flaged as "writemostly". So
mddev_create_wb_pool is introduced and called accordingly.
And for safety reason, we mark implicit GFP_NOIO allocation
scope for create mempool during mddev_suspend/mddev_resume.

And mempool should be removed conversely after remove a
member device or its's "writemostly" flag, which is done
by call mddev_destroy_wb_pool.

Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
2019-06-20 16:36:00 -07:00
Guoqing Jiang
3e148a3209 md/raid1: fix potential data inconsistency issue with write behind device
For write-behind mode, we think write IO is complete once it has
reached all the non-writemostly devices. It works fine for single
queue devices.

But for multiqueue device, if there are lots of IOs come from upper
layer, then the write-behind device could issue those IOs to different
queues, depends on the each queue's delay, so there is no guarantee
that those IOs can arrive in order.

To address the issue, we need to check the collision among write
behind IOs, we can only continue without collision, otherwise wait
for the completion of previous collisioned IO.

And WBCollision is introduced for multiqueue device which is worked
under write-behind mode.

But this patch doesn't handle below cases which could have the data
inconsistency issue as well, these cases will be handled in later
patches.

1. modify max_write_behind by write backlog node.
2. add or remove array's bitmap dynamically.
3. the change of member disk.

Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
2019-06-20 16:35:59 -07:00
Christoph Hellwig
14ccb66b3f block: remove the bi_phys_segments field in struct bio
We only need the number of segments in the blk-mq submission path.
Remove the field from struct bio, and return it from a variant of
blk_queue_split instead of that it can passed as an argument to
those functions that need the value.

This also means we stop recounting segments except for cloning
and partial segments.

To keep the number of arguments in this how path down remove
pointless struct request_queue arguments from any of the functions
that had it and grew a nr_segs argument.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-20 10:29:22 -06:00
Mariusz Tkaczyk
9642fa73d0 md: fix for divide error in status_resync
Stopping external metadata arrays during resync/recovery causes
retries, loop of interrupting and starting reconstruction, until it
hit at good moment to stop completely. While these retries
curr_mark_cnt can be small- especially on HDD drives, so subtraction
result can be smaller than 0. However it is casted to uint without
checking. As a result of it the status bar in /proc/mdstat while stopping
is strange (it jumps between 0% and 99%).

The real problem occurs here after commit 72deb455b5ec ("block: remove
CONFIG_LBDAF"). Sector_div() macro has been changed, now the
divisor is casted to uint32. For db = -8 the divisior(db/32-1) becomes 0.

Check if db value can be really counted and replace these macro by
div64_u64() inline.

Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
2019-06-18 08:02:25 -07:00
Guoqing Jiang
e9eeba28a1 md/raid10: read balance chooses idlest disk for SSD
Andy reported that raid10 array with SSD disks has poor
read performance. Compared with raid1, RAID-1 can be 3x
faster than RAID-10 sometimes [1].

The thing is that raid10 chooses the low distance disk
for read request, however, the approach doesn't work
well for SSD device since it doesn't have spindle like
HDD, we should just read from the SSD which has less
pending IO like commit 9dedf60313fa4 ("md/raid1: read
balance chooses idlest disk for SSD").

So this commit selects the idlest SSD disk for read if
array has none rotational disk, otherwise, read_balance
uses the previous distance priority algorithm. With the
change, the performance of raid10 gets increased largely
per Andy's test [2].

[1]. https://marc.info/?l=linux-raid&m=155915890004761&w=2
[2]. https://marc.info/?l=linux-raid&m=155990654223786&w=2

Tested-by: Andy Smith <andy@strugglers.net>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-15 01:37:35 -06:00
Marcos Paulo de Souza
c7afa8034b md: raid1-10: Unify r{1,10}bio_pool_free
Avoiding duplicated code, since they just execute a kfree.

Signed-off-by: Marcos Paulo de Souza <marcos.souza.org@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-15 01:37:35 -06:00
Gustavo A. R. Silva
8cf05a7841 md: raid10: Use struct_size() in kmalloc()
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:

struct foo {
   int stuff;
   struct boo entry[];
};

instance = kmalloc(size, GFP_KERNEL);

Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:

instance = kmalloc(struct_size(instance, entry, count), GFP_KERNEL);

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-15 01:37:34 -06:00
Yufen Yu
ebfeb444fa md/raid1: get rid of extra blank line and space
This patch get rid of extra blank line and space, and
add necessary space for code.

Signed-off-by: Yufen Yu <yuyufen@huawei.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-15 01:37:34 -06:00
Yufen Yu
e5b521ee9b md: fix spelling typo and add necessary space
This patch fix a spelling typo and add necessary space for code.
In addition, the patch get rid of the unnecessary 'if'.

Signed-off-by: Yufen Yu <yuyufen@huawei.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-15 01:37:34 -06:00
Marcos Paulo de Souza
168b305b0c md: md.c: Return -ENODEV when mddev is NULL in rdev_attr_show
Commit c42d3240990814eec1e4b2b93fa0487fc4873aed
("md: return -ENODEV if rdev has no mddev assigned") changed
rdev_attr_store to return -ENODEV when rdev->mddev is NULL, now do the
same to rdev_attr_show.

Signed-off-by: Marcos Paulo de Souza <marcos.souza.org@gmail.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-15 01:37:34 -06:00
Xiao Ni
d9771f5ec4 raid5-cache: Need to do start() part job after adding journal device
commit d5d885fd514f ("md: introduce new personality funciton start()")
splits the init job to two parts. The first part run() does the jobs that
do not require the md threads. The second part start() does the jobs that
require the md threads.

Now it just does run() in adding new journal device. It needs to do the
second part start() too.

Fixes: d5d885fd514f ("md: introduce new personality funciton start()")
Cc: stable@vger.kernel.org #v4.9+
Reported-by: Michal Soltys <soltys@ziu.info>
Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-15 01:37:34 -06:00
Marcos Paulo de Souza
3f677f9c99 drivers: md: Unify common definitions of raid1 and raid10
These definitions are being moved to raid1-10.c.

Signed-off-by: Marcos Paulo de Souza <marcos.souza.org@gmail.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-15 01:37:34 -06:00
Mauro Carvalho Chehab
f0ba43774c docs: convert docs to ReST and rename to *.rst
The conversion is actually:
  - add blank lines and indentation in order to identify paragraphs;
  - fix tables markups;
  - add some lists markups;
  - mark literal blocks;
  - adjust title markups.

At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-06-14 14:21:04 -06:00
Coly Li
1f0ffa6734 bcache: only set BCACHE_DEV_WB_RUNNING when cached device attached
When people set a writeback percent via sysfs file,
  /sys/block/bcache<N>/bcache/writeback_percent
current code directly sets BCACHE_DEV_WB_RUNNING to dc->disk.flags
and schedules kworker dc->writeback_rate_update.

If there is no cache set attached to, the writeback kernel thread is
not running indeed, running dc->writeback_rate_update does not make
sense and may cause NULL pointer deference when reference cache set
pointer inside update_writeback_rate().

This patch checks whether the cache set point (dc->disk.c) is NULL in
sysfs interface handler, and only set BCACHE_DEV_WB_RUNNING and
schedule dc->writeback_rate_update when dc->disk.c is not NULL (it
means the cache device is attached to a cache set).

This problem might be introduced from initial bcache commit, but
commit 3fd47bfe55b0 ("bcache: stop dc->writeback_rate_update properly")
changes part of the original code piece, so I add 'Fixes: 3fd47bfe55b0'
to indicate from which commit this patch can be applied.

Fixes: 3fd47bfe55b0 ("bcache: stop dc->writeback_rate_update properly")
Reported-by: Bjørn Forsman <bjorn.forsman@gmail.com>
Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Bjørn Forsman <bjorn.forsman@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-13 03:09:15 -06:00
Coly Li
31b90956b1 bcache: fix stack corruption by PRECEDING_KEY()
Recently people report bcache code compiled with gcc9 is broken, one of
the buggy behavior I observe is that two adjacent 4KB I/Os should merge
into one but they don't. Finally it turns out to be a stack corruption
caused by macro PRECEDING_KEY().

See how PRECEDING_KEY() is defined in bset.h,
437 #define PRECEDING_KEY(_k)                                       \
438 ({                                                              \
439         struct bkey *_ret = NULL;                               \
440                                                                 \
441         if (KEY_INODE(_k) || KEY_OFFSET(_k)) {                  \
442                 _ret = &KEY(KEY_INODE(_k), KEY_OFFSET(_k), 0);  \
443                                                                 \
444                 if (!_ret->low)                                 \
445                         _ret->high--;                           \
446                 _ret->low--;                                    \
447         }                                                       \
448                                                                 \
449         _ret;                                                   \
450 })

At line 442, _ret points to address of a on-stack variable combined by
KEY(), the life range of this on-stack variable is in line 442-446,
once _ret is returned to bch_btree_insert_key(), the returned address
points to an invalid stack address and this address is overwritten in
the following called bch_btree_iter_init(). Then argument 'search' of
bch_btree_iter_init() points to some address inside stackframe of
bch_btree_iter_init(), exact address depends on how the compiler
allocates stack space. Now the stack is corrupted.

Fixes: 0eacac22034c ("bcache: PRECEDING_KEY()")
Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Rolf Fokkens <rolf@rolffokkens.nl>
Reviewed-by: Pierre JUHEN <pierre.juhen@orange.fr>
Tested-by: Shenghui Wang <shhuiw@foxmail.com>
Tested-by: Pierre JUHEN <pierre.juhen@orange.fr>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Nix <nix@esperi.org.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-06-13 03:09:14 -06:00
Thomas Gleixner
55716d2643 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 428
Based on 1 normalized pattern(s):

  this file is released under the gplv2

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 68 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Armijn Hemel <armijn@tjaldur.nl>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190531190114.292346262@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:37:16 +02:00
Thomas Gleixner
2025cf9e19 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 288
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms and conditions of the gnu general public license
  version 2 as published by the free software foundation this program
  is distributed in the hope it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 263 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141901.208660670@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:36:37 +02:00
Thomas Gleixner
1a59d1b8e0 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or at
  your option any later version this program is distributed in the
  hope that it will be useful but without any warranty without even
  the implied warranty of merchantability or fitness for a particular
  purpose see the gnu general public license for more details you
  should have received a copy of the gnu general public license along
  with this program if not write to the free software foundation inc
  59 temple place suite 330 boston ma 02111 1307 usa

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 1334 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070033.113240726@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-30 11:26:35 -07:00
Thomas Gleixner
2874c5fd28 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or at
  your option any later version

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 3029 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-30 11:26:32 -07:00
Linus Torvalds
b2ad81363f libnvdimm fixes v5.2-rc2
- Fix a regression that disabled device-mapper dax support
 
 - Remove unnecessary hardened-user-copy overhead (>30%) for dax
   read(2)/write(2).
 
 - Fix some compilation warnings.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJc6WWQAAoJEB7SkWpmfYgCpVwP/0Vfq/3ChbH5T7s4x2MkpLX+
 metYwCyzPJK32mVMbAmizGWEBn8Np+eZcU7jvKYpDXJLWdbUUz4oZD04RYmgkYp7
 SHmjn9VdpfMSziWUx6zrrbyAtBq04x7GT7IIkCzlGIuNVCYqXBnRSVGz06tDFEEd
 pU9HtZr32C425pdFK5D4sorJED2JKG7CwLPdSVHayuyHmg7jp78T7U5Y31WgOhSw
 +JF6UwQIJ+UPg30PYBPG32Zmh8E7Fv/AaYF3JGbp4xRS+B/xbakZhJtYuBzWRjlp
 BlwUg9nUaVgEnjE9KpTcJk8VlXDz6ZjpYXXdY4Hv5g+PPWm5kdZBhPYjaymrtI3o
 7DjtKmNd4F5qhU06oTXtFoBbgoiOBM7fOqsyVZ6tsNguVojlt8lnUvkTKqvznw4n
 K4TGzi0Zgu511umMumF1Q/d0BlNXz+gptcC4qwuEUyQa7sEPSWSfcC66SvY/Y5ym
 VGG4roO3Jz6p3JniuFEXakifzU57vPPv7OxGD3d0PKUSDHVU5yPjWRpJju8wJeVW
 DmTZ+SBo2Q/YP9vDlULPqxGJNkP31SaRg/9PnB8W1z2yqyuA+Pjv+Qjt1X618PFq
 1c2+ufeJoOb1Zc3k6Jw1bovilpb2GDW+4QucC3J0/zFtK00PYcGyyqo3jWlUgINf
 QWPgwBIW/yFcb7xOazFS
 =nko1
 -----END PGP SIGNATURE-----

Merge tag 'libnvdimm-fixes-5.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm fixes from Dan Williams:

 - Fix a regression that disabled device-mapper dax support

 - Remove unnecessary hardened-user-copy overhead (>30%) for dax
   read(2)/write(2).

 - Fix some compilation warnings.

* tag 'libnvdimm-fixes-5.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  libnvdimm/pmem: Bypass CONFIG_HARDENED_USERCOPY overhead
  dax: Arrange for dax_supported check to span multiple devices
  libnvdimm: Fix compilation warnings with W=1
2019-05-25 10:11:23 -07:00
Linus Torvalds
86c2f5d653 SPDX update for 5.2-rc2, round 2
Here is another set of reviewed patches that adds SPDX tags to different
 kernel files, based on a set of rules that are being used to parse the
 comments to try to determine that the license of the file is
 "GPL-2.0-or-later".  Only the "obvious" versions of these matches are
 included here, a number of "non-obvious" variants of text have been
 found but those have been postponed for later review and analysis.
 
 These patches have been out for review on the linux-spdx@vger mailing
 list, and while they were created by automatic tools, they were
 hand-verified by a bunch of different people, all whom names are on the
 patches are reviewers.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXOgmlw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yk4rACfRqxGOGVLR/t6E9dDzOZRAdEz/mYAoJLZmziY
 0YlSSSPtP5HI6JDh65Ng
 =HXQb
 -----END PGP SIGNATURE-----

Merge tag 'spdx-5.2-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pule more SPDX updates from Greg KH:
 "Here is another set of reviewed patches that adds SPDX tags to
  different kernel files, based on a set of rules that are being used to
  parse the comments to try to determine that the license of the file is
  "GPL-2.0-or-later".

  Only the "obvious" versions of these matches are included here, a
  number of "non-obvious" variants of text have been found but those
  have been postponed for later review and analysis.

  These patches have been out for review on the linux-spdx@vger mailing
  list, and while they were created by automatic tools, they were
  hand-verified by a bunch of different people, all whom names are on
  the patches are reviewers"

* tag 'spdx-5.2-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (85 commits)
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 125
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 123
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 122
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 121
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 120
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 119
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 118
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 116
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 114
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 113
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 112
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 111
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 110
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 106
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 105
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 104
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 103
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 102
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 101
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 98
  ...
2019-05-24 14:31:58 -07:00
Thomas Gleixner
af1a8899d2 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 47
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 or at your option any
  later version you should have received a copy of the gnu general
  public license for example usr src linux copying if not write to the
  free software foundation inc 675 mass ave cambridge ma 02139 usa

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 20 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190520170858.552543146@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-24 17:27:13 +02:00
Thomas Gleixner
8d7c56d08f treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 45
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 or at your option any
  later version

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 11 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190520170858.370933192@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-24 17:27:12 +02:00
Linus Torvalds
86f9e56d08 - Fix a particularly glaring oversight in a DM core commit from 5.1 that
doesn't properly trim special IOs (e.g. discards) relative to
   corresponding target's max_io_len_target_boundary().
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEJfWUX4UqZ4x1O2wixSPxCi2dA1oFAlzkicQTHHNuaXR6ZXJA
 cmVkaGF0LmNvbQAKCRDFI/EKLZ0DWuRcCADKc4zew1EAhcHjwanfWoQ2d+5ONdxv
 Ir1yiQX3hqK96G1Vu4IROsxWPDgNBI+60HwWI0z3/nMdv4F9a9Tfowl9fcKgZNey
 jGrzP7x1+9GPtJY0BIpFUQ895qC75wXdd0c6HjmM5IaN0DRrv783ZMMBNhS6yx84
 2vDThiEpGiDKUXHTP9b1khUbZzvkpTmlk293UlkFDftgejfW5mq1FqbjLfACMNVI
 Nh7D6A7MjFrEKHjvNbiGgeFn93iW1+XqGbsFobuCV8Z4A0ImD85H78Lesri2qrRG
 nxGfTZMyq++SGSOV/JTzZ4k/qkZfkrDyljYiPaTgZpCi1mZYQ6wH8+7u
 =oKZ5
 -----END PGP SIGNATURE-----

Merge tag 'for-5.2/dm-fix-1' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper fix from Mike Snitzer:
 "Fix a particularly glaring oversight in a DM core commit from 5.1 that
  doesn't properly trim special IOs (e.g. discards) relative to
  corresponding target's max_io_len_target_boundary()"

* tag 'for-5.2/dm-fix-1' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm: make sure to obey max_io_len_target_boundary
2019-05-22 08:10:35 -07:00
Michael Lass
51b86f9a8d dm: make sure to obey max_io_len_target_boundary
Commit 61697a6abd24 ("dm: eliminate 'split_discard_bios' flag from DM
target interface") incorrectly removed code from
__send_changing_extent_only() that is required to impose a per-target IO
boundary on IO that exceeds max_io_len_target_boundary().  Otherwise
"special" IO (e.g. DISCARD, WRITE SAME, WRITE ZEROES) can write beyond
where allowed.

Fix this by restoring the max_io_len_target_boundary() limit in
__send_changing_extent_only()

Fixes: 61697a6abd24 ("dm: eliminate 'split_discard_bios' flag from DM target interface")
Cc: stable@vger.kernel.org # 5.1+
Signed-off-by: Michael Lass <bevan@bi-co.net>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-05-21 19:15:20 -04:00
Thomas Gleixner
ec8f24b7fa treewide: Add SPDX license identifier - Makefile/Kconfig
Add SPDX license identifiers to all Make/Kconfig files which:

 - Have no license information of any form

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21 10:50:46 +02:00