IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Sort the cells in logical block order before processing each cell in
process_thin_deferred_cells(). This significantly improves the ondisk
layout on rotational storage, whereby improving read performance.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
This use of direct submission in process_shared_bio() reduces latency
for submitting bios in the shared cell by avoiding adding those bios to
the deferred list and waiting for the next iteration of the worker.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
This use of direct submission in process_prepared_mapping() reduces
latency for submitting bios in a cell by avoiding adding those bios to
the deferred list and waiting for the next iteration of the worker.
But this direct submission exposes the potential for a race between
releasing a cell and incrementing deferred set. Fix this by introducing
dm_cell_visit_release() and refactoring inc_remap_and_issue_cell()
accordingly.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
This avoids dropping the cell, so increases the probability that other
bios will collect within the cell, rather than being passed individually
to the worker.
Also add required process_cell and process_discard_cell error handling
wrappers and set associated pool-mode function pointers accordingly.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
When processing a discard bio, if the block is already quiesced do the
discard immediately rather than adding the mapping to a list for the
next iteration of the worker thread.
Discarding a fully provisioned 100G thin volume with 64k block size goes
from 860s to 95s with this change.
Clearly there's something wrong with the worker architecture, more
investigation needed.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Introduce thin_merge so that any additional constraints from the data
volume may be taken into account when determing the maximum number of
sectors that can be issued relative to the specified logical offset.
This is particularly important if/when the data volume is layered ontop
of a more sophisticated device (e.g. dm-raid or some other DM target).
Reviewed-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
These code changes do not introduce a functional change.
But bio_add_page() will never attempt to build up a bio larger than
queue_max_sectors(). Similarly, bio_get_nr_vecs() is also bound by
queue_max_sectors(). Therefore, there is no point in allowing
dm_merge_bvec() to answer "how many sectors can a bio have at this
offset?" with anything larger than queue_max_sectors(). Using
queue_max_sectors() rather than BIO_MAX_SECTORS serves to more
accurately convey the limits that are being imposed.
Also, use unlikely() to clarify the fact that the defensive code in
dm_merge_bvec() relative to max_size going negative shouldn't ever
happen -- if it does happen there is a bug in the block layer for
requesting larger than dm_merge_bvec()'s initial response for a given
offset. Also, update a comment in dm_merge_bvec() relative to
max_hw_sectors_kb. And fix empty newline whitespace.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Allows for filesystems to submit bios that are a factor of the thinp
blocksize, improving dm-thinp efficiency (particularly when the data
volume is RAID).
Also set io_min to max_sectors_kb if it is a factor of the thinp
blocksize.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Throttle IO based on the time it's taking the worker to do one loop.
There were reports of hung task timeouts occuring and it was observed
that the excessively long avgqu-sz (as reported by iostat) was
contributing to these hung tasks.
Throttling definitely helps dm-thinp perform better under heavy IO load
(without being detremental by being overzealous). It reduces avgqu-sz
drastically, e.g.: from 60K to ~6K, and even as low as 150 once metadata
is cached by bufio, when dirty_ratio=5, dirty_background_ratio=2. And
avgqu-sz stays at or below 30K even with dirty_ratio=20,
dirty_background_ratio=10.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Prefetch metadata at the start of the worker thread and then again every
128th bio processed from the deferred list.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Introduce the dm_tm_issue_prefetches interface. If you're using a
non-blocking clone the tm will build up a list of requested blocks that
weren't in core. dm_tm_issue_prefetches will request those blocks to be
prefetched.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
This change is a prerequisite for allowing metadata to be prefetched.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Previously it was using a fixed sized hash table. There are times
when very many concurrent cells are held (such as when processing a very
large discard). When this happens the hash table performance becomes
very poor.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
These changes help keep metadata backed by dm-bufio in-core longer which
fixes reports of metadata churn in the face of heavy random IO workloads.
Before, bufio evicted all buffers older than DM_BUFIO_DEFAULT_AGE_SECS.
Having a device (e.g. dm-thinp or dm-cache) lose all metadata just
because associated buffers had been idle for some time is unfriendly.
Now, the user may now configure the number of bytes that bufio retains
using the 'retain_bytes' module parameter. The default is 256K.
Also, the DM_BUFIO_WORK_TIMER_SECS and DM_BUFIO_DEFAULT_AGE_SECS
defaults were quite low so increase them (to 30 and 300 respectively).
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Converting over to using an rbtree eliminates a fixed 8MB allocation
from vmalloc space for the hash table.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
The walk code was using a 'ro_spine' to hold it's locked btree nodes.
But this data structure is designed for the rolling lock scheme, and
as such automatically unlocks blocks that are two steps up the call
chain. This is not suitable for the simple recursive walk algorithm,
which retraces its steps.
This code is only used by the persistent array code, which in turn is
only used by dm-cache. In order to trigger it you need to have a
mapping tree that is more than 2 levels deep; which equates to 8-16
million cache blocks. For instance a 4T ssd with a very small block
size of 32k only just triggers this bug.
The fix just places the locked blocks on the stack, and stops using
the ro_spine altogether.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
Avoids normal IO racing with discard.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
Commit 48cf06bc5f ("dm raid: add discard support for RAID levels 4, 5
and 6") did not properly handle missing metadata device(s). A failing
read of the superblock causes the metadata and data devices to be
removed from the dev array in struct raid_set, setting references to
both devices to NULL. configure_discard_support() nonetheless tries to
access the data dev unconditionally causing an oops.
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
The dm-raid superblock (struct dm_raid_superblock) is padded to 512
bytes and that size is being used to read it in from the metadata
device into one preallocated page.
Reading or writing this on a 512-byte sector device works fine but on
a 4096-byte sector device this fails.
Set the dm-raid superblock's size to the logical block size of the
metadata device, because IO at that size is guaranteed too work. Also
add a size check to avoid silent partial metadata loss in case the
superblock should ever grow past the logical block size or PAGE_SIZE.
[includes pointer math fix from Dan Carpenter]
Reported-by: "Liuhua Wang" <lwang@suse.com>
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
bioset_create_nobvec() interface when creating the DM's bioset
. fix a few bugs in dm-bufio and dm-log-userspace
. add DM core support for a DM multipath use-case that requires loading
DM tables that contain devices that have failed (by allowing active
and inactive DM tables to share dm_devs)
. add discard support to the DM raid target; like MD raid456 the user
must opt-in to raid456 discard support be specifying the
devices_handle_discard_safely=Y module param.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJUQWcdAAoJEMUj8QotnQNaHaQH/RD0xf54AnaQ0tEGuNQXFwtx
Gc/3s+VEcKlmTvk9nm2FWNvVagPn8uBQ0O2eid4UJk9AyfPJnPwGUoVqxbKhKK9i
G5/O5s8opLlItk14h/btw/zB8RNC1bg8NGnBrGYDudiwHm+Gv4jlnHErp2JMHv9F
nonb+QoG23wlEJkBafzBNYhthkNDq1ZFrDjhqG7dNySkXh8VZAW8VcZ/ZfskkhOa
C8CDl3TKL1BBJHQKesvqHQbCSqh8Ujzs63bLA3heaSMExkhmUgdfpnbHK4hzPNJP
rtmVEW57mVI+O5Cfva1p9RClT5EjiO+5VufHkpRJSIsfsH5PMaQ7vW8gKmwd5JA=
=z+Yz
-----END PGP SIGNATURE-----
Merge tag 'dm-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device-mapper updates from Mike Snitzer:
"I rebased the DM tree ontop of linux-block.git's 'for-3.18/core' at
the beginning of October because DM core now depends on the newly
introduced bioset_create_nobvec() interface.
Summary:
- fix DM's long-standing excessive use of memory by leveraging the
new bioset_create_nobvec() interface when creating the DM's bioset
- fix a few bugs in dm-bufio and dm-log-userspace
- add DM core support for a DM multipath use-case that requires
loading DM tables that contain devices that have failed (by
allowing active and inactive DM tables to share dm_devs)
- add discard support to the DM raid target; like MD raid456 the user
must opt-in to raid456 discard support be specifying the
devices_handle_discard_safely=Y module param"
* tag 'dm-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm log userspace: fix memory leak in dm_ulog_tfr_init failure path
dm bufio: when done scanning return from __scan immediately
dm bufio: update last_accessed when relinking a buffer
dm raid: add discard support for RAID levels 4, 5 and 6
dm raid: add discard support for RAID levels 1 and 10
dm: allow active and inactive tables to share dm_devs
dm mpath: stop queueing IO when no valid paths exist
dm: use bioset_create_nobvec()
dm: remove nr_iovecs parameter from alloc_tio()
Pull block layer driver update from Jens Axboe:
"This is the block driver pull request for 3.18. Not a lot in there
this round, and nothing earth shattering.
- A round of drbd fixes from the linbit team, and an improvement in
asender performance.
- Removal of deprecated (and unused) IRQF_DISABLED flag in rsxx and
hd from Michael Opdenacker.
- Disable entropy collection from flash devices by default, from Mike
Snitzer.
- A small collection of xen blkfront/back fixes from Roger Pau Monné
and Vitaly Kuznetsov"
* 'for-3.18/drivers' of git://git.kernel.dk/linux-block:
block: disable entropy contributions for nonrot devices
xen, blkfront: factor out flush-related checks from do_blkif_request()
xen-blkback: fix leak on grant map error path
xen/blkback: unmap all persistent grants when frontend gets disconnected
rsxx: Remove deprecated IRQF_DISABLED
block: hd: remove deprecated IRQF_DISABLED
drbd: use RB_DECLARE_CALLBACKS() to define augment callbacks
drbd: compute the end before rb_insert_augmented()
drbd: Add missing newline in resync progress display in /proc/drbd
drbd: reduce lock contention in drbd_worker
drbd: Improve asender performance
drbd: Get rid of the WORK_PENDING macro
drbd: Get rid of the __no_warn and __cond_lock macros
drbd: Avoid inconsistent locking warning
drbd: Remove superfluous newline from "resync_extents" debugfs entry.
drbd: Use consistent names for all the bi_end_io callbacks
drbd: Use better variable names
- a few minor bug fixes
- quite a lot of code tidy-up and simplification
- remove PRINT_RAID_DEBUG ioctl. I'm fairly sure
it is unused, and it isn't particularly useful.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIVAwUAVD9k1jnsnt1WYoG5AQKCaw/9F7jE9PDYtEfJ8ShEWMM0CWNsCKmgqfpV
i4RaeVKe1IoA5JOurYk+wvdbWSXGfz5XQ9GX8ptRl9X7ZoG4aJ65v9GHpBamPLrc
mB2Lz3zR9AZVYrMDeZym9cSZ6FpZNzzXpJE2O2RslXq3gFI03MObyM8xyeh8ybOD
45nhH+CJ17OFNn5OzzFLhYEAoYDeOS97zAwInWeFlUp14Jl403xnZ3srF2YJ78TR
PjcCpxo1IhGEnYE8rDYqH/UjDPzEfAdYrqM5k3NEnuPiqn+KxYNSsbAQGdeMrGUc
DO0H8dt6U1U2tq/t/qN8n01uQ7AJ3S3JrTsQxSW/UC1SVfgpztK/a78eA/YSy/zs
iZzPP7CpLfF4T945jaQionevZOBFRM+gbrMqeoQTPO2QtfrSGe4Awoht7Z+no3RR
dCX0ScO16kHkcAcSXXGZkGtC1DcteEwUfufSdako12exo1k3efc98DsyMw2VfzOM
EJcQD1JGYVW+czZM58EEue92TT5jvWnhU5s3PEUMTZrDgSWwTVQC3oNCgDGFKI4X
eebpWlG3gEjNrnL5givbBwC2LCfI59R70gpnGhavdKtt9AtpfsMJnj8E3cCqHE9I
xR6YPF161KSmKGOG47RK/VJJnq5SmZbxxeShT101uq3SVeqImit6ql3JfAM9HoMD
RI2iWG9yma4=
=2QEJ
-----END PGP SIGNATURE-----
Merge tag 'md/3.18' of git://neil.brown.name/md
Pull md updates from Neil Brown:
- a few minor bug fixes
- quite a lot of code tidy-up and simplification
- remove PRINT_RAID_DEBUG ioctl. I'm fairly sure it is unused, and it
isn't particularly useful.
* tag 'md/3.18' of git://neil.brown.name/md: (21 commits)
lib/raid6: Add log level to printks
md: move EXPORT_SYMBOL to after function in md.c
md: discard PRINT_RAID_DEBUG ioctl
md: remove MD_BUG()
md: clean up 'exit' labels in md_ioctl().
md: remove unnecessary test for MD_MAJOR in md_ioctl()
md: don't allow "-sync" to be set for device in an active array.
md: remove unwanted white space from md.c
md: don't start resync thread directly from md thread.
md: Just use RCU when checking for overlap between arrays.
md: avoid potential long delay under pers_lock
md: simplify export_array()
md: discard find_rdev_nr in favour of find_rdev_nr_rcu
md: use wait_event() to simplify md_super_wait()
md: be more relaxed about stopping an array which isn't started.
md/raid1: process_checks doesn't use its return value.
md/raid5: fix init_stripe() inconsistencies
md/raid10: another memory leak due to reshape.
md: use set_bit/clear_bit instead of shift/mask for bi_flags changes.
md/raid1: minor typos and reformatting.
...
The shrinker uses gfp flags to indicate what kind of operation can the
driver wait for. If __GFP_IO flag is present, the driver can wait for
block I/O operations, if __GFP_FS flag is present, the driver can wait on
operations involving the filesystem.
dm-bufio tested for __GFP_IO. However, dm-bufio can run on a loop block
device that makes calls into the filesystem. If __GFP_IO is present and
__GFP_FS isn't, dm-bufio could still block on filesystem operations if it
runs on a loop block device.
The change from __GFP_IO to __GFP_FS supposedly fixes one observed (though
unreproducible) deadlock involving dm-bufio and loop device.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
Pull percpu consistent-ops changes from Tejun Heo:
"Way back, before the current percpu allocator was implemented, static
and dynamic percpu memory areas were allocated and handled separately
and had their own accessors. The distinction has been gone for many
years now; however, the now duplicate two sets of accessors remained
with the pointer based ones - this_cpu_*() - evolving various other
operations over time. During the process, we also accumulated other
inconsistent operations.
This pull request contains Christoph's patches to clean up the
duplicate accessor situation. __get_cpu_var() uses are replaced with
with this_cpu_ptr() and __this_cpu_ptr() with raw_cpu_ptr().
Unfortunately, the former sometimes is tricky thanks to C being a bit
messy with the distinction between lvalues and pointers, which led to
a rather ugly solution for cpumask_var_t involving the introduction of
this_cpu_cpumask_var_ptr().
This converts most of the uses but not all. Christoph will follow up
with the remaining conversions in this merge window and hopefully
remove the obsolete accessors"
* 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (38 commits)
irqchip: Properly fetch the per cpu offset
percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t -fix
ia64: sn_nodepda cannot be assigned to after this_cpu conversion. Use __this_cpu_write.
percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t
Revert "powerpc: Replace __get_cpu_var uses"
percpu: Remove __this_cpu_ptr
clocksource: Replace __this_cpu_ptr with raw_cpu_ptr
sparc: Replace __get_cpu_var uses
avr32: Replace __get_cpu_var with __this_cpu_write
blackfin: Replace __get_cpu_var uses
tile: Use this_cpu_ptr() for hardware counters
tile: Replace __get_cpu_var uses
powerpc: Replace __get_cpu_var uses
alpha: Replace __get_cpu_var
ia64: Replace __get_cpu_var uses
s390: cio driver &__get_cpu_var replacements
s390: Replace __get_cpu_var uses
mips: Replace __get_cpu_var uses
MIPS: Replace __get_cpu_var uses in FPU emulator.
arm: Replace __this_cpu_ptr with raw_cpu_ptr
...
Replaced the use of a Variable Length Array In Struct (VLAIS) with a C99
compliant equivalent. This patch allocates the appropriate amount of memory
using a char array using the SHASH_DESC_ON_STACK macro.
The new code can be compiled with both gcc and clang.
Signed-off-by: Jan-Simon Möller <dl9pf@gmx.de>
Signed-off-by: Behan Webster <behanw@converseincode.com>
Reviewed-by: Mark Charlebois <charlebm@gmail.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Cc: pageexec@freemail.hu
Cc: gmazyland@gmail.com
Cc: "David S. Miller" <davem@davemloft.net>
All the interesting information printed by this ioctl
is provided in /proc/mdstat and/or sysfs.
So it isn't needed and isn't used and would be best if it didn't
exist.
Signed-off-by: NeilBrown <neilb@suse.de>
Most of the places that call this are doing so pointlessly.
A couple of the others a best replaced with WARN_ON().
Signed-off-by: NeilBrown <neilb@suse.de>
unknown ioctls no longer get this deep into md_ioctl since
md_ioctl_valid() was introduced in 3.14.
So remove the test and the misleading comment.
Signed-off-by: NeilBrown <neilb@suse.de>
If an array is active, devices can be marked 'faulty', but simply
removing the 'sync' flag is wrong. That only makes sense
for an array which is not active (and is probably only useful
for testing anyway).
Signed-off-by: NeilBrown <neilb@suse.de>
The main 'md' thread is needed for processing writes, so if it blocks
write requests could be delayed.
Starting a new thread requires some GFP_KERNEL allocations and so can
wait for writes to complete. This can deadlock.
So instead, ask a workqueue to start the sync thread.
There is no particular rush for this to happen, so any work queue
will do.
MD_RECOVERY_RUNNING is used to ensure only one thread is started.
Reported-by: BillStuff <billstuff2001@sbcglobal.net>
Signed-off-by: NeilBrown <neilb@suse.de>
We don't really need the full mddev_lock here, and having to
drop it is messy.
RCU is enough to protect these lists.
Signed-off-by: NeilBrown <neilb@suse.de>
printk may cause long time lapse if value of printk_delay in sysctl is
configured large by user. If register_md_personality takes long time to print in
spinlock pers_lock, we may encounter high CPU usage rate when there are other
pers_lock competitors who may be blocked to spin.
We can avoid this condition by moving printk out of coverage of pers_lock
spinlock.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: NeilBrown <neilb@suse.de>
In general we don't allow an array to be stopped if it is in use.
However if the array hasn't really been started yet, then any
apparent use is an anomily, probably due to 'udev' or similar
having a look to see what is there.
This means that if something goes wrong while assembling an array
it cannot reliably be un-assembled - STOP_ARRAY could fail.
There is no value here, so change do_md_stop() to succeed
despite concurrent opens if the array has not yet been
activated. i.e. if ->pers is NULL.
Reported-by: "Baldysiak, Pawel" <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
raid5: fix init_stripe() inconsistencies
1) remove_hash() is not necessary. We will only be called right after
get_free_stripe(). There we have already a call to remove_hash().
2) Tracing prints out the sector of the freed stripe and not the sector
that we want to initialize.
Signed-off-by: NeilBrown <neilb@suse.de>
Pull scheduler updates from Ingo Molnar:
"The main changes in this cycle were:
- Optimized support for Intel "Cluster-on-Die" (CoD) topologies (Dave
Hansen)
- Various sched/idle refinements for better idle handling (Nicolas
Pitre, Daniel Lezcano, Chuansheng Liu, Vincent Guittot)
- sched/numa updates and optimizations (Rik van Riel)
- sysbench speedup (Vincent Guittot)
- capacity calculation cleanups/refactoring (Vincent Guittot)
- Various cleanups to thread group iteration (Oleg Nesterov)
- Double-rq-lock removal optimization and various refactorings
(Kirill Tkhai)
- various sched/deadline fixes
... and lots of other changes"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits)
sched/dl: Use dl_bw_of() under rcu_read_lock_sched()
sched/fair: Delete resched_cpu() from idle_balance()
sched, time: Fix build error with 64 bit cputime_t on 32 bit systems
sched: Improve sysbench performance by fixing spurious active migration
sched/x86: Fix up typo in topology detection
x86, sched: Add new topology for multi-NUMA-node CPUs
sched/rt: Use resched_curr() in task_tick_rt()
sched: Use rq->rd in sched_setaffinity() under RCU read lock
sched: cleanup: Rename 'out_unlock' to 'out_free_new_mask'
sched: Use dl_bw_of() under RCU read lock
sched/fair: Remove duplicate code from can_migrate_task()
sched, mips, ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW
sched: print_rq(): Don't use tasklist_lock
sched: normalize_rt_tasks(): Don't use _irqsave for tasklist_lock, use task_rq_lock()
sched: Fix the task-group check in tg_has_rt_tasks()
sched/fair: Leverage the idle state info when choosing the "idlest" cpu
sched: Let the scheduler see CPU idle states
sched/deadline: Fix inter- exclusive cpusets migrations
sched/deadline: Clear dl_entity params when setscheduling to different class
sched/numa: Kill the wrong/dead TASK_DEAD check in task_numa_fault()
...
Fix a potential struct stripe_c leak that would occur if the
chunk_size exceeded the maximum allowed by dm_set_target_max_io_len
(UINT_MAX). However, in practice there is no possibility of this
occuring given that chunk_size is of type uint32_t. But it is good to
fix this to future-proof in case dm_set_target_max_io_len's
implementation were to change.
Signed-off-by: Pavitra Kumar <pavitrak@nvidia.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
If two threads call bitmap_unplug at the same time, then
one might schedule all the writes, and the other might
decide that it doesn't need to wait. But really it does.
It rarely hurts to wait when it isn't absolutely necessary,
and the current code doesn't really focus on 'absolutely necessary'
anyway. So just wait always.
This can potentially lead to data corruption if a crash happens
at an awkward time and data was written before the bitmap was
updated. It is very unlikely, but this should go to -stable
just to be safe. Appropriate for any -stable.
Signed-off-by: NeilBrown <neilb@suse.de>
Cc: stable@vger.kernel.org (please delay until 3.18 is released)
If cn_add_callback() fails in dm_ulog_tfr_init(), it does not
deallocate prealloced memory but calls cn_del_callback().
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Reviewed-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org