IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
vmwgfx: fix incorrect VRAM size check in vmw_kms_fb_create()
drm/radeon/kms: bail on BTC parts if MC ucode is missing
commit 6373a9a286 (netem: use vmalloc for distribution table) added a
regression, since vfree() is called while holding a spinlock and BH
being disabled.
Fix this by doing the pointers swap in critical section, and freeing
after spinlock release.
Also add __GFP_NOWARN to the kmalloc() try, since we fallback to
vmalloc().
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes one scheduling while atomic error:
[ 385.565186] ctnetlink v0.93: registering with nfnetlink.
[ 385.565349] BUG: scheduling while atomic: lt-expect_creat/16163/0x00000200
It can be triggered with utils/expect_create included in
libnetfilter_conntrack if the FTP helper is not loaded.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This fixes one bogus error that is returned to user-space:
libnetfilter_conntrack/utils# ./expect_get
TEST: get expectation (-1)(Unknown error 18446744073709551504)
This patch includes the correct handling for EAGAIN (nfnetlink
uses this error value to restart the operation after module
auto-loading).
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] omap3isp: Fix crash caused by subdevs now having a pointer to devnodes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: call d_instantiate after all ops are setup
Btrfs: fix worker lock misuse in find_worker
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
netfilter: xt_connbytes: handle negation correctly
net: relax rcvbuf limits
rps: fix insufficient bounds checking in store_rps_dev_flow_table_cnt()
net: introduce DST_NOPEER dst flag
mqprio: Avoid panic if no options are provided
bridge: provide a mtu() method for fake_dst_ops
Add a req_running field to the pl330_thread to track which request (if
any) has been submitted to the DMA. This mechanism replaces the old
one in which we tried to guess the same by looking at the PC of the
DMA, which could prevent the driver from sending more requests if it
didn't guess correctly.
Reference: <1323631637-9610-1-git-send-email-javi.merino@arm.com>
Signed-off-by: Javi Merino <javi.merino@arm.com>
Acked-by: Jassi Brar <jaswinder.singh@linaro.org>
Tested-by: Tushar Behera <tushar.behera@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Since Linux 2.6.36 the writeback code has introduces various measures for
live lock prevention during sync(). Unfortunately some of these are
actively harmful for the XFS model, where the inode gets marked dirty for
metadata from the data I/O handler.
The older_than_this checks that are now more strictly enforced since
writeback: avoid livelocking WB_SYNC_ALL writeback
by only calling into __writeback_inodes_sb and thus only sampling the
current cut off time once. But on a slow enough devices the previous
asynchronous sync pass might not have fully completed yet, and thus XFS
might mark metadata dirty only after that sampling of the cut off time for
the blocking pass already happened. I have not myself reproduced this
myself on a real system, but by introducing artificial delay into the
XFS I/O completion workqueues it can be reproduced easily.
Fix this by iterating over all XFS inodes in ->sync_fs and log all that
are dirty. This might log inode that only got redirtied after the
previous pass, but given how cheap delayed logging of inodes is it
isn't a major concern for performance.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Tested-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
If the writeback code writes back an inode because it has expired we currently
use the non-blockin ->write_inode path. This means any inode that is pinned
is skipped. With delayed logging and a workload that has very little log
traffic otherwise it is very likely that an inode that gets constantly
written to is always pinned, and thus we keep refusing to write it. The VM
writeback code at that point redirties it and doesn't try to write it again
for another 30 seconds. This means under certain scenarious time based
metadata writeback never happens.
Fix this by calling into xfs_log_inode for kupdate in addition to data
integrity syncs, and thus transfer the inode to the log ASAP.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Tested-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Activation conditions for a workaround should not be encoded in the
workaround's direct dependencies if this makes otherwise reasonable
configuration choices impossible.
This patches uses the SMP/UP patching facilities instead to compile
out the workaround if the configuration means that it is definitely
not needed.
This means that configs for buggy silicon can simply select
ARM_ERRATA_751472, without preventing a UP kernel from being built
or duplicatiing knowledge about when to activate the workaround.
This seems the correct way to do things, because the erratum is a
property of the silicon, irrespective of what the kernel config
happens to be.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Activation conditions for a workaround should not be encoded in the
workaround's direct dependencies if this makes otherwise reasonable
configuration choices impossible.
The workaround for erratum 720789 only affects a code path which is
not active in UP kernels; hence it should be safe to turn on in UP
kernels, without penalty.
This patch simply removes the extra dependency on SMP from Kconfig.
This means that configs for buggy silicon can simply select
ARM_ERRATA_720789, without preventing a UP kernel from being built
or duplicatiing knowledge about when to activate the workaround.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This reverts commit e1b6eb3ccb.
This was causing a delay of 10 seconds in the resume process of a Thinkpad
laptop. I'm afraid this could affect more devices once 3.2 is released.
Reported-by: Tomáš Janoušek <tomi@nomi.cz>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
The current perf scripting facility only supports tracepoints. This
patch implements a generic perl handler to support other events than
tracepoints too.
This patch introduces a function process_event() that is called by perf
for each sample. The function is called with byte streams as arguments
containing information about the event, its attributes, the sample and
raw data. Perl's unpack() function can easily be used for byte decoding.
The following is the default implementation for process_event() that can
also be generated with perf script:
# Packed byte string args of process_event():
#
# $event: union perf_event util/event.h
# $attr: struct perf_event_attr linux/perf_event.h
# $sample: struct perf_sample util/event.h
# $raw_data: perf_sample->raw_data util/event.h
sub process_event
{
my ($event, $attr, $sample, $raw_data) = @_;
my @event = unpack("LSS", $event);
my @attr = unpack("LLQQQQQLLQQ", $attr);
my @sample = unpack("QLLQQQQQLL", $sample);
my @raw_data = unpack("C*", $raw_data);
use Data::Dumper;
print Dumper \@event, \@attr, \@sample, \@raw_data;
}
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1323969824-9711-4-git-send-email-robert.richter@amd.com
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch introduces the for_each_set_bit() macro and modifies feature
implementation to use it.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1323248577-11268-8-git-send-email-robert.richter@amd.com
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The features HEADER_TRACE_INFO and HEADER_BUILD_ID are handled
different when writing the feature section. All other features are
simply disabled on failure and writing the section goes on without
returning an error. There is no reason for these special cases. This
patch unifies handling of the features.
This should be ok since all features can be parsed independently.
Offset and size of a feature's block is stored in struct perf_file_
section right after the data block of perf.data (see perf_session__
write_header()). Thus, if a feature does not exist then other features
can be processed anyway.
Also moving special code for HEADER_BUILD_ID out to write_build_id().
v2:
* perf record throws an error now if buildids may not be generated,
which can be disabled with the --no-buildid option.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1323248577-11268-6-git-send-email-robert.richter@amd.com
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The default input file for perf report is not handled the same way as
perf record does it for its output file. This leads to unexpected
behavior of perf report, etc. E.g.:
# perf record -a -e cpu-cycles sleep 2 | perf report | cat
failed to open perf.data: No such file or directory (try 'perf record' first)
While perf record writes to a fifo, perf report expects perf.data to be
read. This patch changes this to accept fifos as input file.
Applies to the following commands:
perf annotate
perf buildid-list
perf evlist
perf kmem
perf lock
perf report
perf sched
perf script
perf timechart
Also fixes char const* -> const char* type declaration for filename
strings.
v2:
* Prevent potential null pointer access to input_name in
builtin-report.c. Needed due to removal of patch "perf report: Setup
browser if stdout is a pipe"
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1323248577-11268-5-git-send-email-robert.richter@amd.com
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If filename is NULL there is an out-of-bound access to struct
perf_session if it would be used with perf_session__open(). Shouldn't
actually happen in current implementation as filename is always !NULL.
Fixing this by always null-terminating filename.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1323248577-11268-3-git-send-email-robert.richter@amd.com
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
A feature may be unknown if perf.data is created and parsed on different
perf tool versions. This should not stop the header to be processed,
instead continue processing it.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1323248577-11268-2-git-send-email-robert.richter@amd.com
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reducing duplication and line size by extending function names for
print and write from a single name.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1323248577-11268-7-git-send-email-robert.richter@amd.com
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now that we automatically point users at it, let's provide them some
guidance so that they hopefully don't just get mysterious EINVAL's
from the kernel.
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1324301972-22740-4-git-send-email-nelhage@nelhage.com
Signed-off-by: Nelson Elhage <nelhage@nelhage.com>
[ committer note: Made it work after 50a682c ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This failure is most likely due to running up against the
kernel.perf_event_mlock_kb sysctl, so we can tell the user what to do to
fix the issue.
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1324301972-22740-3-git-send-email-nelhage@nelhage.com
Signed-off-by: Nelson Elhage <nelhage@nelhage.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
I get such truncated annotation results in 'perf top':
: Disassembly of section .text: ▒
: ▒
: ffffffff810966a8 <nr_iowait_cpu>: ▒
4.94 : ffffffff810966a8: movslq %edi,%rdi ▒
3.70 : ffffffff810966ab: mov $0x13700,%rax ▒
0.00 : ffffffff810966b2: add -0x7e32cb00(,%rdi,8),%rax ▒
8.64 : ffffffff810966ba: mov 0x7e0(%rax),%eax ▒
82.72 : ffffffff810966c0: cltq ▒
Note the missing 'retq' which is there in the original function:
ffffffff810966a8 <nr_iowait_cpu>:
ffffffff810966a8: 48 63 ff movslq %edi,%rdi
ffffffff810966ab: 48 c7 c0 00 37 01 00 mov $0x13700,%rax
ffffffff810966b2: 48 03 04 fd 00 35 cd add -0x7e32cb00(,%rdi,8),%rax
ffffffff810966b9: 81
ffffffff810966ba: 8b 80 e0 07 00 00 mov 0x7e0(%rax),%eax
ffffffff810966c0: 48 98 cltq
ffffffff810966c2: c3 retq
ffffffff810966c3 <this_cpu_load>:
I'm using a fairly recent binutils:
GNU objdump version 2.21.51.0.6-2.fc16 20110118
AFAICS the bug is simply that sym->end points to the last byte
of the symbol in question - while objdump's --stop-address
expects the last byte plus 1 to disassemble the full range.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20111223130804.GA24305@elte.hu
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This handles multithreaded processes with named threads when doing
system wide profiling: the comm for each thread is looked up allowing
them to be different from the thread group leader.
v2:
- fixed sizeof arg to perf_event__get_comm_tgid
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1324578603-12762-3-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf does not properly handle monitoring of processes with named threads.
For example:
$ ps -C myapp -L
PID LWP TTY TIME CMD
25118 25118 ? 00:00:00 myapp
25118 25119 ? 00:00:00 myapp:worker
perf record -e cs -c 1 -fo /tmp/perf.data -p 25118 -- sleep 10
perf report --stdio -i /tmp/perf.data
100.00% myapp:worker [kernel.kallsyms] [k] perf_event_task_sched_out
The process name is set to the name of the last thread it finds for the
process.
The Problem:
perf-top and perf-record both create a thread_map of threads to be
monitored. That map is used in perf_event__synthesize_thread_map which
loops over the entries in thread_map and calls __event__synthesize_thread
to generate COMM and MMAP events.
__event__synthesize_thread calls perf_event__synthesize_comm which opens
/proc/pid/status, reads the name of the task and its thread group id.
That's all fine. The problem is that it then reads /proc/pid/task and
generates COMM events for each task it finds - but using the name found
in /proc/pid/status where pid is the thread of interest.
The end result (looping over thread_map + synthesizing comm events for
each thread each time) means the name of the last thread processed sets
the name for all threads in the process - which is not good for
multithreaded processes with named threads.
The Fix:
perf_event__synthesize_comm has an input argument (full) that decides
whether to process task entries for each pid it is passed. It currently
never set to 0 (perf_event__synthesize_comm has a single caller and it
always passes the value 1). Let's fix that.
Add the full input argument to __event__synthesize_thread which passes
it to perf_event__synthesize_comm. For thread/process monitoring set full
to 0 which means COMM and MMAP events are only generated for the pid
passed to it. For system wide monitoring set full to 1 so that COMM events
are generated for all threads in a process.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1324578603-12762-2-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Broken in commit d1dc698a54
Signed-off-by: Joachim Eastwood <joachim.eastwood@jotron.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Use raw_spin_unlock_irqrestore() as equivalent to
raw_spin_lock_irqsave().
Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1324646665-13334-1-git-send-email-robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
"! --connbytes 23:42" should match if the packet/byte count is not in range.
As there is no explict "invert match" toggle in the match structure,
userspace swaps the from and to arguments
(i.e., as if "--connbytes 42:23" were given).
However, "what <= 23 && what >= 42" will always be false.
Change things so we use "||" in case "from" is larger than "to".
This change may look like it breaks backwards compatibility when "to" is 0.
However, older iptables binaries will refuse "connbytes 42:0",
and current releases treat it to mean "! --connbytes 0:42",
so we should be fine.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This closes races where btrfs is calling d_instantiate too soon during
inode creation. All of the callers of btrfs_add_nondir are updated to
instantiate after the inode is fully setup in memory.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Dan Carpenter noticed that we were doing a double unlock on the worker
lock, and sometimes picking a worker thread without the lock held.
This fixes both errors.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
This change fixes a linking problem, which happens if oprofile
is selected to be compiled as built-in:
`oprofile_arch_exit' referenced in section `.init.text' of
arch/arm/oprofile/built-in.o: defined in discarded section
`.exit.text' of arch/arm/oprofile/built-in.o
The problem is appeared after commit 87121ca504, which
introduced oprofile_arch_exit() calls from __init function. Note
that the aforementioned commit has been backported to stable
branches, and the problem is known to be reproduced at least
with 3.0.13 and 3.1.5 kernels.
Signed-off-by: Vladimir Zapolskiy <vladimir.zapolskiy@nokia.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: oprofile-list <oprofile-list@lists.sourceforge.net>
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/20111222151540.GB16765@erda.amd.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Make sure that mutex is released upon register writing failure.
This fixes boot freezing observed on ARM based OLPC
(http://dev.laptop.org/ticket/11357).
Signed-off-by: Paul Fox <pgf@laptop.org>
Signed-off-by: Tai-hwa Liang <avatar@sentelic.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
skb->truesize might be big even for a small packet.
Its even bigger after commit 87fb4b7b53 (net: more accurate skb
truesize) and big MTU.
We should allow queueing at least one packet per receiver, even with a
low RCVBUF setting.
Reported-by: Michal Simek <monstr@monstr.eu>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Setting a large rps_flow_cnt like (1 << 30) on 32-bit platform will
cause a kernel oops due to insufficient bounds checking.
if (count > 1<<30) {
/* Enforce a limit to prevent overflow */
return -EINVAL;
}
count = roundup_pow_of_two(count);
table = vmalloc(RPS_DEV_FLOW_TABLE_SIZE(count));
Note that the macro RPS_DEV_FLOW_TABLE_SIZE(count) is defined as:
... + (count * sizeof(struct rps_dev_flow))
where sizeof(struct rps_dev_flow) is 8. (1 << 30) * 8 will overflow
32 bits.
This patch replaces the magic number (1 << 30) with a symbolic bound.
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Userspace may not provide TCA_OPTIONS, in fact tc currently does
so not do so if no arguments are specified on the command line.
Return EINVAL instead of panicing.
Signed-off-by: Thomas Graf <tgraf@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 618f9bc74a (net: Move mtu handling down to the protocol
depended handlers) forgot the bridge netfilter case, adding a NULL
dereference in ip_fragment().
Reported-by: Chris Boot <bootc@bootc.net>
CC: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-linus' of git://neil.brown.name/md:
md/bitmap: It is OK to clear bits during recovery.
md: don't give up looking for spares on first failure-to-add
md/raid5: ensure correct assessment of drives during degraded reshape.
md/linear: fix hot-add of devices to linear arrays.
commit d0a4bb4927 introduced a
regression which is annoying but fairly harmless.
When writing to an array that is undergoing recovery (a spare
in being integrated into the array), writing to the array will
set bits in the bitmap, but they will not be cleared when the
write completes.
For bits covering areas that have not been recovered yet this is not a
problem as the recovery will clear the bits. However bits set in
already-recovered region will stay set and never be cleared.
This doesn't risk data integrity. The only negatives are:
- next time there is a crash, more resyncing than necessary will
be done.
- the bitmap doesn't look clean, which is confusing.
While an array is recovering we don't want to update the
'events_cleared' setting in the bitmap but we do still want to clear
bits that have very recently been set - providing they were written to
the recovering device.
So split those two needs - which previously both depended on 'success'
and always clear the bit of the write went to all devices.
Signed-off-by: NeilBrown <neilb@suse.de>
Before performing a recovery we try to remove any spares that
might not be working, then add any that might have become relevant.
Currently we abort on the first spare that cannot be added.
This is a false optimisation.
It is conceivable that - depending on rules in the personality - a
subsequent spare might be accepted.
Also the loop does other things like count the available spares and
reset the 'recovery_offset' value.
If we abort early these might not happen properly.
So remove the early abort.
In particular if you have an array what is undergoing recovery and
which has extra spares, then the recovery may not restart after as
reboot as the could of 'spares' might end up as zero.
Reported-by: Anssi Hannula <anssi.hannula@iki.fi>
Signed-off-by: NeilBrown <neilb@suse.de>
While reshaping a degraded array (as when reshaping a RAID0 by first
converting it to a degraded RAID4) we currently get confused about
which devices are in_sync. In most cases we get it right, but in the
region that is being reshaped we need to treat non-failed devices as
in-sync when we have the data but haven't actually written it out yet.
Reported-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>