IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
In switched_from_dl() we have to issue a resched if we successfully
pulled some task from other cpus. This patch also aligns the behavior
with -rt.
Suggested-by: Juri Lelli <juri.lelli@arm.com>
Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Kirill Tkhai <ktkhai@parallels.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1414708776-124078-5-git-send-email-wanpeng.li@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch pushes task away if the dealine of the task is equal
to current during wake up. The same behavior as rt class.
Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Kirill Tkhai <ktkhai@parallels.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1414708776-124078-4-git-send-email-wanpeng.li@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The yield semantic of deadline class is to reduce remaining runtime to
zero, and then update_curr_dl() will stop it. However, comsumed bandwidth
is reduced from the budget of yield task again even if it has already been
set to zero which leads to artificial overrun. This patch fix it by make
sure we don't steal some more time from the task that yielded in update_curr_dl().
Suggested-by: Juri Lelli <juri.lelli@arm.com>
Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Kirill Tkhai <ktkhai@parallels.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1414708776-124078-2-git-send-email-wanpeng.li@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch checks if current can be pushed/pulled somewhere else
in advance to make logic clear, the same behavior as dl class.
- If current can't be migrated, useless to reschedule, let's hope
task can move out.
- If task is migratable, so let's not schedule it and see if it
can be pushed or pulled somewhere else.
Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Kirill Tkhai <ktkhai@parallels.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1414708776-124078-1-git-send-email-wanpeng.li@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
As per commit f10e00f4bf36 ("sched/dl: Use dl_bw_of() under
rcu_read_lock_sched()"), dl_bw_of() has to be protected by
rcu_read_lock_sched().
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1414497286-28824-1-git-send-email-juri.lelli@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Idle cpu is idler than non-idle cpu, so we needn't search for least_loaded_cpu
after we have found an idle cpu.
Signed-off-by: Yao Dongdong <yaodongdong@huawei.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1414469286-6023-1-git-send-email-yaodongdong@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently used hrtimer_try_to_cancel() is racy:
raw_spin_lock(&rq->lock)
... dl_task_timer raw_spin_lock(&rq->lock)
... raw_spin_lock(&rq->lock) ...
switched_from_dl() ... ...
hrtimer_try_to_cancel() ... ...
switched_to_fair() ... ...
... ... ...
... ... ...
raw_spin_unlock(&rq->lock) ... (asquired)
... ... ...
... ... ...
do_exit() ... ...
schedule() ... ...
raw_spin_lock(&rq->lock) ... raw_spin_unlock(&rq->lock)
... ... ...
raw_spin_unlock(&rq->lock) ... raw_spin_lock(&rq->lock)
... ... (asquired)
put_task_struct() ... ...
free_task_struct() ... ...
... ... raw_spin_unlock(&rq->lock)
... (asquired) ...
... ... ...
... (use after free) ...
So, let's implement 100% guaranteed way to cancel the timer and let's
be sure we are safe even in very unlikely situations.
rq unlocking does not limit the area of switched_from_dl() use, because
this has already been possible in pull_dl_task() below.
Let's consider the safety of of this unlocking. New code in the patch
is working when hrtimer_try_to_cancel() fails. This means the callback
is running. In this case hrtimer_cancel() is just waiting till the
callback is finished. Two
1) Since we are in switched_from_dl(), new class is not dl_sched_class and
new prio is not less MAX_DL_PRIO. So, the callback returns early; it's
right after !dl_task() check. After that hrtimer_cancel() returns back too.
The above is:
raw_spin_lock(rq->lock); ...
... dl_task_timer()
... raw_spin_lock(rq->lock);
switched_from_dl() ...
hrtimer_try_to_cancel() ...
raw_spin_unlock(rq->lock); ...
hrtimer_cancel() ...
... raw_spin_unlock(rq->lock);
... return HRTIMER_NORESTART;
... ...
raw_spin_lock(rq->lock); ...
2) But the below is also possible:
dl_task_timer()
raw_spin_lock(rq->lock);
...
raw_spin_unlock(rq->lock);
raw_spin_lock(rq->lock); ...
switched_from_dl() ...
hrtimer_try_to_cancel() ...
... return HRTIMER_NORESTART;
raw_spin_unlock(rq->lock); ...
hrtimer_cancel(); ...
raw_spin_lock(rq->lock); ...
In this case hrtimer_cancel() returns immediately. Very unlikely case,
just to mention.
Nobody can manipulate the task, because check_class_changed() is
always called with pi_lock locked. Nobody can force the task to
participate in (concurrent) priority inheritance schemes (the same reason).
All concurrent task operations require pi_lock, which is held by us.
No deadlocks with dl_task_timer() are possible, because it returns
right after !dl_task() check (it does nothing).
If we receive a new dl_task during the time of unlocked rq, we just
don't have to do pull_dl_task() in switched_from_dl() further.
Signed-off-by: Kirill Tkhai <ktkhai@parallels.com>
[ Added comments]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Juri Lelli <juri.lelli@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1414420852.19914.186.camel@tkhai
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In some cases this can trigger a true flood of output.
Requested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The kauditd_thread wait loop is a bit iffy; it has a number of problems:
- calls try_to_freeze() before schedule(); you typically want the
thread to re-evaluate the sleep condition when unfreezing, also
freeze_task() issues a wakeup.
- it unconditionally does the {add,remove}_wait_queue(), even when the
sleep condition is false.
Use wait_event_freezable() that does the right thing.
Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Eric Paris <eparis@redhat.com>
Cc: oleg@redhat.com
Cc: Eric Paris <eparis@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20141002102251.GA6324@worktop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
There is a race between kthread_stop() and the new wait_woken() that
can result in a lack of progress.
CPU 0 | CPU 1
|
rfcomm_run() | kthread_stop()
... |
if (!test_bit(KTHREAD_SHOULD_STOP)) |
| set_bit(KTHREAD_SHOULD_STOP)
| wake_up_process()
wait_woken() | wait_for_completion()
set_current_state(INTERRUPTIBLE) |
if (!WQ_FLAG_WOKEN) |
schedule_timeout() |
|
After which both tasks will wait.. forever.
Fix this by having wait_woken() check for kthread_should_stop() but
only for kthreads (obviously).
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
sched_move_task() is the only interface to change sched_task_group:
cpu_cgrp_subsys methods and autogroup_move_group() use it.
Everything is synchronized by task_rq_lock(), so cpu_cgroup_attach()
is ordered with other users of sched_move_task(). This means we do no
need RCU here: if we've dereferenced a tg here, the .attach method
hasn't been called for it yet.
Thus, we should pass "true" to task_css_check() to silence lockdep
warnings.
Fixes: eeb61e53ea19 ("sched: Fix race between task_group and sched_task_group")
Reported-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Kirill Tkhai <ktkhai@parallels.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1414473874.8574.2.camel@tkhai
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit 38706bc5a29a (rcutorture: Add callback-flood test) vmalloc()ed
a bunch of RCU callbacks, but failed to free them. This commit fixes
that oversight.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
Add early boot self tests for RCU under CONFIG_PROVE_RCU.
Currently the only test is adding a dummy callback which increments a counter
which we then later verify after calling rcu_barrier*().
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
A long string of get_online_cpus() with each followed by a
put_online_cpu() that fails to acquire cpu_hotplug.lock can result in
overflow of the cpu_hotplug.puts_pending counter. Although this is
perhaps improbably, a system with absolutely no CPU-hotplug operations
will have an arbitrarily long time in which this overflow could occur.
This commit therefore adds overflow checks to get_online_cpus() and
try_get_online_cpus().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
The "cpu" argument to rcu_cleanup_after_idle() is always the current
CPU, so drop it. This moves the smp_processor_id() from the caller to
rcu_cleanup_after_idle(), saving argument-passing overhead. Again,
the anticipated cross-CPU uses of these functions has been replaced
by NO_HZ_FULL.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
The "cpu" argument to rcu_prepare_for_idle() is always the current
CPU, so drop it. This in turn allows two of the uses of "cpu" in
this function to be replaced with a this_cpu_ptr() and the third by
smp_processor_id(), replacing that of the call to rcu_prepare_for_idle().
Again, the anticipated cross-CPU uses of these functions has been replaced
by NO_HZ_FULL.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
The "cpu" argument to rcu_needs_cpu() is always the current CPU, so drop
it. This in turn allows the "cpu" argument to rcu_cpu_has_callbacks()
to be removed, which allows the uses of "cpu" in both functions to be
replaced with a this_cpu_ptr(). Again, the anticipated cross-CPU uses
of these functions has been replaced by NO_HZ_FULL.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
The "cpu" argument to rcu_note_context_switch() is always the current
CPU, so drop it. This in turn allows the "cpu" argument to
rcu_preempt_note_context_switch() to be removed, which allows the sole
use of "cpu" in both functions to be replaced with a this_cpu_ptr().
Again, the anticipated cross-CPU uses of these functions has been
replaced by NO_HZ_FULL.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
Because rcu_preempt_check_callbacks()'s argument is guaranteed to
always be the current CPU, drop the argument and replace per_cpu()
with __this_cpu_read().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
Because rcu_pending()'s argument is guaranteed to always be the current
CPU, drop the argument and replace per_cpu_ptr() with this_cpu_ptr().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
The "cpu" argument was kept around on the off-chance that RCU might
offload scheduler-clock interrupts. However, this offload approach
has been replaced by NO_HZ_FULL, which offloads -all- RCU processing
from qualifying CPUs. It is therefore time to remove the "cpu" argument
to rcu_check_callbacks(), which this commit does.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
The rcu_data per-CPU variable has a number of fields that are atomically
manipulated, potentially by any CPU. This situation can result in false
sharing with per-CPU variables that have the misfortune of being allocated
adjacent to rcu_data in memory. This commit therefore changes the
DEFINE_PER_CPU() to DEFINE_PER_CPU_SHARED_ALIGNED() in order to avoid
this false sharing.
Reported-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
For some functions in kernel/rcu/tree* the rdtp parameter is always
this_cpu_ptr(rdtp). Remove the parameter if constant and calculate the
pointer in function.
This will have the advantage that it is obvious that the address are
all per cpu offsets and thus it will enable the use of this_cpu_ops in
the future.
Signed-off-by: Christoph Lameter <cl@linux.com>
[ paulmck: Forward-ported to rcu/dev, whitespace adjustment. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
This patch migrates swsusp_show_speed and its callers to using ktime_t instead
of 'struct timeval' which suffers from the y2038 problem.
Changes to swsusp_show_speed:
- use ktime_t for start and stop times
- pass start and stop times by value
Calling functions affected:
- load_image
- load_image_lzo
- save_image
- save_image_lzo
- hibernate_preallocate_memory
Design decisions:
- use ktime_t to preserve same granularity of reporting as before
- use centisecs logic as before to avoid 'div by zero' issues caused by
using seconds and nanoseconds directly
- use monotonic time (ktime_get()) since we only care about elapsed time.
Signed-off-by: Tina Ruchandani <ruchandani.tina@gmail.com>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- Fix a crash on r8a7791/koelsch during resume from system suspend
caused by a recent cpufreq-dt commit (Geert Uytterhoeven).
- Fix an MFD enumeration problem introduced by a recent commit
adding ACPI support to the MFD subsystem that exposed a weakness
in the ACPI core causing ACPI enumeration to be applied to all
devices associated with one ACPI companion object, although it
should be used for one of them only (Mika Westerberg).
- Fix an ACPI EC regression introduced during the 3.17 cycle
causing some Samsung laptops to misbehave as a result of a
workaround targeted at some Acer machines. That includes
a revert of a commit that went too far and a quirk for the
Acer machines in question. From Lv Zheng.
- Fix a regression in the system suspend error code path introduced
during the 3.15 cycle that causes it to fail to take errors from
asychronous execution of "late" suspend callbacks into account
(Imre Deak).
- Fix a long-standing bug in the hibernation resume error code path
that fails to roll back everything correcty on "freeze" callback
errors and leaves some devices in a "suspended" state causing more
breakage to happen subsequently (Imre Deak).
- Make the cpufreq-dt driver disable operation performance points
that are not supported by the VR connected to the CPU voltage
plane with acceptable tolerance instead of constantly failing
voltage scaling later on (Lucas Stach).
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJUVAuPAAoJEILEb/54YlRxGfQP/0nFfTqyuDN8cPA2qRIzDIoi
8PTOzlhrRuzUlpMkYdsDijxwFcK2/59LomwtuAKHi7309N6UzUa8vAkb8WrzpY7m
XUU+fhsLEkDnEczMfgmbP5ljtP75eJSSWRO0WIBuk4k79qcsutLNtgGpJV7feYSv
+t7OE9DrBPM8lSpBKM/4qs5gnXzdaWmi4xGH7upQWyxAC6RG9GosKdDUZxVxSJQt
oy/y0O4oxwyjg+8EvPwd22JtoFJ6axoEwCJXXlkn7NbIQNGtxrMR9zcMglsuOklg
bG93g1xJl4YCwLXV8sKfPU2kQkQ1ISY3rYIkwIjvBNIY4QFsQpCg3GYt08OJI0bO
4wDD7kH8C51aD9Zfi9luCdE4MsMyGB7SeNvQJul5uMujuG9ZeI61a8d7P6fmXu5X
lk+GeNl/rMujaESwqQlNgm3DvSYfc5FFEDC6F4Wcu4koomSlJwj//lMlOg2ajIgz
p5En6FeC8yGTuobGqo2dT7yYjmxm+kdX+gTStsto+hkxWA7beNjI1iXXWwPrQa/F
7pzneSrdbTZVdzZ1F9eR9AcGljhRMLBxs2XembXgkviCv+IVjw4qHWWKveDQKkhG
CVtcd3jrFSRHeAaqVNnbsoMu2nOLRY2W+f2+FNEfYKc+13aDJYm7pyAOIjujY7ns
Q1jSP7ZZQBVlxP5j5W5x
=g4QU
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-3.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:
"These are fixes received after my previous pull request plus one that
has been in the works for quite a while, but its previous version
caused problems to happen, so it's been deferred till now.
Fixed are two recent regressions (MFD enumeration and cpufreq-dt),
ACPI EC regression introduced in 3.17, system suspend error code path
regression introduced in 3.15, an older bug related to recovery from
failing resume from hibernation and a cpufreq-dt driver issue related
to operation performance points.
Specifics:
- Fix a crash on r8a7791/koelsch during resume from system suspend
caused by a recent cpufreq-dt commit (Geert Uytterhoeven).
- Fix an MFD enumeration problem introduced by a recent commit adding
ACPI support to the MFD subsystem that exposed a weakness in the
ACPI core causing ACPI enumeration to be applied to all devices
associated with one ACPI companion object, although it should be
used for one of them only (Mika Westerberg).
- Fix an ACPI EC regression introduced during the 3.17 cycle causing
some Samsung laptops to misbehave as a result of a workaround
targeted at some Acer machines. That includes a revert of a commit
that went too far and a quirk for the Acer machines in question.
From Lv Zheng.
- Fix a regression in the system suspend error code path introduced
during the 3.15 cycle that causes it to fail to take errors from
asychronous execution of "late" suspend callbacks into account
(Imre Deak).
- Fix a long-standing bug in the hibernation resume error code path
that fails to roll back everything correcty on "freeze" callback
errors and leaves some devices in a "suspended" state causing more
breakage to happen subsequently (Imre Deak).
- Make the cpufreq-dt driver disable operation performance points
that are not supported by the VR connected to the CPU voltage plane
with acceptable tolerance instead of constantly failing voltage
scaling later on (Lucas Stach)"
* tag 'pm+acpi-3.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / EC: Fix regression due to conflicting firmware behavior between Samsung and Acer.
Revert "ACPI / EC: Add support to disallow QR_EC to be issued before completing previous QR_EC"
cpufreq: cpufreq-dt: Restore default cpumask_setall(policy->cpus)
PM / Sleep: fix recovery during resuming from hibernation
PM / Sleep: fix async suspend_late/freeze_late error handling
ACPI: Use ACPI companion to match only the first physical device
cpufreq: cpufreq-dt: disable unsupported OPPs
Pull networking fixes from David Miller:
"A bit has accumulated, but it's been a week or so since my last batch
of post-merge-window fixes, so...
1) Missing module license in netfilter reject module, from Pablo.
Lots of people ran into this.
2) Off by one in mac80211 baserate calculation, from Karl Beldan.
3) Fix incorrect return value from ax88179_178a driver's set_mac_addr
op, which broke use of it with bonding. From Ian Morgan.
4) Checking of skb_gso_segment()'s return value was not all
encompassing, it can return an SKB pointer, a pointer error, or
NULL. Fix from Florian Westphal.
This is crummy, and longer term will be fixed to just return error
pointers or a real SKB.
6) Encapsulation offloads not being handled by
skb_gso_transport_seglen(). From Florian Westphal.
7) Fix deadlock in TIPC stack, from Ying Xue.
8) Fix performance regression from using rhashtable for netlink
sockets. The problem was the synchronize_net() invoked for every
socket destroy. From Thomas Graf.
9) Fix bug in eBPF verifier, and remove the strong dependency of BPF
on NET. From Alexei Starovoitov.
10) In qdisc_create(), use the correct interface to allocate
->cpu_bstats, otherwise the u64_stats_sync member isn't
initialized properly. From Sabrina Dubroca.
11) Off by one in ip_set_nfnl_get_byindex(), from Dan Carpenter.
12) nf_tables_newchain() was erroneously expecting error pointers from
netdev_alloc_pcpu_stats(). It only returna a valid pointer or
NULL. From Sabrina Dubroca.
13) Fix use-after-free in _decode_session6(), from Li RongQing.
14) When we set the TX flow hash on a socket, we mistakenly do so
before we've nailed down the final source port. Move the setting
deeper to fix this. From Sathya Perla.
15) NAPI budget accounting in amd-xgbe driver was counting descriptors
instead of full packets, fix from Thomas Lendacky.
16) Fix total_data_buflen calculation in hyperv driver, from Haiyang
Zhang.
17) Fix bcma driver build with OF_ADDRESS disabled, from Hauke
Mehrtens.
18) Fix mis-use of per-cpu memory in TCP md5 code. The problem is
that something that ends up being vmalloc memory can't be passed
to the crypto hash routines via scatter-gather lists. From Eric
Dumazet.
19) Fix regression in promiscuous mode enabling in cdc-ether, from
Olivier Blin.
20) Bucket eviction and frag entry killing can race with eachother,
causing an unlink of the object from the wrong list. Fix from
Nikolay Aleksandrov.
21) Missing initialization of spinlock in cxgb4 driver, from Anish
Bhatt.
22) Do not cache ipv4 routing failures, otherwise if the sysctl for
forwarding is subsequently enabled this won't be seen. From
Nicolas Cavallari"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (131 commits)
drivers: net: cpsw: Support ALLMULTI and fix IFF_PROMISC in switch mode
drivers: net: cpsw: Fix broken loop condition in switch mode
net: ethtool: Return -EOPNOTSUPP if user space tries to read EEPROM with lengh 0
stmmac: pci: set default of the filter bins
net: smc91x: Fix gpios for device tree based booting
mpls: Allow mpls_gso to be built as module
mpls: Fix mpls_gso handler.
r8152: stop submitting intr for -EPROTO
netfilter: nft_reject_bridge: restrict reject to prerouting and input
netfilter: nft_reject_bridge: don't use IP stack to reject traffic
netfilter: nf_reject_ipv6: split nf_send_reset6() in smaller functions
netfilter: nf_reject_ipv4: split nf_send_reset() in smaller functions
netfilter: nf_tables_bridge: update hook_mask to allow {pre,post}routing
drivers/net: macvtap and tun depend on INET
drivers/net, ipv6: Select IPv6 fragment idents for virtio UFO packets
drivers/net: Disable UFO through virtio
net: skb_fclone_busy() needs to detect orphaned skb
gre: Use inner mac length when computing tunnel length
mlx4: Avoid leaking steering rules on flow creation error flow
net/mlx4_en: Don't attempt to TX offload the outer UDP checksum for VXLAN
...
Pull scheduler fixes from Ingo Molnar:
"Various scheduler fixes all over the place: three SCHED_DL fixes,
three sched/numa fixes, two generic race fixes and a comment fix"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/dl: Fix preemption checks
sched: Update comments for CLONE_NEWNS
sched: stop the unbound recursion in preempt_schedule_context()
sched/fair: Fix division by zero sysctl_numa_balancing_scan_size
sched/fair: Care divide error in update_task_scan_period()
sched/numa: Fix unsafe get_task_struct() in task_numa_assign()
sched/deadline: Fix races between rt_mutex_setprio() and dl_task_timer()
sched/deadline: Don't replenish from a !SCHED_DEADLINE entity
sched: Fix race between task_group and sched_task_group
Pull perf fixes from Ingo Molnar:
"Mostly tooling fixes, plus on the kernel side:
- a revert for a newly introduced PMU driver which isn't complete yet
and where we ran out of time with fixes (to be tried again in
v3.19) - this makes up for a large chunk of the diffstat.
- compilation warning fixes
- a printk message fix
- event_idx usage fixes/cleanups"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf probe: Trivial typo fix for --demangle
perf tools: Fix report -F dso_from for data without branch info
perf tools: Fix report -F dso_to for data without branch info
perf tools: Fix report -F symbol_from for data without branch info
perf tools: Fix report -F symbol_to for data without branch info
perf tools: Fix report -F mispredict for data without branch info
perf tools: Fix report -F in_tx for data without branch info
perf tools: Fix report -F abort for data without branch info
perf tools: Make CPUINFO_PROC an array to support different kernel versions
perf callchain: Use global caching provided by libunwind
perf/x86/intel: Revert incomplete and undocumented Broadwell client support
perf/x86: Fix compile warnings for intel_uncore
perf: Fix typos in sample code in the perf_event.h header
perf: Fix and clean up initialization of pmu::event_idx
perf: Fix bogus kernel printk
perf diff: Add missing hists__init() call at tool start
Pull futex fixes from Ingo Molnar:
"This contains two futex fixes: one fixes a race condition, the other
clarifies shared/private futex comments"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
futex: Fix a race condition between REQUEUE_PI and task death
futex: Mention key referencing differences between shared and private futexes
Pull core fixes from Ingo Molnar:
"The tree contains two RCU fixes and a compiler quirk comment fix"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rcu: Make rcu_barrier() understand about missing rcuo kthreads
compiler/gcc4+: Remove inaccurate comment about 'asm goto' miscompiles
rcu: More on deadlock between CPU hotplug and expedited grace periods
Pull timer fixes from Thomas Gleixner:
"As you requested in the rc2 release mail the timer department serves
you a few real bug fixes:
- Fix the probe logic of the architected arm/arm64 timer
- Plug a stack info leak in posix-timers
- Prevent a shift out of bounds issue in the clockevents core"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
ARM/ARM64: arch-timer: fix arch_timer_probed logic
clockevents: Prevent shift out of bounds
posix-timers: Fix stack info leak in timer_create()
tracing system does not support that and without checks, it can cause
an oops to be reported.
Rabin Vincent added checks in the return code on syscall events to make
sure that the system call number is within the range that tracing
knows about, and if not, simply ignores the system call.
The system call tracing infrastructure needs to be rewritten to handle these
cases better, but for now, to keep from oopsing, this patch will do.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJUUt+4AAoJEEjnJuOKh9ld3HgH/0RL7neY1tp05+v0GRvABmGr
6T47GEmZi9NiQOWjFC4SxNHLQSjpQX7eLD2CC6bljDfFpgKiIqarWHegEBUoBF9K
Dlg2jPpCwwwKbTXlAKTmv9QTGzvBEYyVZxhSC7mEbziV4Rbt7CVZJlogVdeYP5y0
4mWyHJg11Dt9SiZJCIv8sIrx2Xka2eX+Aq30dwYd9JGco3vVCH8NZ09ZgYBHaxIm
YrL6yUVnHP3nqKiEL4qCMUqUzexzdwUhrGPddLANaSRTWT+EAGYPD113bA76jAKc
cd3eaFwFkmCA0yfmjjBSb23FsPvKHc7j6BtZA6Q3uKPZUVlX+DyVNisUfEnaLQs=
=9NTR
-----END PGP SIGNATURE-----
Merge tag 'trace-fixes-v3.18-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
"ARM has system calls outside the NR_syscalls range, and the generic
tracing system does not support that and without checks, it can cause
an oops to be reported.
Rabin Vincent added checks in the return code on syscall events to
make sure that the system call number is within the range that tracing
knows about, and if not, simply ignores the system call.
The system call tracing infrastructure needs to be rewritten to handle
these cases better, but for now, to keep from oopsing, this patch will
do"
* tag 'trace-fixes-v3.18-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing/syscalls: Ignore numbers outside NR_syscalls' range
The file /sys/kernel/debug/tracing/eneabled_functions is used to debug
ftrace function hooks. Add to the output what function is being called
by the trampoline if the arch supports it.
Add support for this feature in x86_64.
Cc: H. Peter Anvin <hpa@linux.intel.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The current method of handling multiple function callbacks is to register
a list function callback that calls all the other callbacks based on
their hash tables and compare it to the function that the callback was
called on. But this is very inefficient.
For example, if you are tracing all functions in the kernel and then
add a kprobe to a function such that the kprobe uses ftrace, the
mcount trampoline will switch from calling the function trace callback
to calling the list callback that will iterate over all registered
ftrace_ops (in this case, the function tracer and the kprobes callback).
That means for every function being traced it checks the hash of the
ftrace_ops for function tracing and kprobes, even though the kprobes
is only set at a single function. The kprobes ftrace_ops is checked
for every function being traced!
Instead of calling the list function for functions that are only being
traced by a single callback, we can call a dynamically allocated
trampoline that calls the callback directly. The function graph tracer
already uses a direct call trampoline when it is being traced by itself
but it is not dynamically allocated. It's trampoline is static in the
kernel core. The infrastructure that called the function graph trampoline
can also be used to call a dynamically allocated one.
For now, only ftrace_ops that are not dynamically allocated can have
a trampoline. That is, users such as function tracer or stack tracer.
kprobes and perf allocate their ftrace_ops, and until there's a safe
way to free the trampoline, it can not be used. The dynamically allocated
ftrace_ops may, although, use the trampoline if the kernel is not
compiled with CONFIG_PREEMPT. But that will come later.
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
ARM has some private syscalls (for example, set_tls(2)) which lie
outside the range of NR_syscalls. If any of these are called while
syscall tracing is being performed, out-of-bounds array access will
occur in the ftrace and perf sys_{enter,exit} handlers.
# trace-cmd record -e raw_syscalls:* true && trace-cmd report
...
true-653 [000] 384.675777: sys_enter: NR 192 (0, 1000, 3, 4000022, ffffffff, 0)
true-653 [000] 384.675812: sys_exit: NR 192 = 1995915264
true-653 [000] 384.675971: sys_enter: NR 983045 (76f74480, 76f74000, 76f74b28, 76f74480, 76f76f74, 1)
true-653 [000] 384.675988: sys_exit: NR 983045 = 0
...
# trace-cmd record -e syscalls:* true
[ 17.289329] Unable to handle kernel paging request at virtual address aaaaaace
[ 17.289590] pgd = 9e71c000
[ 17.289696] [aaaaaace] *pgd=00000000
[ 17.289985] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
[ 17.290169] Modules linked in:
[ 17.290391] CPU: 0 PID: 704 Comm: true Not tainted 3.18.0-rc2+ #21
[ 17.290585] task: 9f4dab00 ti: 9e710000 task.ti: 9e710000
[ 17.290747] PC is at ftrace_syscall_enter+0x48/0x1f8
[ 17.290866] LR is at syscall_trace_enter+0x124/0x184
Fix this by ignoring out-of-NR_syscalls-bounds syscall numbers.
Commit cd0980fc8add "tracing: Check invalid syscall nr while tracing syscalls"
added the check for less than zero, but it should have also checked
for greater than NR_syscalls.
Link: http://lkml.kernel.org/p/1414620418-29472-1-git-send-email-rabin@rab.in
Fixes: cd0980fc8add "tracing: Check invalid syscall nr while tracing syscalls"
Cc: stable@vger.kernel.org # 2.6.33+
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Add a space between subj= and feature= fields to make them parsable.
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
verifier keeps track of register state spilled to stack.
registers are 8-byte wide and always aligned, so instead of tracking them
in every byte-sized stack slot, use MAX_BPF_STACK / 8 array to track
spilled register state.
Though verifier runs in user context and its state freed immediately
after verification, it makes sense to reduce its memory usage.
This optimization reduces sizeof(struct verifier_state)
from 12464 to 1712 on 64-bit and from 6232 to 1112 on 32-bit.
Note, this patch doesn't change existing limits, which are there to bound
time and memory during verification: 4k total number of insns in a program,
1k number of jumps (states to visit) and 32k number of processed insn
(since an insn may be visited multiple times). Theoretical worst case memory
during verification is 1712 * 1k = 17Mbyte. Out-of-memory situation triggers
cleanup and rejects the program.
Suggested-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull two RCU fixes from Paul E. McKenney:
" - Complete the work of commit dd56af42bd82 (rcu: Eliminate deadlock
between CPU hotplug and expedited grace periods), which was
intended to allow synchronize_sched_expedited() to be safely
used when holding locks acquired by CPU-hotplug notifiers.
This commit makes the put_online_cpus() avoid the deadlock
instead of just handling the get_online_cpus().
- Complete the work of commit 35ce7f29a44a (rcu: Create rcuo
kthreads only for onlined CPUs), which was intended to allow
RCU to avoid allocating unneeded kthreads on systems where the
firmware says that there are more CPUs than are really present.
This commit makes rcu_barrier() aware of the mismatch, so that
it doesn't hang waiting for non-existent CPUs. "
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Found this in the message log on a s390 system:
BUG kmalloc-192 (Not tainted): Poison overwritten
Disabling lock debugging due to kernel taint
INFO: 0x00000000684761f4-0x00000000684761f7. First byte 0xff instead of 0x6b
INFO: Allocated in call_usermodehelper_setup+0x70/0x128 age=71 cpu=2 pid=648
__slab_alloc.isra.47.constprop.56+0x5f6/0x658
kmem_cache_alloc_trace+0x106/0x408
call_usermodehelper_setup+0x70/0x128
call_usermodehelper+0x62/0x90
cgroup_release_agent+0x178/0x1c0
process_one_work+0x36e/0x680
worker_thread+0x2f0/0x4f8
kthread+0x10a/0x120
kernel_thread_starter+0x6/0xc
kernel_thread_starter+0x0/0xc
INFO: Freed in call_usermodehelper_exec+0x110/0x1b8 age=71 cpu=2 pid=648
__slab_free+0x94/0x560
kfree+0x364/0x3e0
call_usermodehelper_exec+0x110/0x1b8
cgroup_release_agent+0x178/0x1c0
process_one_work+0x36e/0x680
worker_thread+0x2f0/0x4f8
kthread+0x10a/0x120
kernel_thread_starter+0x6/0xc
kernel_thread_starter+0x0/0xc
There is a use-after-free bug on the subprocess_info structure allocated
by the user mode helper. In case do_execve() returns with an error
____call_usermodehelper() stores the error code to sub_info->retval, but
sub_info can already have been freed.
Regarding UMH_NO_WAIT, the sub_info structure can be freed by
__call_usermodehelper() before the worker thread returns from
do_execve(), allowing memory corruption when do_execve() failed after
exec_mmap() is called.
Regarding UMH_WAIT_EXEC, the call to umh_complete() allows
call_usermodehelper_exec() to continue which then frees sub_info.
To fix this race the code needs to make sure that the call to
call_usermodehelper_freeinfo() is always done after the last store to
sub_info->retval.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Following up the arm testing of gcov, turns out gcov on ARM64 works fine
as well. Only change needed is adding ARM64 to Kconfig depends.
Tested with qemu and mach-virt
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Acked-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 35ce7f29a44a (rcu: Create rcuo kthreads only for onlined CPUs)
contains checks for the case where CPUs are brought online out of
order, re-wiring the rcuo leader-follower relationships as needed.
Unfortunately, this rewiring was broken. This apparently went undetected
due to the tendency of systems to bring CPUs online in order. This commit
nevertheless fixes the rewiring.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
If a no-CBs CPU were to post an RCU callback with interrupts disabled
after it entered the idle loop for the last time, there might be no
deferred wakeup for the corresponding rcuo kthreads. This commit
therefore adds a set of calls to do_nocb_deferred_wakeup() after the
CPU has gone completely offline.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
PREEMPT_RCU and TREE_PREEMPT_RCU serve the same function after
TINY_PREEMPT_RCU has been removed. This patch removes TREE_PREEMPT_RCU
and uses PREEMPT_RCU config option in its place.
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Rename CONFIG_RCU_BOOST_PRIO to CONFIG_RCU_KTHREAD_PRIO and use this
value for both the per-CPU kthreads (rcuc/N) and the rcu boosting
threads (rcub/n).
Also, create the module_parameter rcutree.kthread_prio to be used on
the kernel command line at boot to set a new value (rcutree.kthread_prio=N).
Signed-off-by: Clark Williams <clark.williams@gmail.com>
[ paulmck: Ported to rcu/dev, applied Paul Bolle and Peter Zijlstra feedback. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
__cleanup_sighand() frees sighand without RCU grace period. This is
correct but this looks "obviously buggy" and constantly confuses the
readers, add the comments to explain how this works.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
The kill_pid_info() can potentially loop indefinitely if tasks are created
and deleted sufficiently quickly, and if this happens, this function
will remain in a single RCU read-side critical section indefinitely.
This commit therefore exits the RCU read-side critical section on each
pass through the loop. Because a race must happen to retry the loop,
this should have no performance impact in the common case.
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
During the 3.18 merge period additional __get_cpu_var uses were
added. The patch converts these to this_cpu_ptr().
Signed-off-by: Christoph Lameter <cl@linux.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Tejun Heo <tj@kernel.org>