IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmBXahgQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgppMVEAC+Kn8AmNPbV7/AX3jfZYEh1UwyPetpJQ2m
FiWkXnuG85kM3UD12S5RYEYkHxzSob2d1yfZ+kL1TAkVJaz3FVoUU9ms0guXfCNb
l8k5fgK2zlegCyBIsPnouR/zV4Y/GJjf+tY0/c1e2Ovfl1zjCW486PvwjJzjMy8b
rXUi3MMKB3JPltML152qi9S1lJJuIHMB22ZUdTiyX+u4RtCzvGHGZmlpb4sw73RF
IRN7qBDYy5Pth+PCUBrhveIPmF/QSKhPHTarczIkgqSw/fSslsgEdBe88fxBDfbf
+WIaYifwqDongT4wkboXFUPTkSUlA+TbvnMW6dRZJTJvRspKz0SV4l+xC/QvT231
JqHqvRk2FkdVlpfXBvdVz94jLFiBJSl02QqTseQGbRdFY4BvxqkC15z4HkPdldJ8
QM2+6ZfzVWbzZkssgK42kTuDq9EX5Ks/+rOkIM/z2L5D00sbeeCVGCeNXf3uS7So
s7pskeTOLoXSvTpwzzEBEpJ6ebU698B1hx++Hjuy95Zifs2holkHXu36wvYmWFDm
CmxZ48waSQJq/emjbOSYfJthKc/TmaUzocsnMvSA5eoCmP445OUQJJTfifEj50if
/k0+XTi1DOrYHyy8R7a8T7xXDJIlMGY7fZyvmzopfRlJHnaHkeBfpbSaPCZXoAiJ
8T/mkYohAw==
=xaEf
-----END PGP SIGNATURE-----
Merge tag 'io_uring-5.12-2021-03-21' of git://git.kernel.dk/linux-block
Pull io_uring followup fixes from Jens Axboe:
- The SIGSTOP change from Eric, so we properly ignore that for
PF_IO_WORKER threads.
- Disallow sending signals to PF_IO_WORKER threads in general, we're
not interested in having them funnel back to the io_uring owning
task.
- Stable fix from Stefan, ensuring we properly break links for short
send/sendmsg recv/recvmsg if MSG_WAITALL is set.
- Catch and loop when needing to run task_work before a PF_IO_WORKER
threads goes to sleep.
* tag 'io_uring-5.12-2021-03-21' of git://git.kernel.dk/linux-block:
io_uring: call req_set_fail_links() on short send[msg]()/recv[msg]() with MSG_WAITALL
io-wq: ensure task is running before processing task_work
signal: don't allow STOP on PF_IO_WORKER threads
signal: don't allow sending any signals to PF_IO_WORKER threads
plus a DocBook fix.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmBXMJIRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1jl4BAAoDqrifzbrgC3wylJALpEAYnPJ1uPdAKP
tE1O1wPoJhb9P2b5ktWUiRzrAx9wpRD3Z3nIxsGgUAC1G/StJ9mF/XgigF0QSAFl
rn49iey6XljcB9prBpFnFkS9C4LmYX4P+0KDImerriSI2rHE/jlhBZrhlQRKTfcj
tHssqsu4i0ZH/O2xmOd0wOeDXiF/EkQX1FFekjfxFa+1xACW979Ucf8RTWjfhkVl
Dtvort/WC/VDzDXH+B0uPVGornTjZL6U6YcsmXu8EmXNo2htgHSkUBvLDMEs/T1q
vtkoTzoz4nrndSCDzSLZJOgp/qCn8Nf2iYesxzV8EICOj6ZDSqpOFIBH/dI0Swvi
8mUzzLRJ4Tb/ng806DBBxZw80q3SWt5VngBZjW37cSyIDtFRvdsp8F/VavBTvPx8
7rleLF0vftWTVVSiBluzZQiIb7wYqr/zQT9Umne/DfvPCqZi9GnJLcBU50Sg/fEB
cAMc8D6jYkoHiYT3eHr/O7QxNyyf7kaMfNMZV0Io71WTYudCvQOPTF055fWLD1+w
zc0MTuIWl+wkLlV9XQ8y9ol/frpN97tHRBOHSiukcci+7YVQwB4J6hla7094GpLl
6zNqQza2QrGtAX9lbwLlXGdnAqOQExyu+sGHZS7IdUUgj2z047iFzOPepWqqYimL
RHO/DJLSGqI=
=IkEX
-----END PGP SIGNATURE-----
Merge tag 'irq-urgent-2021-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fix from Ingo Molnar:
"A change to robustify force-threaded IRQ handlers to always disable
interrupts, plus a DocBook fix.
The force-threaded IRQ handler change has been accelerated from the
normal schedule of such a change to keep the bad pattern/workaround of
spin_lock_irqsave() in handlers or IRQF_NOTHREAD as a kludge from
spreading"
* tag 'irq-urgent-2021-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Disable interrupts for force threaded handlers
genirq/irq_sim: Fix typos in kernel doc (fnode -> fwnode)
devicetree-node lookups.
- Restore the IRQ2 ignore logic
- Fix get_nr_restart_syscall() to return the correct restart syscall number.
Split in a 4-patches set to avoid kABI breakage when backporting to dead
kernels.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmBXJu0ACgkQEsHwGGHe
VUrCkQ/9Et5W76HMQfHccluks2i2yNXgd7nROhIt0iMS1Ph86AWYJZmMZ2dbaqW8
nORU20ziHme+9PScmcJb2LdJxIRDtYNs1J811IYeKNpvj8KHXtV2VYCVG9UcL21E
FmUlZf5oINiDMzu3q4SuqHw9t7X6RCItolQIRmQHDXqPraFhBxji2VOFXDIg+qhf
a4sBz6UfxA4a/b7d/KxHxNvuQE5Cluc9gninhtaYh1b7OQZJX4+vTa3W5V4kK0df
ohOH5pnJp9V7qH2CmB3UcGWJTxHeLbm4E0KYkyasnKG9M0KmIvJ6jNARlRAo3hAF
hn9D4xLtsnIWjtO6xEVdF7kSizkYZRPay5kX88quvlSa0FkkPnsUvFtW79Yi3ZNy
vL2NAu2biqNQyo7ZWVffJns2DrJwYZ6KOGA6oUBwTUBfieF9KMdDew8IXRUMYNdO
LzW87Irf9eZj9c+b7Rtr0VofmKgRYwy1Lo8eVT+VGkV+nOTOB9rlAll2lYBq3aNA
W6ei0S5/1zaRF5aU6Qmnap4eb1X/tp845q6CPYa9kIsZwVyGFOa7iLeYcNn9qHdB
G6RW6CUh97A7wwxUYt5VGUscjYV2V9Ycv9HvIwrG/T7aezWnhI9ODtggzDgCnbls
og6N/+heLZ9G/DyxAEmHuazV2ItDPJq69gag/POHhXJaSUGbdbA=
=WfC4
-----END PGP SIGNATURE-----
Merge tag 'x86_urgent_for_v5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
"The freshest pile of shiny x86 fixes for 5.12:
- Add the arch-specific mapping between physical and logical CPUs to
fix devicetree-node lookups
- Restore the IRQ2 ignore logic
- Fix get_nr_restart_syscall() to return the correct restart syscall
number. Split in a 4-patches set to avoid kABI breakage when
backporting to dead kernels"
* tag 'x86_urgent_for_v5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/apic/of: Fix CPU devicetree-node lookups
x86/ioapic: Ignore IRQ2 again
x86: Introduce restart_block->arch_data to remove TS_COMPAT_RESTART
x86: Introduce TS_COMPAT_RESTART to fix get_nr_restart_syscall()
x86: Move TS_COMPAT back to asm/thread_info.h
kernel, fs: Introduce and use set_restart_fn() and arch_set_restart_data()
Just like we don't allow normal signals to IO threads, don't deliver a
STOP to a task that has PF_IO_WORKER set. The IO threads don't take
signals in general, and have no means of flushing out a stop either.
Longer term, we may want to look into allowing stop of these threads,
as it relates to eg process freezing. For now, this prevents a spin
issue if a SIGSTOP is delivered to the parent task.
Reported-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
They don't take signals individually, and even if they share signals with
the parent task, don't allow them to be delivered through the worker
thread. Linux does allow this kind of behavior for regular threads, but
it's really a compatability thing that we need not care about for the IO
threads.
Reported-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Since this message is printed when dynamically allocated spinlocks (e.g.
kzalloc()) are used without initialization (e.g. spin_lock_init()),
suggest to developers to check whether initialization functions for objects
were called, before making developers wonder what annotation is missing.
[ mingo: Minor tweaks to the message. ]
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210321064913.4619-1-penguin-kernel@I-love.SAKURA.ne.jp
Signed-off-by: Ingo Molnar <mingo@kernel.org>
With interrupt force threading all device interrupt handlers are invoked
from kernel threads. Contrary to hard interrupt context the invocation only
disables bottom halfs, but not interrupts. This was an oversight back then
because any code like this will have an issue:
thread(irq_A)
irq_handler(A)
spin_lock(&foo->lock);
interrupt(irq_B)
irq_handler(B)
spin_lock(&foo->lock);
This has been triggered with networking (NAPI vs. hrtimers) and console
drivers where printk() happens from an interrupt which interrupted the
force threaded handler.
Now people noticed and started to change the spin_lock() in the handler to
spin_lock_irqsave() which affects performance or add IRQF_NOTHREAD to the
interrupt request which in turn breaks RT.
Fix the root cause and not the symptom and disable interrupts before
invoking the force threaded handler which preserves the regular semantics
and the usefulness of the interrupt force threading as a general debugging
tool.
For not RT this is not changing much, except that during the execution of
the threaded handler interrupts are delayed until the handler
returns. Vs. scheduling and softirq processing there is no difference.
For RT kernels there is no issue.
Fixes: 8d32a307e4 ("genirq: Provide forced interrupt threading")
Reported-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Johan Hovold <johan@kernel.org>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20210317143859.513307808@linutronix.de
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmBVI8cQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpuFOD/494N0khk5EpLnoq0+/uyRpnqnTjL3n+iWc
fviiodL2/eirKWML/WbNUaKOWMs76iBwRqvTFnmCuyVexM9iPq3BXHocNYESYFni
0EfuL+jzs/LjQLVJgCxyYUyafDtCGZ5ct/3ilfGWSY13ngfYdUVT1p+u9NK94T63
4SrT6KKqEnpStpA1kjCw+doL17Tx2jrcrnX8gztIm0IarTnJGusiNZboy1IBMcqf
Lw7CEePn4b9/0wKJa8sDYIFtI8Rvj2Jk86c4DDpGgoPU6I9fGPnp3oMGrxlwectT
uTguzTlKAvbSu6v+2jqHCcXpkOG3aQJJM+YaNZmWOKwkLdyzLLIDT7SPlNHlacDF
yBj+Ou3FbKvVUrYldUHlQoLZIAgp7AQO1JBilijNNibXsH0M4Gaw3aGPFmhEFfeJ
/y+DXEfi2TGC6Yo+Ogub9Rh3gd2kgATu9Qbbnxi5TmYFc6WASBHP3OQEMVpVkD6F
IZxZDvIKMj3DoYX3Can0vlqiWhmL5o7gyaRTkmxc4A21CR+AHstupDNTHbR23IsY
dVxWmfrU25VFcIUAUOUgzPayDRn5KevexXjpkC8MVPQUqe/8FgI18eigDWTwlkcG
0AZUraswv8uT5b0oLj9cawtAU9Dlit7niI6r9I3dtoUAD3JY4+yDp7oZp2TTOV2z
+rgS+5zjug==
=aPxz
-----END PGP SIGNATURE-----
Merge tag 'io_uring-5.12-2021-03-19' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
"Quieter week this time, which was both expected and desired. About
half of the below is fixes for this release, the other half are just
fixes in general. In detail:
- Fix the freezing of IO threads, by making the freezer not send them
fake signals. Make them freezable by default.
- Like we did for personalities, move the buffer IDR to xarray. Kills
some code and avoids a use-after-free on teardown.
- SQPOLL cleanups and fixes (Pavel)
- Fix linked timeout race (Pavel)
- Fix potential completion post use-after-free (Pavel)
- Cleanup and move internal structures outside of general kernel view
(Stefan)
- Use MSG_SIGNAL for send/recv from io_uring (Stefan)"
* tag 'io_uring-5.12-2021-03-19' of git://git.kernel.dk/linux-block:
io_uring: don't leak creds on SQO attach error
io_uring: use typesafe pointers in io_uring_task
io_uring: remove structures from include/linux/io_uring.h
io_uring: imply MSG_NOSIGNAL for send[msg]()/recv[msg]() calls
io_uring: fix sqpoll cancellation via task_work
io_uring: add generic callback_head helpers
io_uring: fix concurrent parking
io_uring: halt SQO submission on ctx exit
io_uring: replace sqd rw_semaphore with mutex
io_uring: fix complete_post use ctx after free
io_uring: fix ->flags races by linked timeouts
io_uring: convert io_buffer_idr to XArray
io_uring: allow IO worker threads to be frozen
kernel: freezer should treat PF_IO_WORKER like PF_KTHREAD for freezing
Sites that match init_section_contains() get marked as INIT. For
built-in code init_sections contains both __init and __exit text. OTOH
kernel_text_address() only explicitly includes __init text (and there
are no __exit text markers).
Match what jump_label already does and ignore the warning for INIT
sites. Also see the excellent changelog for commit: 8f35eaa5f2
("jump_label: Don't warn on __exit jump entries")
Fixes: 9183c3f9ed ("static_call: Add inline static call infrastructure")
Reported-by: Sumit Garg <sumit.garg@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Tested-by: Sumit Garg <sumit.garg@linaro.org>
Link: https://lkml.kernel.org/r/20210318113610.739542434@infradead.org
The intent is to avoid writing init code after init (because the text
might have been freed). The code is needlessly different between
jump_label and static_call and not obviously correct.
The existing code relies on the fact that the module loader clears the
init layout, such that within_module_init() always fails, while
jump_label relies on the module state which is more obvious and
matches the kernel logic.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Tested-by: Sumit Garg <sumit.garg@linaro.org>
Link: https://lkml.kernel.org/r/20210318113610.636651340@infradead.org
It turns out that static_call_set_init() does not preserve the other
flags; IOW. it clears TAIL if it was set.
Fixes: 9183c3f9ed ("static_call: Add inline static call infrastructure")
Reported-by: Sumit Garg <sumit.garg@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Tested-by: Sumit Garg <sumit.garg@linaro.org>
Link: https://lkml.kernel.org/r/20210318113610.519406371@infradead.org
This reverts commit d60cd06331.
This patch causes a panic when rebooting my Dell Poweredge r440. I do
not have the full panic log as it's lost at that stage of the reboot and
I do not have a serial console. Reverting this patch makes my system
able to reboot again.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The fexit/fmod_ret programs can be attached to kernel functions that can sleep.
The synchronize_rcu_tasks() will not wait for such tasks to complete.
In such case the trampoline image will be freed and when the task
wakes up the return IP will point to freed memory causing the crash.
Solve this by adding percpu_ref_get/put for the duration of trampoline
and separate trampoline vs its image life times.
The "half page" optimization has to be removed, since
first_half->second_half->first_half transition cannot be guaranteed to
complete in deterministic time. Every trampoline update becomes a new image.
The image with fmod_ret or fexit progs will be freed via percpu_ref_kill and
call_rcu_tasks. Together they will wait for the original function and
trampoline asm to complete. The trampoline is patched from nop to jmp to skip
fexit progs. They are freed independently from the trampoline. The image with
fentry progs only will be freed via call_rcu_tasks_trace+call_rcu_tasks which
will wait for both sleepable and non-sleepable progs to complete.
Fixes: fec56f5890 ("bpf: Introduce BPF trampoline")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Paul E. McKenney <paulmck@kernel.org> # for RCU
Link: https://lore.kernel.org/bpf/20210316210007.38949-1-alexei.starovoitov@gmail.com
Given we know the max possible value of ptr_limit at the time of retrieving
the latter, add basic assertions, so that the verifier can bail out if
anything looks odd and reject the program. Nothing triggered this so far,
but it also does not hurt to have these.
Signed-off-by: Piotr Krysiuk <piotras@gmail.com>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Instead of having the mov32 with aux->alu_limit - 1 immediate, move this
operation to retrieve_ptr_limit() instead to simplify the logic and to
allow for subsequent sanity boundary checks inside retrieve_ptr_limit().
This avoids in future that at the time of the verifier masking rewrite
we'd run into an underflow which would not sign extend due to the nature
of mov32 instruction.
Signed-off-by: Piotr Krysiuk <piotras@gmail.com>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
retrieve_ptr_limit() computes the ptr_limit for registers with stack and
map_value type. ptr_limit is the size of the memory area that is still
valid / in-bounds from the point of the current position and direction
of the operation (add / sub). This size will later be used for masking
the operation such that attempting out-of-bounds access in the speculative
domain is redirected to remain within the bounds of the current map value.
When masking to the right the size is correct, however, when masking to
the left, the size is off-by-one which would lead to an incorrect mask
and thus incorrect arithmetic operation in the non-speculative domain.
Piotr found that if the resulting alu_limit value is zero, then the
BPF_MOV32_IMM() from the fixup_bpf_calls() rewrite will end up loading
0xffffffff into AX instead of sign-extending to the full 64 bit range,
and as a result, this allows abuse for executing speculatively out-of-
bounds loads against 4GB window of address space and thus extracting the
contents of kernel memory via side-channel.
Fixes: 979d63d50c ("bpf: prevent out of bounds speculation on pointer arithmetic")
Signed-off-by: Piotr Krysiuk <piotras@gmail.com>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
The purpose of this patch is to streamline error propagation and in particular
to propagate retrieve_ptr_limit() errors for pointer types that are not defining
a ptr_limit such that register-based alu ops against these types can be rejected.
The main rationale is that a gap has been identified by Piotr in the existing
protection against speculatively out-of-bounds loads, for example, in case of
ctx pointers, unprivileged programs can still perform pointer arithmetic. This
can be abused to execute speculatively out-of-bounds loads without restrictions
and thus extract contents of kernel memory.
Fix this by rejecting unprivileged programs that attempt any pointer arithmetic
on unprotected pointer types. The two affected ones are pointer to ctx as well
as pointer to map. Field access to a modified ctx' pointer is rejected at a
later point in time in the verifier, and 7c69673262 ("bpf: Permit map_ptr
arithmetic with opcode add and offset 0") only relevant for root-only use cases.
Risk of unprivileged program breakage is considered very low.
Fixes: 7c69673262 ("bpf: Permit map_ptr arithmetic with opcode add and offset 0")
Fixes: b2157399cc ("bpf: prevent out-of-bounds speculation")
Signed-off-by: Piotr Krysiuk <piotras@gmail.com>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
The use_ww_ctx flag is passed to mutex_optimistic_spin(), but the
function doesn't use it. The frequent use of the (use_ww_ctx && ww_ctx)
combination is repetitive.
In fact, ww_ctx should not be used at all if !use_ww_ctx. Simplify
ww_mutex code by dropping use_ww_ctx from mutex_optimistic_spin() an
clear ww_ctx if !use_ww_ctx. In this way, we can replace (use_ww_ctx &&
ww_ctx) by just (ww_ctx).
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Link: https://lore.kernel.org/r/20210316153119.13802-2-longman@redhat.com
The following sequence of commands:
register_ftrace_direct(ip, addr1);
modify_ftrace_direct(ip, addr1, addr2);
unregister_ftrace_direct(ip, addr2);
will cause the kernel to warn:
[ 30.179191] WARNING: CPU: 2 PID: 1961 at kernel/trace/ftrace.c:5223 unregister_ftrace_direct+0x130/0x150
[ 30.180556] CPU: 2 PID: 1961 Comm: test_progs W O 5.12.0-rc2-00378-g86bc10a0a711-dirty #3246
[ 30.182453] RIP: 0010:unregister_ftrace_direct+0x130/0x150
When modify_ftrace_direct() changes the addr from old to new it should update
the addr stored in ftrace_direct_funcs. Otherwise the final
unregister_ftrace_direct() won't find the address and will cause the splat.
Fixes: 0567d68091 ("ftrace: Add modify_ftrace_direct()")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lore.kernel.org/bpf/20210316195815.34714-1-alexei.starovoitov@gmail.com
Preparation for fixing get_nr_restart_syscall() on X86 for COMPAT.
Add a new helper which sets restart_block->fn and calls a dummy
arch_set_restart_data() helper.
Fixes: 609c19a385 ("x86/ptrace: Stop setting TS_COMPAT in ptrace code")
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210201174641.GA17871@redhat.com
Fix typos in kernel doc, otherwise validation script complains:
.../irq_sim.c:170: warning: Function parameter or member 'fwnode' not described in 'irq_domain_create_sim'
.../irq_sim.c:170: warning: Excess function parameter 'fnode' description in 'irq_domain_create_sim'
.../irq_sim.c:240: warning: Function parameter or member 'fwnode' not described in 'devm_irq_domain_create_sim'
.../irq_sim.c:240: warning: Excess function parameter 'fnode' description in 'devm_irq_domain_create_sim'
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210302161453.28540-1-andriy.shevchenko@linux.intel.com
Doing a
prctl(PR_SET_MM, PR_SET_MM_AUXV, addr, 1);
will copy 1 byte from userspace to (quite big) on-stack array
and then stash everything to mm->saved_auxv.
AT_NULL terminator will be inserted at the very end.
/proc/*/auxv handler will find that AT_NULL terminator
and copy original stack contents to userspace.
This devious scheme requires CAP_SYS_RESOURCE.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Make the GENERIC_IRQ_MULTI_HANDLER configuration correct
- Add a missing DT compatible string fir tge Ingenic driver
- Remove the pointless debugfs_file pointer from struct irqdomain
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmBOLisTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYocIsD/oCUvQdR3WK2R73R4+ecJk9dpIG+J+m
dexJ2QZ8gc8qnGqfZznrw9+JnymYfbUxUzWNM+qKUJCfpGYrf0++iopJwdHcMexh
8dyptcZDGvw65QXUxaA1L+oKDBtFUouC3pie+AGpFHSX6FlWHdTS26fQ63UZy4uO
o4+sbHgiy1hEZZKB20k+WTF3e72+YPquo6VwP4lGcGjOsIq4PABmbeattF5E3Woa
wXXhC40qaSpA/JDWNaaknLzyEJgDORPDflWxMJQdo/A+SqRnHCbPjOmi0rGyn3dx
Ae17++DH/XsTzlLcIEe2ZeNdhIPfqNXSIssCzP8VZwLpseIJ22Ou0SRaQ0lUYutM
WrgAVT5+/iSQgX8Zu5Oncr56EOwrJLSupsRd+lXvEYLBLzlBhQx5UgodnxlKP+Go
PazdG52tJBapwH3Rh3Q8rJySxhfWpUUzFY/scb9IyyuqcxqFnFo7/EJqUukvJ6lA
hSFr8L5jYK6U3guKySChQuDGsFkz4xInoGuTWiL21lbbV3Y86kCZ3M5Aon8maM82
nxY73u+QTj8Xj2ElXgPa/sJiw26uszcFkgEWaeBM0OtUoaEJR7O1fy3s9SRwKlLG
smt92iFehSQoDJWJlujsyDewUacF1I3DS6DUlOit62P8FvWC+fEyn92aocStOtYz
AlRhB/V8WDFjbg==
=PG58
-----END PGP SIGNATURE-----
Merge tag 'irq-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
"A set of irqchip updates:
- Make the GENERIC_IRQ_MULTI_HANDLER configuration correct
- Add a missing DT compatible string for the Ingenic driver
- Remove the pointless debugfs_file pointer from struct irqdomain"
* tag 'irq-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/ingenic: Add support for the JZ4760
dt-bindings/irq: Add compatible string for the JZ4760B
irqchip: Do not blindly select CONFIG_GENERIC_IRQ_MULTI_HANDLER
ARM: ep93xx: Select GENERIC_IRQ_MULTI_HANDLER directly
irqdomain: Remove debugfs_file from struct irq_domain
lack of reevaluation of the timers which expire in softirq context under
certain circumstances, e.g. when the clock was set.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmBOLKkTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoeypD/wL+NjxFXzmAaSsy/sehpEmkavixQlE
BCfW+pVIvj4Hs4OQyzhJVdRIos/hzU+P/xmZ8Mk+yBU6+SY6n8N/CODzhV2IbaXT
V90MFqyB4U0/eWlILpAoVNxl+3SHvX1HxkrQn1uEz5+643tS9gnatlCAUGwHDzLD
i0Jykvpd9ytHi7VRPconzIA0wsG/DGkgQ7yzX+lLrhg6J/D04uTwT3j4nw9pgCH4
lsc3Snv+RoGwrcgNbgueRXxdIExPw0NfDOC2dM7SWWcgHXGK7MOt/WkrvD8xHF6c
CaF1Q2MXgZDjBynYzjFgSsHwk6iUc6X4EdxgA2fskQnSD8GhI88H875hIQJ2bF1r
jZS2UyDXKnaddOjKhigx3tQs3F2TjArKBxreP3oIzfTGCDE7t5tAo8siPvsHSB0E
FvuhSf3wojVCoLbsd+ByGH/Deh2Qe13eG8arG2pell7OBzCj/wU5Luw6K4uHAmFh
1cMnmOt8zeUkm7HPZX8fiZZlRDKqldBOZ5Mc7kEJ1sOzxtmxMYHRqGlFKhvByDrH
x/41WiJMskK+L/aqBOQZz5Yqn1PRGWDvLUpgFXFGeQeJaDDNB6dFvlXTfR6hUbdd
LKdrNMQk+E/o5+tZwhymz6+OXYlzoUZTU2FljwL8dLog7wRNhtFbsESFuI5nkJuN
MIZm1+5Lr4TNVg==
=GZmg
-----END PGP SIGNATURE-----
Merge tag 'timers-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Thomas Gleixner:
"A single fix in for hrtimers to prevent an interrupt storm caused by
the lack of reevaluation of the timers which expire in softirq context
under certain circumstances, e.g. when the clock was set"
* tag 'timers-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
hrtimer: Update softirq_expires_next correctly after __hrtimer_get_next_event()
- Prevent a NULL pointer dereference in the migration_stop_cpu()
mechanims
- Prevent self concurrency of affine_move_task()
- Small fixes and cleanups related to task migration/affinity setting
- Ensure that sync_runqueues_membarrier_state() is invoked on the current
CPU when it is in the cpu mask
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmBOLHQTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoWqTEACQLldMda63sEBPzEh4s0y+s4BqUsUM
Pn5wVK1J91PZg1ofv4vLjIzKfjNuIbNNTswhux9kfb29LO0/KBd9BTYi442q4A+P
chMi0Amfp4AGYlwo5+RNwEFNDFr33TD2Ax83cJ6FIDlJzLj8DRfzyxtwBvXfBG5R
EZLTtKL30g20Y8N3nmQjvCInGvh0J1igr4lWXKtmvist7Ie3hW5jpvc8hF+VI0f+
C1JfHg6GRw2eSCVFaF9EEeqX8+Wce+MrWIjwwB363vIX82lc/XC2XVbvrsgpA2P8
sJaZz4KsOcXJLg9DWcN/OrpiMsgjnpKdMMsa3H2Uza8V3URtshpacb0wBWUpa1IA
R55oCv4aRst6hNcCW1ayOLSEOcR2A2qAW2/ktiWYDqerIqkSCezMktunmrOc/vrW
tmnEjlkYf0TNV54XREQ0Hr6OEnSIxqc9WrjbHUFbpv50YURqOCaHr19L0aOsemMJ
g1pJCNkQhv4gZSenM6Fgo5ucbWB2Nvzu/Y6g7B2VFcpa3K7fmRJZW2uU5FvhwbeQ
3ngvEwxMf3Rb6D7SpJyU41TYV9SqdOmoO4/UAFJ8YOlKp8biHCPmGh4+/QYza4Ky
BIfPKtpr7MnSuYayo0wYYcKG0nE+rRJrj0Y0MAtz+6SfRCEc5Vd0NfIPQeAxDTvp
oAITUrOuePiBrA==
=WSUq
-----END PGP SIGNATURE-----
Merge tag 'sched-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Thomas Gleixner:
"A set of scheduler updates:
- Prevent a NULL pointer dereference in the migration_stop_cpu()
mechanims
- Prevent self concurrency of affine_move_task()
- Small fixes and cleanups related to task migration/affinity setting
- Ensure that sync_runqueues_membarrier_state() is invoked on the
current CPU when it is in the cpu mask"
* tag 'sched-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/membarrier: fix missing local execution of ipi_sync_rq_state()
sched: Simplify set_affinity_pending refcounts
sched: Fix affine_move_task() self-concurrency
sched: Optimize migration_cpu_stop()
sched: Collate affine_move_task() stoppers
sched: Simplify migration_cpu_stop()
sched: Fix migration_cpu_stop() requeueing
- A fix for the static_call mechanism so it handles unaligned
addresses correctly.
- Make u64_stats_init() a macro so every instance gets a seperate lockdep
key.
- Make seqcount_latch_init() a macro as well to preserve the static
variable which is used for the lockdep key.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmBOK+ETHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYofjwD/0YlskydvnAOKeO8yjdBNiTtpw4aX5B
jTFTGXTgsmXeRfLPUt5Fte/9DX/tF2hKNYdy9bbLTK9Xf6+NLqTPf99OQwONB3Dn
3vRYPGMBeq7zzKgdH9n3H408YgmsON9IikvPUWDIxDvOsniCUnS2UIHmGefK/uTh
yuqnv+YhKBDLZz9XWiYm12Y163i7IsAurmyw95sI0G23HU0ityf7o42mXcFj2nkD
ET5xH6b+cHz1JUzmciLW2MFhx85IyaLN2ZfEAZSXgU2YwlCGPSOSp+MV3UOpa8YM
a6qW09L4rUsfWiB8SNMIaYyH7GHH5dvn9LrNP9/qF2QkAPeMisyTEkW2gyA/xLWG
xPv5T8QSWkWpgTc3BkSl6A6Y+o3YOoHaTcT3v1/FU6ZfYbdT5sPvLyA/MplRxhzd
thzrx9qSJvBzNiCNXgNdtICEuGTepuTb5ZbJTNmF4pMlNTB3Hbsl9EteAXD7V2pV
BDE7ckdLZnnd5pAtV3bxqETqftvU0GYA+s4Wp+UT8c4NQIm1XDxAV5UuK01LigQi
eAr5ja3TUGZWWr3uCM6QKZv6iYgldf9WtEQiovQaJIRUYZodmQ73AFA/mpeViPZF
ZQGMiSX7UBySv52J9GLR5pe+G8go1VNlYPuGw9qMBUysVpZ0104ccgvqlJgnFlCH
SA15mhCfEvZ0og==
=iE+t
-----END PGP SIGNATURE-----
Merge tag 'locking-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Thomas Gleixner:
"A couple of locking fixes:
- A fix for the static_call mechanism so it handles unaligned
addresses correctly.
- Make u64_stats_init() a macro so every instance gets a seperate
lockdep key.
- Make seqcount_latch_init() a macro as well to preserve the static
variable which is used for the lockdep key"
* tag 'locking-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
seqlock,lockdep: Fix seqcount_latch_init()
u64_stats,lockdep: Fix u64_stats_init() vs lockdep
static_call: Fix the module key fixup
properly handle PID/TID for large PEBS.
- Handle the case properly when there's no PMU and therefore return an
empty list of perf MSRs for VMX to switch instead of reading random
garbage from the stack.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmBOBHoACgkQEsHwGGHe
VUoYHBAAmSY3P4Q91ZS+Sz1orGGX0LufQ0ZVWxnNUD9sFibz5Y2MxyJpQPm6Ae4U
1nO0+QyzbQPwuWKcQxlLHOJXkypkFSdRyR3cpAE5BOIXvqna07xBg/zuTFaOoDek
qn42RHLs5TQB1MNKY+0dyJAfjEHBFrm0CsO27L99TRv5nEsdECM/vUswvasc+QMC
dTS9sMHoiDVwJ8DFn6qmJ8AqkNxmcZgvNOD62TAt8Ac6u6zTGqq1oN+BMpQFRo9a
j/Fu+5PZS4bH/pMlpL0OR6AlmR1PPJ8e1Ik+1Dk0brCrSNdiXtS3DSTllbGxNFi6
4d5oSoStAyDNrihwPm2dw+VofFC03PEVZN095WVq7Iqn9cK/nxFgBEpaIe6fiwa2
MrZ2YiDxrOAin0hxUSu8oLwgOwxmedaSQwo1tyzZswVtXSqc7p4JawzBiIo93RAJ
UHpXI9zwgEyOGUJ95qcbezJVgILJqExjN+SOxaNjoqkAX8Hfgrf4aKDIMrcMC02Z
ZFW86MXL2Rwk+WspAKlWtPgAGuU5sljXeyDK0MRcHwAom8cX+Fod80ocI+xjX8JB
R73cd9dE2iWzIADikCItixzka+HuUBgWDqVT85yTzBt/KqwbIeE7kn6VCJmoJBbw
c9aRcyqEBky8FO6EpD0vIP2jcnlbvUnoq5wG0KV9KXaQDhxtZfk=
=djiL
-----END PGP SIGNATURE-----
Merge tag 'perf_urgent_for_v5.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Borislav Petkov:
- Make sure PMU internal buffers are flushed for per-CPU events too and
properly handle PID/TID for large PEBS.
- Handle the case properly when there's no PMU and therefore return an
empty list of perf MSRs for VMX to switch instead of reading random
garbage from the stack.
* tag 'perf_urgent_for_v5.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/perf: Use RET0 as default for guest_get_msrs to handle "no PMU" case
perf/x86/intel: Set PERF_ATTACH_SCHED_CB for large PEBS and LBR
perf/core: Flush PMU internal buffers for per-CPU events
Merge misc fixes from Andrew Morton:
"28 patches.
Subsystems affected by this series: mm (memblock, pagealloc, hugetlb,
highmem, kfence, oom-kill, madvise, kasan, userfaultfd, memcg, and
zram), core-kernel, kconfig, fork, binfmt, MAINTAINERS, kbuild, and
ia64"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (28 commits)
zram: fix broken page writeback
zram: fix return value on writeback_store
mm/memcg: set memcg when splitting page
mm/memcg: rename mem_cgroup_split_huge_fixup to split_page_memcg and add nr_pages argument
ia64: fix ptrace(PTRACE_SYSCALL_INFO_EXIT) sign
ia64: fix ia64_syscall_get_set_arguments() for break-based syscalls
mm/userfaultfd: fix memory corruption due to writeprotect
kasan: fix KASAN_STACK dependency for HW_TAGS
kasan, mm: fix crash with HW_TAGS and DEBUG_PAGEALLOC
mm/madvise: replace ptrace attach requirement for process_madvise
include/linux/sched/mm.h: use rcu_dereference in in_vfork()
kfence: fix reports if constant function prefixes exist
kfence, slab: fix cache_alloc_debugcheck_after() for bulk allocations
kfence: fix printk format for ptrdiff_t
linux/compiler-clang.h: define HAVE_BUILTIN_BSWAP*
MAINTAINERS: exclude uapi directories in API/ABI section
binfmt_misc: fix possible deadlock in bm_register_write
mm/highmem.c: fix zero_user_segments() with start > end
hugetlb: do early cow when page pinned on src mm
mm: use is_cow_mapping() across tree where proper
...
When a new mm is created, its PASID should be cleared, i.e. the PASID is
initialized to its init state 0 on both ARM and X86.
This patch was part of the series introducing mm->pasid, but got lost
along the way [1]. It still makes sense to have it, because each address
space has a different PASID. And the IOMMU code in
iommu_sva_alloc_pasid() expects the pasid field of a new mm struct to be
cleared.
[1] https://lore.kernel.org/linux-iommu/YDgh53AcQHT+T3L0@otcwcpicx3.sc.intel.com/
Link: https://lkml.kernel.org/r/20210302103837.2562625-1-jean-philippe@linaro.org
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: Jacob Pan <jacob.jun.pan@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With the freezer using the proper signaling to notify us of when it's
time to freeze a thread, we can re-enable normal freezer usage for the
IO threads. Ensure that SQPOLL, io-wq, and the io-wq manager call
try_to_freeze() appropriately, and remove the default setting of
PF_NOFREEZE from create_io_thread().
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Don't send fake signals to PF_IO_WORKER threads, they don't accept
signals. Just treat them like kthreads in this regard, all they need
is a wakeup as no forced kernel/user transition is needed.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmBLtdcQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpqK9D/9sE6QDAmLCvW4+wsFawf+Md9tCE3F15quC
Tptsa6IoR2UB01d06uavLJ5sGo0LeVQQP8+Nygz0TM7jSV39Odmr8geP8wyqSQwP
ZHLasrnz3LGINFOmxwMz/xQbrYUXEhRah+nx9Me0ROWmtQ46MRBZlpjsxffKccC9
SdkS6R8chfc/6HT6oQXMRRDtB4U4SjDdeX6VFIW5E2Z62h0xjhZrmY42fPmChjXR
mmAa2medSmajlwKrmp/+6sCfu2vVRR7bZ5FbS/SoQyo3ZvMabXI3lWicSgtu1wAK
iK9NFJEuJ34Fj4RxTSwQrj0eRX5BqZpWHUJ/1ecxc4tDRtaIXZuzPtblYrZ5fwYe
5pBzXXNpVwhat1AvGp9BFH/4P3kxJDszUAuL7zRut6nHu8xFGDGbNJHezCtws/uZ
i+90Qt5sfoYyXgMDAZuXS7AkJXKbdnajpwjXmZheL3MEj2EsVylcTVaW0MBdVjx1
y0eAtOGUVj2rNOSthDT0ZlKql7PY9N3dhkRxJIzRlIIfBfg73UWkis7zOlFE8CCz
y0rtsu+v/u22mU17v6gdVnTls/vbfiGSg4SutEK2Rv/Qqbjr+po+RXK14BJKBJR9
JknAkQlBjagZmLZKlzRfCDqa62aFYwxC/eOeLGxSpInj0ncgKmWNpnFjXSyRBdPq
stOCQF5aHQ==
=40h0
-----END PGP SIGNATURE-----
Merge tag 'io_uring-5.12-2021-03-12' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
"Not quite as small this week as I had hoped, but at least this should
be the end of it. All the little known issues have been ironed out -
most of it little stuff, but cancelations being the bigger part. Only
minor tweaks and/or regular fixes expected beyond this point.
- Fix the creds tracking for async (io-wq and SQPOLL)
- Various SQPOLL fixes related to parking, sharing, forking, IOPOLL,
completions, and life times. Much simpler now.
- Make IO threads unfreezable by default, on account of a bug report
that had them spinning on resume. Honestly not quite sure why
thawing leaves us with a perpetual signal pending (causing the
spin), but for now make them unfreezable like there were in 5.11
and prior.
- Move personality_idr to xarray, solving a use-after-free related to
removing an entry from the iterator callback. Buffer idr needs the
same treatment.
- Re-org around and task vs context tracking, enabling the fixing of
cancelations, and then cancelation fixes on top.
- Various little bits of cleanups and hardening, and removal of now
dead parts"
* tag 'io_uring-5.12-2021-03-12' of git://git.kernel.dk/linux-block: (34 commits)
io_uring: fix OP_ASYNC_CANCEL across tasks
io_uring: cancel sqpoll via task_work
io_uring: prevent racy sqd->thread checks
io_uring: remove useless ->startup completion
io_uring: cancel deferred requests in try_cancel
io_uring: perform IOPOLL reaping if canceler is thread itself
io_uring: force creation of separate context for ATTACH_WQ and non-threads
io_uring: remove indirect ctx into sqo injection
io_uring: fix invalid ctx->sq_thread_idle
kernel: make IO threads unfreezable by default
io_uring: always wait for sqd exited when stopping SQPOLL thread
io_uring: remove unneeded variable 'ret'
io_uring: move all io_kiocb init early in io_init_req()
io-wq: fix ref leak for req in case of exit cancelations
io_uring: fix complete_post races for linked req
io_uring: add io_disarm_next() helper
io_uring: fix io_sq_offload_create error handling
io-wq: remove unused 'user' member of io_wq
io_uring: Convert personality_idr to XArray
io_uring: clean R_DISABLED startup mess
...
Daniel Borkmann says:
====================
pull-request: bpf 2021-03-10
The following pull-request contains BPF updates for your *net* tree.
We've added 8 non-merge commits during the last 5 day(s) which contain
a total of 11 files changed, 136 insertions(+), 17 deletions(-).
The main changes are:
1) Reject bogus use of vmlinux BTF as map/prog creation BTF, from Alexei Starovoitov.
2) Fix allocation failure splat in x86 JIT for large progs. Also fix overwriting
percpu cgroup storage from tracing programs when nested, from Yonghong Song.
3) Fix rx queue retrieval in XDP for multi-queue veth, from Maciej Fijalkowski.
4) Fix bpf_check_mtu() helper API before freeze to have mtu_len as custom skb/xdp
L3 input length, from Jesper Dangaard Brouer.
5) Fix inode_storage's lookup_elem return value upon having bad fd, from Tal Lossos.
6) Fix bpftool and libbpf cross-build on MacOS, from Georgi Valkov.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The io-wq threads were already marked as no-freeze, but the manager was
not. On resume, we perpetually have signal_pending() being true, and
hence the manager will loop and spin 100% of the time.
Just mark the tasks created by create_io_thread() as PF_NOFREEZE by
default, and remove any knowledge of it in io-wq and io_uring.
Reported-by: Kevin Locke <kevin@kevinlocke.name>
Tested-by: Kevin Locke <kevin@kevinlocke.name>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull networking fixes from David Miller:
1) Fix transmissions in dynamic SMPS mode in ath9k, from Felix Fietkau.
2) TX skb error handling fix in mt76 driver, also from Felix.
3) Fix BPF_FETCH atomic in x86 JIT, from Brendan Jackman.
4) Avoid double free of percpu pointers when freeing a cloned bpf prog.
From Cong Wang.
5) Use correct printf format for dma_addr_t in ath11k, from Geert
Uytterhoeven.
6) Fix resolve_btfids build with older toolchains, from Kun-Chuan
Hsieh.
7) Don't report truncated frames to mac80211 in mt76 driver, from
Lorenzop Bianconi.
8) Fix watcdog timeout on suspend/resume of stmmac, from Joakim Zhang.
9) mscc ocelot needs NET_DEVLINK selct in Kconfig, from Arnd Bergmann.
10) Fix sign comparison bug in TCP_ZEROCOPY_RECEIVE getsockopt(), from
Arjun Roy.
11) Ignore routes with deleted nexthop object in mlxsw, from Ido
Schimmel.
12) Need to undo tcp early demux lookup sometimes in nf_nat, from
Florian Westphal.
13) Fix gro aggregation for udp encaps with zero csum, from Daniel
Borkmann.
14) Make sure to always use imp*_ndo_send when necessaey, from Jason A.
Donenfeld.
15) Fix TRSCER masks in sh_eth driver from Sergey Shtylyov.
16) prevent overly huge skb allocationsd in qrtr, from Pavel Skripkin.
17) Prevent rx ring copnsumer index loss of sync in enetc, from Vladimir
Oltean.
18) Make sure textsearch copntrol block is large enough, from Wilem de
Bruijn.
19) Revert MAC changes to r8152 leading to instability, from Hates Wang.
20) Advance iov in 9p even for empty reads, from Jissheng Zhang.
21) Double hook unregister in nftables, from PabloNeira Ayuso.
22) Fix memleak in ixgbe, fropm Dinghao Liu.
23) Avoid dups in pkt scheduler class dumps, from Maximilian Heyne.
24) Various mptcp fixes from Florian Westphal, Paolo Abeni, and Geliang
Tang.
25) Fix DOI refcount bugs in cipso, from Paul Moore.
26) One too many irqsave in ibmvnic, from Junlin Yang.
27) Fix infinite loop with MPLS gso segmenting via virtio_net, from
Balazs Nemeth.
* git://git.kernel.org:/pub/scm/linux/kernel/git/netdev/net: (164 commits)
s390/qeth: fix notification for pending buffers during teardown
s390/qeth: schedule TX NAPI on QAOB completion
s390/qeth: improve completion of pending TX buffers
s390/qeth: fix memory leak after failed TX Buffer allocation
net: avoid infinite loop in mpls_gso_segment when mpls_hlen == 0
net: check if protocol extracted by virtio_net_hdr_set_proto is correct
net: dsa: xrs700x: check if partner is same as port in hsr join
net: lapbether: Remove netif_start_queue / netif_stop_queue
atm: idt77252: fix null-ptr-dereference
atm: uPD98402: fix incorrect allocation
atm: fix a typo in the struct description
net: qrtr: fix error return code of qrtr_sendmsg()
mptcp: fix length of ADD_ADDR with port sub-option
net: bonding: fix error return code of bond_neigh_init()
net: enetc: allow hardware timestamping on TX queues with tc-etf enabled
net: enetc: set MAC RX FIFO to recommended value
net: davicom: Use platform_get_irq_optional()
net: davicom: Fix regulator not turned off on driver removal
net: davicom: Fix regulator not turned off on failed probe
net: dsa: fix switchdev objects on bridge master mistakenly being applied on ports
...
There's no need to keep around a dentry pointer to a simple file that
debugfs itself can look up when we need to remove it from the system.
So simplify the code by deleting the variable and cleaning up the logic
around the debugfs file.
Cc: Marc Zyngier <maz@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/YCvYV53ZdzQSWY6w@kroah.com
bpf_fd_inode_storage_lookup_elem() returned NULL when getting a bad FD,
which caused -ENOENT in bpf_map_copy_value. -EBADF error is better than
-ENOENT for a bad FD behaviour.
The patch was partially contributed by CyberArk Software, Inc.
Fixes: 8ea636848a ("bpf: Implement bpf_local_storage for inodes")
Signed-off-by: Tal Lossos <tallossos@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: KP Singh <kpsingh@kernel.org>
Link: https://lore.kernel.org/bpf/20210307120948.61414-1-tallossos@gmail.com
The syzbot got FD of vmlinux BTF and passed it into map_create which caused
crash in btf_type_id_size() when it tried to access resolved_ids. The vmlinux
BTF doesn't have 'resolved_ids' and 'resolved_sizes' initialized to save
memory. To avoid such issues disallow using vmlinux BTF in prog_load and
map_create commands.
Fixes: 5329722057 ("bpf: Assign ID to vmlinux BTF and return extra info for BTF in GET_OBJ_INFO")
Reported-by: syzbot+8bab8ed346746e7540e8@syzkaller.appspotmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210307225248.79031-1-alexei.starovoitov@gmail.com
hrtimer_force_reprogram() and hrtimer_interrupt() invokes
__hrtimer_get_next_event() to find the earliest expiry time of hrtimer
bases. __hrtimer_get_next_event() does not update
cpu_base::[softirq_]_expires_next to preserve reprogramming logic. That
needs to be done at the callsites.
hrtimer_force_reprogram() updates cpu_base::softirq_expires_next only when
the first expiring timer is a softirq timer and the soft interrupt is not
activated. That's wrong because cpu_base::softirq_expires_next is left
stale when the first expiring timer of all bases is a timer which expires
in hard interrupt context. hrtimer_interrupt() does never update
cpu_base::softirq_expires_next which is wrong too.
That becomes a problem when clock_settime() sets CLOCK_REALTIME forward and
the first soft expiring timer is in the CLOCK_REALTIME_SOFT base. Setting
CLOCK_REALTIME forward moves the clock MONOTONIC based expiry time of that
timer before the stale cpu_base::softirq_expires_next.
cpu_base::softirq_expires_next is cached to make the check for raising the
soft interrupt fast. In the above case the soft interrupt won't be raised
until clock monotonic reaches the stale cpu_base::softirq_expires_next
value. That's incorrect, but what's worse it that if the softirq timer
becomes the first expiring timer of all clock bases after the hard expiry
timer has been handled the reprogramming of the clockevent from
hrtimer_interrupt() will result in an interrupt storm. That happens because
the reprogramming does not use cpu_base::softirq_expires_next, it uses
__hrtimer_get_next_event() which returns the actual expiry time. Once clock
MONOTONIC reaches cpu_base::softirq_expires_next the soft interrupt is
raised and the storm subsides.
Change the logic in hrtimer_force_reprogram() to evaluate the soft and hard
bases seperately, update softirq_expires_next and handle the case when a
soft expiring timer is the first of all bases by comparing the expiry times
and updating the required cpu base fields. Split this functionality into a
separate function to be able to use it in hrtimer_interrupt() as well
without copy paste.
Fixes: 5da7016046 ("hrtimer: Implement support for softirq based hrtimers")
Reported-by: Mikael Beckius <mikael.beckius@windriver.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Mikael Beckius <mikael.beckius@windriver.com>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210223160240.27518-1-anna-maria@linutronix.de
Sometimes the PMU internal buffers have to be flushed for per-CPU events
during a context switch, e.g., large PEBS. Otherwise, the perf tool may
report samples in locations that do not belong to the process where the
samples are processed in, because PEBS does not tag samples with PID/TID.
The current code only flush the buffers for a per-task event. It doesn't
check a per-CPU event.
Add a new event state flag, PERF_ATTACH_SCHED_CB, to indicate that the
PMU internal buffers have to be flushed for this event during a context
switch.
Add sched_cb_entry and perf_sched_cb_usages back to track the PMU/cpuctx
which is required to be flushed.
Only need to invoke the sched_task() for per-CPU events in this patch.
The per-task events have been handled in perf_event_context_sched_in/out
already.
Fixes: 9c964efa43 ("perf/x86/intel: Drain the PEBS buffer during context switches")
Reported-by: Gabriel Marin <gmx@google.com>
Originally-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20201130193842.10569-1-kan.liang@linux.intel.com
Provided the target address of a R_X86_64_PC32 relocation is aligned,
the low two bits should be invariant between the relative and absolute
value.
Turns out the address is not aligned and things go sideways, ensure we
transfer the bits in the absolute form when fixing up the key address.
Fixes: 73f44fe19d ("static_call: Allow module use without exposing static_call_key")
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lkml.kernel.org/r/20210225220351.GE4746@worktop.programming.kicks-ass.net
The function sync_runqueues_membarrier_state() should copy the
membarrier state from the @mm received as parameter to each runqueue
currently running tasks using that mm.
However, the use of smp_call_function_many() skips the current runqueue,
which is unintended. Replace by a call to on_each_cpu_mask().
Fixes: 227a4aadc7 ("sched/membarrier: Fix p->mm->membarrier_state racy load")
Reported-by: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org # 5.4.x+
Link: https://lore.kernel.org/r/74F1E842-4A84-47BF-B6C2-5407DFDD4A4A@gmail.com
Now that we have set_affinity_pending::stop_pending to indicate if a
stopper is in progress, and we have the guarantee that if that stopper
exists, it will (eventually) complete our @pending we can simplify the
refcount scheme by no longer counting the stopper thread.
Fixes: 6d337eab04 ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()")
Cc: stable@kernel.org
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lkml.kernel.org/r/20210224131355.724130207@infradead.org
Consider:
sched_setaffinity(p, X); sched_setaffinity(p, Y);
Then the first will install p->migration_pending = &my_pending; and
issue stop_one_cpu_nowait(pending); and the second one will read
p->migration_pending and _also_ issue: stop_one_cpu_nowait(pending),
the _SAME_ @pending.
This causes stopper list corruption.
Add set_affinity_pending::stop_pending, to indicate if a stopper is in
progress.
Fixes: 6d337eab04 ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()")
Cc: stable@kernel.org
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lkml.kernel.org/r/20210224131355.649146419@infradead.org
When the purpose of migration_cpu_stop() is to migrate the task to
'any' valid CPU, don't migrate the task when it's already running on a
valid CPU.
Fixes: 6d337eab04 ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()")
Cc: stable@kernel.org
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lkml.kernel.org/r/20210224131355.569238629@infradead.org
The SCA_MIGRATE_ENABLE and task_running() cases are almost identical,
collapse them to avoid further duplication.
Fixes: 6d337eab04 ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()")
Cc: stable@kernel.org
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lkml.kernel.org/r/20210224131355.500108964@infradead.org
When affine_move_task() issues a migration_cpu_stop(), the purpose of
that function is to complete that @pending, not any random other
p->migration_pending that might have gotten installed since.
This realization much simplifies migration_cpu_stop() and allows
further necessary steps to fix all this as it provides the guarantee
that @pending's stopper will complete @pending (and not some random
other @pending).
Fixes: 6d337eab04 ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()")
Cc: stable@kernel.org
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lkml.kernel.org/r/20210224131355.430014682@infradead.org
When affine_move_task(p) is called on a running task @p, which is not
otherwise already changing affinity, we'll first set
p->migration_pending and then do:
stop_one_cpu(cpu_of_rq(rq), migration_cpu_stop, &arg);
This then gets us to migration_cpu_stop() running on the CPU that was
previously running our victim task @p.
If we find that our task is no longer on that runqueue (this can
happen because of a concurrent migration due to load-balance etc.),
then we'll end up at the:
} else if (dest_cpu < 1 || pending) {
branch. Which we'll take because we set pending earlier. Here we first
check if the task @p has already satisfied the affinity constraints,
if so we bail early [A]. Otherwise we'll reissue migration_cpu_stop()
onto the CPU that is now hosting our task @p:
stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop,
&pending->arg, &pending->stop_work);
Except, we've never initialized pending->arg, which will be all 0s.
This then results in running migration_cpu_stop() on the next CPU with
arg->p == NULL, which gives the by now obvious result of fireworks.
The cure is to change affine_move_task() to always use pending->arg,
furthermore we can use the exact same pattern as the
SCA_MIGRATE_ENABLE case, since we'll block on the pending->done
completion anyway, no point in adding yet another completion in
stop_one_cpu().
This then gives a clear distinction between the two
migration_cpu_stop() use cases:
- sched_exec() / migrate_task_to() : arg->pending == NULL
- affine_move_task() : arg->pending != NULL;
And we can have it ignore p->migration_pending when !arg->pending. Any
stop work from sched_exec() / migrate_task_to() is in addition to stop
works from affine_move_task(), which will be sufficient to issue the
completion.
Fixes: 6d337eab04 ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()")
Cc: stable@kernel.org
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lkml.kernel.org/r/20210224131355.357743989@infradead.org
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmBCYeIQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpisOD/9bSFR7gRqO9oIy6/PEveRI4PWDujjcXgRZ
6jxQnfFUrNQsXcXIlHO4HUDG7DVX/isxdk/YVGhVfuKoco/a0XyYAALH5SVy77T+
hDdWCIJBXgxnfAvv+xMBQDEwlz+pdaOLfOVaGMRAp3akuVTBMA+ZE940Lc81kBaU
bTGev+BzPUsUE7n6ebPdhIQDA6LB02e7kaBZsRDwjsABJuD3o4O1jOAtZyqpPRsW
nADvxsrlMxB3RN97iokinBXV426iAQ/nBDYVDVnWpbckD7Ti4f6r2ohku0qEdhZS
XrTF+1mzEqdmvMLl1YQ/GGpH7ReOLHN78aj4BaG49+pryfkaFe50AHr7frGqKLms
DWymTJnpdJSTNT0Z2GRLNrnWHa3YgeuPMdhlIPfihnZBXhZ7p6X5iNpQ69jd93P3
zLXMJ0RKpkl6bmV+Pk4kCqUfz1BV3sUqG9euLdTq+3uBRA0/B5ktPosyH2DGqUYa
n9aEUHslwHUF+Deu/S9RmVzhTjuD0IRbURSeayimFFe71kHhKsHShOKQMUkhu6zQ
AMsQRq9VrWy/3x3C+qpcbEJ3BIqyGLbiQByOBx96kg9Zk14io3GEmSlqZcxbsKTq
/JXjanaEcUwtKKccOC6g+O+G7VlskO9gLi/Fj/x98R92UBEqpEtVZb8MLCdpiLY/
SHJHbC7Fpw==
=w0Sf
-----END PGP SIGNATURE-----
Merge tag 'io_uring-5.12-2021-03-05' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
"A bit of a mix between fallout from the worker change, cleanups and
reductions now possible from that change, and fixes in general. In
detail:
- Fully serialize manager and worker creation, fixing races due to
that.
- Clean up some naming that had gone stale.
- SQPOLL fixes.
- Fix race condition around task_work rework that went into this
merge window.
- Implement unshare. Used for when the original task does unshare(2)
or setuid/seteuid and friends, drops the original workers and forks
new ones.
- Drop the only remaining piece of state shuffling we had left, which
was cred. Move it into issue instead, and we can drop all of that
code too.
- Kill f_op->flush() usage. That was such a nasty hack that we had
out of necessity, we no longer need it.
- Following from ->flush() removal, we can also drop various bits of
ctx state related to SQPOLL and cancelations.
- Fix an issue with IOPOLL retry, which originally was fallout from a
filemap change (removing iov_iter_revert()), but uncovered an issue
with iovec re-import too late.
- Fix an issue with system suspend.
- Use xchg() for fallback work, instead of cmpxchg().
- Properly destroy io-wq on exec.
- Add create_io_thread() core helper, and use that in io-wq and
io_uring. This allows us to remove various silly completion events
related to thread setup.
- A few error handling fixes.
This should be the grunt of fixes necessary for the new workers, next
week should be quieter. We've got a pending series from Pavel on
cancelations, and how tasks and rings are indexed. Outside of that,
should just be minor fixes. Even with these fixes, we're still killing
a net ~80 lines"
* tag 'io_uring-5.12-2021-03-05' of git://git.kernel.dk/linux-block: (41 commits)
io_uring: don't restrict issue_flags for io_openat
io_uring: make SQPOLL thread parking saner
io-wq: kill hashed waitqueue before manager exits
io_uring: clear IOCB_WAITQ for non -EIOCBQUEUED return
io_uring: don't keep looping for more events if we can't flush overflow
io_uring: move to using create_io_thread()
kernel: provide create_io_thread() helper
io_uring: reliably cancel linked timeouts
io_uring: cancel-match based on flags
io-wq: ensure all pending work is canceled on exit
io_uring: ensure that threads freeze on suspend
io_uring: remove extra in_idle wake up
io_uring: inline __io_queue_async_work()
io_uring: inline io_req_clean_work()
io_uring: choose right tctx->io_wq for try cancel
io_uring: fix -EAGAIN retry with IOPOLL
io-wq: fix error path leak of buffered write hash map
io_uring: remove sqo_task
io_uring: kill sqo_dead and sqo submission halting
io_uring: ignore double poll add on the same waitqueue head
...
As pointed out by Ilya and explained in the new comment, there's a
discrepancy between x86 and BPF CMPXCHG semantics: BPF always loads
the value from memory into r0, while x86 only does so when r0 and the
value in memory are different. The same issue affects s390.
At first this might sound like pure semantics, but it makes a real
difference when the comparison is 32-bit, since the load will
zero-extend r0/rax.
The fix is to explicitly zero-extend rax after doing such a
CMPXCHG. Since this problem affects multiple archs, this is done in
the verifier by patching in a BPF_ZEXT_REG instruction after every
32-bit cmpxchg. Any archs that don't need such manual zero-extension
can do a look-ahead with insn_is_zext to skip the unnecessary mov.
Note this still goes on top of Ilya's patch:
https://lore.kernel.org/bpf/20210301154019.129110-1-iii@linux.ibm.com/T/#u
Differences v5->v6[1]:
- Moved is_cmpxchg_insn and ensured it can be safely re-used. Also renamed it
and removed 'inline' to match the style of the is_*_function helpers.
- Fixed up comments in verifier test (thanks for the careful review, Martin!)
Differences v4->v5[1]:
- Moved the logic entirely into opt_subreg_zext_lo32_rnd_hi32, thanks to Martin
for suggesting this.
Differences v3->v4[1]:
- Moved the optimization against pointless zext into the correct place:
opt_subreg_zext_lo32_rnd_hi32 is called _after_ fixup_bpf_calls.
Differences v2->v3[1]:
- Moved patching into fixup_bpf_calls (patch incoming to rename this function)
- Added extra commentary on bpf_jit_needs_zext
- Added check to avoid adding a pointless zext(r0) if there's already one there.
Difference v1->v2[1]: Now solved centrally in the verifier instead of
specifically for the x86 JIT. Thanks to Ilya and Daniel for the suggestions!
[1] v5: https://lore.kernel.org/bpf/CA+i-1C3ytZz6FjcPmUg5s4L51pMQDxWcZNvM86w4RHZ_o2khwg@mail.gmail.com/T/#t
v4: https://lore.kernel.org/bpf/CA+i-1C3ytZz6FjcPmUg5s4L51pMQDxWcZNvM86w4RHZ_o2khwg@mail.gmail.com/T/#t
v3: https://lore.kernel.org/bpf/08669818-c99d-0d30-e1db-53160c063611@iogearbox.net/T/#t
v2: https://lore.kernel.org/bpf/08669818-c99d-0d30-e1db-53160c063611@iogearbox.net/T/#t
v1: https://lore.kernel.org/bpf/d7ebaefb-bfd6-a441-3ff2-2fdfe699b1d2@iogearbox.net/T/#t
Reported-by: Ilya Leoshkevich <iii@linux.ibm.com>
Fixes: 5ffa25502b ("bpf: Add instructions for atomic_[cmp]xchg")
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Provide a generic helper for setting up an io_uring worker. Returns a
task_struct so that the caller can do whatever setup is needed, then call
wake_up_new_task() to kick it into gear.
Add a kernel_clone_args member, io_thread, which tells copy_process() to
mark the task with PF_IO_WORKER.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
insn_has_def32() returns false for 32-bit BPF_FETCH insns. This makes
adjust_insn_aux_data() incorrectly set zext_dst, as can be seen in [1].
This happens because insn_no_def() does not know about the BPF_FETCH
variants of BPF_STX.
Fix in two steps.
First, replace insn_no_def() with insn_def_regno(), which returns the
register an insn defines. Normally insn_no_def() calls are followed by
insn->dst_reg uses; replace those with the insn_def_regno() return
value.
Second, adjust the BPF_STX special case in is_reg64() to deal with
queries made from opt_subreg_zext_lo32_rnd_hi32(), where the state
information is no longer available. Add a comment, since the purpose
of this special case is not clear at first glance.
[1] https://lore.kernel.org/bpf/20210223150845.1857620-1-jackmanb@google.com/
Fixes: 5ffa25502b ("bpf: Add instructions for atomic_[cmp]xchg")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Brendan Jackman <jackmanb@google.com>
Link: https://lore.kernel.org/bpf/20210301154019.129110-1-iii@linux.ibm.com
If tracing is disabled for some reason (traceoff_on_warning, command line,
etc), the ftrace selftests are guaranteed to fail, as their results are
defined by trace data in the ring buffers. If the ring buffers are turned
off, the tests will fail, due to lack of data.
Because tracing being disabled is for a specific reason (warning, user
decided to, etc), it does not make sense to enable tracing to run the self
tests, as the test output may corrupt the reason for the tracing to be
disabled.
Instead, simply skip the self tests and report that they are being skipped
due to tracing being disabled.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
When the CONFIG_RING_BUFFER_VALIDATE_TIME_DELTAS is enabled, and the time
stamps are detected as not being valid, it reports information about the
write stamp, but does not show the before_stamp which is still useful
information. Also, it should give a warning once, such that tests detect
this happening.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Part of the logic of the new time stamp code depends on the before_stamp and
the write_stamp to be different if the write_stamp does not match the last
event on the buffer, as it will be used to calculate the delta of the next
event written on the buffer.
The discard logic depends on this, as the next event to come in needs to
inject a full timestamp as it can not rely on the last event timestamp in
the buffer because it is unknown due to events after it being discarded. But
by changing the write_stamp back to the time before it, it forces the next
event to use a full time stamp, instead of relying on it.
The issue came when a full time stamp was used for the event, and
rb_time_delta() returns zero in that case. The update to the write_stamp
(which subtracts delta) made it not change. Then when the event is removed
from the buffer, because the before_stamp and write_stamp still match, the
next event written would calculate its delta from the write_stamp, but that
would be wrong as the write_stamp is of the time of the event that was
discarded.
In the case that the delta change being made to write_stamp is zero, set the
before_stamp to zero as well, and this will force the next event to inject a
full timestamp and not use the current write_stamp.
Cc: stable@vger.kernel.org
Fixes: a389d86f7f ("ring-buffer: Have nested events still record running time stamp")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
A declaration of function "int trace_empty(struct trace_iterator *iter)"
shows up twice in the header file kernel/trace/trace.h
Link: https://lkml.kernel.org/r/20210304092348.208033-1-y.karadz@gmail.com
Signed-off-by: Yordan Karadzhov (VMware) <y.karadz@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmA6njIQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgprolD/9zWti9LsZvA7yE+PhVwrwF3CsNzLfQlClw
99HaA7HxtAc/VLJrnD/SubhCAPdBC5B2xPv6faajdwF2iUR3Rr1Uc93CQ3uP2KKq
kvm6ALTpzPTMI6YSABhY74sg9BkkoDbMo54JQYVQPleiE+5eDLbuFZck6ObfUHyY
a4aaImlndWp/t14GzrClL4hucF+5KJy846P+QCVclkh0yl8xSsqZ5LIFU7tu3iQb
HpZ5HKLT/2ma/EOr3wknnsIe97AUZQU0q5aMparhYlm+qR511eop3QXx850FL/oC
tEGceKLij6qazmkiocKVzML8Fs+Y9/a4vCMjLCScWJmzDlmKdlH2uudeahN6b9Hm
15qRQHOjl1Hc2bdr5ZVn87nq9RWhSm18C+SRMwOKHCOnEhwxqM3RjRfAgj4BJ6QB
PFbFqdY+8Y1YLPFmn9hph72ePaEcN4L2IXW6TI/WX8mot8ODAnkq9Hr38dKwzO+i
0mon6DVyJKKho6XwvVu5IYurkR2beQprjeVUxwZjjT6DxUgsc+J6itK5LDHFSkeZ
qZlXn5Di8MkiXg0DFJYDQiFXnO0Z5GlRWOGPVfBaOr3x+1dqzDdHGw4oz1oGqvnr
GNNYCsYIpDGm7eauX5lqL5MUFpjqRCceXy5JSHPhnWWw617nYkr4H9jdsV9HiTX1
tQFx05QW3w==
=ccMs
-----END PGP SIGNATURE-----
Merge tag 'block-5.12-2021-02-27' of git://git.kernel.dk/linux-block
Pull more block updates from Jens Axboe:
"A few stragglers (and one due to me missing it originally), and fixes
for changes in this merge window mostly. In particular:
- blktrace cleanups (Chaitanya, Greg)
- Kill dead blk_pm_* functions (Bart)
- Fixes for the bio alloc changes (Christoph)
- Fix for the partition changes (Christoph, Ming)
- Fix for turning off iopoll with polled IO inflight (Jeffle)
- nbd disconnect fix (Josef)
- loop fsync error fix (Mauricio)
- kyber update depth fix (Yang)
- max_sectors alignment fix (Mikulas)
- Add bio_max_segs helper (Matthew)"
* tag 'block-5.12-2021-02-27' of git://git.kernel.dk/linux-block: (21 commits)
block: Add bio_max_segs
blktrace: fix documentation for blk_fill_rw()
block: memory allocations in bounce_clone_bio must not fail
block: remove the gfp_mask argument to bounce_clone_bio
block: fix bounce_clone_bio for passthrough bios
block-crypto-fallback: use a bio_set for splitting bios
block: fix logging on capacity change
blk-settings: align max_sectors on "logical_block_size" boundary
block: reopen the device in blkdev_reread_part
block: don't skip empty device in in disk_uevent
blktrace: remove debugfs file dentries from struct blk_trace
nbd: handle device refs for DESTROY_ON_DISCONNECT properly
kyber: introduce kyber_depth_updated()
loop: fix I/O error on fsync() in detached loop devices
block: fix potential IO hang when turning off io_poll
block: get rid of the trace rq insert wrapper
blktrace: fix blk_rq_merge documentation
blktrace: fix blk_rq_issue documentation
blktrace: add blk_fill_rwbs documentation comment
block: remove superfluous param in blk_fill_rwbs()
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmA4JRkQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpoWqD/9dbbqe8L701U6May1A/4hRsqL4THTA2flx
vNCNRBl6XV3l/wBCtL6waKy6tyO4lyM8XdUdEvo3Kxl2kGPb8eVfpyYL/+77HqyH
ctT4RMrs+84Mxn+5N6cM97hS1qVI2moTxxyvOEl/JTB7BYrutz9gvAoeY3/Dto47
J66oSaPeuqJ32TyihxfQHVxQopJcqFzDjyoYHGDu6ATio1PXfaIdTu8ywVYSECAh
pWI4rwnqdurGuHMNpxyL1bA6CT/jC7s+sqU7bUYUCgtYI3eG0u3V0bp5gAQQIgl9
5sxxE3DidYGAkYZsosrelshBtzGddLdz4Qrt2ungMYv8RsGNpFQ095jDPKDwFaZj
bSvSsfplCo7iFsJByb1TtpNEOW8eAwi81PmBDVQ9Oq5P5ygTYno9GBDc/20ql0Fk
q6wcX28coE3IBw44ne0hIwvBOtXV4WJyluG/gqOxfbTH+kOy3pDsN8lWcY/P4X0U
yzdU2MLHe8BNMyYlUiBF47Amzt4ltr85P4XD3WZ4bX71iwri6HvrdGWLuuKwX+Ie
66QiIDDQIYZQ6NMMJWS9DGW3y3DBizpSXGxONbOw1J2bQdNmtToR0D2UnK/9UnKp
msnvkUNk8fkYGS4aptpJ6HxbmjMEG5YtbiGlPj6fz5/7MTvhRjPxt7A0LWrUIdqR
f88+sHUMqg==
=oc8u
-----END PGP SIGNATURE-----
Merge tag 'io_uring-worker.v3-2021-02-25' of git://git.kernel.dk/linux-block
Pull io_uring thread rewrite from Jens Axboe:
"This converts the io-wq workers to be forked off the tasks in question
instead of being kernel threads that assume various bits of the
original task identity.
This kills > 400 lines of code from io_uring/io-wq, and it's the worst
part of the code. We've had several bugs in this area, and the worry
is always that we could be missing some pieces for file types doing
unusual things (recent /dev/tty example comes to mind, userfaultfd
reads installing file descriptors is another fun one... - both of
which need special handling, and I bet it's not the last weird oddity
we'll find).
With these identical workers, we can have full confidence that we're
never missing anything. That, in itself, is a huge win. Outside of
that, it's also more efficient since we're not wasting space and code
on tracking state, or switching between different states.
I'm sure we're going to find little things to patch up after this
series, but testing has been pretty thorough, from the usual
regression suite to production. Any issue that may crop up should be
manageable.
There's also a nice series of further reductions we can do on top of
this, but I wanted to get the meat of it out sooner rather than later.
The general worry here isn't that it's fundamentally broken. Most of
the little issues we've found over the last week have been related to
just changes in how thread startup/exit is done, since that's the main
difference between using kthreads and these kinds of threads. In fact,
if all goes according to plan, I want to get this into the 5.10 and
5.11 stable branches as well.
That said, the changes outside of io_uring/io-wq are:
- arch setup, simple one-liner to each arch copy_thread()
implementation.
- Removal of net and proc restrictions for io_uring, they are no
longer needed or useful"
* tag 'io_uring-worker.v3-2021-02-25' of git://git.kernel.dk/linux-block: (30 commits)
io-wq: remove now unused IO_WQ_BIT_ERROR
io_uring: fix SQPOLL thread handling over exec
io-wq: improve manager/worker handling over exec
io_uring: ensure SQPOLL startup is triggered before error shutdown
io-wq: make buffered file write hashed work map per-ctx
io-wq: fix race around io_worker grabbing
io-wq: fix races around manager/worker creation and task exit
io_uring: ensure io-wq context is always destroyed for tasks
arch: ensure parisc/powerpc handle PF_IO_WORKER in copy_thread()
io_uring: cleanup ->user usage
io-wq: remove nr_process accounting
io_uring: flag new native workers with IORING_FEAT_NATIVE_WORKERS
net: remove cmsg restriction from io_uring based send/recvmsg calls
Revert "proc: don't allow async path resolution of /proc/self components"
Revert "proc: don't allow async path resolution of /proc/thread-self components"
io_uring: move SQPOLL thread io-wq forked worker
io-wq: make io_wq_fork_thread() available to other users
io-wq: only remove worker from free_list, if it was there
io_uring: remove io_identity
io_uring: remove any grabbing of context
...
Pull misc vfs updates from Al Viro:
"Assorted stuff pile - no common topic here"
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
whack-a-mole: don't open-code iminor/imajor
9p: fix misuse of sscanf() in v9fs_stat2inode()
audit_alloc_mark(): don't open-code ERR_CAST()
fs/inode.c: make inode_init_always() initialize i_ino to 0
vfs: don't unnecessarily clone write access for writable fds
Pull swiotlb updates from Konrad Rzeszutek Wilk:
"Two memory encryption related patches (SWIOTLB is enabled by default
for AMD-SEV):
- Add support for alignment so that NVME can properly work
- Keep track of requested DMA buffers length, as underlaying hardware
devices can trip SWIOTLB to bounce too much and crash the kernel
And a tiny fix to use proper APIs in drivers"
* 'stable/for-linus-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
swiotlb: Validate bounce size in the sync/unmap path
nvme-pci: set min_align_mask
swiotlb: respect min_align_mask
swiotlb: don't modify orig_addr in swiotlb_tbl_sync_single
swiotlb: refactor swiotlb_tbl_map_single
swiotlb: clean up swiotlb_tbl_unmap_single
swiotlb: factor out a nr_slots helper
swiotlb: factor out an io_tlb_offset helper
swiotlb: add a IO_TLB_SIZE define
driver core: add a min_align_mask field to struct device_dma_parameters
sdhci: stop poking into swiotlb internals
Alexei Starovoitov says:
====================
pull-request: bpf 2021-02-26
1) Fix for bpf atomic insns with src_reg=r0, from Brendan.
2) Fix use after free due to bpf_prog_clone, from Cong.
3) Drop imprecise verifier log message, from Dmitrii.
4) Remove incorrect blank line in bpf helper description, from Hangbin.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
selftests/bpf: No need to drop the packet when there is no geneve opt
bpf: Remove blank line in bpf helper description comment
tools/resolve_btfids: Fix build error with older host toolchains
selftests/bpf: Fix a compiler warning in global func test
bpf: Drop imprecise log message
bpf: Clear percpu pointers in bpf_prog_clone_free()
bpf: Fix a warning message in mark_ptr_not_null_reg()
bpf, x86: Fix BPF_FETCH atomic and/or/xor with r0 as src
====================
Link: https://lore.kernel.org/r/20210226193737.57004-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
- Fix lockdep false alarm on resume-from-cpuidle path
- Fix memory leak in kexec_file
- Fix module linker script to work with GDB
- Fix error code when trying to use uprobes with AArch32 instructions
- Fix late VHE enabling with 64k pages
- Add missing ISBs after TLB invalidation
- Fix seccomp when tracing syscall -1
- Fix stacktrace return code at end of stack
- Fix inconsistent whitespace for pointer return values
- Fix compiler warnings when building with W=1
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmA40kUQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNLMUB/93o3Ucd3SeLLmOziyZMWjxCNcuzXAXDhFH
z0q0Zq8U5+xHaCH+jPASNwS7gT6dMX8E60SlXcvVaHuBaH5zsrZnOtpJ5mZQAQ7E
nR1M5ANfusMJ8uRpDHhy5ymJ4IcE/yn74rapBIeGs1e4vWF60Lb6nSVrEJMNRada
zbRr2z9bMecQPGX+KSWpgYg4dLRpyTo8oSYJiYmyoSczGvXhrFHlnIJeaKrJuvGt
IIhil8l9uZd5j0ucVWGiYgAcAuqzgkH2yEiNbkGRwn0nMK+4HGbXpEuzUm/90p3y
lRLQSvx/hKwerIlodUYbFDx4FMXoFfMRQm/8/6tCBrUn/4exDslZ
=wuLk
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"The big one is a fix for the VHE enabling path during early boot,
where the code enabling the MMU wasn't necessarily in the identity map
of the new page-tables, resulting in a consistent crash with 64k
pages. In fixing that, we noticed some missing barriers too, so we
added those for the sake of architectural compliance.
Other than that, just the usual merge window trickle. There'll be more
to come, too.
Summary:
- Fix lockdep false alarm on resume-from-cpuidle path
- Fix memory leak in kexec_file
- Fix module linker script to work with GDB
- Fix error code when trying to use uprobes with AArch32 instructions
- Fix late VHE enabling with 64k pages
- Add missing ISBs after TLB invalidation
- Fix seccomp when tracing syscall -1
- Fix stacktrace return code at end of stack
- Fix inconsistent whitespace for pointer return values
- Fix compiler warnings when building with W=1"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: stacktrace: Report when we reach the end of the stack
arm64: ptrace: Fix seccomp of traced syscall -1 (NO_SYSCALL)
arm64: Add missing ISB after invalidating TLB in enter_vhe
arm64: Add missing ISB after invalidating TLB in __primary_switch
arm64: VHE: Enable EL2 MMU from the idmap
KVM: arm64: make the hyp vector table entries local
arm64/mm: Fixed some coding style issues
arm64: uprobe: Return EOPNOTSUPP for AARCH32 instruction probing
kexec: move machine_kexec_post_load() to public interface
arm64 module: set plt* section addresses to 0x0
arm64: kexec_file: fix memory leakage in create_dtb() when fdt_open_into() fails
arm64: spectre: Prevent lockdep splat on v4 mitigation enable path
Currently breakpoints in kernel .init.text section are not handled
correctly while allowing to remove them even after corresponding pages
have been freed.
Fix it via killing .init.text section breakpoints just prior to initmem
pages being freed.
Doug: "HW breakpoints aren't handled by this patch but it's probably
not such a big deal".
Link: https://lkml.kernel.org/r/20210224081652.587785-1-sumit.garg@linaro.org
Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Suggested-by: Doug Anderson <dianders@chromium.org>
Acked-by: Doug Anderson <dianders@chromium.org>
Acked-by: Daniel Thompson <daniel.thompson@linaro.org>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Drop repeated words in kernel/events/.
{if, the, that, with, time}
Drop repeated words in kernel/locking/.
{it, no, the}
Drop repeated words in kernel/sched/.
{in, not}
Link: https://lkml.kernel.org/r/20210127023412.26292-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Will Deacon <will@kernel.org> [kernel/locking/]
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Apart from subsystem specific .proc_handler handler, all ctl_tables with
extra1 and extra2 members set should use proc_dointvec_minmax instead of
proc_dointvec, or the limit set in extra* never work and potentially echo
underflow values(negative numbers) is likely make system unstable.
Especially vfs_cache_pressure and zone_reclaim_mode, -1 is apparently not
a valid value, but we can set to them. And then kernel may crash.
# echo -1 > /proc/sys/vm/vfs_cache_pressure
Link: https://lkml.kernel.org/r/20201223105535.2875-1-linf@wangsu.com
Signed-off-by: Lin Feng <linf@wangsu.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "Add error_report_end tracepoint to KFENCE and KASAN", v3.
This patchset adds a tracepoint, error_repor_end, that is to be used by
KFENCE, KASAN, and potentially other bug detection tools, when they print
an error report. One of the possible use cases is userspace collection of
kernel error reports: interested parties can subscribe to the tracing
event via tracefs, and get notified when an error report occurs.
This patch (of 3):
Introduce error_report_end tracepoint. It can be used in debugging tools
like KASAN, KFENCE, etc. to provide extensions to the error reporting
mechanisms (e.g. allow tests hook into error reporting, ease error report
collection from production kernels). Another benefit would be making use
of ftrace for debugging or benchmarking the tools themselves.
Should we need it, the tracepoint name leaves us with the possibility to
introduce a complementary error_report_start tracepoint in the future.
Link: https://lkml.kernel.org/r/20210121131915.1331302-1-glider@google.com
Link: https://lkml.kernel.org/r/20210121131915.1331302-2-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Suggested-by: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The size of the buffer being bounced is not checked if it happens
to be larger than the size of the mapped buffer. Because the size
can be controlled by a device, as it's the case with virtio devices,
this can lead to memory corruption.
This patch saves the remaining buffer memory for each slab and uses
that information for validation in the sync/unmap paths before
swiotlb_bounce is called.
Validating this argument is important under the threat models of
AMD SEV-SNP and Intel TDX, where the HV is considered untrusted.
Signed-off-by: Martin Radev <martin.b.radev@gmail.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Respect the min_align_mask in struct device_dma_parameters in swiotlb.
There are two parts to it:
1) for the lower bits of the alignment inside the io tlb slot, just
extent the size of the allocation and leave the start of the slot
empty
2) for the high bits ensure we find a slot that matches the high bits
of the alignment to avoid wasting too much memory
Based on an earlier patch from Jianxiong Gao <jxgao@google.com>.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jianxiong Gao <jxgao@google.com>
Tested-by: Jianxiong Gao <jxgao@google.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
- Fix false-positive build warnings for ARCH=ia64 builds
- Optimize dictionary size for module compression with xz
- Check the compiler and linker versions in Kconfig
- Fix misuse of extra-y
- Support DWARF v5 debug info
- Clamp SUBLEVEL to 255 because stable releases 4.4.x and 4.9.x
exceeded the limit
- Add generic syscall{tbl,hdr}.sh for cleanups across arches
- Minor cleanups of genksyms
- Minor cleanups of Kconfig
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmA3zhgVHG1hc2FoaXJv
eUBrZXJuZWwub3JnAAoJED2LAQed4NsG0C4P/A5hUNFdkYI+EffAWZiHn69t0S8j
M1GQkZildKu/yOfm6hp3mNwgHmYgw0aAuch1htkJuv+5rXRtoK77yw0xKbUqNHyO
VqkJWQPVUXJbWIDiu332NaETHbFTWCnPZKGmzcbVOBHbYsXUJPp17gROQ9ke0fQN
Ae6OV5WINhoS8UnjESWb3qOO87MdQTZ+9mP+NMnVh4kV1SUeMAXLFwFll66KZTkj
GXB330N3p9L0wQVljhXpQ/YPOd76wJNPhJWJ9+hKLFbWsedovzlHb+duprh1z1xe
7LLaq9dEbXxe1Uz0qmK76lupXxilYMyUupTW9HIYtIsY8br8DIoBOG0bn46LVnuL
/m+UQNfUFCYYePT7iZQNNc1DISQJrxme3bjq0PJzZTDukNnHJVahnj9x4RoNaF8j
Dc+JME0r2i8Ccp28vgmaRgzvSsb8Xtw5icwRdwzIpyt1ubs/+tkd/GSaGzQo30Q8
m8y1WOjovHNX7OGnOaOWBGoQAX/2k/VHeAediMsPqWUoOxwsLHYxG/4KtgwbJ5vc
gu/Fyk1GRDklZPpLdYFVvz8TGnqSDogJgF+7WolJ6YvPGAUIDAfd5Ky2sWayddlm
wchc3sKDVyh3lov23h0WQVTvLO9xl+NZ6THxoAGdYeQ0DUu5OxwH8qje/UpWuo1a
DchhNN+g5pa6n56Z
=sLxb
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Fix false-positive build warnings for ARCH=ia64 builds
- Optimize dictionary size for module compression with xz
- Check the compiler and linker versions in Kconfig
- Fix misuse of extra-y
- Support DWARF v5 debug info
- Clamp SUBLEVEL to 255 because stable releases 4.4.x and 4.9.x
exceeded the limit
- Add generic syscall{tbl,hdr}.sh for cleanups across arches
- Minor cleanups of genksyms
- Minor cleanups of Kconfig
* tag 'kbuild-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (38 commits)
initramfs: Remove redundant dependency of RD_ZSTD on BLK_DEV_INITRD
kbuild: remove deprecated 'always' and 'hostprogs-y/m'
kbuild: parse C= and M= before changing the working directory
kbuild: reuse this-makefile to define abs_srctree
kconfig: unify rule of config, menuconfig, nconfig, gconfig, xconfig
kconfig: omit --oldaskconfig option for 'make config'
kconfig: fix 'invalid option' for help option
kconfig: remove dead code in conf_askvalue()
kconfig: clean up nested if-conditionals in check_conf()
kconfig: Remove duplicate call to sym_get_string_value()
Makefile: Remove # characters from compiler string
Makefile: reuse CC_VERSION_TEXT
kbuild: check the minimum linker version in Kconfig
kbuild: remove ld-version macro
scripts: add generic syscallhdr.sh
scripts: add generic syscalltbl.sh
arch: syscalls: remove $(srctree)/ prefix from syscall tables
arch: syscalls: add missing FORCE and fix 'targets' to make if_changed work
gen_compile_commands: prune some directories
kbuild: simplify access to the kernel's version
...
The irq stack switching was moved out of the ASM entry code in course of
the entry code consolidation. It ended up being suboptimal in various
ways.
- Make the stack switching inline so the stackpointer manipulation is not
longer at an easy to find place.
- Get rid of the unnecessary indirect call.
- Avoid the double stack switching in interrupt return and reuse the
interrupt stack for softirq handling.
- A objtool fix for CONFIG_FRAME_POINTER=y builds where it got confused
about the stack pointer manipulation.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmA21OcTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoaX0D/9S0ud6oqbsIvI8LwhvYub63a2cjKP9
liHAJ7xwMYYVwzf0skwsPb/QE6+onCzdq0upJkgG/gEYm2KbiaMWZ4GgHdj0O7ER
qXKJONDd36AGxSEdaVzLY5kPuD/mkomGk5QdaZaTmjruthkNzg4y/N2wXUBIMZR0
FdpSpp5fGspSZCn/DXDx6FjClwpLI53VclvDs6DcZ2DIBA0K+F/cSLb1UQoDLE1U
hxGeuNa+GhKeeZ5C+q5giho1+ukbwtjMW9WnKHAVNiStjm0uzdqq7ERGi/REvkcB
LY62u5uOSW1zIBMmzUjDDQEqvypB0iFxFCpN8g9sieZjA0zkaUioRTQyR+YIQ8Cp
l8LLir0dVQivR1bHghHDKQJUpdw/4zvDj4mMH10XHqbcOtIxJDOJHC5D00ridsAz
OK0RlbAJBl9FTdLNfdVReBCoehYAO8oefeyMAG12nZeSh5XVUWl238rvzmzIYNhG
cEtkSx2wIUNEA+uSuI+xvfmwpxL7voTGvqmiRDCAFxyO7Bl/GBu9OEBFA1eOvHB+
+wTmPDMswRetQNh4QCRXzk1JzP1Wk5CobUL9iinCWFoTJmnsPPSOWlosN6ewaNXt
kYFpRLy5xt9EP7dlfgBSjiRlthDhTdMrFjD5bsy1vdm1w7HKUo82lHa4O8Hq3PHS
tinKICUqRsbjig==
=Sqr1
-----END PGP SIGNATURE-----
Merge tag 'x86-entry-2021-02-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 irq entry updates from Thomas Gleixner:
"The irq stack switching was moved out of the ASM entry code in course
of the entry code consolidation. It ended up being suboptimal in
various ways.
This reworks the X86 irq stack handling:
- Make the stack switching inline so the stackpointer manipulation is
not longer at an easy to find place.
- Get rid of the unnecessary indirect call.
- Avoid the double stack switching in interrupt return and reuse the
interrupt stack for softirq handling.
- A objtool fix for CONFIG_FRAME_POINTER=y builds where it got
confused about the stack pointer manipulation"
* tag 'x86-entry-2021-02-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Fix stack-swizzle for FRAME_POINTER=y
um: Enforce the usage of asm-generic/softirq_stack.h
x86/softirq/64: Inline do_softirq_own_stack()
softirq: Move do_softirq_own_stack() to generic asm header
softirq: Move __ARCH_HAS_DO_SOFTIRQ to Kconfig
x86: Select CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK
x86/softirq: Remove indirection in do_softirq_own_stack()
x86/entry: Use run_sysvec_on_irqstack_cond() for XEN upcall
x86/entry: Convert device interrupts to inline stack switching
x86/entry: Convert system vectors to irq stack macro
x86/irq: Provide macro for inlining irq stack switching
x86/apic: Split out spurious handling code
x86/irq/64: Adjust the per CPU irq stack pointer by 8
x86/irq: Sanitize irq stack tracking
x86/entry: Fix instrumentation annotation
Here is the "big" driver core and debugfs update for 5.12-rc1
This set of driver core patches caused a bunch of problems in linux-next
for the past few weeks, when Saravana tried to set fw_devlink=on as the
default functionality. This caused a number of systems to stop booting,
and lots of bugs were fixed in this area for almost all of the reported
systems, but this option is not ready to be turned on just yet for the
default operation based on this testing, so I've reverted that change at
the very end so we don't have to worry about regressions in 5.12. We
will try to turn this on for 5.13 if testing goes better over the next
few months.
Other than the fixes caused by the fw_devlink testing in here, there's
not much more:
- debugfs fixes for invalid input into debugfs_lookup()
- kerneldoc cleanups
- warn message if platform drivers return an error on their
remove callback (a futile effort, but good to catch).
All of these have been in linux-next for a while now, and the
regressions have gone away with the revert of the fw_devlink change.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYDZhzA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylS2wCfU28FxDWNwcWhPFVfRT8Mb3OxZ50An1sR4lNR
t5Ie4aztMUjVJhI9bq6g
=3NSB
-----END PGP SIGNATURE-----
Merge tag 'driver-core-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core / debugfs update from Greg KH:
"Here is the "big" driver core and debugfs update for 5.12-rc1
This set of driver core patches caused a bunch of problems in
linux-next for the past few weeks, when Saravana tried to set
fw_devlink=on as the default functionality. This caused a number of
systems to stop booting, and lots of bugs were fixed in this area for
almost all of the reported systems, but this option is not ready to be
turned on just yet for the default operation based on this testing, so
I've reverted that change at the very end so we don't have to worry
about regressions in 5.12
We will try to turn this on for 5.13 if testing goes better over the
next few months.
Other than the fixes caused by the fw_devlink testing in here, there's
not much more:
- debugfs fixes for invalid input into debugfs_lookup()
- kerneldoc cleanups
- warn message if platform drivers return an error on their remove
callback (a futile effort, but good to catch).
All of these have been in linux-next for a while now, and the
regressions have gone away with the revert of the fw_devlink change"
* tag 'driver-core-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (35 commits)
Revert "driver core: Set fw_devlink=on by default"
of: property: fw_devlink: Ignore interrupts property for some configs
debugfs: do not attempt to create a new file before the filesystem is initalized
debugfs: be more robust at handling improper input in debugfs_lookup()
driver core: auxiliary bus: Fix calling stage for auxiliary bus init
of: irq: Fix the return value for of_irq_parse_one() stub
of: irq: make a stub for of_irq_parse_one()
clk: Mark fwnodes when their clock provider is added/removed
PM: domains: Mark fwnodes when their powerdomain is added/removed
irqdomain: Mark fwnodes when their irqdomain is added/removed
driver core: fw_devlink: Handle suppliers that don't use driver core
of: property: Add fw_devlink support for optional properties
driver core: Add fw_devlink.strict kernel param
of: property: Don't add links to absent suppliers
driver core: fw_devlink: Detect supplier devices that will never be added
driver core: platform: Emit a warning if a remove callback returned non-zero
of: property: Fix fw_devlink handling of interrupts/interrupts-extended
gpiolib: Don't probe gpio_device if it's not the primary device
device.h: Remove bogus "the" in kerneldoc
gpiolib: Bind gpio_device to a driver to enable fw_devlink=on by default
...
- add support to emulate processing delays in the DMA API benchmark
selftest (Barry Song)
- remove support for non-contiguous noncoherent allocations,
which aren't used and will be replaced by a different API
-----BEGIN PGP SIGNATURE-----
iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmA2A7gLHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYMebw//bkSZ1v1FvGgMd+AQKKnNz+iNHH0MJAlEDhPCynFM
QCPg6OtU9IU/5nmyQlO3rgZ1IW+qABCF36TqjPZar6STuTv3dzfvv9xydyOqdPNA
ekFzc9FnjvWt4wzL+1pXiB/EfjKDudGAjlMyLhghl653HcLnLvE3LxgpfBMrUHbH
DfSBTXt4fTK4ck8ZO6FW2LXOtLgmJvk+qglO1vs9GQv/zcRHXYkIyvqMYTlHwBlh
Ltfl+kJzFHQ3taIo3utCeS5Qzctd6tbxy/Me4OHl2VydNAi8awQz4HX4yZyWYxl5
WpIGhHfD9ROKnGroaEhetUO4OczOXiqYdkt6tt5iAAUW2TFA+mgbvph3+Di/zxgl
4IxOQyhdWA38IA00YmNsoPafuuqC7WwASUfCufg+30MgHR3bpM7GyY5X84DIh3tm
wlPJBMl2RqWnfxmmvjPYxV2wtN3TkA8KJN/xVcUE8aWL2mV50l1/nDdlvCbmjg60
pQt1cGP8A2hODYwLHTzadm67xc0cLrkC8nQbrnDo/FAKGmDD3aHhS95TAIr+ZoeK
cgSFHNkJ1UcJ6nosCB3/MPlIJo1noAIeJnmuOIfhJn0uIof4CGQ5XQgWmJeHFLqO
GlwtJAN3F3db4dxMQNn5br049wob7fgFWqMPfTGy51bZ5BClUKWGSpEonavpUMd1
oKM=
=papz
-----END PGP SIGNATURE-----
Merge tag 'dma-mapping-5.12' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping updates from Christoph Hellwig:
- add support to emulate processing delays in the DMA API benchmark
selftest (Barry Song)
- remove support for non-contiguous noncoherent allocations, which
aren't used and will be replaced by a different API
* tag 'dma-mapping-5.12' of git://git.infradead.org/users/hch/dma-mapping:
dma-mapping: remove the {alloc,free}_noncoherent methods
dma-mapping: benchmark: pretend DMA is transmitting
Add missing ":" after rwbs function parameter documentation that fixes
following warning :-
./kernel/trace/blktrace.c:1877: warning: Function parameter or member 'rwbs' not described in 'blk_fill_rwbs'
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 1f83bb4b49 ("blktrace: add blk_fill_rwbs documentation comment")
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Now it is possible for global function to have a pointer argument that
points to something different than struct. Drop the irrelevant log
message and keep the logic same.
Fixes: e5069b9c23 ("bpf: Support pointers in global func args")
Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210223090416.333943-1-me@ubique.spb.ru
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEqG5UsNXhtOCrfGQP+7dXa6fLC2sFAmAj3ncACgkQ+7dXa6fL
C2s7eQ/+Obr0Mp9mYJhht/LN3YAIgFrgyPCgwsmYsanc0j8cdECDMoz6b287/W3g
69zHQUv7iVqHPIK+NntBSSpHKlCapfUKikt5c9kfPNuDn3aT3ZpTBr1t3DYJX1uO
K6tMUXNDNoi1O70yqsVZEq4Qcv2+1uQXP+F/GxjNkd/brID1HsV/VENKCLSRbyP/
iazgXx/hChQSdu0YbZwMCkuVErEAJvRWU75l9D1v1Uaaaqro5QdelMdz9DZeO4E5
CirXXA5d9zAA9ANj0T7odyg79vhFOz8yc0lFhybc/EPNYSHeOV1o8eK3h4ZIZ+hl
BShwe7feHlmxkQ5WQBppjAn+aFiBtw7LKIptS3YpMI5M7clgT1THDPhgOdVWmbZk
sBbD0bToP8sst6Zi/95StbqawjagR3uE6YBXRVSyTefGQdG1q1c0u9FM/8bZTc3B
q4iDTbvfYdUFN6ywQZhh09v6ljZLdNSv0ht1wLcgByBmgdBvzmBgfczEKtAZcxfY
cLBRvjc8ZjWpfqjrvmmURGQaqwVlO9YBGRzJJwALH9xib1IQbuVmUOilaIGTcCiE
W1Qd4YLPh8Gv1B9GDY2HMw56IGp75QHD56KwIbf93c8JeEB08/iWSuH+kKwyup8+
h5xXpzt5NKAx4GQesWeBjWvt+AmZ+uJDtt4dNb/j91gmbh3POTI=
=HCrJ
-----END PGP SIGNATURE-----
Merge tag 'keys-misc-20210126' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull keyring updates from David Howells:
"Here's a set of minor keyrings fixes/cleanups that I've collected from
various people for the upcoming merge window.
A couple of them might, in theory, be visible to userspace:
- Make blacklist_vet_description() reject uppercase letters as they
don't match the all-lowercase hex string generated for a blacklist
search.
This may want reconsideration in the future, but, currently, you
can't add to the blacklist keyring from userspace and the only
source of blacklist keys generates lowercase descriptions.
- Fix blacklist_init() to use a new KEY_ALLOC_* flag to indicate that
it wants KEY_FLAG_KEEP to be set rather than passing KEY_FLAG_KEEP
into keyring_alloc() as KEY_FLAG_KEEP isn't a valid alloc flag.
This isn't currently a problem as the blacklist keyring isn't
currently writable by userspace.
The rest of the patches are cleanups and I don't think they should
have any visible effect"
* tag 'keys-misc-20210126' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
watch_queue: rectify kernel-doc for init_watch()
certs: Replace K{U,G}IDT_INIT() with GLOBAL_ROOT_{U,G}ID
certs: Fix blacklist flag type confusion
PKCS#7: Fix missing include
certs: Fix blacklisted hexadecimal hash string check
certs/blacklist: fix kernel doc interface issue
crypto: public_key: Remove redundant header file from public_key.h
keys: remove trailing semicolon in macro definition
crypto: pkcs7: Use match_string() helper to simplify the code
PKCS#7: drop function from kernel-doc pkcs7_validate_trust_one
encrypted-keys: Replace HTTP links with HTTPS ones
crypto: asymmetric_keys: fix some comments in pkcs7_parser.h
KEYS: remove redundant memset
security: keys: delete repeated words in comments
KEYS: asymmetric: Fix kerneldoc
security/keys: use kvfree_sensitive()
watch_queue: Drop references to /dev/watch_queue
keys: Remove outdated __user annotations
security: keys: Fix fall-through warnings for Clang
- Generate __mcount_loc in objtool (Peter Zijlstra)
- Support running objtool against vmlinux.o (Sami Tolvanen)
- Clang LTO enablement for x86 (Sami Tolvanen)
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmA1fn8ACgkQiXL039xt
wCbswQ//Zmnq912Ubyn5uPe9SOS/kumGDoqtxGzlZwo/pSB3qFArhD6G07sJ49XD
nu/05ZcOda760wubnhcuK91n2fY5i/eGLXMSjfgtdVcco4Q67nPQydc+LGdhuDco
FlhL8TAIwqYN1f2nJK1IggZpZFxz5r/r1Pq8q1S0oQRqDenxDBQwNtBba4B1OIxw
/FE/1Hp3xwRnuJEP2jREBeY1yQ+Y1n859pZcDgSOWlTArcp8EVUi5hIWJ9DwIe73
mqnx6PcFWEYB0zLNZmZz2gpEac+ncGyme6ChayeuQfInbL5dhx97jFGt3S6/+NSY
mF2zyaR/+JsGGuM8dVqH3izKCJXCEAGirrdMO1ndb9HdwS3KnYEiag2ciNWL0wm3
UEM4r0i2B14sU3pkyotKgsJdOSgorMKkQUPb2wW+OUfnkZNEWKLqylMgNXBD80l4
WG5vYQRwwFN9jRBik6Z5YFGnwGsNIoGg1F1GRNMjh6h51adYQeBN/1QJE1FJ5L4D
iKzmZYqimKUINXWfI6TNyqiv9TctOt65pxnRyq+MHxfTDzHGyc3MUeCeCiR1a1yI
S5QhcgfSnC/NjDA0+oYC6yRlcBtfhjtUqFTGoZ4q4q/LF1BVU1bPyIXZrROLc05s
LNMMBcWbJetJxFtm/gYfiVFuNitYtxbBV1krVtsWznCA2nKGJ9w=
=htKJ
-----END PGP SIGNATURE-----
Merge tag 'clang-lto-v5.12-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull more clang LTO updates from Kees Cook:
"Clang LTO x86 enablement.
Full disclosure: while this has _not_ been in linux-next (since it
initially looked like the objtool dependencies weren't going to make
v5.12), it has been under daily build and runtime testing by Sami for
quite some time. These x86 portions have been discussed on lkml, with
Peter, Josh, and others helping nail things down.
The bulk of the changes are to get objtool working happily. The rest
of the x86 enablement is very small.
Summary:
- Generate __mcount_loc in objtool (Peter Zijlstra)
- Support running objtool against vmlinux.o (Sami Tolvanen)
- Clang LTO enablement for x86 (Sami Tolvanen)"
Link: https://lore.kernel.org/lkml/20201013003203.4168817-26-samitolvanen@google.com/
Link: https://lore.kernel.org/lkml/cover.1611263461.git.jpoimboe@redhat.com/
* tag 'clang-lto-v5.12-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
kbuild: lto: force rebuilds when switching CONFIG_LTO
x86, build: allow LTO to be selected
x86, cpu: disable LTO for cpu.c
x86, vdso: disable LTO only for vDSO
kbuild: lto: postpone objtool
objtool: Split noinstr validation from --vmlinux
x86, build: use objtool mcount
tracing: add support for objtool mcount
objtool: Don't autodetect vmlinux.o
objtool: Fix __mcount_loc generation with Clang's assembler
objtool: Add a pass for generating __mcount_loc
- Address cpufreq regression introduced in 5.11 that causes
CPU frequency reporting to be distorted on systems with CPPC
that use acpi-cpufreq as the scaling driver (Rafael Wysocki).
- Fix regression introduced during the 5.10 development cycle
related to CPU hotplug and policy recreation in the
qcom-cpufreq-hw driver (Shawn Guo).
- Fix recent regression in the operating performance points (OPP)
framework that may cause frequency updates to be skipped by
mistake in some cases (Jonathan Marek).
- Simplify schedutil governor code and remove a misleading comment
from it (Yue Hu).
- Fix kerneldoc comment typo in the cpufreq core (Yue Hu).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmA1UtMSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxezIP/2oBj9fFBSLEB6NL24hO1O7Te2Jbdmpq
RZbGu712eVeB+2dp7jApofwIaBuqRIB9gZBPwyIpEl9c4PbvQ8xARBfxUTBneWuG
0+y8t9YDHnTxTz2erh6/OkbCEfqijnpWqHtt9A5OiFvPT2zyjCRZ2W/+UJ66QF+O
Dl79CyiDotwbMlZnYGTJSxRTia4OFT3U9qc5H0KBCDIWKCE47XpwnLDAuPu9ClY+
YW3Tp58yc/3eRcYIexjovmHN/TAF6yFMhVX2q/EGdmAraMM5+bQvymbjtA5LvQlj
q68wSRa92KBxf+VVQf3Bv9gyFCgfZLz3lYSRCf/xKs4xcsA3j1PdV8QGO15rFtuG
paJ+T74YAzOm4ntihU+QusCJwYpXMn87BKpCEdsVkV3bJLNWlC/9wDwlXgNvOi+0
/pzNGSCfJjyG6vXb5G2WC+iDLX1BKdLS3+adCzfMHgU2dS3kUjCUDDA400xYmW/B
yNpjU6hUOqNLA2LWRgteuKP/psjJEQH6mwWWXuXsjFf+wGCHIc0U2t73LYR+JCgZ
K43VsxIu2J7QWjSV9Nzff1yVNpJBlMnXr0jVQuvHh9Rkc4qvk2yU0SHEeuCXexFL
rcapniJ3/1DbBK93+1ObENjbtq4XF/1FQhNRhcQew7Do54NmjuGRc1lEu+q3hbcs
5Gldg/M97C62
=PT0e
-----END PGP SIGNATURE-----
Merge tag 'pm-5.12-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
"These are fixes and cleanups on top of the power management material
for 5.12-rc1 merged previously.
Specifics:
- Address cpufreq regression introduced in 5.11 that causes CPU
frequency reporting to be distorted on systems with CPPC that use
acpi-cpufreq as the scaling driver (Rafael Wysocki).
- Fix regression introduced during the 5.10 development cycle related
to CPU hotplug and policy recreation in the qcom-cpufreq-hw driver
(Shawn Guo).
- Fix recent regression in the operating performance points (OPP)
framework that may cause frequency updates to be skipped by mistake
in some cases (Jonathan Marek).
- Simplify schedutil governor code and remove a misleading comment
from it (Yue Hu).
- Fix kerneldoc comment typo in the cpufreq core (Yue Hu)"
* tag 'pm-5.12-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: Fix typo in kerneldoc comment
cpufreq: schedutil: Remove update_lock comment from struct sugov_policy definition
cpufreq: schedutil: Remove needless sg_policy parameter from ignore_dl_rate_limit()
cpufreq: ACPI: Set cpuinfo.max_freq directly if max boost is known
cpufreq: qcom-hw: drop devm_xxx() calls from init/exit hooks
opp: Don't skip freq update for different frequency
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCYCegywAKCRCRxhvAZXjc
ouJ6AQDlf+7jCQlQdeKKoN9QDFfMzG1ooemat36EpRRTONaGuAD8D9A4sUsG4+5f
4IU5Lj9oY4DEmF8HenbWK2ZHsesL2Qg=
=yPaw
-----END PGP SIGNATURE-----
Merge tag 'idmapped-mounts-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull idmapped mounts from Christian Brauner:
"This introduces idmapped mounts which has been in the making for some
time. Simply put, different mounts can expose the same file or
directory with different ownership. This initial implementation comes
with ports for fat, ext4 and with Christoph's port for xfs with more
filesystems being actively worked on by independent people and
maintainers.
Idmapping mounts handle a wide range of long standing use-cases. Here
are just a few:
- Idmapped mounts make it possible to easily share files between
multiple users or multiple machines especially in complex
scenarios. For example, idmapped mounts will be used in the
implementation of portable home directories in
systemd-homed.service(8) where they allow users to move their home
directory to an external storage device and use it on multiple
computers where they are assigned different uids and gids. This
effectively makes it possible to assign random uids and gids at
login time.
- It is possible to share files from the host with unprivileged
containers without having to change ownership permanently through
chown(2).
- It is possible to idmap a container's rootfs and without having to
mangle every file. For example, Chromebooks use it to share the
user's Download folder with their unprivileged containers in their
Linux subsystem.
- It is possible to share files between containers with
non-overlapping idmappings.
- Filesystem that lack a proper concept of ownership such as fat can
use idmapped mounts to implement discretionary access (DAC)
permission checking.
- They allow users to efficiently changing ownership on a per-mount
basis without having to (recursively) chown(2) all files. In
contrast to chown (2) changing ownership of large sets of files is
instantenous with idmapped mounts. This is especially useful when
ownership of a whole root filesystem of a virtual machine or
container is changed. With idmapped mounts a single syscall
mount_setattr syscall will be sufficient to change the ownership of
all files.
- Idmapped mounts always take the current ownership into account as
idmappings specify what a given uid or gid is supposed to be mapped
to. This contrasts with the chown(2) syscall which cannot by itself
take the current ownership of the files it changes into account. It
simply changes the ownership to the specified uid and gid. This is
especially problematic when recursively chown(2)ing a large set of
files which is commong with the aforementioned portable home
directory and container and vm scenario.
- Idmapped mounts allow to change ownership locally, restricting it
to specific mounts, and temporarily as the ownership changes only
apply as long as the mount exists.
Several userspace projects have either already put up patches and
pull-requests for this feature or will do so should you decide to pull
this:
- systemd: In a wide variety of scenarios but especially right away
in their implementation of portable home directories.
https://systemd.io/HOME_DIRECTORY/
- container runtimes: containerd, runC, LXD:To share data between
host and unprivileged containers, unprivileged and privileged
containers, etc. The pull request for idmapped mounts support in
containerd, the default Kubernetes runtime is already up for quite
a while now: https://github.com/containerd/containerd/pull/4734
- The virtio-fs developers and several users have expressed interest
in using this feature with virtual machines once virtio-fs is
ported.
- ChromeOS: Sharing host-directories with unprivileged containers.
I've tightly synced with all those projects and all of those listed
here have also expressed their need/desire for this feature on the
mailing list. For more info on how people use this there's a bunch of
talks about this too. Here's just two recent ones:
https://www.cncf.io/wp-content/uploads/2020/12/Rootless-Containers-in-Gitpod.pdfhttps://fosdem.org/2021/schedule/event/containers_idmap/
This comes with an extensive xfstests suite covering both ext4 and
xfs:
https://git.kernel.org/brauner/xfstests-dev/h/idmapped_mounts
It covers truncation, creation, opening, xattrs, vfscaps, setid
execution, setgid inheritance and more both with idmapped and
non-idmapped mounts. It already helped to discover an unrelated xfs
setgid inheritance bug which has since been fixed in mainline. It will
be sent for inclusion with the xfstests project should you decide to
merge this.
In order to support per-mount idmappings vfsmounts are marked with
user namespaces. The idmapping of the user namespace will be used to
map the ids of vfs objects when they are accessed through that mount.
By default all vfsmounts are marked with the initial user namespace.
The initial user namespace is used to indicate that a mount is not
idmapped. All operations behave as before and this is verified in the
testsuite.
Based on prior discussions we want to attach the whole user namespace
and not just a dedicated idmapping struct. This allows us to reuse all
the helpers that already exist for dealing with idmappings instead of
introducing a whole new range of helpers. In addition, if we decide in
the future that we are confident enough to enable unprivileged users
to setup idmapped mounts the permission checking can take into account
whether the caller is privileged in the user namespace the mount is
currently marked with.
The user namespace the mount will be marked with can be specified by
passing a file descriptor refering to the user namespace as an
argument to the new mount_setattr() syscall together with the new
MOUNT_ATTR_IDMAP flag. The system call follows the openat2() pattern
of extensibility.
The following conditions must be met in order to create an idmapped
mount:
- The caller must currently have the CAP_SYS_ADMIN capability in the
user namespace the underlying filesystem has been mounted in.
- The underlying filesystem must support idmapped mounts.
- The mount must not already be idmapped. This also implies that the
idmapping of a mount cannot be altered once it has been idmapped.
- The mount must be a detached/anonymous mount, i.e. it must have
been created by calling open_tree() with the OPEN_TREE_CLONE flag
and it must not already have been visible in the filesystem.
The last two points guarantee easier semantics for userspace and the
kernel and make the implementation significantly simpler.
By default vfsmounts are marked with the initial user namespace and no
behavioral or performance changes are observed.
The manpage with a detailed description can be found here:
1d7b902e28
In order to support idmapped mounts, filesystems need to be changed
and mark themselves with the FS_ALLOW_IDMAP flag in fs_flags. The
patches to convert individual filesystem are not very large or
complicated overall as can be seen from the included fat, ext4, and
xfs ports. Patches for other filesystems are actively worked on and
will be sent out separately. The xfstestsuite can be used to verify
that port has been done correctly.
The mount_setattr() syscall is motivated independent of the idmapped
mounts patches and it's been around since July 2019. One of the most
valuable features of the new mount api is the ability to perform
mounts based on file descriptors only.
Together with the lookup restrictions available in the openat2()
RESOLVE_* flag namespace which we added in v5.6 this is the first time
we are close to hardened and race-free (e.g. symlinks) mounting and
path resolution.
While userspace has started porting to the new mount api to mount
proper filesystems and create new bind-mounts it is currently not
possible to change mount options of an already existing bind mount in
the new mount api since the mount_setattr() syscall is missing.
With the addition of the mount_setattr() syscall we remove this last
restriction and userspace can now fully port to the new mount api,
covering every use-case the old mount api could. We also add the
crucial ability to recursively change mount options for a whole mount
tree, both removing and adding mount options at the same time. This
syscall has been requested multiple times by various people and
projects.
There is a simple tool available at
https://github.com/brauner/mount-idmapped
that allows to create idmapped mounts so people can play with this
patch series. I'll add support for the regular mount binary should you
decide to pull this in the following weeks:
Here's an example to a simple idmapped mount of another user's home
directory:
u1001@f2-vm:/$ sudo ./mount --idmap both:1000:1001:1 /home/ubuntu/ /mnt
u1001@f2-vm:/$ ls -al /home/ubuntu/
total 28
drwxr-xr-x 2 ubuntu ubuntu 4096 Oct 28 22:07 .
drwxr-xr-x 4 root root 4096 Oct 28 04:00 ..
-rw------- 1 ubuntu ubuntu 3154 Oct 28 22:12 .bash_history
-rw-r--r-- 1 ubuntu ubuntu 220 Feb 25 2020 .bash_logout
-rw-r--r-- 1 ubuntu ubuntu 3771 Feb 25 2020 .bashrc
-rw-r--r-- 1 ubuntu ubuntu 807 Feb 25 2020 .profile
-rw-r--r-- 1 ubuntu ubuntu 0 Oct 16 16:11 .sudo_as_admin_successful
-rw------- 1 ubuntu ubuntu 1144 Oct 28 00:43 .viminfo
u1001@f2-vm:/$ ls -al /mnt/
total 28
drwxr-xr-x 2 u1001 u1001 4096 Oct 28 22:07 .
drwxr-xr-x 29 root root 4096 Oct 28 22:01 ..
-rw------- 1 u1001 u1001 3154 Oct 28 22:12 .bash_history
-rw-r--r-- 1 u1001 u1001 220 Feb 25 2020 .bash_logout
-rw-r--r-- 1 u1001 u1001 3771 Feb 25 2020 .bashrc
-rw-r--r-- 1 u1001 u1001 807 Feb 25 2020 .profile
-rw-r--r-- 1 u1001 u1001 0 Oct 16 16:11 .sudo_as_admin_successful
-rw------- 1 u1001 u1001 1144 Oct 28 00:43 .viminfo
u1001@f2-vm:/$ touch /mnt/my-file
u1001@f2-vm:/$ setfacl -m u:1001:rwx /mnt/my-file
u1001@f2-vm:/$ sudo setcap -n 1001 cap_net_raw+ep /mnt/my-file
u1001@f2-vm:/$ ls -al /mnt/my-file
-rw-rwxr--+ 1 u1001 u1001 0 Oct 28 22:14 /mnt/my-file
u1001@f2-vm:/$ ls -al /home/ubuntu/my-file
-rw-rwxr--+ 1 ubuntu ubuntu 0 Oct 28 22:14 /home/ubuntu/my-file
u1001@f2-vm:/$ getfacl /mnt/my-file
getfacl: Removing leading '/' from absolute path names
# file: mnt/my-file
# owner: u1001
# group: u1001
user::rw-
user:u1001:rwx
group::rw-
mask::rwx
other::r--
u1001@f2-vm:/$ getfacl /home/ubuntu/my-file
getfacl: Removing leading '/' from absolute path names
# file: home/ubuntu/my-file
# owner: ubuntu
# group: ubuntu
user::rw-
user:ubuntu:rwx
group::rw-
mask::rwx
other::r--"
* tag 'idmapped-mounts-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: (41 commits)
xfs: remove the possibly unused mp variable in xfs_file_compat_ioctl
xfs: support idmapped mounts
ext4: support idmapped mounts
fat: handle idmapped mounts
tests: add mount_setattr() selftests
fs: introduce MOUNT_ATTR_IDMAP
fs: add mount_setattr()
fs: add attr_flags_to_mnt_flags helper
fs: split out functions to hold writers
namespace: only take read lock in do_reconfigure_mnt()
mount: make {lock,unlock}_mount_hash() static
namespace: take lock_mount_hash() directly when changing flags
nfs: do not export idmapped mounts
overlayfs: do not mount on top of idmapped mounts
ecryptfs: do not mount on top of idmapped mounts
ima: handle idmapped mounts
apparmor: handle idmapped mounts
fs: make helpers idmap mount aware
exec: handle idmapped mounts
would_dump: handle idmapped mounts
...
* pm-cpufreq:
cpufreq: Fix typo in kerneldoc comment
cpufreq: schedutil: Remove update_lock comment from struct sugov_policy definition
cpufreq: schedutil: Remove needless sg_policy parameter from ignore_dl_rate_limit()
cpufreq: ACPI: Set cpuinfo.max_freq directly if max boost is known
cpufreq: qcom-hw: drop devm_xxx() calls from init/exit hooks
* pm-opp:
opp: Don't skip freq update for different frequency
Summary of modules changes for the 5.12 merge window:
- Retire EXPORT_UNUSED_SYMBOL() and EXPORT_SYMBOL_GPL_FUTURE(). These export
types were introduced between 2006 - 2008. All the of the unused symbols have
been long removed and gpl future symbols were converted to gpl quite a long
time ago, and I don't believe these export types have been used ever since.
So, I think it should be safe to retire those export types now. (Christoph Hellwig)
- Refactor and clean up some aged code cruft in the module loader (Christoph Hellwig)
- Build {,module_}kallsyms_on_each_symbol only when livepatching is enabled, as
it is the only caller (Christoph Hellwig)
- Unexport find_module() and module_mutex and fix the last module
callers to not rely on these anymore. Make module_mutex internal to
the module loader. (Christoph Hellwig)
- Harden ELF checks on module load and validate ELF structures before checking
the module signature (Frank van der Linden)
- Fix undefined symbol warning for clang (Fangrui Song)
- Fix smatch warning (Dan Carpenter)
Signed-off-by: Jessica Yu <jeyu@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEVrp26glSWYuDNrCUwEV+OM47wXIFAmA0/KMQHGpleXVAa2Vy
bmVsLm9yZwAKCRDARX44zjvBcu0uD/4nmRp18EKAtdUZivsZHat0aEWGlkmrVueY
5huYw6iwM8b/wIAl3xwLki1Iv0/l0a83WXZhLG4ekl0/Nj8kgllA+jtBrZWpoLMH
CZusN5dS9YwwyD2vu3ak83ARcehcDEPeA9thvc3uRFGis6Hi4bt1rkzGdrzsgqR4
tybfN4qaQx4ZAKFxA8bnS58l7QTFwUzTxJfM6WWzl1Q+mLZDr/WP+loJ/f1/oFFg
ufN31KrqqFpdQY5UKq5P4H8FVq/eXE1Mwl8vo3HsnAj598fznyPUmA3D/j+N4GuR
sTGBVZ9CSehUj7uZRs+Qgg6Bd+y3o44N29BrdZWA6K3ieTeQQpA+VgPUNrDBjGhP
J/9Y4ms4PnuNEWWRaa73m9qsVqAsjh9+T2xp9PYn9uWLCM8BvQFtWcY7tw4/nB0/
INmyiP/tIRpwWkkBl47u1TPR09FzBBGDZjBiSn3lm3VX+zCYtHoma5jWyejG11cf
ybDrTsci9ANyHNP2zFQsUOQJkph78PIal0i3k4ODqGJvaC0iEIH3Xjv+0dmE14rq
kGRrG/HN6HhMZPjashudVUktyTZ63+PJpfFlQbcUzdvjQQIkzW0vrCHMWx9vD1xl
Na7vZLl4Nb03WSJp6saY6j2YSRKL0poGETzGqrsUAHEhpEOPHduaiCVlAr/EmeMk
p6SrWv8+UQ==
=T29Q
-----END PGP SIGNATURE-----
Merge tag 'modules-for-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux
Pull module updates from Jessica Yu:
- Retire EXPORT_UNUSED_SYMBOL() and EXPORT_SYMBOL_GPL_FUTURE(). These
export types were introduced between 2006 - 2008. All the of the
unused symbols have been long removed and gpl future symbols were
converted to gpl quite a long time ago, and I don't believe these
export types have been used ever since. So, I think it should be safe
to retire those export types now (Christoph Hellwig)
- Refactor and clean up some aged code cruft in the module loader
(Christoph Hellwig)
- Build {,module_}kallsyms_on_each_symbol only when livepatching is
enabled, as it is the only caller (Christoph Hellwig)
- Unexport find_module() and module_mutex and fix the last module
callers to not rely on these anymore. Make module_mutex internal to
the module loader (Christoph Hellwig)
- Harden ELF checks on module load and validate ELF structures before
checking the module signature (Frank van der Linden)
- Fix undefined symbol warning for clang (Fangrui Song)
- Fix smatch warning (Dan Carpenter)
* tag 'modules-for-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
module: potential uninitialized return in module_kallsyms_on_each_symbol()
module: remove EXPORT_UNUSED_SYMBOL*
module: remove EXPORT_SYMBOL_GPL_FUTURE
module: move struct symsearch to module.c
module: pass struct find_symbol_args to find_symbol
module: merge each_symbol_section into find_symbol
module: remove each_symbol_in_section
module: mark module_mutex static
kallsyms: only build {,module_}kallsyms_on_each_symbol when required
kallsyms: refactor {,module_}kallsyms_on_each_symbol
module: use RCU to synchronize find_module
module: unexport find_module and module_mutex
drm: remove drm_fb_helper_modinit
powerpc/powernv: remove get_cxl_module
module: harden ELF info handling
module: Ignore _GLOBAL_OFFSET_TABLE_ when warning for undefined symbols
- Clang LTO build infrastructure and arm64-specific enablement (Sami Tolvanen)
- Recursive build CC_FLAGS_LTO fix (Alexander Lobakin)
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmA0OEYACgkQiXL039xt
wCYGJw/8CcyvQUGmXYEZVDLMahKz93RYijiGuSTVnhl0pNAyfOojaZ8Z//eD1VNA
s82azW1XybbA6RnPGD7YQzYz27cSF2qUFDmplwVfE4mwBnPXzRxtVBDLSxksP1HS
77sCOu91QhbovPCWET4dSHLJB3DVc78FiW4lVlRgrglyAz+dut1iXYar5e7VNoS0
S4MwnqwteHC6YXP619rubhpdDoj7njuw1uxRIaodt9S/zRSpl5MCUgHmzQusgezs
yWDdPHPWHnF7xxKgwSvE7AKZPdOnIxKxRi6Yd6vUIyrYB3qLZkFe75nUsgMroAhs
/Bgrn69U2McMiJsOdh0ERzP2VNYfvMacBQ308nb45j83Bgv5l6uj8QOZU4ZogmXV
PsDzsfUe9GsxgYexfozGX61rpd6JinzQKVyoDW3oTT54fbBxO3uDqT8kOBw72dPT
9nkOxTzyb+UO0dpb/MhXLGkGcv8+lTA5ffVIKUx5UxKngRbukc3dxwVJgO4HmucK
bwVQGD83D+/if5/JL9WtQRjDwFEn+IFmdv+3cAXkRo4IIS18LPZB1MJncTeWr8Z9
HlkuDXlJOncUWCABGd1IKu1j0S2HpXV4qhqQXJ6PdfOvUPEaD9qgqEAjD5FxxyXF
wpaV2MWya5i1FGwD5UKhi8hVnAFJyF0/w+enjiPwlmIbjdyEVXE=
=6peY
-----END PGP SIGNATURE-----
Merge tag 'clang-lto-v5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull clang LTO updates from Kees Cook:
"Clang Link Time Optimization.
This is built on the work done preparing for LTO by arm64 folks,
tracing folks, etc. This includes the core changes as well as the
remaining pieces for arm64 (LTO has been the default build method on
Android for about 3 years now, as it is the prerequisite for the
Control Flow Integrity protections).
While x86 LTO enablement is done, it depends on some pending objtool
clean-ups. It's possible that I'll send a "part 2" pull request for
LTO that includes x86 support.
For merge log posterity, and as detailed in commit dc5723b02e
("kbuild: add support for Clang LTO"), here is the lt;dr to do an LTO
build:
make LLVM=1 LLVM_IAS=1 defconfig
scripts/config -e LTO_CLANG_THIN
make LLVM=1 LLVM_IAS=1
(To do a cross-compile of arm64, add "CROSS_COMPILE=aarch64-linux-gnu-"
and "ARCH=arm64" to the "make" command lines.)
Summary:
- Clang LTO build infrastructure and arm64-specific enablement (Sami
Tolvanen)
- Recursive build CC_FLAGS_LTO fix (Alexander Lobakin)"
* tag 'clang-lto-v5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
kbuild: prevent CC_FLAGS_LTO self-bloating on recursive rebuilds
arm64: allow LTO to be selected
arm64: disable recordmcount with DYNAMIC_FTRACE_WITH_REGS
arm64: vdso: disable LTO
drivers/misc/lkdtm: disable LTO for rodata.o
efi/libstub: disable LTO
scripts/mod: disable LTO for empty.c
modpost: lto: strip .lto from module names
PCI: Fix PREL32 relocations for LTO
init: lto: fix PREL32 relocations
init: lto: ensure initcall ordering
kbuild: lto: add a default list of used symbols
kbuild: lto: merge module sections
kbuild: lto: limit inlining
kbuild: lto: fix module versioning
kbuild: add support for Clang LTO
tracing: move function tracer options to Kconfig
These debugfs dentries do not need to be saved for anything as the whole
directory and everything in it is properly cleaned up when the parent
directory is removed. So remove them from struct blk_trace and don't
save them when created as it's not necessary.
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-block@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
- replace mm/frame_vector.c by get_user_pages in misc/habana and
drm/exynos drivers, then move that into media as it's sole user
- close race in generic_access_phys
- s390 pci ioctl fix of this series landed in 5.11 already
- properly revoke iomem mappings (/dev/mem, pci files)
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEb4nG6jLu8Y5XI+PfTA9ye/CYqnEFAmAzgywACgkQTA9ye/CY
qnFPbA//RUHB5bD7vwnEglfJhonKSi/Vt3dNQwUI+pCFK8muWvvPyTkGXKjjT2dI
uAOY2F23wymtIexV3fNLgnMez7kMcupOLkdxJic4GiO+HJn1jnkshdX7/dGtUW7O
G3yfnf/D27i912tT3j6PN7dVnasAYYtndCgImM027Zigzn4ibY+02tnzd5XTj1F8
yq8Swx88oqF8v10HxfpF3RLShqT3S17mFmd9dTv0GkZX497Pe75O44XcXzkD33Bj
wasH2Tz8gMEQx6TNAGlJe13dzDHReh2cG0z2r+6PTA6KnaMMxbEIImHNuhWOmHb/
nf8Jpu9uMOLzB+3hG3TzISTDBhAgPfoJ8Ov40VJCWMtCVBnyMyPJr28Oobb8Dj3V
SXvjSVlLeobOLt+E9vAS+Rmas07LCGBdNP9sexxV7S/sveSQ5W+FptaQW03EghwA
nBYEUC68WqpX99lJCFPmv5zmy5xkecjpU6mLHZljtV1ORzktqWZdVhmC8njHMAMY
Hi/emnPxEX1FpOD38rr7F9KUUSsy4t/ZaCgVaLcxCcbglCHXSHC41R09p9TBRSJo
G6Lksjyj4aa+UL5dZDAtLY0shg0bv2u93dGQNaDAC+uzj6D0ErBBzDK570zBKjp/
75+nqezJlD0d7I6rOl6FwiEYeSrYXJxYEveKVUr8CnH6sfeBlwo=
=lQoR
-----END PGP SIGNATURE-----
Merge tag 'topic/iomem-mmap-vs-gup-2021-02-22' of git://anongit.freedesktop.org/drm/drm
Pull follow_pfn() updates from Daniel Vetter:
"Fixes around VM_FPNMAP and follow_pfn:
- replace mm/frame_vector.c by get_user_pages in misc/habana and
drm/exynos drivers, then move that into media as it's sole user
- close race in generic_access_phys
- s390 pci ioctl fix of this series landed in 5.11 already
- properly revoke iomem mappings (/dev/mem, pci files)"
* tag 'topic/iomem-mmap-vs-gup-2021-02-22' of git://anongit.freedesktop.org/drm/drm:
PCI: Revoke mappings like devmem
PCI: Also set up legacy files only after sysfs init
sysfs: Support zapping of binary attr mmaps
resource: Move devmem revoke code to resource framework
/dev/mem: Only set filp->f_mapping
PCI: Obey iomem restrictions for procfs mmap
mm: Close race in generic_access_phys
media: videobuf2: Move frame_vector into media subsystem
mm/frame-vector: Use FOLL_LONGTERM
misc/habana: Use FOLL_LONGTERM for userptr
misc/habana: Stop using frame_vector helpers
drm/exynos: Use FOLL_LONGTERM for g2d cmdlists
drm/exynos: Stop using frame_vector helpers
drm userspaces uses this, systemd uses this, makes sense to pull it
out from the checkpoint-restore bundle. Kees reviewed this from
security pov and is happy with the final version.
LWN coverage: https://lwn.net/Articles/845448/
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEb4nG6jLu8Y5XI+PfTA9ye/CYqnEFAmAzaXIACgkQTA9ye/CY
qnH5FQ//eL/7a/PDICuCRIN2p2aQwHoe9d12q+01RolAgce6F9mR9SFiKGSCR+t7
daw4G/BaGxSYzvz1IqWbXDMhN87jAXV/IGs9k4OkSIcbnDmMY78EKMZB1c1t7AZo
zmeAuQvmTAcBogTwC6IE9N1JwhH3fmudq4p8zZ4zLojJNSPjrwCvF/xQI/Yaw52V
CTfni8mrjYJ+pZ1qn9XP3IceAFEEI27ubZj2DJU+5xpRJAdIAobo0XbVOf8XQ0uc
/BRLyXFS66EDsY1wWHT6y6UXDNZgbLic0olC1aielaBJh+Wq6bQHgephxpasU5y7
cZX7XTX2N1q8j8NmgzWLYRgERqtXv0CPHKdimTs8SaUcPDGhxcnwPR6hmdQEC+i6
IjntWMERjfuyD+s6qVuc7s8WS7+Ry9OxgdVskHASqGpBvsSliXN1o02Am6WUuGsB
HZxTjCe967FyL4LGU0YjobMTUUSWfYQkOBKABlvYUySNZ0ZHnSygHIWiWjC6b89A
KmXiHJoocNfDlKwX6bf3OWQ+dGGFu2wo5wYzldIiqYJVidp50xdOosdRE1R6WwuG
IOLCdNKdqDgtig+90/fFZ06liXZvqUdDafWgUs/g6lLquFrcq5aAIiSdR6PcPKB0
MwfWcCglLtYrxgDHvNaqnW18yRQq2TGbe+A65aXzLPp45pKP8Hk=
=uiSj
-----END PGP SIGNATURE-----
Merge tag 'topic/kcmp-kconfig-2021-02-22' of git://anongit.freedesktop.org/drm/drm
Pull kcmp kconfig update from Daniel Vetter:
"Make the kcmp syscall available independently of checkpoint/restore.
drm userspaces uses this, systemd uses this, so makes sense to pull it
out from the checkpoint-restore bundle.
Kees reviewed this from security pov and is happy with the final
version"
Link: https://lwn.net/Articles/845448/
* tag 'topic/kcmp-kconfig-2021-02-22' of git://anongit.freedesktop.org/drm/drm:
kcmp: Support selection of SYS_kcmp without CHECKPOINT_RESTORE
Pull qorkqueue updates from Tejun Heo:
"Tracepoint and comment updates only"
* 'for-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: Use %s instead of function name
workqueue: tracing the name of the workqueue instead of it's address
workqueue: fix annotation for WQ_SYSFS
Pull cgroup updates from Tejun Heo:
"Nothing interesting. Just two minor patches"
* 'for-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cpuset: fix typos in comments
cgroup: cgroup.{procs,threads} factor out common parts
- Update to the way irqs and preemption is tracked via the trace event PC field
- Fix handling of unregistering event failing due to allocate memory.
This is only triggered by failure injection, as it is pretty much guaranteed
to have less than a page allocation succeed.
- Do not show the useless "filter" or "enable" files for the "ftrace" trace
system, as they have no effect on doing anything.
- Add a warning if kprobes are registered more than once.
- Synthetic events now have their fields parsed by semicolons.
Old formats without semicolons will still work, but new features will
require them.
- New option to allow trace events to show %p without hashing in trace file.
The trace file can only be read by root, and reading the raw event buffer
did not have any pointers hashed, so this does not expose anything new.
- New directory in tools called tools/tracing, where a new tool that reads
sequential latency reports from the ftrace latency tracers.
- Other minor fixes and cleanups.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYDL2wBQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qti6AP0RUcSU5U1onx8DcwPQLC5Xr3CPqJkm
RvKeJDdgFP+sVgEAiMTFsy2UMc0gmlHZMFd5nZLSiJCu1I2hHmhS5yKbHgY=
=fD9+
-----END PGP SIGNATURE-----
Merge tag 'trace-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
- Update to the way irqs and preemption is tracked via the trace event
PC field
- Fix handling of unregistering event failing due to allocate memory.
This is only triggered by failure injection, as it is pretty much
guaranteed to have less than a page allocation succeed.
- Do not show the useless "filter" or "enable" files for the "ftrace"
trace system, as they have no effect on doing anything.
- Add a warning if kprobes are registered more than once.
- Synthetic events now have their fields parsed by semicolons. Old
formats without semicolons will still work, but new features will
require them.
- New option to allow trace events to show %p without hashing in trace
file. The trace file can only be read by root, and reading the raw
event buffer did not have any pointers hashed, so this does not
expose anything new.
- New directory in tools called tools/tracing, where a new tool that
reads sequential latency reports from the ftrace latency tracers.
- Other minor fixes and cleanups.
* tag 'trace-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (33 commits)
kprobes: Fix to delay the kprobes jump optimization
tracing/tools: Add the latency-collector to tools directory
tracing: Make hash-ptr option default
tracing: Add ptr-hash option to show the hashed pointer value
tracing: Update the stage 3 of trace event macro comment
tracing: Show real address for trace event arguments
selftests/ftrace: Add '!event' synthetic event syntax check
selftests/ftrace: Update synthetic event syntax errors
tracing: Add a backward-compatibility check for synthetic event creation
tracing: Update synth command errors
tracing: Rework synthetic event command parsing
tracing/dynevent: Delegate parsing to create function
kprobes: Warn if the kprobe is reregistered
ftrace: Remove unused ftrace_force_update()
tracepoints: Code clean up
tracepoints: Do not punish non static call users
tracepoints: Remove unnecessary "data_args" macro parameter
tracing: Do not create "enable" or "filter" files for ftrace event subsystem
kernel: trace: preemptirq_delay_test: add cpu affinity
tracepoint: Do not fail unregistering a probe due to memory failure
...
swiotlb_tbl_map_single currently nevers sets a tlb_addr that is not
aligned to the tlb bucket size. But we're going to add such a case
soon, for which this adjustment would be bogus.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jianxiong Gao <jxgao@google.com>
Tested-by: Jianxiong Gao <jxgao@google.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Split out a bunch of a self-contained helpers to make the function easier
to follow.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jianxiong Gao <jxgao@google.com>
Tested-by: Jianxiong Gao <jxgao@google.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Another fairly small set of changes of changes this cycle. The most
significant functional change is a fix to better manage the flags
when allocating memory.
Additionally there is the removal of some unused code (which is
slightly more dramatic than it sounds given it means there are now no
tasklets in kgdb) together with a tidy up of the debug prints and some
spelling corrections for the documentation.
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEELzVBU1D3lWq6cKzwfOMlXTn3iKEFAmAz6HIACgkQfOMlXTn3
iKFUoRAAqFOqqeMltvNxS/UFCl9N8THyc1jtLZp3eszQGfZ0+bFRqO7ucXBmA+vL
V2YyWU6f4p63NnEY/lj5f0xhK59Xc3qYGh3f/cbwZO90Ul8qHCh2UU93LkSWx7Xl
1NqV7CrLZudRpDFx5cGl+PLeK6N0y8jpu+/O/o+mfJHUw4l9ElfpqPytyiXfgGA/
6t6U9jU95oHH196/Y5fzW9GyO4xJ1ZQMIaEpd2JdM+F4mBG3cMDMTRPyGNLk8Yvd
AfKUVCFQcnL/BJTwGNiovlv5APN1cksk9MECkSE2yFE4I1y5L4/GxtazG0MktVVZ
oVW+CWJdnnmx7M1PddE3womgaG5lL0IZW8h0QE34EDcLtjJrfkaG2kzTmmrvBazA
8MIHSNbA1oWxv30GmAQ0vQa5ddBEyqrnaYr/ArYDETUV+HPM7V79c1wvUSmPEEwC
PDsx4bfVqzWXGADMFbtRdMzKeK93KdFKY5CaGBVgtnU38wZkd8yXGPMB7468utn0
RmdBqYMbzfAFQvcJig0eGJtS8wvXFWS4rxQqUIOwH9SKrWwGjv5S/AMe/rjyTnrp
8Nv8wbi/N0C/gX7an6o5R8lxMfdgz9TQN3BrB2osINltadbrhRuVcFIyRbyIOBki
m6P18RQX7lN9d0sS0E0Pz2WCQk3axXBjAuxRANbbnPllo1Ds49w=
=fQWC
-----END PGP SIGNATURE-----
Merge tag 'kgdb-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux
Pull kgdb updates from Daniel Thompson:
"Another fairly small set of changes of changes this cycle. The most
significant functional change is a fix to better manage the flags when
allocating memory.
Additionally there is the removal of some unused code (which is
slightly more dramatic than it sounds given it means there are now no
tasklets in kgdb) together with a tidy up of the debug prints and some
spelling corrections for the documentation"
* tag 'kgdb-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux:
kgdb: Remove kgdb_schedule_breakpoint()
kdb: Make memory allocations more robust
kdb: kdb_support: Fix debugging information problem
kernel: debug: fix typo issue
kgdb: rectify kernel-doc for kgdb_unregister_io_module()
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmAzp8UACgkQUqAMR0iA
lPKHvxAAhB7XsfLaQkpDrqqTswssl85ouQwxwPc6EO52CJx/O5gdZ576vG6Xa1e+
0+79LwutQupTdYpM19mszkopdNr2NDov9ClQB0yGiiwsFlWLe1FvITe3SO4QzxsX
Wl78uYPCXWmnj3FnKLgfz6+mIGD4wvNrrFAztPiZ1GNHqjo48RFD6RybIXa3hR/j
Fx4C7R5eKnbIBophKT4bt1FE0ci9HonDhVYYGyHC6aYNlpTHGYENig32fbkZh6nI
qdyBvtAyfRbihyOTJrsKlXXb3mb27oWVY6e0+tTabBC3tWBmtorpBbFG8HcBEoS2
a5UDLtv2m6adFyFTc1ulF9+IPvLqUx8cweGkM1e/XNYZmZAvoUVKyFeiUNBcKhpm
5ZXYcAZPfWzf2MtFo4mMeLubkPAxk01FWTplt54az2T0B+DnuRieDYarcjrts9ib
4qvyljqEZ5/uvtoi2O+MRje7roOgx3Hb6JgvhIpObY5XV7MMeeMoFQpGKRUxosE7
J8f1fhr37OeD2BRwcqMuf7NNBUISFZnzynaTOghXpBSRAKoa+GPzKhOLalKg1nhI
7LAFGq39CeV9DU59AuWLOmqXCRv7bmjs05vEJtCVv3p+vlvBCiKMhGz6RuGj3OaV
L2pHXBpUxfSIDtl8wVqA+004J8G4n7i77cWpymiNm+yS5WjoVO0=
=CZnB
-----END PGP SIGNATURE-----
Merge tag 'printk-for-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk updates from Petr Mladek:
- New "no_hash_pointers" kernel parameter causes that %p shows raw
pointer values instead of hashed ones. It is intended only for
debugging purposes. Misuse is prevented by a fat warning message that
is inspired by trace_printk().
- Prevent a possible deadlock when flushing printk_safe buffers during
panic().
- Fix performance regression caused by the lockless printk ringbuffer.
It was visible with huge log buffer and long messages.
- Documentation fix-up.
* tag 'printk-for-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
lib/vsprintf: no_hash_pointers prints all addresses as unhashed
kselftest: add support for skipped tests
lib: use KSTM_MODULE_GLOBALS macro in kselftest drivers
printk: avoid prb_first_valid_seq() where possible
printk: fix deadlock when kernel panic
printk: rectify kernel-doc for prb_rec_init_wr()
The WARN_ON() argument is a condition, not an error message. So this
code will print a stack trace but will not print the warning message.
Fix that and also change it to only WARN_ONCE().
Fixes: 4ddb74165a ("bpf: Extract nullable reg type conversion into a helper function")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/YCzJlV3hnF%2Ft1Pk4@mwanda