linux

iv/linux

Author	SHA1	Message	Date
Muchun Song	8a8109f303	printk: fix deadlock when kernel panic printk_safe_flush_on_panic() caused the following deadlock on our server: CPU0: CPU1: panic rcu_dump_cpu_stacks kdump_nmi_shootdown_cpus nmi_trigger_cpumask_backtrace register_nmi_handler(crash_nmi_callback) printk_safe_flush __printk_safe_flush raw_spin_lock_irqsave(&read_lock) // send NMI to other processors apic_send_IPI_allbutself(NMI_VECTOR) // NMI interrupt, dead loop crash_nmi_callback printk_safe_flush_on_panic printk_safe_flush __printk_safe_flush // deadlock raw_spin_lock_irqsave(&read_lock) DEADLOCK: read_lock is taken on CPU1 and will never get released. It happens when panic() stops a CPU by NMI while it has been in the middle of printk_safe_flush(). Handle the lock the same way as logbuf_lock. The printk_safe buffers are flushed only when both locks can be safely taken. It can avoid the deadlock _in this particular case_ at expense of losing contents of printk_safe buffers. Note: It would actually be safe to re-init the locks when all CPUs were stopped by NMI. But it would require passing this information from arch-specific code. It is not worth the complexity. Especially because logbuf_lock and printk_safe buffers have been obsoleted by the lockless ring buffer. Fixes: cf9b1106c81c ("printk/nmi: flush NMI messages on the system panic") Signed-off-by: Muchun Song <songmuchun@bytedance.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Cc: <stable@vger.kernel.org> Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20210210034823.64867-1-songmuchun@bytedance.com	2021-02-10 13:57:06 +01:00
Daniel Borkmann	e88b2c6e5a	bpf: Fix 32 bit src register truncation on div/mod While reviewing a different fix, John and I noticed an oddity in one of the BPF program dumps that stood out, for example: # bpftool p d x i 13 0: (b7) r0 = 808464450 1: (b4) w4 = 808464432 2: (bc) w0 = w0 3: (15) if r0 == 0x0 goto pc+1 4: (9c) w4 %= w0 [...] In line 2 we noticed that the mov32 would 32 bit truncate the original src register for the div/mod operation. While for the two operations the dst register is typically marked unknown e.g. from adjust_scalar_min_max_vals() the src register is not, and thus verifier keeps tracking original bounds, simplified: 0: R1=ctx(id=0,off=0,imm=0) R10=fp0 0: (b7) r0 = -1 1: R0_w=invP-1 R1=ctx(id=0,off=0,imm=0) R10=fp0 1: (b7) r1 = -1 2: R0_w=invP-1 R1_w=invP-1 R10=fp0 2: (3c) w0 /= w1 3: R0_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R1_w=invP-1 R10=fp0 3: (77) r1 >>= 32 4: R0_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R1_w=invP4294967295 R10=fp0 4: (bf) r0 = r1 5: R0_w=invP4294967295 R1_w=invP4294967295 R10=fp0 5: (95) exit processed 6 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 Runtime result of r0 at exit is 0 instead of expected -1. Remove the verifier mov32 src rewrite in div/mod and replace it with a jmp32 test instead. After the fix, we result in the following code generation when having dividend r1 and divisor r6: div, 64 bit: div, 32 bit: 0: (b7) r6 = 8 0: (b7) r6 = 8 1: (b7) r1 = 8 1: (b7) r1 = 8 2: (55) if r6 != 0x0 goto pc+2 2: (56) if w6 != 0x0 goto pc+2 3: (ac) w1 ^= w1 3: (ac) w1 ^= w1 4: (05) goto pc+1 4: (05) goto pc+1 5: (3f) r1 /= r6 5: (3c) w1 /= w6 6: (b7) r0 = 0 6: (b7) r0 = 0 7: (95) exit 7: (95) exit mod, 64 bit: mod, 32 bit: 0: (b7) r6 = 8 0: (b7) r6 = 8 1: (b7) r1 = 8 1: (b7) r1 = 8 2: (15) if r6 == 0x0 goto pc+1 2: (16) if w6 == 0x0 goto pc+1 3: (9f) r1 %= r6 3: (9c) w1 %= w6 4: (b7) r0 = 0 4: (b7) r0 = 0 5: (95) exit 5: (95) exit x86 in particular can throw a 'divide error' exception for div instruction not only for divisor being zero, but also for the case when the quotient is too large for the designated register. For the edx:eax and rdx:rax dividend pair it is not an issue in x86 BPF JIT since we always zero edx (rdx). Hence really the only protection needed is against divisor being zero. Fixes: 68fda450a7df ("bpf: fix 32-bit divide by zero") Co-developed-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org>	2021-02-10 01:32:40 +01:00
Daniel Borkmann	fd675184fc	bpf: Fix verifier jmp32 pruning decision logic Anatoly has been fuzzing with kBdysch harness and reported a hang in one of the outcomes: func#0 @0 0: R1=ctx(id=0,off=0,imm=0) R10=fp0 0: (b7) r0 = 808464450 1: R0_w=invP808464450 R1=ctx(id=0,off=0,imm=0) R10=fp0 1: (b4) w4 = 808464432 2: R0_w=invP808464450 R1=ctx(id=0,off=0,imm=0) R4_w=invP808464432 R10=fp0 2: (9c) w4 %= w0 3: R0_w=invP808464450 R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0 3: (66) if w4 s> 0x30303030 goto pc+0 R0_w=invP808464450 R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff),s32_max_value=808464432) R10=fp0 4: R0_w=invP808464450 R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff),s32_max_value=808464432) R10=fp0 4: (7f) r0 >>= r0 5: R0_w=invP(id=0) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff),s32_max_value=808464432) R10=fp0 5: (9c) w4 %= w0 6: R0_w=invP(id=0) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0) R10=fp0 6: (66) if w0 s> 0x3030 goto pc+0 R0_w=invP(id=0,s32_max_value=12336) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0) R10=fp0 7: R0=invP(id=0,s32_max_value=12336) R1=ctx(id=0,off=0,imm=0) R4=invP(id=0) R10=fp0 7: (d6) if w0 s<= 0x303030 goto pc+1 9: R0=invP(id=0,s32_max_value=12336) R1=ctx(id=0,off=0,imm=0) R4=invP(id=0) R10=fp0 9: (95) exit propagating r0 from 6 to 7: safe 4: R0_w=invP808464450 R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0,umin_value=808464433,umax_value=2147483647,var_off=(0x0; 0x7fffffff)) R10=fp0 4: (7f) r0 >>= r0 5: R0_w=invP(id=0) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0,umin_value=808464433,umax_value=2147483647,var_off=(0x0; 0x7fffffff)) R10=fp0 5: (9c) w4 %= w0 6: R0_w=invP(id=0) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0) R10=fp0 6: (66) if w0 s> 0x3030 goto pc+0 R0_w=invP(id=0,s32_max_value=12336) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0) R10=fp0 propagating r0 7: safe propagating r0 from 6 to 7: safe processed 15 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 1 The underlying program was xlated as follows: # bpftool p d x i 10 0: (b7) r0 = 808464450 1: (b4) w4 = 808464432 2: (bc) w0 = w0 3: (15) if r0 == 0x0 goto pc+1 4: (9c) w4 %= w0 5: (66) if w4 s> 0x30303030 goto pc+0 6: (7f) r0 >>= r0 7: (bc) w0 = w0 8: (15) if r0 == 0x0 goto pc+1 9: (9c) w4 %= w0 10: (66) if w0 s> 0x3030 goto pc+0 11: (d6) if w0 s<= 0x303030 goto pc+1 12: (05) goto pc-1 13: (95) exit The verifier rewrote original instructions it recognized as dead code with 'goto pc-1', but reality differs from verifier simulation in that we are actually able to trigger a hang due to hitting the 'goto pc-1' instructions. Taking a closer look at the verifier analysis, the reason is that it misjudges its pruning decision at the first 'from 6 to 7: safe' occasion. What happens is that while both old/cur registers are marked as precise, they get misjudged for the jmp32 case as range_within() yields true, meaning that the prior verification path with a wider register bound could be verified successfully and therefore the current path with a narrower register bound is deemed safe as well whereas in reality it's not. R0 old/cur path's bounds compare as follows: old: smin_value=0x8000000000000000,smax_value=0x7fffffffffffffff,umin_value=0x0,umax_value=0xffffffffffffffff,var_off=(0x0; 0xffffffffffffffff) cur: smin_value=0x8000000000000000,smax_value=0x7fffffff7fffffff,umin_value=0x0,umax_value=0xffffffff7fffffff,var_off=(0x0; 0xffffffff7fffffff) old: s32_min_value=0x80000000,s32_max_value=0x00003030,u32_min_value=0x00000000,u32_max_value=0xffffffff cur: s32_min_value=0x00003031,s32_max_value=0x7fffffff,u32_min_value=0x00003031,u32_max_value=0x7fffffff The 64 bit bounds generally look okay and while the information that got propagated from 32 to 64 bit looks correct as well, it's not precise enough for judging a conditional jmp32. Given the latter only operates on subregisters we also need to take these into account as well for a range_within() probe in order to be able to prune paths. Extending the range_within() constraint to both bounds will be able to tell us that the old signed 32 bit bounds are not wider than the cur signed 32 bit bounds. With the fix in place, the program will now verify the 'goto' branch case as it should have been: [...] 6: R0_w=invP(id=0) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0) R10=fp0 6: (66) if w0 s> 0x3030 goto pc+0 R0_w=invP(id=0,s32_max_value=12336) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0) R10=fp0 7: R0=invP(id=0,s32_max_value=12336) R1=ctx(id=0,off=0,imm=0) R4=invP(id=0) R10=fp0 7: (d6) if w0 s<= 0x303030 goto pc+1 9: R0=invP(id=0,s32_max_value=12336) R1=ctx(id=0,off=0,imm=0) R4=invP(id=0) R10=fp0 9: (95) exit 7: R0_w=invP(id=0,smax_value=9223372034707292159,umax_value=18446744071562067967,var_off=(0x0; 0xffffffff7fffffff),s32_min_value=12337,u32_min_value=12337,u32_max_value=2147483647) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0) R10=fp0 7: (d6) if w0 s<= 0x303030 goto pc+1 R0_w=invP(id=0,smax_value=9223372034707292159,umax_value=18446744071562067967,var_off=(0x0; 0xffffffff7fffffff),s32_min_value=3158065,u32_min_value=3158065,u32_max_value=2147483647) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0) R10=fp0 8: R0_w=invP(id=0,smax_value=9223372034707292159,umax_value=18446744071562067967,var_off=(0x0; 0xffffffff7fffffff),s32_min_value=3158065,u32_min_value=3158065,u32_max_value=2147483647) R1=ctx(id=0,off=0,imm=0) R4_w=invP(id=0) R10=fp0 8: (30) r0 = (u8 )skb[808464432] BPF_LD_[ABS\|IND] uses reserved fields processed 11 insns (limit 1000000) max_states_per_insn 1 total_states 1 peak_states 1 mark_read 1 The bug is quite subtle in the sense that when verifier would determine that a given branch is dead code, it would (here: wrongly) remove these instructions from the program and hard-wire the taken branch for privileged programs instead of the 'goto pc-1' rewrites which will cause hard to debug problems. Fixes: 3f50f132d840 ("bpf: Verifier, do explicit ALU32 bounds tracking") Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org>	2021-02-10 01:31:46 +01:00
Daniel Borkmann	ee114dd64c	bpf: Fix verifier jsgt branch analysis on max bound Fix incorrect is_branch{32,64}_taken() analysis for the jsgt case. The return code for both will tell the caller whether a given conditional jump is taken or not, e.g. 1 means branch will be taken [for the involved registers] and the goto target will be executed, 0 means branch will not be taken and instead we fall-through to the next insn, and last but not least a -1 denotes that it is not known at verification time whether a branch will be taken or not. Now while the jsgt has the branch-taken case correct with reg->s32_min_value > sval, the branch-not-taken case is off-by-one when testing for reg->s32_max_value < sval since the branch will also be taken for reg->s32_max_value == sval. The jgt branch analysis, for example, gets this right. Fixes: 3f50f132d840 ("bpf: Verifier, do explicit ALU32 bounds tracking") Fixes: 4f7b3e82589e ("bpf: improve verifier branch analysis") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org>	2021-02-10 01:31:45 +01:00
Tom Zanussi	8b5ab6bd0b	tracing: Add a backward-compatibility check for synthetic event creation The synthetic event parsing rework now requires semicolons between synthetic event fields. That requirement breaks existing users who might already have used the old synthetic event command format, so this adds an inner loop that can parse more than one field, if present, between semicolons. For each field, parse_synth_field() checks in which version that field was introduced, using check_field_version(). The caller, __create_synth_event() can then use that version information to determine whether or not to enforce the requirement on the command as a whole. In the future, if/when new features are added, the requirement will be that any field/string containing the new feature must use semicolons, and the check_field_version() check can then check for those and enforce it. Using a version number allows this scheme to be extended if necessary. Link: https://lkml.kernel.org/r/74fcc500d561b40ce91c5ee94818c70c6b0c9330.1612208610.git.zanussi@kernel.org [ zanussi: added check_field_version() comment from rostedt@goodmis.org ] Signed-off-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2021-02-09 12:52:15 -05:00
Tom Zanussi	8d3e816523	tracing: Update synth command errors Since array types are handled differently, errors referencing them also need to be handled differently. Add and use a new INVALID_ARRAY_SPEC error. Also add INVALID_CMD and INVALID_DYN_CMD to catch and display the correct form for badly-formed commands, which can also be used in place of CMD_INCOMPLETE, which is removed, and remove CMD_TOO_LONG, since it's no longer used. Link: https://lkml.kernel.org/r/b9dd434dc6458dcff11adc6ed616fe93a8794770.1612208610.git.zanussi@kernel.org Signed-off-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2021-02-09 12:52:15 -05:00
Tom Zanussi	c9e759b1e8	tracing: Rework synthetic event command parsing Now that command parsing has been delegated to the create functions and we're no longer constrained by argv_split(), we can modify the synthetic event command parser to better match the higher-level structure of the synthetic event commands, which is basically an event name followed by a set of semicolon-separated fields. Since we're also now passed the raw command, we can also save it directly and can get rid of save_cmdstr(). Link: https://lkml.kernel.org/r/cb9e2be92d992ce59f2b4f132264a5d467f3933f.1612208610.git.zanussi@kernel.org Signed-off-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2021-02-09 12:52:15 -05:00
Masami Hiramatsu	d262271d04	tracing/dynevent: Delegate parsing to create function Delegate command parsing to each create function so that the command syntax can be customized. This requires changes to the kprobe/uprobe/synthetic event handling, which are also included here. Link: https://lkml.kernel.org/r/e488726f49cbdbc01568618f8680584306c4c79f.1612208610.git.zanussi@kernel.org Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> [ zanussi@kernel.org: added synthetic event modifications ] Signed-off-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2021-02-09 12:52:15 -05:00
Masami Hiramatsu	33b1d14668	kprobes: Warn if the kprobe is reregistered Warn if the kprobe is reregistered, since there must be a software bug (actively used resource must not be re-registered) and caller must be fixed. Link: https://lkml.kernel.org/r/161236436734.194052.4058506306336814476.stgit@devnote2 Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Ananth N Mavinakayanahalli <ananth@linux.ibm.com> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2021-02-09 12:44:32 -05:00
Steven Rostedt (VMware)	7211f0a257	tracepoints: Code clean up Restructure the code a bit to make it simpler, fix some formatting problems and add READ_ONCE/WRITE_ONCE to make sure there's no compiler load/store tearing to the variables that can be accessed across CPUs. Started with Mathieu Desnoyers's patch: Link: https://lore.kernel.org/lkml/20210203175741.20665-1-mathieu.desnoyers@efficios.com/ And will keep his signature, but I will take the responsibility of this being correct, and keep the authorship. Link: https://lkml.kernel.org/r/20210204143004.61126582@gandalf.local.home Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2021-02-09 12:27:29 -05:00
Christoph Hellwig	81d88ce550	dma-mapping: remove the {alloc,free}_noncoherent methods It turns out allowing non-contigous allocations here was a rather bad idea, as we'll now need to define ways to get the pages for mmaping or dma_buf sharing. Revert this change and stick to the original concept. A different API for the use case of non-contigous allocations will be added back later. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Tomasz Figa <tfiga@chromium.org> Tested-by: Ricardo Ribalda <ribalda@chromium.org>:wq	2021-02-09 18:01:38 +01:00
Saravana Kannan	ed1054a02a	irqdomain: Mark fwnodes when their irqdomain is added/removed This allows fw_devlink to recognize irqdomain drivers that don't use the device-driver model to initialize the device. fw_devlink will use this information to make sure consumers of such irqdomain aren't indefinitely blocked from probing, waiting for the irqdomain device to appear and bind to a driver. Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Saravana Kannan <saravanak@google.com> Link: https://lore.kernel.org/r/20210205222644.2357303-7-saravanak@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-02-09 14:31:06 +01:00
Linus Torvalds	e0756cfc7d	tracing: Fix output of top level event "enable" file When writing a tool for enabling events in the tracing system, an anomaly was discovered. The top level event "enable" file would never show "1" when all events were enabled. The system and event "enable" files worked as expected. The reason was because the top level event "enable" file included the "ftrace" tracer events, which are not controlled by the "enable" file and would cause the output to be wrong. This appears to have been a bug since it was created. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYCGOmxQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qhDFAQDjSrHmSC0ziTck9QMXSUdxLs0gjENr R0n5WPZ/mRboxQD/aWlw99TnuSwFDzB0gTlwDuDd1Ge2snqqmFCRTscU7gE= =Pig3 -----END PGP SIGNATURE----- Merge tag 'trace-v5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fix from Steven Rostedt: "Fix output of top level event tracing 'enable' file. When writing a tool for enabling events in the tracing system, an anomaly was discovered. The top level event 'enable' file would never show '1' when all events were enabled. The system and event 'enable' files worked as expected. The reason was because the top level event 'enable' file included the 'ftrace' tracer events, which are not controlled by the 'enable' file and would cause the output to be wrong. This appears to have been a bug since it was created" * tag 'trace-v5.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Do not count ftrace events in top level enable output	2021-02-08 11:32:39 -08:00
Sumit Garg	93f7a6d818	kdb: Make memory allocations more robust Currently kdb uses in_interrupt() to determine whether its library code has been called from the kgdb trap handler or from a saner calling context such as driver init. This approach is broken because in_interrupt() alone isn't able to determine kgdb trap handler entry from normal task context. This can happen during normal use of basic features such as breakpoints and can also be trivially reproduced using: echo g > /proc/sysrq-trigger We can improve this by adding check for in_dbg_master() instead which explicitly determines if we are running in debugger context. Cc: stable@vger.kernel.org Signed-off-by: Sumit Garg <sumit.garg@linaro.org> Link: https://lore.kernel.org/r/1611313556-4004-1-git-send-email-sumit.garg@linaro.org Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>	2021-02-08 13:42:50 +00:00
Christoph Hellwig	367948220f	module: remove EXPORT_UNUSED_SYMBOL* EXPORT_UNUSED_SYMBOL* is not actually used anywhere. Remove the unused functionality as we generally just remove unused code anyway. Reviewed-by: Miroslav Benes <mbenes@suse.cz> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jessica Yu <jeyu@kernel.org>	2021-02-08 12:28:07 +01:00
Christoph Hellwig	f1c3d73e97	module: remove EXPORT_SYMBOL_GPL_FUTURE As far as I can tell this has never been used at all, and certainly not any time recently. Reviewed-by: Miroslav Benes <mbenes@suse.cz> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jessica Yu <jeyu@kernel.org>	2021-02-08 12:28:02 +01:00
Christoph Hellwig	00cc2c1cd3	module: move struct symsearch to module.c struct symsearch is only used inside of module.h, so move the definition out of module.h. Reviewed-by: Miroslav Benes <mbenes@suse.cz> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jessica Yu <jeyu@kernel.org>	2021-02-08 12:27:43 +01:00
Christoph Hellwig	0b96615cdc	module: pass struct find_symbol_args to find_symbol Simplify the calling convention by passing the find_symbol_args structure to find_symbol instead of initializing it inside the function. Reviewed-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jessica Yu <jeyu@kernel.org>	2021-02-08 12:25:19 +01:00
Christoph Hellwig	71e4b309dc	module: merge each_symbol_section into find_symbol each_symbol_section is only called by find_symbol, so merge the two functions. Reviewed-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jessica Yu <jeyu@kernel.org>	2021-02-08 12:25:07 +01:00
Christoph Hellwig	a7c38f2cd3	module: remove each_symbol_in_section each_symbol_in_section just contains a trivial loop over its arguments. Just open code the loop in the two callers. Reviewed-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jessica Yu <jeyu@kernel.org>	2021-02-08 12:24:54 +01:00
Christoph Hellwig	922f2a7c82	module: mark module_mutex static Except for two lockdep asserts module_mutex is only used in module.c. Remove the two asserts given that the functions they are in are not exported and just called from the module code, and mark module_mutex static. Reviewed-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jessica Yu <jeyu@kernel.org>	2021-02-08 12:24:26 +01:00
Christoph Hellwig	3e3552056a	kallsyms: only build {,module_}kallsyms_on_each_symbol when required kallsyms_on_each_symbol and module_kallsyms_on_each_symbol are only used by the livepatching code, so don't build them if livepatching is not enabled. Reviewed-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jessica Yu <jeyu@kernel.org>	2021-02-08 12:24:04 +01:00
Christoph Hellwig	013c1667cf	kallsyms: refactor {,module_}kallsyms_on_each_symbol Require an explicit call to module_kallsyms_on_each_symbol to look for symbols in modules instead of the call from kallsyms_on_each_symbol, and acquire module_mutex inside of module_kallsyms_on_each_symbol instead of leaving that up to the caller. Note that this slightly changes the behavior for the livepatch code in that the symbols from vmlinux are not iterated anymore if objname is set, but that actually is the desired behavior in this case. Reviewed-by: Petr Mladek <pmladek@suse.com> Acked-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jessica Yu <jeyu@kernel.org>	2021-02-08 12:22:08 +01:00
Christoph Hellwig	a006050575	module: use RCU to synchronize find_module Allow for a RCU-sched critical section around find_module, following the lower level find_module_all helper, and switch the two callers outside of module.c to use such a RCU-sched critical section instead of module_mutex. Reviewed-by: Petr Mladek <pmladek@suse.com> Acked-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jessica Yu <jeyu@kernel.org>	2021-02-08 12:21:40 +01:00
Christoph Hellwig	089049f6c9	module: unexport find_module and module_mutex find_module is not used by modular code any more, and random driver code has no business calling it to start with. Reviewed-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jessica Yu <jeyu@kernel.org>	2021-02-08 12:21:16 +01:00
Linus Torvalds	ff92acb220	dma-mapping fixes for 5.11: - fix a 32 vs 64-bit padding issue in the new benchmark code (Barry Song) -----BEGIN PGP SIGNATURE----- iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmAgE/ALHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYM1pw/+MDNm/z5v8hNUkffBuEygZz36VP2Nupc9pDS8ctFF 0YracQ9SWmFFFzpXKwkMA49QvQR07hBodqBrd+lDsuXtwaSu5lAnZa3H24l3eZGO UYaNIl3n/yYM0ALOD0OZ6OPmj/RHMJMQSHtEiVRjBusCNIrgZd5EBP0h0my3Wu1D nRbbZDdoeI9jVCiYfiIh8UasJKGtL32LYiQDQMlUL+IA3Vuh3dCS9CojURuOs4EU 9+U80MKH5TMwHaSQqQXr8bosiDY4IImOhUvlEiy1c4bk0Uof6IOuq/LmucqCzLPw srUZjY7paz8ntO5M2jIH1UbUmeE9/4YH35xv3DVGYCOu24TohLUO4WP4T9VNUtx7 vQk1weBs4q6IWYkGNdYaomM4514u/59MBd24MdQsnQxxYzPFzSxX7VmK2tFNUHuS AqgUppT4IqkBqGMMcJmnOM48Xhy+q996cpkWZCtfGKoFclIaoEC+kD3YBNfvm1vs 9upivyD9Ht1h/4jfWFvSKyxKF257AoueYugYVd57pNY6PNIbTf221CW6d57lzPA6 rCpQLUlN6A6QQ9ifa7FtSbClj7PQrbUb0iFcdAerJU8FgyURMbncpNoc+t54Lxyw zO+tLUn+yZ+6ji7kydsOqs/RIt5chi7cDsv+p+yUqlBdBDyb3UisihAhiYlKtpju Bu0= =OqA5 -----END PGP SIGNATURE----- Merge tag 'dma-mapping-5.11-2' of git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping fix from Christoph Hellwig: "Fix a 32 vs 64-bit padding issue in the new benchmark code (Barry Song)" * tag 'dma-mapping-5.11-2' of git://git.infradead.org/users/hch/dma-mapping: dma-mapping: benchmark: use u8 for reserved field in uAPI structure	2021-02-07 10:40:48 -08:00
Linus Torvalds	fc6c0ae53a	- Prevent device managed IRQ allocation helpers from returning IRQ 0. - A fix for MSI activation of PCI endpoints with multiple MSIs. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmAf9E8ACgkQEsHwGGHe VUrdnBAAshn35KlffL7TayhPnO9FArEHw9GRoRdVOvfLp/NEQsALlEFx3ecaYo5j Rxoh/+/UdIx3pp/OTWu6uDnAxSnwctNZ50o1MFSiXZlYkoC/vVawauOPS29+W3bL 40fhGcA8RNx6Hi7a0Cgj0uioxmRJpZ0x8NvLzKT5uvkPYnRfLQSf7xqrkhQR9pm/ lJaG11aa/LNXndamYlrC1PllkDmX2UwZ6z0XBP9PJf6tDHlfR8sLHhGJ1E/ACaY6 Vw03DKsXHdiqqa+1bc8XduagHfchL4RCQXe9FS0IymH0a3lrjdOtdqZznTHR8S7N uwyPyNSdQDOV6Ni+qgc/Icoxfkj0/ZXytD4wkgpLP6ShUnGUaO6PrA5tm7CX/eoj 900eh1p2ZHHB5UP3FtG1ldUV0vn2HVtk7XOwSiPURoUldcBAnvJThQvxFA2wkeZA BnhTfoWCl2cncyWmUndNJ5kQFObGW7u8V6rU8kHgKNQDUKrD7hOGgOeFcPQ4j4I6 lXqrHKXu3yGCxVNZKt+4Ay5rRVQL8vKzXjDZbHhmLAomxuX4BCOqTCgWVFszX2Nr 3mLHw13tXAYobFDnq24CfPhljgGj7HUIOvadOJtoTG/5Kb4M7hybyqnlHRx8GVMh fOS3/o6TKhHQbfwMkx1Km3EiKQkDmvhJrzp/fQ6NcxXa8PY65T8= =v33D -----END PGP SIGNATURE----- Merge tag 'irq_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Borislav Petkov: - Prevent device managed IRQ allocation helpers from returning IRQ 0 - A fix for MSI activation of PCI endpoints with multiple MSIs * tag 'irq_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq: Prevent [devm_]irq_alloc_desc from returning irq 0 genirq/msi: Activate Multi-MSI early when MSI_FLAG_ACTIVATE_EARLY is set	2021-02-07 10:25:01 -08:00
Linus Torvalds	c6792d44d8	- For syscall user dispatch, separate ptctl operation from syscall redirection range specification before the API has been made official in 5.11. - Ensure tasks using the generic syscall code do trap after returning from a syscall when single-stepping is requested. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmAfz7gACgkQEsHwGGHe VUp+8hAAlNdy5EJVBVEBT8U6K9ZxHJ2Mnk/uPteD8Sq9o37dndfJ5utrXd52h9om JFfcsIVO7Ej2i7bKNVzM1FgUeO5UqtwGoZyJxuyT4ma+MZIjFibaem0+ousovJiU MhB6Vl+jkEBIEJXg2z9btoLTa86SPJM77u+gtJXaeQegcNJENY1jpUHYlV22q90/ b3b3MTVNNbw3bQty5hwWSU9G6PEXa888CJ+lEeuSjMQrVTmQ5i5oSMfYbUMCZIwm RQGcC/8qlDFfECBP9qMfq6sSoGnJ9uYmcT2Dzo7NiZHvBhtkzoWP4myjVF5g1oc/ H5nUwrG2EXem73xuAdxbPe1nqVoU2byd658GjZ0St/Zcb5usanNEOkgJa3f+O3X5 eRT5u9PFzhaTo2UDcLo02DlEqi/4Ed7bXJ2gxryHHxVi91Dr4G1uR+PL04MXJ6r8 8YCf10c5qOrQ8u5DJ7/yq7uZkNpecdwzvEpQWkR7SmEjY0hNo2yt0Lt8JcD6eFcv Jx27bETAseUTrynnJJmyG7y+HvDds5M+t1gj8NPPs7vA/XkdEFRUdKoDGCJE+p6+ y+cvRemx5p9YTiiTIEaiG187jR3M460DOvmT54xHcIWEWoJz3WfcRfXUqkx4xWOB TdJW5qTUnIkPr8XvHVcJUl6o9HIODclJCgZ7F7ceUP8XF2s2ATw= =l5j7 -----END PGP SIGNATURE----- Merge tag 'core_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull syscall entry fixes from Borislav Petkov: - For syscall user dispatch, separate prctl operation from syscall redirection range specification before the API has been made official in 5.11. - Ensure tasks using the generic syscall code do trap after returning from a syscall when single-stepping is requested. * tag 'core_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: entry: Use different define for selector variable in SUD entry: Ensure trap after single-step on system call return	2021-02-07 10:16:24 -08:00
Linus Torvalds	814daadbf0	- Use a freezable workqueue for RTC sync because the sync can happen at any time and trigger suspend assertion checks in the i2c subsystem. - Correct a previous RTC validation change to check only bit 6 in register D because some Intel machines use bits 0-5. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmAfxt8ACgkQEsHwGGHe VUqGnQ//W1gu/MyIGauA2Ds6WHtvgguyOLUfjQbykSXXHol9aygcdray6Zhca/+D 6bf7gudkIQVYy6A38dD6tH1/2brHelY9SsxJ/MOhKJ2zh3wistdV4tJsH682Dp8G 9BgmLYkc/QRuSMh04GKL+UoXxdv3IsDy6q2dZfMoQj6cDwx65JL2qdIp4HvAYZ+B FwF8BJxakLGr4ZHRurYQaT/+OKwc6rrF1/ix8zGl6sN8BATZTbcn0SVHWiiaoNlj TVXDLoVUHWw1X3xWdLwZlhD0SPsc1f3nO8Y+q/86zbf0r9YUJVq5dhuAqRRcAl2L CQDDmOUJLmtf6S2ZOcbQIRbC0gjQulMGoEOVZclYa8x1eeUywBwyHSZwVVzhvyVC jvtXu9yW7Y0kAbKbQnL42hwVJra+0fIwIG1ay3h2kZzlBKazxSom2JozFuhcQ/6M gNbHk8QZ4FPDNVl/gN6hxDtKcVv6ObvZGZnNbr6xjRCUSJ57O/kcmq/vkwYeRof/ vS2SPaY6OifrBYQVuH10CxpE4HJBA309eQ1vdwHtfq5+IcJE50XBNNm5VG1xu5h3 RQQINsQXg8+mERT1Jkpyy/JTTnBje2Hp0qxyC6FYRwDsBjNv8HjrhZT/H2rTWioG a3D9BZ0tcnJK/pu47FlA9gKQ2WMrnSJ7K2nHjHam5su0iIZTRk4= =UHNm -----END PGP SIGNATURE----- Merge tag 'timers_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Borislav Petkov: "Two more timers-related fixes for v5.11: - Use a freezable workqueue for RTC sync because the sync can happen at any time and trigger suspend assertion checks in the i2c subsystem. - Correct a previous RTC validation change to check only bit 6 in register D because some Intel machines use bits 0-5" * tag 'timers_urgent_for_v5.11_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: ntp: Use freezable workqueue for RTC synchronization rtc: mc146818: Dont test for bit 0-5 in Register D	2021-02-07 09:55:26 -08:00
Gabriel Krisman Bertazi	36a6c843fd	entry: Use different define for selector variable in SUD Michael Kerrisk suggested that, from an API perspective, it is a bad idea to share the PR_SYS_DISPATCH_ defines between the prctl operation and the selector variable. Therefore, define two new constants to be used by SUD's selector variable and update the corresponding documentation and test cases. While this changes the API syscall user dispatch has never been part of a Linux release, it will show up for the first time in 5.11. Suggested-by: Michael Kerrisk (man-pages) <mtk.manpages@gmail.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210205184321.2062251-1-krisman@collabora.com	2021-02-06 00:21:42 +01:00
Gabriel Krisman Bertazi	6342adcaa6	entry: Ensure trap after single-step on system call return Commit 299155244770 ("entry: Drop usage of TIF flags in the generic syscall code") introduced a bug on architectures using the generic syscall entry code, in which processes stopped by PTRACE_SYSCALL do not trap on syscall return after receiving a TIF_SINGLESTEP. The reason is that the meaning of TIF_SINGLESTEP flag is overloaded to cause the trap after a system call is executed, but since the above commit, the syscall call handler only checks for the SYSCALL_WORK flags on the exit work. Split the meaning of TIF_SINGLESTEP such that it only means single-step mode, and create a new type of SYSCALL_WORK to request a trap immediately after a syscall in single-step mode. In the current implementation, the SYSCALL_WORK flag shadows the TIF_SINGLESTEP flag for simplicity. Update x86 to flip this bit when a tracer enables single stepping. Fixes: 299155244770 ("entry: Drop usage of TIF flags in the generic syscall code") Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Kyle Huey <me@kylehuey.com> Link: https://lore.kernel.org/r/87h7mtc9pr.fsf_-_@collabora.com	2021-02-06 00:21:42 +01:00
Steven Rostedt (VMware)	2d396cb3b1	tracing: Do not create "enable" or "filter" files for ftrace event subsystem The ftrace event subsystem is only created for showing the format files of events created by the ftrace tracers, and are not trace events. The ftrace subsystem currently has both the "enable" and "filter" files that in other subsystems are used to enable/disable all events within the subsystem or set a filter for all the subsystem events. As ftrace subsystem events do not use enable or filter operations, these files are useless in the ftrace subsystem. Remove them. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2021-02-05 18:04:39 -05:00
Steven Rostedt (VMware)	256cfdd6fd	tracing: Do not count ftrace events in top level enable output The file /sys/kernel/tracing/events/enable is used to enable all events by echoing in "1", or disabling all events when echoing in "0". To know if all events are enabled, disabled, or some are enabled but not all of them, cating the file should show either "1" (all enabled), "0" (all disabled), or "X" (some enabled but not all of them). This works the same as the "enable" files in the individule system directories (like tracing/events/sched/enable). But when all events are enabled, the top level "enable" file shows "X". The reason is that its checking the "ftrace" events, which are special events that only exist for their format files. These include the format for the function tracer events, that are enabled when the function tracer is enabled, but not by the "enable" file. The check includes these events, which will always be disabled, and even though all true events are enabled, the top level "enable" file will show "X" instead of "1". To fix this, have the check test the event's flags to see if it has the "IGNORE_ENABLE" flag set, and if so, not test it. Cc: stable@vger.kernel.org Fixes: 553552ce1796c ("tracing: Combine event filter_active and enable into single flags field") Reported-by: "Yordan Karadzhov (VMware)" <y.karadz@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2021-02-05 15:40:04 -05:00
Johannes Berg	55b6f763d8	init/gcov: allow CONFIG_CONSTRUCTORS on UML to fix module gcov On ARCH=um, loading a module doesn't result in its constructors getting called, which breaks module gcov since the debugfs files are never registered. On the other hand, in-kernel constructors have already been called by the dynamic linker, so we can't call them again. Get out of this conundrum by allowing CONFIG_CONSTRUCTORS to be selected, but avoiding the in-kernel constructor calls. Also remove the "if !UML" from GCOV selecting CONSTRUCTORS now, since we really do want CONSTRUCTORS, just not kernel binary ones. Link: https://lkml.kernel.org/r/20210120172041.c246a2cac2fb.I1358f584b76f1898373adfed77f4462c8705b736@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Jessica Yu <jeyu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-02-05 11:03:47 -08:00
Alexey Dobriyan	174bcc691f	timens: Delete no-op time_ns_init() Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Andrei Vagin <avagin@gmail.com> Link: https://lore.kernel.org/r/20201228215402.GA572900@localhost.localdomain	2021-02-05 19:32:09 +01:00
Alexandre Belloni	b5c28ea601	alarmtimer: Update kerneldoc Update kerneldoc comments to reflect the actual arguments and return values of the documented functions. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210202013457.3482388-1-alexandre.belloni@bootlin.com	2021-02-05 19:26:41 +01:00
Geert Uytterhoeven	24c242ec7a	ntp: Use freezable workqueue for RTC synchronization The bug fixed by commit e3fab2f3de081e98 ("ntp: Fix RTC synchronization on 32-bit platforms") revealed an underlying issue: RTC synchronization may happen anytime, even while the system is partially suspended. On systems where the RTC is connected to an I2C bus, the I2C bus controller may already or still be suspended, triggering a WARNING during suspend or resume from s2ram: WARNING: CPU: 0 PID: 124 at drivers/i2c/i2c-core.h:54 __i2c_transfer+0x634/0x680 i2c i2c-6: Transfer while suspended [...] Workqueue: events_power_efficient sync_hw_clock [...] (__i2c_transfer) (i2c_transfer) (regmap_i2c_read) ... (da9063_rtc_set_time) (rtc_set_time) (sync_hw_clock) (process_one_work) Fix this race condition by using the freezable instead of the normal power-efficient workqueue. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/20210125143039.1051912-1-geert+renesas@glider.be	2021-02-05 18:03:13 +01:00
Peter Zijlstra	7f82e631d2	locking/lockdep: Avoid unmatched unlock Commit f6f48e180404 ("lockdep: Teach lockdep about "USED" <- "IN-NMI" inversions") overlooked that print_usage_bug() releases the graph_lock and called it without the graph lock held. Fixes: f6f48e180404 ("lockdep: Teach lockdep about "USED" <- "IN-NMI" inversions") Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Waiman Long <longman@redhat.com> Link: https://lkml.kernel.org/r/YBfkuyIfB1+VRxXP@hirez.programming.kicks-ass.net	2021-02-05 17:20:15 +01:00
Barry Song	9dc00b25ea	dma-mapping: benchmark: pretend DMA is transmitting In a real dma mapping user case, after dma_map is done, data will be transmit. Thus, in multi-threaded user scenario, IOMMU contention should not be that severe. For example, if users enable multiple threads to send network packets through 1G/10G/100Gbps NIC, usually the steps will be: map -> transmission -> unmap. Transmission delay reduces the contention of IOMMU. Here a delay is added to simulate the transmission between map and unmap so that the tested result could be more accurate for TX and simple RX. A typical TX transmission for NIC would be like: map -> TX -> unmap since the socket buffers come from OS. Simple RX model eg. disk driver, is also map -> RX -> unmap, but real RX model in a NIC could be more complicated considering packets can come spontaneously and many drivers are using pre-mapped buffers pool. This is in the TBD list. Signed-off-by: Barry Song <song.bao.hua@hisilicon.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2021-02-05 12:48:46 +01:00
Barry Song	9f5f8ec501	dma-mapping: benchmark: use u8 for reserved field in uAPI structure The original code put five u32 before a u64 expansion[10] array. Five is odd, this will cause trouble in the extension of the structure by adding new features. This patch moves to use u8 for reserved field to avoid future alignment risk. Meanwhile, it also clears the memory of struct map_benchmark in tools, otherwise, if users use old version to run on newer kernel, the random expansion value will cause side effect on newer kernel. Signed-off-by: Barry Song <song.bao.hua@hisilicon.com> Signed-off-by: Christoph Hellwig <hch@lst.de>	2021-02-05 12:48:46 +01:00
Yonghong Song	23a2d70c7a	bpf: Refactor BPF_PSEUDO_CALL checking as a helper function There is no functionality change. This refactoring intends to facilitate next patch change with BPF_PSEUDO_FUNC. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210204234827.1628953-1-yhs@fb.com	2021-02-04 21:51:47 -08:00
KP Singh	ba90c2cc02	bpf: Allow usage of BPF ringbuffer in sleepable programs The BPF ringbuffer map is pre-allocated and the implementation logic does not rely on disabling preemption or per-cpu data structures. Using the BPF ringbuffer sleepable LSM and tracing programs does not trigger any warnings with DEBUG_ATOMIC_SLEEP, DEBUG_PREEMPT, PROVE_RCU and PROVE_LOCKING and LOCKDEP enabled. This allows helpers like bpf_copy_from_user and bpf_ima_inode_hash to write to the BPF ring buffer from sleepable BPF programs. Signed-off-by: KP Singh <kpsingh@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210204193622.3367275-2-kpsingh@kernel.org	2021-02-04 16:35:00 -08:00
Stephen Zhang	0759d80728	kdb: kdb_support: Fix debugging information problem There are several common patterns. 0: kdb_printf("...",...); which is the normal one. 1: kdb_printf("%s: "...,__func__,...) We could improve '1' to this : #define kdb_func_printf(format, args...) \ kdb_printf("%s: " format, __func__, ## args) 2: if(KDB_DEBUG(AR)) kdb_printf("%s "...,__func__,...); We could improve '2' to this : #define kdb_dbg_printf(mask, format, args...) \ do { \ if (KDB_DEBUG(mask)) \ kdb_func_printf(format, ## args); \ } while (0) In addition, we changed the format code of size_t to %zu. Signed-off-by: Stephen Zhang <stephenzhangzsd@gmail.com> Link: https://lore.kernel.org/r/1612440429-6391-1-git-send-email-stephenzhangzsd@gmail.com Reviewed-by: Douglas Anderson <dianders@chromium.org> [daniel.thompson@linaro.org: Minor typo and line length fixes in the patch description] Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>	2021-02-04 15:55:04 +00:00
wengjianfeng	cbd026e1d8	kernel: debug: fix typo issue change 'regster' to 'register'. Signed-off-by: wengjianfeng <wengjianfeng@yulong.com> Link: https://lore.kernel.org/r/20210203081034.9004-1-samirweng1979@163.com Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>	2021-02-04 15:26:03 +00:00
Lukas Bulwahn	2da2687b51	kgdb: rectify kernel-doc for kgdb_unregister_io_module() The command 'find ./kernel/debug/ \| xargs ./scripts/kernel-doc -none' reported a typo in the kernel-doc of kgdb_unregister_io_module(). Rectify the kernel-doc, such that no issues remain for ./kernel/debug/. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Link: https://lore.kernel.org/r/20210125144847.21896-1-lukas.bulwahn@gmail.com Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>	2021-02-04 15:23:25 +00:00
Ben Gardon	f3d4b4b1dc	sched: Add cond_resched_rwlock Safely rescheduling while holding a spin lock is essential for keeping long running kernel operations running smoothly. Add the facility to cond_resched rwlocks. CC: Ingo Molnar <mingo@redhat.com> CC: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Davidlohr Bueso <dbueso@suse.de> Acked-by: Waiman Long <longman@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20210202185734.1680553-9-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-02-04 05:27:43 -05:00
Bui Quang Minh	6183f4d3a0	bpf: Check for integer overflow when using roundup_pow_of_two() On 32-bit architecture, roundup_pow_of_two() can return 0 when the argument has upper most bit set due to resulting 1UL << 32. Add a check for this case. Fixes: d5a3b1f69186 ("bpf: introduce BPF_MAP_TYPE_STACK_TRACE") Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210127063653.3576-1-minhquangbui99@gmail.com	2021-02-03 21:45:33 +01:00
Linus Torvalds	dbc15d24f9	Tracing fixes: - Initialize tracing-graph-pause at task creation, not start of function tracing. Causes the pause counter to be corrupted. - Set "pause-on-trace" for latency tracers as that option breaks their output (regression). - Fix the wrong error return for setting kretprobes on future modules (before they are loaded). - Fix re-registering the same kretprobe. - Add missing value check for added RCU variable reload. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYBnNzhQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qlpoAP4hU98lfAButfYTuuS7Id+/r21bB4lG 9HHB72wkpEfs8AEAlTDC5c3eXhnXXJC4a8b4sGv1wvBiHL2ZoW/yQ/4oZgA= =hpY/ -----END PGP SIGNATURE----- Merge tag 'trace-v5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: - Initialize tracing-graph-pause at task creation, not start of function tracing, to avoid corrupting the pause counter. - Set "pause-on-trace" for latency tracers as that option breaks their output (regression). - Fix the wrong error return for setting kretprobes on future modules (before they are loaded). - Fix re-registering the same kretprobe. - Add missing value check for added RCU variable reload. * tag 'trace-v5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracepoint: Fix race between tracing and removing tracepoint kretprobe: Avoid re-registration of the same kretprobe earlier tracing/kprobe: Fix to support kretprobe events on unloaded modules tracing: Use pause-on-trace with the latency tracers fgraph: Initialize tracing_graph_pause at task creation	2021-02-03 10:02:00 -08:00
Alexei Starovoitov	548f1191d8	bpf: Unbreak BPF_PROG_TYPE_KPROBE when kprobe is called via do_int3 The commit 0d00449c7a28 ("x86: Replace ist_enter() with nmi_enter()") converted do_int3 handler to be "NMI-like". That made old if (in_nmi()) check abort execution of bpf programs attached to kprobe when kprobe is firing via int3 (For example when kprobe is placed in the middle of the function). Remove the check to restore user visible behavior. Fixes: 0d00449c7a28 ("x86: Replace ist_enter() with nmi_enter()") Reported-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lore.kernel.org/bpf/20210203070636.70926-1-alexei.starovoitov@gmail.com	2021-02-03 15:54:22 +01:00
Brendan Jackman	37086bfdc7	bpf: Propagate stack bounds to registers in atomics w/ BPF_FETCH When BPF_FETCH is set, atomic instructions load a value from memory into a register. The current verifier code first checks via check_mem_access whether we can access the memory, and then checks via check_reg_arg whether we can write into the register. For loads, check_reg_arg has the side-effect of marking the register's value as unkonwn, and check_mem_access has the side effect of propagating bounds from memory to the register. This currently only takes effect for stack memory. Therefore with the current order, bounds information is thrown away, but by simply reversing the order of check_reg_arg vs. check_mem_access, we can instead propagate bounds smartly. A simple test is added with an infinite loop that can only be proved unreachable if this propagation is present. This is implemented both with C and directly in test_verifier using assembly. Suggested-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Brendan Jackman <jackmanb@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210202135002.4024825-1-jackmanb@google.com	2021-02-02 18:23:29 -08:00

... 3 4 5 6 7 ...

35739 Commits