linux

iv/linux

Author	SHA1	Message	Date
Ayush Sawal	717df0f4cd	chtls: Fix hardware tid leak send_abort_rpl() is not calculating cpl_abort_req_rss offset and ends up sending wrong TID with abort_rpl WR causng tid leaks. Replaced send_abort_rpl() with chtls_send_abort_rpl() as it is redundant. Fixes: cc35c88ae4db ("crypto : chtls - CPL handler definition") Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com> Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-07 17:06:01 -08:00
Linus Torvalds	c4cc3b1de3	gcc-plugins fix for v5.11-rc3 - Bump c++ standard version for latest GCC versions (Valdis Kletnieks) -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAl/3lmMACgkQiXL039xt wCbVwxAAoPLbLZHjBkG654VWl1YK9tsRIcGomGqCmzsgkO9dBJPj58VPeVBSlnl2 A58YJdz7m0iq9Tv1UH+fkOW5EziIMIsozho09JZpAAn7hqPw0eQP56EudJudoXIN UFtt3C+bqEUCfYpmwhUl1aV2SBAyB6QQZ6nn+J6Lrxq0w6KbYzNTeaKBwnGcDkGN RZQpMfY+UJjzAFm17/N1UhyOBR+EfjdN9PDi46omFRikfsP3KmaUVdl05JZ3ONfr oN7JGuoNv1PSHPXslqMzgB/8h5DuCASUfLPDCZr8wk9EZ87gcM3xVHHDpQOnLxuU V26/YB7IBh8nWmwfpZcfsT7CdGq108JlxyuSGezxNziW3yHLNHMeZ29Arlh2jJ+U Z2PunpGTZhBcod7MFobIgLTnnXU9i+4re9Y6soJq1P9g9cfwd3q8YxixFMYvWgyh KmtF9eF6a+EOATutC+lLByNnZ5+DisEfGyMiGXEv+cbozT74Dx2H8ISQshW03tym iyP94giJAf7DVJ7mMVTG1XlM+Pl6dQC9p51nu5pl5DBM0Ryj8Hunu8605JJ44qIb 8jroMSV0SoqMXlf2bN5XKLGRSpyrKz59dKjfw/Iu0v+2fd796ZugbRGouR27VekK WFQsql6OXsYHgK8uh1pxnZvP2swrjjXhHiLUGwr6e+b2YArrf0E= =jtTd -----END PGP SIGNATURE----- Merge tag 'gcc-plugins-v5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull gcc-plugins fix from Kees Cook: "Bump c++ standard version for latest GCC versions (Valdis Kletnieks)" * tag 'gcc-plugins-v5.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: gcc-plugins: fix gcc 11 indigestion with plugins...	2021-01-07 16:03:19 -08:00
Tom Lendacky	647daca25d	KVM: SVM: Add support for booting APs in an SEV-ES guest Typically under KVM, an AP is booted using the INIT-SIPI-SIPI sequence, where the guest vCPU register state is updated and then the vCPU is VMRUN to begin execution of the AP. For an SEV-ES guest, this won't work because the guest register state is encrypted. Following the GHCB specification, the hypervisor must not alter the guest register state, so KVM must track an AP/vCPU boot. Should the guest want to park the AP, it must use the AP Reset Hold exit event in place of, for example, a HLT loop. First AP boot (first INIT-SIPI-SIPI sequence): Execute the AP (vCPU) as it was initialized and measured by the SEV-ES support. It is up to the guest to transfer control of the AP to the proper location. Subsequent AP boot: KVM will expect to receive an AP Reset Hold exit event indicating that the vCPU is being parked and will require an INIT-SIPI-SIPI sequence to awaken it. When the AP Reset Hold exit event is received, KVM will place the vCPU into a simulated HLT mode. Upon receiving the INIT-SIPI-SIPI sequence, KVM will make the vCPU runnable. It is again up to the guest to then transfer control of the AP to the proper location. To differentiate between an actual HLT and an AP Reset Hold, a new MP state is introduced, KVM_MP_STATE_AP_RESET_HOLD, which the vCPU is placed in upon receiving the AP Reset Hold exit event. Additionally, to communicate the AP Reset Hold exit event up to userspace (if needed), a new exit reason is introduced, KVM_EXIT_AP_RESET_HOLD. A new x86 ops function is introduced, vcpu_deliver_sipi_vector, in order to accomplish AP booting. For VMX, vcpu_deliver_sipi_vector is set to the original SIPI delivery function, kvm_vcpu_deliver_sipi_vector(). SVM adds a new function that, for non SEV-ES guests, invokes the original SIPI delivery function, kvm_vcpu_deliver_sipi_vector(), but for SEV-ES guests, implements the logic above. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Message-Id: <e8fbebe8eb161ceaabdad7c01a5859a78b424d5e.1609791600.git.thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-01-07 18:11:37 -05:00
Maxim Levitsky	f2c7ef3ba9	KVM: nSVM: cancel KVM_REQ_GET_NESTED_STATE_PAGES on nested vmexit It is possible to exit the nested guest mode, entered by svm_set_nested_state prior to first vm entry to it (e.g due to pending event) if the nested run was not pending during the migration. In this case we must not switch to the nested msr permission bitmap. Also add a warning to catch similar cases in the future. Fixes: a7d5c7ce41ac1 ("KVM: nSVM: delay MSR permission processing to first nested VM run") Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210107093854.882483-2-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-01-07 18:11:35 -05:00
Maxim Levitsky	56fe28de8c	KVM: nSVM: mark vmcb as dirty when forcingly leaving the guest mode We overwrite most of vmcb fields while doing so, so we must mark it as dirty. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210107093854.882483-5-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-01-07 18:11:34 -05:00
Maxim Levitsky	81f76adad5	KVM: nSVM: correctly restore nested_run_pending on migration The code to store it on the migration exists, but no code was restoring it. One of the side effects of fixing this is that L1->L2 injected events are no longer lost when migration happens with nested run pending. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210107093854.882483-3-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-01-07 18:11:33 -05:00
Ben Gardon	c0dba6e468	KVM: x86/mmu: Clarify TDP MMU page list invariants The tdp_mmu_roots and tdp_mmu_pages in struct kvm_arch should only contain pages with tdp_mmu_page set to true. tdp_mmu_pages should not contain any pages with a non-zero root_count and tdp_mmu_roots should only contain pages with a positive root_count, unless a thread holds the MMU lock and is in the process of modifying the list. Various functions expect these invariants to be maintained, but they are not explictily documented. Add to the comments on both fields to document the above invariants. Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20210107001935.3732070-2-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-01-07 18:11:32 -05:00
Ben Gardon	a889ea54b3	KVM: x86/mmu: Ensure TDP MMU roots are freed after yield Many TDP MMU functions which need to perform some action on all TDP MMU roots hold a reference on that root so that they can safely drop the MMU lock in order to yield to other threads. However, when releasing the reference on the root, there is a bug: the root will not be freed even if its reference count (root_count) is reduced to 0. To simplify acquiring and releasing references on TDP MMU root pages, and to ensure that these roots are properly freed, move the get/put operations into another TDP MMU root iterator macro. Moving the get/put operations into an iterator macro also helps simplify control flow when a root does need to be freed. Note that using the list_for_each_entry_safe macro would not have been appropriate in this situation because it could keep a pointer to the next root across an MMU lock release + reacquire, during which time that root could be freed. Reported-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Fixes: faaf05b00aec ("kvm: x86/mmu: Support zapping SPTEs in the TDP MMU") Fixes: 063afacd8730 ("kvm: x86/mmu: Support invalidate range MMU notifier for TDP MMU") Fixes: a6a0b05da9f3 ("kvm: x86/mmu: Support dirty logging for the TDP MMU") Fixes: 14881998566d ("kvm: x86/mmu: Support disabling dirty logging for the tdp MMU") Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20210107001935.3732070-1-bgardon@google.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-01-07 18:11:31 -05:00
Lai Jiangshan	88bf56d04b	kvm: check tlbs_dirty directly In kvm_mmu_notifier_invalidate_range_start(), tlbs_dirty is used as: need_tlb_flush \|= kvm->tlbs_dirty; with need_tlb_flush's type being int and tlbs_dirty's type being long. It means that tlbs_dirty is always used as int and the higher 32 bits is useless. We need to check tlbs_dirty in a correct way and this change checks it directly without propagating it to need_tlb_flush. Note: it's _extremely_ unlikely this neglecting of higher 32 bits can cause problems in practice. It would require encountering tlbs_dirty on a 4 billion count boundary, and KVM would need to be using shadow paging or be running a nested guest. Cc: stable@vger.kernel.org Fixes: a4ee1ca4a36e ("KVM: MMU: delay flush all tlbs on sync_page path") Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Message-Id: <20201217154118.16497-1-jiangshanlai@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-01-07 18:11:30 -05:00
Stephen Zhang	de7860c8a3	KVM: x86: change in pv_eoi_get_pending() to make code more readable Signed-off-by: Stephen Zhang <stephenzhangzsd@gmail.com> Message-Id: <1608277897-1932-1-git-send-email-stephenzhangzsd@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-01-07 18:11:29 -05:00
Jakub Kicinski	0565ff56cd	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Alexei Starovoitov says: ==================== pull-request: bpf 2021-01-07 We've added 4 non-merge commits during the last 10 day(s) which contain a total of 4 files changed, 14 insertions(+), 7 deletions(-). The main changes are: 1) Fix task_iter bug caused by the merge conflict resolution, from Yonghong. 2) Fix resolve_btfids for multiple type hierarchies, from Jiri. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpftool: Fix compilation failure for net.o with older glibc tools/resolve_btfids: Warn when having multiple IDs for single type bpf: Fix a task_iter bug caused by a merge conflict resolution selftests/bpf: Fix a compile error for BPF_F_BPRM_SECUREEXEC ==================== Link: https://lore.kernel.org/r/20210107221555.64959-1-alexei.starovoitov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-07 15:10:27 -08:00
Sean Christopherson	7f0c1f1a82	MAINTAINERS: Really update email address for Sean Christopherson Use my @google.com address in MAINTAINERS, somehow only the .mailmap entry was added when the original update patch was applied. Fixes: c2b1209d852f ("MAINTAINERS: Update email address for Sean Christopherson") Cc: kvm@vger.kernel.org Reported-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210106182916.331743-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-01-07 18:07:33 -05:00
Paolo Bonzini	2f80d502d6	KVM: x86: fix shift out of bounds reported by UBSAN Since we know that e >= s, we can reassociate the left shift, changing the shifted number from 1 to 2 in exchange for decreasing the right hand side by 1. Reported-by: syzbot+e87846c48bf72bc85311@syzkaller.appspotmail.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-01-07 18:07:32 -05:00
Andrew Jones	b268b6f0bd	KVM: selftests: Implement perf_test_util more conventionally It's not conventional C to put non-inline functions in header files. Create a source file for the functions instead. Also reduce the amount of globals and rename the functions to something less generic. Reviewed-by: Ben Gardon <bgardon@google.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Jones <drjones@redhat.com> Message-Id: <20201218141734.54359-4-drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-01-07 18:07:31 -05:00
Andrew Jones	1133e17ea7	KVM: selftests: Use vm_create_with_vcpus in create_vm Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: Andrew Jones <drjones@redhat.com> Message-Id: <20201218141734.54359-3-drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-01-07 18:07:30 -05:00
Andrew Jones	e42ac777d6	KVM: selftests: Factor out guest mode code demand_paging_test, dirty_log_test, and dirty_log_perf_test have redundant guest mode code. Factor it out. Also, while adding a new include, remove the ones we don't need. Reviewed-by: Ben Gardon <bgardon@google.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Jones <drjones@redhat.com> Message-Id: <20201218141734.54359-2-drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-01-07 18:07:29 -05:00
Uros Bizjak	52782d5b63	KVM/SVM: Remove leftover __svm_vcpu_run prototype from svm.c Commit 16809ecdc1e8a moved __svm_vcpu_run the prototype to svm.h, but forgot to remove the original from svm.c. Fixes: 16809ecdc1e8a ("KVM: SVM: Provide an updated VMRUN invocation for SEV-ES guests") Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Message-Id: <20201220200339.65115-1-ubizjak@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-01-07 18:07:28 -05:00
Nathan Chancellor	f65cf84ee7	KVM: SVM: Add register operand to vmsave call in sev_es_vcpu_load When using LLVM's integrated assembler (LLVM_IAS=1) while building x86_64_defconfig + CONFIG_KVM=y + CONFIG_KVM_AMD=y, the following build error occurs: $ make LLVM=1 LLVM_IAS=1 arch/x86/kvm/svm/sev.o arch/x86/kvm/svm/sev.c:2004:15: error: too few operands for instruction asm volatile(__ex("vmsave") : : "a" (__sme_page_pa(sd->save_area)) : "memory"); ^ arch/x86/kvm/svm/sev.c:28:17: note: expanded from macro '__ex' #define __ex(x) __kvm_handle_fault_on_reboot(x) ^ ./arch/x86/include/asm/kvm_host.h:1646:10: note: expanded from macro '__kvm_handle_fault_on_reboot' "666: \n\t" \ ^ <inline asm>:2:2: note: instantiated into assembly here vmsave ^ 1 error generated. This happens because LLVM currently does not support calling vmsave without the fixed register operand (%rax for 64-bit and %eax for 32-bit). This will be fixed in LLVM 12 but the kernel currently supports LLVM 10.0.1 and newer so this needs to be handled. Add the proper register using the _ASM_AX macro, which matches the vmsave call in vmenter.S. Fixes: 861377730aa9 ("KVM: SVM: Provide support for SEV-ES vCPU loading") Link: https://reviews.llvm.org/D93524 Link: https://github.com/ClangBuiltLinux/linux/issues/1216 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Message-Id: <20201219063711.3526947-1-natechancellor@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-01-07 18:07:27 -05:00
Paolo Bonzini	bc351f0726	Merge branch 'kvm-master' into kvm-next Fixes to get_mmio_spte, destined to 5.10 stable branch.	2021-01-07 18:06:52 -05:00
Sean Christopherson	9aa418792f	KVM: x86/mmu: Optimize not-present/MMIO SPTE check in get_mmio_spte() Check only the terminal leaf for a "!PRESENT \|\| MMIO" SPTE when looking for reserved bits on valid, non-MMIO SPTEs. The get_walk() helpers terminate their walks if a not-present or MMIO SPTE is encountered, i.e. the non-terminal SPTEs have already been verified to be regular SPTEs. This eliminates an extra check-and-branch in a relatively hot loop. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20201218003139.2167891-5-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-01-07 18:00:27 -05:00
Sean Christopherson	dde81f9477	KVM: x86/mmu: Use raw level to index into MMIO walks' sptes array Bump the size of the sptes array by one and use the raw level of the SPTE to index into the sptes array. Using the SPTE level directly improves readability by eliminating the need to reason out why the level is being adjusted when indexing the array. The array is on the stack and is not explicitly initialized; bumping its size is nothing more than a superficial adjustment to the stack frame. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20201218003139.2167891-4-seanjc@google.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-01-07 18:00:26 -05:00
Sean Christopherson	39b4d43e60	KVM: x86/mmu: Get root level from walkers when retrieving MMIO SPTE Get the so called "root" level from the low level shadow page table walkers instead of manually attempting to calculate it higher up the stack, e.g. in get_mmio_spte(). When KVM is using PAE shadow paging, the starting level of the walk, from the callers perspective, is not the CR3 root but rather the PDPTR "root". Checking for reserved bits from the CR3 root causes get_mmio_spte() to consume uninitialized stack data due to indexing into sptes[] for a level that was not filled by get_walk(). This can result in false positives and/or negatives depending on what garbage happens to be on the stack. Opportunistically nuke a few extra newlines. Fixes: 95fb5b0258b7 ("kvm: x86/mmu: Support MMIO in the TDP MMU") Reported-by: Richard Herbert <rherbert@sympatico.ca> Cc: Ben Gardon <bgardon@google.com> Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20201218003139.2167891-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-01-07 18:00:24 -05:00
Sean Christopherson	2aa078932f	KVM: x86/mmu: Use -1 to flag an undefined spte in get_mmio_spte() Return -1 from the get_walk() helpers if the shadow walk doesn't fill at least one spte, which can theoretically happen if the walk hits a not-present PDPTR. Returning the root level in such a case will cause get_mmio_spte() to return garbage (uninitialized stack data). In practice, such a scenario should be impossible as KVM shouldn't get a reserved-bit page fault with a not-present PDPTR. Note, using mmu->root_level in get_walk() is wrong for other reasons, too, but that's now a moot point. Fixes: 95fb5b0258b7 ("kvm: x86/mmu: Support MMIO in the TDP MMU") Cc: Ben Gardon <bgardon@google.com> Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20201218003139.2167891-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-01-07 18:00:23 -05:00
Jakub Kicinski	704a0f858e	Merge branch 'net-fix-netfilter-defrag-ip-tunnel-pmtu-blackhole' Florian Westphal says: ==================== net: fix netfilter defrag/ip tunnel pmtu blackhole Christian Perle reported a PMTU blackhole due to unexpected interaction between the ip defragmentation that comes with connection tracking and ip tunnels. Unfortunately setting 'nopmtudisc' on the tunnel breaks the test scenario even without netfilter. Christinas setup looks like this: +--------+ +---------+ +--------+ \|Router A\|-------\|Wanrouter\|-------\|Router B\| \| \|.IPIP..\| \|..IPIP.\| \| +--------+ +---------+ +--------+ / mtu 1400 \ / \ +--------+ +--------+ \|Client A\| \|Client B\| +--------+ +--------+ MTU is 1500 everywhere, except on Router A to Wanrouter and Wanrouter to Router B. Router A and Router B use IPIP tunnel interfaces to tunnel traffic between Client A and Client B over WAN. Client A sends a 1400 byte UDP datagram to Client B. This packet gets encapsulated in the IPIP tunnel. This works, packet is received on client B. When conntrack (or anything else that forces ip defragmentation) is enabled on Router A, the packet gets dropped on Router A after encapsulation because they exceed the link MTU. Setting the 'nopmtudisc' flag on the IPIP tunnel makes things worse, no packets pass even in the no-netfilter scenario. Patch one is a reproducer script for selftest infra. Patch two is a fix for 'nopmtudisc' behaviour so ip_tunnel will send an icmp error to Client A. This allows 'nopmtudisc' tunnel to forward the UDP datagrams. Patch three enables ip refragmentation for all reassembled packets, just like ipv6. ==================== Link: https://lore.kernel.org/r/20210105231523.622-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-07 14:42:37 -08:00
Florian Westphal	bb4cc1a188	net: ip: always refragment ip defragmented packets Conntrack reassembly records the largest fragment size seen in IPCB. However, when this gets forwarded/transmitted, fragmentation will only be forced if one of the fragmented packets had the DF bit set. In that case, a flag in IPCB will force fragmentation even if the MTU is large enough. This should work fine, but this breaks with ip tunnels. Consider client that sends a UDP datagram of size X to another host. The client fragments the datagram, so two packets, of size y and z, are sent. DF bit is not set on any of these packets. Middlebox netfilter reassembles those packets back to single size-X packet, before routing decision. packet-size-vs-mtu checks in ip_forward are irrelevant, because DF bit isn't set. At output time, ip refragmentation is skipped as well because x is still smaller than the mtu of the output device. If ttransmit device is an ip tunnel, the packet size increases to x+overhead. Also, tunnel might be configured to force DF bit on outer header. In this case, packet will be dropped (exceeds MTU) and an ICMP error is generated back to sender. But sender already respects the announced MTU, all the packets that it sent did fit the announced mtu. Force refragmentation as per original sizes unconditionally so ip tunnel will encapsulate the fragments instead. The only other solution I see is to place ip refragmentation in the ip_tunnel code to handle this case. Fixes: d6b915e29f4ad ("ip_fragment: don't forward defragmented DF packet") Reported-by: Christian Perle <christian.perle@secunet.com> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-07 14:42:36 -08:00
Florian Westphal	50c661670f	net: fix pmtu check in nopmtudisc mode For some reason ip_tunnel insist on setting the DF bit anyway when the inner header has the DF bit set, EVEN if the tunnel was configured with 'nopmtudisc'. This means that the script added in the previous commit cannot be made to work by adding the 'nopmtudisc' flag to the ip tunnel configuration. Doing so breaks connectivity even for the without-conntrack/netfilter scenario. When nopmtudisc is set, the tunnel will skip the mtu check, so no icmp error is sent to client. Then, because inner header has DF set, the outer header gets added with DF bit set as well. IP stack then sends an error to itself because the packet exceeds the device MTU. Fixes: 23a3647bc4f93 ("ip_tunnels: Use skb-len to PMTU check.") Cc: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-07 14:42:36 -08:00
Florian Westphal	9e7a67dee2	selftests: netfilter: add selftest for ipip pmtu discovery with enabled connection tracking Convert Christians bug description into a reproducer. Cc: Shuah Khan <shuah@kernel.org> Reported-by: Christian Perle <christian.perle@secunet.com> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-07 14:42:31 -08:00
Vineet Gupta	bb12433bf5	ARC: unbork 5.11 bootup: fix snafu in _TIF_NOTIFY_SIGNAL handling Linux 5.11.rcX was failing to boot on ARC HSDK board. Turns out we have a couple of issues, this being the first one, and I'm to blame as I didn't pay attention during review. TIF_NOTIFY_SIGNAL support requires checking multiple TIF_* bits in kernel return code path. Old code only needed to check a single bit so BBIT0 <TIF_SIGPENDING> worked. New code needs to check multiple bits so AND <bit-mask> instruction. So needs to use bit mask variant _TIF_SIGPENDING Cc: Jens Axboe <axboe@kernel.dk> Fixes: 53855e12588743ea128 ("arc: add support for TIF_NOTIFY_SIGNAL") Link: https://github.com/foss-for-synopsys-dwc-arc-processors/linux/issues/34 Signed-off-by: Vineet Gupta <vgupta@synopsys.com>	2021-01-07 14:04:42 -08:00
Bhaskar Chowdhury	9d54ee78ae	docs: admin-guide: bootconfig: Fix feils to fails s/feils/fails/p Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20210107125610.1576368-1-unixbhaskar@gmail.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>	2021-01-07 14:44:58 -07:00
Randy Dunlap	25942e5ecb	Documentation/admin-guide: kernel-parameters: hyphenate comma-separated Hyphenate "comma separated" when it is used as a compound adjective. hyphenated. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20210101040831.4148-1-rdunlap@infradead.org Signed-off-by: Jonathan Corbet <corbet@lwn.net>	2021-01-07 14:44:02 -07:00
Jonathan Neuschäfer	a734a7235e	docs: binfmt-misc: Fix .rst formatting "name below" is not part of the /proc path and should not be formatted in monospace. "doesn``t" is rendered in HTML with a double backtick. Revert it back to "doesn't". Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Link: https://lore.kernel.org/r/20210101211447.1021412-1-j.neuschaefer@gmx.net Signed-off-by: Jonathan Corbet <corbet@lwn.net>	2021-01-07 14:43:03 -07:00
Miguel Ojeda	0ef597c3ac	docs: remove mention of ENABLE_MUST_CHECK We removed ENABLE_MUST_CHECK in 196793946264 ("Compiler Attributes: remove CONFIG_ENABLE_MUST_CHECK"), so let's remove docs' mentions. At the same time, fix the outdated text related to ENABLE_WARN_DEPRECATED that wasn't removed in 3337d5cfe5e08 ("configs: get rid of obsolete CONFIG_ENABLE_WARN_DEPRECATED"). Finally, reflow the paragraph. Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/r/20210105055815.GA5173@kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net>	2021-01-07 14:41:38 -07:00
Lukas Bulwahn	f3562f5e00	docs: octeontx2: tune rst markup Commit 80b9414832a1 ("docs: octeontx2: Add Documentation for NPA health reporters") added new documentation with improper formatting for rst, and caused a few new warnings for make htmldocs in octeontx2.rst:169--202. Tune markup and formatting for better presentation in the HTML view. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: George Cherian <george.cherian@marvell.com> Link: https://lore.kernel.org/r/20210106161735.21751-1-lukas.bulwahn@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-07 12:47:31 -08:00
Dinghao Liu	5b0bb12c58	net/mlx5e: Fix memleak in mlx5e_create_l2_table_groups When mlx5_create_flow_group() fails, ft->g should be freed just like when kvzalloc() fails. The caller of mlx5e_create_l2_table_groups() does not catch this issue on failure, which leads to memleak. Fixes: 33cfaaa8f36f ("net/mlx5e: Split the main flow steering table") Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-07 12:22:51 -08:00
Dinghao Liu	7a6eb072a9	net/mlx5e: Fix two double free cases mlx5e_create_ttc_table_groups() frees ft->g on failure of kvzalloc(), but such failure will be caught by its caller in mlx5e_create_ttc_table() and ft->g will be freed again in mlx5e_destroy_flow_table(). The same issue also occurs in mlx5e_create_ttc_table_groups(). Set ft->g to NULL after kfree() to avoid double free. Fixes: 7b3722fa9ef6 ("net/mlx5e: Support RSS for GRE tunneled packets") Fixes: 33cfaaa8f36f ("net/mlx5e: Split the main flow steering table") Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-07 12:22:50 -08:00
Leon Romanovsky	4d8be21112	net/mlx5: Release devlink object if adev fails Add missed freeing previously allocated devlink object. Fixes: a925b5e309c9 ("net/mlx5: Register mlx5 devices to auxiliary virtual bus") Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-07 12:22:50 -08:00
Aya Levin	b1c0aca3d3	net/mlx5e: ethtool, Fix restriction of autoneg with 56G Prior to this patch, configuring speed to 50G with autoneg off over devices supporting 50G per lane failed. Support for 50G per lane introduced a new set of link-modes, on which driver always performed a speed validation as if only legacy link-modes were configured. Fix driver speed validation to force setting autoneg over 56G only if in legacy link-mode. Fixes: 3d7cadae51f1 ("net/mlx5e: ethtool, Fix analysis of speed setting") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-07 12:22:50 -08:00
Maor Dickman	e13ed0ac06	net/mlx5e: In skb build skip setting mark in switchdev mode sop_drop_qpn field in the cqe is used by two features, in SWITCHDEV mode to restore the chain id in case of a miss and in LEGACY mode to support skbedit mark action. In build RX skb, the skb mark field is set regardless of the configured mode which cause a corruption of the mark field in case of switchdev mode. Fix by overriding the mark value back to 0 in the representor tc update skb flow. Fixes: 8f1e0b97cc70 ("net/mlx5: E-Switch, Mark miss packets with new chain id mapping") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-07 12:22:50 -08:00
Alaa Hleihel	25c904b59a	net/mlx5: E-Switch, fix changing vf VLANID Adding vf VLANID for the first time, or after having cleared previously defined VLANID works fine, however, attempting to change an existing vf VLANID clears the rules on the firmware, but does not add new rules for the new vf VLANID. Fix this by changing the logic in function esw_acl_egress_lgcy_setup() so that it will always configure egress rules. Fixes: ea651a86d468 ("net/mlx5: E-Switch, Refactor eswitch egress acl codes") Signed-off-by: Alaa Hleihel <alaa@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-07 12:22:49 -08:00
Moshe Shemesh	b544011f0e	net/mlx5e: Fix SWP offsets when vlan inserted by driver In case WQE includes inline header the vlan is inserted by driver even if vlan offload is set. On geneve over vlan interface where software parser is used the SWP offsets should be updated according to the added vlan. Fixes: e3cfc7e6b7bd ("net/mlx5e: TX, Add geneve tunnel stateless offload support") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-07 12:22:49 -08:00
Oz Shlomo	eed38eeee7	net/mlx5e: CT: Use per flow counter when CT flow accounting is enabled Connection counters may be shared for both directions when the counter is used for connection aging purposes. However, if TC flow accounting is enabled then a unique counter is required per direction. Instantiate a unique counter per direction if the conntrack accounting extension is enabled. Use a shared counter when the connection accounting extension is disabled. Fixes: 1edae2335adf ("net/mlx5e: CT: Use the same counter for both directions") Signed-off-by: Oz Shlomo <ozsh@nvidia.com> Reported-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-07 12:22:49 -08:00
Mark Zhang	0f2dcade69	net/mlx5: Use port_num 1 instead of 0 when delete a RoCE address In multi-port mode, FW reports syndrome 0x2ea48 (invalid vhca_port_number) if the port_num is not 1 or 2. Fixes: 80f09dfc237f ("net/mlx5: Eswitch, enable RoCE loopback traffic") Signed-off-by: Mark Zhang <markzhang@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-07 12:22:48 -08:00
Aya Levin	9c9be85f6b	net/mlx5e: Add missing capability check for uplink follow Expose firmware indication that it supports setting eswitch uplink state to follow (follow the physical link). Condition setting the eswitch uplink admin-state with this capability bit. Older FW may not support the uplink state setting. Fixes: 7d0314b11cdd ("net/mlx5e: Modify uplink state on interface up/down") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-07 12:22:48 -08:00
Mark Zhang	abf8ef953a	net/mlx5: Check if lag is supported before creating one This patch fixes a memleak issue by preventing to create a lag and add PFs if lag is not supported. comm “python3”, pid 349349, jiffies 4296985507 (age 1446.976s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ……………. 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ……………. backtrace: [<000000005b216ae7>] mlx5_lag_add+0x1d5/0×3f0 [mlx5_core] [<000000000445aa55>] mlx5e_nic_enable+0x66/0×1b0 [mlx5_core] [<00000000c56734c3>] mlx5e_attach_netdev+0x16e/0×200 [mlx5_core] [<0000000030439d1f>] mlx5e_attach+0x5c/0×90 [mlx5_core] [<0000000018fd8615>] mlx5e_add+0x1a4/0×410 [mlx5_core] [<0000000068bc504b>] mlx5_add_device+0x72/0×120 [mlx5_core] [<000000009fce51f9>] mlx5_register_device+0x77/0xb0 [mlx5_core] [<00000000d0d81ff3>] mlx5_load_one+0xc58/0×1eb0 [mlx5_core] [<0000000045077adc>] init_one+0x3ea/0×920 [mlx5_core] [<0000000043287674>] pci_device_probe+0xcd/0×150 [<00000000dafd3279>] really_probe+0x1c9/0×4b0 [<00000000f06bdd84>] driver_probe_device+0x5d/0×140 [<00000000e3d508b6>] device_driver_attach+0x4f/0×60 [<0000000084fba0f0>] bind_store+0xbf/0×120 [<00000000bf6622b3>] kernfs_fop_write+0x114/0×1b0 Fixes: 9b412cc35f00 ("net/mlx5e: Add LAG warning if bond slave is not lag master") Signed-off-by: Mark Zhang <markzhang@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-07 12:22:48 -08:00
Linus Torvalds	f5e6c33025	spi: Fixes for v5.11 A couple of core fixes here, both to do with handling of drivers which don't report their maximum speed since we factored some of the handling for transfer speeds out into the core in the previous release. There's also some driver specific fixes, including a relatively large set for some races around timeouts in spi-geni-qcom. -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAl/3NqwACgkQJNaLcl1U h9C04gf8DHM+rGA6XsQfxjlqPmu0KOq+QVfDEYvXEbpWIQ1YWyB3kK3H6ftvFMJz bWW1xBjJL6nav9pZwRllLu2P1hkXek0MpZFFRp00MrUbUW06ct103xiIGVmVPUMQ A4w44nt7WJj/k9fRyvAjrAp1LQ8pGl0H1m6kifZLpNm8GlUMrCyFz7kfWs6ojHOo uBfru37WFiSfgQeDVTBd9v0A7Nu/dEnMlHCbikifpwsB/Q0GTfs8GG4pIo+QehwQ VhOCi/x+p7JW3Tksp4DM6jawzpBCoTRt5EzNzbprIyPde3vzUj4rjiqkcAxgW3hu oiwkkjM5e3/0oVHebSkmRVuG2qlT8w== =hP9t -----END PGP SIGNATURE----- Merge tag 'spi-fix-v5.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "A couple of core fixes here, both to do with handling of drivers which don't report their maximum speed since we factored some of the handling for transfer speeds out into the core in the previous release. There's also some driver specific fixes, including a relatively large set for some races around timeouts in spi-geni-qcom" * tag 'spi-fix-v5.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: fix the divide by 0 error when calculating xfer waiting time spi: Fix the clamping of spi->max_speed_hz spi: altera: fix return value for altera_spi_txrx() spi: stm32: FIFO threshold level - fix align packet size spi: spi-geni-qcom: Print an error when we timeout setting the CS spi: spi-geni-qcom: Don't try to set CS if an xfer is pending spi: spi-geni-qcom: Fail new xfers if xfer/cancel/abort pending spi: spi-geni-qcom: Fix geni_spi_isr() NULL dereference in timeout case	2021-01-07 12:21:32 -08:00
Linus Torvalds	a1a7b4f324	regulator: Fixes for v5.11 A few minor driver specific fixes, mostly DT bindings document bits, plus a new device ID. -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAl/3Pe0ACgkQJNaLcl1U h9Cigwf/QVwn08GK8+fsRCRWT/VffFeK2u9QlgUvqRLy8gFoU/GE2GczbU7yOBmy ELYHxuXBBGXU2tX5o0xE280KzPsRN5IFHRpj5HOGvdtpzaL4ch2xomVttzXoPaCQ i16YcTD5AR+yzYhnY/SWExgVpNx5qbYdYEuYBHf0v1jTF/L+eH5YKYBJibgzMooy tfSNFM8Ie6d1b8UUbMETdaA7UHd0QIqphJj11DFsNJ4wfH+CqjMqI11hTImADTJK tng15k+I9R3kE57I79Cdzpcd2/yl4eyTNbRbqx7non1nMQMx0tsN/wQk5mD2JVXM uYY6Xb2enz2lloNCcGiQLFmgRvXWBw== =sZgX -----END PGP SIGNATURE----- Merge tag 'regulator-fix-v5.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fixes from Mark Brown: "A few minor driver specific fixes, mostly DT bindings document bits, plus a new device ID" * tag 'regulator-fix-v5.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: qcom-rpmh: add QCOM_COMMAND_DB dependency regulator: qcom-rpmh-regulator: correct hfsmps515 definition dt-bindings: regulator: qcom,rpmh-regulator: add pm8009 revision regulator: bd718x7: Add enable times regulator: pf8x00: Use specific compatible strings for devices	2021-01-07 12:18:22 -08:00
Amanoel Dawod	384b77fd48	Fonts: font_ter16x32: Update font with new upstream Terminus release This is just a maintenance patch to update font_ter16x32.c with changes and minor fixes added in new upstream Terminus v4.49. >From release notes of new version 4.49, this brings: - Altered ascii grave in some sizes to be more useful as a back quote. - Fixed 21B5, added 21B2 and 21B3. Just as my initial submission of the font, above changes were obtained from new ter-i32b.psf font source. Terminus font sources are available for download at SourceForge: https://sourceforge.net/projects/terminus-font/files/terminus-font-4.49/ Simply running `make` in source directory will build the .psf font files. Signed-off-by: Amanoel Dawod <kernel@amanoeldawod.com> Link: https://lore.kernel.org/r/20201226235840.26290-1-kernel@amanoeldawod.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-01-07 21:06:25 +01:00
Sean Tranchetti	5316a7c013	tools: selftests: add test for changing routes with PTMU exceptions Adds new 2 new tests to the PTMU script: pmtu_ipv4/6_route_change. These tests explicitly test for a recently discovered problem in the IPv6 routing framework where PMTU exceptions were not properly released when replacing a route via "ip route change ...". After creating PMTU exceptions, the route from the device A to R1 will be replaced with a new route, then device A will be deleted. If the PMTU exceptions were properly cleaned up by the kernel, this device deletion will succeed. Otherwise, the unregistration of the device will stall, and messages such as the following will be logged in dmesg: unregister_netdevice: waiting for veth_A-R1 to become free. Usage count = 4 Signed-off-by: Sean Tranchetti <stranche@codeaurora.org> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/1609892546-11389-2-git-send-email-stranche@quicinc.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-07 12:03:36 -08:00
Sean Tranchetti	d8f5c29653	net: ipv6: fib: flush exceptions when purging route Route removal is handled by two code paths. The main removal path is via fib6_del_route() which will handle purging any PMTU exceptions from the cache, removing all per-cpu copies of the DST entry used by the route, and releasing the fib6_info struct. The second removal location is during fib6_add_rt2node() during a route replacement operation. This path also calls fib6_purge_rt() to handle cleaning up the per-cpu copies of the DST entries and releasing the fib6_info associated with the older route, but it does not flush any PMTU exceptions that the older route had. Since the older route is removed from the tree during the replacement, we lose any way of accessing it again. As these lingering DSTs and the fib6_info struct are holding references to the underlying netdevice struct as well, unregistering that device from the kernel can never complete. Fixes: 2b760fcf5cfb3 ("ipv6: hook up exception table to store dst cache") Signed-off-by: Sean Tranchetti <stranche@codeaurora.org> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/1609892546-11389-1-git-send-email-stranche@quicinc.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-07 12:03:16 -08:00
Linus Torvalds	fc37784dc7	regmap: Fixes for v5.11 A couple of small fixes for leaks when attaching a device to a preexisting regmap. -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAl/3OOIACgkQJNaLcl1U h9BHLwf/Q5iRceYjgTZT3EbDqtkf4eX60dwoaoCdSBOtzdzgX1kYdFI2xTeqjMTh sXKTA/OfapyWQrPIWW4vx5jkrum9Wz7hoQJhxp6jLl/lSj2c1TR3U7Nx+b2OA4Ap mPrdlfxHF47jZtsjfPtVUwNyb456ZL94H0IX/kefIUVp8uQusDFBoYLefi+T3+PA cRKlJqpcPpq2amZnXYCRhi8FpFPhFZezfAiddBzpab3xE+RFihgWl9NV2kuk99nR 9SD5ZhoDQF5IQnRXhfHGp2vhXFTk+BgpDuj+TFsF9TkWNgH8gtEQwmskvf7MJFvd Bihwcb8I7Me3ZPE7NI+8YqJEHLXnJg== =2RSs -----END PGP SIGNATURE----- Merge tag 'regmap-fix-v5.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap Pull regmap fixes from Mark Brown: "A couple of small fixes for leaks when attaching a device to a preexisting regmap" * tag 'regmap-fix-v5.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: regmap: debugfs: Fix a reversed if statement in regmap_debugfs_init() regmap: debugfs: Fix a memory leak when calling regmap_attach_dev	2021-01-07 11:57:56 -08:00

1 2 3 4 5 ...

982661 Commits