linux

iv/linux

Author	SHA1	Message	Date
YiFei Zhu	4e15f460be	Documentation/bpf: Document CGROUP_STORAGE map type The machanics and usage are not very straightforward. Given the changes it's better to document how it works and how to use it, rather than having to rely on the examples and implementation to infer what is going on. Signed-off-by: YiFei Zhu <zhuyifei@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/b412edfbb05cb1077c9e2a36a981a54ee23fa8b3.1595565795.git.zhuyifei@google.com	2020-07-25 20:16:36 -07:00
YiFei Zhu	3573f38401	selftests/bpf: Test CGROUP_STORAGE behavior on shared egress + ingress This mirrors the original egress-only test. The cgroup_storage is now extended to have two packet counters, one for egress and one for ingress. We also extend to have two egress programs to test that egress will always share with other egress origrams in the same cgroup. The behavior of the counters are exactly the same as the original egress-only test. The test is split into two, one "isolated" test that when the key type is struct bpf_cgroup_storage_key, which contains the attach type, programs of different attach types will see different storages. The other, "shared" test that when the key type is u64, programs of different attach types will see the same storage if they are attached to the same cgroup. Signed-off-by: YiFei Zhu <zhuyifei@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/c756f5f1521227b8e6e90a453299dda722d7324d.1595565795.git.zhuyifei@google.com	2020-07-25 20:16:36 -07:00
YiFei Zhu	7d9c342789	bpf: Make cgroup storages shared between programs on the same cgroup This change comes in several parts: One, the restriction that the CGROUP_STORAGE map can only be used by one program is removed. This results in the removal of the field 'aux' in struct bpf_cgroup_storage_map, and removal of relevant code associated with the field, and removal of now-noop functions bpf_free_cgroup_storage and bpf_cgroup_storage_release. Second, we permit a key of type u64 as the key to the map. Providing such a key type indicates that the map should ignore attach type when comparing map keys. However, for simplicity newly linked storage will still have the attach type at link time in its key struct. cgroup_storage_check_btf is adapted to accept u64 as the type of the key. Third, because the storages are now shared, the storages cannot be unconditionally freed on program detach. There could be two ways to solve this issue: * A. Reference count the usage of the storages, and free when the last program is detached. * B. Free only when the storage is impossible to be referred to again, i.e. when either the cgroup_bpf it is attached to, or the map itself, is freed. Option A has the side effect that, when the user detach and reattach a program, whether the program gets a fresh storage depends on whether there is another program attached using that storage. This could trigger races if the user is multi-threaded, and since nondeterminism in data races is evil, go with option B. The both the map and the cgroup_bpf now tracks their associated storages, and the storage unlink and free are removed from cgroup_bpf_detach and added to cgroup_bpf_release and cgroup_storage_map_free. The latter also new holds the cgroup_mutex to prevent any races with the former. Fourth, on attach, we reuse the old storage if the key already exists in the map, via cgroup_storage_lookup. If the storage does not exist yet, we create a new one, and publish it at the last step in the attach process. This does not create a race condition because for the whole attach the cgroup_mutex is held. We keep track of an array of new storages that was allocated and if the process fails only the new storages would get freed. Signed-off-by: YiFei Zhu <zhuyifei@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/d5401c6106728a00890401190db40020a1f84ff1.1595565795.git.zhuyifei@google.com	2020-07-25 20:16:35 -07:00
YiFei Zhu	9e5bd1f763	selftests/bpf: Test CGROUP_STORAGE map can't be used by multiple progs The current assumption is that the lifetime of a cgroup storage is tied to the program's attachment. The storage is created in cgroup_bpf_attach, and released upon cgroup_bpf_detach and cgroup_bpf_release. Because the current semantics is that each attachment gets a completely independent cgroup storage, and you can have multiple programs attached to the same (cgroup, attach type) pair, the key of the CGROUP_STORAGE map, looking up the map with this pair could yield multiple storages, and that is not permitted. Therefore, the kernel verifier checks that two programs cannot share the same CGROUP_STORAGE map, even if they have different expected attach types, considering that the actual attach type does not always have to be equal to the expected attach type. The test creates a CGROUP_STORAGE map and make it shared across two different programs, one cgroup_skb/egress and one /ingress. It asserts that the two programs cannot be both loaded, due to verifier failure from the above reason. Signed-off-by: YiFei Zhu <zhuyifei@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/30a6b0da67ae6b0296c4d511bfb19c5f3d035916.1595565795.git.zhuyifei@google.com	2020-07-25 20:16:35 -07:00
YiFei Zhu	d4a89c1eb8	selftests/bpf: Add test for CGROUP_STORAGE map on multiple attaches This test creates a parent cgroup, and a child of that cgroup. It attaches a cgroup_skb/egress program that simply counts packets, to a global variable (ARRAY map), and to a CGROUP_STORAGE map. The program is first attached to the parent cgroup only, then to parent and child. The test cases sends a message within the child cgroup, and because the program is inherited across parent / child cgroups, it will trigger the egress program for both the parent and child, if they exist. The program, when looking up a CGROUP_STORAGE map, uses the cgroup and attach type of the attachment parameters; therefore, both attaches uses different cgroup storages. We assert that all packet counts returns what we expects. Signed-off-by: YiFei Zhu <zhuyifei@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/5a20206afa4606144691c7caa0d1b997cd60dec0.1595565795.git.zhuyifei@google.com	2020-07-25 20:16:35 -07:00
Alexei Starovoitov	90065c0647	Merge branch 'fix-bpf_get_stack-with-PEBS' Song Liu says: ==================== Calling get_perf_callchain() on perf_events from PEBS entries may cause unwinder errors. To fix this issue, perf subsystem fetches callchain early, and marks perf_events are marked with __PERF_SAMPLE_CALLCHAIN_EARLY. Similar issue exists when BPF program calls get_perf_callchain() via helper functions. For more information about this issue, please refer to discussions in [1]. This set fixes this issue with helper proto bpf_get_stackid_pe and bpf_get_stack_pe. [1] https://lore.kernel.org/lkml/ED7B9430-6489-4260-B3C5-9CFA2E3AA87A@fb.com/ Changes v4 => v5: 1. Return -EPROTO instead of -EINVAL on PERF_EVENT_IOC_SET_BPF errors. (Alexei) 2. Let libbpf print a hint message when PERF_EVENT_IOC_SET_BPF returns -EPROTO. (Alexei) Changes v3 => v4: 1. Fix error check logic in bpf_get_stackid_pe and bpf_get_stack_pe. (Alexei) 2. Do not allow attaching BPF programs with bpf_get_stack\|stackid to perf_event with precise_ip > 0, but not proper callchain. (Alexei) 3. Add selftest get_stackid_cannot_attach. Changes v2 => v3: 1. Fix handling of stackmap skip field. (Andrii) 2. Simplify the code in a few places. (Andrii) Changes v1 => v2: 1. Simplify the design and avoid introducing new helper function. (Andrii) ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-07-25 20:16:35 -07:00
Song Liu	346938e938	selftests/bpf: Add get_stackid_cannot_attach This test confirms that BPF program that calls bpf_get_stackid() cannot attach to perf_event with precise_ip > 0 but not PERF_SAMPLE_CALLCHAIN; and cannot attach if the perf_event has exclude_callchain_kernel. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200723180648.1429892-6-songliubraving@fb.com	2020-07-25 20:16:35 -07:00
Song Liu	1da4864c2b	selftests/bpf: Add callchain_stackid This tests new helper function bpf_get_stackid_pe and bpf_get_stack_pe. These two helpers have different implementation for perf_event with PEB entries. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200723180648.1429892-5-songliubraving@fb.com	2020-07-25 20:16:35 -07:00
Song Liu	d4b4dd6ce7	libbpf: Print hint when PERF_EVENT_IOC_SET_BPF returns -EPROTO The kernel prevents potential unwinder warnings and crashes by blocking BPF program with bpf_get_[stack\|stackid] on perf_event without PERF_SAMPLE_CALLCHAIN, or with exclude_callchain_[kernel\|user]. Print a hint message in libbpf to help the user debug such issues. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200723180648.1429892-4-songliubraving@fb.com	2020-07-25 20:16:35 -07:00
Alexei Starovoitov	909e446b32	Merge branch 'bpf_iter-for-map-elems' Yonghong Song says: ==================== Bpf iterator has been implemented for task, task_file, bpf_map, ipv6_route, netlink, tcp and udp so far. For map elements, there are two ways to traverse all elements from user space: 1. using BPF_MAP_GET_NEXT_KEY bpf subcommand to get elements one by one. 2. using BPF_MAP_LOOKUP_BATCH bpf subcommand to get a batch of elements. Both these approaches need to copy data from kernel to user space in order to do inspection. This patch implements bpf iterator for map elements. User can have a bpf program in kernel to run with each map element, do checking, filtering, aggregation, modifying values etc. without copying data to user space. Patch #1 and #2 are refactoring. Patch #3 implements readonly/readwrite buffer support in verifier. Patches #4 - #7 implements map element support for hash, percpu hash, lru hash lru percpu hash, array, percpu array and sock local storage maps. Patches #8 - #9 are libbpf and bpftool support. Patches #10 - #13 are selftests for implemented map element iterators. Changelogs: v3 -> v4: . fix a kasan failure triggered by a failed bpf_iter link_create, not just free_link but need cleanup_link. (Alexei) v2 -> v3: . rebase on top of latest bpf-next v1 -> v2: . support to modify map element values. (Alexei) . map key/values can be used with helper arguments for those arguments with ARG_PTR_TO_MEM or ARG_PTR_TO_INIT_MEM register type. (Alexei) . remove usused variable. (kernel test robot) ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-07-25 20:16:34 -07:00
Song Liu	5d99cb2c86	bpf: Fail PERF_EVENT_IOC_SET_BPF when bpf_get_[stack\|stackid] cannot work bpf_get_[stack\|stackid] on perf_events with precise_ip uses callchain attached to perf_sample_data. If this callchain is not presented, do not allow attaching BPF program that calls bpf_get_[stack\|stackid] to this event. In the error case, -EPROTO is returned so that libbpf can identify this error and print proper hint message. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200723180648.1429892-3-songliubraving@fb.com	2020-07-25 20:16:34 -07:00
Yonghong Song	9efcc4ad7a	selftests/bpf: Add a test for out of bound rdonly buf access If the bpf program contains out of bound access w.r.t. a particular map key/value size, the verification will be still okay, e.g., it will be accepted by verifier. But it will be rejected during link_create time. A test is added here to ensure link_create failure did happen if out of bound access happened. $ ./test_progs -n 4 ... #4/23 rdonly-buf-out-of-bound:OK ... Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200723184124.591700-1-yhs@fb.com	2020-07-25 20:16:34 -07:00
Song Liu	7b04d6d60f	bpf: Separate bpf_get_[stack\|stackid] for perf events BPF Calling get_perf_callchain() on perf_events from PEBS entries may cause unwinder errors. To fix this issue, the callchain is fetched early. Such perf_events are marked with __PERF_SAMPLE_CALLCHAIN_EARLY. Similarly, calling bpf_get_[stack\|stackid] on perf_events from PEBS may also cause unwinder errors. To fix this, add separate version of these two helpers, bpf_get_[stack\|stackid]_pe. These two hepers use callchain in bpf_perf_event_data_kern->data->callchain. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200723180648.1429892-2-songliubraving@fb.com	2020-07-25 20:16:34 -07:00
Yonghong Song	3b1c420bd8	selftests/bpf: Add a test for bpf sk_storage_map iterator Added one test for bpf sk_storage_map_iterator. $ ./test_progs -n 4 ... #4/22 bpf_sk_storage_map:OK ... Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200723184122.591591-1-yhs@fb.com	2020-07-25 20:16:34 -07:00
Yonghong Song	60dd49ea65	selftests/bpf: Add test for bpf array map iterators Two subtests are added. $ ./test_progs -n 4 ... #4/20 bpf_array_map:OK #4/21 bpf_percpu_array_map:OK ... The bpf_array_map subtest also tested bpf program changing array element values and send key/value to user space through bpf_seq_write() interface. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200723184121.591367-1-yhs@fb.com	2020-07-25 20:16:34 -07:00
Yonghong Song	2a7c2fff7d	selftests/bpf: Add test for bpf hash map iterators Two subtests are added. $ ./test_progs -n 4 ... #4/18 bpf_hash_map:OK #4/19 bpf_percpu_hash_map:OK ... Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200723184120.590916-1-yhs@fb.com	2020-07-25 20:16:33 -07:00
Yonghong Song	d8793aca70	tools/bpftool: Add bpftool support for bpf map element iterator The optional parameter "map MAP" can be added to "bpftool iter" command to create a bpf iterator for map elements. For example, bpftool iter pin ./prog.o /sys/fs/bpf/p1 map id 333 For map element bpf iterator "map MAP" parameter is required. Otherwise, bpf link creation will return an error. Quentin Monnet kindly provided bash-completion implementation for new "map MAP" option. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200723184119.590799-1-yhs@fb.com	2020-07-25 20:16:33 -07:00
Yonghong Song	cd31039a73	tools/libbpf: Add support for bpf map element iterator Add map_fd to bpf_iter_attach_opts and flags to bpf_link_create_opts. Later on, bpftool or selftest will be able to create a bpf map element iterator by passing map_fd to the kernel during link creation time. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200723184117.590673-1-yhs@fb.com	2020-07-25 20:16:33 -07:00
Yonghong Song	5ce6e77c7e	bpf: Implement bpf iterator for sock local storage map The bpf iterator for bpf sock local storage map is implemented. User space interacts with sock local storage map with fd as a key and storage value. In kernel, passing fd to the bpf program does not really make sense. In this case, the sock itself is passed to bpf program. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200723184116.590602-1-yhs@fb.com	2020-07-25 20:16:33 -07:00
Yonghong Song	d3cc2ab546	bpf: Implement bpf iterator for array maps The bpf iterators for array and percpu array are implemented. Similar to hash maps, for percpu array map, bpf program will receive values from all cpus. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200723184115.590532-1-yhs@fb.com	2020-07-25 20:16:33 -07:00
Yonghong Song	d6c4503cc2	bpf: Implement bpf iterator for hash maps The bpf iterators for hash, percpu hash, lru hash and lru percpu hash are implemented. During link time, bpf_iter_reg->check_target() will check map type and ensure the program access key/value region is within the map defined key/value size limit. For percpu hash and lru hash maps, the bpf program will receive values for all cpus. The map element bpf iterator infrastructure will prepare value properly before passing the value pointer to the bpf program. This patch set supports readonly map keys and read/write map values. It does not support deleting map elements, e.g., from hash tables. If there is a user case for this, the following mechanism can be used to support map deletion for hashtab, etc. - permit a new bpf program return value, e.g., 2, to let bpf iterator know the map element should be removed. - since bucket lock is taken, the map element will be queued. - once bucket lock is released after all elements under this bucket are traversed, all to-be-deleted map elements can be deleted. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200723184114.590470-1-yhs@fb.com	2020-07-25 20:16:33 -07:00
Alexei Starovoitov	a228a64fc1	bpf: Add bpf_prog iterator It's mostly a copy paste of commit 6086d29def80 ("bpf: Add bpf_map iterator") that is use to implement bpf_seq_file opreations to traverse all bpf programs. v1->v2: Tweak to use build time btf_id Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net>	2020-07-25 20:16:32 -07:00
Yonghong Song	a5cbe05a66	bpf: Implement bpf iterator for map elements The bpf iterator for map elements are implemented. The bpf program will receive four parameters: bpf_iter_meta meta: the meta data bpf_map map: the bpf_map whose elements are traversed void key: the key of one element void value: the value of the same element Here, meta and map pointers are always valid, and key has register type PTR_TO_RDONLY_BUF_OR_NULL and value has register type PTR_TO_RDWR_BUF_OR_NULL. The kernel will track the access range of key and value during verification time. Later, these values will be compared against the values in the actual map to ensure all accesses are within range. A new field iter_seq_info is added to bpf_map_ops which is used to add map type specific information, i.e., seq_ops, init/fini seq_file func and seq_file private data size. Subsequent patches will have actual implementation for bpf_map_ops->iter_seq_info. In user space, BPF_ITER_LINK_MAP_FD needs to be specified in prog attr->link_create.flags, which indicates that attr->link_create.target_fd is a map_fd. The reason for such an explicit flag is for possible future cases where one bpf iterator may allow more than one possible customization, e.g., pid and cgroup id for task_file. Current kernel internal implementation only allows the target to register at most one required bpf_iter_link_info. To support the above case, optional bpf_iter_link_info's are needed, the target can be extended to register such link infos, and user provided link_info needs to match one of target supported ones. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200723184112.590360-1-yhs@fb.com	2020-07-25 20:16:32 -07:00
Yonghong Song	3f9969f2c0	bpf: Fix pos computation for bpf_iter seq_ops->start() Currently, the pos pointer in bpf iterator map/task/task_file seq_ops->start() is always incremented. This is incorrect. It should be increased only if pos is 0 (for SEQ_START_TOKEN) since these start() function actually returns the first real object. If pos is not 0, it merely found the object based on the state in seq->private, and not really advancing the pos. This patch fixed this issue by only incrementing pos if it is 0. Note that the old *pos calculation, although not correct, does not affect correctness of bpf_iter as bpf_iter seq_file->read() does not support llseek. This patch also renamed "mid" in bpf_map iterator seq_file private data to "map_id" for better clarity. Fixes: 6086d29def80 ("bpf: Add bpf_map iterator") Fixes: eaaacd23910f ("bpf: Add task and task/file iterator targets") Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200722195156.4029817-1-yhs@fb.com	2020-07-25 20:16:32 -07:00
Yonghong Song	afbf21dce6	bpf: Support readonly/readwrite buffers in verifier Readonly and readwrite buffer register states are introduced. Totally four states, PTR_TO_RDONLY_BUF[_OR_NULL] and PTR_TO_RDWR_BUF[_OR_NULL] are supported. As suggested by their respective names, PTR_TO_RDONLY_BUF[_OR_NULL] are for readonly buffers and PTR_TO_RDWR_BUF[_OR_NULL] for read/write buffers. These new register states will be used by later bpf map element iterator. New register states share some similarity to PTR_TO_TP_BUFFER as it will calculate accessed buffer size during verification time. The accessed buffer size will be later compared to other metrics during later attach/link_create time. Similar to reg_state PTR_TO_BTF_ID_OR_NULL in bpf iterator programs, PTR_TO_RDONLY_BUF_OR_NULL or PTR_TO_RDWR_BUF_OR_NULL reg_types can be set at prog->aux->bpf_ctx_arg_aux, and bpf verifier will retrieve the values during btf_ctx_access(). Later bpf map element iterator implementation will show how such information will be assigned during target registeration time. The verifier is also enhanced such that PTR_TO_RDONLY_BUF can be passed to ARG_PTR_TO_MEM[_OR_NULL] helper argument, and PTR_TO_RDWR_BUF can be passed to ARG_PTR_TO_MEM[_OR_NULL] or ARG_PTR_TO_UNINIT_MEM. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200723184111.590274-1-yhs@fb.com	2020-07-25 20:16:32 -07:00
Jakub Sitnicki	86176a1821	selftests/bpf: Test BPF socket lookup and reuseport with connections Cover the case when BPF socket lookup returns a socket that belongs to a reuseport group, and the reuseport group contains connected UDP sockets. Ensure that the presence of connected UDP sockets in reuseport group does not affect the socket lookup result. Socket selected by reuseport should always be used as result in such case. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Link: https://lore.kernel.org/bpf/20200722161720.940831-3-jakub@cloudflare.com	2020-07-25 20:16:32 -07:00
Yonghong Song	f9c7927295	bpf: Refactor to provide aux info to bpf_iter_init_seq_priv_t This patch refactored target bpf_iter_init_seq_priv_t callback function to accept additional information. This will be needed in later patches for map element targets since a particular map should be passed to traverse elements for that particular map. In the future, other information may be passed to target as well, e.g., pid, cgroup id, etc. to customize the iterator. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200723184110.590156-1-yhs@fb.com	2020-07-25 20:16:32 -07:00
Yonghong Song	14fc6bd6b7	bpf: Refactor bpf_iter_reg to have separate seq_info member There is no functionality change for this patch. Struct bpf_iter_reg is used to register a bpf_iter target, which includes information for both prog_load, link_create and seq_file creation. This patch puts fields related seq_file creation into a different structure. This will be useful for map elements iterator where one iterator covers different map types and different map types may have different seq_ops, init/fini private_data function and private_data size. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200723184109.590030-1-yhs@fb.com	2020-07-25 20:16:32 -07:00
Jakub Sitnicki	c8a2983c4d	udp: Don't discard reuseport selection when group has connections When BPF socket lookup prog selects a socket that belongs to a reuseport group, and the reuseport group has connected sockets in it, the socket selected by reuseport will be discarded, and socket returned by BPF socket lookup will be used instead. Modify this behavior so that the socket selected by reuseport running after BPF socket lookup always gets used. Ignore the fact that the reuseport group might have connections because it is only relevant when scoring sockets during regular hashtable-based lookup. Fixes: 72f7e9440e9b ("udp: Run SK_LOOKUP BPF program on socket lookup") Fixes: 6d4201b1386b ("udp6: Run SK_LOOKUP BPF program on socket lookup") Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Link: https://lore.kernel.org/bpf/20200722161720.940831-2-jakub@cloudflare.com	2020-07-25 20:16:01 -07:00
Andrii Nakryiko	f3c93a93b5	tools/bpftool: Strip BPF .o files before skeleton generation Strip away DWARF info from .bpf.o files, before generating BPF skeletons. This reduces bpftool binary size from 3.43MB to 2.58MB. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20200722043804.2373298-1-andriin@fb.com	2020-07-25 20:15:32 -07:00
David S. Miller	a57066b1a0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net The UDP reuseport conflict was a little bit tricky. The net-next code, via bpf-next, extracted the reuseport handling into a helper so that the BPF sk lookup code could invoke it. At the same time, the logic for reuseport handling of unconnected sockets changed via commit efc6b6f6c3113e8b203b9debfb72d81e0f3dcace which changed the logic to carry on the reuseport result into the rest of the lookup loop if we do not return immediately. This requires moving the reuseport_has_conns() logic into the callers. While we are here, get rid of inline directives as they do not belong in foo.c files. The other changes were cases of more straightforward overlapping modifications. Signed-off-by: David S. Miller <davem@davemloft.net>	2020-07-25 17:49:04 -07:00
Linus Torvalds	04300d66f0	RISC-V Fixes for 5.8-rc7 I have a few more fixes this week: * A fix to avoid using SBI calls during kasan initialization, as the SBI calls themselves have not been probed yet. * Three fixes related to systems with multiple memory regions. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAl8cebwTHHBhbG1lckBk YWJiZWx0LmNvbQAKCRAuExnzX7sYiV4UEACunm3LQGjjZUbHX/1/X2iAdNBcnbiZ SFSXaKuAiDzSgT0//UMloo33eywo3afGCTOCysi3zrwNU+Hbj5l9byTMnvXy9hGv QCFf2aEma9Ebk9ImdLDjkHQy2e1SiQRHQiKQcGTtP4LN470rrysh0a8qLLVAGucJ j5fDPcy9smRw4x0lGHqLXDmVIm8nQNJAj6iKVQYzA2mSJC2+RoKcPcunnXvoOUwl q/lcp41BavZyYsDIoAblRCA1/sfFL9BfocL4fUBFh4HUp/qcqwsGPv+U5QoEzkQ8 9LLLuXnHXgU5BTv4zVCUqXjMza6fgZj+qMpxXLQig0DRBetuQY2gpCdt2JAkcXTj XvkHavxkA6206myl9xY6OJueC/25I6dp0eDENBpUNET8fqbeFBhIUCuNtAzT+98J OcMImyFQgoesGU+6dkdHr6tSJHDcI3PzXlVUNH3hkIQq5Ny10g+jjAInZo6iXfpW yBu+AzvippbD6C8/L+ntP94y9F/Me5rZKWMBhT2gTLm6JLrtC/Wl+ol2fZF8gmIU wdT6TUykDY6u6bdqPSRKISzy0HfuREQNJU3qqoI496BMWQ1DaQjHB1482fdtdELI KUFkM6u9iudWu5Dap45rSCjeZu01cLqnQqO5M3LYUrCy0D02j5SiGfUwRYo+eD0S me68CtmbUPsH7Q== =FG/F -----END PGP SIGNATURE----- Merge tag 'riscv-for-linus-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux into master Pull RISC-V fixes from Palmer Dabbelt: "A few more fixes this week: - A fix to avoid using SBI calls during kasan initialization, as the SBI calls themselves have not been probed yet. - Three fixes related to systems with multiple memory regions" * tag 'riscv-for-linus-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Parse all memory blocks to remove unusable memory RISC-V: Do not rely on initrd_start/end computed during early dt parsing RISC-V: Set maximum number of mapped pages correctly riscv: kasan: use local_tlb_flush_all() to avoid uninitialized __sbi_rfence	2020-07-25 14:42:11 -07:00
Linus Torvalds	fbe0d451bc	Misc fixes: - Fix a section end page alignment assumption that was causing crashes - Fix ORC unwinding on freshly forked tasks which haven't executed yet and which have empty user task stacks - Fix the debug.exception-trace=1 sysctl dumping of user stacks, which was broken by recent maccess changes Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8cGlwRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1hVGxAAhW5LoNLv6BsMtKrFhBJRzPhoO8YcB8ep Z/MaaiEcS8kcSVx1nQrNNhqzwIGBlexHTyoddYQr/LeM3ahOoXc4+jZI7Es6WKdc Rhofalp0pryJrqj52orAVAUeHikmiqmadANKKmUNtXcjWWoxexgab28gLwJPPA55 SKw1/tqQ6JTf/M6tLnalib/VcigN7F/r797LgqLan9pSkjMGj6f1zMFmqTsNDi05 Si5gqRv2GlAraelRQU7emMZGu1bXhfJvGix7BB1/u6nRimY15vIFtccgt4akJmFN tBUwNHFeLrYsoX1txjMTtGzASBqtjaD6A1K2geqmuZ1lRL1YuGnUl7n4ccgAvQdJ eY6iiLiXDcjZLZXYe5tfdV/4p3HZ3MYm/IRkaw2KfiGAIv0Gii/U0mShO+lX1PAZ BdelgJv2fcOxDiNe0RM4EniBXQUcqDIOpQ1fPryYsFLeeA1LCK2/b8WLOPHuK8YG oxAgU9vV7OA7nfBdxIPszvX1jXOiQPglsW2gfxlKbGAA9j3uKDGhGnmFFYatYUMF OEOf/5iuw3MZE/+gAG685ksk4ibgW4Kv6ahAEdzs5hh37uM8SiC3oWVWQf2YWGHG ITswqKY2oUqghWa3ftXEtPLKstsYbRG7KCRfBXh6AN2ay7JrDJg6sItCL4UI5BjK dkDJ4pKLXug= =oamN -----END PGP SIGNATURE----- Merge tag 'x86-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master Pull x86 fixes from Ingo Molnar: "Misc fixes: - Fix a section end page alignment assumption that was causing crashes - Fix ORC unwinding on freshly forked tasks which haven't executed yet and which have empty user task stacks - Fix the debug.exception-trace=1 sysctl dumping of user stacks, which was broken by recent maccess changes" * tag 'x86-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/dumpstack: Dump user space code correctly again x86/stacktrace: Fix reliable check for empty user task stacks x86/unwind/orc: Fix ORC for newly forked tasks x86, vmlinux.lds: Page-align end of ..page_aligned sections	2020-07-25 14:25:47 -07:00
Linus Torvalds	78b1afe22d	Fix a interaction/regression between uprobes based shared library tracing & GDB. Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8cHqwRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1iyTxAAghbrqeeT6V4lj8xfBif5BOvexEPDaur5 2OPXRMLg4YWYCCC8Wwdj43Nnz3IuG5A75acyXkxmqLgvIu4QE7EAm8CYuyy0LWZa NK7A8v5fbDOKJJL/WdSZ/xoFP27KLgE4u0gaeBBOp9Chib/bWo6h63jxsOpbQybE u5u6x4uAm90RKRocYnfHaMXxI8kDVnRRdfwmVjAcy3dbu3uth02vdfqlmXrmPh1A kpqH/EnIlw9DrJCGOdDJPCmCTjcj5xYyy3qA7C6DkY3X+oe4AYtMWnfCPOgV0k7v xGq0j9jIa0kynHQsZxjX28CObuumMrh6ZbI7/UfE8W8yccpUQKkzTWKhqzb/9Wc7 E5SivCiykAhUPCbhcwZIihW6nwjWSF54Z1UCxi/PA1RMRTe89e96881Xfgh/huW7 TCHnK9lRyWAP+ceJcj72BEobRElfsHyeTbd0xwS/a+LBCftxp/r8Fm1DHrrBaGtL WE/lHXxgRR2b/ieAVhlEUNwd7tn4YBjcad8WstZTYIZHMxtmPvUKjaDhMuWNclIH QtjRYy1vA0gAf2Lbs3EP52sTZXyxlksftkwlca0z8VF5SezDER0R4V4EgQqIa8zC mLr7jmnKqEeOp6vkdhTSnDHYNHuVw4OpGCY+28vh17b/z6MFNokAq5Bd/fV07oKg nassUOSteTw= =eCSv -----END PGP SIGNATURE----- Merge tag 'perf-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master Pull uprobe fix from Ingo Molnar: "Fix an interaction/regression between uprobes based shared library tracing & GDB" * tag 'perf-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: uprobes: Change handle_swbp() to send SIGTRAP with si_code=SI_KERNEL, to fix GDB regression	2020-07-25 13:55:38 -07:00
Linus Torvalds	a7b36c2b13	Fix a suspend/resume regression (crash) on TI AM3/AM4 SoC's. Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8cDeIRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1gVeA/8Cs67mJ9j3oto917z1GLElBVy7A+od/Pn TqWEtwmonJA/ZM/TvuvfFxHFg6kgPlDx0x5pHYUows/8DGsftdveDQw4bGhrCN0i Xghcn4gB+jdOaykpLQacEukYt/nPU0IOf79XqySiu774lWHvzwQI+2MIb7xBHBMO npUXFzgeSlgjFkMm1lVa9QVPwVc1zQ6gKPJdLpDlrgW3cCrenM7QCAg3NG3Gx5TT OXK4QZEvIuym+HWwbZRA2mOUSjm8U1Z1o95jiWUro7o5Xe+Agqi5zsLJc7lxX57+ LYwrXNx/zPZ+oBlzZhZZrAxn1bJP7DVS13ZCEL9Miy0Uslr7QBPyoGHiV4P8tL3v cOn7kY+P7YT96Fx3ahkK3pJN+nUZIAsRrwZbgOP4bZHKdVi3zqUEilSdZNcXBZ9y nj9mY/2pX9SOMIvoaosku2hkKs5EKIkh3qzhvbfA8fXsQC3lttrPGRW7hgPmxm7W Fa+6liOi+t+d0rGgmi1fhHSn/L2gzmsLy5R1l6XyK15nDSALd4TrdtDrlJ8vbbl5 oRAN9mt9qPqNCQQKjcnAIXuPeJp2Pt6KphlPApHVa3J/eVzug1dFWyjia/e+js8W Gejz1TV6SDdU137TmC4E6suAZ5nRSEBwiE+MiNnnG0DIvz7cp5FSIWCXOgt5WPl5 BXS4CSwnF24= =7P/h -----END PGP SIGNATURE----- Merge tag 'timers-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master Pull timer fix from Ingo Molnar: "Fix a suspend/resume regression (crash) on TI AM3/AM4 SoC's" * tag 'timers-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clocksource/drivers/timer-ti-dm: Fix suspend and resume for am3 and am4	2020-07-25 13:27:12 -07:00
Linus Torvalds	3077805eeb	Fix a race introduced by the recent loadavg race fix, plus add a debug check for a hard to debug case of bogus wakeup function flags. Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8cDFkRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1h2Hw//WW67DawJjUVoz7Hwjgf02ECgY1XspmoW 5oJjtXFU6pvOThL4U8qcb1HGOtWV/MqFajv25RX4lIyaxBo8AD5mazVQCwOc3s2C nR9ZM2kYRgSKbrSp2EOoGYCTw76do9y8beahZ5btvSkjNzD1YgDJwwcuGeMZwp3w cFg8GeeUbLRFcsVVMx+apirg2KR9rUp1jqjVdjsln8C/VngjWPYFdktgAglcXfry 5Op2SOkYUw4gw/vX8jm6YyyxtqZHbH5lsDN7zk12W3Xwqh9tH2ounk+ehzFlT+cY HD8PvB39cT4UYtF7R0R7W6cYuLJkbF6XHpRhqBvwd4M7u6yzABp5S2W9ZfZPjmQR nNTV2MBK9LUWTtMhXby7w7ooooSFqRydEUpEfTrmD8xYmMGYMmNkjpgErXKtIyk+ a124ouvP8pwmI0WxA9w8rPKAJnuN0TN60RvtEE4RpxvaL4Os79QcTObhwuxdarxm 8nJvQRqts0O3+i9dWF8hS9ZFWkQZfX9GfiEK6zR2qaqjxtVwvagc58XLBcRBd9pn CI8JvJ1pZ8AFC6dy/szZ/6awrEJzynfeInYXnVX8UQ7Cn+bBQdTAlgXopLY58ybY RKTIq4/Ov9D7g3RggoK+aEjryxJflN0xaz9Dus8fvveIZuFHtsra0u7FM3ZHz38x rmP1A6Dc3Yc= =XPX3 -----END PGP SIGNATURE----- Merge tag 'sched-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master Pull scheduler fixes from Ingo Molnar: "Fix a race introduced by the recent loadavg race fix, plus add a debug check for a hard to debug case of bogus wakeup function flags" * tag 'sched-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Warn if garbage is passed to default_wake_function() sched: Fix race against ptrace_freeze_trace()	2020-07-25 13:24:40 -07:00
Linus Torvalds	17baa44286	Various EFI fixes: - Fix the layering violation in the use of the EFI runtime services availability mask in users of the 'efivars' abstraction - Revert build fix for GCC v4.8 which is no longer supported - Clean up some x86 EFI stub details, some of which are borderline bugs that copy around garbage into padding fields - let's fix these out of caution. - Fix build issues while working on RISC-V support - Avoid --whole-archive when linking the stub on arm64 Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8cCIkRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1hGsxAAsPwL5ghYykViUQId0OjNmI7eRgISKRso 3n3BaFmgYMp9eq1gndzPp7A5Ty0NPsr8FiwuZ7setk9YoLuTUT1MHVtAzd6xkxlR 838CwTDvW5HvB69uxPQDHA/1mcH1smH4Iew/J7QXP+o6Zrg+BWOjKNtTiKFwawNC m4Tkdvvq52wzykqbeuRhXxLetKFOH//R1V0s4M6nNySuY6gQGJQ2LPyaiMN6eq1V LE8+wGcNlIRUOeC8RkEA7CE9g92jkGZ07uJDA09OP0J5WBNWLcxMJM2mBZ1Ho6uc sNNOeTy76sjuXQvUBWCnjBr3/qqXnVGSuIq8NS1hS4L7HagdQ/peqFJRzvWBHT90 b9wUZv09ioFm9lb6/P6NL16sn/WPknCD7umxpfp5HKrlL2p7puvsvDXtuWgyhhPG M+X9ZX1iaA54cA9bU6cXFzrNMw/DnYjHFsECF915EXjItNZToGs0rd1lf7ArgH2P +3HgvLods73ufObKH2pUfY7EU2Ly1oJsNpK3RmpoOUzehW+++S80KdytkaAB10kT dKp9LlTj6gC+lnC45J9NtpHlCodz/Rc0lBHQpxIqlk/p9grUuH4zn714Ii/FfNQg i29GX9cCdgv+4KzmJCNTqyZvGFZ5m3K8f41x7iD+/ygqPLsP9PfV5RGRGau+FfL8 ezInhKdITqM= =n3nF -----END PGP SIGNATURE----- Merge tag 'efi-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master Pull EFI fixes from Ingo Molnar: "Various EFI fixes: - Fix the layering violation in the use of the EFI runtime services availability mask in users of the 'efivars' abstraction - Revert build fix for GCC v4.8 which is no longer supported - Clean up some x86 EFI stub details, some of which are borderline bugs that copy around garbage into padding fields - let's fix these out of caution. - Fix build issues while working on RISC-V support - Avoid --whole-archive when linking the stub on arm64" * tag 'efi-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi: Revert "efi/x86: Fix build with gcc 4" efi/efivars: Expose RT service availability via efivars abstraction efi/libstub: Move the function prototypes to header file efi/libstub: Fix gcc error around __umoddi3 for 32 bit builds efi/libstub/arm64: link stub lib.a conditionally efi/x86: Only copy upto the end of setup_header efi/x86: Remove unused variables	2020-07-25 13:18:42 -07:00
Linus Torvalds	7cb3a5c5f6	Fixfor a recently discovered regression in rename to older servers caused by a recent patch -----BEGIN PGP SIGNATURE----- iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAl8bmiQACgkQiiy9cAdy T1EfFwv+IeD4kNshoBF5NT//+7j+9n+R8z74v5bJUEmoabZMuCaWuYl/uUcBqIBH l29Xd60aF8V2jd3xIt/SEPNFOAvvBSQIjn8X52Lkquo+QzgiOElSUL8aNZe17T9L N1S+nhwk2udlveBtWZRxHT50WNdKQCGKEwUDn9ouUbuKxLrcwWc1gRCxkegmJXey NRaynzdCRYFx7uLdctvesTDB4P9MJn+5vX7L1BdI7MxXzsSEI1yHpM5OBL9e13ki swZ7P16qyCA6TK8EYuYLCeXKEK0X+wGeu/y8JZ7RjO9ozqrvECvguumNbcUCs+zf vLWQkaeC2M1L5L4WV71sY5Pi7CUGw9WGaU/FBJUK65+hYqUWETt351BwbIRwVlrx mX7C7Dh50AM84/54u169Sk1p/S8vjdvkmx8YyRWK+t7kmZx6TP7XKXvYN1vEHS12 Y5ceRKmNHLdRqS3nY6FK7TKdgVWK2hgNODDdVbUZ+iPoJztSmH7j2RkTzxJz6qBi vzBAPJiG =Cvv4 -----END PGP SIGNATURE----- Merge tag '5.8-rc6-cifs-fix' of git://git.samba.org/sfrench/cifs-2.6 into master Pull cifs fix from Steve French: "A fix for a recently discovered regression in rename to older servers caused by a recent patch" * tag '5.8-rc6-cifs-fix' of git://git.samba.org/sfrench/cifs-2.6: Revert "cifs: Fix the target file was deleted when rename failed."	2020-07-25 12:53:46 -07:00
Linus Torvalds	1b64b2e244	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net into master Pull networking fixes from David Miller: 1) Fix RCU locaking in iwlwifi, from Johannes Berg. 2) mt76 can access uninitialized NAPI struct, from Felix Fietkau. 3) Fix race in updating pause settings in bnxt_en, from Vasundhara Volam. 4) Propagate error return properly during unbind failures in ax88172a, from George Kennedy. 5) Fix memleak in adf7242_probe, from Liu Jian. 6) smc_drv_probe() can leak, from Wang Hai. 7) Don't muck with the carrier state if register_netdevice() fails in the bonding driver, from Taehee Yoo. 8) Fix memleak in dpaa_eth_probe, from Liu Jian. 9) Need to check skb_put_padto() return value in hsr_fill_tag(), from Murali Karicheri. 10) Don't lose ionic RSS hash settings across FW update, from Shannon Nelson. 11) Fix clobbered SKB control block in act_ct, from Wen Xu. 12) Missing newlink in "tx_timeout" sysfs output, from Xiongfeng Wang. 13) IS_UDPLITE cleanup a long time ago, incorrectly handled transformations involving UDPLITE_RECV_CC. From Miaohe Lin. 14) Unbalanced locking in netdevsim, from Taehee Yoo. 15) Suppress false-positive error messages in qed driver, from Alexander Lobakin. 16) Out of bounds read in ax25_connect and ax25_sendmsg, from Peilin Ye. 17) Missing SKB release in cxgb4's uld_send(), from Navid Emamdoost. 18) Uninitialized value in geneve_changelink(), from Cong Wang. 19) Fix deadlock in xen-netfront, from Andera Righi. 19) flush_backlog() frees skbs with IRQs disabled, so should use dev_kfree_skb_irq() instead of kfree_skb(). From Subash Abhinov Kasiviswanathan. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (111 commits) drivers/net/wan: lapb: Corrected the usage of skb_cow dev: Defer free of skbs in flush_backlog qrtr: orphan socket in qrtr_release() xen-netfront: fix potential deadlock in xennet_remove() flow_offload: Move rhashtable inclusion to the source file geneve: fix an uninitialized value in geneve_changelink() bonding: check return value of register_netdevice() in bond_newlink() tcp: allow at most one TLP probe per flight AX.25: Prevent integer overflows in connect and sendmsg cxgb4: add missing release on skb in uld_send() net: atlantic: fix PTP on AQC10X AX.25: Prevent out-of-bounds read in ax25_sendmsg() sctp: shrink stream outq when fails to do addstream reconf sctp: shrink stream outq only when new outcnt < old outcnt AX.25: Fix out-of-bounds read in ax25_connect() enetc: Remove the mdio bus on PF probe bailout net: ethernet: ti: add NETIF_F_HW_TC hw feature flag for taprio offload net: ethernet: ave: Fix error returns in ave_init drivers/net/wan/x25_asy: Fix to make it work ipvs: fix the connection sync failed in some cases ...	2020-07-25 11:50:59 -07:00
Atish Patra	fa5a198359	riscv: Parse all memory blocks to remove unusable memory Currently, maximum physical memory allowed is equal to -PAGE_OFFSET. That's why we remove any memory blocks spanning beyond that size. However, it is done only for memblock containing linux kernel which will not work if there are multiple memblocks. Process all memory blocks to figure out how much memory needs to be removed and remove at the end instead of updating the memblock list in place. Signed-off-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>	2020-07-24 22:08:25 -07:00
Atish Patra	4400231c8a	RISC-V: Do not rely on initrd_start/end computed during early dt parsing Currently, initrd_start/end are computed during early_init_dt_scan but used during arch_setup. We will get the following panic if initrd is used and CONFIG_DEBUG_VIRTUAL is turned on. [ 0.000000] ------------[ cut here ]------------ [ 0.000000] kernel BUG at arch/riscv/mm/physaddr.c:33! [ 0.000000] Kernel BUG [#1] [ 0.000000] Modules linked in: [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.8.0-rc4-00015-ged0b226fed02 #886 [ 0.000000] epc: ffffffe0002058d2 ra : ffffffe0000053f0 sp : ffffffe001001f40 [ 0.000000] gp : ffffffe00106e250 tp : ffffffe001009d40 t0 : ffffffe00107ee28 [ 0.000000] t1 : 0000000000000000 t2 : ffffffe000a2e880 s0 : ffffffe001001f50 [ 0.000000] s1 : ffffffe0001383e8 a0 : ffffffe00c087e00 a1 : 0000000080200000 [ 0.000000] a2 : 00000000010bf000 a3 : ffffffe00106f3c8 a4 : ffffffe0010bf000 [ 0.000000] a5 : ffffffe000000000 a6 : 0000000000000006 a7 : 0000000000000001 [ 0.000000] s2 : ffffffe00106f068 s3 : ffffffe00106f070 s4 : 0000000080200000 [ 0.000000] s5 : 0000000082200000 s6 : 0000000000000000 s7 : 0000000000000000 [ 0.000000] s8 : 0000000080011010 s9 : 0000000080012700 s10: 0000000000000000 [ 0.000000] s11: 0000000000000000 t3 : 000000000001fe30 t4 : 000000000001fe30 [ 0.000000] t5 : 0000000000000000 t6 : ffffffe00107c471 [ 0.000000] status: 0000000000000100 badaddr: 0000000000000000 cause: 0000000000000003 [ 0.000000] random: get_random_bytes called from print_oops_end_marker+0x22/0x46 with crng_init=0 To avoid the error, initrd_start/end can be computed from phys_initrd_start/size in setup itself. It also improves the initrd placement by aligning the start and size with the page size. Fixes: 76d2a0493a17 ("RISC-V: Init and Halt Code") Signed-off-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>	2020-07-24 21:24:27 -07:00
Xie He	8754e1379e	drivers/net/wan: lapb: Corrected the usage of skb_cow This patch fixed 2 issues with the usage of skb_cow in LAPB drivers "lapbether" and "hdlc_x25": 1) After skb_cow fails, kfree_skb should be called to drop a reference to the skb. But in both drivers, kfree_skb is not called. 2) skb_cow should be called before skb_push so that is can ensure the safety of skb_push. But in "lapbether", it is incorrectly called after skb_push. More details about these 2 issues: 1) The behavior of calling kfree_skb on failure is also the behavior of netif_rx, which is called by this function with "return netif_rx(skb);". So this function should follow this behavior, too. 2) In "lapbether", skb_cow is called after skb_push. This results in 2 logical issues: a) skb_push is not protected by skb_cow; b) An extra headroom of 1 byte is ensured after skb_push. This extra headroom has no use in this function. It also has no use in the upper-layer function that this function passes the skb to (x25_lapb_receive_frame in net/x25/x25_dev.c). So logically skb_cow should instead be called before skb_push. Cc: Eric Dumazet <edumazet@google.com> Cc: Martin Schiller <ms@dev.tdt.de> Signed-off-by: Xie He <xie.he.0141@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-07-24 20:17:42 -07:00
David S. Miller	dfecd3e00c	Merge branch 'net-dsa-mv88e6xxx-port-mtu-support' Chris Packham says: ==================== net: dsa: mv88e6xxx: port mtu support This series connects up the mv88e6xxx switches to the dsa infrastructure for configuring the port MTU. The first patch is also a bug fix which might be a candiatate for stable. I've rebased this series on top of net-next/master to pick up Andrew's change for the gigabit switches. Patch 1 and 2 are unchanged (aside from adding Andrew's Reviewed-by). Patch 3 is reworked to make use of the existing mtu support. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-07-24 20:03:28 -07:00
Chris Packham	1baf0fac10	net: dsa: mv88e6xxx: Use chip-wide max frame size for MTU Some of the chips in the mv88e6xxx family don't support jumbo configuration per port. But they do have a chip-wide max frame size that can be used. Use this to approximate the behaviour of configuring a port based MTU. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-07-24 20:03:27 -07:00
Chris Packham	e8b34c67d6	net: dsa: mv88e6xxx: Support jumbo configuration on 6190/6190X The MV88E6190 and MV88E6190X both support per port jumbo configuration just like the other GE switches. Install the appropriate ops. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-07-24 20:03:27 -07:00
Chris Packham	0f3c66a3c7	net: dsa: mv88e6xxx: MV88E6097 does not support jumbo configuration The MV88E6097 chip does not support configuring jumbo frames. Prior to commit 5f4366660d65 only the 6352, 6351, 6165 and 6320 chips configured jumbo mode. The refactor accidentally added the function for the 6097. Remove the erroneous function pointer assignment. Fixes: 5f4366660d65 ("net: dsa: mv88e6xxx: Refactor setting of jumbo frames") Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-07-24 20:03:27 -07:00
Subash Abhinov Kasiviswanathan	7df5cb75cf	dev: Defer free of skbs in flush_backlog IRQs are disabled when freeing skbs in input queue. Use the IRQ safe variant to free skbs here. Fixes: 145dd5f9c88f ("net: flush the softnet backlog in process context") Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-07-24 19:59:22 -07:00
Atish Patra	d0d8aae645	RISC-V: Set maximum number of mapped pages correctly Currently, maximum number of mapper pages are set to the pfn calculated from the memblock size of the memblock containing kernel. This will work until that memblock spans the entire memory. However, it will be set to a wrong value if there are multiple memblocks defined in kernel (e.g. with efi runtime services). Set the the maximum value to the pfn calculated from dram size. Signed-off-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>	2020-07-24 18:53:42 -07:00
Linus Torvalds	23ee3e4e5b	pci-v5.8-fixes-2 -----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAl8beM8UHGJoZWxnYWFz QGdvb2dsZS5jb20ACgkQWYigwDrT+vygXhAAwYMQgoLF+HZDVGWK6q/YmN+Q0ygb tXtnKt/ehf+zrvWiWZxmRVPZ71MjR9Ut6kXQ+MWlADkLkBeIxsCMj4N8ZjGSp/Gh UMfFwVYoUhJG3AM8uko6hCugsYsxagFA5zr53crHIE9AHCp/2Ca4lq1VrBB5k5OQ v4bghphxtPZXtUelHHcbmylU8EmnPG79XzL1VYqgeOd98nSrn34c68Ib7tezLF3I zgRraYghlyI6TGvbA55vdouMs+z+eYYN2hIu6FEXDx2lm81IvFYD5LdXwfb78Ns3 4+BcyuqHcsyvHuxD77Y/JrlmLdFopFRyhexfc887znqD2nXc0kAAtLXKM+d+/bXz vV8qtsDVWhYXS6cwvDuYSjsa1DiaWyJNQ/4tIuu5hYofXfi7ltyATkVrAjE0QPaC QpWnMVKecXSjd//VyTSvAAgO3Bk27uhXfTy3n7lpCTcpzMJj04t6go6u8/SOLtz9 ECWs4WWQVoi7zX65chcXtwce2+plQA6Pu7duzNLo5hP5KIIH4Ij0I19w1yGZueUm V5CIRAsVXPfQPJ45nhKwHldHJpZ/uXM97nYH5F9H14j9VXrrMJWptzyZiH53ZKvE 9GLBfHGcyKBCHcOhJtEJqgf7gkrxGpfSVcacYmPClB8SOsTd67mzIVRdH3xS7DUk j8VlWiNB266bHaY= =hzUW -----END PGP SIGNATURE----- Merge tag 'pci-v5.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci into master Pull PCI fixes from Bjorn Helgaas: - Reject invalid IRQ 0 command line argument for virtio_mmio because IRQ 0 now generates warnings (Bjorn Helgaas) - Revert "PCI/PM: Assume ports without DLL Link Active train links in 100 ms", which broke nouveau (Bjorn Helgaas) * tag 'pci-v5.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: Revert "PCI/PM: Assume ports without DLL Link Active train links in 100 ms" virtio-mmio: Reject invalid IRQ 0 command line argument	2020-07-24 18:30:24 -07:00
Cong Wang	af9f691f0f	qrtr: orphan socket in qrtr_release() We have to detach sock from socket in qrtr_release(), otherwise skb->sk may still reference to this socket when the skb is released in tun->queue, particularly sk->sk_wq still points to &sock->wq, which leads to a UAF. Reported-and-tested-by: syzbot+6720d64f31c081c2f708@syzkaller.appspotmail.com Fixes: 28fb4e59a47d ("net: qrtr: Expose tunneling endpoint to user space") Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-07-24 17:29:52 -07:00

1 2 3 4 5 ...

936579 Commits