linux

iv/linux

Author	SHA1	Message	Date
Andrii Nakryiko	0cfdcd6378	libbpf: Add base BTF accessor Add ability to get base BTF. It can be also used to check if BTF is split BTF. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201202065244.530571-3-andrii@kernel.org	2020-12-03 10:17:26 -08:00
Andrii Nakryiko	71ccb50074	tools/bpftool: Emit name <anon> for anonymous BTFs For consistency of output, emit "name <anon>" for BTFs without the name. This keeps output more consistent and obvious. Suggested-by: Song Liu <songliubraving@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201202065244.530571-2-andrii@kernel.org	2020-12-03 10:17:26 -08:00
Eric Sandeen	72d1249e2f	uapi: fix statx attribute value overlap for DAX & MOUNT_ROOT STATX_ATTR_MOUNT_ROOT and STATX_ATTR_DAX got merged with the same value, so one of them needs fixing. Move STATX_ATTR_DAX. While we're in here, clarify the value-matching scheme for some of the attributes, and explain why the value for DAX does not match. Fixes: 80340fe3605c ("statx: add mount_root") Fixes: 712b2698e4c0 ("fs/stat: Define DAX statx attribute") Link: https://lore.kernel.org/linux-fsdevel/7027520f-7c79-087e-1d00-743bdefa1a1e@redhat.com/ Link: https://lore.kernel.org/lkml/20201202214629.1563760-1-ira.weiny@intel.com/ Reported-by: David Howells <dhowells@redhat.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: David Howells <dhowells@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Cc: <stable@vger.kernel.org> # 5.8 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-12-03 10:03:14 -08:00
Uwe Kleine-König	062c9cdf60	pwm: sl28cpld: fix getting driver data in pwm callbacks Currently .get_state() and .apply() use dev_get_drvdata() on the struct device related to the pwm chip. This only works after .probe() called platform_set_drvdata() which in this driver happens only after pwmchip_add() and so comes possibly too late. Instead of setting the driver data earlier use the traditional container_of approach as this way the driver data is conceptually and computational nearer. Fixes: 9db33d221efc ("pwm: Add support for sl28cpld PWM controller") Tested-by: Michael Walle <michael@walle.cc> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-12-03 09:57:37 -08:00
Willy Tarreau	4f134b89a2	lib/syscall: fix syscall registers retrieval on 32-bit platforms Lilith >_> and Claudio Bozzato of Cisco Talos security team reported that collect_syscall() improperly casts the syscall registers to 64-bit values leaking the uninitialized last 24 bytes on 32-bit platforms, that are visible in /proc/self/syscall. The cause is that info->data.args are u64 while syscall_get_arguments() uses longs, as hinted by the bogus pointer cast in the function. Let's just proceed like the other call places, by retrieving the registers into an array of longs before assigning them to the caller's array. This was successfully tested on x86_64, i386 and ppc32. Reference: CVE-2020-28588, TALOS-2020-1211 Fixes: 631b7abacd02 ("ptrace: Remove maxargs from task_current_syscall()") Cc: Greg KH <greg@kroah.com> Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Michael Ellerman <mpe@ellerman.id.au> (ppc32) Signed-off-by: Willy Tarreau <w@1wt.eu> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-12-03 09:52:44 -08:00
Thomas Karlsson	d4bff72c84	macvlan: Support for high multicast packet rate Background: Broadcast and multicast packages are enqueued for later processing. This queue was previously hardcoded to 1000. This proved insufficient for handling very high packet rates. This resulted in packet drops for multicast. While at the same time unicast worked fine. The change: This patch make the queue length adjustable to accommodate for environments with very high multicast packet rate. But still keeps the default value of 1000 unless specified. The queue length is specified as a request per macvlan using the IFLA_MACVLAN_BC_QUEUE_LEN parameter. The actual used queue length will then be the maximum of any macvlan connected to the same port. The actual used queue length for the port can be retrieved (read only) by the IFLA_MACVLAN_BC_QUEUE_LEN_USED parameter for verification. This will be followed up by a patch to iproute2 in order to adjust the parameter from userspace. Signed-off-by: Thomas Karlsson <thomas.karlsson@paneda.se> Link: https://lore.kernel.org/r/dd4673b2-7eab-edda-6815-85c67ce87f63@paneda.se Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-03 08:21:29 -08:00
Dan Carpenter	74a8c816fa	rtw88: debug: Fix uninitialized memory in debugfs code This code does not ensure that the whole buffer is initialized and none of the callers check for errors so potentially none of the buffer is initialized. Add a memset to eliminate this bug. Fixes: e3037485c68e ("rtw88: new Realtek 802.11ac driver") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/X8ilOfVz3pf0T5ec@mwanda	2020-12-03 18:00:45 +02:00
Alexei Starovoitov	97306be45f	Merge branch 'switch to memcg-based memory accounting' Roman Gushchin says: ==================== Currently bpf is using the memlock rlimit for the memory accounting. This approach has its downsides and over time has created a significant amount of problems: 1) The limit is per-user, but because most bpf operations are performed as root, the limit has a little value. 2) It's hard to come up with a specific maximum value. Especially because the counter is shared with non-bpf use cases (e.g. memlock()). Any specific value is either too low and creates false failures or is too high and useless. 3) Charging is not connected to the actual memory allocation. Bpf code should manually calculate the estimated cost and charge the counter, and then take care of uncharging, including all fail paths. It adds to the code complexity and makes it easy to leak a charge. 4) There is no simple way of getting the current value of the counter. We've used drgn for it, but it's far from being convenient. 5) Cryptic -EPERM is returned on exceeding the limit. Libbpf even had a function to "explain" this case for users. 6) rlimits are generally considered as (at least partially) obsolete. They do not provide a comprehensive system for the control of physical resources: memory, cpu, io etc. All resource control developments in the recent years were related to cgroups. In order to overcome these problems let's switch to the memory cgroup-based memory accounting of bpf objects. With the recent addition of the percpu memory accounting, now it's possible to provide a comprehensive accounting of the memory used by bpf programs and maps. This approach has the following advantages: 1) The limit is per-cgroup and hierarchical. It's way more flexible and allows a better control over memory usage by different workloads. 2) The actual memory consumption is taken into account. It happens automatically on the allocation time if __GFP_ACCOUNT flags is passed. Uncharging is also performed automatically on releasing the memory. So the code on the bpf side becomes simpler and safer. 3) There is a simple way to get the current value and statistics. Cgroup-based accounting adds new requirements: 1) The kernel config should have CONFIG_CGROUPS and CONFIG_MEMCG_KMEM enabled. These options are usually enabled, maybe excluding tiny builds for embedded devices. 2) The system should have a configured cgroup hierarchy, including reasonable memory limits and/or guarantees. Modern systems usually delegate this task to systemd or similar task managers. Without meeting these requirements there are no limits on how much memory bpf can use and a non-root user is able to hurt the system by allocating too much. But because per-user rlimits do not provide a functional system to protect and manage physical resources anyway, anyone who seriously depends on it, should use cgroups. When a bpf map is created, the memory cgroup of the process which creates the map is recorded. Subsequently all memory allocation related to the bpf map are charged to the same cgroup. It includes allocations made from interrupts and by any processes. Bpf program memory is charged to the memory cgroup of a process which loads the program. The patchset consists of the following parts: 1) 4 mm patches are required on the mm side, otherwise vmallocs cannot be mapped to userspace 2) memcg-based accounting for various bpf objects: progs and maps 3) removal of the rlimit-based accounting 4) removal of rlimit adjustments in userspace samples v9: - always charge the saved memory cgroup, by Daniel, Toke and Alexei - added bpf_map_kzalloc() - rebase and minor fixes v8: - extended the cover letter to be more clear on new requirements, by Daniel - an approximate value is provided by map memlock info, by Alexei v7: - introduced bpf_map_kmalloc_node() and bpf_map_alloc_percpu(), by Alexei - switched allocations made from an interrupt context to new helpers, by Daniel - rebase and minor fixes v6: - rebased to the latest version of the remote charging API - fixed signatures, added acks v5: - rebased to the latest version of the remote charging API - implemented kmem accounting from an interrupt context, by Shakeel - rebased to latest changes in mm allowed to map vmallocs to userspace - fixed a build issue in kselftests, by Alexei - fixed a use-after-free bug in bpf_map_free_deferred() - added bpf line info coverage, by Shakeel - split bpf map charging preparations into a separate patch v4: - covered allocations made from an interrupt context, by Daniel - added some clarifications to the cover letter v3: - droped the userspace part for further discussions/refinements, by Andrii and Song v2: - fixed build issue, caused by the remaining rlimit-based accounting for sockhash maps ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-12-02 18:46:19 -08:00
Roman Gushchin	5b0764b2d3	bpf: samples: Do not touch RLIMIT_MEMLOCK Since bpf is not using rlimit memlock for the memory accounting and control, do not change the limit in sample applications. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201201215900.3569844-35-guro@fb.com	2020-12-02 18:32:47 -08:00
Roman Gushchin	3ac1f01b43	bpf: Eliminate rlimit-based memory accounting for bpf progs Do not use rlimit-based memory accounting for bpf progs. It has been replaced with memcg-based memory accounting. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201201215900.3569844-34-guro@fb.com	2020-12-02 18:32:47 -08:00
Roman Gushchin	80ee81e040	bpf: Eliminate rlimit-based memory accounting infra for bpf maps Remove rlimit-based accounting infrastructure code, which is not used anymore. To provide a backward compatibility, use an approximation of the bpf map memory footprint as a "memlock" value, available to a user via map info. The approximation is based on the maximal number of elements and key and value sizes. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201201215900.3569844-33-guro@fb.com	2020-12-02 18:32:47 -08:00
Roman Gushchin	ab31be378a	bpf: Eliminate rlimit-based memory accounting for bpf local storage maps Do not use rlimit-based memory accounting for bpf local storage maps. It has been replaced with the memcg-based memory accounting. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201201215900.3569844-32-guro@fb.com	2020-12-02 18:32:47 -08:00
Roman Gushchin	819a4f3235	bpf: Eliminate rlimit-based memory accounting for xskmap maps Do not use rlimit-based memory accounting for xskmap maps. It has been replaced with the memcg-based memory accounting. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201201215900.3569844-31-guro@fb.com	2020-12-02 18:32:47 -08:00
Roman Gushchin	370868107b	bpf: Eliminate rlimit-based memory accounting for stackmap maps Do not use rlimit-based memory accounting for stackmap maps. It has been replaced with the memcg-based memory accounting. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201201215900.3569844-30-guro@fb.com	2020-12-02 18:32:47 -08:00
Roman Gushchin	0d2c4f9640	bpf: Eliminate rlimit-based memory accounting for sockmap and sockhash maps Do not use rlimit-based memory accounting for sockmap and sockhash maps. It has been replaced with the memcg-based memory accounting. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201201215900.3569844-29-guro@fb.com	2020-12-02 18:32:47 -08:00
Roman Gushchin	abbdd0813f	bpf: Eliminate rlimit-based memory accounting for bpf ringbuffer Do not use rlimit-based memory accounting for bpf ringbuffer. It has been replaced with the memcg-based memory accounting. bpf_ringbuf_alloc() can't return anything except ERR_PTR(-ENOMEM) and a valid pointer, so to simplify the code make it return NULL in the first case. This allows to drop a couple of lines in ringbuf_map_alloc() and also makes it look similar to other memory allocating function like kmalloc(). Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20201201215900.3569844-28-guro@fb.com	2020-12-02 18:32:47 -08:00
Roman Gushchin	db54330d3e	bpf: Eliminate rlimit-based memory accounting for reuseport_array maps Do not use rlimit-based memory accounting for reuseport_array maps. It has been replaced with the memcg-based memory accounting. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201201215900.3569844-27-guro@fb.com	2020-12-02 18:32:47 -08:00
Roman Gushchin	a37fb7ef24	bpf: Eliminate rlimit-based memory accounting for queue_stack_maps maps Do not use rlimit-based memory accounting for queue_stack maps. It has been replaced with the memcg-based memory accounting. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201201215900.3569844-26-guro@fb.com	2020-12-02 18:32:46 -08:00
Roman Gushchin	cbddcb574d	bpf: Eliminate rlimit-based memory accounting for lpm_trie maps Do not use rlimit-based memory accounting for lpm_trie maps. It has been replaced with the memcg-based memory accounting. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201201215900.3569844-25-guro@fb.com	2020-12-02 18:32:46 -08:00
Roman Gushchin	755e5d5536	bpf: Eliminate rlimit-based memory accounting for hashtab maps Do not use rlimit-based memory accounting for hashtab maps. It has been replaced with the memcg-based memory accounting. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201201215900.3569844-24-guro@fb.com	2020-12-02 18:32:46 -08:00
Roman Gushchin	844f157f6c	bpf: Eliminate rlimit-based memory accounting for devmap maps Do not use rlimit-based memory accounting for devmap maps. It has been replaced with the memcg-based memory accounting. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201201215900.3569844-23-guro@fb.com	2020-12-02 18:32:46 -08:00
Roman Gushchin	087b0d39fe	bpf: Eliminate rlimit-based memory accounting for cgroup storage maps Do not use rlimit-based memory accounting for cgroup storage maps. It has been replaced with the memcg-based memory accounting. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201201215900.3569844-22-guro@fb.com	2020-12-02 18:32:46 -08:00
Roman Gushchin	711cabaf14	bpf: Eliminate rlimit-based memory accounting for cpumap maps Do not use rlimit-based memory accounting for cpumap maps. It has been replaced with the memcg-based memory accounting. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201201215900.3569844-21-guro@fb.com	2020-12-02 18:32:46 -08:00
Roman Gushchin	f043733f31	bpf: Eliminate rlimit-based memory accounting for bpf_struct_ops maps Do not use rlimit-based memory accounting for bpf_struct_ops maps. It has been replaced with the memcg-based memory accounting. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201201215900.3569844-20-guro@fb.com	2020-12-02 18:32:46 -08:00
Roman Gushchin	1bc5975613	bpf: Eliminate rlimit-based memory accounting for arraymap maps Do not use rlimit-based memory accounting for arraymap maps. It has been replaced with the memcg-based memory accounting. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201201215900.3569844-19-guro@fb.com	2020-12-02 18:32:46 -08:00
Roman Gushchin	28e1dcdef0	bpf: Refine memcg-based memory accounting for xskmap maps Extend xskmap memory accounting to include the memory taken by the xsk_map_node structure. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201201215900.3569844-18-guro@fb.com	2020-12-02 18:32:46 -08:00
Roman Gushchin	7846dd9f83	bpf: Refine memcg-based memory accounting for sockmap and sockhash maps Include internal metadata into the memcg-based memory accounting. Also include the memory allocated on updating an element. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201201215900.3569844-17-guro@fb.com	2020-12-02 18:32:46 -08:00
Roman Gushchin	e9aae8beba	bpf: Memcg-based memory accounting for bpf local storage maps Account memory used by bpf local storage maps: per-socket, per-inode and per-task storages. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201201215900.3569844-16-guro@fb.com	2020-12-02 18:32:45 -08:00
Roman Gushchin	be4035c734	bpf: Memcg-based memory accounting for bpf ringbuffer Enable the memcg-based memory accounting for the memory used by the bpf ringbuffer. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201201215900.3569844-15-guro@fb.com	2020-12-02 18:32:45 -08:00
Roman Gushchin	353e7af4bf	bpf: Memcg-based memory accounting for lpm_trie maps Include lpm trie and lpm trie node objects into the memcg-based memory accounting. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201201215900.3569844-14-guro@fb.com	2020-12-02 18:32:45 -08:00
Roman Gushchin	881456811a	bpf: Refine memcg-based memory accounting for hashtab maps Include percpu objects and the size of map metadata into the accounting. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201201215900.3569844-13-guro@fb.com	2020-12-02 18:32:45 -08:00
Roman Gushchin	1440290adf	bpf: Refine memcg-based memory accounting for devmap maps Include map metadata and the node size (struct bpf_dtab_netdev) into the accounting. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201201215900.3569844-12-guro@fb.com	2020-12-02 18:32:45 -08:00
Roman Gushchin	3a61c7c58b	bpf: Memcg-based memory accounting for cgroup storage maps Account memory used by cgroup storage maps including metadata structures. Account the percpu memory for the percpu flavor of cgroup storage. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201201215900.3569844-11-guro@fb.com	2020-12-02 18:32:45 -08:00
Roman Gushchin	e88cc05b61	bpf: Refine memcg-based memory accounting for cpumap maps Include metadata and percpu data into the memcg-based memory accounting. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201201215900.3569844-10-guro@fb.com	2020-12-02 18:32:45 -08:00
Roman Gushchin	6d192c7938	bpf: Refine memcg-based memory accounting for arraymap maps Include percpu arrays and auxiliary data into the memcg-based memory accounting. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201201215900.3569844-9-guro@fb.com	2020-12-02 18:32:45 -08:00
Roman Gushchin	d5299b67dd	bpf: Memcg-based memory accounting for bpf maps This patch enables memcg-based memory accounting for memory allocated by __bpf_map_area_alloc(), which is used by many types of bpf maps for large initial memory allocations. Please note, that __bpf_map_area_alloc() should not be used outside of map creation paths without setting the active memory cgroup to the map's memory cgroup. Following patches in the series will refine the accounting for some of the map types. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201201215900.3569844-8-guro@fb.com	2020-12-02 18:32:45 -08:00
Roman Gushchin	48edc1f78a	bpf: Prepare for memcg-based memory accounting for bpf maps Bpf maps can be updated from an interrupt context and in such case there is no process which can be charged. It makes the memory accounting of bpf maps non-trivial. Fortunately, after commit 4127c6504f25 ("mm: kmem: enable kernel memcg accounting from interrupt contexts") and commit b87d8cefe43c ("mm, memcg: rework remote charging API to support nesting") it's finally possible. To make the ownership model simple and consistent, when the map is created, the memory cgroup of the current process is recorded. All subsequent allocations related to the bpf map are charged to the same memory cgroup. It includes allocations made by any processes (even if they do belong to a different cgroup) and from interrupts. This commit introduces 3 new helpers, which will be used by following commits to enable the accounting of bpf maps memory: - bpf_map_kmalloc_node() - bpf_map_kzalloc() - bpf_map_alloc_percpu() They are wrapping popular memory allocation functions. They set the active memory cgroup to the map's memory cgroup and add __GFP_ACCOUNT to the passed gfp flags. Then they call into the corresponding memory allocation function and restore the original active memory cgroup. These helpers are supposed to use everywhere except the map creation path. During the map creation when the map structure is allocated by itself, it cannot be passed to those helpers. In those cases default memory allocation function will be used with the __GFP_ACCOUNT flag. Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201201215900.3569844-7-guro@fb.com	2020-12-02 18:32:34 -08:00
Roman Gushchin	ddf8503c7c	bpf: Memcg-based memory accounting for bpf progs Include memory used by bpf programs into the memcg-based accounting. This includes the memory used by programs itself, auxiliary data, statistics and bpf line info. A memory cgroup containing the process which loads the program is getting charged. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201201215900.3569844-6-guro@fb.com	2020-12-02 18:28:06 -08:00
Roman Gushchin	18b2db3b03	mm: Convert page kmemcg type to a page memcg flag PageKmemcg flag is currently defined as a page type (like buddy, offline, table and guard). Semantically it means that the page was accounted as a kernel memory by the page allocator and has to be uncharged on the release. As a side effect of defining the flag as a page type, the accounted page can't be mapped to userspace (look at page_has_type() and comments above). In particular, this blocks the accounting of vmalloc-backed memory used by some bpf maps, because these maps do map the memory to userspace. One option is to fix it by complicating the access to page->mapcount, which provides some free bits for page->page_type. But it's way better to move this flag into page->memcg_data flags. Indeed, the flag makes no sense without enabled memory cgroups and memory cgroup pointer set in particular. This commit replaces PageKmemcg() and __SetPageKmemcg() with PageMemcgKmem() and an open-coded OR operation setting the memcg pointer with the MEMCG_DATA_KMEM bit. __ClearPageKmemcg() can be simple deleted, as the whole memcg_data is zeroed at once. As a bonus, on !CONFIG_MEMCG build the PageMemcgKmem() check will be compiled out. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Link: https://lkml.kernel.org/r/20201027001657.3398190-5-guro@fb.com Link: https://lore.kernel.org/bpf/20201201215900.3569844-5-guro@fb.com	2020-12-02 18:28:06 -08:00
Roman Gushchin	87944e2992	mm: Introduce page memcg flags The lowest bit in page->memcg_data is used to distinguish between struct memory_cgroup pointer and a pointer to a objcgs array. All checks and modifications of this bit are open-coded. Let's formalize it using page memcg flags, defined in enum page_memcg_data_flags. Additional flags might be added later. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Link: https://lkml.kernel.org/r/20201027001657.3398190-4-guro@fb.com Link: https://lore.kernel.org/bpf/20201201215900.3569844-4-guro@fb.com	2020-12-02 18:28:06 -08:00
Roman Gushchin	270c6a7146	mm: memcontrol/slab: Use helpers to access slab page's memcg_data To gather all direct accesses to struct page's memcg_data field in one place, let's introduce 3 new helpers to use in the slab accounting code: struct obj_cgroup *page_objcgs(struct page page); struct obj_cgroup *page_objcgs_check(struct page page); bool set_page_objcgs(struct page page, struct obj_cgroup *objcgs); They are similar to the corresponding API for generic pages, except that the setter can return false, indicating that the value has been already set from a different thread. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Link: https://lkml.kernel.org/r/20201027001657.3398190-3-guro@fb.com Link: https://lore.kernel.org/bpf/20201201215900.3569844-3-guro@fb.com	2020-12-02 18:28:06 -08:00
Roman Gushchin	bcfe06bf26	mm: memcontrol: Use helpers to read page's memcg data Patch series "mm: allow mapping accounted kernel pages to userspace", v6. Currently a non-slab kernel page which has been charged to a memory cgroup can't be mapped to userspace. The underlying reason is simple: PageKmemcg flag is defined as a page type (like buddy, offline, etc), so it takes a bit from a page->mapped counter. Pages with a type set can't be mapped to userspace. But in general the kmemcg flag has nothing to do with mapping to userspace. It only means that the page has been accounted by the page allocator, so it has to be properly uncharged on release. Some bpf maps are mapping the vmalloc-based memory to userspace, and their memory can't be accounted because of this implementation detail. This patchset removes this limitation by moving the PageKmemcg flag into one of the free bits of the page->mem_cgroup pointer. Also it formalizes accesses to the page->mem_cgroup and page->obj_cgroups using new helpers, adds several checks and removes a couple of obsolete functions. As the result the code became more robust with fewer open-coded bit tricks. This patch (of 4): Currently there are many open-coded reads of the page->mem_cgroup pointer, as well as a couple of read helpers, which are barely used. It creates an obstacle on a way to reuse some bits of the pointer for storing additional bits of information. In fact, we already do this for slab pages, where the last bit indicates that a pointer has an attached vector of objcg pointers instead of a regular memcg pointer. This commits uses 2 existing helpers and introduces a new helper to converts all read sides to calls of these helpers: struct mem_cgroup page_memcg(struct page page); struct mem_cgroup page_memcg_rcu(struct page page); struct mem_cgroup page_memcg_check(struct page page); page_memcg_check() is intended to be used in cases when the page can be a slab page and have a memcg pointer pointing at objcg vector. It does check the lowest bit, and if set, returns NULL. page_memcg() contains a VM_BUG_ON_PAGE() check for the page not being a slab page. To make sure nobody uses a direct access, struct page's mem_cgroup/obj_cgroups is converted to unsigned long memcg_data. Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Link: https://lkml.kernel.org/r/20201027001657.3398190-1-guro@fb.com Link: https://lkml.kernel.org/r/20201027001657.3398190-2-guro@fb.com Link: https://lore.kernel.org/bpf/20201201215900.3569844-2-guro@fb.com	2020-12-02 18:28:05 -08:00
Zhang Changzhong	832e09798c	vxlan: fix error return code in __vxlan_dev_create() Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: 0ce1822c2a08 ("vxlan: add adjacent link to limit depth level") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Link: https://lore.kernel.org/r/1606903122-2098-1-git-send-email-zhangchangzhong@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-02 18:04:02 -08:00
Zhang Changzhong	aba84871bd	net: pasemi: fix error return code in pasemi_mac_open() Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: 72b05b9940f0 ("pasemi_mac: RX/TX ring management cleanup") Fixes: 8d636d8bc5ff ("pasemi_mac: jumbo frame support") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Link: https://lore.kernel.org/r/1606903035-1838-1-git-send-email-zhangchangzhong@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-02 18:03:58 -08:00
Zhang Changzhong	ff9924897f	cxgb3: fix error return code in t3_sge_alloc_qset() Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: b1fb1f280d09 ("cxgb3 - Fix dma mapping error path") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Acked-by: Raju Rangoju <rajur@chelsio.com> Link: https://lore.kernel.org/r/1606902965-1646-1-git-send-email-zhangchangzhong@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-02 18:03:53 -08:00
Jonas Bonn	cec85994c6	bareudp: constify device_type declaration device_type may be declared as const. Signed-off-by: Jonas Bonn <jonas@norrbonn.se> Link: https://lore.kernel.org/r/20201202122324.564918-1-jonas@norrbonn.se Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-02 18:00:18 -08:00
Jakub Kicinski	db77471259	Merge branch 'nfc-s3fwrn5-support-a-uart-interface' Bongsu Jeon says: ==================== nfc: s3fwrn5: Support a UART interface S3FWRN82 is the Samsung's NFC chip that supports the UART communication. Before adding the UART driver module, I did refactoring the s3fwrn5_i2c module to reuse the common blocks. ==================== Link: https://lore.kernel.org/r/1606909661-3814-1-git-send-email-bongsu.jeon@samsung.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-02 17:55:27 -08:00
Bongsu Jeon	3f52c2cb7e	nfc: s3fwrn5: Support a UART interface Since S3FWRN82 NFC Chip, The UART interface can be used. S3FWRN82 uses NCI protocol and supports I2C and UART interface. Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Bongsu Jeon <bongsu.jeon@samsung.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-02 17:55:25 -08:00
Bongsu Jeon	b3799d592f	nfc: s3fwrn5: extract the common phy blocks Extract the common phy blocks to reuse it. The UART module will use the common blocks. Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Bongsu Jeon <bongsu.jeon@samsung.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-02 17:55:25 -08:00
Bongsu Jeon	337da14995	nfc: s3fwrn5: reduce the EN_WAIT_TIME The delay of 20ms is enough to enable and wake up the Samsung's nfc chip. Acked-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Bongsu Jeon <bongsu.jeon@samsung.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-02 17:55:25 -08:00

... 5 6 7 8 9 ...

970040 Commits