linux

iv/linux

Author	SHA1	Message	Date
wenxu	10f4e76587	netfilter: nft_flow_offload: fix interaction with vrf slave device In the forward chain, the iif is changed from slave device to master vrf device. Thus, flow offload does not find a match on the lower slave device. This patch uses the cached route, ie. dst->dev, to update the iif and oif fields in the flow entry. After this patch, the following example works fine: # ip addr add dev eth0 1.1.1.1/24 # ip addr add dev eth1 10.0.0.1/24 # ip link add user1 type vrf table 1 # ip l set user1 up # ip l set dev eth0 master user1 # ip l set dev eth1 master user1 # nft add table firewall # nft add flowtable f fb1 { hook ingress priority 0 \; devices = { eth0, eth1 } \; } # nft add chain f ftb-all {type filter hook forward priority 0 \; policy accept \; } # nft add rule f ftb-all ct zone 1 ip protocol tcp flow offload @fb1 # nft add rule f ftb-all ct zone 1 ip protocol udp flow offload @fb1 Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-01-11 00:55:37 +01:00
Shakeel Butt	e2c8d550a9	netfilter: ebtables: account ebt_table_info to kmemcg The [ip,ip6,arp]_tables use x_tables_info internally and the underlying memory is already accounted to kmemcg. Do the same for ebtables. The syzbot, by using setsockopt(EBT_SO_SET_ENTRIES), was able to OOM the whole system from a restricted memcg, a potential DoS. By accounting the ebt_table_info, the memory used for ebt_table_info can be contained within the memcg of the allocating process. However the lifetime of ebt_table_info is independent of the allocating process and is tied to the network namespace. So, the oom-killer will not be able to relieve the memory pressure due to ebt_table_info memory. The memory for ebt_table_info is allocated through vmalloc. Currently vmalloc does not handle the oom-killed allocating process correctly and one large allocation can bypass memcg limit enforcement. So, with this patch, at least the small allocations will be contained. For large allocations, we need to fix vmalloc. Reported-by: syzbot+7713f3aa67be76b1552c@syzkaller.appspotmail.com Signed-off-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-01-11 00:55:36 +01:00
wenxu	a799aea098	netfilter: nft_flow_offload: Fix reverse route lookup Using the following example: client 1.1.1.7 ---> 2.2.2.7 which dnat to 10.0.0.7 server The first reply packet (ie. syn+ack) uses an incorrect destination address for the reverse route lookup since it uses: daddr = ct->tuplehash[!dir].tuple.dst.u3.ip; which is 2.2.2.7 in the scenario that is described above, while this should be: daddr = ct->tuplehash[dir].tuple.src.u3.ip; that is 10.0.0.7. Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-01-09 23:25:02 +01:00
Pablo Neira Ayuso	715849ab31	netfilter: nf_tables: selective rule dump needs table to be specified Table needs to be specified for selective rule dumps per chain. Fixes: 241faeceb849c ("netfilter: nf_tables: Speed up selective rule dumps") Reported-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-01-08 23:31:18 +01:00
Taehee Yoo	b91d903688	netfilter: nf_tables: fix leaking object reference count There is no code that decreases the reference count of stateful objects in error path of the nft_add_set_elem(). this causes a leak of reference count of stateful objects. Test commands: $nft add table ip filter $nft add counter ip filter c1 $nft add map ip filter m1 { type ipv4_addr : counter \;} $nft add element ip filter m1 { 1 : c1 } $nft add element ip filter m1 { 1 : c1 } $nft delete element ip filter m1 { 1 } $nft delete counter ip filter c1 Result: Error: Could not process rule: Device or resource busy delete counter ip filter c1 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ At the second 'nft add element ip filter m1 { 1 : c1 }', the reference count of the 'c1' is increased then it tries to insert into the 'm1'. but the 'm1' already has same element so it returns -EEXIST. But it doesn't decrease the reference count of the 'c1' in the error path. Due to a leak of the reference count of the 'c1', the 'c1' can't be removed by 'nft delete counter ip filter c1'. Fixes: 8aeff920dcc9 ("netfilter: nf_tables: add stateful object reference to set elements") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-01-08 23:31:17 +01:00
Phil Sutter	310529e663	netfilter: nf_tables: Fix for endless loop when dumping ruleset __nf_tables_dump_rules() stores the current idx value into cb->args[0] before returning to caller. With multiple chains present, cb->args[0] is therefore updated after each chain's rules have been traversed. This though causes the final nf_tables_dump_rules() run (which should return an skb->len of zero since no rules are left to dump) to continue dumping rules for each but the first chain. Fix this by moving the cb->args[0] update to nf_tables_dump_rules(). With no final action to be performed anymore in __nf_tables_dump_rules(), drop 'out_unfinished' jump label and 'rc' variable - instead return the appropriate value directly. Fixes: 241faeceb849c ("netfilter: nf_tables: Speed up selective rule dumps") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-01-08 23:01:13 +01:00
Bryan Whitehead	a0071840d2	lan743x: Remove phy_read from link status change function It has been noticed that some phys do not have the registers required by the previous implementation. To fix this, instead of using phy_read, the required information is extracted from the phy_device structure. fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver") Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-08 16:26:12 -05:00
Eugene Syromiatnikov	b7ea4894aa	ptp: uapi: change _IOW to IOWR in PTP_SYS_OFFSET_EXTENDED definition The ioctl command is read/write (or just read, if the fact that user space writes n_samples field is ignored). Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-08 16:22:56 -05:00
Eugene Syromiatnikov	895ac1376d	ptp: check that rsv field is zero in struct ptp_sys_offset_extended Otherwise it is impossible to use it for something else, as it will break userspace that puts garbage there. The same check should be done in other structures, but the fact that data in reserved fields is ignored is already part of the kernel ABI. Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-08 16:22:56 -05:00
David S. Miller	977e4899c9	Merge ra.kernel.org:/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2019-01-08 The following pull-request contains BPF updates for your net tree. The main changes are: 1) Fix BSD'ism in sendmsg(2) to rewrite unspecified IPv6 dst for unconnected UDP sockets with [::1] _after_ cgroup BPF invocation, from Andrey. 2) Follow-up fix to the speculation fix where we need to reject a corner case for sanitation when ptr and scalars are mixed in the same alu op. Also, some unrelated minor doc fixes, from Daniel. 3) Fix BPF kselftest's incorrect uses of create_and_get_cgroup() by not assuming fd of zero value to be the result of an error case, from Stanislav. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-07 22:49:35 -05:00
Alexei Starovoitov	2dc0f02da1	Merge branch 'bpf-doc-updates' Daniel Borkmann says: ==================== Two trivial doc follow-ups to i) remove deprecated kern_version mentioning in the design qa and ii) to mention stand-alone build and license of libbpf. Thanks! ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-01-07 15:52:00 -08:00
Daniel Borkmann	80f21ff987	bpf, doc: add note for libbpf's stand-alone build Given this came up couple of times, add a note to libbpf's readme about the semi-automated mirror for a stand-alone build which is officially managed by BPF folks. While at it, also explicitly state the libbpf license in the readme file. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-01-07 15:52:00 -08:00
Daniel Borkmann	a769fa7208	bpf, doc: update design qa to reflect kern_version requirement Update the bpf_design_QA.rst to also reflect recent changes in 6c4fc209fcf9 ("bpf: remove useless version check for prog load"). Suggested-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-01-07 15:52:00 -08:00
Stanislav Fomichev	a8911d6d58	selftests/bpf: fix incorrect users of create_and_get_cgroup We have some tests that assume create_and_get_cgroup returns -1 on error which is incorrect (it returns 0 on error). Since fd might be zero in general case, change create_and_get_cgroup to return -1 on error and fix the users that assume 0 on error. Fixes: f269099a7e7a ("tools/bpf: add a selftest for bpf_get_current_cgroup_id() helper") Fixes: 7d2c6cfc5411 ("bpf: use --cgroup in test_suite if supplied") v2: - instead of fixing the uses that assume -1 on error, convert the users that assume 0 on error (fd might be zero in general case) Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-01-07 13:15:55 -08:00
Cong Wang	26d92e951f	smc: move unhash as early as possible in smc_release() In smc_release() we release smc->clcsock before unhash the smc sock, but a parallel smc_diag_dump() may be still reading smc->clcsock, therefore this could cause a use-after-free as reported by syzbot. Reported-and-tested-by: syzbot+fbd1e5476e4c94c7b34e@syzkaller.appspotmail.com Fixes: 51f1de79ad8e ("net/smc: replace sock_put worker by socket refcounting") Cc: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reported-by: syzbot+0bf2e01269f1274b4b03@syzkaller.appspotmail.com Reported-by: syzbot+e3132895630f957306bc@syzkaller.appspotmail.com Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-07 14:40:27 -05:00
Jason Gunthorpe	7acf8b36a2	phy: ti: Fix compilation failures without REGMAP This driver requires regmap or the compile fails: drivers/phy/ti/phy-gmii-sel.c:43:27: error: array type has incomplete element type ‘struct reg_field’ const struct reg_field (*regfields)[PHY_GMII_SEL_LAST]; Add it to kconfig. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-07 14:21:37 -05:00
JianJhen Chen	4c84edc11b	net: bridge: fix a bug on using a neighbour cache entry without checking its state When handling DNAT'ed packets on a bridge device, the neighbour cache entry from lookup was used without checking its state. It means that a cache entry in the NUD_STALE state will be used directly instead of entering the NUD_DELAY state to confirm the reachability of the neighbor. This problem becomes worse after commit 2724680bceee ("neigh: Keep neighbour cache entries if number of them is small enough."), since all neighbour cache entries in the NUD_STALE state will be kept in the neighbour table as long as the number of cache entries does not exceed the value specified in gc_thresh1. This commit validates the state of a neighbour cache entry before using the entry. Signed-off-by: JianJhen Chen <kchen@synology.com> Reviewed-by: JinLin Chen <jlchen@synology.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-07 12:09:02 -05:00
Gustavo A. R. Silva	f87d8ad923	tipc: fix memory leak in tipc_nl_compat_publ_dump There is a memory leak in case genlmsg_put fails. Fix this by freeing args before return. Addresses-Coverity-ID: 1476406 ("Resource leak") Fixes: 46273cf7e009 ("tipc: fix a missing check of genlmsg_put") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-07 11:42:08 -05:00
Bjørn Mork	a29c3c09ba	cdc_ether: trivial whitespace readability fix This function is unreadable enough without indenting mismatches and unnecessary line breaks. Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-07 11:33:18 -05:00
Jacob Wen	eeb2c4fb6a	rds: use DIV_ROUND_UP instead of ceil Yes indeed, DIV_ROUND_UP is in kernel.h. Signed-off-by: Jacob Wen <jian.w.wen@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-07 07:22:36 -08:00
Heiner Kallweit	10262b0b53	r8169: don't try to read counters if chip is in a PCI power-save state Avoid log spam caused by trying to read counters from the chip whilst it is in a PCI power-save state. Reference: https://bugzilla.kernel.org/show_bug.cgi?id=107421 Fixes: 1ef7286e7f36 ("r8169: Dereference MMIO address immediately before use") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-07 07:19:25 -08:00
Oliver Hartkopp	0aaa81377c	can: gw: ensure DLC boundaries after CAN frame modification Muyu Yu provided a POC where user root with CAP_NET_ADMIN can create a CAN frame modification rule that makes the data length code a higher value than the available CAN frame data size. In combination with a configured checksum calculation where the result is stored relatively to the end of the data (e.g. cgw_csum_xor_rel) the tail of the skb (e.g. frag_list pointer in skb_shared_info) can be rewritten which finally can cause a system crash. Michael Kubecek suggested to drop frames that have a DLC exceeding the available space after the modification process and provided a patch that can handle CAN FD frames too. Within this patch we also limit the length for the checksum calculations to the maximum of Classic CAN data length (8). CAN frames that are dropped by these additional checks are counted with the CGW_DELETED counter which indicates misconfigurations in can-gw rules. This fixes CVE-2019-3701. Reported-by: Muyu Yu <ieatmuttonchuan@gmail.com> Reported-by: Marcus Meissner <meissner@suse.de> Suggested-by: Michal Kubecek <mkubecek@suse.cz> Tested-by: Muyu Yu <ieatmuttonchuan@gmail.com> Tested-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Cc: linux-stable <stable@vger.kernel.org> # >= v3.2 Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-07 05:17:51 -08:00
Stephen Warren	01cd364a15	net/mlx4: replace pci_{,un}map_sg with dma_{,un}map_sg pci_{,un}map_sg are deprecated and replaced by dma_{,un}map_sg. This is especially relevant since the rest of the driver uses the DMA API. Fix the driver to use the replacement APIs. Signed-off-by: Stephen Warren <swarren@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-07 05:14:17 -08:00
Stephen Warren	f65e192af3	net/mlx4: Get rid of page operation after dma_alloc_coherent This patch solves a crash at the time of mlx4 driver unload or system shutdown. The crash occurs because dma_alloc_coherent() returns one value in mlx4_alloc_icm_coherent(), but a different value is passed to dma_free_coherent() in mlx4_free_icm_coherent(). In turn this is because when allocated, that pointer is passed to sg_set_buf() to record it, then when freed it is re-calculated by calling lowmem_page_address(sg_page()) which returns a different value. Solve this by recording the value that dma_alloc_coherent() returns, and passing this to dma_free_coherent(). This patch is roughly equivalent to commit 378efe798ecf ("RDMA/hns: Get rid of page operation after dma_alloc_coherent"). Based-on-code-from: Christoph Hellwig <hch@lst.de> Signed-off-by: Stephen Warren <swarren@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-07 05:14:17 -08:00
Alexei Starovoitov	97274b6126	Merge branch 'reject-ptr-scalar-mix' Daniel Borkmann says: ==================== Follow-up fix to 979d63d50c0c ("bpf: prevent out of bounds speculation on pointer arithmetic") in order to reject a corner case for sanitation when ptr / scalars are mixed in the same alu op. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-01-05 21:32:39 -08:00
Daniel Borkmann	1cbbcfbbd5	bpf: add various test cases for alu op on mixed dst register types Add couple of test_verifier tests to check sanitation of alu op insn with pointer and scalar type coming from different paths. This also includes BPF insns of the test reproducer provided by Jann Horn. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-01-05 21:32:38 -08:00
Daniel Borkmann	d3bd7413e0	bpf: fix sanitation of alu op with pointer / scalar type from different paths While 979d63d50c0c ("bpf: prevent out of bounds speculation on pointer arithmetic") took care of rejecting alu op on pointer when e.g. pointer came from two different map values with different map properties such as value size, Jann reported that a case was not covered yet when a given alu op is used in both "ptr_reg += reg" and "numeric_reg += reg" from different branches where we would incorrectly try to sanitize based on the pointer's limit. Catch this corner case and reject the program instead. Fixes: 979d63d50c0c ("bpf: prevent out of bounds speculation on pointer arithmetic") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-01-05 21:32:38 -08:00
David Ahern	d4a7e9bb74	ipv6: Take rcu_read_lock in __inet6_bind for mapped addresses I realized the last patch calls dev_get_by_index_rcu in a branch not holding the rcu lock. Add the calls to rcu_read_lock and rcu_read_unlock. Fixes: ec90ad334986 ("ipv6: Consider sk_bound_dev_if when binding a socket to a v4 mapped address") Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-05 14:17:07 -08:00
Alexei Starovoitov	466f89e9ec	Merge branch 'udpv6_sendmsg-addr_any-fix' Andrey Ignatov says: ==================== The patch set fixes BSD'ism in sys_sendmsg to rewrite unspecified destination IPv6 for unconnected UDP sockets in sys_sendmsg with [::1] in case when either CONFIG_CGROUP_BPF is enabled or when sys_sendmsg BPF hook sets destination IPv6 to [::]. Patch 1 is the fix and provides more details. Patch 2 adds two test cases to verify the fix. v1->v2: * Fix compile error in patch 1. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-01-04 20:23:34 -08:00
Andrey Ignatov	976b4f3a46	selftests/bpf: Test [::] -> [::1] rewrite in sys_sendmsg in test_sock_addr Test that sys_sendmsg BPF hook doesn't break sys_sendmsg behaviour to rewrite destination IPv6 = [::] with [::1] (BSD'ism). Two test cases are added: 1) User passes dst IPv6 = [::] and BPF_CGROUP_UDP6_SENDMSG program doesn't touch it. 2) User passes dst IPv6 != [::], but BPF_CGROUP_UDP6_SENDMSG program rewrites it with [::]. In both cases [::1] is used by sys_sendmsg code eventually and datagram is sent successfully for unconnected UDP socket. Example of relevant output: Test case: sendmsg6: set dst IP = [::] (BSD'ism) .. [PASS] Test case: sendmsg6: preserve dst IP = [::] (BSD'ism) .. [PASS] Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-01-04 20:23:33 -08:00
Andrey Ignatov	e8e3698408	bpf: Fix [::] -> [::1] rewrite in sys_sendmsg sys_sendmsg has supported unspecified destination IPv6 (wildcard) for unconnected UDP sockets since 876c7f41. When [::] is passed by user as destination, sys_sendmsg rewrites it with [::1] to be consistent with BSD (see "BSD'ism" comment in the code). This didn't work when cgroup-bpf was enabled though since the rewrite [::] -> [::1] happened before passing control to cgroup-bpf block where fl6.daddr was updated with passed by user sockaddr_in6.sin6_addr (that might or might not be changed by BPF program). That way if user passed [::] as dst IPv6 it was first rewritten with [::1] by original code from 876c7f41, but then rewritten back with [::] by cgroup-bpf block. It happened even when BPF_CGROUP_UDP6_SENDMSG program was not present (CONFIG_CGROUP_BPF=y was enough). The fix is to apply BSD'ism after cgroup-bpf block so that [::] is replaced with [::1] no matter where it came from: passed by user to sys_sendmsg or set by BPF_CGROUP_UDP6_SENDMSG program. Fixes: 1cedee13d25a ("bpf: Hooks for sys_sendmsg") Reported-by: Nitin Rawat <nitin.rawat@intel.com> Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-01-04 20:23:33 -08:00
David Ahern	ec90ad3349	ipv6: Consider sk_bound_dev_if when binding a socket to a v4 mapped address Similar to c5ee066333eb ("ipv6: Consider sk_bound_dev_if when binding a socket to an address"), binding a socket to v4 mapped addresses needs to consider if the socket is bound to a device. This problem also exists from the beginning of git history. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-04 17:18:58 -08:00
Jeff Kirsher	ae84e4a8eb	ixgbe: fix Kconfig when driver is not a module The new ability added to the driver to use mii_bus to handle MII related ioctls is causing compile issues when the driver is compiled into the kernel (i.e. not a module). The problem was in selecting MDIO_DEVICE instead of the preferred PHYLIB Kconfig option. The reason being that MDIO_DEVICE had a dependency on PHYLIB and would be compiled as a module when PHYLIB was a module, no matter whether ixgbe was compiled into the kernel. CC: Dave Jones <davej@codemonkey.org.uk> CC: Steve Douthit <stephend@silicom-usa.com> CC: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Reviewed-by: Stephen Douthit <stephend@silicom-usa.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-04 14:02:16 -08:00
Eric Dumazet	8d93367045	ipv6: make icmp6_send() robust against null skb->dev syzbot was able to crash one host with the following stack trace : kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 8625 Comm: syz-executor4 Not tainted 4.20.0+ #8 RIP: 0010:dev_net include/linux/netdevice.h:2169 [inline] RIP: 0010:icmp6_send+0x116/0x2d30 net/ipv6/icmp.c:426 icmpv6_send smack_socket_sock_rcv_skb security_sock_rcv_skb sk_filter_trim_cap __sk_receive_skb dccp_v6_do_rcv release_sock This is because a RX packet found socket owned by user and was stored into socket backlog. Before leaving RCU protected section, skb->dev was cleared in __sk_receive_skb(). When socket backlog was finally handled at release_sock() time, skb was fed to smack_socket_sock_rcv_skb() then icmp6_send() We could fix the bug in smack_socket_sock_rcv_skb(), or simply make icmp6_send() more robust against such possibility. In the future we might provide to icmp6_send() the net pointer instead of infering it. Fixes: d66a8acbda92 ("Smack: Inform peer that IPv6 traffic has been blocked") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Piotr Sawicki <p.sawicki2@partner.samsung.com> Cc: Casey Schaufler <casey@schaufler-ca.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-04 13:40:03 -08:00
Peter Oskolkov	3271a48218	selftests: net: fix/improve ip_defrag selftest Commit ade446403bfb ("net: ipv4: do not handle duplicate fragments as overlapping") changed IPv4 defragmentation so that duplicate fragments, as well as _some_ fragments completely covered by previously delivered fragments, do not lead to the whole frag queue being discarded. This makes the existing ip_defrag selftest flaky. This patch * makes sure that negative IPv4 defrag tests generate truly overlapping fragments that trigger defrag queue drops; * tests that duplicate IPv4 fragments do not trigger defrag queue drops; * makes a couple of minor tweaks to the test aimed at increasing its code coverage and reduce flakiness. Signed-off-by: Peter Oskolkov <posk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-04 13:38:39 -08:00
Daniele Palmas	f87118d576	qmi_wwan: add MTU default to qmap network interface This patch adds MTU default value to qmap network interface in order to avoid "RTNETLINK answers: No buffer space available" error when setting an ipv6 address. Signed-off-by: Daniele Palmas <dnlplm@gmail.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-04 13:35:42 -08:00
David S. Miller	75e7fb0a87	Merge branch 'hns-fixes' Huazhong Tan says: ==================== net: hns: Bugfixes for HNS driver This patchset includes bugfixes for the HNS ethernet controller driver. Every patch is independent. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-04 13:33:57 -08:00
Yonglong Liu	bb989501ab	net: hns: Fix use after free identified by SLUB debug When enable SLUB debug, than remove hns_enet_drv module, SLUB debug will identify a use after free bug: [134.189505] Unable to handle kernel paging request at virtual address 006b6b6b6b6b6b6b [134.197553] Mem abort info: [134.200381] ESR = 0x96000004 [134.203487] Exception class = DABT (current EL), IL = 32 bits [134.209497] SET = 0, FnV = 0 [134.212596] EA = 0, S1PTW = 0 [134.215777] Data abort info: [134.218701] ISV = 0, ISS = 0x00000004 [134.222596] CM = 0, WnR = 0 [134.225606] [006b6b6b6b6b6b6b] address between user and kernel address ranges [134.232851] Internal error: Oops: 96000004 [#1] SMP [134.237798] CPU: 21 PID: 27834 Comm: rmmod Kdump: loaded Tainted: G OE 4.19.5-1.2.34.aarch64 #1 [134.247856] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.58 10/24/2018 [134.255181] pstate: 20000005 (nzCv daif -PAN -UAO) [134.260044] pc : hns_ae_put_handle+0x38/0x60 [134.264372] lr : hns_ae_put_handle+0x24/0x60 [134.268700] sp : ffff00001be93c50 [134.272054] x29: ffff00001be93c50 x28: ffff802faaec8040 [134.277442] x27: 0000000000000000 x26: 0000000000000000 [134.282830] x25: 0000000056000000 x24: 0000000000000015 [134.288284] x23: ffff0000096fe098 x22: ffff000001050070 [134.293671] x21: ffff801fb3c044a0 x20: ffff80afb75ec098 [134.303287] x19: ffff80afb75ec098 x18: 0000000000000000 [134.312945] x17: 0000000000000000 x16: 0000000000000000 [134.322517] x15: 0000000000000002 x14: 0000000000000000 [134.332030] x13: dead000000000100 x12: ffff7e02bea3c988 [134.341487] x11: ffff80affbee9e68 x10: 0000000000000000 [134.351033] x9 : 6fffff8000008101 x8 : 0000000000000000 [134.360569] x7 : dead000000000100 x6 : ffff000009579748 [134.370059] x5 : 0000000000210d00 x4 : 0000000000000000 [134.379550] x3 : 0000000000000001 x2 : 0000000000000000 [134.388813] x1 : 6b6b6b6b6b6b6b6b x0 : 0000000000000000 [134.397993] Process rmmod (pid: 27834, stack limit = 0x00000000d474b7fd) [134.408498] Call trace: [134.414611] hns_ae_put_handle+0x38/0x60 [134.422208] hnae_put_handle+0xd4/0x108 [134.429563] hns_nic_dev_remove+0x60/0xc0 [hns_enet_drv] [134.438342] platform_drv_remove+0x2c/0x70 [134.445958] device_release_driver_internal+0x174/0x208 [134.454810] driver_detach+0x70/0xd8 [134.461913] bus_remove_driver+0x64/0xe8 [134.469396] driver_unregister+0x34/0x60 [134.476822] platform_driver_unregister+0x20/0x30 [134.485130] hns_nic_dev_driver_exit+0x14/0x6e4 [hns_enet_drv] [134.494634] __arm64_sys_delete_module+0x238/0x290 struct hnae_handle is a member of struct hnae_vf_cb, so when vf_cb is freed, than use hnae_handle will cause use after free panic. This patch frees vf_cb after hnae_handle used. Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-04 13:33:57 -08:00
Yonglong Liu	c77804be53	net: hns: Fix WARNING when hns modules installed Commit 308c6cafde01 ("net: hns: All ports can not work when insmod hns ko after rmmod.") add phy_stop in hns_nic_init_phy(), In the branch of "net", this method is effective, but in the branch of "net-next", it will cause a WARNING when hns modules loaded, reference to commit 2b3e88ea6528 ("net: phy: improve phy state checking"): [10.092168] ------------[ cut here ]------------ [10.092171] called from state READY [10.092189] WARNING: CPU: 4 PID: 1 at ../drivers/net/phy/phy.c:854 phy_stop+0x90/0xb0 [10.092192] Modules linked in: [10.092197] CPU: 4 PID:1 Comm:swapper/0 Not tainted 4.20.0-rc7-next-20181220 #1 [10.092200] Hardware name: Huawei TaiShan 2280 /D05, BIOS Hisilicon D05 UEFI 16.12 Release 05/15/2017 [10.092202] pstate: 60000005 (nZCv daif -PAN -UAO) [10.092205] pc : phy_stop+0x90/0xb0 [10.092208] lr : phy_stop+0x90/0xb0 [10.092209] sp : ffff00001159ba90 [10.092212] x29: ffff00001159ba90 x28: 0000000000000007 [10.092215] x27: ffff000011180068 x26: ffff0000110a5620 [10.092218] x25: ffff0000113b6000 x24: ffff842f96dac000 [10.092221] x23: 0000000000000000 x22: 0000000000000000 [10.092223] x21: ffff841fb8425e18 x20: ffff801fb3a56438 [10.092226] x19: ffff801fb3a56000 x18: ffffffffffffffff [10.092228] x17: 0000000000000000 x16: 0000000000000000 [10.092231] x15: ffff00001122d6c8 x14: ffff00009159b7b7 [10.092234] x13: ffff00001159b7c5 x12: ffff000011245000 [10.092236] x11: 0000000005f5e0ff x10: ffff00001159b750 [10.092239] x9 : 00000000ffffffd0 x8 : 0000000000000465 [10.092242] x7 : ffff0000112457f8 x6 : ffff0000113bd7ce [10.092245] x5 : 0000000000000000 x4 : 0000000000000000 [10.092247] x3 : 00000000ffffffff x2 : ffff000011245828 [10.092250] x1 : 4b5860bd05871300 x0 : 0000000000000000 [10.092253] Call trace: [10.092255] phy_stop+0x90/0xb0 [10.092260] hns_nic_init_phy+0xf8/0x110 [10.092262] hns_nic_try_get_ae+0x4c/0x3b0 [10.092264] hns_nic_dev_probe+0x1fc/0x480 [10.092268] platform_drv_probe+0x50/0xa0 [10.092271] really_probe+0x1f4/0x298 [10.092273] driver_probe_device+0x58/0x108 [10.092275] __driver_attach+0xdc/0xe0 [10.092278] bus_for_each_dev+0x74/0xc8 [10.092280] driver_attach+0x20/0x28 [10.092283] bus_add_driver+0x1b8/0x228 [10.092285] driver_register+0x60/0x110 [10.092288] __platform_driver_register+0x40/0x48 [10.092292] hns_nic_dev_driver_init+0x18/0x20 [10.092296] do_one_initcall+0x5c/0x180 [10.092299] kernel_init_freeable+0x198/0x240 [10.092303] kernel_init+0x10/0x108 [10.092306] ret_from_fork+0x10/0x18 [10.092308] ---[ end trace 1396dd0278e397eb ]--- This WARNING occurred because of calling phy_stop before phy_start. The root cause of the problem in commit '308c6cafde01' is: Reference to hns_nic_init_phy, the flag phydev->supported is changed after phy_connect_direct. The flag phydev->supported is 0x6ff when hns modules is loaded, so will not change Fiber Port power(Reference to marvell.c), which is power on at default. Then the flag phydev->supported is changed to 0x6f, so Fiber Port power is off when removing hns modules. When hns modules installed again, the flag phydev->supported is default value 0x6ff, so will not change Fiber Port power(now is off), causing mac link not up problem. So the solution is change phy flags before phy_connect_direct. Fixes: 308c6cafde01 ("net: hns: All ports can not work when insmod hns ko after rmmod.") Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-04 13:33:57 -08:00
Linus Walleij	cff1e01f16	net: dsa: mt7530: Drop unused GPIO include This driver uses GPIO descriptors only, <linux/of_gpio.h> is not used so drop the include. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-04 13:07:23 -08:00
David S. Miller	0c06a09197	Merge branch 'GUE-error-recursion' Stefano Brivio says: ==================== Fix two further potential unbounded recursions in GUE error handlers Patch 1/2 takes care of preventing the issue fixed by commit 11789039da53 ("fou: Prevent unbounded recursion in GUE error handler") also with UDP-Lite payloads -- I just realised this might happen from a syzbot report. Patch 2/2 fixes the issue for both UDP and UDP-Lite on IPv6, which I also forgot to deal with in that same commit. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-04 13:06:07 -08:00
Stefano Brivio	44039e0017	fou6: Prevent unbounded recursion in GUE error handler I forgot to deal with IPv6 in commit 11789039da53 ("fou: Prevent unbounded recursion in GUE error handler"). Now syzbot reported what might be the same type of issue, caused by gue6_err(), that is, handling exceptions for direct UDP encapsulation in GUE (UDP-in-UDP) leads to unbounded recursion in the GUE exception handler. As it probably doesn't make sense to set up GUE this way, and it's currently not even possible to configure this, skip exception handling for UDP (or UDP-Lite) packets encapsulated in UDP (or UDP-Lite) packets with GUE on IPv6. Reported-by: syzbot+4ad25edc7a33e4ab91e0@syzkaller.appspotmail.com Reported-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com> Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Fixes: b8a51b38e4d4 ("fou, fou6: ICMP error handlers for FoU and GUE") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-04 13:06:07 -08:00
Stefano Brivio	bc6e019b6e	fou: Prevent unbounded recursion in GUE error handler also with UDP-Lite In commit 11789039da53 ("fou: Prevent unbounded recursion in GUE error handler"), I didn't take care of the case where UDP-Lite is encapsulated into UDP or UDP-Lite with GUE. From a syzbot report about a possibly similar issue with GUE on IPv6, I just realised the same thing might happen with a UDP-Lite inner payload. Also skip exception handling for inner UDP-Lite protocol. Fixes: 11789039da53 ("fou: Prevent unbounded recursion in GUE error handler") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-04 13:06:07 -08:00
Yi-Hung Wei	41e4e2cd75	openvswitch: Fix IPv6 later frags parsing The previous commit fa642f08839b ("openvswitch: Derive IP protocol number for IPv6 later frags") introduces IP protocol number parsing for IPv6 later frags that can mess up the network header length calculation logic, i.e. nh_len < 0. However, the network header length calculation is mainly for deriving the transport layer header in the key extraction process which the later fragment does not apply. Therefore, this commit skips the network header length calculation to fix the issue. Reported-by: Chris Mi <chrism@mellanox.com> Reported-by: Greg Rose <gvrose8192@gmail.com> Fixes: fa642f08839b ("openvswitch: Derive IP protocol number for IPv6 later frags") Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-04 13:00:02 -08:00
Claudiu Beznea	ba3e1847d6	net: macb: remove unnecessary code Commit 653e92a9175e ("net: macb: add support for padding and fcs computation") introduced a bug fixed by commit 899ecaedd155 ("net: ethernet: cadence: fix socket buffer corruption problem"). Code removed in this patch is not reachable at all so remove it. Fixes: 653e92a9175e ("net: macb: add support for padding and fcs computation") Cc: Tristram Ha <Tristram.Ha@microchip.com> Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-04 12:59:09 -08:00
Linus Walleij	a09b42ba1a	net: dsa: microchip: Drop unused GPIO includes This driver does not use the old GPIO includes so drop them. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-04 12:58:25 -08:00
David S. Miller	ebdefe4656	Merge branch 'qed-fixes' Denis Bolotin says: ==================== qed: Misc fixes in qed This patch series fixes 2 potential bugs in qed. Please consider applying to net. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-04 12:57:31 -08:00
Denis Bolotin	46721c3d9e	qed: Fix qed_ll2_post_rx_buffer_notify_fw() by adding a write memory barrier Make sure chain element is updated before ringing the doorbell. Signed-off-by: Denis Bolotin <dbolotin@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-04 12:57:30 -08:00
Denis Bolotin	2d533a9287	qed: Fix qed_chain_set_prod() for PBL chains with non power of 2 page count In PBL chains with non power of 2 page count, the producer is not at the beginning of the chain when index is 0 after a wrap. Therefore, after the producer index wrap around, page index should be calculated more carefully. Signed-off-by: Denis Bolotin <dbolotin@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-04 12:57:30 -08:00
David Rientjes	f8c468e853	net, skbuff: do not prefer skb allocation fails early Commit dcda9b04713c ("mm, tree wide: replace __GFP_REPEAT by __GFP_RETRY_MAYFAIL with more useful semantic") replaced __GFP_REPEAT in alloc_skb_with_frags() with __GFP_RETRY_MAYFAIL when the allocation may directly reclaim. The previous behavior would require reclaim up to 1 << order pages for skb aligned header_len of order > PAGE_ALLOC_COSTLY_ORDER before failing, otherwise the allocations in alloc_skb() would loop in the page allocator looking for memory. __GFP_RETRY_MAYFAIL makes both allocations failable under memory pressure, including for the HEAD allocation. This can cause, among many other things, write() to fail with ENOTCONN during RPC when under memory pressure. These allocations should succeed as they did previous to dcda9b04713c even if it requires calling the oom killer and additional looping in the page allocator to find memory. There is no way to specify the previous behavior of __GFP_REPEAT, but it's unlikely to be necessary since the previous behavior only guaranteed that 1 << order pages would be reclaimed before failing for order > PAGE_ALLOC_COSTLY_ORDER. That reclaim is not guaranteed to be contiguous memory, so repeating for such large orders is usually not beneficial. Removing the setting of __GFP_RETRY_MAYFAIL to restore the previous behavior, specifically not allowing alloc_skb() to fail for small orders and oom kill if necessary rather than allowing RPCs to fail. Fixes: dcda9b04713c ("mm, tree wide: replace __GFP_REPEAT by __GFP_RETRY_MAYFAIL with more useful semantic") Signed-off-by: David Rientjes <rientjes@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-04 12:53:16 -08:00

1 2 3 4 5 ...

809756 Commits