linux

iv/linux

Author	SHA1	Message	Date
Li RongQing	982c17b9e3	net: remove BUG_ON from __pskb_pull_tail if list is NULL pointer, and the following access of list will trigger panic, which is same as BUG_ON Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 15:07:50 -08:00
Eric Dumazet	08e14fe429	net_sched: sch_fq: ensure maxrate fq parameter applies to EDT flows When EDT conversion happened, fq lost the ability to enfore a maxrate for all flows. It kept it for non EDT flows. This commit restores the functionality. Tested: tc qd replace dev eth0 root fq maxrate 500Mbit netperf -P0 -H host -- -O THROUGHPUT 489.75 Fixes: ab408b6dc744 ("tcp: switch tcp and sch_fq to new earliest departure time model") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 11:42:12 -08:00
Amritha Nambiar	5c72299fba	net: sched: cls_flower: Classify packets using port ranges Added support in tc flower for filtering based on port ranges. Example: 1. Match on a port range: ------------------------- $ tc filter add dev enp4s0 protocol ip parent ffff:\ prio 1 flower ip_proto tcp dst_port range 20-30 skip_hw\ action drop $ tc -s filter show dev enp4s0 parent ffff: filter protocol ip pref 1 flower chain 0 filter protocol ip pref 1 flower chain 0 handle 0x1 eth_type ipv4 ip_proto tcp dst_port range 20-30 skip_hw not_in_hw action order 1: gact action drop random type none pass val 0 index 1 ref 1 bind 1 installed 85 sec used 3 sec Action statistics: Sent 460 bytes 10 pkt (dropped 10, overlimits 0 requeues 0) backlog 0b 0p requeues 0 2. Match on IP address and port range: -------------------------------------- $ tc filter add dev enp4s0 protocol ip parent ffff:\ prio 1 flower dst_ip 192.168.1.1 ip_proto tcp dst_port range 100-200\ skip_hw action drop $ tc -s filter show dev enp4s0 parent ffff: filter protocol ip pref 1 flower chain 0 handle 0x2 eth_type ipv4 ip_proto tcp dst_ip 192.168.1.1 dst_port range 100-200 skip_hw not_in_hw action order 1: gact action drop random type none pass val 0 index 2 ref 1 bind 1 installed 58 sec used 2 sec Action statistics: Sent 920 bytes 20 pkt (dropped 20, overlimits 0 requeues 0) backlog 0b 0p requeues 0 v4: 1. Added condition before setting port key. 2. Organized setting and dumping port range keys into functions and added validation of input range. v3: 1. Moved new fields in UAPI enum to the end of enum. 2. Removed couple of empty lines. v2: Addressed Jiri's comments: 1. Added separate functions for dst and src comparisons. 2. Removed endpoint enum. 3. Added new bit TCA_FLOWER_FLAGS_RANGE to decide normal/range lookup. 4. Cleaned up fl_lookup function. Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 11:38:23 -08:00
Cong Wang	7fe50ac83f	net: dump more useful information in netdev_rx_csum_fault() Currently netdev_rx_csum_fault() only shows a device name, we need more information about the skb for debugging csum failures. Sample output: ens3: hw csum failure dev features: 0x0000000000014b89 skb len=84 data_len=0 pkt_type=0 gso_size=0 gso_type=0 nr_frags=0 ip_summed=0 csum=0 csum_complete_sw=0 csum_valid=0 csum_level=0 Note, I use pr_err() just to be consistent with the existing one. Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 11:37:04 -08:00
David Howells	7150ceaacb	rxrpc: Fix life check The life-checking function, which is used by kAFS to make sure that a call is still live in the event of a pending signal, only samples the received packet serial number counter; it doesn't actually provoke a change in the counter, rather relying on the server to happen to give us a packet in the time window. Fix this by adding a function to force a ping to be transmitted. kAFS then keeps track of whether there's been a stall, and if so, uses the new function to ping the server, resetting the timeout to allow the reply to come back. If there's a stall, a ping and the call is still stalled in the same place after another period, then the call will be aborted. Fixes: bc5e3a546d55 ("rxrpc: Use MSG_WAITALL to tell sendmsg() to temporarily ignore signals") Fixes: f4d15fb6f99a ("rxrpc: Provide functions for allowing cleaner handling of signals") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-15 11:35:40 -08:00
Linus Torvalds	94ca5c18e1	NFS client bugfixes for Linux 4.20 Highlights include: Stable fixes: - Don't exit the NFSv4 state manager without clearing NFS4CLNT_MANAGER_RUNNING Bugfixes: - Fix an Oops when destroying the RPCSEC_GSS credential cache - Fix an Oops during delegation callbacks - Ensure that the NFSv4 state manager exits the loop on SIGKILL - Fix a bogus get/put in generic_key_to_expire() -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJb7Lh0AAoJEA4mA3inWBJc8uAQAIkrGChs3AFuEQ3G3H9RlxDX WFsPghRGmDwXf2sD+nWjl0r60v0v5fQaUhW/7EPe2kbVTF/rnjieXNeFOw33ZMFk MDq03nL1/I25DoNK/qg5GZ2NIltZ9oKKbwaN+0LxXKz69X5qIXYnDzYPHDR/PNTg Go7PvG8rU31Wd67E2pquwC6zZ6rCPf2BtQjZdzouLAEUWXAMHyJmszpFUxhLMJoz k6dZouphj8fkMse3cfKLnGDqbQ2bE6+Yb0B6Hi0p5nShYgZTaQNZ9KxrEJF7J05i cxH6IvLEawEMWXYzGEwr1LUDDrpwveuNTt/OroTgOcSsVpZx1DE0sOZkQ4pt/uTe c5NzZYKjEOb2DWxoGR2GEDkRasKVBkWvR5MegvyDgyAcXkAjN/6CgYXiniNYDxl6 qk7sIqkJfug7fv+VW5YHwORKnvRIEDlFcwy5yZ0ij/Qa0dqUR3aczINGLwS6kcfn u7M42UR17FUo2zaI9pZhuijwntbtkXMIETWHGRctK7Mum6u37QSVySNCO2A4knBE jEy+oYPFCIUqH+ESpNp73otrVt1CTexScIJNsEi1naLmOhjQRW7YjUPEH1Xjg0Ss OGyqIjOf6ToF6ma39/XZI9miJe08k6x8b0aGUdG29Cko9UvjLH86ODEausSRAyFA OyZFFuHHAau5FGpNvZfj =AstN -----END PGP SIGNATURE----- Merge tag 'nfs-for-4.20-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client bugfixes from Trond Myklebust: "Highlights include: Stable fixes: - Don't exit the NFSv4 state manager without clearing NFS4CLNT_MANAGER_RUNNING Bugfixes: - Fix an Oops when destroying the RPCSEC_GSS credential cache - Fix an Oops during delegation callbacks - Ensure that the NFSv4 state manager exits the loop on SIGKILL - Fix a bogus get/put in generic_key_to_expire()" * tag 'nfs-for-4.20-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4: Fix an Oops during delegation callbacks SUNRPC: Fix a bogus get/put in generic_key_to_expire() SUNRPC: Fix a Oops when destroying the RPCSEC_GSS credential cache NFSv4: Ensure that the state manager exits the loop on SIGKILL NFSv4: Don't exit the state manager without clearing NFS4CLNT_MANAGER_RUNNING	2018-11-15 10:59:37 -06:00
Xin Long	f8504f4ca0	l2tp: fix a sock refcnt leak in l2tp_tunnel_register This issue happens when trying to add an existent tunnel. It doesn't call sock_put() before returning -EEXIST to release the sock refcnt that was held by calling sock_hold() before the existence check. This patch is to fix it by holding the sock after doing the existence check. Fixes: f6cd651b056f ("l2tp: fix race in duplicate tunnel detection") Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-14 22:49:31 -08:00
Linus Torvalds	4e4490d438	Three nfsd bugfixes. None are new bugs, but they all take a little effort to hit, which might explain why they weren't found sooner. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJb6zwdAAoJECebzXlCjuG+SLMP/AlpI+vPV7DdCLRWGCY1ZMjk 5pxIS+74mD2EopBYgZY58L1fxWgv2bLOiAs/baAlNpkjTNX3wlxXGTu9IzVdPOn7 3n+W2Rb+mXEFaag7mP8RFpOvt7Yb3p4DObGpg7TKWJZ6r/8xcxQWQO+e0iiS5+XK EOiaFcGmYlOC1JtrRIL2fr16trXUhT1gz7qAZgKBzebbEdn4FfdsdwHm7nUyRB3I LhCMV35RfzOBC2C/kQzlHaHYlo0dx5lKMtVzvtgMdpgXr4QXE/7Ke/ANQ7oGfhhO 9uX0Uf18HmeGRejK9QoMha7VWuwh5pyHBq0ppMpGL2jb11BD/l9iXgS+vTxpA2B0 YIiSOnaiDFsEk6hMsFqueVIdaTrarcjg/S2mh2QDjtkXKS3L0W6/7v97JJHu9J4l 6zxiT6Crq2p8pMZ5gY3RI1AYllW/K+TRoccLhO+q19g3q1HWxP6DyeFBgNF66/ha NtmQP+94IkaCS70zirpEu/OeUMviQgX2x77OReyibHLA4+R+hNHwtR67BLl+xG0G jmKHfqqX7offFaHmsoD8kK3gpKtit0/py9Hp7gXQg4vU5iL512gI83ICEOEkZMXn Ppsrl1HyoO/ohY/USpMvRqYHjM1ZGew19ZzD7SId6vUVaYjQIEsjQVnycK3h+gSb otk5pc3bWPCwa8csOWPs =+1Ub -----END PGP SIGNATURE----- Merge tag 'nfsd-4.20-1' of git://linux-nfs.org/~bfields/linux Pull nfsd fixes from Bruce Fields: "Three nfsd bugfixes. None are new bugs, but they all take a little effort to hit, which might explain why they weren't found sooner" * tag 'nfsd-4.20-1' of git://linux-nfs.org/~bfields/linux: SUNRPC: drop pointless static qualifier in xdr_get_next_encode_buffer() nfsd: COPY and CLONE operations require the saved filehandle to be set sunrpc: correct the computation for page_ptr when truncating	2018-11-14 15:31:15 -06:00
Jakub Kicinski	c0b7490b19	net: sched: red: notify drivers about RED's limit parameter RED qdisc's limit parameter changes the behaviour of the qdisc, for instance if it's set to 0 qdisc will drop all the packets. When replace operation happens and parameter is set to non-0 a new fifo qdisc will be instantiated and replace the old child qdisc which will be destroyed. Drivers need to know the parameter, even if they don't impose the actual limit to be able to reliably reconstruct the Qdisc hierarchy. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-14 08:51:28 -08:00
Jakub Kicinski	d577a3d279	net: sched: mq: offload a graft notification Drivers offloading Qdiscs should have reasonable certainty the offloaded behaviour matches the SW path. This is impossible if the driver does not know about all Qdiscs or when Qdiscs move and are reused. Send a graft notification from MQ. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-14 08:51:28 -08:00
Jakub Kicinski	bf2a752bea	net: sched: red: offload a graft notification Drivers offloading Qdiscs should have reasonable certainty the offloaded behaviour matches the SW path. This is impossible if the driver does not know about all Qdiscs or when Qdiscs move and are reused. Send a graft notification from RED. The drivers are expected to simply stop offloading the Qdisc, if a non-standard child is ever grafted onto it. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-14 08:51:28 -08:00
Jakub Kicinski	98b0e5f684	net: sched: provide notification for graft on root Drivers are currently not notified when a Qdisc is grafted as root. This requires special casing Qdiscs added with parent = TC_H_ROOT in the driver. Also there is no notification sent to the driver when an existing Qdisc is grafted as root. Add this very simple notifications, drivers should now be able to track their Qdisc tree fully. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-14 08:51:27 -08:00
David S. Miller	11123ab9d9	linux-can-fixes-for-4.20-20181109 -----BEGIN PGP SIGNATURE----- iQFHBAABCgAxFiEENrCndlB/VnAEWuH5k9IU1zQoZfEFAlvlt0gTHG1rbEBwZW5n dXRyb25peC5kZQAKCRCT0hTXNChl8bMDB/9ElLCS/uh3CznHeX8w24t/LldHoy0q eposGQ6+uWV/R7lUfNNUtIAcoSxzuOyXSMh9skz8NdExdQ0/9osnvNWemKTGrfhm ndCVmMd7dMoWX2m1VTJ2jrij3MKPe8HmUei+kB9PrhHFNwofNSOvw2dEVjJDSwUW gAvs6K/KrHh5ncd9O3JfaXqc9Cs95o0dz4U4AGZ68UjUemx1AmDse2q3JVPQcxn0 muXoWWFXBbKob/0qpFG0xP9ssdq75AL58dlEqRV+64EMgqWcgvdoPxGGIBbP4t0x zMwE3hCaoC7Uogr28tnQrf4kSm5IC33AiMQDKmBQRtzFLxtCI1wE71M4 =eM20 -----END PGP SIGNATURE----- Merge tag 'linux-can-fixes-for-4.20-20181109' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2018-11-09 this is a pull request of 20 patches for net/master. First we have a patch by Oliver Hartkopp which changes the raw socket's raw_sendmsg() to return an error value if the user tries to send a CANFD frame to a CAN-2.0 device. The next two patches are by Jimmy Assarsson and fix potential problems in the kvaser_usb driver. YueHaibing's patches for the ucan driver fix a compile time warning and remove a duplicate include. Eugeniu Rosca patch adds more binding documentation to the rcar_can driver bindings. The next two patches are by Fabrizio Castro for the rcar_can driver and fixes a problem in the driver's probe function and document the r8a774a1 binding. Lukas Wunner's patch fixes a recpetion problem in hi311x driver by switching from edge to level triggered interruts. The next three patches all target the flexcan driver. Pankaj Bansal's patch unconditionally unlocks the last mailbox used for RX. Alexander Stein provides a better workaround for a hardware limitation when sending RTR frames, by using the last mailbox for TX, resulting in fewer lost frames. The patch by me simplyfies the driver, by making a runtime value a compile time constant. The following 4 patches are by me and provide the groundwork for the next patches by Oleksij Rempel. To avoid code duplication common code in the common CAN driver infrastructure is factured out and error handling is cleaned up. The next 4 patches are by Oleksij Rempel and fix the problem in the flexcan driver that other processes see TX frames arrive out of order with ragards to a RX'ed frame (which are send by a different system on the CAN bus as the result of our TX frame). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-13 08:43:05 -08:00
Trond Myklebust	e3d5e573a5	SUNRPC: Fix a bogus get/put in generic_key_to_expire() Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-11-12 16:39:13 -05:00
Trond Myklebust	a652a4bc21	SUNRPC: Fix a Oops when destroying the RPCSEC_GSS credential cache Commit 07d02a67b7fa causes a use-after free in the RPCSEC_GSS credential destroy code, because the call to get_rpccred() in gss_destroying_context() will now always fail to increment the refcount. While we could just replace the get_rpccred() with a refcount_set(), that would have the unfortunate consequence of resurrecting a credential in the credential cache for which we are in the process of destroying the RPCSEC_GSS context. Rather than do this, we choose to make a copy that is never added to the cache and use that to destroy the context. Fixes: 07d02a67b7fa ("SUNRPC: Simplify lookup code") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-11-12 16:39:13 -05:00
Xin Long	6ba8457402	sctp: process sk_reuseport in sctp_get_port_local When socks' sk_reuseport is set, the same port and address are allowed to be bound into these socks who have the same uid. Note that the difference from sk_reuse is that it allows multiple socks to listen on the same port and address. Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-12 09:09:51 -08:00
Xin Long	76c6d988ae	sctp: add sock_reuseport for the sock in __sctp_hash_endpoint This is a part of sk_reuseport support for sctp. It defines a helper sctp_bind_addrs_check() to check if the bind_addrs in two socks are matched. It will add sock_reuseport if they are completely matched, and return err if they are partly matched, and alloc sock_reuseport if all socks are not matched at all. It will work until sk_reuseport support is added in sctp_get_port_local() in the next patch. v1->v2: - use 'laddr->valid && laddr2->valid' check instead as Marcelo pointed in sctp_bind_addrs_check(). Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-12 09:09:51 -08:00
Xin Long	532ae2f10e	sctp: do reuseport_select_sock in __sctp_rcv_lookup_endpoint This is a part of sk_reuseport support for sctp, and it selects a sock by the hashkey of lport, paddr and dport by default. It will work until sk_reuseport support is added in sctp_get_port_local() in the next patch. v1->v2: - define lport as __be16 instead of __be32 as Marcelo pointed in __sctp_rcv_lookup_endpoint(). Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-12 09:09:51 -08:00
Linus Lüssing	016fd28568	batman-adv: enable MCAST by default at compile time Thanks to rigorous testing in wireless community mesh networks several issues with multicast entries in the translation table were found and fixed in the last 1.5 years. Now we see the first larger networks (a few hundred nodes) with a batman-adv version with multicast optimizations enabled arising, with no TT / multicast optimization related issues so far. Therefore it seems safe to enable multicast optimizations by default. Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2018-11-12 10:41:51 +01:00
Sven Eckelmann	fb939135a6	batman-adv: Move CRC16 dependency to BATMAN_ADV_BLA The commit ced72933a5e8 ("batman-adv: use CRC32C instead of CRC16 in TT code") switched the translation table code from crc16 to crc32c. The (optional) bridge loop avoidance code is the only user of this function. batman-adv should only select CRC16 when it is actually using it. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2018-11-12 10:41:51 +01:00
Sven Eckelmann	d2d489b7d8	batman-adv: Add inconsistent multicast netlink dump detection The netlink dump functionality transfers a large number of entries from the kernel to userspace. It is rather likely that the transfer has to interrupted and later continued. During that time, it can happen that either new entries are added or removed. The userspace could than either receive some entries multiple times or miss entries. Commit 670dc2833d14 ("netlink: advertise incomplete dumps") introduced a mechanism to inform userspace about this problem. Userspace can then decide whether it is necessary or not to retry dumping the information again. The netlink dump functions have to be switched to exclusive locks to avoid changes while the current message is prepared. The already existing generation sequence counter from the hash helper can be used for this simple hash. Reported-by: Matthias Schiffer <mschiffer@universe-factory.net> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2018-11-12 10:41:51 +01:00
Sven Eckelmann	6b7b40aad5	batman-adv: Add inconsistent local TT netlink dump detection The netlink dump functionality transfers a large number of entries from the kernel to userspace. It is rather likely that the transfer has to interrupted and later continued. During that time, it can happen that either new entries are added or removed. The userspace could than either receive some entries multiple times or miss entries. Commit 670dc2833d14 ("netlink: advertise incomplete dumps") introduced a mechanism to inform userspace about this problem. Userspace can then decide whether it is necessary or not to retry dumping the information again. The netlink dump functions have to be switched to exclusive locks to avoid changes while the current message is prepared. The already existing generation sequence counter from the hash helper can be used for this simple hash. Reported-by: Matthias Schiffer <mschiffer@universe-factory.net> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2018-11-12 10:41:51 +01:00
Sven Eckelmann	6f81652a47	batman-adv: Add inconsistent dat netlink dump detection The netlink dump functionality transfers a large number of entries from the kernel to userspace. It is rather likely that the transfer has to interrupted and later continued. During that time, it can happen that either new entries are added or removed. The userspace could than either receive some entries multiple times or miss entries. Commit 670dc2833d14 ("netlink: advertise incomplete dumps") introduced a mechanism to inform userspace about this problem. Userspace can then decide whether it is necessary or not to retry dumping the information again. The netlink dump functions have to be switched to exclusive locks to avoid changes while the current message is prepared. The already existing generation sequence counter from the hash helper can be used for this simple hash. Reported-by: Matthias Schiffer <mschiffer@universe-factory.net> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2018-11-12 10:41:51 +01:00
Sven Eckelmann	24d71b9232	batman-adv: Add inconsistent claim netlink dump detection The netlink dump functionality transfers a large number of entries from the kernel to userspace. It is rather likely that the transfer has to interrupted and later continued. During that time, it can happen that either new entries are added or removed. The userspace could than either receive some entries multiple times or miss entries. Commit 670dc2833d14 ("netlink: advertise incomplete dumps") introduced a mechanism to inform userspace about this problem. Userspace can then decide whether it is necessary or not to retry dumping the information again. The netlink dump functions have to be switched to exclusive locks to avoid changes while the current message is prepared. The already existing generation sequence counter from the hash helper can be used for this simple hash. Reported-by: Matthias Schiffer <mschiffer@universe-factory.net> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2018-11-12 10:41:51 +01:00
Sven Eckelmann	b00d0e6a2c	batman-adv: Add inconsistent backbone netlink dump detection The netlink dump functionality transfers a large number of entries from the kernel to userspace. It is rather likely that the transfer has to interrupted and later continued. During that time, it can happen that either new entries are added or removed. The userspace could than either receive some entries multiple times or miss entries. Commit 670dc2833d14 ("netlink: advertise incomplete dumps") introduced a mechanism to inform userspace about this problem. Userspace can then decide whether it is necessary or not to retry dumping the information again. The netlink dump functions have to be switched to exclusive locks to avoid changes while the current message is prepared. The already existing generation sequence counter from the hash helper can be used for this simple hash. Reported-by: Matthias Schiffer <mschiffer@universe-factory.net> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2018-11-12 10:41:51 +01:00
Sven Eckelmann	05abd7bcc9	batman-adv: Store modification counter via hash helpers Multiple datastructures use the hash helper functions to add and remove entries from the simple hlist based hashes. These are often also dumped to userspace via netlink and thus should have a generation sequence counter. Reported-by: Matthias Schiffer <mschiffer@universe-factory.net> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2018-11-12 10:41:51 +01:00
Sven Eckelmann	fb69be6979	batman-adv: Add inconsistent hardif netlink dump detection The netlink dump functionality transfers a large number of entries from the kernel to userspace. It is rather likely that the transfer has to interrupted and later continued. During that time, it can happen that either new entries are added or removed. The userspace could than either receive some entries multiple times or miss entries. Commit 670dc2833d14 ("netlink: advertise incomplete dumps") introduced a mechanism to inform userspace about this problem. Userspace can then decide whether it is necessary or not to retry dumping the information again. The netlink dump functions have to be switched to exclusive locks to avoid changes while the current message is prepared. And an external generation sequence counter is introduced which tracks all modifications of the list. Reported-by: Matthias Schiffer <mschiffer@universe-factory.net> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2018-11-12 10:41:51 +01:00
Sven Eckelmann	9264c85c8b	batman-adv: Add inconsistent gateway netlink dump detection The netlink dump functionality transfers a large number of entries from the kernel to userspace. It is rather likely that the transfer has to interrupted and later continued. During that time, it can happen that either new entries are added or removed. The userspace could than either receive some entries multiple times or miss entries. Commit 670dc2833d14 ("netlink: advertise incomplete dumps") introduced a mechanism to inform userspace about this problem. Userspace can then decide whether it is necessary or not to retry dumping the information again. The netlink dump functions have to be switched to exclusive locks to avoid changes while the current message is prepared. And an external generation sequence counter is introduced which tracks all modifications of the list. Reported-by: Matthias Schiffer <mschiffer@universe-factory.net> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2018-11-12 10:41:51 +01:00
Sven Eckelmann	694127c1dd	batman-adv: Fix description for BATMAN_ADV_DEBUG The debug messages of batman-adv are not printed to the kernel log at all but can be stored (depending on the compile setting) in the tracing buffer or the batadv specific log buffer. There is also no debug module parameter but a batadv netdev specific log_level setting to enable/disable different classes of debug messages at runtime. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2018-11-12 10:41:50 +01:00
Sven Eckelmann	0dacc7fab6	batman-adv: Allow to use BATMAN_ADV_DEBUG without BATMAN_ADV_DEBUGFS The BATMAN_ADV_DEBUGFS portion of batman-adv is marked as deprecated. Thus all required functionality should be available without it. The debug log was already modified to also output via the kernel tracing function but still retained its BATMAN_ADV_DEBUGFS functionality. Separate the entry point for the debug log from the debugfs portions to make it possible to build with BATMAN_ADV_DEBUG and without BATMAN_ADV_DEBUGFS. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2018-11-12 10:41:50 +01:00
Sven Eckelmann	95d8f85c91	batman-adv: Improve includes for trace functionality The batadv_dbg trace event uses different functionality and datastructures which are not directly associated with the trace infrastructure. It should not be expected that the trace headers indirectly provide them and instead include the required headers directly. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2018-11-12 10:41:50 +01:00
Sven Eckelmann	a5dac4da72	batman-adv: Add includes for deprecation warning The commit 00caf6a2b318 ("batman-adv: Mark debugfs functionality as deprecated") introduced various messages to inform the user about the deprecation of the debugfs based functionality. The messages also include the context/task in which this problem was observed. The datastructures and functions to access this information require special headers. These should be included directly instead of depending on a more complex and fragile include chain. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2018-11-12 10:41:50 +01:00
Sven Eckelmann	01468225f3	batman-adv: Drop unused lockdep include The commit dee222c7b20c ("batman-adv: Move OGM rebroadcast stats to orig_ifinfo") removed all used functionality of the include linux/lockdep.h from batadv_iv_ogm.c. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2018-11-12 10:41:50 +01:00
Simon Wunderlich	3987b6a4cc	batman-adv: Start new development cycle Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2018-11-12 10:41:50 +01:00
Sven Eckelmann	d7d8bbb40a	batman-adv: Expand merged fragment buffer for full packet The complete size ("total_size") of the fragmented packet is stored in the fragment header and in the size of the fragment chain. When the fragments are ready for merge, the skbuff's tail of the first fragment is expanded to have enough room after the data pointer for at least total_size. This means that it gets expanded by total_size - first_skb->len. But this is ignoring the fact that after expanding the buffer, the fragment header is pulled by from this buffer. Assuming that the tailroom of the buffer was already 0, the buffer after the data pointer of the skbuff is now only total_size - len(fragment_header) large. When the merge function is then processing the remaining fragments, the code to copy the data over to the merged skbuff will cause an skb_over_panic when it tries to actually put enough data to fill the total_size bytes of the packet. The size of the skb_pull must therefore also be taken into account when the buffer's tailroom is expanded. Fixes: 610bfc6bc99b ("batman-adv: Receive fragmented packets and merge") Reported-by: Martin Weinelt <martin@darmstadt.freifunk.net> Co-authored-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2018-11-12 10:41:29 +01:00
Sven Eckelmann	f4156f9656	batman-adv: Use explicit tvlv padding for ELP packets The announcement messages of batman-adv COMPAT_VERSION 15 have the possibility to announce additional information via a dynamic TVLV part. This part is optional for the ELP packets and currently not parsed by the Linux implementation. Still out-of-tree versions are using it to transport things like neighbor hashes to optimize the rebroadcast behavior. Since the ELP broadcast packets are smaller than the minimal ethernet packet, it often has to be padded. This is often done (as specified in RFC894) with octets of zero and thus work perfectly fine with the TVLV part (making it a zero length and thus empty). But not all ethernet compatible hardware seems to follow this advice. To avoid ambiguous situations when parsing the TVLV header, just force the 4 bytes (TVLV length + padding) after the required ELP header to zero. Fixes: d6f94d91f766 ("batman-adv: ELP - adding basic infrastructure") Reported-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>	2018-11-12 10:41:29 +01:00
David S. Miller	2b9b7502df	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2018-11-11 17:57:54 -08:00
Eric Dumazet	48872c11b7	net_sched: sch_fq: add dctcp-like marking Similar to 80ba92fa1a92 ("codel: add ce_threshold attribute") After EDT adoption, it became easier to implement DCTCP-like CE marking. In many cases, queues are not building in the network fabric but on the hosts themselves. If packets leaving fq missed their Earliest Departure Time by XXX usec, we mark them with ECN CE. This gives a feedback (after one RTT) to the sender to slow down and find better operating mode. Example : tc qd replace dev eth0 root fq ce_threshold 2.5ms Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-11 13:59:21 -08:00
Eric Dumazet	c73e5807e4	tcp: tsq: no longer use limit_output_bytes for paced flows FQ pacing guarantees that paced packets queued by one flow do not add head-of-line blocking for other flows. After TCP GSO conversion, increasing limit_output_bytes to 1 MB is safe, since this maps to 16 skbs at most in qdisc or device queues. (or slightly more if some drivers lower {gso_max_segs\|size}) We still can queue at most 1 ms worth of traffic (this can be scaled by wifi drivers if they need to) Tested: # ethtool -c eth0 \| egrep "tx-usecs:\|tx-frames:" # 40 Gbit mlx4 NIC tx-usecs: 16 tx-frames: 16 # tc qdisc replace dev eth0 root fq # for f in {1..10};do netperf -P0 -H lpaa24,6 -o THROUGHPUT;done Before patch: 27711 26118 27107 27377 27712 27388 27340 27117 27278 27509 After patch: 37434 36949 36658 36998 37711 37291 37605 36659 36544 37349 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-11 13:57:03 -08:00
Eric Dumazet	a682850a11	tcp: get rid of tcp_tso_should_defer() dependency on HZ/jiffies tcp_tso_should_defer() first heuristic is to not defer if last send is "old enough". Its current implementation uses jiffies and its low granularity. TSO autodefer performance should not rely on kernel HZ :/ After EDT conversion, we have state variables in nanoseconds that can allow us to properly implement the heuristic. This patch increases TSO chunk sizes on medium rate flows, especially when receivers do not use GRO or similar aggregation. It also reduces bursts for HZ=100 or HZ=250 kernels, making TCP behavior more uniform. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-11 13:54:53 -08:00
Eric Dumazet	f1c6ea3827	tcp: refine tcp_tso_should_defer() after EDT adoption tcp_tso_should_defer() last step tries to check if the probable next ACK packet is coming in less than half rtt. Problem is that the head->tstamp might be in the future, so we need to use signed arithmetics to avoid overflows. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-11 13:54:53 -08:00
Eric Dumazet	1c09f7d073	tcp: do not try to defer skbs with eor mark (MSG_EOR) Applications using MSG_EOR are giving a strong hint to TCP stack : Subsequent sendmsg() can not append more bytes to skbs having the EOR mark. Do not try to TSO defer suchs skbs, there is really no hope. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-11 13:54:53 -08:00
Yafang Shao	5e13a0d3f5	tcp: minor optimization in tcp ack fast path processing Bitwise operation is a little faster. So I replace after() with using the flag FLAG_SND_UNA_ADVANCED as it is already set before. In addtion, there's another similar improvement in tcp_cwnd_reduction(). Cc: Joe Perches <joe@perches.com> Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-11 10:24:18 -08:00
Eric Dumazet	7236ead1b1	act_mirred: clear skb->tstamp on redirect If sch_fq is used at ingress, skbs that might have been timestamped by net_timestamp_set() if a packet capture is requesting timestamps could be delayed by arbitrary amount of time, since sch_fq time base is MONOTONIC. Fix this problem by moving code from sch_netem.c to act_mirred.c. Fixes: fb420d5d91c1 ("tcp/fq: move back to CLOCK_MONOTONIC") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-11 10:21:31 -08:00
Jon Maloy	7ab412d33b	tipc: fix link re-establish failure When a link failure is detected locally, the link is reset, the flag link->in_session is set to false, and a RESET_MSG with the 'stopping' bit set is sent to the peer. The purpose of this bit is to inform the peer that this endpoint just is going down, and that the peer should handle the reception of this particular RESET message as a local failure. This forces the peer to accept another RESET or ACTIVATE message from this endpoint before it can re-establish the link. This again is necessary to ensure that link session numbers are properly exchanged before the link comes up again. If a failure is detected locally at the same time at the peer endpoint this will do the same, which is also a correct behavior. However, when receiving such messages, the endpoints will not distinguish between 'stopping' RESETs and ordinary ones when it comes to updating session numbers. Both endpoints will copy the received session number and set their 'in_session' flags to true at the reception, while they are still expecting another RESET from the peer before they can go ahead and re-establish. This is contradictory, since, after applying the validation check referred to below, the 'in_session' flag will cause rejection of all such messages, and the link will never come up again. We now fix this by not only handling received RESET/STOPPING messages as a local failure, but also by omitting to set a new session number and the 'in_session' flag in such cases. Fixes: 7ea817f4e832 ("tipc: check session number before accepting link protocol messages") Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-11 10:03:38 -08:00
LUU Duc Canh	31c4f4cc32	tipc: improve broadcast retransmission algorithm Currently, the broadcast retransmission algorithm is using the 'prev_retr' field in struct tipc_link to time stamp the latest broadcast retransmission occasion. This helps to restrict retransmission of individual broadcast packets to max once per 10 milliseconds, even though all other criteria for retransmission are met. We now move this time stamp to the control block of each individual packet, and remove other limiting criteria. This simplifies the retransmission algorithm, and eliminates any risk of logical errors in selecting which packets can be retransmitted. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: LUU Duc Canh <canh.d.luu@dektech.com.au> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-11 09:57:46 -08:00
John Hurley	7f76fa3675	net: sched: register callbacks for indirect tc block binds Currently drivers can register to receive TC block bind/unbind callbacks by implementing the setup_tc ndo in any of their given netdevs. However, drivers may also be interested in binds to higher level devices (e.g. tunnel drivers) to potentially offload filters applied to them. Introduce indirect block devs which allows drivers to register callbacks for block binds on other devices. The callback is triggered when the device is bound to a block, allowing the driver to register for rules applied to that block using already available functions. Freeing an indirect block callback will trigger an unbind event (if necessary) to direct the driver to remove any offloaded rules and unreg any block rule callbacks. It is the responsibility of the implementing driver to clean any registered indirect block callbacks before exiting, if the block it still active at such a time. Allow registering an indirect block dev callback for a device that is already bound to a block. In this case (if it is an ingress block), register and also trigger the callback meaning that any already installed rules can be replayed to the calling driver. Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-11 09:54:52 -08:00
David S. Miller	e15e067d06	sctp: Fix SKB list traversal in sctp_intl_store_ordered(). Same change as made to sctp_intl_store_reasm(). To be fully correct, an iterator has an undefined value when something like skb_queue_walk() naturally terminates. This will actually matter when SKB queues are converted over to list_head. Formalize what this code ends up doing with the current implementation. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-10 19:32:23 -08:00
David S. Miller	348bbc25c4	sctp: Fix SKB list traversal in sctp_intl_store_reasm(). To be fully correct, an iterator has an undefined value when something like skb_queue_walk() naturally terminates. This will actually matter when SKB queues are converted over to list_head. Formalize what this code ends up doing with the current implementation. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-10 19:28:27 -08:00
David S. Miller	9e733177c7	iucv: Remove SKB list assumptions. Eliminate the assumption that SKBs and SKB list heads can be cast to eachother in SKB list handling code. This change also appears to fix a bug since the list->next pointer is sampled outside of holding the SKB queue lock. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-10 16:55:11 -08:00

1 2 3 4 5 ...

53816 Commits