IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
This operation is handled by nf_synproxy_ipv4_init() now.
Fixes: d7f9b2f18eae ("netfilter: synproxy: extract SYNPROXY infrastructure from {ipt, ip6t}_SYNPROXY")
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
ip netns exec ns1 ip a a dev eth0 10.0.0.7/24
ip netns exec ns2 ip link a link eth0 name vlan type vlan id 200
ip netns exec ns2 ip a a dev vlan 10.0.0.8/24
ip l add dev br0 type bridge vlan_filtering 1
brctl addif br0 veth1
brctl addif br0 veth2
bridge vlan add dev veth1 vid 200 pvid untagged
bridge vlan add dev veth2 vid 200
A two fragment packet sent from ns2 contains the vlan tag 200. In the
bridge conntrack, this packet will defrag to one skb with fraglist.
When the packet is forwarded to ns1 through veth1, the first skb vlan
tag will be cleared by the "untagged" flags. But the vlan tag in the
second skb is still tagged, so the second fragment ends up with tag 200
to ns1. So if the first fragment packet doesn't contain the vlan tag,
all of the remain should not contain vlan tag.
Fixes: 3c171f496ef5 ("netfilter: bridge: add connection tracking system")
Signed-off-by: wenxu <wenxu@ucloud.cn>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
When CONFIG_IPV6 is disabled, the bridge netfilter code
produces a link error:
ERROR: "br_ip6_fragment" [net/bridge/netfilter/nf_conntrack_bridge.ko] undefined!
ERROR: "nf_ct_frag6_gather" [net/bridge/netfilter/nf_conntrack_bridge.ko] undefined!
The problem is that it assumes that whenever IPV6 is not a loadable
module, we can call the functions direction. This is clearly
not true when IPV6 is disabled.
There are two other functions defined like this in linux/netfilter_ipv6.h,
so change them all the same way.
Fixes: 764dd163ac92 ("netfilter: nf_conntrack_bridge: add support for IPv6")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Prevent a UAF in brnf_exit_net().
When unregister_net_sysctl_table() is called the ctl_hdr pointer will
obviously be freed and so accessing it righter after is invalid. Fix
this by stashing a pointer to the table we want to free before we
unregister the sysctl header.
Note that syzkaller falsely chased this down to the drm tree so the
Fixes tag that syzkaller requested would be wrong. This commit uses a
different but the correct Fixes tag.
/* Splat */
BUG: KASAN: use-after-free in br_netfilter_sysctl_exit_net
net/bridge/br_netfilter_hooks.c:1121 [inline]
BUG: KASAN: use-after-free in brnf_exit_net+0x38c/0x3a0
net/bridge/br_netfilter_hooks.c:1141
Read of size 8 at addr ffff8880a4078d60 by task kworker/u4:4/8749
CPU: 0 PID: 8749 Comm: kworker/u4:4 Not tainted 5.2.0-rc5-next-20190618 #17
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google
01/01/2011
Workqueue: netns cleanup_net
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x172/0x1f0 lib/dump_stack.c:113
print_address_description.cold+0xd4/0x306 mm/kasan/report.c:351
__kasan_report.cold+0x1b/0x36 mm/kasan/report.c:482
kasan_report+0x12/0x20 mm/kasan/common.c:614
__asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:132
br_netfilter_sysctl_exit_net net/bridge/br_netfilter_hooks.c:1121 [inline]
brnf_exit_net+0x38c/0x3a0 net/bridge/br_netfilter_hooks.c:1141
ops_exit_list.isra.0+0xaa/0x150 net/core/net_namespace.c:154
cleanup_net+0x3fb/0x960 net/core/net_namespace.c:553
process_one_work+0x989/0x1790 kernel/workqueue.c:2269
worker_thread+0x98/0xe40 kernel/workqueue.c:2415
kthread+0x354/0x420 kernel/kthread.c:255
ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
Allocated by task 11374:
save_stack+0x23/0x90 mm/kasan/common.c:71
set_track mm/kasan/common.c:79 [inline]
__kasan_kmalloc mm/kasan/common.c:489 [inline]
__kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:462
kasan_kmalloc+0x9/0x10 mm/kasan/common.c:503
__do_kmalloc mm/slab.c:3645 [inline]
__kmalloc+0x15c/0x740 mm/slab.c:3654
kmalloc include/linux/slab.h:552 [inline]
kzalloc include/linux/slab.h:743 [inline]
__register_sysctl_table+0xc7/0xef0 fs/proc/proc_sysctl.c:1327
register_net_sysctl+0x29/0x30 net/sysctl_net.c:121
br_netfilter_sysctl_init_net net/bridge/br_netfilter_hooks.c:1105 [inline]
brnf_init_net+0x379/0x6a0 net/bridge/br_netfilter_hooks.c:1126
ops_init+0xb3/0x410 net/core/net_namespace.c:130
setup_net+0x2d3/0x740 net/core/net_namespace.c:316
copy_net_ns+0x1df/0x340 net/core/net_namespace.c:439
create_new_namespaces+0x400/0x7b0 kernel/nsproxy.c:103
unshare_nsproxy_namespaces+0xc2/0x200 kernel/nsproxy.c:202
ksys_unshare+0x444/0x980 kernel/fork.c:2822
__do_sys_unshare kernel/fork.c:2890 [inline]
__se_sys_unshare kernel/fork.c:2888 [inline]
__x64_sys_unshare+0x31/0x40 kernel/fork.c:2888
do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Freed by task 9:
save_stack+0x23/0x90 mm/kasan/common.c:71
set_track mm/kasan/common.c:79 [inline]
__kasan_slab_free+0x102/0x150 mm/kasan/common.c:451
kasan_slab_free+0xe/0x10 mm/kasan/common.c:459
__cache_free mm/slab.c:3417 [inline]
kfree+0x10a/0x2c0 mm/slab.c:3746
__rcu_reclaim kernel/rcu/rcu.h:215 [inline]
rcu_do_batch kernel/rcu/tree.c:2092 [inline]
invoke_rcu_callbacks kernel/rcu/tree.c:2310 [inline]
rcu_core+0xcc7/0x1500 kernel/rcu/tree.c:2291
__do_softirq+0x25c/0x94c kernel/softirq.c:292
The buggy address belongs to the object at ffff8880a4078d40
which belongs to the cache kmalloc-512 of size 512
The buggy address is located 32 bytes inside of
512-byte region [ffff8880a4078d40, ffff8880a4078f40)
The buggy address belongs to the page:
page:ffffea0002901e00 refcount:1 mapcount:0 mapping:ffff8880aa400a80
index:0xffff8880a40785c0
flags: 0x1fffc0000000200(slab)
raw: 01fffc0000000200 ffffea0001d636c8 ffffea0001b07308 ffff8880aa400a80
raw: ffff8880a40785c0 ffff8880a40780c0 0000000100000004 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff8880a4078c00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880a4078c80: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
> ffff8880a4078d00: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
^
ffff8880a4078d80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880a4078e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
Reported-by: syzbot+43a3fa52c0d9c5c94f41@syzkaller.appspotmail.com
Fixes: 22567590b2e6 ("netfilter: bridge: namespace bridge netfilter sysctls")
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This helper function is never used and it is intended to avoid a direct
dependency with the ipv6 module.
Fixes: d7f9b2f18eae ("netfilter: synproxy: extract SYNPROXY infrastructure from {ipt, ip6t}_SYNPROXY")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
When either CONFIG_IPV6 or CONFIG_SYN_COOKIES are disabled, the kernel
fails to build:
include/linux/netfilter_ipv6.h:180:9: error: implicit declaration of function '__cookie_v6_init_sequence'
[-Werror,-Wimplicit-function-declaration]
return __cookie_v6_init_sequence(iph, th, mssp);
include/linux/netfilter_ipv6.h:194:9: error: implicit declaration of function '__cookie_v6_check'
[-Werror,-Wimplicit-function-declaration]
return __cookie_v6_check(iph, th, cookie);
net/ipv6/netfilter.c:237:26: error: use of undeclared identifier '__cookie_v6_init_sequence'; did you mean 'cookie_init_sequence'?
net/ipv6/netfilter.c:238:21: error: use of undeclared identifier '__cookie_v6_check'; did you mean '__cookie_v4_check'?
Fix the IS_ENABLED() checks to match the function declaration
and definitions for these.
Fixes: 3006a5224f15 ("netfilter: synproxy: remove module dependency on IPv6 SYNPROXY")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Currently, the expiration of every element in a set or map
is a read-only parameter generated at kernel side.
This change will permit to set a certain expiration date
per element that will be required, for example, during
stateful replication among several nodes.
This patch handles the NFTA_SET_ELEM_EXPIRATION in order
to configure the expiration parameter per element, or
will use the timeout in the case that the expiration
is not set.
Signed-off-by: Laura Garcia Liebana <nevola@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
nf_ct_helper_ext_add may return null, which must then be checked.
Fixes: 857b46027d6f ("netfilter: nft_ct: add ct expectations support")
Reported-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Stéphane Veyret <sveyret@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Currently functions nf_synproxy_{ipc4|ipv6}_init return an uninitialized
garbage value in variable ret on a successful return. Fix this by
returning zero on success.
Addresses-Coverity: ("Uninitialized scalar variable")
Fixes: d7f9b2f18eae ("netfilter: synproxy: extract SYNPROXY infrastructure from {ipt, ip6t}_SYNPROXY")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Add common functions into nf_synproxy_core.c to prepare for nftables support.
The prototypes of the functions used by {ipt, ip6t}_SYNPROXY are in the new
file nf_synproxy.h
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This is a prerequisite for the infrastructure module NETFILTER_SYNPROXY.
The new module is needed to avoid duplicated code for the SYNPROXY
nftables support.
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This new UAPI file is going to be used by the xt and nft common SYNPROXY
infrastructure. It is needed to avoid duplicated code.
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Jozsef Kadlecsik says:
====================
ipset patches for nf-next
- Remove useless memset() calls, nla_parse_nested/nla_parse
erase the tb array properly, from Florent Fourcot.
- Merge the uadd and udel functions, the code is nicer
this way, also from Florent Fourcot.
- Add a missing check for the return value of a
nla_parse[_deprecated] call, from Aditya Pakki.
- Add the last missing check for the return value
of nla_parse[_deprecated] call.
- Fix error path and release the references properly
in set_target_v3_checkentry().
- Fix memory accounting which is reported to userspace
for hash types on resize, from Stefano Brivio.
- Update my email address to kadlec@netfilter.org.
The patch covers all places in the source tree where
my kadlec@blackhole.kfki.hu address could be found.
====================
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Currently, the /proc/sys/net/bridge folder is only created in the initial
network namespace. This patch ensures that the /proc/sys/net/bridge folder
is available in each network namespace if the module is loaded and
disappears from all network namespaces when the module is unloaded.
In doing so the patch makes the sysctls:
bridge-nf-call-arptables
bridge-nf-call-ip6tables
bridge-nf-call-iptables
bridge-nf-filter-pppoe-tagged
bridge-nf-filter-vlan-tagged
bridge-nf-pass-vlan-input-dev
apply per network namespace. This unblocks some use-cases where users would
like to e.g. not do bridge filtering for bridges in a specific network
namespace while doing so for bridges located in another network namespace.
The netfilter rules are afaict already per network namespace so it should
be safe for users to specify whether bridge devices inside a network
namespace are supposed to go through iptables et al. or not. Also, this can
already be done per-bridge by setting an option for each individual bridge
via Netlink. It should also be possible to do this for all bridges in a
network namespace via sysctls.
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This ports the sysctls to use struct brnf_net.
With this patch we make it possible to namespace the br_netfilter module in
the following patch.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
____nf_conntrack_find() performs checks on the conntrack objects in
this order:
1. if (nf_ct_is_expired(ct))
This fetches ct->timeout, in third cache line.
The hnnode that is used to store the list pointers resides in the first
(origin) or second (reply tuple) cache lines.
This test rarely passes, but its necessary to reap obsolete entries.
2. if (nf_ct_is_dying(ct))
This fetches ct->status, also in third cache line.
The test is useless, and can be removed:
Consider:
cpu0 cpu1
ct = ____nf_conntrack_find()
atomic_inc_not_zero(ct) -> ok
nf_ct_key_equal -> ok
is_dying -> DYING bit not set, ok
set_bit(ct, DYING);
... unhash ... etc.
return ct
-> returning a ct with dying bit set, despite
having a test for it.
This (unlikely) case is fine - refcount prevents ct from getting free'd.
3. if (nf_ct_key_equal(h, tuple, zone, net))
nf_ct_key_equal checks in following order:
1. Tuple equal (first or second cacheline)
2. Zone equal (third cacheline)
3. confirmed bit set (->status, third cacheline)
4. net namespace match (third cacheline).
Swapping "timeout" and "cpu" places timeout in the first cacheline.
This has two advantages:
1. For a conntrack that won't even match the original tuple,
we will now only fetch the first and maybe the second cacheline
instead of always accessing the 3rd one as well.
2. in case of TCP ct->timeout changes frequently because we
reduce/increase it when there are packets outstanding in the network.
The first cacheline contains both the reference count and the ct spinlock,
i.e. moving timeout there avoids writes to 3rd cacheline.
The restart sequence in __nf_conntrack_find() is removed, if we found a
candidate, but then fail to increment the refcount or discover the tuple
has changed (object recycling), just pretend we did not find an entry.
A second lookup won't find anything until another CPU adds a new conntrack
with identical tuple into the hash table, which is very unlikely.
We have the confirmation-time checks (when we hold hash lock) that deal
with identical entries and even perform clash resolution in some cases.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch allows to add, list and delete expectations via nft objref
infrastructure and assigning these expectations via nft rule.
This allows manual port triggering when no helper is defined to manage a
specific protocol. For example, if I have an online game which protocol
is based on initial connection to TCP port 9753 of the server, and where
the server opens a connection to port 9876, I can set rules as follow:
table ip filter {
ct expectation mygame {
protocol udp;
dport 9876;
timeout 2m;
size 1;
}
chain input {
type filter hook input priority 0; policy drop;
tcp dport 9753 ct expectation set "mygame";
}
chain output {
type filter hook output priority 0; policy drop;
udp dport 9876 ct status expected accept;
}
}
Signed-off-by: Stéphane Veyret <sveyret@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
It's better to use my kadlec@netfilter.org email address in
the source code. I might not be able to use
kadlec@blackhole.kfki.hu in the future.
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
If a fresh array block is allocated during resize, the current in-memory
set size should be increased by the size of the block, not replaced by it.
Before the fix, adding entries to a hash set type, leading to a table
resize, caused an inconsistent memory size to be reported. This becomes
more obvious when swapping sets with similar sizes:
# cat hash_ip_size.sh
#!/bin/sh
FAIL_RETRIES=10
tries=0
while [ ${tries} -lt ${FAIL_RETRIES} ]; do
ipset create t1 hash:ip
for i in `seq 1 4345`; do
ipset add t1 1.2.$((i / 255)).$((i % 255))
done
t1_init="$(ipset list t1|sed -n 's/Size in memory: \(.*\)/\1/p')"
ipset create t2 hash:ip
for i in `seq 1 4360`; do
ipset add t2 1.2.$((i / 255)).$((i % 255))
done
t2_init="$(ipset list t2|sed -n 's/Size in memory: \(.*\)/\1/p')"
ipset swap t1 t2
t1_swap="$(ipset list t1|sed -n 's/Size in memory: \(.*\)/\1/p')"
t2_swap="$(ipset list t2|sed -n 's/Size in memory: \(.*\)/\1/p')"
ipset destroy t1
ipset destroy t2
tries=$((tries + 1))
if [ ${t1_init} -lt 10000 ] || [ ${t2_init} -lt 10000 ]; then
echo "FAIL after ${tries} tries:"
echo "T1 size ${t1_init}, after swap ${t1_swap}"
echo "T2 size ${t2_init}, after swap ${t2_swap}"
exit 1
fi
done
echo "PASS"
# echo -n 'func hash_ip4_resize +p' > /sys/kernel/debug/dynamic_debug/control
# ./hash_ip_size.sh
[ 2035.018673] attempt to resize set t1 from 10 to 11, t 00000000fe6551fa
[ 2035.078583] set t1 resized from 10 (00000000fe6551fa) to 11 (00000000172a0163)
[ 2035.080353] Table destroy by resize 00000000fe6551fa
FAIL after 4 tries:
T1 size 9064, after swap 71128
T2 size 71128, after swap 9064
Reported-by: NOYB <JunkYardMail1@Frontier.com>
Fixes: 9e41f26a505c ("netfilter: ipset: Count non-static extension memory for userspace")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
In dump_init() the outdated comment was incorrect and we had a missing
validation check of nla_parse_deprecated().
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
When nla_parse fails, we should not use the results (the first
argument). The fix checks if it fails, and if so, returns its error code
upstream.
Signed-off-by: Aditya Pakki <pakki001@umn.edu>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Both functions are using exactly the same code, except the command value
passed to call_ad function.
Signed-off-by: Florent Fourcot <florent.fourcot@wifirst.fr>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
One of the memset call is buggy: it does not erase full array, but only pointer size.
Moreover, after a check, first step of nla_parse_nested/nla_parse is to
erase tb array as well. We can remove both calls safely.
Signed-off-by: Florent Fourcot <florent.fourcot@wifirst.fr>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
CONFIG_NETFILTER=m and CONFIG_NF_DEFRAG_IPV6 is not set
ERROR: "nf_ct_frag6_gather" [net/ipv6/ipv6.ko] undefined!
Fixes: c9bb6165a16e ("netfilter: nf_conntrack_bridge: fix CONFIG_IPV6=y")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
small cleanup: "struct request_sock_queue *queue" parameter of reqsk_queue_unlink
func is never used in the func, so we can remove it.
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the early days of phylib we had a functionality that changed to the
next lower speed in fixed mode if no link was established after a
certain period of time. This functionality has been removed years ago,
and state PHY_FORCING isn't needed any longer. Instead we can go from
UP to RUNNING or NOLINK directly (same as in autoneg mode).
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The variable cache_allocs is to indicate how many frags (KiB) are in one
rds connection frag cache.
The command "rds-info -Iv" will output the rds connection cache
statistics as below:
"
RDS IB Connections:
LocalAddr RemoteAddr Tos SL LocalDev RemoteDev
1.1.1.14 1.1.1.14 58 255 fe80::2:c903🅰️7a31 fe80::2:c903🅰️7a31
send_wr=256, recv_wr=1024, send_sge=8, rdma_mr_max=4096,
rdma_mr_size=257, cache_allocs=12
"
This means that there are about 12KiB frag in this rds connection frag
cache.
Since rds.h in rds-tools is not related with the kernel rds.h, the change
in kernel rds.h does not affect rds-tools.
rds-info in rds-tools 2.0.5 and 2.0.6 is tested with this commit. It works
well.
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Biao Huang says:
====================
complete dwmac-mediatek driver and fix flow control issue
Changes in v2:
patch#1: there is no extra action in mediatek_dwmac_remove, remove it
v1:
This series mainly complete dwmac-mediatek driver:
1. add power on/off operations for dwmac-mediatek.
2. disable rx watchdog to reduce rx path reponding time.
3. change the default value of tx-frames from 25 to 1, so
ptp4l will test pass by default.
and also fix the issue that flow control won't be disabled any more
once being enabled.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Current dwmac4_flow_ctrl will not clear
GMAC_RX_FLOW_CTRL_RFE/GMAC_RX_FLOW_CTRL_RFE bits,
so MAC hw will keep flow control on although expecting
flow control off by ethtool. Add codes to fix it.
Fixes: 477286b53f55 ("stmmac: add GMAC4 core support")
Signed-off-by: Biao Huang <biao.huang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
the default value of tx-frames is 25, it's too late when
passing tstamp to stack, then the ptp4l will fail:
ptp4l -i eth0 -f gPTP.cfg -m
ptp4l: selected /dev/ptp0 as PTP clock
ptp4l: port 1: INITIALIZING to LISTENING on INITIALIZE
ptp4l: port 0: INITIALIZING to LISTENING on INITIALIZE
ptp4l: port 1: link up
ptp4l: timed out while polling for tx timestamp
ptp4l: increasing tx_timestamp_timeout may correct this issue,
but it is likely caused by a driver bug
ptp4l: port 1: send peer delay response failed
ptp4l: port 1: LISTENING to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED)
ptp4l tests pass when changing the tx-frames from 25 to 1 with
ethtool -C option.
It should be fine to set tx-frames default value to 1, so ptp4l will pass
by default.
Signed-off-by: Biao Huang <biao.huang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
disable rx watchdog for dwmac-mediatek, then the hw will
issue a rx interrupt once receiving a packet, so the responding time
for rx path will be reduced.
Signed-off-by: Biao Huang <biao.huang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
add Ethernet power on/off operations in init/exit flow.
Signed-off-by: Biao Huang <biao.huang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IS_ERR() already calls unlikely(), so this extra likely() call
around the !IS_ERR() is not needed.
Signed-off-by: Enrico Weigelt <info@metux.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
IS_ERR() already calls unlikely(), so this extra unlikely() call
around IS_ERR() is not needed.
Signed-off-by: Enrico Weigelt <info@metux.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
IS_ERR() already calls unlikely(), so this extra unlikely() call
around IS_ERR() is not needed.
Signed-off-by: Enrico Weigelt <info@metux.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
IS_ERR() already calls unlikely(), so this extra likely() call
around the !IS_ERR() is not needed.
Signed-off-by: Enrico Weigelt <info@metux.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
IS_ERR() already calls unlikely(), so this extra likely() call
around the !IS_ERR() is not needed.
Signed-off-by: Enrico Weigelt <info@metux.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:
struct nfp_tun_active_tuns {
...
struct route_ip_info {
__be32 ipv4;
__be32 egress_port;
__be32 extra[2];
} tun_info[];
};
Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes.
So, replace the following form:
sizeof(struct nfp_tun_active_tuns) + sizeof(struct route_ip_info) * count
with:
struct_size(payload, tun_info, count)
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PF driver state flag __I40E_VIRTCHNL_OP_PENDING needs to be
checked and set at the beginning of i40e_ndo_set_vf_mac. Otherwise,
if there are error conditions before it, the flag will be cleared
unexpectedly by this function to cause potential race conditions.
Hence move the check to the top of this function.
Signed-off-by: Lihong Yang <lihong.yang@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The VF configuration returned in i40e_ndo_get_vf_config is
already stored by the PF. There is no dependency on any
specific state of the VF to return the configuration.
Drop the check against I40E_VF_STATE_INIT since it is not
needed.
Signed-off-by: Lihong Yang <lihong.yang@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
10GbE Intel Wired LAN Driver Updates 2019-06-05
This series contains updates to mainly ixgbe, with a few updates to
i40e, net, ice and hns2 driver.
Jan adds support for tracking each queue pair for whether or not AF_XDP
zero copy is enabled. Also updated the ixgbe driver to use the
netdev-provided umems so that we do not need to contain these structures
in our own adapter structure.
William Tu provides two fixes for AF_XDP statistics which were causing
incorrect counts.
Jake reduces the PTP transmit timestamp timeout from 15 seconds to 1 second,
which is still well after the maximum expected delay. Also fixes an
issues with the PTP SDP pin setup which was not properly aligning on a
full second, so updated the code to account for the cyclecounter
multiplier and simplify the code to make the intent of the calculations
more clear. Updated the function header comments to help with the code
documentation. Added support for SDP/PPS output for x550 devices, which
is slightly different than x540 devices that currently have this
support.
Anirudh adds a new define for Link Layer Discovery Protocol to the
networking core, so that drivers do not have to create and use their own
definitions. In addition, update all the drivers currently defining
their own LLDP define to use the new networking core define.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
If ixgbevf_write_msg_read_ack fails, return its error code upstream
Signed-off-by: Kangjie Lu <kjlu@umn.edu>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Similar to the X540 hardware, enable support for generating a 1pps
output signal on SDP0.
This support is slightly different to the X540 hardware, because of the
register layout changes. First, the system time register is now
represented in 'cycles' and 'billions of cycles'. Second, we need to
also program the TSSDP register, as well as the ESDP register. Third,
the clock output uses only FREQOUT, instead of a full 64bit value for
the output clock period. Finally, we have to use the ST0 bit instead of
the SYNCLK bit in the TSAUXC register.
This support should work even for the hardware with a higher frequency
clock, as it carefully takes into account the multiply and shift of the
cycle counter used.
We also set the pps configuration to 1, since we now support generating
a pulse per second output.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Remove references to HCLGE_MAC_ETHERTYPE_LLDP and use ETH_P_LLDP instead.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Remove references to IXGBE_ETH_P_LLD and use ETH_P_LLDP instead.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Remove references to I40E_ETH_P_LLDP and use ETH_P_LLDP instead.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add a new define ETH_P_LLDP for Link Layer Discovery Protocol (LLDP)
ethertype.
Suggested-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>