IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
If CTA_STATUS is present, but CTA_STATUS_MASK is not, then the
mask is automatically set to 'status', so that kernel returns those
entries that have all of the requested bits set.
This makes more sense than using a all-one mask since we'd hardly
ever find a match.
There are no other checks for status bits, so if e.g. userspace
sets impossible combinations it will get an empty dump.
If kernel would reject unknown status bits, then a program that works on
a future kernel that has IPS_FOO bit fails on old kernels.
Same for 'impossible' combinations:
Kernel never sets ASSURED without first having set SEEN_REPLY, but its
possible that a future kernel could do so.
Therefore no sanity tests other than a 0-mask.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
ctnetlink dumps can be filtered based on the connmark.
Prepare for status bit filtering by using a named structure and by
moving the mark parsing code to a helper.
Else ctnetlink_alloc_filter size grows a bit too big for my taste
when status handling is added.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
If any of these modules is loaded, hooks get registered in all netns:
Before: 'unshare -n nft list hooks' shows:
family bridge hook prerouting {
-2147483648 ebt_broute
-0000000300 ebt_nat_hook
}
family bridge hook input {
-0000000200 ebt_filter_hook
}
family bridge hook forward {
-0000000200 ebt_filter_hook
}
family bridge hook output {
+0000000100 ebt_nat_hook
+0000000200 ebt_filter_hook
}
family bridge hook postrouting {
+0000000300 ebt_nat_hook
}
This adds 'template 'tables' for ebtables.
Each ebtable_foo registers the table as a template, with an init function
that gets called once the first get/setsockopt call is made.
ebtables core then searches the (per netns) list of tables.
If no table is found, it searches the list of templates instead.
If a template entry exists, the init function is called which will
enable the table and register the hooks (so packets are diverted
to the table).
If no entry is found in the template list, request_module is called.
After this, hook registration is delayed until the 'ebtables'
(set/getsockopt) request is made for a given table and will only
happen in the specific namespace.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
clusterip is now handled via net_generic.
NOTRACK is tiny compared to rest of xt_CT feature set, even the existing
deprecation warning is bigger than the actual functionality.
Just remove the warning, its not worth keeping/adding a net_generic one.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Do not register the arp mangling hooks from pernet init path.
As-is, load of the module is enough for these hooks to become active
in each net namespace.
Use checkentry instead so hook is only added if a CLUSTERIP rule is used.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
TCP and UDP are built-in conntrack protocol trackers and the flowtable
only supports for TCP and UDP, remove this call.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Use nfnetlink_unicast() which already translates EAGAIN to ENOBUFS,
since EAGAIN is reserved to report missing module dependencies to the
nfnetlink core.
e0241ae6ac59 ("netfilter: use nfnetlink_unicast() forgot to update
this spot.
Reported-by: Yajun Deng <yajun.deng@linux.dev>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Leon Romanovsky says:
====================
Clean devlink net namespace operations
This short series continues my work on devlink core code to make devlink
reload less prone to errors and harden it from API abuse.
Despite first patch being a clear fix, I would ask you to apply it to
net-next anyway, because the fixed patch is anyway old and it will
help us to eliminate merge conflicts that will arise for following
patches or even for the second one.
====================
Link: https://lore.kernel.org/r/cover.1627578998.git.leonro@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
There is no need in extra call indirection and check from impossible
flow where someone tries to set namespace without prior call
to devlink_alloc().
Instead of this extra logic and additional EXPORT_SYMBOL, use specialized
devlink allocation function that receives net namespace as an argument.
Such specialized API allows clear view when devlink initialized in wrong
net namespace and/or kernel users don't try to change devlink namespace
under the hood.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If skb_dst_set_noref() is invoked with a NULL dst, the 'slow_gro'
field is cleared, too. That could lead to wrong behavior if
the skb later enters the GRO stage.
Fix the potential issue replacing preserving a non-zero value of
the 'slow_gro' field.
Additionally, fix a comment typo.
Reported-by: Sabrina Dubroca <sd@queasysnail.net>
Reported-by: Jakub Kicinski <kuba@kernel.org>
Fixes: 8a886b142bd0 ("sk_buff: track dst status in slow_gro")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/aa42529252dc8bb02bd42e8629427040d1058537.1627662501.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
No need for multiple spaces in variable declaration (the code does not
use them in other places). No functional change.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Several functions receive pointers to u8, sk_buff or other structs but
do not modify the contents so make them const. This allows doing the
same for local variables and in total makes the code a little bit safer.
This makes const also data passed as "unsigned long opt" argument to
nci_request() function. Usual flow for such functions is:
1. Receive "u8 *" and store it (the pointer) in a structure
allocated on stack (e.g. struct nci_set_config_param),
2. Call nci_request() or __nci_request() passing a callback function an
the pointer to the structure via an "unsigned long opt",
3. nci_request() calls the callback which dereferences "unsigned long
opt" in a read-only way.
This converts all above paths to use proper pointer to const data, so
entire flow is safer.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Few pointers to struct nfc_target and struct nfc_se can be made const.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Several functions receive pointers to u8, char or sk_buff but do not
modify the contents so make them const. This allows doing the same for
local variables and in total makes the code a little bit safer.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The nfc_llc_init() is used only in other __init annotated context.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The af_nfc_exit() is used only in other __exit annotated context
(nfc_exit()).
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The device_node in nfcmrvl_spi_parse_dt() cannot be const as it is
passed to OF functions which modify it.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
refcount_t type should be used instead of int when fib_treeref is used as
a reference counter,and avoid use-after-free risks.
Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20210729071350.28919-1-yajun.deng@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In the function bcm_enetsw_probe(), 'ret' will be assigned by
bcm_enet_change_mtu(), so 'ret = 0' make no sense.
Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
DSA has gained the recent ability to deal gracefully with upper
interfaces it cannot offload, such as the bridge, bonding or team
drivers. When such uppers exist, the ports are still in standalone mode
as far as the hardware is concerned.
But when we deliver packets to the software bridge in order for that to
do the forwarding, there is an unpleasant surprise in that the bridge
will refuse to forward them. This is because we unconditionally set
skb->offload_fwd_mark = true, meaning that the bridge thinks the frames
were already forwarded in hardware by us.
Since dp->bridge_dev is populated only when there is hardware offload
for it, but not in the software fallback case, let's introduce a new
helper that can be called from the tagger data path which sets the
skb->offload_fwd_mark accordingly to zero when there is no hardware
offload for bridging. This lets the bridge forward packets back to other
interfaces of our switch, if needed.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When an ipvlan device is created on a bond device, the link state
of the ipvlan device may be abnormal. This is because bonding device
allows to add physical network card device in the down state and so
NETDEV_CHANGE event will not be notified to other listeners, so ipvlan
has no chance to update its link status.
The following steps can cause such problems:
1) bond0 is down
2) ip link add link bond0 name ipvlan type ipvlan mode l2
3) echo +enp2s7 >/sys/class/net/bond0/bonding/slaves
4) ip link set bond0 up
After these steps, use ip link command, we found ipvlan has NO-CARRIER:
ipvlan@bond0: <NO-CARRIER, BROADCAST,MULTICAST,UP,M-DOWN> mtu ...>
We can deal with this problem like VLAN: Add handling of NETDEV_UP
events. If we receive NETDEV_UP event, we will update the link status
of the ipvlan.
Signed-off-by: Di Zhu <zhudi21@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
currently, only 'ingress' and 'clsact ingress' qdiscs store the tc 'chain
id' in the skb extension. However, userspace programs (like ovs) are able
to setup egress rules, and datapath gets confused in case it doesn't find
the 'chain id' for a packet that's "recirculated" by tc.
Change tcf_classify() to have the same semantic as tcf_classify_ingress()
so that a single function can be called in ingress / egress, using the tc
ingress / egress block respectively.
Suggested-by: Alaa Hleilel <alaa@nvidia.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei says:
====================
dpaa2-switch: add mirroring support
This patch set adds per port and per VLAN mirroring in dpaa2-switch.
The first 4 patches are just cosmetic changes. We renamed the
dpaa2_switch_acl_tbl structure into dpaa2_switch_filter_block so that we
can reuse it for filters that do not use the ACL table and reorganized
the addition of trap, redirect and drop filters into a separate
function. All this just to make for a more streamlined addition of the
support for mirroring.
The next 4 patches are actually adding the advertised support. Mirroring
rules can be added in shared blocks, the driver will replicate the same
configuration on all the switch ports part of the same block.
The last patch documents the feature, presents its behavior and
limitations and gives a couple of examples.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Document the mirroring capabilities of the dpaa2-switch driver,
any restrictions that are imposed and some example commands.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When mirroring rules are added in shared filter blocks, the same
mirroring rule has to be configured on all the switch ports that are
part of the same block.
In case a switch port joins a shared block after mirroring filters have
been already added to it, then all the mirror rules should be offloaded
to the port. The reverse, removal of mirroring rules, has to be done at
block unbind.
For this purpose, the dpaa2_switch_block_offload_mirror() and
dpaa2_switch_block_unoffload_mirror() functions are added and called
upon binding and unbinding a switch port to/from a block.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using the infrastructure added in the previous patch, extend tc-flower
support with FLOW_ACTION_MIRRED based on VLAN.
Tested with:
tc qdisc add dev eth8 ingress_block 1 clsact
tc filter add block 1 ingress protocol 802.1q flower skip_sw \
vlan_id 100 action mirred egress mirror dev eth6
tc filter del block 1 ingress pref 49152
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for per port mirroring for the DPAA2 switch. We support
only single mirror port, therefore we allow mirroring rules only as long
as the destination port is always the same.
Unlike all the actions (drop, redirect, trap) already supported by the
dpaa2-switch driver, adding mirroring filters in shared blocks is not
achieved by a singular ACL entry added in a table shared by the ports.
This is why, when a new mirror filter is added in a block we have to got
through all the switch ports sharing it and configure the filter
individually on all.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the necessary MC API for setting up and configuring the mirroring
feature on the DPSW DPAA2 object.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Extract the necessary steps to offload a filter by using the ACL table
in a separate function - dpaa2_switch_cls_matchall_replace_acl().
This is intended to help with the code readability when the mirroring
support is added.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Extract the necessary steps to offload a filter by using the ACL table
in a separate function - dpaa2_switch_cls_flower_replace_acl().
This is intended to help with the code readability when the mirroring
support is added.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Until now, shared filter blocks were implemented only by ACL tables
shared between ports. Going forward, when the mirroring support will be
added, this will not be true anymore.
Rename the dpaa2_switch_acl_tbl into dpaa2_switch_filter_block so that
we make it clear that the structure is used not only for filters that
use the ACL table but will be used for all the filters that are added in
a block.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Until now, the dpaa2_switch_tc_parse_action() function was used for all
the supported tc actions since all of them were implemented by adding
ACL table entries. In the next commits, the dpaa2-switch driver will
gain mirroring support which is not using the same HW feature.
Make sure that we specify the ACL in the function name so that we make
it clear that it's only used for specific actions.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Removing the qede module version which is not needed and not allowed
with inbox drivers.
Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Shai Malin <smalin@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Removing the qed module version which is not needed and not allowed
with inbox drivers.
Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Shai Malin <smalin@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean says:
====================
NXP SJA1105 VLAN regressions
These are 3 patches to fix issues seen with some more varied testing
done after the changes in the "Traffic termination for sja1105 ports
under VLAN-aware bridge" series were made:
https://patchwork.kernel.org/project/netdevbpf/cover/20210726165536.1338471-1-vladimir.oltean@nxp.com/
Issue 1: traffic no longer works on a port after leaving a VLAN-aware bridge
Issue 2: untagged traffic not dropped if pvid is absent from a VLAN-aware port
Issue 3: PTP and STP broken on ports under a VLAN-aware bridge
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
On RX, a control packet with SJA1110 will have:
- an in-band control extension (DSA tag) composed of a header and an
optional trailer (if it is a timestamp frame). We can (and do) deduce
the source port and switch id from this.
- a VLAN header, which can either be the tag_8021q RX VLAN (pvid) or the
bridge VLAN. The sja1105_vlan_rcv() function attempts to deduce the
source port and switch id a second time from this.
The basic idea is that even though we don't need the source port
information from the tag_8021q header if it's a control packet, we do
need to strip that header before we pass it on to the network stack.
The problem is that we call sja1105_vlan_rcv for ports under VLAN-aware
bridges, and that function tells us it couldn't identify a tag_8021q
header, so we need to perform imprecise RX by VID. Well, we don't,
because we already know the source port and switch ID.
This patch drops the return value from sja1105_vlan_rcv and we just look
at the source_port and switch_id values from sja1105_rcv and sja1110_rcv
which were initialized to -1. If they are still -1 it means we need to
perform imprecise RX.
Fixes: 884be12f8566 ("net: dsa: sja1105: add support for imprecise RX")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Surprisingly, this configuration:
ip link add br0 type bridge vlan_filtering 1
ip link set swp2 master br0
bridge vlan del dev swp2 vid 1
still has the sja1105 switch sending untagged packets to the CPU (and
failing to decode them, since dsa_find_designated_bridge_port_by_vid
searches by VID 1 and rightfully finds no bridge VLAN 1 on a port).
Dumping the switch configuration, the VLANs are managed properly:
- the pvid of swp2 is 1 in the MAC Configuration Table, but
- only the CPU port is in the port membership of VLANID 1 in the VLAN
Lookup Table
When the ingress packets are tagged with VID 1, they are properly
dropped. But when they are untagged, they are able to reach the CPU
port. Also, when the pvid in the MAC Configuration Table is changed to
e.g. 55 (an unused VLAN), the untagged packets are also dropped.
So it looks like:
- the switch bypasses ingress VLAN membership checks for untagged traffic
- the reason why the untagged traffic is dropped when I make the pvid 55
is due to the lack of valid destination ports in VLAN 55, rather than
an ingress membership violation
- the ingress VLAN membership cheks are only done for VLAN-tagged traffic
Interesting. It looks like there is an explicit bit to drop untagged
traffic, so we should probably be using that to preserve user expectations.
Note that only VLAN-aware ports should drop untagged packets due to no
pvid - when VLAN-unaware, the software bridge doesn't do this even if
there is no pvid on any bridge port and on the bridge itself. So the new
sja1105_drop_untagged() function cannot simply be called with "false"
from sja1105_bridge_vlan_add() and with "true" from sja1105_bridge_vlan_del.
Instead, we need to also consider the VLAN awareness state. That means
we need to hook the "drop untagged" setting in all the same places where
the "commit pvid" logic is, and it needs to factor in all the state when
flipping the "drop untagged" bit: is our current pvid in the VLAN Lookup
Table, and is the current port in that VLAN's port membership list?
VLAN-unaware ports will never drop untagged frames because these checks
always succeed by construction, and the tag_8021q VLANs cannot be changed
by the user.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that we no longer have the ultra-central sja1105_build_vlan_table(),
we need to be more careful about checking all corner cases manually.
For example, when a port leaves a VLAN-aware bridge, it becomes
standalone so its pvid should become a tag_8021q RX VLAN again. However,
sja1105_commit_pvid() only gets called from sja1105_bridge_vlan_add()
and from sja1105_vlan_filtering(), and no VLAN awareness change takes
place (VLAN filtering is a global setting for sja1105, so the switch
remains VLAN-aware overall).
This means that we need to put another sja1105_commit_pvid() call in
sja1105_bridge_member().
Fixes: 6dfd23d35e75 ("net: dsa: sja1105: delete vlan delta save/restore logic")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeremy Kerr says:
====================
Add Management Component Transport Protocol support
This series adds core MCTP support to the kernel. From the Kconfig
description:
Management Component Transport Protocol (MCTP) is an in-system
protocol for communicating between management controllers and
their managed devices (peripherals, host processors, etc.). The
protocol is defined by DMTF specification DSP0236.
This option enables core MCTP support. For communicating with other
devices, you'll want to enable a driver for a specific hardware
channel.
This implementation allows a sockets-based API for sending and receiving
MCTP messages via sendmsg/recvmsg on SOCK_DGRAM sockets. Kernel stack
control is all via netlink, using existing RTM_* messages. The userspace
ABI change is fairly small; just the necessary AF_/ETH_P_/ARPHDR_
constants, a new sockaddr, and a new netlink attribute.
For MAINTAINERS, I've just included netdev@ as the list entry. I'm happy
to alter this based on preferences here - an alternative would be the
OpenBMC list (the main user of the MCTP interface), or we can create a
new list entirely.
We have a couple of interface drivers almost ready to go at the moment,
but those can wait until the core code has some review.
This is v4 of the series; v1 and v2 were both RFC.
selinux folks: CCing 01/15 due to the new PF_MCTP protocol family.
linux-doc folks: CCing 15/15 for the new MCTP overview document.
Review, comments, questions etc. are most welcome.
Cheers,
Jeremy
v2:
- change to match spec terminology: controller -> component
- require specific capabilities for bind() & sendmsg()
- add address and tag defintions to uapi
- add selinux AF_MCTP table definitions
- remove strict cflags; warnings are present in common headers
v3:
- require caps for MCTP bind() & send()
- comment typo fixes
- switch to an array for local EIDs
- fix addrinfo dump iteration & error path
- add RTM_DELADDR
- remove GENMASK() and BIT() from uapi
v4:
- drop tun patch; that can be submitted separately
- keep nipa happy: add maintainer CCs, including doc and selinux
- net-next rebase
- Include AF_MCTP in af_family_slock_keys and pf_family_names
- Introduce MODULE_ definitions earlier
- upstream change: set_link_af no longer called with RTNL held
- add kdoc for net_device.mctp_ptr
- don't inline mctp_rt_match_eid
- require rtm_type == RTN_UNICAST in route management handlers
- remove unused RTAX policy table
- fix mctp_sock->keys rcu annotations
- fix spurious rcu_read_unlock in route input
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This change adds a brief document about the sockets API provided for
sending and receiving MCTP messages from userspace.
This is roughly based on the OpenBMC design document, at:
https://github.com/openbmc/docs/blob/master/designs/mctp/mctp-kernel.md
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently we have a compile-time default network
(MCTP_INITIAL_DEFAULT_NET). This change introduces a default_net field
on the net namespace, allowing future configuration for new interfaces.
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that we have a neighbour implementation, hook it up to the output
path to set the dest hardware address for outgoing packets.
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change implements MCTP fragmentation (based on route & device MTU),
and corresponding reassembly.
The MCTP specification only allows for fragmentation on the originating
message endpoint, and reassembly on the destination endpoint -
intermediate nodes do not need to reassemble/refragment. Consequently,
we only fragment in the local transmit path, and reassemble
locally-bound packets. Messages are required to be in-order, so we
simply cancel reassembly on out-of-order or missing packets.
In the fragmentation path, we just break up the message into MTU-sized
fragments; the skb structure is a simple copy for now, which we can later
improve with a shared data implementation.
For reassembly, we keep track of incoming message fragments using the
existing tag infrastructure, allocating a key on the (src,dest,tag)
tuple, and reassembles matching fragments into a skb->frag_list.
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Start filling-out the socket syscalls: bind, sendmsg & recvmsg.
This requires an input route implementation, so we add to
mctp_route_input, allowing lookups on binds & message tags. This just
handles single-packet messages at present, we will add fragmentation in
a future change.
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change adds the netlink interfaces for manipulating the MCTP
neighbour table.
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add an initial neighbour table implementation, to be used in the route
output path.
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>