12598 Commits

Author SHA1 Message Date
Eli Britstein
6663cf821c flow_offload: Fix flow action infrastructure
Implementation of macro "flow_action_for_each" introduced in
commit e3ab786b42535 ("flow_offload: add flow action infrastructure")
and used in commit 738678817573c ("drivers: net: use flow action
infrastructure") iterated the first item twice and did not reach the
last one. Fix it.

Fixes: e3ab786b42535 ("flow_offload: add flow action infrastructure")
Fixes: 738678817573c ("drivers: net: use flow action infrastructure")
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-11 21:30:14 -08:00
Jakub Kicinski
14fd1901e7 devlink: add a generic board.manufacture version name
At Jiri's suggestion add a generic "board.manufacture"
version identifier.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-11 20:39:56 -08:00
Johannes Berg
5d4071abd9 cfg80211: fix and clean up cfg80211_gen_new_bssid()
Fix cfg80211_gen_new_bssid() to not rely on u64 modulo arithmetic,
which isn't needed since we really just want to mask there. Also,
clean it up to calculate the mask only once and use GENMASK_ULL()
instead of open-coding the mask calculation.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-02-11 16:54:08 +01:00
Eli Cohen
eddd2cf195 net: Change TCA_ACT_* to TCA_ID_* to match that of TCA_ID_POLICE
Modify the kernel users of the TCA_ACT_* macros to use TCA_ID_*. For
example, use TCA_ID_GACT instead of TCA_ACT_GACT. This will align with
TCA_ID_POLICE and also differentiates these identifier, used in struct
tc_action_ops type field, from other macros starting with TCA_ACT_.

To make things clearer, we name the enum defining the TCA_ID_*
identifiers and also change the "type" field of struct tc_action to
id.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-10 09:28:43 -08:00
Lorenzo Bianconi
c09551c6ff net: ipv4: use a dedicated counter for icmp_v4 redirect packets
According to the algorithm described in the comment block at the
beginning of ip_rt_send_redirect, the host should try to send
'ip_rt_redirect_number' ICMP redirect packets with an exponential
backoff and then stop sending them at all assuming that the destination
ignores redirects.
If the device has previously sent some ICMP error packets that are
rate-limited (e.g TTL expired) and continues to receive traffic,
the redirect packets will never be transmitted. This happens since
peer->rate_tokens will be typically greater than 'ip_rt_redirect_number'
and so it will never be reset even if the redirect silence timeout
(ip_rt_redirect_silence) has elapsed without receiving any packet
requiring redirects.

Fix it by using a dedicated counter for the number of ICMP redirect
packets that has been sent by the host

I have not been able to identify a given commit that introduced the
issue since ip_rt_send_redirect implements the same rate-limiting
algorithm from commit 1da177e4c3f4 ("Linux-2.6.12-rc2")

Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 21:50:15 -08:00
Jiri Pirko
7c62cfb8c5 devlink: publish params only after driver init is done
Currently, user can do dump or get of param values right after the
devlink params are registered. However the driver may not be initialized
which is an issue. The same problem happens during notification
upon param registration. Allow driver to publish devlink params
whenever it is ready to handle get() ops. Note that this cannot
be resolved by init reordering, as the "driverinit" params have
to be available before the driver is initialized (it needs the param
values there).

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 15:02:49 -08:00
David S. Miller
a655fe9f19 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
An ipvlan bug fix in 'net' conflicted with the abstraction away
of the IPV6 specific support in 'net-next'.

Similarly, a bug fix for mlx5 in 'net' conflicted with the flow
action conversion in 'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 15:00:17 -08:00
Johannes Berg
851ae31d34 cfg80211: add missing kernel-doc for multi-BSSID fields
Add the missing kernel-doc for the new multi-BSSID fields
in struct cfg80211_bss.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-02-08 14:12:25 +01:00
Sara Sharon
caf56338c2 mac80211: indicate support for multiple BSSID
Set multi-bssid support flags according to driver support.

Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-02-08 13:56:37 +01:00
Sara Sharon
78ac51f815 mac80211: support multi-bssid
Add support for multi-bssid.

This includes:
- Parsing multi-bssid element
- Overriding DTIM values
- Taking into account in various places the inner BSSID instead of
  transmitter BSSID
- Save aside some multi-bssid properties needed by drivers

Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-02-08 13:56:37 +01:00
Sara Sharon
0cd01efb03 cfg80211: save multi-bssid properties
When the new IEs are generated, the multiple BSSID elements
are not saved. Save aside properties that are needed later
for PS.

Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-02-08 13:51:50 +01:00
Sara Sharon
7ece9c372b cfg80211: make BSSID generation function inline
This will enable reuse by mac80211.

Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-02-08 13:51:50 +01:00
Sara Sharon
213ed579d3 cfg80211: parse multi-bssid only if HW supports it
Parsing and exposing nontransmitted APs is problematic
when underlying HW doesn't support it. Do it only if
driver indicated support. Allow HE restriction as well,
since the HE spec defined the exact manner that Multiple
BSSID set should behave. APs that not support the HE
spec will have less predictable Multiple BSSID set
support/behavior

Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-02-08 13:51:50 +01:00
Sara Sharon
7011ba583f cfg80211: Move Multiple BSS info to struct cfg80211_bss to be visible
Previously the transmitted BSS and the non-trasmitted BSS list were
defined in struct cfg80211_internal_bss. Move them to struct cfg80211_bss
since mac80211 needs this info.

Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-02-08 13:51:50 +01:00
Johannes Berg
49a68e0d88 cfg80211: add various struct element finding helpers
We currently have a number of helpers to find elements that just
return a u8 *, change those to return a struct element and add
inlines to deal with the u8 * compatibility.

Note that the match behaviour is changed to start the natch at
the data, so conversion from _ie_match to _elem_match need to
be done carefully.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-02-08 13:51:50 +01:00
Eran Ben Elisha
c8e1da0bf9 devlink: Add health report functionality
Upon error discover, every driver can report it to the devlink health
mechanism via devlink_health_report function, using the appropriate
reporter registered to it. Driver can pass error specific context which
will be delivered to it as part of the dump / recovery callbacks.

Once an error is reported, devlink health will do the following actions:
* A log is being send to the kernel trace events buffer
* Health status and statistics are being updated for the reporter instance
* Object dump is being taken and stored at the reporter instance (as long
  as there is no other dump which is already stored)
* Auto recovery attempt is being done. Depends on:
  - Auto Recovery configuration
  - Grace period vs. Time since last recover

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07 10:34:28 -08:00
Eran Ben Elisha
a0bdcc59d1 devlink: Add health reporter create/destroy functionality
Devlink health reporter is an instance for reporting, diagnosing and
recovering from run time errors discovered by the reporters.
Define it's data structure and supported operations.
In addition, expose devlink API to create and destroy a reporter.
Each devlink instance will hold it's own reporters list.

As part of the allocation, driver shall provide a set of callbacks which
will be used by devlink in order to handle health reports and user
commands related to this reporter. In addition, driver is entitled to
provide some priv pointer, which can be fetched from the reporter by
devlink_health_reporter_priv function.

For each reporter, devlink will hold a metadata of statistics,
dump msg and status.

For passing dumps and diagnose data to the user-space, it will use devlink
fmsg API.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07 10:34:28 -08:00
Eran Ben Elisha
1db64e8733 devlink: Add devlink formatted message (fmsg) API
Devlink fmsg is a mechanism to pass descriptors between drivers and
devlink, in json-like format. The API allows the driver to add nested
attributes such as object, object pair and value array, in addition to
attributes such as name and value.

Driver can use this API to fill the fmsg context in a format which will be
translated by the devlink to the netlink message later.
There is no memory allocation in advance (other than the initial list
head), and it dynamically allocates messages descriptors and add them to
the list on the fly.

When it needs to send the data using SKBs to the netlink layer, it
fragments the data between different SKBs. In order to do this
fragmentation, it uses virtual nests attributes, to avoid actual
nesting use which cannot be divided between different SKBs.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-07 10:34:28 -08:00
Kalle Valo
3479f74ee4 Third batch of iwlwifi patches intended for v5.1
* Work on the new debugging infrastructure continues;
 * HE radiotap;
 * Support for new FW version 44;
 * A couple of new FW API changes;
 * A bunch of fixes for static analyzer reported issues;
 * General bugfixes;
 * Other cleanups and small fixes;
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEF3LNfgb2BPWm68smoUecoho8xfoFAlxYFbcACgkQoUecoho8
 xfoFhg//eJLoosJx5BIX7vJ0b4uUJ7gjTj67qMja7RBVUXxMfYcn7Yrztlenm+H7
 yIsZe7I0Jap88WH3HKYU/G6ASFiyXZo6TrUt4rzY3Xuy3SIgSG5gmnt4XcQRSSBd
 mYp+hjmz5PJPx2lzRGccQ167oOQZ/DHLn7JwuuLmgtLfz4RMHpUtitOQf9WGlKx4
 nure2JLFZ4yV+lng6XBPma/lelgi9q8L8bu7izOhJkh0saSDlSWQUcn5kWoWG5av
 syQsrxb3FH6wfZijZXW4USolpThgCcXxTzd0IFPPXIyx/z6PEZK0yBNvGeiiPxm9
 bWT1fJGrSea+82qY2vTVE1NLKd46S8jATSxSawqwGFQRv7EeLW1IdDzpFlbqU+Rn
 1dUaOHzIUtK6MdCzNco4rcYZlvmFnMlqROQexnCp/sNp1J+eOfL1aqog3wRZc7sN
 IAtZJbwMcLC1YHunMhFOCs1+imXVrhBBN8tEJGoQHtBvIQpTumX6VFGeGBmXiyVX
 FLJLF0aeYg970OMcwqNuMzNIcSk13uhw3F/M/jIIN8IyK3L3bX6kiaUevUMgYK3T
 +3zPj41hJWGFelLa7FNryqt1S4Vbi12EubuK1gv1QcpvXro9OsUEXzeJw7PWEPj0
 r3JBiIIkfpFul88F6a3163dNQoo03k4Nd69dNww/RdD7+GGuZWU=
 =ckmd
 -----END PGP SIGNATURE-----

Merge tag 'iwlwifi-next-for-kalle-2019-02-04' of git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next

Third batch of iwlwifi patches intended for v5.1

* Work on the new debugging infrastructure continues;
* HE radiotap;
* Support for new FW version 44;
* A couple of new FW API changes;
* A bunch of fixes for static analyzer reported issues;
* General bugfixes;
* Other cleanups and small fixes;
2019-02-07 11:34:26 +02:00
Florian Fainelli
bccb30254a net: Get rid of SWITCHDEV_ATTR_ID_PORT_PARENT_ID
Now that we have a dedicated NDO for getting a port's parent ID, get rid
of SWITCHDEV_ATTR_ID_PORT_PARENT_ID and convert all callers to use the
NDO exclusively. This is a preliminary change to getting rid of
switchdev_ops eventually.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06 14:17:03 -08:00
Pablo Neira Ayuso
8bec2833fb flow_offload: add wake-up-on-lan and queue to flow_action
These actions need to be added to support the ethtool_rx_flow interface.
The queue action includes a field to specify the RSS context, that is
set via FLOW_RSS flow type flag and the rss_context field in struct
ethtool_rxnfc, plus the corresponding queue index. FLOW_RSS implies that
rss_context is non-zero, therefore, queue.ctx == 0 means that FLOW_RSS
was not set. Also add a field to store the vf index which is stored in
the ethtool_rxnfc ring_cookie field.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06 10:38:26 -08:00
Pablo Neira Ayuso
2cd173e6d5 cls_flower: don't expose TC actions to drivers anymore
Now that drivers have been converted to use the flow action
infrastructure, remove this field from the tc_cls_flower_offload
structure.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06 10:38:26 -08:00
Pablo Neira Ayuso
3b1903ef97 flow_offload: add statistics retrieval infrastructure and use it
This patch provides the flow_stats structure that acts as container for
tc_cls_flower_offload, then we can use to restore the statistics on the
existing TC actions. Hence, tcf_exts_stats_update() is not used from
drivers anymore.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06 10:38:25 -08:00
Pablo Neira Ayuso
3a7b68617d cls_api: add translator to flow_action representation
This patch implements a new function to translate from native TC action
to the new flow_action representation. Moreover, this patch also updates
cls_flower to use this new function.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06 10:38:25 -08:00
Pablo Neira Ayuso
e3ab786b42 flow_offload: add flow action infrastructure
This new infrastructure defines the nic actions that you can perform
from existing network drivers. This infrastructure allows us to avoid a
direct dependency with the native software TC action representation.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06 10:38:25 -08:00
Pablo Neira Ayuso
8f2566225a flow_offload: add flow_rule and flow_match structures and use them
This patch wraps the dissector key and mask - that flower uses to
represent the matching side - around the flow_match structure.

To avoid a follow up patch that would edit the same LoCs in the drivers,
this patch also wraps this new flow match structure around the flow rule
object. This new structure will also contain the flow actions in follow
up patches.

This introduces two new interfaces:

	bool flow_rule_match_key(rule, dissector_id)

that returns true if a given matching key is set on, and:

	flow_rule_match_XYZ(rule, &match);

To fetch the matching side XYZ into the match container structure, to
retrieve the key and the mask with one single call.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-06 10:38:25 -08:00
Cong Wang
f75a2804da xfrm: destroy xfrm_state synchronously on net exit path
xfrm_state_put() moves struct xfrm_state to the GC list
and schedules the GC work to clean it up. On net exit call
path, xfrm_state_flush() is called to clean up and
xfrm_flush_gc() is called to wait for the GC work to complete
before exit.

However, this doesn't work because one of the ->destructor(),
ipcomp_destroy(), schedules the same GC work again inside
the GC work. It is hard to wait for such a nested async
callback. This is also why syzbot still reports the following
warning:

 WARNING: CPU: 1 PID: 33 at net/ipv6/xfrm6_tunnel.c:351 xfrm6_tunnel_net_exit+0x2cb/0x500 net/ipv6/xfrm6_tunnel.c:351
 ...
  ops_exit_list.isra.0+0xb0/0x160 net/core/net_namespace.c:153
  cleanup_net+0x51d/0xb10 net/core/net_namespace.c:551
  process_one_work+0xd0c/0x1ce0 kernel/workqueue.c:2153
  worker_thread+0x143/0x14a0 kernel/workqueue.c:2296
  kthread+0x357/0x430 kernel/kthread.c:246
  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352

In fact, it is perfectly fine to bypass GC and destroy xfrm_state
synchronously on net exit call path, because it is in process context
and doesn't need a work struct to do any blocking work.

This patch introduces xfrm_state_put_sync() which simply bypasses
GC, and lets its callers to decide whether to use this synchronous
version. On net exit path, xfrm_state_fini() and
xfrm6_tunnel_net_exit() use it. And, as ipcomp_destroy() itself is
blocking, it can use xfrm_state_put_sync() directly too.

Also rename xfrm_state_gc_destroy() to ___xfrm_state_destroy() to
reflect this change.

Fixes: b48c05ab5d32 ("xfrm: Fix warning in xfrm6_tunnel_net_exit.")
Reported-and-tested-by: syzbot+e9aebef558e3ed673934@syzkaller.appspotmail.com
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2019-02-05 06:29:20 +01:00
Pablo Neira Ayuso
f6ac858589 netfilter: nf_tables: unbind set in rule from commit path
Anonymous sets that are bound to rules from the same transaction trigger
a kernel splat from the abort path due to double set list removal and
double free.

This patch updates the logic to search for the transaction that is
responsible for creating the set and disable the set list removal and
release, given the rule is now responsible for this. Lookup is reverse
since the transaction that adds the set is likely to be at the tail of
the list.

Moreover, this patch adds the unbind step to deliver the event from the
commit path.  This should not be done from the worker thread, since we
have no guarantees of in-order delivery to the listener.

This patch removes the assumption that both activate and deactivate
callbacks need to be provided.

Fixes: cd5125d8f518 ("netfilter: nf_tables: split set destruction in deactivate and destroy phase")
Reported-by: Mikhail Morfikov <mmorfikov@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-02-04 17:29:17 +01:00
Johannes Berg
74cf15cb69 iwlwifi: mvm: add HE TB PPDU SIG-A BW to radiotap
Expose the trigger-based PPDU SIG-A bandwidth to radiotap in
the newly defined bits thereof.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
2019-02-04 12:28:06 +02:00
Jakub Kicinski
bff5731d43 net: devlink: report cell size of shared buffers
Shared buffer allocation is usually done in cell increments.
Drivers will either round up the allocation or refuse the
configuration if it's not an exact multiple of cell size.
Drivers know exactly the cell size of shared buffer, so help
out users by providing this information in dumps.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-03 11:25:34 -08:00
Deepa Dinamani
887feae36a socket: Add SO_TIMESTAMP[NS]_NEW
Add SO_TIMESTAMP_NEW and SO_TIMESTAMPNS_NEW variants of
socket timestamp options.
These are the y2038 safe versions of the SO_TIMESTAMP_OLD
and SO_TIMESTAMPNS_OLD for all architectures.

Note that the format of scm_timestamping.ts[0] is not changed
in this patch.

Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Cc: jejb@parisc-linux.org
Cc: ralf@linux-mips.org
Cc: rth@twiddle.net
Cc: linux-alpha@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linux-parisc@vger.kernel.org
Cc: linux-rdma@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-03 11:17:31 -08:00
Jakub Kicinski
ddb6e99e2d ethtool: add compat for devlink info
If driver did not fill the fw_version field, try to call into
the new devlink get_info op and collect the versions that way.
We assume ethtool was always reporting running versions.

v4:
 - use IS_REACHABLE() to avoid problems with DEVLINK=m (kbuildbot).
v3 (Jiri):
 - do a dump and then parse it instead of special handling;
 - concatenate all versions (well, all that fit :)).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-01 15:30:31 -08:00
Jakub Kicinski
785bd550c4 devlink: add generic info version names
Add defines and docs for generic info versions.

v3:
 - add docs;
 - separate patch (Jiri).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-01 15:30:30 -08:00
Jakub Kicinski
fc6fae7dd9 devlink: add version reporting to devlink info API
ethtool -i has a few fixed-size fields which can be used to report
firmware version and expansion ROM version. Unfortunately, modern
hardware has more firmware components. There is usually some
datapath microcode, management controller, PXE drivers, and a
CPLD load. Running ethtool -i on modern controllers reveals the
fact that vendors cram multiple values into firmware version field.

Here are some examples from systems I could lay my hands on quickly:

tg3:  "FFV20.2.17 bc 5720-v1.39"
i40e: "6.01 0x800034a4 1.1747.0"
nfp:  "0.0.3.5 0.25 sriov-2.1.16 nic"

Add a new devlink API to allow retrieving multiple versions, and
provide user-readable name for those versions.

While at it break down the versions into three categories:
 - fixed - this is the board/fixed component version, usually vendors
           report information like the board version in the PCI VPD,
           but it will benefit from naming and common API as well;
 - running - this is the running firmware version;
 - stored - this is firmware in the flash, after firmware update
            this value will reflect the flashed version, while the
            running version may only be updated after reboot.

v3:
 - add per-type helpers instead of using the special argument (Jiri).
RFCv2:
 - remove the nesting in attr DEVLINK_ATTR_INFO_VERSIONS (now
   versions are mixed with other info attrs)l
 - have the driver report versions from the same callback as
   other info.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-01 15:30:30 -08:00
Jakub Kicinski
f9cf22882c devlink: add device information API
ethtool -i has served us well for a long time, but its showing
its limitations more and more. The device information should
also be reported per device not per-netdev.

Lay foundation for a simple devlink-based way of reading device
info. Add driver name and device serial number as initial pieces
of information exposed via this new API.

v3:
 - rename helpers (Jiri);
 - rename driver name attr (Jiri);
 - remove double spacing in commit message (Jiri).
RFC v2:
 - wrap the skb into an opaque structure (Jiri);
 - allow the serial number of be any length (Jiri & Andrew);
 - add driver name (Jonathan).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-01 15:30:30 -08:00
Dave Watson
5b053e121f net: tls: Set async_capable for tls zerocopy only if we see EINPROGRESS
Currently we don't zerocopy if the crypto framework async bit is set.
However some crypto algorithms (such as x86 AESNI) support async,
but in the context of sendmsg, will never run asynchronously.  Instead,
check for actual EINPROGRESS return code before assuming algorithm is
async.

Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-01 15:05:07 -08:00
Dave Watson
130b392c6c net: tls: Add tls 1.3 support
TLS 1.3 has minor changes from TLS 1.2 at the record layer.

* Header now hardcodes the same version and application content type in
  the header.
* The real content type is appended after the data, before encryption (or
  after decryption).
* The IV is xored with the sequence number, instead of concatinating four
  bytes of IV with the explicit IV.
* Zero-padding:  No exlicit length is given, we search backwards from the
  end of the decrypted data for the first non-zero byte, which is the
  content type.  Currently recv supports reading zero-padding, but there
  is no way for send to add zero padding.

Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-01 15:00:55 -08:00
Dave Watson
a2ef9b6a22 net: tls: Refactor tls aad space size calculation
TLS 1.3 has a different AAD size, use a variable in the code to
make TLS 1.3 support easy.

Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-01 15:00:55 -08:00
Dave Watson
fb99bce712 net: tls: Support 256 bit keys
Wire up support for 256 bit keys from the setsockopt to the crypto
framework

Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-01 15:00:55 -08:00
Johannes Berg
7d4194633b mac80211: fix missing/malformed documentation
Fix the missing and malformed documentation that kernel-doc and
sphinx warn about. While at it, also add some things to the docs
to fix missing links.

Sadly, the only way I could find to fix this was to add some
trailing whitespace.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-02-01 12:11:13 +01:00
Johannes Berg
9874b71fa1 cfg80211: add missing documentation that kernel-doc warns about
Add the missing documentation that kernel-doc continually warns
about, to get rid of all that noise.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-02-01 11:53:15 +01:00
Johannes Berg
23323289b1 netlink: reduce NLA_POLICY_NESTED{,_ARRAY} arguments
In typical cases, there's no need to pass both the maxattr
and the policy array pointer, as the maxattr should just be
ARRAY_SIZE(policy) - 1. Therefore, to be less error prone,
just remove the maxattr argument from the default macros
and deduce the size accordingly.

Leave the original macros with a leading underscore to use
here and in case somebody needs to pass a policy pointer
where the policy isn't declared in the same place and thus
ARRAY_SIZE() cannot be used.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-02-01 11:06:55 +01:00
Johannes Berg
752cfee90d Merge remote-tracking branch 'net-next/master' into mac80211-next
Merge net-next so that we get the changes from net, which would
otherwise conflict with the NLA_POLICY_NESTED/_ARRAY changes.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-02-01 11:05:35 +01:00
Matteo Croce
5ac4a12df5 cfg80211: fix typo
Fix spelling mistake in cfg80211.h: "lenght" -> "length".
The typo is also in the special comment block which
translates to documentation.

Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-02-01 11:05:07 +01:00
Toke Høiland-Jørgensen
cb86880ee4 mac80211: Fix documentation strings for airtime-related variables
There was a typo in the documentation for weight_multiplier in mac80211.h,
and the doc was missing entirely for airtime and airtime_weight in sta_info.h.

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2019-02-01 11:04:53 +01:00
Daniel Borkmann
d5256083f6 ipvlan, l3mdev: fix broken l3s mode wrt local routes
While implementing ipvlan l3 and l3s mode for kubernetes CNI plugin,
I ran into the issue that while l3 mode is working fine, l3s mode
does not have any connectivity to kube-apiserver and hence all pods
end up in Error state as well. The ipvlan master device sits on
top of a bond device and hostns traffic to kube-apiserver (also running
in hostns) is DNATed from 10.152.183.1:443 to 139.178.29.207:37573
where the latter is the address of the bond0. While in l3 mode, a
curl to https://10.152.183.1:443 or to https://139.178.29.207:37573
works fine from hostns, neither of them do in case of l3s. In the
latter only a curl to https://127.0.0.1:37573 appeared to work where
for local addresses of bond0 I saw kernel suddenly starting to emit
ARP requests to query HW address of bond0 which remained unanswered
and neighbor entries in INCOMPLETE state. These ARP requests only
happen while in l3s.

Debugging this further, I found the issue is that l3s mode is piggy-
backing on l3 master device, and in this case local routes are using
l3mdev_master_dev_rcu(dev) instead of net->loopback_dev as per commit
f5a0aab84b74 ("net: ipv4: dst for local input routes should use l3mdev
if relevant") and 5f02ce24c269 ("net: l3mdev: Allow the l3mdev to be
a loopback"). I found that reverting them back into using the
net->loopback_dev fixed ipvlan l3s connectivity and got everything
working for the CNI.

Now judging from 4fbae7d83c98 ("ipvlan: Introduce l3s mode") and the
l3mdev paper in [0] the only sole reason why ipvlan l3s is relying
on l3 master device is to get the l3mdev_ip_rcv() receive hook for
setting the dst entry of the input route without adding its own
ipvlan specific hacks into the receive path, however, any l3 domain
semantics beyond just that are breaking l3s operation. Note that
ipvlan also has the ability to dynamically switch its internal
operation from l3 to l3s for all ports via ipvlan_set_port_mode()
at runtime. In any case, l3 vs l3s soley distinguishes itself by
'de-confusing' netfilter through switching skb->dev to ipvlan slave
device late in NF_INET_LOCAL_IN before handing the skb to L4.

Minimal fix taken here is to add a IFF_L3MDEV_RX_HANDLER flag which,
if set from ipvlan setup, gets us only the wanted l3mdev_l3_rcv() hook
without any additional l3mdev semantics on top. This should also have
minimal impact since dev->priv_flags is already hot in cache. With
this set, l3s mode is working fine and I also get things like
masquerading pod traffic on the ipvlan master properly working.

  [0] https://netdevconf.org/1.2/papers/ahern-what-is-l3mdev-paper.pdf

Fixes: f5a0aab84b74 ("net: ipv4: dst for local input routes should use l3mdev if relevant")
Fixes: 5f02ce24c269 ("net: l3mdev: Allow the l3mdev to be a loopback")
Fixes: 4fbae7d83c98 ("ipvlan: Introduce l3s mode")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Mahesh Bandewar <maheshb@google.com>
Cc: David Ahern <dsa@cumulusnetworks.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Martynas Pumputis <m@lambda.lt>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-30 22:13:34 -08:00
Xin Long
7efba10d6b sctp: add SCTP_FUTURE_ASOC and SCTP_CURRENT_ASSOC for SCTP_STREAM_SCHEDULER sockopt
Check with SCTP_ALL_ASSOC instead in sctp_setsockopt_scheduler and
check with SCTP_FUTURE_ASSOC instead in sctp_getsockopt_scheduler,
it's compatible with 0.

SCTP_CURRENT_ASSOC is supported for SCTP_STREAM_SCHEDULER in this
patch. It also adds default_ss in sctp_sock to support
SCTP_FUTURE_ASSOC.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-30 00:44:08 -08:00
Xin Long
8add543e36 sctp: add SCTP_FUTURE_ASSOC for SCTP_PEER_ADDR_THLDS sockopt
Check with SCTP_FUTURE_ASSOC instead in
sctp_set/getsockopt_paddr_thresholds, it's compatible with 0.

It also adds pf_retrans in sctp_sock to support SCTP_FUTURE_ASSOC.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-30 00:44:06 -08:00
Vasundhara Volam
b639583f9e devlink: Add a generic wake_on_lan port parameter
wake_on_lan - Enables Wake on Lan for this port. If enabled,
the controller asserts a wake pin based on the WOL type.

v2->v3:
- Define only WOL types used now and define them as bitfield, so that
  mutliple WOL types can be enabled upon power on.
- Modify "wake-on-lan" name to "wake_on_lan" to be symmetric with
  previous definitions.
- Rename DEVLINK_PARAM_WOL_XXX to DEVLINK_PARAM_WAKE_XXX to be
  symmetrical with ethtool WOL definitions.

Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-29 22:13:09 -08:00
Vasundhara Volam
c1e5786d67 devlink: Add devlink notifications support for port params
Add notification call for devlink port param set, register and unregister
functions.
Add devlink_port_param_value_changed() function to enable the driver notify
devlink on value change. Driver should use this function after value was
changed on any configuration mode part to driverinit.

v7->v8:
Order devlink_port_param_value_changed() definitions followed by
devlink_param_value_changed()

Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-29 22:13:09 -08:00