IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
In NETDEV_CHANGEUPPER event the upper_info field is valid
only when linking is true. Otherwise it should be ignored.
Fixes: 7907f23adc18 (net/mlx5: Implement RoCE LAG feature)
Signed-off-by: Talat Batheesh <talatb@mellanox.com>
Reviewed-by: Aviv Heller <avivh@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz says:
====================
qed: load/unload mfw series
This series correct the unload flow and greatly enhances its
initialization flow in regard to interactions between driver
and management firmware.
Patch #1 makes sure unloading is done under management-firmware's
'criticial section' protection.
Patches #2 - #4 move driver into using a newer scheme for loading
in regard to the MFW; This newer scheme would help cleaning the device
in case a previous instance has dirtied it [preboot, PDA, etc.].
Patches #5 - #6 let driver inform management-firmware on number of
resources which are dependent on the non-management firmware used.
Patch #7 then uses a new resource [BDQ] instead of some set value.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Until now, qed used some port-defined value as BDQ index for both iSCSI
and FCoE.
As management firmware now treats BDQ as a resource and tells each PF
its BDQ-range, start using a valure from that range instead.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Management firmware is used as an arbiter between the various PFs
in matters of resources, but some of the resources that need to
be divided are dependent on the non-management firmware used,
so management firmware first needs to be told how many resources
there are before trying to divide them.
As part of the initialization sequence, driver would first inform
the management firmware of the available resources under
a dedicated resource lock, and afterwards request for various
resources which might be based on the previous set values.
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Global locking can't properly be used to synchronize between different
PFs in all scenarios, as those instances might reside in different
logical partitions [e.g., when a PF is assigned via PDA to some VM].
The management firmware provides a generic infrastructure for
device locks. For each 'resource', it's guaranteed it could be acquired
by at most a single PF at any given time [or by management firmware].
This patch adds the necessary logic in qed for utilizing said
infrastructure, implementing lock/unlock internal APIs.
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During HW initialization, driver would set various registers to their
needed values - but it assumes all registers start at their reset-value,
so there's no need to re-configure a register's default value.
This assumption might be incorrect, e.g., in case of preboot driver
running and initializing the driver prior to our driver.
To overcome this, we now ask management firmware to initiate a PF-flr
early during the initialization sequence. That would return everything
in the PF's scope back to default and prevent previous configurations
from still being applied.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Management firmware is used as an arbiter between the various PFs
in regard to loading - it causes the various PFs to load/unload
sequentially and informs each of its appropriate rule in the init.
But the existing flow is too weak to handle some scenarios where
PFs aren't properly cleaned prior to loading.
The significant scenarios falling under this criteria:
a. Preboot drivers in some environment can't properly unload.
b. Unexpected driver replacement [kdump, PDA].
Modern management firmware supports a more intricate loading flow,
where the driver has the ability to overcome previous limitations.
This moves qed into using this newer scheme.
Notice new scheme is backward compatible, so new drivers would
still be able to load properly on top of older management firmwares
and vice versa.
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We'll soon need additional information, so start by changing
the infrastructure to receive the initializing variables
via a parameter struct.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Management firmware is used as arbiter between different PFs
which are loading/unloading, but in order to use the synchronization
it offers the contending configurations need to be applied either
between their LOAD_REQ <-> LOAD_DONE or UNLOAD_REQ <-> UNLOAD_DONE
management firmware commands.
Existing HW stop flow utilizes 2 different functions: qed_hw_stop() and
qed_hw_reset() which don't abide this requirement; Most of the closure
is doing outside the scope of the unload request.
This patch removes qed_hw_reset() and places the relevant stop
functionality underneath the management firmware protection.
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Parthasarathy Bhuvaragan says:
====================
tipc: subscription refcount simplifications
The first patch makes the subscription refcount cleanup lockless and
the second updates the subscription refcount policy.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When a new subscription object is inserted into name_seq->subscriptions
list, it's under name_seq->lock protection; when a subscription is
deleted from the list, it's also under the same lock protection;
similarly, when accessing a subscription by going through subscriptions
list, the entire process is also protected by the name_seq->lock.
Therefore, if subscription refcount is increased before it's inserted
into subscriptions list, and its refcount is decreased after it's
deleted from the list, it will be unnecessary to hold refcount at all
before accessing subscription object which is obtained by going through
subscriptions list under name_seq->lock protection.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After a subscription object is created, it's inserted into its
subscriber subscrp_list list under subscriber lock protection,
similarly, before it's destroyed, it should be first removed from
its subscriber->subscrp_list. Since the subscription list is
accessed with subscriber lock, all the subscriptions are valid
during the lock duration. Hence in tipc_subscrb_subscrp_delete(), we
remove subscription get/put and the extra subscriber unlock/lock.
After this change, the subscriptions refcount cleanup is very simple
and does not access any lock.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
moxart_mac_start_xmit() doesn't care where tx_tail is, tx_head can
catch and pass tx_tail, which is bad because moxart_tx_finished()
isn't guaranteed to catch up on freeing resources from tx_tail.
Add a check in moxart_mac_start_xmit() stopping the queue at the
end of the circular buffer. Also add a check in moxart_tx_finished()
waking the queue if the buffer has TX_WAKE_THRESHOLD or more
free descriptors.
While we're at it, move spin_lock_irq() to happen before our
descriptor pointer is assigned in moxart_mac_start_xmit().
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=99451
Signed-off-by: Jonas Jensen <jonas.jensen@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A driver must not access the two fields directly but should instead use
the helper functions to set the values and keep a consistent internal
state:
ethernet/stmicro/stmmac/stmmac_main.c: In function 'stmmac_dvr_probe':
ethernet/stmicro/stmmac/stmmac_main.c:4083:8: error: 'struct net_device' has no member named 'real_num_rx_queues'; did you mean 'real_num_tx_queues'?
Fixes: a8f5102af2a7 ("net: stmmac: TX and RX queue priority configuration")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
gcc-7 points out that the AVMB1_ADDCARD ioctl results in an unintialized
value ending up in the cardnr parameter:
drivers/isdn/capi/kcapi.c: In function 'old_capi_manufacturer':
drivers/isdn/capi/kcapi.c:1042:24: error: 'cdef.cardnr' may be used uninitialized in this function [-Werror=maybe-uninitialized]
cparams.cardnr = cdef.cardnr;
This has been broken since before the start of the git history, so
either the value is not used for anything important, or the ioctl
command doesn't get called in practice.
Setting the cardnr to zero avoids the warning and makes sure
we have consistent behavior.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
With the RPM driver transitioned to RPMSG we can reuse the SMD-RPM
driver ontop of GLINK for 8996, without any modifications.
Acked-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the standalone SMD implementation as we have transitioned the
client drivers to use the RPMSG based one.
Also remove all dependencies on QCOM_SMD from Kconfig files, in order to
keep them selectable in the absence of the removed symbol.
Acked-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
By moving these client drivers to use RPMSG instead of the direct SMD
API we can reuse them ontop of the newly added GLINK wire-protocol
support found in the 820 and 835 Qualcomm platforms.
As the new (RPMSG-based) and old SMD implementations are mutually
exclusive we have to change all client drivers in one commit, to make
sure we have a working system before and after this transition.
Acked-by: Andy Gross <andy.gross@linaro.org>
Acked-by: Kalle Valo <kvalo@codeaurora.org>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
vxlan driver already implicitly supports installing
of external fdb entries with NTF_EXT_LEARNED. This
patch just makes sure these entries are not aged
by the vxlan driver. An external entity managing these
entries will age them out. This is consistent with
the use of NTF_EXT_LEARNED in the bridge driver.
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Laight noticed the support for MSG_MORE with datamsg->force_delay
didn't really work as we expected, as the first msg with MSG_MORE set
would always block the following chunks' dequeuing.
This Patch is to rewrite it by saving the MSG_MORE flag into assoc as
David Laight suggested.
asoc->force_delay is used to save MSG_MORE flag before a msg is sent.
All chunks in queue would not be sent out if asoc->force_delay is set
by the msg with MSG_MORE flag, until a new msg without MSG_MORE flag
clears asoc->force_delay.
Note that this change would not affect the flush is generated by other
triggers, like asoc->state != ESTABLISHED, queue size > pmtu etc.
v1->v2:
Not clear asoc->force_delay after sending the msg with MSG_MORE flag.
Fixes: 4ea0c32f5f42 ("sctp: add support for MSG_MORE")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: David Laight <david.laight@aculab.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko says:
====================
Add support for pipeline debug (dpipe)
Arkadi says:
While doing the hardware offloading process much of the hardware
specifics cannot be presented. An example for such is the routing
LPM algorithm which differ in hardware implementation from the
kernel software implementation. The only information the user receives
is whether specific route is offloaded or not, but he cannot really
understand the underlying implementation nor get the specific statistics
related to that process.
Another example is ACL offload using TC which is commonly implemented
using TCAM memory. Currently there is no capability to gain visibility
into the TCAM structure and to debug suboptimal resource allocation.
This patchset introduces capability for exporting the ASICs pipeline
abstraction via devlink infrastructure, which should serve as an
complementary tool. This infrastructure allows the user to get visibility
into the ASIC by modeling it as a set of match/action tables.
The main objects defined:
Table - abstraction for a single pipeline stage. Contains the
available match/actions and counter availability.
Entry - entry in a specific table with specific matches/actions
values and dedicated counter.
Header/field - tuples which describes the tables behavior.
As an example one of the ASIC's L3 blocks will be modeled. The egress
rif (router interface) table is the final step in the L3 pipeline
processing which does match on the internal rif index which was
determined before by the routing logic. The erif table determines
whether to forward or drop the packet and updates the corresponding
rif L3 statistics.
To expose this internal resources a special metadata header will
be introduced that describes the internal information gathered by
the ASIC's pipeline and contains the following fields: rif_port_index,
forward and drop.
Some internal hardware resources have direct mapping to kernel
objects. For example the rif_port_index is mapped to the net-devices
ifindex. By providing this mapping the users gains visibility into
the offloading process.
Follow-up work will include exporting more L3 tables which will give
visibility into the routing process.
First stage is adding support for dpipe in devlink. Next add support
in spectrum driver. Finally implement egress router interface
(erif) table for spectrum ASIC as an example.
---
v1->v2: Please see individual patches
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement dpipe's table ops for erif table which provide:
1. Getting the entries in the table with the associate values.
- match on "mlxsw_meta:erif_index"
- action on "mlxsw_meta:forwared_out"
2. Synchronize the hardware in case of enabling/disabling counters which
mean removing erif counters from all interfaces.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add rif helper function to access the rif index and rif devices ifindex.
This functions will be used by dpipe in order to dump the rif table.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for counter allocation on router interfaces. The allocation
depends on the counter state of relevant table. In case the counting is
disabled or no counters left the counter index will be set as invalid.
Also a counter pool for router allocation is added.
Signed-off-by: Arakdi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The RICNT register retrieves per port performance counter. It will be
used to query the router interfaces statistics.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add definition for egress router interface table. This table describes
the final part in the routing pipeline. This table matches the egress
interface index (rif index, which is set by the previous stages and
determine the out port) and makes the decision of forwarding the packet
towards the L2 logic or dropping it.
The metadata header is added to represent this internal information.
The rif index field is mapped logically to netdevice ifindex.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add placeholder for dpipe. Support for specific tables and headers will
be introduced in following patches. The headers are shared between all
mlxsw_sp instances.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update RITR for counter support. This allows adding counters for
ASIC's router ports.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The pipeline debug is used to export the pipeline abstractions for the
main objects - tables, headers and entries. The only support for set is
for changing the counter parameter on specific table.
The basic structures:
Header - can represent a real protocol header information or internal
metadata. Generic protocol headers like IPv4 can be shared
between drivers. Each driver can add local headers.
Field - part of a header. Can represent protocol field or specific ASIC
metadata field. Hardware special metadata fields can be mapped
to different resources, for example switch ASIC ports can have
internal number which from the systems point of view is mapped
to netdeivce ifindex.
Match - represent specific match rule. Can describe match on specific
field or header. The header index should be specified as well
in order to support several header instances of the same type
(tunneling).
Action - represents specific action rule. Actions can describe operations
on specific field values for example like set, increment, etc.
And header operation like add and delete.
Value - represents value which can be associated with specific match or
action.
Table - represents a hardware block which can be described with match/
action behavior. The match/action can be done on the packets
data or on the internal metadata that it gathered along the
packets traversal throw the pipeline which is vendor specific
and should be exported in order to provide understanding of
ASICs behavior.
Entry - represents single record in a specific table. The entry is
identified by specific combination of values for match/action.
Prior to accessing the tables/entries the drivers provide the header/
field data base which is used by driver to user-space. The data base
is split between the shared headers and unique headers.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Menzel reported a warning:
WARNING: CPU: 0 PID: 774 at /build/linux-ROBWaj/linux-4.9.13/kernel/trace/trace_functions_graph.c:233 ftrace_return_to_handler+0x1aa/0x1e0
Bad frame pointer: expected f6919d98, received f6919db0
from func acpi_pm_device_sleep_wake return to c43b6f9d
The warning means that function graph tracing is broken for the
acpi_pm_device_sleep_wake() function. That's because the ACPI Makefile
unconditionally sets the '-Os' gcc flag to optimize for size. That's an
issue because mcount-based function graph tracing is incompatible with
'-Os' on x86, thanks to the following gcc bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42109
I have another patch pending which will ensure that mcount-based
function graph tracing is never used with CONFIG_CC_OPTIMIZE_FOR_SIZE on
x86.
But this patch is needed in addition to that one because the ACPI
Makefile overrides that config option for no apparent reason. It has
had this flag since the beginning of git history, and there's no related
comment, so I don't know why it's there. As far as I can tell, there's
no reason for it to be there. The appropriate behavior is for it to
honor CONFIG_CC_OPTIMIZE_FOR_{SIZE,PERFORMANCE} like the rest of the
kernel.
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: All applicable <stable@vger.kernel.org>
When removing a GHES device notified by SCI, list_del_rcu() is used,
ghes_remove() should call synchronize_rcu() before it goes on to call
kfree(ghes), otherwise concurrent RCU readers may still hold this list
entry after it has been freed.
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
Fixes: 81e88fdc432a (ACPI, APEI, Generic Hardware Error Source POLL/IRQ/NMI notification type support)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
No platform-device is required for IO(x)APICs, so don't even
create them.
[ rjw: This fixes a problem with leaking platform device objects
after IOAPIC/IOxAPIC hot-removal events.]
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The on-stack resource-window 'win' in setup_res() is not
properly initialized. This causes the pointers in the
embedded 'struct resource' to contain stale addresses.
These pointers (in my case the ->child pointer) later get
propagated to the global iomem_resources list, causing a #GP
exception when the list is traversed in
iomem_map_sanity_check().
Fixes: c183619b63ec (x86/irq, ACPI: Implement ACPI driver to support IOAPIC hotplug)
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Fixes to multiple issues in virtio. Most notably
a regression fix for crashes reported by Fedora users.
Hybernate is still reportedly broken, working on it.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJY2qEcAAoJECgfDbjSjVRpM/oH/3GPZOh9/tMzDFDaDljqtWQy
PGVb74/3+O55xOOq9nyyS3+6BlCXmiUcynxg61QUOUqUuHPPdH/OntyyPgG0pYkx
271W81C1yc2xFp/qkOiMWKiPmsbJ7ykVg37NWtxm7Phf4RgX3wgymq87hWr4Td1G
q9k6oyMCmvJUECJVxOVHjPt+oYQ7zQkFBNB8kSNlj67gbe533jkPt46MMlXbX7fQ
lPdJTnLXN/GQxnVtw5AAiWF87z0wNVUefrLe9sHW3KOeGBdne4NXblvz3WF/iPq4
N96thgm7QOP3NgAqbaUa7Fb0+jxyi2DNYFrVPxnf+nOOQy/AVUX6GRZJ2Tu6gF0=
=oSO5
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio fixes from Michael Tsirkin:
"Fixes to multiple issues in virtio.
Most notably a regression fix for crashes reported by Fedora users.
Hibernate is still reportedly broken, working on it"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio_balloon: prevent uninitialized variable use
virtio-balloon: use actual number of stats for stats queue buffers
virtio_balloon: init 1st buffer in stats vq
virtio_pci: fix out of bound access for msix_names
The latest gcc-7.0.1 snapshot reports a new warning:
virtio/virtio_balloon.c: In function 'update_balloon_stats':
virtio/virtio_balloon.c:258:26: error: 'events[2]' is used uninitialized in this function [-Werror=uninitialized]
virtio/virtio_balloon.c:260:26: error: 'events[3]' is used uninitialized in this function [-Werror=uninitialized]
virtio/virtio_balloon.c:261:56: error: 'events[18]' is used uninitialized in this function [-Werror=uninitialized]
virtio/virtio_balloon.c:262:56: error: 'events[17]' is used uninitialized in this function [-Werror=uninitialized]
This seems absolutely right, so we should add an extra check to
prevent copying uninitialized stack data into the statistics.
>From all I can tell, this has been broken since the statistics code
was originally added in 2.6.34.
Fixes: 9564e138b1f6 ("virtio: Add memory statistics reporting to the balloon driver (V4)")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The virtio balloon driver contained a not-so-obvious invariant that
update_balloon_stats has to update exactly VIRTIO_BALLOON_S_NR counters
in order to send valid stats to the host. This commit fixes it by having
update_balloon_stats return the actual number of counters, and its
callers use it when pushing buffers to the stats virtqueue.
Note that it is still out of spec to change the number of counters
at run-time. "Driver MUST supply the same subset of statistics in all
buffers submitted to the statsq."
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When init_vqs runs, virtio_balloon.stats is either uninitialized or
contains stale values. The host updates its state with garbage data
because it has no way of knowing that this is just a marker buffer
used for signaling.
This patch updates the stats before pushing the initial buffer.
Alternative fixes:
* Push an empty buffer in init_vqs. Not easily done with the current
virtio implementation and violates the spec "Driver MUST supply the
same subset of statistics in all buffers submitted to the statsq".
* Push a buffer with invalid tags in init_vqs. Violates the same
spec clause, plus "invalid tag" is not really defined.
Note: the spec says:
When using the legacy interface, the device SHOULD ignore all values in
the first buffer in the statsq supplied by the driver after device
initialization. Note: Historically, drivers supplied an uninitialized
buffer in the first buffer.
Unfortunately QEMU does not seem to implement the recommendation
even for the legacy interface.
Cc: stable@vger.kernel.org
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fedora has received multiple reports of crashes when running
4.11 as a guest
https://bugzilla.redhat.com/show_bug.cgi?id=1430297https://bugzilla.redhat.com/show_bug.cgi?id=1434462https://bugzilla.kernel.org/show_bug.cgi?id=194911https://bugzilla.redhat.com/show_bug.cgi?id=1433899
The crashes are not always consistent but they are generally
some flavor of oops or GPF in virtio related code. Multiple people
have done bisections (Thank you Thorsten Leemhuis and
Richard W.M. Jones) and found this commit to be at fault
07ec51480b5eb1233f8c1b0f5d7a7c8d1247c507 is the first bad commit
commit 07ec51480b5eb1233f8c1b0f5d7a7c8d1247c507
Author: Christoph Hellwig <hch@lst.de>
Date: Sun Feb 5 18:15:19 2017 +0100
virtio_pci: use shared interrupts for virtqueues
The issue seems to be an out of bounds access to the msix_names
array corrupting kernel memory.
Fixes: 07ec51480b5e ("virtio_pci: use shared interrupts for virtqueues")
Reported-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Tested-by: Thorsten Leemhuis <linux@leemhuis.info>
Fix a filelayout GETDEVICEINFO call hang triggered from the LAYOUTGET
pnfs_layout_process where the GETDEVICEINFO call is waiting for a
session slot, and the LAYOUGET call is waiting for pnfs_layout_process
to complete before freeing the slot GETDEVICEINFO is waiting for..
This occurs in testing against the pynfs pNFS server where the
the on-wire reply highest_slotid and slot id are zero, and the
target high slot id is 8 (negotiated in CREATE_SESSION).
The internal fore channel slot table max_slotid, the maximum allowed
table slotid value, has been reduced via nfs41_set_max_slotid_locked
from 8 to 1. Thus there is one slot (slotid 0) available for use but
it has not been freed by LAYOUTGET proir to the GETDEVICEINFO request.
In order to ensure that layoutrecall callbacks are processed in the
correct order, nfs4_proc_layoutget processing needs to be finished
e.g. pnfs_layout_process) before giving up the slot that identifies
the layoutget (see referring_call_exists).
Move the filelayout_check_layout nfs4_find_get_device call outside of
the pnfs_layout_process call tree.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
In preparation for moving the filelayout getdeviceinfo call from
filelayout_alloc_lseg called by pnfs_process_layout
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
This includes calling the parsing code that translates from pedit
speak to the HW API, allocation (deallocation) of a modify header
context and setting the modify header id associated with this
context to the FTE of that flow.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
This includes calling the parsing code that translates from pedit
speak to the HW API, allocation (deallocation) of a modify header
context and setting the modify header id associated with this
context to the FTE of that flow.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Parse/translate a set of TC pedit actions to be formed in the HW API format.
User-space provides set of keys where each one of them is made of: command (add or
set), header-type, byte offset within that header along with a 32 bit mask and value.
The mask dictates what bits in the 32 bit word that starts on the offset we should
be dealing with, but under negative polarity (unset bits are to be modified).
We do a 1st pass over the set of keys while using the header-type and offset to
fill the masks and the values into a data-structure containting all the
supported network headers.
We then do a 2nd pass over the set of fields to re-write supported by the HW,
where for each such candidate field, we use the masks filled on the 1st pass to
realize if we should offloading re-write it.
In case offloading is required, we fill a HW descriptor with the following:
(1) the header field to modify
(2) the bit offset within the field from where to modify (set command only)
(3) the value to set/add
(4) the length in bits 1...32 to modify (set command only)
Note that it's possible for a given pedit mask to dictate modifying the
same header field multiple times or to modify multiple header fields.
Currently such combinations are not supported for offloading, hence, for set
commands, the offset within the field is always zero, and the length to modify
is the field size.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
HW drivers will use the header-type and command fields from the extended
keys, and some fields (e.g mask, val, offset) from the legacy keys.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Implement the low-level commands to support packet header re-write.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add the definitions related to creation/deletion of a modify header
context and the modify header steering action which are used for HW
packet header modify (re-write) as part of steering. Add as well the
modify header id into two intermediate structs and set it to the FTE.
Note that as the push/pop vlan steering actions are emulated by the
ewitch management code, we're not breaking any compatibility while
changing their values to make room for the modify header action which
is not emulated and whose value is part of the FW API. The new bit
values for the emulated actions are at the end of the possible range.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Move the commands related to scheduling elements and vport qos to
a suitable location (according to the MLX5_CMD_OP enum values) in
the command string and internal error helpers.
This patch doesn't change any functionality.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
There are bunch of places in the code where the intermediate struct
that keeps the elements related to flow actions is initialized with
the same default values. Put that into a small DECLARE type helper.
This patch doesn't change any functionality.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The code for adding tc fdb flows leaves things half set when it fails
in the middle. Currently we are not leaking things (e.g eswitch
vlan reference, encap reference and HW resources) since the main
code to add flower rules does a cleanup by calling mlx5e_tc_del_flow().
This cleanup further works just b/c we're checking there if the HW rule
for the flow we are attempting to delete is valid before touching it, and
since under the current possible combinations of supported actions it's okay
to go and blidnly deref or delete all the action related resources (encap, vlan).
Instead, do things properly, namely make sure that if add flow fails we
clean all what was allocated or referenced. Now, the flow delete code can
blindly deref/deallocate both the rule and the actions related resources and
when more action combinations are introduced (such as the upcoming header
re-write) we are fine with clear and robust code.
While here, align all of nic/fdb parse actions/add flow functions to get
mlx5e_tc_flow struct param and pick the attributes or whatever else needed
from there.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>