linux

iv/linux

Author	SHA1	Message	Date
Daniel Borkmann	de8f3a83b0	bpf: add meta pointer for direct access This work enables generic transfer of metadata from XDP into skb. The basic idea is that we can make use of the fact that the resulting skb must be linear and already comes with a larger headroom for supporting bpf_xdp_adjust_head(), which mangles xdp->data. Here, we base our work on a similar principle and introduce a small helper bpf_xdp_adjust_meta() for adjusting a new pointer called xdp->data_meta. Thus, the packet has a flexible and programmable room for meta data, followed by the actual packet data. struct xdp_buff is therefore laid out that we first point to data_hard_start, then data_meta directly prepended to data followed by data_end marking the end of packet. bpf_xdp_adjust_head() takes into account whether we have meta data already prepended and if so, memmove()s this along with the given offset provided there's enough room. xdp->data_meta is optional and programs are not required to use it. The rationale is that when we process the packet in XDP (e.g. as DoS filter), we can push further meta data along with it for the XDP_PASS case, and give the guarantee that a clsact ingress BPF program on the same device can pick this up for further post-processing. Since we work with skb there, we can also set skb->mark, skb->priority or other skb meta data out of BPF, thus having this scratch space generic and programmable allows for more flexibility than defining a direct 1:1 transfer of potentially new XDP members into skb (it's also more efficient as we don't need to initialize/handle each of such new members). The facility also works together with GRO aggregation. The scratch space at the head of the packet can be multiple of 4 byte up to 32 byte large. Drivers not yet supporting xdp->data_meta can simply be set up with xdp->data_meta as xdp->data + 1 as bpf_xdp_adjust_meta() will detect this and bail out, such that the subsequent match against xdp->data for later access is guaranteed to fail. The verifier treats xdp->data_meta/xdp->data the same way as we treat xdp->data/xdp->data_end pointer comparisons. The requirement for doing the compare against xdp->data is that it hasn't been modified from it's original address we got from ctx access. It may have a range marking already from prior successful xdp->data/xdp->data_end pointer comparisons though. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-26 13:36:44 -07:00
Jacob Keller	742c987575	i40e/i40evf: avoid dynamic ITR updates when polling or low packet rate The dynamic ITR algorithm depends on a calculation of usecs which assumes that the interrupts have been firing constantly at the interrupt throttle rate. This is not guaranteed because we could have a low packet rate, or have been polling in software. We'll estimate whether this is the case by using jiffies to determine if we've been too long. If the time difference of jiffies is larger we are guaranteed to have an incorrect calculation. If the time difference of jiffies is smaller we might have been polling some but the difference shouldn't affect the calculation too much. This ensures that we don't get stuck in BULK latency during certain rare situations where we receive bursts of packets that force us into NAPI polling. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-27 16:15:24 -07:00
Jacob Keller	0a2c7722be	i40e/i40evf: remove ULTRA latency mode Since commit `c56625d597` ("i40e/i40evf: change dynamic interrupt thresholds") a new higher latency ITR setting called I40E_ULTRA_LATENCY was added with a cryptic comment about how it was meant for adjusting Rx more aggressively when streaming small packets. This mode was attempting to calculate packets per second and then kick in when we have a huge number of small packets. Unfortunately, the ULTRA setting was kicking in for workloads it wasn't intended for including single-thread UDP_STREAM workloads. This wasn't caught for a variety of reasons. First, the ip_defrag routines were improved somewhat which makes the UDP_STREAM test still reasonable at 10GbE, even when dropped down to 8k interrupts a second. Additionally, some other obvious workloads appear to work fine, such as TCP_STREAM. The number 40k doesn't make sense for a number of reasons. First, we absolutely can do more than 40k packets per second. Second, we calculate the value inline in an integer, which sometimes can overflow resulting in using incorrect values. If we fix this overflow it makes it even more likely that we'll enter ULTRA mode which is the opposite of what we want. The ULTRA mode was added originally as a way to reduce CPU utilization during a small packet workload where we weren't keeping up anyways. It should never have been kicking in during these other workloads. Given the issues outlined above, let's remove the ULTRA latency mode. If necessary, a better solution to the CPU utilization issue for small packet workloads will be added in a future patch. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-27 16:12:15 -07:00
Jacob Keller	6d9777298b	i40e: invert logic for checking incorrect cpu vs irq affinity In commit `96db776a36` ("i40e/vf: fix interrupt affinity bug") we added some code to force exit of polling in case we did not have the correct CPU. This is important since it was possible for the IRQ affinity to be changed while the CPU is pegged at 100%. This can result in the polling routine being stuck on the wrong CPU until traffic finally stops. Unfortunately, the implementation, "if the CPU is correct, exit as normal, otherwise, fall-through to the end-polling exit" is incredibly confusing to reason about. In this case, the normal flow looks like the exception, while the exception actually occurs far away from the if statement and comment. We recently discovered and fixed a bug in this code because we were incorrectly initializing the affinity mask. Re-write the code so that the exceptional case is handled at the check, rather than having the logic be spread through the regular exit flow. This does end up with minor code duplication, but the resulting code is much easier to reason about. The new logic is identical, but inverted. If we are running on a CPU not in our affinity mask, we'll exit polling. However, the code flow is much easier to understand. Note that we don't actually have to check for MSI-X, because in the MSI case we'll only have one q_vector, but its default affinity mask should be correct as it includes all CPUs when it's initialized. Further, we could at some point add code to setup the notifier for the non-MSI-X case and enable this workaround for that case too, if desired, though there isn't much gain since its unlikely to be the common case. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-27 16:10:48 -07:00
Jacob Keller	9254c0e34e	i40e: move enabling icr0 into i40e_update_enable_itr If we don't have MSI-X enabled, we handle interrupts on all icr0. This is a special case, so let's move the conditional into i40e_update_enable_itr() in order to make i40e_napi_poll easier to read about. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-08-27 16:07:13 -07:00
David S. Miller	3118e6e19d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net The UDP offload conflict is dealt with by simply taking what is in net-next where we have removed all of the UFO handling code entirely. The TCP conflict was a case of local variables in a function being removed from both net and net-next. In netvsc we had an assignment right next to where a missing set of u64 stats sync object inits were added. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-09 16:28:45 -07:00
Florian Fainelli	7d6d067790	i40e: Initialize 64-bit statistics TX ring seqcount On 32-bit hosts and with CONFIG_DEBUG_LOCK_ALLOC we should be seeing a lockdep splat indicating this seqcount is not correctly initialized, fix that. Fixes: `980e9b1186` ("i40e: Add support for 64 bit netstats") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-01 20:06:06 -07:00
Jesse Brandeburg	b85c94b617	i40e/i40evf: remove mismatched type warnings Compiler reported several places where driver compared signed and unsigned types. Cast or change the types to remove the warnings. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-07-26 03:25:20 -07:00
Jesse Brandeburg	601a2e7ac5	i40e/i40evf: make IPv6 ATR code clearer This just reorders some local vars and makes the code flow clearer. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-07-26 03:25:20 -07:00
Björn Töpel	74608d17fe	i40e: add support for XDP_TX action This patch adds proper XDP_TX action support. For each Tx ring, an additional XDP Tx ring is allocated and setup. This version does the DMA mapping in the fast-path, which will penalize performance for IOMMU enabled systems. Further, debugfs support is not wired up for the XDP Tx rings. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-20 18:17:09 -07:00
Björn Töpel	0c8493d90b	i40e: add XDP support for pass and drop actions This commit adds basic XDP support for i40e derived NICs. All XDP actions will end up in XDP_DROP. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-20 18:17:09 -07:00
David S. Miller	0ddead90b2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net The conflicts were two cases of overlapping changes in batman-adv and the qed driver. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-15 11:59:32 -04:00
Jacob Keller	6964e53f55	i40e: fix handling of HW ATR eviction A recent commit to refactor the driver and remove the hw_disabled_flags field accidentally introduced two regressions. First, we overwrote pf->flags which removed various key flags including the MSI-X settings. Additionally, it was intended that we have now two flags, HW_ATR_EVICT_CAPABLE and HW_ATR_EVICT_ENABLED, but this was not done, and we accidentally were mis-using HW_ATR_EVICT_CAPABLE everywhere. This patch adds the missing piece, HW_ATR_EVICT_ENABLED, and safely updates pf->flags instead of overwriting it. Without this patch we will have many problems including disabling MSI-X support, and we'll attempt to use HW ATR eviction on devices which do not support it. Fixes: `47994c119a` ("i40e: remove hw_disabled_flags in favor of using separate flag bits", 2017-04-19) Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-12 18:53:02 -04:00
David S. Miller	216fe8f021	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Just some simple overlapping changes in marvell PHY driver and the DSA core code. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-06 22:20:08 -04:00
Björn Töpel	2aae918c7a	i40e/i40evf: proper update of the page_offset field In `f8b45b74cc` ("i40e/i40evf: Use build_skb to build frames") i40e_build_skb updates the page_offset field with an incorrect offset, which can lead to data corruption. This patch updates page_offset correctly, by properly setting truesize. Note that the bug only appears on architectures where PAGE_SIZE is 8192 or larger. Fixes: `f8b45b74cc` ("i40e/i40evf: Use build_skb to build frames") Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-06-06 02:49:15 -07:00
Jacob Keller	0bc0706b46	i40e: check for Tx timestamp timeouts during watchdog The i40e driver has logic to handle only one Tx timestamp at a time, using a state bit lock to avoid multiple requests at once. It may be possible, if incredibly unlikely, that a Tx timestamp event is requested but never completes. Since we use an interrupt scheme to determine when the Tx timestamp occurred we would never clear the state bit in this case. Add an i40e_ptp_tx_hang() function similar to the already existing i40e_ptp_rx_hang() function. This function runs in the watchdog routine and makes sure we eventually recover from this case instead of permanently disabling Tx timestamps. Note: there is no currently known way to cause this without hacking the driver code to force it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-05-31 03:12:06 -07:00
Jacob Keller	2955faca04	i40e: add statistic indicating number of skipped Tx timestamps The i40e driver can only handle one Tx timestamp request at a time. This means it is possible for an application timestamp request to be ignored. There is no easy way for an administrator to determine if this occurred. Add a new statistic which tracks this, tx_hwtstamp_skipped. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-05-31 03:09:14 -07:00
Jacob Keller	69077577af	i40e: avoid permanent lock of *_PTP_TX_IN_PROGRESS The i40e driver uses a bit lock to indicate when a Tx timestamp is in progress to avoid attempting to timestamp multiple packets at once. This is required because hardware only has registers to handle one request at a time. There is a corner case where we failed to cleanup the bit lock after a failed transmit. This can potentially result in a state bit being locked forever. Add some cleanup code to i40e_xmit_frame_ring to check and make sure we cleanup incase of these failures. We also modify i40e_tx_map to return an error code indication DMA failure. Reported-by: Reported-by: David Mirabito <davidm@metamako.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-05-31 03:08:09 -07:00
Jacob Keller	47994c119a	i40e: remove hw_disabled_flags in favor of using separate flag bits The hw_disabled_flags field was added as a way of signifying that a feature was automatically or temporarily disabled. However, we actually only use this for FDir features. Replace its use with new _AUTO_DISABLED flags instead. This is more readable, because you aren't setting an _ENABLED flag to disable* the feature. Additionally, clean up a few areas where we used these bits. First, we don't really need to set the auto-disable flag for ATR if we're fully disabling the feature via ethtool. Second, we should always clear the auto-disable bits in case they somehow got set when the feature was disabled. However, avoid displaying a message that we've re-enabled the feature. Third, we shouldn't be re-enabling ATR in the SB ntuple add flow, because it might have been disabled due to space constraints. Instead, we should just wait for the fdir_check_and_reenable to be called by the watchdog. Overall, this change allows us to simplify some code by removing an extra field we didn't need, and the result should make it more clear as to what we're actually doing with these flags. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-30 04:53:58 -07:00
Jacob Keller	0da36b9774	i40e: use DECLARE_BITMAP for state fields Instead of assuming our flags fit within an unsigned long, use DECLARE_BITMAP which will ensure that we always allocate enough space. Additionally, use __I40E_STATE_SIZE__ markers as the last element of the enumeration so that the size of the BITMAP is compile-time assigned rather than programmer-time assigned. This ensures that potential future flag additions do not actually overrun the array. This is especially important as 32bit systems would only have 32bit longs instead of 64bit longs as we generally have assumed in the prior code. This change also removes a dereference of the state fields throughout the code, so it does have a bit of code churn. The conversions were automated using sed replacements with an alternation s/&(vsi->back\|vsi\|pf)->state/\1->state/ s/&adapter->vsi.state/adapter->vsi.state/ For debugfs, we modify the printing so that we can display chunks of the state value on new lines. This ensures that we can print the entire set of state values. Additionally, we now print them as 08lx to ensure that they display nicely. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-30 04:48:13 -07:00
Jacob Keller	d19cb64b92	i40e: separate PF and VSI state flags Avoid using the same named flags for both vsi->state and pf->state. This makes code review easier, as it is more likely that future authors will use the correct state field when checking bits. Previous commits already found issues with at least one check, and possibly others may be incorrect. This reduces confusion as it is more clear what each flag represents, and which flags are valid for which state field. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-30 04:42:30 -07:00
Scott Peterson	ed0980c440	i40e/i40evf: Add tracepoints This patch adds tracepoints to the i40e and i40evf drivers to which BPF programs can be attached for feature testing and verification. It's expected that an attached BPF program will identify and count or log some interesting subset of traffic. The bcc-tools package is helpful there for containing all the BPF arcana in a handy Python wrapper. Though you can make these tracepoints log trace messages, the messages themselves probably won't be very useful (other to verify the tracepoint is being called while you're debugging your BPF program). The idea here is that tracepoints have such low performance cost when disabled that we can leave these in the upstream drivers. This may eventually enable the instrumentation of unmodified customer systems should the need arise to verify a NIC feature is working as expected. In general this enables one set of feature verification tools to be used on these drivers whether they're built with the kernel or separately. Users are advised against using these tracepoints for anything other than a diagnostic tool. They have a performance impact when enabled, and their exact placement and form may change as we see how well they work in practice for the purposes above. Change-ID: Id6014a7322c0e6d08068114dd20bd156f2f6435e Signed-off-by: Scott Peterson <scott.d.peterson@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-19 16:47:31 -07:00
Alexander Duyck	0e626ff7cc	i40e: Fix support for flow director programming status This patch fixes an issue I introduced when I converted the code over to using the length field to determine if a descriptor was done or not. It turns out that we are also processing programming descriptors in the Rx path and need to have these processed even though the length field will be 0 on these packets. What will happen with a programming descriptor is that we will receive a descriptor that has the SPH bit set, and the header length and packet length fields cleared. To account for this we should be checking for the bit for split header being set even though we aren't actually using header split. This bit is set in the length field to indicate if a programming descriptor response is contained in the descriptor. Since we don't support header split we don't need to perform the extra checks of using a fixed value for the entire length field. In addition I am moving the function for checking if a filter is a programming status filter into the i40e_txrx.c file since there is no longer support for FCoE it doesn't make sense to keep this file in i40e.h. Change-ID: I12c359c3dc70adb9d6b92b27324bb2c7f04c1a06 Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-19 16:36:18 -07:00
Alexander Duyck	f8b45b74cc	i40e/i40evf: Use build_skb to build frames This patch is meant to improve the performance of the Rx path. Specifically by using build_skb we have several distinct advantages. In the case of small frames we were previously using a copy-break approach. This means that we were allocating a page fragment to use for skb->head, and were having to copy the packet into that region. Both of those calls are now avoided since we just build the skb around the data. In the case of large frames the gains are much more significant. Specifically we were having to allocate skb->head, and copy the headers as before. However in addition we were having to parse the header using eth_get_headlen which could be quite expensive. All of this is avoided by building the frame around the data. I have seen gains as high as 30% when using VXLAN for instance due to just header pulling overhead. Finally with all this in place it also sets us up to start looking at enabling XDP. Specifically we now have a path in which the data is in the page and the frame is built around it. So if we parse it with XDP before we call build_skb we can take care of any necessary processing there. Change-ID: Id4bdd618e94473d41f892417e5d8019639e421e3 Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-08 02:53:51 -07:00
Alexander Duyck	ca9ec0888d	i40e/i40evf: Add support for padding start of frames This patch adds padding to the start of frames to make room for headroom for us to eventually start using build_skb. Right now we guarantee at least NET_SKB_PAD + NET_IP_ALIGN, however we allocate more space if more is available. For example on x86 the headroom should be 192 bytes. On systems that have too large of a cache line size to support storing 1.5K padding and shared info we default to using 3K buffers and reserve everything that isn't used for skb_shared_info or the data buffer for headroom. Change-ID: I33c641c9a1ea10cf7cc484c2d20985368d2d709a Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-08 02:53:51 -07:00
Alexander Duyck	98efd69493	i40e/i40evf: Add support for using order 1 pages with a 3K buffer There are situations where adding padding to the front and back of an Rx buffer will require that we add additional padding. Specifically if NET_IP_ALIGN is non-zero, or the MTU size is larger than 7.5K we would need to use 2K buffers which leaves us with no room for the padding. To preemptively address these cases I am adding support for 3K buffers to the Rx path so that we can provide the additional padding needed in the event of NET_IP_ALIGN being non-zero or a cache line being greater than 64. Change-ID: I938bc1ba611285428df39a613cd66f98e60b55c7 Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-08 02:53:50 -07:00
Alan Brady	17daabb5e8	i40e: Simplify i40e_detect_recover_hung_queue logic This patch greatly reduces the unneeded complexity in the i40e_detect_recover_hung_queue code path. The previous implementation set a 'hung bit' which would then get cleared while polling. If the detection routine was called a second time with the bit already set, we would issue a software interrupt. This patch makes it such that if interrupts are disabled and we have pending TX descriptors, we trigger a software interrupt since in, the worst case, queues are already clean and we have an extra interrupt. Additionally this patch removes the workaround for lost interrupts as calling napi_reschedule in this context can cause software interrupts to fire on the wrong CPU. Change-ID: Iae108582a3ceb6229ed1d22e4ed6e69cf97aad8d Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-08 02:53:49 -07:00
Alexander Duyck	e8c5f7231c	i40e: Swap use of pf->flags and pf->hw_disabled_flags for ATR Eviction This is a minor cleanup so that we are always updating pf->flags when we make a change to the private flags instead of updating a mix of either pf->flags and/or pf->hw_disabled_flags. In addition I went through and cleaned out all the spots where we were using the X722 define in regards to this flag. Lastly since we changed the logic I went through and flushed out any redundancy and cleaned up the handling of the flags in the Tx path. Change-ID: I79ff95a7272bb2533251ff11ef91e89ccb80b610 Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-08 02:53:49 -07:00
Jacob Keller	a346fb836c	i40e: update error message when trying to add invalid filters Re-word the error message displayed when adding a filter with an invalid flow type. Additionally, report a distinct error message when the IPv4 protocol is at fault. Change-ID: Iba3d85b87f8d383c97c8bdd180df34a6adf3ee67 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-04-08 02:53:48 -07:00
Alexander Duyck	fa2343e903	i40e/i40evf: Break i40e_fetch_rx_buffer up to allow for reuse of frag code This patch is meant to clean up the code in preparation for us adding support for build_skb. Specifically we deconstruct i40e_fetch_buffer into several functions so that those functions can later be reused when we add a path for build_skb. Specifically with this change we split out the code for adding a page to an exiting skb. Change-ID: Iab1efbab6b8b97cb60ab9fdd0be1d37a056a154d Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-29 02:15:06 -07:00
Alexander Duyck	a0cfc3130e	i40e/i40evf: Pull out code for cleaning up Rx buffers This patch pulls out the code responsible for handling buffer recycling and page counting and distributes it through several functions. This allows us to commonize the bits that handle either freeing or recycling the buffers. As far as the page count tracking one change to the logic is that pagecnt_bias is decremented as soon as we call i40e_get_rx_buffer. It is then the responsibility of the function that pulls the data to either increment the pagecnt_bias if the buffer can be recycled as-is, or to update page_offset so that we are pointing at the correct location for placement of the next buffer. Change-ID: Ibac576360cb7f0b1627f2a993d13c1a8a2bf60af Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-29 02:15:06 -07:00
Alexander Duyck	9a064128fc	i40e/i40evf: Pull code for grabbing and syncing rx_buffer from fetch_buffer This patch pulls the code responsible for fetching the Rx buffer and synchronizing DMA into a function, specifically called i40e_get_rx_buffer. The general idea is to allow for better code reuse by pulling this out of i40e_fetch_rx_buffer. We dropped a couple of prefetches since the time between the prefetch being called and the data being accessed was too small to be useful. Change-ID: I4885fce4b2637dbedc8e16431169d23d3d7e79b9 Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-29 02:15:06 -07:00
Alexander Duyck	d57c0e08c7	i40e/i40evf: Use length to determine if descriptor is done This change makes it so that we use the length of the packet instead of the DD status bit to determine if a new descriptor is ready to be processed. The obvious advantage is that it cuts down on reads as we don't really even need the DD bit if going from a 0 to a non-zero value on size is enough to inform us that the packet has been completed. Change-ID: Iebdf9cdb36c454ef092df27199b92ad09c374231 Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-29 02:15:06 -07:00
Alexander Duyck	9eed69a914	i40e: Drop FCoE code from core driver files Looking over the code for FCoE it looks like the Rx path has been broken at least since the last major Rx refactor almost a year ago. It seems like FCoE isn't supported for any of the Fortville/Fortpark hardware so there isn't much point in carrying the code around, especially if it is broken and untested. Change-ID: I892de8fa551cb129ce2361e738ff82ce55fa229e Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-27 16:47:43 -07:00
Alexander Duyck	a5b268e4b1	i40e/i40evf: Clean-up process_skb_fields This is a minor clean-up to make the i40e/i40evf process_skb_fields function look a little more like what we have in igb. The Rx checksum function called out a need for skb->protocol but I can't see where it actually needs it. I am assuming this is something that was likely refactored out some time ago as the Rx checksum code has gone through a few rewrites. Change-ID: I0b4668a34d90b61b66ded7c7c26e19a3e2d06251 Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-27 16:47:43 -07:00
Alexander Duyck	741b8b832a	i40e/i40evf: Fix use after free in Rx cleanup path We need to reset skb back to NULL when we have freed it in the Rx cleanup path. I found one spot where this wasn't occurring so this patch fixes it. Change-ID: Iaca68934200732cd4a63eb0bd83b539c95f8c4dd Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-27 16:45:14 -07:00
Alexander Duyck	1793668c3b	i40e/i40evf: Update code to better handle incrementing page count Update the driver code so that we do bulk updates of the page reference count instead of just incrementing it by one reference at a time. The advantage to doing this is that we cut down on atomic operations and this in turn should give us a slight improvement in cycles per packet. In addition if we eventually move this over to using build_skb the gains will be more noticeable. I also found and fixed a store forwarding stall from where we were assigning "new_buff = old_buff". By breaking it up into individual copies we can avoid this and as a result the performance is slightly improved. Change-ID: I1d3880dece4133eca3c32423b04a5467321ccc52 Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-27 16:45:13 -07:00
Jacob Keller	f223c8752a	i40e: add support for SCTPv4 FDir filters Enable FDir filters for SCTPv4 packets using the ethtool ntuple interface to enable filters. The ethtool API does not allow masking on the verification tag. Change-Id: I093e88a8143994c7e6f4b7b17a0bd5cf861d18e4 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-23 21:13:33 -07:00
Jacob Keller	0e588de17f	i40e: implement support for flexible word payload Add support for flexible payloads passed via ethtool user-def field. This support is somewhat limited due to hardware design. The input set can only be programmed once per filter type, and the flexible offset is part of this filter input set. This means that the user cannot program both a regular and a flexible filter at the same time for a given flow type. Additionally, the user may not program two flexible filters of the same flow type with different offsets, although they are allowed to configure different values at that offset location. We support a single flexible word (2byte) value per protocol type, and we handle the FLX_PIT register using a list of flexible entries so that each flow type may be configured separately. Due to hardware implementation, the flexible data is offset from the start of the packet payload, and thus may not be in part of the header data. For this reason, the offset provided by the user defined data is interpreted as a byte offset from the start of the matching payload. Previous implementations have tried to represent the offset as from the start of the frame, but this is not feasible because header sizes may change due to options. Change-Id: 36ed27995e97de63f9aea5ade5778ff038d6f811 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-23 21:13:33 -07:00
Jacob Keller	097dbf5250	i40e: add counters for UDP/IPv4 and IPv4 filters In preparation for adding code to properly check the mask values, we will need to know the number of active filters for each type. Add counters for each filter type. Rename the already existing fd_tcp_rule to fd_tcp4_filter_cnt to match the style of other names. To avoid style warnings, avoid assigning multiple parameters at once, and fix up one other case where we did so previously. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-20 16:45:22 -07:00
Jacob Keller	377cc24980	i40e: exit ATR mode only when adding TCP/IPv4 filter succeeds Move ATR exit check after we have sent the TCP/IPv4 filter to the ring successfully. This avoids an issue where we potentially update the filter count without actually succeeding in adding the filter. Now, we only increment the fd_tcp_rule after we've succeeded. Additionally, we will re-enable ATR mode only after deletion of the filter is actually posted to the FDIR ring. Change-ID: If5c1dea422081cc5e2de65618b01b4c3bf6bd586 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-20 16:45:20 -07:00
Jacob Keller	e5187ee3ee	i40e: return immediately when failing to add fdir filter Instead of setting err=true and checking this to determine when to free the raw_packet near the end of the function, simply kfree and return immediately. The resulting code is a bit cleaner and has one less variable. This also resolves a subtle bug in the ipv4 case which could fail to add the first filter and then never free the memory, resulting in a small memory leak. Change-ID: I7583aac033481dc794b4acaa14445059c8930ff1 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Avinash Dayanand <avinash.dayanand@intel.com> Reviewed-by: Alan Brady <alan.brady@intel.com> Reviewed-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-20 16:45:20 -07:00
Jacob Keller	8ce43dce6f	i40e: don't use arrays for (src\|dst)_ip The code originally included src_ip and dst_ip with enough space to support ipv6 filters. However, no actual support for ipv6 filters has been implemented. Thus, remove the arrays and just use __be32 values. Should ipv6 support be added in the future, we can replace these with a union that has sizes for both values. Change-Id: I1bc04032244a80eb6ebc8a4e6c723a4a665c1dd5 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-20 16:45:20 -07:00
Harshitha Ramamurthy	b77ac97593	i40e: rename auto_disable_flags to hw_disabled_flags A previous commit introduced a field that tracks the features that are disabled due to HW resource limitations as opposed to the featured disabled by the user. This patch changes the name of the field to make it more readable since it might get confusing when looking at code containing both the flags field and the auto_disable_features field together. Change-ID: Idcc9888659698f6fe3ccff17c8c3f09b5026f708 Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-15 02:02:02 -07:00
Alexander Duyck	59605bc096	i40e/i40evf: Add support for mapping pages with DMA attributes This patch adds support for DMA_ATTR_SKIP_CPU_SYNC and DMA_ATTR_WEAK_ORDERING. By enabling both of these for the Rx path we are able to see performance improvements on architectures that implement either one due to the fact that page mapping and unmapping only has to sync what is actually being used instead of the entire buffer. In addition by enabling the weak ordering attribute enables a performance improvement for architectures that can associate a memory ordering with a DMA buffer such as Sparc. Change-ID: If176824e8231c5b24b8a5d55b339a6026738fc75 Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-03-15 02:00:14 -07:00
Jacob Keller	b9c015d421	i40e: mark the value passed to csum_replace_by_diff as __wsum Fix, or rather, avoid a sparse warning caused by the fact that csum_replace_by_diff expects to receive a __wsum value. Since the calculation appears to work, simply typecast the passed paylen value to __wsum to avoid the warning. This seems pretty fishy since __wsum was obviously annotated as a separate type on purpose, so this throws the entire calculation into question. Since it currently appears to behave as expected, the typecast is probably safe. Change-ID: I4fdc5cddd589abc16098176e8a61127e761488f4 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-18 20:35:36 -08:00
Carolyn Wyborny	3c234c4709	i40e: Fix Adaptive ITR enabling This patch fixes a bug introduced with the addition of the per queue ITR feature support in ethtool. With that addition, there were functions added which converted the ITR settings to binary values. The IS_ENABLED macros that run on those values check whether a bit is set or not and with the value being binary, the bit check always returned ITR disabled which prevents any updating of the ITR rate. This patch fixes the problem by changing the functions to return the current ITR value instead and renaming it to better reflect its function. These functions now provide a value which will be accurately asessed and update the ITR as intended. Change-ID: I14f1d088d052e27f652aaa3113e186415ddea1fc Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-18 20:35:36 -08:00
Jacob Keller	a158aeaf5b	i40e: update comment explaining where FDIR buffers are freed The original comment implies that the only location where the raw_packet buffer will be freed is in i40e_clean_tx_ring() which is incorrect. In fact this isn't even the normal case. Update the comment explaining where the memory is freed. Change-ID: Ie0defc35ed1c3af183f81fdc60b6d783707a5595 Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-11 20:39:01 -08:00
Scott Peterson	9b37c93731	i40e/i40evf: eliminate i40e_pull_tail() Reorganize the i40e_pull_tail() logic, doing it in i40e_add_rx_frag() where it's cheaper. The igb driver does this the same way. Also renames i40e_page_is_reserved() to reflect what it actually tests. Change-ID: Icd9cc507aae1fcdc02308b3a09034111b4c24071 Signed-off-by: Scott Peterson <scott.d.peterson@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-11 20:39:01 -08:00
Scott Peterson	e72e56597b	i40e/i40evf: Moves skb from i40e_rx_buffer to i40e_ring This patch reduces the size of struct i40e_rx_buffer by one pointer, and makes the i40e driver a little more consistent with the igb driver in terms of packets that span buffers. We do this by moving the skb field from struct i40e_rx_buffer to struct i40e_ring. We pass the skb we already have (or NULL if we don't) to i40e_fetch_rx_buffer(), which skips the skb allocation if we already have one for this packet. Change-ID: I4ad48a531844494ba0c5d8e1a62209a057f661b0 Signed-off-by: Scott Peterson <scott.d.peterson@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-02-11 20:39:01 -08:00

1 2 3 4 5 ...

311 Commits