425 lines
12 KiB
C
Raw Normal View History

/* SPDX-License-Identifier: GPL-2.0 */
/* Copyright (c) 2018, Sensor-Technik Wiedemann GmbH
* Copyright (c) 2018-2019, Vladimir Oltean <olteanv@gmail.com>
*/
#ifndef _SJA1105_H
#define _SJA1105_H
net: dsa: sja1105: Add support for the PTP clock The design of this PHC driver is influenced by the switch's behavior w.r.t. timestamping. It exposes two PTP counters, one free-running (PTPTSCLK) and the other offset- and frequency-corrected in hardware through PTPCLKVAL, PTPCLKADD and PTPCLKRATE. The MACs can sample either of these for frame timestamps. However, the user manual warns that taking timestamps based on the corrected clock is less than useful, as the switch can deliver corrupted timestamps in a variety of circumstances. Therefore, this PHC uses the free-running PTPTSCLK together with a timecounter/cyclecounter structure that translates it into a software time domain. Thus, the settime/adjtime and adjfine callbacks are hardware no-ops. The timestamps (introduced in a further patch) will also be translated to the correct time domain before being handed over to the userspace PTP stack. The introduction of a second set of PHC operations that operate on the hardware PTPCLKVAL/PTPCLKADD/PTPCLKRATE in the future is somewhat unavoidable, as the TTEthernet core uses the corrected PTP time domain. However, the free-running counter + timecounter structure combination will suffice for now, as the resulting timestamps yield a sub-50 ns synchronization offset in steady state using linuxptp. For this patch, in absence of frame timestamping, the operations of the switch PHC were tested by syncing it to the system time as a local slave clock with: phc2sys -s CLOCK_REALTIME -c swp2 -O 0 -m -S 0.01 Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-08 15:04:34 +03:00
#include <linux/ptp_clock_kernel.h>
#include <linux/timecounter.h>
#include <linux/dsa/sja1105.h>
net: dsa: sja1105: implement cross-chip bridging operations sja1105 uses dsa_8021q for DSA tagging, a format which is VLAN at heart and which is compatible with cascading. A complete description of this tagging format is in net/dsa/tag_8021q.c, but a quick summary is that each external-facing port tags incoming frames with a unique pvid, and this special VLAN is transmitted as tagged towards the inside of the system, and as untagged towards the exterior. The tag encodes the switch id and the source port index. This means that cross-chip bridging for dsa_8021q only entails adding the dsa_8021q pvids of one switch to the RX filter of the other switches. Everything else falls naturally into place, as long as the bottom-end of ports (the leaves in the tree) is comprised exclusively of dsa_8021q-compatible (i.e. sja1105 switches). Otherwise, there would be a chance that a front-panel switch transmits a packet tagged with a dsa_8021q header, header which it wouldn't be able to remove, and which would hence "leak" out. The only use case I tested (due to lack of board availability) was when the sja1105 switches are part of disjoint trees (however, this doesn't change the fact that multiple sja1105 switches still need unique switch identifiers in such a system). But in principle, even "true" single-tree setups (with DSA links) should work just as fine, except for a small change which I can't test: dsa_towards_port should be used instead of dsa_upstream_port (I made the assumption that the routing port that any sja1105 should use towards its neighbours is the CPU port. That might not hold true in other setups). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10 19:37:43 +03:00
#include <linux/dsa/8021q.h>
#include <net/dsa.h>
#include <linux/mutex.h>
#include "sja1105_static_config.h"
#define SJA1105ET_FDB_BIN_SIZE 4
/* The hardware value is in multiples of 10 ms.
* The passed parameter is in multiples of 1 ms.
*/
#define SJA1105_AGEING_TIME_MS(ms) ((ms) / 10)
net: dsa: sja1105: add support for the SJA1110 switch family The SJA1110 is basically an SJA1105 with more ports, some integrated PHYs (100base-T1 and 100base-TX) and an embedded microcontroller which can be disabled, and the switch core can be controlled by a host running Linux, over SPI. This patch contains: - the static and dynamic config packing functions, for the tables that are common with SJA1105 - one more static config tables which is "unique" to the SJA1110 (actually it is a rehash of stuff that was placed somewhere else in SJA1105): the PCP Remapping Table - a reset and clock configuration procedure for the SJA1110 switch. This resets just the switch subsystem, and gates off the clock which powers on the embedded microcontroller. - an RGMII delay configuration procedure for SJA1110, which is very similar to SJA1105, but different enough for us to be unable to reuse it (this is a pattern that repeats itself) - some adaptations to dynamic config table entries which are no longer programmed in the same way. For example, to delete a VLAN, you used to write an entry through the dynamic reconfiguration interface with the desired VLAN ID, and with the VALIDENT bit set to false. Now, the VLAN table entries contain a TYPE_ENTRY field, which must be set to zero (in a backwards-incompatible way) in order for the entry to be deleted, or to some other entry for the VLAN to match "inner tagged" or "outer tagged" packets. - a similar thing for the static config: the xMII Mode Parameters Table encoding for SGMII and MII (the latter just when attached to a 100base-TX PHY) just isn't what it used to be in SJA1105. They are identical, except there is an extra "special" bit which needs to be set. Set it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 12:25:36 +03:00
#define SJA1105_NUM_L2_POLICERS SJA1110_MAX_L2_POLICING_COUNT
net: dsa: sja1105: parse {rx, tx}-internal-delay-ps properties for RGMII delays This change does not fix any functional issue or address any real life use case that wasn't possible before. It is just a small step in the process of standardizing the way in which Ethernet MAC drivers may apply RGMII delays (traditionally these have been applied by PHYs, with no clear definition of what to do in the case of a fixed-link). The sja1105 driver used to apply MAC-level RGMII delays on the RX data lines when in fixed-link mode and using a phy-mode of "rgmii-rxid" or "rgmii-id" and on the TX data lines when using "rgmii-txid" or "rgmii-id". But the standard definitions don't say anything about behaving differently when the port is in fixed-link vs when it isn't, and the new device tree bindings are about having a way of applying the delays in a way that is independent of the phy-mode and of the fixed-link property. When the {rx,tx}-internal-delay-ps properties are present, use them, otherwise fall back to the old behavior and warn. One other thing to note is that the SJA1105 hardware applies a delay value in degrees rather than in picoseconds (the delay in ps changes depending on the frequency of the RGMII clock - 125 MHz at 1G, 25 MHz at 100M, 2.5MHz at 10M). I assume that is fine, we calculate the phase shift of the internal delay lines assuming that the device tree meant gigabit, and we let the hardware scale those according to the link speed. Link: https://patchwork.kernel.org/project/netdevbpf/patch/20210723173108.459770-6-prasanna.vengateshan@microchip.com/ Link: https://patchwork.ozlabs.org/project/netdev/patch/20200616074955.GA9092@laureti-dev/#2461123 Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 22:29:52 +03:00
/* Calculated assuming 1Gbps, where the clock has 125 MHz (8 ns period)
* To avoid floating point operations, we'll multiply the degrees by 10
* to get a "phase" and get 1 decimal point precision.
*/
#define SJA1105_RGMII_DELAY_PS_TO_PHASE(ps) \
(((ps) * 360) / 800)
#define SJA1105_RGMII_DELAY_PHASE_TO_PS(phase) \
((800 * (phase)) / 360)
#define SJA1105_RGMII_DELAY_PHASE_TO_HW(phase) \
(((phase) - 738) / 9)
#define SJA1105_RGMII_DELAY_PS_TO_HW(ps) \
SJA1105_RGMII_DELAY_PHASE_TO_HW(SJA1105_RGMII_DELAY_PS_TO_PHASE(ps))
/* Valid range in degrees is a value between 73.8 and 101.7
* in 0.9 degree increments
*/
#define SJA1105_RGMII_DELAY_MIN_PS \
SJA1105_RGMII_DELAY_PHASE_TO_PS(738)
#define SJA1105_RGMII_DELAY_MAX_PS \
SJA1105_RGMII_DELAY_PHASE_TO_PS(1017)
typedef enum {
SPI_READ = 0,
SPI_WRITE = 1,
} sja1105_spi_rw_mode_t;
net: dsa: sja1105: Configure the Time-Aware Scheduler via tc-taprio offload This qdisc offload is the closest thing to what the SJA1105 supports in hardware for time-based egress shaping. The switch core really is built around SAE AS6802/TTEthernet (a TTTech standard) but can be made to operate similarly to IEEE 802.1Qbv with some constraints: - The gate control list is a global list for all ports. There are 8 execution threads that iterate through this global list in parallel. I don't know why 8, there are only 4 front-panel ports. - Care must be taken by the user to make sure that two execution threads never get to execute a GCL entry simultaneously. I created a O(n^4) checker for this hardware limitation, prior to accepting a taprio offload configuration as valid. - The spec says that if a GCL entry's interval is shorter than the frame length, you shouldn't send it (and end up in head-of-line blocking). Well, this switch does anyway. - The switch has no concept of ADMIN and OPER configurations. Because it's so simple, the TAS settings are loaded through the static config tables interface, so there isn't even place for any discussion about 'graceful switchover between ADMIN and OPER'. You just reset the switch and upload a new OPER config. - The switch accepts multiple time sources for the gate events. Right now I am using the standalone clock source as opposed to PTP. So the base time parameter doesn't really do much. Support for the PTP clock source will be added in a future series. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-15 05:00:02 +03:00
#include "sja1105_tas.h"
#include "sja1105_ptp.h"
net: dsa: sja1105: Configure the Time-Aware Scheduler via tc-taprio offload This qdisc offload is the closest thing to what the SJA1105 supports in hardware for time-based egress shaping. The switch core really is built around SAE AS6802/TTEthernet (a TTTech standard) but can be made to operate similarly to IEEE 802.1Qbv with some constraints: - The gate control list is a global list for all ports. There are 8 execution threads that iterate through this global list in parallel. I don't know why 8, there are only 4 front-panel ports. - Care must be taken by the user to make sure that two execution threads never get to execute a GCL entry simultaneously. I created a O(n^4) checker for this hardware limitation, prior to accepting a taprio offload configuration as valid. - The spec says that if a GCL entry's interval is shorter than the frame length, you shouldn't send it (and end up in head-of-line blocking). Well, this switch does anyway. - The switch has no concept of ADMIN and OPER configurations. Because it's so simple, the TAS settings are loaded through the static config tables interface, so there isn't even place for any discussion about 'graceful switchover between ADMIN and OPER'. You just reset the switch and upload a new OPER config. - The switch accepts multiple time sources for the gate events. Right now I am using the standalone clock source as opposed to PTP. So the base time parameter doesn't really do much. Support for the PTP clock source will be added in a future series. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-15 05:00:02 +03:00
enum sja1105_stats_area {
MAC,
HL1,
HL2,
ETHER,
__MAX_SJA1105_STATS_AREA,
};
/* Keeps the different addresses between E/T and P/Q/R/S */
struct sja1105_regs {
u64 device_id;
u64 prod_id;
u64 status;
u64 port_control;
u64 rgu;
net: dsa: sja1105: implement tc-gate using time-triggered virtual links Restrict the TTEthernet hardware support on this switch to operate as closely as possible to IEEE 802.1Qci as possible. This means that it can perform PTP-time-based ingress admission control on streams identified by {DMAC, VID, PCP}, which is useful when trying to ensure the determinism of traffic scheduled via IEEE 802.1Qbv. The oddity comes from the fact that in hardware (and in TTEthernet at large), virtual links always need a full-blown action, including not only the type of policing, but also the list of destination ports. So in practice, a single tc-gate action will result in all packets getting dropped. Additional actions (either "trap" or "redirect") need to be specified in the same filter rule such that the conforming packets are actually forwarded somewhere. Apart from the VL Lookup, Policing and Forwarding tables which need to be programmed for each flow (virtual link), the Schedule engine also needs to be told to open/close the admission gates for each individual virtual link. A fairly accurate (and detailed) description of how that works is already present in sja1105_tas.c, since it is already used to trigger the egress gates for the tc-taprio offload (IEEE 802.1Qbv). Key point here, we remember that the schedule engine supports 8 "subschedules" (execution threads that iterate through the global schedule in parallel, and that no 2 hardware threads must execute a schedule entry at the same time). For tc-taprio, each egress port used one of these 8 subschedules, leaving a total of 4 subschedules unused. In principle we could have allocated 1 subschedule for the tc-gate offload of each ingress port, but actually the schedules of all virtual links installed on each ingress port would have needed to be merged together, before they could have been programmed to hardware. So simplify our life and just merge the entire tc-gate configuration, for all virtual links on all ingress ports, into a single subschedule. Be sure to check that against the usual hardware scheduling conflicts, and program it to hardware alongside any tc-taprio subschedule that may be present. The following scenarios were tested: 1. Quantitative testing: tc qdisc add dev swp2 clsact tc filter add dev swp2 ingress flower skip_sw \ dst_mac 42:be:24:9b:76:20 \ action gate index 1 base-time 0 \ sched-entry OPEN 1200 -1 -1 \ sched-entry CLOSE 1200 -1 -1 \ action trap ping 192.168.1.2 -f PING 192.168.1.2 (192.168.1.2) 56(84) bytes of data. ............................. --- 192.168.1.2 ping statistics --- 948 packets transmitted, 467 received, 50.7384% packet loss, time 9671ms 2. Qualitative testing (with a phase-aligned schedule - the clocks are synchronized by ptp4l, not shown here): Receiver (sja1105): tc qdisc add dev swp2 clsact now=$(phc_ctl /dev/ptp1 get | awk '/clock time is/ {print $5}') && \ sec=$(echo $now | awk -F. '{print $1}') && \ base_time="$(((sec + 2) * 1000000000))" && \ echo "base time ${base_time}" tc filter add dev swp2 ingress flower skip_sw \ dst_mac 42:be:24:9b:76:20 \ action gate base-time ${base_time} \ sched-entry OPEN 60000 -1 -1 \ sched-entry CLOSE 40000 -1 -1 \ action trap Sender (enetc): now=$(phc_ctl /dev/ptp0 get | awk '/clock time is/ {print $5}') && \ sec=$(echo $now | awk -F. '{print $1}') && \ base_time="$(((sec + 2) * 1000000000))" && \ echo "base time ${base_time}" tc qdisc add dev eno0 parent root taprio \ num_tc 8 \ map 0 1 2 3 4 5 6 7 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \ base-time ${base_time} \ sched-entry S 01 50000 \ sched-entry S 00 50000 \ flags 2 ping -A 192.168.1.1 PING 192.168.1.1 (192.168.1.1): 56 data bytes ... ^C --- 192.168.1.1 ping statistics --- 1425 packets transmitted, 1424 packets received, 0% packet loss round-trip min/avg/max = 0.322/0.361/0.990 ms And just for comparison, with the tc-taprio schedule deleted: ping -A 192.168.1.1 PING 192.168.1.1 (192.168.1.1): 56 data bytes ... ^C --- 192.168.1.1 ping statistics --- 33 packets transmitted, 19 packets received, 42% packet loss round-trip min/avg/max = 0.336/0.464/0.597 ms Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-05 22:20:56 +03:00
u64 vl_status;
u64 config;
u64 rmii_pll1;
net: dsa: sja1105: configure the PTP_CLK pin as EXT_TS or PER_OUT The SJA1105 switch family has a PTP_CLK pin which emits a signal with fixed 50% duty cycle, but variable frequency and programmable start time. On the second generation (P/Q/R/S) switches, this pin supports even more functionality. The use case described by the hardware documents talks about synchronization via oneshot pulses: given 2 sja1105 switches, arbitrarily designated as a master and a slave, the master emits a single pulse on PTP_CLK, while the slave is configured to timestamp this pulse received on its PTP_CLK pin (which must obviously be configured as input). The difference between the timestamps then exactly becomes the slave offset to the master. The only trouble with the above is that the hardware is very much tied into this use case only, and not very generic beyond that: - When emitting a oneshot pulse, instead of being told when to emit it, the switch just does it "now" and tells you later what time it was, via the PTPSYNCTS register. [ Incidentally, this is the same register that the slave uses to collect the ext_ts timestamp from, too. ] - On the sync slave, there is no interrupt mechanism on reception of a new extts, and no FIFO to buffer them, because in the foreseen use case, software is in control of both the master and the slave pins, so it "knows" when there's something to collect. These 2 problems mean that: - We don't support (at least yet) the quirky oneshot mode exposed by the hardware, just normal periodic output. - We abuse the hardware a little bit when we expose generic extts. Because there's no interrupt mechanism, we need to poll at double the frequency we expect to receive a pulse. Currently that means a non-configurable "twice a second". Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-24 00:59:24 +02:00
u64 ptppinst;
u64 ptppindur;
net: dsa: sja1105: Add support for the PTP clock The design of this PHC driver is influenced by the switch's behavior w.r.t. timestamping. It exposes two PTP counters, one free-running (PTPTSCLK) and the other offset- and frequency-corrected in hardware through PTPCLKVAL, PTPCLKADD and PTPCLKRATE. The MACs can sample either of these for frame timestamps. However, the user manual warns that taking timestamps based on the corrected clock is less than useful, as the switch can deliver corrupted timestamps in a variety of circumstances. Therefore, this PHC uses the free-running PTPTSCLK together with a timecounter/cyclecounter structure that translates it into a software time domain. Thus, the settime/adjtime and adjfine callbacks are hardware no-ops. The timestamps (introduced in a further patch) will also be translated to the correct time domain before being handed over to the userspace PTP stack. The introduction of a second set of PHC operations that operate on the hardware PTPCLKVAL/PTPCLKADD/PTPCLKRATE in the future is somewhat unavoidable, as the TTEthernet core uses the corrected PTP time domain. However, the free-running counter + timecounter structure combination will suffice for now, as the resulting timestamps yield a sub-50 ns synchronization offset in steady state using linuxptp. For this patch, in absence of frame timestamping, the operations of the switch PHC were tested by syncing it to the system time as a local slave clock with: phc2sys -s CLOCK_REALTIME -c swp2 -O 0 -m -S 0.01 Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-08 15:04:34 +03:00
u64 ptp_control;
u64 ptpclkval;
net: dsa: sja1105: Add support for the PTP clock The design of this PHC driver is influenced by the switch's behavior w.r.t. timestamping. It exposes two PTP counters, one free-running (PTPTSCLK) and the other offset- and frequency-corrected in hardware through PTPCLKVAL, PTPCLKADD and PTPCLKRATE. The MACs can sample either of these for frame timestamps. However, the user manual warns that taking timestamps based on the corrected clock is less than useful, as the switch can deliver corrupted timestamps in a variety of circumstances. Therefore, this PHC uses the free-running PTPTSCLK together with a timecounter/cyclecounter structure that translates it into a software time domain. Thus, the settime/adjtime and adjfine callbacks are hardware no-ops. The timestamps (introduced in a further patch) will also be translated to the correct time domain before being handed over to the userspace PTP stack. The introduction of a second set of PHC operations that operate on the hardware PTPCLKVAL/PTPCLKADD/PTPCLKRATE in the future is somewhat unavoidable, as the TTEthernet core uses the corrected PTP time domain. However, the free-running counter + timecounter structure combination will suffice for now, as the resulting timestamps yield a sub-50 ns synchronization offset in steady state using linuxptp. For this patch, in absence of frame timestamping, the operations of the switch PHC were tested by syncing it to the system time as a local slave clock with: phc2sys -s CLOCK_REALTIME -c swp2 -O 0 -m -S 0.01 Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-08 15:04:34 +03:00
u64 ptpclkrate;
net: dsa: sja1105: Implement state machine for TAS with PTP clock source Tested using the following bash script and the tc from iproute2-next: #!/bin/bash set -e -u -o pipefail NSEC_PER_SEC="1000000000" gatemask() { local tc_list="$1" local mask=0 for tc in ${tc_list}; do mask=$((${mask} | (1 << ${tc}))) done printf "%02x" ${mask} } if ! systemctl is-active --quiet ptp4l; then echo "Please start the ptp4l service" exit fi now=$(phc_ctl /dev/ptp1 get | gawk '/clock time is/ { print $5; }') # Phase-align the base time to the start of the next second. sec=$(echo "${now}" | gawk -F. '{ print $1; }') base_time="$(((${sec} + 1) * ${NSEC_PER_SEC}))" tc qdisc add dev swp5 parent root handle 100 taprio \ num_tc 8 \ map 0 1 2 3 5 6 7 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \ base-time ${base_time} \ sched-entry S $(gatemask 7) 100000 \ sched-entry S $(gatemask "0 1 2 3 4 5 6") 400000 \ clockid CLOCK_TAI flags 2 The "state machine" is a workqueue invoked after each manipulation command on the PTP clock (reset, adjust time, set time, adjust frequency) which checks over the state of the time-aware scheduler. So it is not monitored periodically, only in reaction to a PTP command typically triggered from a userspace daemon (linuxptp). Otherwise there is no reason for things to go wrong. Now that the timecounter/cyclecounter has been replaced with hardware operations on the PTP clock, the TAS Kconfig now depends upon PTP and the standalone clocksource operating mode has been removed. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-12 02:11:54 +02:00
u64 ptpclkcorp;
net: dsa: sja1105: configure the PTP_CLK pin as EXT_TS or PER_OUT The SJA1105 switch family has a PTP_CLK pin which emits a signal with fixed 50% duty cycle, but variable frequency and programmable start time. On the second generation (P/Q/R/S) switches, this pin supports even more functionality. The use case described by the hardware documents talks about synchronization via oneshot pulses: given 2 sja1105 switches, arbitrarily designated as a master and a slave, the master emits a single pulse on PTP_CLK, while the slave is configured to timestamp this pulse received on its PTP_CLK pin (which must obviously be configured as input). The difference between the timestamps then exactly becomes the slave offset to the master. The only trouble with the above is that the hardware is very much tied into this use case only, and not very generic beyond that: - When emitting a oneshot pulse, instead of being told when to emit it, the switch just does it "now" and tells you later what time it was, via the PTPSYNCTS register. [ Incidentally, this is the same register that the slave uses to collect the ext_ts timestamp from, too. ] - On the sync slave, there is no interrupt mechanism on reception of a new extts, and no FIFO to buffer them, because in the foreseen use case, software is in control of both the master and the slave pins, so it "knows" when there's something to collect. These 2 problems mean that: - We don't support (at least yet) the quirky oneshot mode exposed by the hardware, just normal periodic output. - We abuse the hardware a little bit when we expose generic extts. Because there's no interrupt mechanism, we need to poll at double the frequency we expect to receive a pulse. Currently that means a non-configurable "twice a second". Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-24 00:59:24 +02:00
u64 ptpsyncts;
net: dsa: sja1105: Implement state machine for TAS with PTP clock source Tested using the following bash script and the tc from iproute2-next: #!/bin/bash set -e -u -o pipefail NSEC_PER_SEC="1000000000" gatemask() { local tc_list="$1" local mask=0 for tc in ${tc_list}; do mask=$((${mask} | (1 << ${tc}))) done printf "%02x" ${mask} } if ! systemctl is-active --quiet ptp4l; then echo "Please start the ptp4l service" exit fi now=$(phc_ctl /dev/ptp1 get | gawk '/clock time is/ { print $5; }') # Phase-align the base time to the start of the next second. sec=$(echo "${now}" | gawk -F. '{ print $1; }') base_time="$(((${sec} + 1) * ${NSEC_PER_SEC}))" tc qdisc add dev swp5 parent root handle 100 taprio \ num_tc 8 \ map 0 1 2 3 5 6 7 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \ base-time ${base_time} \ sched-entry S $(gatemask 7) 100000 \ sched-entry S $(gatemask "0 1 2 3 4 5 6") 400000 \ clockid CLOCK_TAI flags 2 The "state machine" is a workqueue invoked after each manipulation command on the PTP clock (reset, adjust time, set time, adjust frequency) which checks over the state of the time-aware scheduler. So it is not monitored periodically, only in reaction to a PTP command typically triggered from a userspace daemon (linuxptp). Otherwise there is no reason for things to go wrong. Now that the timecounter/cyclecounter has been replaced with hardware operations on the PTP clock, the TAS Kconfig now depends upon PTP and the standalone clocksource operating mode has been removed. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-12 02:11:54 +02:00
u64 ptpschtm;
u64 ptpegr_ts[SJA1105_MAX_NUM_PORTS];
u64 pad_mii_tx[SJA1105_MAX_NUM_PORTS];
u64 pad_mii_rx[SJA1105_MAX_NUM_PORTS];
u64 pad_mii_id[SJA1105_MAX_NUM_PORTS];
u64 cgu_idiv[SJA1105_MAX_NUM_PORTS];
u64 mii_tx_clk[SJA1105_MAX_NUM_PORTS];
u64 mii_rx_clk[SJA1105_MAX_NUM_PORTS];
u64 mii_ext_tx_clk[SJA1105_MAX_NUM_PORTS];
u64 mii_ext_rx_clk[SJA1105_MAX_NUM_PORTS];
u64 rgmii_tx_clk[SJA1105_MAX_NUM_PORTS];
u64 rmii_ref_clk[SJA1105_MAX_NUM_PORTS];
u64 rmii_ext_tx_clk[SJA1105_MAX_NUM_PORTS];
u64 stats[__MAX_SJA1105_STATS_AREA][SJA1105_MAX_NUM_PORTS];
u64 mdio_100base_tx;
u64 mdio_100base_t1;
u64 pcs_base[SJA1105_MAX_NUM_PORTS];
};
struct sja1105_mdio_private {
struct sja1105_private *priv;
};
enum {
SJA1105_SPEED_AUTO,
SJA1105_SPEED_10MBPS,
SJA1105_SPEED_100MBPS,
SJA1105_SPEED_1000MBPS,
SJA1105_SPEED_2500MBPS,
SJA1105_SPEED_MAX,
};
enum sja1105_internal_phy_t {
SJA1105_NO_PHY = 0,
SJA1105_PHY_BASE_TX,
SJA1105_PHY_BASE_T1,
};
struct sja1105_info {
u64 device_id;
/* Needed for distinction between P and R, and between Q and S
* (since the parts with/without SGMII share the same
* switch core and device_id)
*/
u64 part_no;
/* E/T and P/Q/R/S have partial timestamps of different sizes.
* They must be reconstructed on both families anyway to get the full
* 64-bit values back.
*/
int ptp_ts_bits;
/* Also SPI commands are of different sizes to retrieve
* the egress timestamps.
*/
int ptpegr_ts_bytes;
int num_cbs_shapers;
int max_frame_mem;
net: dsa: sja1105: add support for the SJA1110 switch family The SJA1110 is basically an SJA1105 with more ports, some integrated PHYs (100base-T1 and 100base-TX) and an embedded microcontroller which can be disabled, and the switch core can be controlled by a host running Linux, over SPI. This patch contains: - the static and dynamic config packing functions, for the tables that are common with SJA1105 - one more static config tables which is "unique" to the SJA1110 (actually it is a rehash of stuff that was placed somewhere else in SJA1105): the PCP Remapping Table - a reset and clock configuration procedure for the SJA1110 switch. This resets just the switch subsystem, and gates off the clock which powers on the embedded microcontroller. - an RGMII delay configuration procedure for SJA1110, which is very similar to SJA1105, but different enough for us to be unable to reuse it (this is a pattern that repeats itself) - some adaptations to dynamic config table entries which are no longer programmed in the same way. For example, to delete a VLAN, you used to write an entry through the dynamic reconfiguration interface with the desired VLAN ID, and with the VALIDENT bit set to false. Now, the VLAN table entries contain a TYPE_ENTRY field, which must be set to zero (in a backwards-incompatible way) in order for the entry to be deleted, or to some other entry for the VLAN to match "inner tagged" or "outer tagged" packets. - a similar thing for the static config: the xMII Mode Parameters Table encoding for SGMII and MII (the latter just when attached to a 100base-TX PHY) just isn't what it used to be in SJA1105. They are identical, except there is an extra "special" bit which needs to be set. Set it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 12:25:36 +03:00
int num_ports;
bool multiple_cascade_ports;
/* Every {port, TXQ} has its own CBS shaper */
bool fixed_cbs_mapping;
net: dsa: add support for the SJA1110 native tagging protocol The SJA1110 has improved a few things compared to SJA1105: - To send a control packet from the host port with SJA1105, one needed to program a one-shot "management route" over SPI. This is no longer true with SJA1110, you can actually send "in-band control extensions" in the packets sent by DSA, these are in fact DSA tags which contain the destination port and switch ID. - When receiving a control packet from the switch with SJA1105, the source port and switch ID were written in bytes 3 and 4 of the destination MAC address of the frame (which was a very poor shot at a DSA header). If the control packet also had an RX timestamp, that timestamp was sent in an actual follow-up packet, so there were reordering concerns on multi-core/multi-queue DSA masters, where the metadata frame with the RX timestamp might get processed before the actual packet to which that timestamp belonged (there is no way to pair a packet to its timestamp other than the order in which they were received). On SJA1110, this is no longer true, control packets have the source port, switch ID and timestamp all in the DSA tags. - Timestamps from the switch were partial: to get a 64-bit timestamp as required by PTP stacks, one would need to take the partial 24-bit or 32-bit timestamp from the packet, then read the current PTP time very quickly, and then patch in the high bits of the current PTP time into the captured partial timestamp, to reconstruct what the full 64-bit timestamp must have been. That is awful because packet processing is done in NAPI context, but reading the current PTP time is done over SPI and therefore needs sleepable context. But it also aggravated a few things: - Not only is there a DSA header in SJA1110, but there is a DSA trailer in fact, too. So DSA needs to be extended to support taggers which have both a header and a trailer. Very unconventional - my understanding is that the trailer exists because the timestamps couldn't be prepared in time for putting them in the header area. - Like SJA1105, not all packets sent to the CPU have the DSA tag added to them, only control packets do: * the ones which match the destination MAC filters/traps in MAC_FLTRES1 and MAC_FLTRES0 * the ones which match FDB entries which have TRAP or TAKETS bits set So we could in theory hack something up to request the switch to take timestamps for all packets that reach the CPU, and those would be DSA-tagged and contain the source port / switch ID by virtue of the fact that there needs to be a timestamp trailer provided. BUT: - The SJA1110 does not parse its own DSA tags in a way that is useful for routing in cross-chip topologies, a la Marvell. And the sja1105 driver already supports cross-chip bridging from the SJA1105 days. It does that by automatically setting up the DSA links as VLAN trunks which contain all the necessary tag_8021q RX VLANs that must be communicated between the switches that span the same bridge. So when using tag_8021q on sja1105, it is possible to have 2 switches with ports sw0p0, sw0p1, sw1p0, sw1p1, and 2 VLAN-unaware bridges br0 and br1, and br0 can take sw0p0 and sw1p0, and br1 can take sw0p1 and sw1p1, and forwarding will happen according to the expected rules of the Linux bridge. We like that, and we don't want that to go away, so as a matter of fact, the SJA1110 tagger still needs to support tag_8021q. So the sja1110 tagger is a hybrid between tag_8021q for data packets, and the native hardware support for control packets. On RX, packets have a 13-byte trailer if they contain an RX timestamp. That trailer is padded in such a way that its byte 8 (the start of the "residence time" field - not parsed by Linux because we don't care) is aligned on a 16 byte boundary. So the padding has a variable length between 0 and 15 bytes. The DSA header contains the offset of the beginning of the padding relative to the beginning of the frame (and the end of the padding is obviously the end of the packet minus 13 bytes, the length of the trailer). So we discard it. Packets which don't have a trailer contain the source port and switch ID information in the header (they are "trap-to-host" packets). Packets which have a trailer contain the source port and switch ID in the trailer. On TX, the destination port mask and switch ID is always in the trailer, so we always need to say in the header that a trailer is present. The header needs a custom EtherType and this was chosen as 0xdadc, after 0xdada which is for Marvell and 0xdadb which is for VLANs in VLAN-unaware mode on SJA1105 (and SJA1110 in fact too). Because we use tag_8021q in concert with the native tagging protocol, control packets will have 2 DSA tags. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 22:01:29 +03:00
enum dsa_tag_protocol tag_proto;
const struct sja1105_dynamic_table_ops *dyn_ops;
const struct sja1105_table_ops *static_ops;
const struct sja1105_regs *regs;
2021-02-12 17:16:00 +02:00
bool can_limit_mcast_flood;
int (*reset_cmd)(struct dsa_switch *ds);
net: dsa: sja1105: Error out if RGMII delays are requested in DT Documentation/devicetree/bindings/net/ethernet.txt is confusing because it says what the MAC should not do, but not what it *should* do: * "rgmii-rxid" (RGMII with internal RX delay provided by the PHY, the MAC should not add an RX delay in this case) The gap in semantics is threefold: 1. Is it illegal for the MAC to apply the Rx internal delay by itself, and simplify the phy_mode (mask off "rgmii-rxid" into "rgmii") before passing it to of_phy_connect? The documentation would suggest yes. 1. For "rgmii-rxid", while the situation with the Rx clock skew is more or less clear (needs to be added by the PHY), what should the MAC driver do about the Tx delays? Is it an implicit wild card for the MAC to apply delays in the Tx direction if it can? What if those were already added as serpentine PCB traces, how could that be made more obvious through DT bindings so that the MAC doesn't attempt to add them twice and again potentially break the link? 3. If the interface is a fixed-link and therefore the PHY object is fixed (a purely software entity that obviously cannot add clock skew), what is the meaning of the above property? So an interpretation of the RGMII bindings was chosen that hopefully does not contradict their intention but also makes them more applied. The SJA1105 driver understands to act upon "rgmii-*id" phy-mode bindings if the port is in the PHY role (either explicitly, or if it is a fixed-link). Otherwise it always passes the duty of setting up delays to the PHY driver. The error behavior that this patch adds is required on SJA1105E/T where the MAC really cannot apply internal delays. If the other end of the fixed-link cannot apply RGMII delays either (this would be specified through its own DT bindings), then the situation requires PCB delays. For SJA1105P/Q/R/S, this is however hardware supported and the error is thus only temporary. I created a stub function pointer for configuring delays per-port on RXC and TXC, and will implement it when I have access to a board with this hardware setup. Meanwhile do not allow the user to select an invalid configuration. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-02 23:23:32 +03:00
int (*setup_rgmii_delay)(const void *ctx, int port);
/* Prototypes from include/net/dsa.h */
int (*fdb_add_cmd)(struct dsa_switch *ds, int port,
const unsigned char *addr, u16 vid);
int (*fdb_del_cmd)(struct dsa_switch *ds, int port,
const unsigned char *addr, u16 vid);
void (*ptp_cmd_packing)(u8 *buf, struct sja1105_ptp_cmd *cmd,
enum packing_op op);
bool (*rxtstamp)(struct dsa_switch *ds, int port, struct sk_buff *skb);
net: dsa: sja1105: implement TX timestamping for SJA1110 The TX timestamping procedure for SJA1105 is a bit unconventional because the transmit procedure itself is unconventional. Control packets (and therefore PTP as well) are transmitted to a specific port in SJA1105 using "management routes" which must be written over SPI to the switch. These are one-shot rules that match by destination MAC address on traffic coming from the CPU port, and select the precise destination port for that packet. So to transmit a packet from NET_TX softirq context, we actually need to defer to a process context so that we can perform that SPI write before we send the packet. The DSA master dev_queue_xmit() runs in process context, and we poll until the switch confirms it took the TX timestamp, then we annotate the skb clone with that TX timestamp. This is why the sja1105 driver does not need an skb queue for TX timestamping. But the SJA1110 is a bit (not much!) more conventional, and you can request 2-step TX timestamping through the DSA header, as well as give the switch a cookie (timestamp ID) which it will give back to you when it has the timestamp. So now we do need a queue for keeping the skb clones until their TX timestamps become available. The interesting part is that the metadata frames from SJA1105 haven't disappeared completely. On SJA1105 they were used as follow-ups which contained RX timestamps, but on SJA1110 they are actually TX completion packets, which contain a variable (up to 32) array of timestamps. Why an array? Because: - not only is the TX timestamp on the egress port being communicated, but also the RX timestamp on the CPU port. Nice, but we don't care about that, so we ignore it. - because a packet could be multicast to multiple egress ports, each port takes its own timestamp, and the TX completion packet contains the individual timestamps on each port. This is unconventional because switches typically have a timestamping FIFO and raise an interrupt, but this one doesn't. So the tagger needs to detect and parse meta frames, and call into the main switch driver, which pairs the timestamps with the skbs in the TX timestamping queue which are waiting for one. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11 22:01:31 +03:00
void (*txtstamp)(struct dsa_switch *ds, int port, struct sk_buff *skb);
int (*clocking_setup)(struct sja1105_private *priv);
int (*pcs_mdio_read_c45)(struct mii_bus *bus, int phy, int mmd,
int reg);
int (*pcs_mdio_write_c45)(struct mii_bus *bus, int phy, int mmd,
int reg, u16 val);
net: dsa: sja1105: properly power down the microcontroller clock for SJA1110 It turns out that powering down the BASE_TIMER_CLK does not turn off the microcontroller, just its timers, including the one for the watchdog. So the embedded microcontroller is still running, and potentially still doing things. To prevent unwanted interference, we should power down the BASE_MCSS_CLK as well (MCSS = microcontroller subsystem). The trouble is that currently we turn off the BASE_TIMER_CLK for SJA1110 from the .clocking_setup() method, mostly because this is a Clock Generation Unit (CGU) setting which was traditionally configured in that method for SJA1105. But in SJA1105, the CGU was used for bringing up the port clocks at the proper speeds, and in SJA1110 it's not (but rather for initial configuration), so it's best that we rebrand the sja1110_clocking_setup() method into what it really is - an implementation of the .disable_microcontroller() method. Since disabling the microcontroller only needs to be done once, at probe time, we can choose the best place to do that as being in sja1105_setup(), before we upload the static config to the device. This guarantees that the static config being used by the switch afterwards is really ours. Note that the procedure to upload a static config necessarily resets the switch. This already did not reset the microcontroller, only the switch core, so since the .disable_microcontroller() method is guaranteed to be called by that point, if it's disabled, it remains disabled. Add a comment to make that clear. With the code movement for SJA1110 from .clocking_setup() to .disable_microcontroller(), both methods are optional and are guarded by "if" conditions. Tested by enabling in the device tree the rev-mii switch port 0 that goes towards the microcontroller, and flashing a firmware that would have networking. Without this patch, the microcontroller can be pinged, with this patch it cannot. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18 14:52:54 +03:00
int (*disable_microcontroller)(struct sja1105_private *priv);
const char *name;
bool supports_mii[SJA1105_MAX_NUM_PORTS];
bool supports_rmii[SJA1105_MAX_NUM_PORTS];
bool supports_rgmii[SJA1105_MAX_NUM_PORTS];
bool supports_sgmii[SJA1105_MAX_NUM_PORTS];
bool supports_2500basex[SJA1105_MAX_NUM_PORTS];
enum sja1105_internal_phy_t internal_phy[SJA1105_MAX_NUM_PORTS];
const u64 port_speed[SJA1105_SPEED_MAX];
};
enum sja1105_key_type {
SJA1105_KEY_BCAST,
SJA1105_KEY_TC,
SJA1105_KEY_VLAN_UNAWARE_VL,
SJA1105_KEY_VLAN_AWARE_VL,
};
struct sja1105_key {
enum sja1105_key_type type;
union {
/* SJA1105_KEY_TC */
struct {
int pcp;
} tc;
/* SJA1105_KEY_VLAN_UNAWARE_VL */
/* SJA1105_KEY_VLAN_AWARE_VL */
struct {
u64 dmac;
u16 vid;
u16 pcp;
} vl;
};
};
enum sja1105_rule_type {
SJA1105_RULE_BCAST_POLICER,
SJA1105_RULE_TC_POLICER,
SJA1105_RULE_VL,
};
enum sja1105_vl_type {
SJA1105_VL_NONCRITICAL,
SJA1105_VL_RATE_CONSTRAINED,
SJA1105_VL_TIME_TRIGGERED,
};
struct sja1105_rule {
struct list_head list;
unsigned long cookie;
unsigned long port_mask;
struct sja1105_key key;
enum sja1105_rule_type type;
/* Action */
union {
/* SJA1105_RULE_BCAST_POLICER */
struct {
int sharindx;
} bcast_pol;
/* SJA1105_RULE_TC_POLICER */
struct {
int sharindx;
} tc_pol;
/* SJA1105_RULE_VL */
struct {
enum sja1105_vl_type type;
net: dsa: sja1105: implement tc-gate using time-triggered virtual links Restrict the TTEthernet hardware support on this switch to operate as closely as possible to IEEE 802.1Qci as possible. This means that it can perform PTP-time-based ingress admission control on streams identified by {DMAC, VID, PCP}, which is useful when trying to ensure the determinism of traffic scheduled via IEEE 802.1Qbv. The oddity comes from the fact that in hardware (and in TTEthernet at large), virtual links always need a full-blown action, including not only the type of policing, but also the list of destination ports. So in practice, a single tc-gate action will result in all packets getting dropped. Additional actions (either "trap" or "redirect") need to be specified in the same filter rule such that the conforming packets are actually forwarded somewhere. Apart from the VL Lookup, Policing and Forwarding tables which need to be programmed for each flow (virtual link), the Schedule engine also needs to be told to open/close the admission gates for each individual virtual link. A fairly accurate (and detailed) description of how that works is already present in sja1105_tas.c, since it is already used to trigger the egress gates for the tc-taprio offload (IEEE 802.1Qbv). Key point here, we remember that the schedule engine supports 8 "subschedules" (execution threads that iterate through the global schedule in parallel, and that no 2 hardware threads must execute a schedule entry at the same time). For tc-taprio, each egress port used one of these 8 subschedules, leaving a total of 4 subschedules unused. In principle we could have allocated 1 subschedule for the tc-gate offload of each ingress port, but actually the schedules of all virtual links installed on each ingress port would have needed to be merged together, before they could have been programmed to hardware. So simplify our life and just merge the entire tc-gate configuration, for all virtual links on all ingress ports, into a single subschedule. Be sure to check that against the usual hardware scheduling conflicts, and program it to hardware alongside any tc-taprio subschedule that may be present. The following scenarios were tested: 1. Quantitative testing: tc qdisc add dev swp2 clsact tc filter add dev swp2 ingress flower skip_sw \ dst_mac 42:be:24:9b:76:20 \ action gate index 1 base-time 0 \ sched-entry OPEN 1200 -1 -1 \ sched-entry CLOSE 1200 -1 -1 \ action trap ping 192.168.1.2 -f PING 192.168.1.2 (192.168.1.2) 56(84) bytes of data. ............................. --- 192.168.1.2 ping statistics --- 948 packets transmitted, 467 received, 50.7384% packet loss, time 9671ms 2. Qualitative testing (with a phase-aligned schedule - the clocks are synchronized by ptp4l, not shown here): Receiver (sja1105): tc qdisc add dev swp2 clsact now=$(phc_ctl /dev/ptp1 get | awk '/clock time is/ {print $5}') && \ sec=$(echo $now | awk -F. '{print $1}') && \ base_time="$(((sec + 2) * 1000000000))" && \ echo "base time ${base_time}" tc filter add dev swp2 ingress flower skip_sw \ dst_mac 42:be:24:9b:76:20 \ action gate base-time ${base_time} \ sched-entry OPEN 60000 -1 -1 \ sched-entry CLOSE 40000 -1 -1 \ action trap Sender (enetc): now=$(phc_ctl /dev/ptp0 get | awk '/clock time is/ {print $5}') && \ sec=$(echo $now | awk -F. '{print $1}') && \ base_time="$(((sec + 2) * 1000000000))" && \ echo "base time ${base_time}" tc qdisc add dev eno0 parent root taprio \ num_tc 8 \ map 0 1 2 3 4 5 6 7 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \ base-time ${base_time} \ sched-entry S 01 50000 \ sched-entry S 00 50000 \ flags 2 ping -A 192.168.1.1 PING 192.168.1.1 (192.168.1.1): 56 data bytes ... ^C --- 192.168.1.1 ping statistics --- 1425 packets transmitted, 1424 packets received, 0% packet loss round-trip min/avg/max = 0.322/0.361/0.990 ms And just for comparison, with the tc-taprio schedule deleted: ping -A 192.168.1.1 PING 192.168.1.1 (192.168.1.1): 56 data bytes ... ^C --- 192.168.1.1 ping statistics --- 33 packets transmitted, 19 packets received, 42% packet loss round-trip min/avg/max = 0.336/0.464/0.597 ms Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-05 22:20:56 +03:00
unsigned long destports;
int sharindx;
int maxlen;
int ipv;
u64 base_time;
u64 cycle_time;
int num_entries;
struct action_gate_entry *entries;
struct flow_stats stats;
} vl;
};
};
struct sja1105_flow_block {
struct list_head rules;
bool l2_policer_used[SJA1105_NUM_L2_POLICERS];
int num_virtual_links;
};
struct sja1105_private {
struct sja1105_static_config static_config;
net: dsa: sja1105: parse {rx, tx}-internal-delay-ps properties for RGMII delays This change does not fix any functional issue or address any real life use case that wasn't possible before. It is just a small step in the process of standardizing the way in which Ethernet MAC drivers may apply RGMII delays (traditionally these have been applied by PHYs, with no clear definition of what to do in the case of a fixed-link). The sja1105 driver used to apply MAC-level RGMII delays on the RX data lines when in fixed-link mode and using a phy-mode of "rgmii-rxid" or "rgmii-id" and on the TX data lines when using "rgmii-txid" or "rgmii-id". But the standard definitions don't say anything about behaving differently when the port is in fixed-link vs when it isn't, and the new device tree bindings are about having a way of applying the delays in a way that is independent of the phy-mode and of the fixed-link property. When the {rx,tx}-internal-delay-ps properties are present, use them, otherwise fall back to the old behavior and warn. One other thing to note is that the SJA1105 hardware applies a delay value in degrees rather than in picoseconds (the delay in ps changes depending on the frequency of the RGMII clock - 125 MHz at 1G, 25 MHz at 100M, 2.5MHz at 10M). I assume that is fine, we calculate the phase shift of the internal delay lines assuming that the device tree meant gigabit, and we let the hardware scale those according to the link speed. Link: https://patchwork.kernel.org/project/netdevbpf/patch/20210723173108.459770-6-prasanna.vengateshan@microchip.com/ Link: https://patchwork.ozlabs.org/project/netdev/patch/20200616074955.GA9092@laureti-dev/#2461123 Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 22:29:52 +03:00
int rgmii_rx_delay_ps[SJA1105_MAX_NUM_PORTS];
int rgmii_tx_delay_ps[SJA1105_MAX_NUM_PORTS];
phy_interface_t phy_mode[SJA1105_MAX_NUM_PORTS];
bool fixed_link[SJA1105_MAX_NUM_PORTS];
unsigned long ucast_egress_floods;
unsigned long bcast_egress_floods;
unsigned long hwts_tx_en;
net: dsa: sja1105: always enable the send_meta options incl_srcpt has the limitation, mentioned in commit b4638af8885a ("net: dsa: sja1105: always enable the INCL_SRCPT option"), that frames with a MAC DA of 01:80:c2:xx:yy:zz will be received as 01:80:c2:00:00:zz unless PTP RX timestamping is enabled. The incl_srcpt option was initially unconditionally enabled, then that changed with commit 42824463d38d ("net: dsa: sja1105: Limit use of incl_srcpt to bridge+vlan mode"), then again with b4638af8885a ("net: dsa: sja1105: always enable the INCL_SRCPT option"). Bottom line is that it now needs to be always enabled, otherwise the driver does not have a reliable source of information regarding source_port and switch_id for link-local traffic (tag_8021q VLANs may be imprecise since now they identify an entire bridging domain when ports are not standalone). If we accept that PTP RX timestamping (and therefore, meta frame generation) is always enabled in hardware, then that limitation could be avoided and packets with any MAC DA can be properly received, because meta frames do contain the original bytes from the MAC DA of their associated link-local packet. This change enables meta frame generation unconditionally, which also has the nice side effects of simplifying the switch control path (a switch reset is no longer required on hwtstamping settings change) and the tagger data path (it no longer needs to be informed whether to expect meta frames or not - it always does). Fixes: 227d07a07ef1 ("net: dsa: sja1105: Add support for traffic through standalone ports") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-04 01:05:45 +03:00
unsigned long hwts_rx_en;
const struct sja1105_info *info;
size_t max_xfer_len;
struct spi_device *spidev;
struct dsa_switch *ds;
net: dsa: sja1105: delete vlan delta save/restore logic With the best_effort_vlan_filtering mode now gone, the driver does not have 3 operating modes anymore (VLAN-unaware, VLAN-aware and best effort), but only 2. The idea is that we will gain support for network stack I/O through a VLAN-aware bridge, using the data plane offload framework (imprecise RX, imprecise TX). So the VLAN-aware use case will be more functional. But standalone ports that are part of the same switch when some other ports are under a VLAN-aware bridge should work too. Termination on those should work through the tag_8021q RX VLAN and TX VLAN. This was not possible using the old logic, because: - in VLAN-unaware mode, only the tag_8021q VLANs were committed to hw - in VLAN-aware mode, only the bridge VLANs were committed to hw - in best-effort VLAN mode, both the tag_8021q and bridge VLANs were committed to hw The strategy for the new VLAN-aware mode is to allow the bridge and the tag_8021q VLANs to coexist in the VLAN table at the same time. [ yes, we need to make sure that the bridge cannot install a tag_8021q VLAN, but ] This means that the save/restore logic introduced by commit ec5ae61076d0 ("net: dsa: sja1105: save/restore VLANs using a delta commit method") does not serve a purpose any longer. We can delete it and restore the old code that simply adds a VLAN to the VLAN table and calls it a day. Note that we keep the sja1105_commit_pvid() function from those days, but adapt it slightly. Ports that are under a VLAN-aware bridge use the bridge's pvid, ports that are standalone or under a VLAN-unaware bridge use the tag_8021q pvid, for local termination or VLAN-unaware forwarding. Now, when the vlan_filtering property is toggled for the bridge, the pvid of the ports beneath it is the only thing that's changing, we no longer delete some VLANs and restore others. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-26 19:55:31 +03:00
u16 bridge_pvid[SJA1105_MAX_NUM_PORTS];
u16 tag_8021q_pvid[SJA1105_MAX_NUM_PORTS];
struct sja1105_flow_block flow_block;
/* Serializes transmission of management frames so that
* the switch doesn't confuse them with one another.
*/
struct mutex mgmt_lock;
/* PTP two-step TX timestamp ID, and its serialization lock */
spinlock_t ts_id_lock;
u8 ts_id;
net: dsa: sja1105: serialize access to the dynamic config interface The sja1105 hardware seems as concurrent as can be, but when we create a background script that adds/removes a rain of FDB entries without the rtnl_mutex taken, then in parallel we do another operation like run 'bridge fdb show', we can notice these errors popping up: sja1105 spi2.0: port 2 failed to read back entry for 00:01:02:03:00:40 vid 0: -ENOENT sja1105 spi2.0: port 2 failed to add 00:01:02:03:00:40 vid 0 to fdb: -2 sja1105 spi2.0: port 2 failed to read back entry for 00:01:02:03:00:46 vid 0: -ENOENT sja1105 spi2.0: port 2 failed to add 00:01:02:03:00:46 vid 0 to fdb: -2 Luckily what is going on does not require a major rework in the driver. The sja1105_dynamic_config_read() function sends multiple SPI buffers to the peripheral until the operation completes. We should not do anything until the hardware clears the VALID bit. But since there is no locking (i.e. right now we are implicitly serialized by the rtnl_mutex, but if we remove that), it might be possible that the process which performs the dynamic config read is preempted and another one performs a dynamic config write. What will happen in that case is that sja1105_dynamic_config_read(), when it resumes, expects to see VALIDENT set for the entry it reads back. But it won't. This can be corrected by introducing a mutex for serializing SPI accesses to the dynamic config interface which should be atomic with respect to each other. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-24 20:17:50 +03:00
/* Serializes access to the dynamic config interface */
struct mutex dynamic_config_lock;
struct devlink_region **regions;
struct sja1105_cbs_entry *cbs;
struct mii_bus *mdio_base_t1;
struct mii_bus *mdio_base_tx;
struct mii_bus *mdio_pcs;
struct dw_xpcs *xpcs[SJA1105_MAX_NUM_PORTS];
struct sja1105_ptp_data ptp_data;
net: dsa: sja1105: Configure the Time-Aware Scheduler via tc-taprio offload This qdisc offload is the closest thing to what the SJA1105 supports in hardware for time-based egress shaping. The switch core really is built around SAE AS6802/TTEthernet (a TTTech standard) but can be made to operate similarly to IEEE 802.1Qbv with some constraints: - The gate control list is a global list for all ports. There are 8 execution threads that iterate through this global list in parallel. I don't know why 8, there are only 4 front-panel ports. - Care must be taken by the user to make sure that two execution threads never get to execute a GCL entry simultaneously. I created a O(n^4) checker for this hardware limitation, prior to accepting a taprio offload configuration as valid. - The spec says that if a GCL entry's interval is shorter than the frame length, you shouldn't send it (and end up in head-of-line blocking). Well, this switch does anyway. - The switch has no concept of ADMIN and OPER configurations. Because it's so simple, the TAS settings are loaded through the static config tables interface, so there isn't even place for any discussion about 'graceful switchover between ADMIN and OPER'. You just reset the switch and upload a new OPER config. - The switch accepts multiple time sources for the gate events. Right now I am using the standalone clock source as opposed to PTP. So the base time parameter doesn't really do much. Support for the PTP clock source will be added in a future series. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-15 05:00:02 +03:00
struct sja1105_tas_data tas_data;
};
#include "sja1105_dynamic_config.h"
struct sja1105_spi_message {
u64 access;
u64 read_count;
u64 address;
};
net: dsa: sja1105: Configure the Time-Aware Scheduler via tc-taprio offload This qdisc offload is the closest thing to what the SJA1105 supports in hardware for time-based egress shaping. The switch core really is built around SAE AS6802/TTEthernet (a TTTech standard) but can be made to operate similarly to IEEE 802.1Qbv with some constraints: - The gate control list is a global list for all ports. There are 8 execution threads that iterate through this global list in parallel. I don't know why 8, there are only 4 front-panel ports. - Care must be taken by the user to make sure that two execution threads never get to execute a GCL entry simultaneously. I created a O(n^4) checker for this hardware limitation, prior to accepting a taprio offload configuration as valid. - The spec says that if a GCL entry's interval is shorter than the frame length, you shouldn't send it (and end up in head-of-line blocking). Well, this switch does anyway. - The switch has no concept of ADMIN and OPER configurations. Because it's so simple, the TAS settings are loaded through the static config tables interface, so there isn't even place for any discussion about 'graceful switchover between ADMIN and OPER'. You just reset the switch and upload a new OPER config. - The switch accepts multiple time sources for the gate events. Right now I am using the standalone clock source as opposed to PTP. So the base time parameter doesn't really do much. Support for the PTP clock source will be added in a future series. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-15 05:00:02 +03:00
/* From sja1105_main.c */
enum sja1105_reset_reason {
SJA1105_VLAN_FILTERING = 0,
SJA1105_AGEING_TIME,
SJA1105_SCHEDULING,
SJA1105_BEST_EFFORT_POLICING,
SJA1105_VIRTUAL_LINKS,
};
int sja1105_static_config_reload(struct sja1105_private *priv,
enum sja1105_reset_reason reason);
int sja1105_vlan_filtering(struct dsa_switch *ds, int port, bool enabled,
struct netlink_ext_ack *extack);
void sja1105_frame_memory_partitioning(struct sja1105_private *priv);
/* From sja1105_mdio.c */
int sja1105_mdiobus_register(struct dsa_switch *ds);
void sja1105_mdiobus_unregister(struct dsa_switch *ds);
int sja1105_pcs_mdio_read_c45(struct mii_bus *bus, int phy, int mmd, int reg);
int sja1105_pcs_mdio_write_c45(struct mii_bus *bus, int phy, int mmd, int reg,
u16 val);
int sja1110_pcs_mdio_read_c45(struct mii_bus *bus, int phy, int mmd, int reg);
int sja1110_pcs_mdio_write_c45(struct mii_bus *bus, int phy, int mmd, int reg,
u16 val);
/* From sja1105_devlink.c */
int sja1105_devlink_setup(struct dsa_switch *ds);
void sja1105_devlink_teardown(struct dsa_switch *ds);
int sja1105_devlink_info_get(struct dsa_switch *ds,
struct devlink_info_req *req,
struct netlink_ext_ack *extack);
/* From sja1105_spi.c */
int sja1105_xfer_buf(const struct sja1105_private *priv,
sja1105_spi_rw_mode_t rw, u64 reg_addr,
net: dsa: sja1105: Switch to scatter/gather API for SPI This reworks the SPI transfer implementation to make use of more of the SPI core features. The main benefit is to avoid the memcpy in sja1105_xfer_buf(). The memcpy was only needed because the function was transferring a single buffer at a time. So it needed to copy the caller-provided buffer at buf + 4, to store the SPI message header in the "headroom" area. But the SPI core supports scatter-gather messages, comprised of multiple transfers. We can actually use those to break apart every SPI message into 2 transfers: one for the header and one for the actual payload. To keep the behavior the same regarding the chip select signal, it is necessary to tell the SPI core to de-assert the chip select after each chunk. This was not needed before, because each spi_message contained only 1 single transfer. The meaning of the per-transfer cs_change=1 is: - If the transfer is the last one of the message, keep CS asserted - Otherwise, deassert CS We need to deassert CS in the "otherwise" case, which was implicit before. Avoiding the memcpy creates yet another opportunity. The device can't process more than 256 bytes of SPI payload at a time, so the sja1105_xfer_long_buf() function used to exist, to split the larger caller buffer into chunks. But these chunks couldn't be used as scatter/gather buffers for spi_message until now, because of that memcpy (we would have needed more memory for each chunk). So we can now remove the sja1105_xfer_long_buf() function and have a single implementation for long and short buffers. Another benefit is lower usage of stack memory. Previously we had to store 2 SPI buffers for each chunk. Due to the elimination of the memcpy, we can now send pointers to the actual chunks from the caller-supplied buffer to the SPI core. Since the patch merges two functions into a rewritten implementation, the function prototype was also changed, mainly for cosmetic consistency with the structures used within it. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-12 01:31:15 +03:00
u8 *buf, size_t len);
int sja1105_xfer_u32(const struct sja1105_private *priv,
net: dsa: sja1105: Implement the .gettimex64 system call for PTP Through the PTP_SYS_OFFSET_EXTENDED ioctl, it is possible for userspace applications (i.e. phc2sys) to compensate for the delays incurred while reading the PHC's time. The task itself of taking the software timestamp is delegated to the SPI subsystem, through the newly introduced API in struct spi_transfer. The goal is to cross-timestamp I/O operations on the switch's PTP clock with values in the local system clock (CLOCK_REALTIME). For that we need to understand a bit of the hardware internals. The 'read PTP time' message is a 12 byte structure, first 4 bytes of which represent the SPI header, and the last 8 bytes represent the 64-bit PTP time. The switch itself starts processing the command immediately after receiving the last bit of the address, i.e. at the middle of byte 3 (last byte of header). The PTP time is shadowed to a buffer register in the switch, and retrieved atomically during the subsequent SPI frames. A similar thing goes on for the 'write PTP time' message, although in that case the switch waits until the 64-bit PTP time becomes fully available before taking any action. So the byte that needs to be software-timestamped is byte 11 (last) of the transfer. The patch creates a common (and local) sja1105_xfer implementation for the SPI I/O, and offers 3 front-ends: - sja1105_xfer_u32 and sja1105_xfer_u64: these are capable of optionally requesting a PTP timestamp - sja1105_xfer_buf: this is for large transfers (e.g. the static config buffer) and other misc data, and there is no point in giving timestamping capabilities to this. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-09 13:32:22 +02:00
sja1105_spi_rw_mode_t rw, u64 reg_addr, u32 *value,
struct ptp_system_timestamp *ptp_sts);
int sja1105_xfer_u64(const struct sja1105_private *priv,
net: dsa: sja1105: Implement the .gettimex64 system call for PTP Through the PTP_SYS_OFFSET_EXTENDED ioctl, it is possible for userspace applications (i.e. phc2sys) to compensate for the delays incurred while reading the PHC's time. The task itself of taking the software timestamp is delegated to the SPI subsystem, through the newly introduced API in struct spi_transfer. The goal is to cross-timestamp I/O operations on the switch's PTP clock with values in the local system clock (CLOCK_REALTIME). For that we need to understand a bit of the hardware internals. The 'read PTP time' message is a 12 byte structure, first 4 bytes of which represent the SPI header, and the last 8 bytes represent the 64-bit PTP time. The switch itself starts processing the command immediately after receiving the last bit of the address, i.e. at the middle of byte 3 (last byte of header). The PTP time is shadowed to a buffer register in the switch, and retrieved atomically during the subsequent SPI frames. A similar thing goes on for the 'write PTP time' message, although in that case the switch waits until the 64-bit PTP time becomes fully available before taking any action. So the byte that needs to be software-timestamped is byte 11 (last) of the transfer. The patch creates a common (and local) sja1105_xfer implementation for the SPI I/O, and offers 3 front-ends: - sja1105_xfer_u32 and sja1105_xfer_u64: these are capable of optionally requesting a PTP timestamp - sja1105_xfer_buf: this is for large transfers (e.g. the static config buffer) and other misc data, and there is no point in giving timestamping capabilities to this. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-09 13:32:22 +02:00
sja1105_spi_rw_mode_t rw, u64 reg_addr, u64 *value,
struct ptp_system_timestamp *ptp_sts);
int static_config_buf_prepare_for_upload(struct sja1105_private *priv,
void *config_buf, int buf_len);
int sja1105_static_config_upload(struct sja1105_private *priv);
int sja1105_inhibit_tx(const struct sja1105_private *priv,
unsigned long port_bitmap, bool tx_inhibited);
extern const struct sja1105_info sja1105e_info;
extern const struct sja1105_info sja1105t_info;
extern const struct sja1105_info sja1105p_info;
extern const struct sja1105_info sja1105q_info;
extern const struct sja1105_info sja1105r_info;
extern const struct sja1105_info sja1105s_info;
net: dsa: sja1105: add support for the SJA1110 switch family The SJA1110 is basically an SJA1105 with more ports, some integrated PHYs (100base-T1 and 100base-TX) and an embedded microcontroller which can be disabled, and the switch core can be controlled by a host running Linux, over SPI. This patch contains: - the static and dynamic config packing functions, for the tables that are common with SJA1105 - one more static config tables which is "unique" to the SJA1110 (actually it is a rehash of stuff that was placed somewhere else in SJA1105): the PCP Remapping Table - a reset and clock configuration procedure for the SJA1110 switch. This resets just the switch subsystem, and gates off the clock which powers on the embedded microcontroller. - an RGMII delay configuration procedure for SJA1110, which is very similar to SJA1105, but different enough for us to be unable to reuse it (this is a pattern that repeats itself) - some adaptations to dynamic config table entries which are no longer programmed in the same way. For example, to delete a VLAN, you used to write an entry through the dynamic reconfiguration interface with the desired VLAN ID, and with the VALIDENT bit set to false. Now, the VLAN table entries contain a TYPE_ENTRY field, which must be set to zero (in a backwards-incompatible way) in order for the entry to be deleted, or to some other entry for the VLAN to match "inner tagged" or "outer tagged" packets. - a similar thing for the static config: the xMII Mode Parameters Table encoding for SGMII and MII (the latter just when attached to a 100base-TX PHY) just isn't what it used to be in SJA1105. They are identical, except there is an extra "special" bit which needs to be set. Set it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 12:25:36 +03:00
extern const struct sja1105_info sja1110a_info;
extern const struct sja1105_info sja1110b_info;
extern const struct sja1105_info sja1110c_info;
extern const struct sja1105_info sja1110d_info;
/* From sja1105_clocking.c */
typedef enum {
XMII_MAC = 0,
XMII_PHY = 1,
} sja1105_mii_role_t;
typedef enum {
XMII_MODE_MII = 0,
XMII_MODE_RMII = 1,
XMII_MODE_RGMII = 2,
net: dsa: sja1105: Add support for the SGMII port SJA1105 switches R and S have one SerDes port with an 802.3z quasi-compatible PCS, hardwired on port 4. The other ports are still MII/RMII/RGMII. The PCS performs rate adaptation to lower link speeds; the MAC on this port is hardwired at gigabit. Only full duplex is supported. The SGMII port can be configured as part of the static config tables, as well as through a dedicated SPI address region for its pseudo-clause-22 registers. However it looks like the static configuration is not able to change some out-of-reset values (like the value of MII_BMCR), so at the end of the day, having code for it is utterly pointless. We are just going to use the pseudo-C22 interface. Because the PCS gets reset when the switch resets, we have to add even more restoration logic to sja1105_static_config_reload, otherwise the SGMII port breaks after operations such as enabling PTP timestamping which require a switch reset. >From PHYLINK perspective, the switch supports *only* SGMII (it doesn't support 1000Base-X). It also doesn't expose access to the raw config word for in-band AN in registers MII_ADV/MII_LPA. It is able to work in the following modes: - Forced speed - SGMII in-band AN slave (speed received from PHY) - SGMII in-band AN master (acting as a PHY) The latter mode is not supported by this patch. It is even unclear to me how that would be described. There is some code for it left in the patch, but 'an_master' is always passed as false. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-20 13:29:37 +02:00
XMII_MODE_SGMII = 3,
} sja1105_phy_interface_t;
int sja1105pqrs_setup_rgmii_delay(const void *ctx, int port);
net: dsa: sja1105: add support for the SJA1110 switch family The SJA1110 is basically an SJA1105 with more ports, some integrated PHYs (100base-T1 and 100base-TX) and an embedded microcontroller which can be disabled, and the switch core can be controlled by a host running Linux, over SPI. This patch contains: - the static and dynamic config packing functions, for the tables that are common with SJA1105 - one more static config tables which is "unique" to the SJA1110 (actually it is a rehash of stuff that was placed somewhere else in SJA1105): the PCP Remapping Table - a reset and clock configuration procedure for the SJA1110 switch. This resets just the switch subsystem, and gates off the clock which powers on the embedded microcontroller. - an RGMII delay configuration procedure for SJA1110, which is very similar to SJA1105, but different enough for us to be unable to reuse it (this is a pattern that repeats itself) - some adaptations to dynamic config table entries which are no longer programmed in the same way. For example, to delete a VLAN, you used to write an entry through the dynamic reconfiguration interface with the desired VLAN ID, and with the VALIDENT bit set to false. Now, the VLAN table entries contain a TYPE_ENTRY field, which must be set to zero (in a backwards-incompatible way) in order for the entry to be deleted, or to some other entry for the VLAN to match "inner tagged" or "outer tagged" packets. - a similar thing for the static config: the xMII Mode Parameters Table encoding for SGMII and MII (the latter just when attached to a 100base-TX PHY) just isn't what it used to be in SJA1105. They are identical, except there is an extra "special" bit which needs to be set. Set it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 12:25:36 +03:00
int sja1110_setup_rgmii_delay(const void *ctx, int port);
int sja1105_clocking_setup_port(struct sja1105_private *priv, int port);
int sja1105_clocking_setup(struct sja1105_private *priv);
net: dsa: sja1105: properly power down the microcontroller clock for SJA1110 It turns out that powering down the BASE_TIMER_CLK does not turn off the microcontroller, just its timers, including the one for the watchdog. So the embedded microcontroller is still running, and potentially still doing things. To prevent unwanted interference, we should power down the BASE_MCSS_CLK as well (MCSS = microcontroller subsystem). The trouble is that currently we turn off the BASE_TIMER_CLK for SJA1110 from the .clocking_setup() method, mostly because this is a Clock Generation Unit (CGU) setting which was traditionally configured in that method for SJA1105. But in SJA1105, the CGU was used for bringing up the port clocks at the proper speeds, and in SJA1110 it's not (but rather for initial configuration), so it's best that we rebrand the sja1110_clocking_setup() method into what it really is - an implementation of the .disable_microcontroller() method. Since disabling the microcontroller only needs to be done once, at probe time, we can choose the best place to do that as being in sja1105_setup(), before we upload the static config to the device. This guarantees that the static config being used by the switch afterwards is really ours. Note that the procedure to upload a static config necessarily resets the switch. This already did not reset the microcontroller, only the switch core, so since the .disable_microcontroller() method is guaranteed to be called by that point, if it's disabled, it remains disabled. Add a comment to make that clear. With the code movement for SJA1110 from .clocking_setup() to .disable_microcontroller(), both methods are optional and are guarded by "if" conditions. Tested by enabling in the device tree the rev-mii switch port 0 that goes towards the microcontroller, and flashing a firmware that would have networking. Without this patch, the microcontroller can be pinged, with this patch it cannot. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-18 14:52:54 +03:00
int sja1110_disable_microcontroller(struct sja1105_private *priv);
/* From sja1105_ethtool.c */
void sja1105_get_ethtool_stats(struct dsa_switch *ds, int port, u64 *data);
void sja1105_get_strings(struct dsa_switch *ds, int port,
u32 stringset, u8 *data);
int sja1105_get_sset_count(struct dsa_switch *ds, int port, int sset);
/* From sja1105_dynamic_config.c */
int sja1105_dynamic_config_read(struct sja1105_private *priv,
enum sja1105_blk_idx blk_idx,
int index, void *entry);
int sja1105_dynamic_config_write(struct sja1105_private *priv,
enum sja1105_blk_idx blk_idx,
int index, void *entry, bool keep);
enum sja1105_iotag {
SJA1105_C_TAG = 0, /* Inner VLAN header */
SJA1105_S_TAG = 1, /* Outer VLAN header */
};
net: dsa: sja1105: add support for the SJA1110 switch family The SJA1110 is basically an SJA1105 with more ports, some integrated PHYs (100base-T1 and 100base-TX) and an embedded microcontroller which can be disabled, and the switch core can be controlled by a host running Linux, over SPI. This patch contains: - the static and dynamic config packing functions, for the tables that are common with SJA1105 - one more static config tables which is "unique" to the SJA1110 (actually it is a rehash of stuff that was placed somewhere else in SJA1105): the PCP Remapping Table - a reset and clock configuration procedure for the SJA1110 switch. This resets just the switch subsystem, and gates off the clock which powers on the embedded microcontroller. - an RGMII delay configuration procedure for SJA1110, which is very similar to SJA1105, but different enough for us to be unable to reuse it (this is a pattern that repeats itself) - some adaptations to dynamic config table entries which are no longer programmed in the same way. For example, to delete a VLAN, you used to write an entry through the dynamic reconfiguration interface with the desired VLAN ID, and with the VALIDENT bit set to false. Now, the VLAN table entries contain a TYPE_ENTRY field, which must be set to zero (in a backwards-incompatible way) in order for the entry to be deleted, or to some other entry for the VLAN to match "inner tagged" or "outer tagged" packets. - a similar thing for the static config: the xMII Mode Parameters Table encoding for SGMII and MII (the latter just when attached to a 100base-TX PHY) just isn't what it used to be in SJA1105. They are identical, except there is an extra "special" bit which needs to be set. Set it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08 12:25:36 +03:00
enum sja1110_vlan_type {
SJA1110_VLAN_INVALID = 0,
SJA1110_VLAN_C_TAG = 1, /* Single inner VLAN tag */
SJA1110_VLAN_S_TAG = 2, /* Single outer VLAN tag */
SJA1110_VLAN_D_TAG = 3, /* Double tagged, use outer tag for lookup */
};
enum sja1110_shaper_type {
SJA1110_LEAKY_BUCKET_SHAPER = 0,
SJA1110_CBS_SHAPER = 1,
};
u8 sja1105et_fdb_hash(struct sja1105_private *priv, const u8 *addr, u16 vid);
int sja1105et_fdb_add(struct dsa_switch *ds, int port,
const unsigned char *addr, u16 vid);
int sja1105et_fdb_del(struct dsa_switch *ds, int port,
const unsigned char *addr, u16 vid);
int sja1105pqrs_fdb_add(struct dsa_switch *ds, int port,
const unsigned char *addr, u16 vid);
int sja1105pqrs_fdb_del(struct dsa_switch *ds, int port,
const unsigned char *addr, u16 vid);
/* From sja1105_flower.c */
int sja1105_cls_flower_del(struct dsa_switch *ds, int port,
struct flow_cls_offload *cls, bool ingress);
int sja1105_cls_flower_add(struct dsa_switch *ds, int port,
struct flow_cls_offload *cls, bool ingress);
net: dsa: sja1105: implement tc-gate using time-triggered virtual links Restrict the TTEthernet hardware support on this switch to operate as closely as possible to IEEE 802.1Qci as possible. This means that it can perform PTP-time-based ingress admission control on streams identified by {DMAC, VID, PCP}, which is useful when trying to ensure the determinism of traffic scheduled via IEEE 802.1Qbv. The oddity comes from the fact that in hardware (and in TTEthernet at large), virtual links always need a full-blown action, including not only the type of policing, but also the list of destination ports. So in practice, a single tc-gate action will result in all packets getting dropped. Additional actions (either "trap" or "redirect") need to be specified in the same filter rule such that the conforming packets are actually forwarded somewhere. Apart from the VL Lookup, Policing and Forwarding tables which need to be programmed for each flow (virtual link), the Schedule engine also needs to be told to open/close the admission gates for each individual virtual link. A fairly accurate (and detailed) description of how that works is already present in sja1105_tas.c, since it is already used to trigger the egress gates for the tc-taprio offload (IEEE 802.1Qbv). Key point here, we remember that the schedule engine supports 8 "subschedules" (execution threads that iterate through the global schedule in parallel, and that no 2 hardware threads must execute a schedule entry at the same time). For tc-taprio, each egress port used one of these 8 subschedules, leaving a total of 4 subschedules unused. In principle we could have allocated 1 subschedule for the tc-gate offload of each ingress port, but actually the schedules of all virtual links installed on each ingress port would have needed to be merged together, before they could have been programmed to hardware. So simplify our life and just merge the entire tc-gate configuration, for all virtual links on all ingress ports, into a single subschedule. Be sure to check that against the usual hardware scheduling conflicts, and program it to hardware alongside any tc-taprio subschedule that may be present. The following scenarios were tested: 1. Quantitative testing: tc qdisc add dev swp2 clsact tc filter add dev swp2 ingress flower skip_sw \ dst_mac 42:be:24:9b:76:20 \ action gate index 1 base-time 0 \ sched-entry OPEN 1200 -1 -1 \ sched-entry CLOSE 1200 -1 -1 \ action trap ping 192.168.1.2 -f PING 192.168.1.2 (192.168.1.2) 56(84) bytes of data. ............................. --- 192.168.1.2 ping statistics --- 948 packets transmitted, 467 received, 50.7384% packet loss, time 9671ms 2. Qualitative testing (with a phase-aligned schedule - the clocks are synchronized by ptp4l, not shown here): Receiver (sja1105): tc qdisc add dev swp2 clsact now=$(phc_ctl /dev/ptp1 get | awk '/clock time is/ {print $5}') && \ sec=$(echo $now | awk -F. '{print $1}') && \ base_time="$(((sec + 2) * 1000000000))" && \ echo "base time ${base_time}" tc filter add dev swp2 ingress flower skip_sw \ dst_mac 42:be:24:9b:76:20 \ action gate base-time ${base_time} \ sched-entry OPEN 60000 -1 -1 \ sched-entry CLOSE 40000 -1 -1 \ action trap Sender (enetc): now=$(phc_ctl /dev/ptp0 get | awk '/clock time is/ {print $5}') && \ sec=$(echo $now | awk -F. '{print $1}') && \ base_time="$(((sec + 2) * 1000000000))" && \ echo "base time ${base_time}" tc qdisc add dev eno0 parent root taprio \ num_tc 8 \ map 0 1 2 3 4 5 6 7 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \ base-time ${base_time} \ sched-entry S 01 50000 \ sched-entry S 00 50000 \ flags 2 ping -A 192.168.1.1 PING 192.168.1.1 (192.168.1.1): 56 data bytes ... ^C --- 192.168.1.1 ping statistics --- 1425 packets transmitted, 1424 packets received, 0% packet loss round-trip min/avg/max = 0.322/0.361/0.990 ms And just for comparison, with the tc-taprio schedule deleted: ping -A 192.168.1.1 PING 192.168.1.1 (192.168.1.1): 56 data bytes ... ^C --- 192.168.1.1 ping statistics --- 33 packets transmitted, 19 packets received, 42% packet loss round-trip min/avg/max = 0.336/0.464/0.597 ms Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-05 22:20:56 +03:00
int sja1105_cls_flower_stats(struct dsa_switch *ds, int port,
struct flow_cls_offload *cls, bool ingress);
void sja1105_flower_setup(struct dsa_switch *ds);
void sja1105_flower_teardown(struct dsa_switch *ds);
struct sja1105_rule *sja1105_rule_find(struct sja1105_private *priv,
unsigned long cookie);
#endif