2019-10-14 21:51:20 +05:30
/* SPDX-License-Identifier: GPL-2.0 */
/* Copyright (c) 2018, Sensor-Technik Wiedemann GmbH
2019-05-02 23:23:30 +03:00
* Copyright ( c ) 2018 - 2019 , Vladimir Oltean < olteanv @ gmail . com >
*/
# ifndef _SJA1105_H
# define _SJA1105_H
net: dsa: sja1105: Add support for the PTP clock
The design of this PHC driver is influenced by the switch's behavior
w.r.t. timestamping. It exposes two PTP counters, one free-running
(PTPTSCLK) and the other offset- and frequency-corrected in hardware
through PTPCLKVAL, PTPCLKADD and PTPCLKRATE. The MACs can sample either
of these for frame timestamps.
However, the user manual warns that taking timestamps based on the
corrected clock is less than useful, as the switch can deliver corrupted
timestamps in a variety of circumstances.
Therefore, this PHC uses the free-running PTPTSCLK together with a
timecounter/cyclecounter structure that translates it into a software
time domain. Thus, the settime/adjtime and adjfine callbacks are
hardware no-ops.
The timestamps (introduced in a further patch) will also be translated
to the correct time domain before being handed over to the userspace PTP
stack.
The introduction of a second set of PHC operations that operate on the
hardware PTPCLKVAL/PTPCLKADD/PTPCLKRATE in the future is somewhat
unavoidable, as the TTEthernet core uses the corrected PTP time domain.
However, the free-running counter + timecounter structure combination
will suffice for now, as the resulting timestamps yield a sub-50 ns
synchronization offset in steady state using linuxptp.
For this patch, in absence of frame timestamping, the operations of the
switch PHC were tested by syncing it to the system time as a local slave
clock with:
phc2sys -s CLOCK_REALTIME -c swp2 -O 0 -m -S 0.01
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-08 15:04:34 +03:00
# include <linux/ptp_clock_kernel.h>
# include <linux/timecounter.h>
2019-05-02 23:23:30 +03:00
# include <linux/dsa/sja1105.h>
net: dsa: sja1105: implement cross-chip bridging operations
sja1105 uses dsa_8021q for DSA tagging, a format which is VLAN at heart
and which is compatible with cascading. A complete description of this
tagging format is in net/dsa/tag_8021q.c, but a quick summary is that
each external-facing port tags incoming frames with a unique pvid, and
this special VLAN is transmitted as tagged towards the inside of the
system, and as untagged towards the exterior. The tag encodes the switch
id and the source port index.
This means that cross-chip bridging for dsa_8021q only entails adding
the dsa_8021q pvids of one switch to the RX filter of the other
switches. Everything else falls naturally into place, as long as the
bottom-end of ports (the leaves in the tree) is comprised exclusively of
dsa_8021q-compatible (i.e. sja1105 switches). Otherwise, there would be
a chance that a front-panel switch transmits a packet tagged with a
dsa_8021q header, header which it wouldn't be able to remove, and which
would hence "leak" out.
The only use case I tested (due to lack of board availability) was when
the sja1105 switches are part of disjoint trees (however, this doesn't
change the fact that multiple sja1105 switches still need unique switch
identifiers in such a system). But in principle, even "true" single-tree
setups (with DSA links) should work just as fine, except for a small
change which I can't test: dsa_towards_port should be used instead of
dsa_upstream_port (I made the assumption that the routing port that any
sja1105 should use towards its neighbours is the CPU port. That might
not hold true in other setups).
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10 19:37:43 +03:00
# include <linux/dsa/8021q.h>
2019-05-02 23:23:30 +03:00
# include <net/dsa.h>
2019-05-05 13:19:27 +03:00
# include <linux/mutex.h>
2019-05-02 23:23:30 +03:00
# include "sja1105_static_config.h"
# define SJA1105_NUM_PORTS 5
# define SJA1105_NUM_TC 8
# define SJA1105ET_FDB_BIN_SIZE 4
2019-05-02 23:23:36 +03:00
/* The hardware value is in multiples of 10 ms.
* The passed parameter is in multiples of 1 ms .
*/
# define SJA1105_AGEING_TIME_MS(ms) ((ms) / 10)
2020-03-29 14:52:02 +03:00
# define SJA1105_NUM_L2_POLICERS 45
2019-05-02 23:23:30 +03:00
2019-11-12 02:11:53 +02:00
typedef enum {
SPI_READ = 0 ,
SPI_WRITE = 1 ,
} sja1105_spi_rw_mode_t ;
2019-09-15 05:00:02 +03:00
# include "sja1105_tas.h"
2019-10-12 02:18:15 +03:00
# include "sja1105_ptp.h"
2019-09-15 05:00:02 +03:00
2019-05-02 23:23:30 +03:00
/* Keeps the different addresses between E/T and P/Q/R/S */
struct sja1105_regs {
u64 device_id ;
u64 prod_id ;
u64 status ;
2019-05-02 23:23:37 +03:00
u64 port_control ;
2019-05-02 23:23:30 +03:00
u64 rgu ;
net: dsa: sja1105: implement tc-gate using time-triggered virtual links
Restrict the TTEthernet hardware support on this switch to operate as
closely as possible to IEEE 802.1Qci as possible. This means that it can
perform PTP-time-based ingress admission control on streams identified
by {DMAC, VID, PCP}, which is useful when trying to ensure the
determinism of traffic scheduled via IEEE 802.1Qbv.
The oddity comes from the fact that in hardware (and in TTEthernet at
large), virtual links always need a full-blown action, including not
only the type of policing, but also the list of destination ports. So in
practice, a single tc-gate action will result in all packets getting
dropped. Additional actions (either "trap" or "redirect") need to be
specified in the same filter rule such that the conforming packets are
actually forwarded somewhere.
Apart from the VL Lookup, Policing and Forwarding tables which need to
be programmed for each flow (virtual link), the Schedule engine also
needs to be told to open/close the admission gates for each individual
virtual link. A fairly accurate (and detailed) description of how that
works is already present in sja1105_tas.c, since it is already used to
trigger the egress gates for the tc-taprio offload (IEEE 802.1Qbv). Key
point here, we remember that the schedule engine supports 8
"subschedules" (execution threads that iterate through the global
schedule in parallel, and that no 2 hardware threads must execute a
schedule entry at the same time). For tc-taprio, each egress port used
one of these 8 subschedules, leaving a total of 4 subschedules unused.
In principle we could have allocated 1 subschedule for the tc-gate
offload of each ingress port, but actually the schedules of all virtual
links installed on each ingress port would have needed to be merged
together, before they could have been programmed to hardware. So
simplify our life and just merge the entire tc-gate configuration, for
all virtual links on all ingress ports, into a single subschedule. Be
sure to check that against the usual hardware scheduling conflicts, and
program it to hardware alongside any tc-taprio subschedule that may be
present.
The following scenarios were tested:
1. Quantitative testing:
tc qdisc add dev swp2 clsact
tc filter add dev swp2 ingress flower skip_sw \
dst_mac 42:be:24:9b:76:20 \
action gate index 1 base-time 0 \
sched-entry OPEN 1200 -1 -1 \
sched-entry CLOSE 1200 -1 -1 \
action trap
ping 192.168.1.2 -f
PING 192.168.1.2 (192.168.1.2) 56(84) bytes of data.
.............................
--- 192.168.1.2 ping statistics ---
948 packets transmitted, 467 received, 50.7384% packet loss, time 9671ms
2. Qualitative testing (with a phase-aligned schedule - the clocks are
synchronized by ptp4l, not shown here):
Receiver (sja1105):
tc qdisc add dev swp2 clsact
now=$(phc_ctl /dev/ptp1 get | awk '/clock time is/ {print $5}') && \
sec=$(echo $now | awk -F. '{print $1}') && \
base_time="$(((sec + 2) * 1000000000))" && \
echo "base time ${base_time}"
tc filter add dev swp2 ingress flower skip_sw \
dst_mac 42:be:24:9b:76:20 \
action gate base-time ${base_time} \
sched-entry OPEN 60000 -1 -1 \
sched-entry CLOSE 40000 -1 -1 \
action trap
Sender (enetc):
now=$(phc_ctl /dev/ptp0 get | awk '/clock time is/ {print $5}') && \
sec=$(echo $now | awk -F. '{print $1}') && \
base_time="$(((sec + 2) * 1000000000))" && \
echo "base time ${base_time}"
tc qdisc add dev eno0 parent root taprio \
num_tc 8 \
map 0 1 2 3 4 5 6 7 \
queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
base-time ${base_time} \
sched-entry S 01 50000 \
sched-entry S 00 50000 \
flags 2
ping -A 192.168.1.1
PING 192.168.1.1 (192.168.1.1): 56 data bytes
...
^C
--- 192.168.1.1 ping statistics ---
1425 packets transmitted, 1424 packets received, 0% packet loss
round-trip min/avg/max = 0.322/0.361/0.990 ms
And just for comparison, with the tc-taprio schedule deleted:
ping -A 192.168.1.1
PING 192.168.1.1 (192.168.1.1): 56 data bytes
...
^C
--- 192.168.1.1 ping statistics ---
33 packets transmitted, 19 packets received, 42% packet loss
round-trip min/avg/max = 0.336/0.464/0.597 ms
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-05 22:20:56 +03:00
u64 vl_status ;
2019-05-02 23:23:30 +03:00
u64 config ;
2020-03-20 13:29:37 +02:00
u64 sgmii ;
2019-05-02 23:23:30 +03:00
u64 rmii_pll1 ;
net: dsa: sja1105: configure the PTP_CLK pin as EXT_TS or PER_OUT
The SJA1105 switch family has a PTP_CLK pin which emits a signal with
fixed 50% duty cycle, but variable frequency and programmable start time.
On the second generation (P/Q/R/S) switches, this pin supports even more
functionality. The use case described by the hardware documents talks
about synchronization via oneshot pulses: given 2 sja1105 switches,
arbitrarily designated as a master and a slave, the master emits a
single pulse on PTP_CLK, while the slave is configured to timestamp this
pulse received on its PTP_CLK pin (which must obviously be configured as
input). The difference between the timestamps then exactly becomes the
slave offset to the master.
The only trouble with the above is that the hardware is very much tied
into this use case only, and not very generic beyond that:
- When emitting a oneshot pulse, instead of being told when to emit it,
the switch just does it "now" and tells you later what time it was,
via the PTPSYNCTS register. [ Incidentally, this is the same register
that the slave uses to collect the ext_ts timestamp from, too. ]
- On the sync slave, there is no interrupt mechanism on reception of a
new extts, and no FIFO to buffer them, because in the foreseen use
case, software is in control of both the master and the slave pins,
so it "knows" when there's something to collect.
These 2 problems mean that:
- We don't support (at least yet) the quirky oneshot mode exposed by
the hardware, just normal periodic output.
- We abuse the hardware a little bit when we expose generic extts.
Because there's no interrupt mechanism, we need to poll at double the
frequency we expect to receive a pulse. Currently that means a
non-configurable "twice a second".
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-24 00:59:24 +02:00
u64 ptppinst ;
u64 ptppindur ;
net: dsa: sja1105: Add support for the PTP clock
The design of this PHC driver is influenced by the switch's behavior
w.r.t. timestamping. It exposes two PTP counters, one free-running
(PTPTSCLK) and the other offset- and frequency-corrected in hardware
through PTPCLKVAL, PTPCLKADD and PTPCLKRATE. The MACs can sample either
of these for frame timestamps.
However, the user manual warns that taking timestamps based on the
corrected clock is less than useful, as the switch can deliver corrupted
timestamps in a variety of circumstances.
Therefore, this PHC uses the free-running PTPTSCLK together with a
timecounter/cyclecounter structure that translates it into a software
time domain. Thus, the settime/adjtime and adjfine callbacks are
hardware no-ops.
The timestamps (introduced in a further patch) will also be translated
to the correct time domain before being handed over to the userspace PTP
stack.
The introduction of a second set of PHC operations that operate on the
hardware PTPCLKVAL/PTPCLKADD/PTPCLKRATE in the future is somewhat
unavoidable, as the TTEthernet core uses the corrected PTP time domain.
However, the free-running counter + timecounter structure combination
will suffice for now, as the resulting timestamps yield a sub-50 ns
synchronization offset in steady state using linuxptp.
For this patch, in absence of frame timestamping, the operations of the
switch PHC were tested by syncing it to the system time as a local slave
clock with:
phc2sys -s CLOCK_REALTIME -c swp2 -O 0 -m -S 0.01
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-08 15:04:34 +03:00
u64 ptp_control ;
2019-10-16 21:41:02 +03:00
u64 ptpclkval ;
net: dsa: sja1105: Add support for the PTP clock
The design of this PHC driver is influenced by the switch's behavior
w.r.t. timestamping. It exposes two PTP counters, one free-running
(PTPTSCLK) and the other offset- and frequency-corrected in hardware
through PTPCLKVAL, PTPCLKADD and PTPCLKRATE. The MACs can sample either
of these for frame timestamps.
However, the user manual warns that taking timestamps based on the
corrected clock is less than useful, as the switch can deliver corrupted
timestamps in a variety of circumstances.
Therefore, this PHC uses the free-running PTPTSCLK together with a
timecounter/cyclecounter structure that translates it into a software
time domain. Thus, the settime/adjtime and adjfine callbacks are
hardware no-ops.
The timestamps (introduced in a further patch) will also be translated
to the correct time domain before being handed over to the userspace PTP
stack.
The introduction of a second set of PHC operations that operate on the
hardware PTPCLKVAL/PTPCLKADD/PTPCLKRATE in the future is somewhat
unavoidable, as the TTEthernet core uses the corrected PTP time domain.
However, the free-running counter + timecounter structure combination
will suffice for now, as the resulting timestamps yield a sub-50 ns
synchronization offset in steady state using linuxptp.
For this patch, in absence of frame timestamping, the operations of the
switch PHC were tested by syncing it to the system time as a local slave
clock with:
phc2sys -s CLOCK_REALTIME -c swp2 -O 0 -m -S 0.01
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-08 15:04:34 +03:00
u64 ptpclkrate ;
2019-11-12 02:11:54 +02:00
u64 ptpclkcorp ;
net: dsa: sja1105: configure the PTP_CLK pin as EXT_TS or PER_OUT
The SJA1105 switch family has a PTP_CLK pin which emits a signal with
fixed 50% duty cycle, but variable frequency and programmable start time.
On the second generation (P/Q/R/S) switches, this pin supports even more
functionality. The use case described by the hardware documents talks
about synchronization via oneshot pulses: given 2 sja1105 switches,
arbitrarily designated as a master and a slave, the master emits a
single pulse on PTP_CLK, while the slave is configured to timestamp this
pulse received on its PTP_CLK pin (which must obviously be configured as
input). The difference between the timestamps then exactly becomes the
slave offset to the master.
The only trouble with the above is that the hardware is very much tied
into this use case only, and not very generic beyond that:
- When emitting a oneshot pulse, instead of being told when to emit it,
the switch just does it "now" and tells you later what time it was,
via the PTPSYNCTS register. [ Incidentally, this is the same register
that the slave uses to collect the ext_ts timestamp from, too. ]
- On the sync slave, there is no interrupt mechanism on reception of a
new extts, and no FIFO to buffer them, because in the foreseen use
case, software is in control of both the master and the slave pins,
so it "knows" when there's something to collect.
These 2 problems mean that:
- We don't support (at least yet) the quirky oneshot mode exposed by
the hardware, just normal periodic output.
- We abuse the hardware a little bit when we expose generic extts.
Because there's no interrupt mechanism, we need to poll at double the
frequency we expect to receive a pulse. Currently that means a
non-configurable "twice a second".
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-24 00:59:24 +02:00
u64 ptpsyncts ;
2019-11-12 02:11:54 +02:00
u64 ptpschtm ;
2019-06-08 15:04:35 +03:00
u64 ptpegr_ts [ SJA1105_NUM_PORTS ] ;
2019-05-02 23:23:30 +03:00
u64 pad_mii_tx [ SJA1105_NUM_PORTS ] ;
net: dsa: sja1105: enable internal pull-down for RX_DV/CRS_DV/RX_CTL and RX_ER
Some boards do not have the RX_ER MII signal connected. Normally in such
situation, those pins would be grounded, but then again, some boards
left it electrically floating.
When sending traffic to those switch ports, one can see that the
N_SOFERR statistics counter is incrementing once per each packet. The
user manual states for this counter that it may count the number of
frames "that have the MII error input being asserted prior to or
up to the SOF delimiter byte". So the switch MAC is sampling an
electrically floating signal, and preventing proper traffic reception
because of that.
As a workaround, enable the internal weak pull-downs on the input pads
for the MII control signals. This way, a floating signal would be
internally tied to ground.
The logic levels of signals which _are_ externally driven should not be
bothered by this 40-50 KOhm internal resistor. So it is not an issue to
enable the internal pull-down unconditionally, irrespective of PHY
interface type (MII, RMII, RGMII, SGMII) and of board layout.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-17 22:50:52 +03:00
u64 pad_mii_rx [ SJA1105_NUM_PORTS ] ;
2019-06-08 19:12:27 +03:00
u64 pad_mii_id [ SJA1105_NUM_PORTS ] ;
2019-05-02 23:23:30 +03:00
u64 cgu_idiv [ SJA1105_NUM_PORTS ] ;
u64 mii_tx_clk [ SJA1105_NUM_PORTS ] ;
u64 mii_rx_clk [ SJA1105_NUM_PORTS ] ;
u64 mii_ext_tx_clk [ SJA1105_NUM_PORTS ] ;
u64 mii_ext_rx_clk [ SJA1105_NUM_PORTS ] ;
u64 rgmii_tx_clk [ SJA1105_NUM_PORTS ] ;
u64 rmii_ref_clk [ SJA1105_NUM_PORTS ] ;
u64 rmii_ext_tx_clk [ SJA1105_NUM_PORTS ] ;
u64 mac [ SJA1105_NUM_PORTS ] ;
u64 mac_hl1 [ SJA1105_NUM_PORTS ] ;
u64 mac_hl2 [ SJA1105_NUM_PORTS ] ;
2020-03-27 16:00:16 +02:00
u64 ether_stats [ SJA1105_NUM_PORTS ] ;
2019-05-02 23:23:30 +03:00
u64 qlevel [ SJA1105_NUM_PORTS ] ;
} ;
struct sja1105_info {
u64 device_id ;
/* Needed for distinction between P and R, and between Q and S
* ( since the parts with / without SGMII share the same
* switch core and device_id )
*/
u64 part_no ;
2019-06-08 15:04:35 +03:00
/* E/T and P/Q/R/S have partial timestamps of different sizes.
* They must be reconstructed on both families anyway to get the full
* 64 - bit values back .
*/
int ptp_ts_bits ;
/* Also SPI commands are of different sizes to retrieve
* the egress timestamps .
*/
int ptpegr_ts_bytes ;
2020-05-28 03:27:58 +03:00
int num_cbs_shapers ;
2019-05-02 23:23:30 +03:00
const struct sja1105_dynamic_table_ops * dyn_ops ;
const struct sja1105_table_ops * static_ops ;
const struct sja1105_regs * regs ;
2020-05-12 20:20:32 +03:00
/* Both E/T and P/Q/R/S have quirks when it comes to popping the S-Tag
* from double - tagged frames . E / T will pop it only when it ' s equal to
* TPID from the General Parameters Table , while P / Q / R / S will only
* pop it when it ' s equal to TPID2 .
*/
u16 qinq_tpid ;
2019-11-13 00:16:41 +02:00
int ( * reset_cmd ) ( struct dsa_switch * ds ) ;
net: dsa: sja1105: Error out if RGMII delays are requested in DT
Documentation/devicetree/bindings/net/ethernet.txt is confusing because
it says what the MAC should not do, but not what it *should* do:
* "rgmii-rxid" (RGMII with internal RX delay provided by the PHY, the MAC
should not add an RX delay in this case)
The gap in semantics is threefold:
1. Is it illegal for the MAC to apply the Rx internal delay by itself,
and simplify the phy_mode (mask off "rgmii-rxid" into "rgmii") before
passing it to of_phy_connect? The documentation would suggest yes.
1. For "rgmii-rxid", while the situation with the Rx clock skew is more
or less clear (needs to be added by the PHY), what should the MAC
driver do about the Tx delays? Is it an implicit wild card for the
MAC to apply delays in the Tx direction if it can? What if those were
already added as serpentine PCB traces, how could that be made more
obvious through DT bindings so that the MAC doesn't attempt to add
them twice and again potentially break the link?
3. If the interface is a fixed-link and therefore the PHY object is
fixed (a purely software entity that obviously cannot add clock
skew), what is the meaning of the above property?
So an interpretation of the RGMII bindings was chosen that hopefully
does not contradict their intention but also makes them more applied.
The SJA1105 driver understands to act upon "rgmii-*id" phy-mode bindings
if the port is in the PHY role (either explicitly, or if it is a
fixed-link). Otherwise it always passes the duty of setting up delays to
the PHY driver.
The error behavior that this patch adds is required on SJA1105E/T where
the MAC really cannot apply internal delays. If the other end of the
fixed-link cannot apply RGMII delays either (this would be specified
through its own DT bindings), then the situation requires PCB delays.
For SJA1105P/Q/R/S, this is however hardware supported and the error is
thus only temporary. I created a stub function pointer for configuring
delays per-port on RXC and TXC, and will implement it when I have access
to a board with this hardware setup.
Meanwhile do not allow the user to select an invalid configuration.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-02 23:23:32 +03:00
int ( * setup_rgmii_delay ) ( const void * ctx , int port ) ;
2019-06-03 00:11:57 +03:00
/* Prototypes from include/net/dsa.h */
int ( * fdb_add_cmd ) ( struct dsa_switch * ds , int port ,
const unsigned char * addr , u16 vid ) ;
int ( * fdb_del_cmd ) ( struct dsa_switch * ds , int port ,
const unsigned char * addr , u16 vid ) ;
2019-11-12 02:11:53 +02:00
void ( * ptp_cmd_packing ) ( u8 * buf , struct sja1105_ptp_cmd * cmd ,
enum packing_op op ) ;
2019-05-02 23:23:30 +03:00
const char * name ;
} ;
2020-05-05 22:20:54 +03:00
enum sja1105_key_type {
SJA1105_KEY_BCAST ,
SJA1105_KEY_TC ,
SJA1105_KEY_VLAN_UNAWARE_VL ,
SJA1105_KEY_VLAN_AWARE_VL ,
} ;
struct sja1105_key {
enum sja1105_key_type type ;
union {
/* SJA1105_KEY_TC */
struct {
int pcp ;
} tc ;
/* SJA1105_KEY_VLAN_UNAWARE_VL */
/* SJA1105_KEY_VLAN_AWARE_VL */
struct {
u64 dmac ;
u16 vid ;
u16 pcp ;
} vl ;
} ;
} ;
2020-03-29 14:52:02 +03:00
enum sja1105_rule_type {
SJA1105_RULE_BCAST_POLICER ,
SJA1105_RULE_TC_POLICER ,
2020-05-05 22:20:55 +03:00
SJA1105_RULE_VL ,
} ;
enum sja1105_vl_type {
SJA1105_VL_NONCRITICAL ,
SJA1105_VL_RATE_CONSTRAINED ,
SJA1105_VL_TIME_TRIGGERED ,
2020-03-29 14:52:02 +03:00
} ;
struct sja1105_rule {
struct list_head list ;
unsigned long cookie ;
unsigned long port_mask ;
2020-05-05 22:20:54 +03:00
struct sja1105_key key ;
2020-03-29 14:52:02 +03:00
enum sja1105_rule_type type ;
2020-05-05 22:20:55 +03:00
/* Action */
2020-03-29 14:52:02 +03:00
union {
/* SJA1105_RULE_BCAST_POLICER */
struct {
int sharindx ;
} bcast_pol ;
/* SJA1105_RULE_TC_POLICER */
struct {
int sharindx ;
} tc_pol ;
2020-05-05 22:20:55 +03:00
/* SJA1105_RULE_VL */
struct {
enum sja1105_vl_type type ;
net: dsa: sja1105: implement tc-gate using time-triggered virtual links
Restrict the TTEthernet hardware support on this switch to operate as
closely as possible to IEEE 802.1Qci as possible. This means that it can
perform PTP-time-based ingress admission control on streams identified
by {DMAC, VID, PCP}, which is useful when trying to ensure the
determinism of traffic scheduled via IEEE 802.1Qbv.
The oddity comes from the fact that in hardware (and in TTEthernet at
large), virtual links always need a full-blown action, including not
only the type of policing, but also the list of destination ports. So in
practice, a single tc-gate action will result in all packets getting
dropped. Additional actions (either "trap" or "redirect") need to be
specified in the same filter rule such that the conforming packets are
actually forwarded somewhere.
Apart from the VL Lookup, Policing and Forwarding tables which need to
be programmed for each flow (virtual link), the Schedule engine also
needs to be told to open/close the admission gates for each individual
virtual link. A fairly accurate (and detailed) description of how that
works is already present in sja1105_tas.c, since it is already used to
trigger the egress gates for the tc-taprio offload (IEEE 802.1Qbv). Key
point here, we remember that the schedule engine supports 8
"subschedules" (execution threads that iterate through the global
schedule in parallel, and that no 2 hardware threads must execute a
schedule entry at the same time). For tc-taprio, each egress port used
one of these 8 subschedules, leaving a total of 4 subschedules unused.
In principle we could have allocated 1 subschedule for the tc-gate
offload of each ingress port, but actually the schedules of all virtual
links installed on each ingress port would have needed to be merged
together, before they could have been programmed to hardware. So
simplify our life and just merge the entire tc-gate configuration, for
all virtual links on all ingress ports, into a single subschedule. Be
sure to check that against the usual hardware scheduling conflicts, and
program it to hardware alongside any tc-taprio subschedule that may be
present.
The following scenarios were tested:
1. Quantitative testing:
tc qdisc add dev swp2 clsact
tc filter add dev swp2 ingress flower skip_sw \
dst_mac 42:be:24:9b:76:20 \
action gate index 1 base-time 0 \
sched-entry OPEN 1200 -1 -1 \
sched-entry CLOSE 1200 -1 -1 \
action trap
ping 192.168.1.2 -f
PING 192.168.1.2 (192.168.1.2) 56(84) bytes of data.
.............................
--- 192.168.1.2 ping statistics ---
948 packets transmitted, 467 received, 50.7384% packet loss, time 9671ms
2. Qualitative testing (with a phase-aligned schedule - the clocks are
synchronized by ptp4l, not shown here):
Receiver (sja1105):
tc qdisc add dev swp2 clsact
now=$(phc_ctl /dev/ptp1 get | awk '/clock time is/ {print $5}') && \
sec=$(echo $now | awk -F. '{print $1}') && \
base_time="$(((sec + 2) * 1000000000))" && \
echo "base time ${base_time}"
tc filter add dev swp2 ingress flower skip_sw \
dst_mac 42:be:24:9b:76:20 \
action gate base-time ${base_time} \
sched-entry OPEN 60000 -1 -1 \
sched-entry CLOSE 40000 -1 -1 \
action trap
Sender (enetc):
now=$(phc_ctl /dev/ptp0 get | awk '/clock time is/ {print $5}') && \
sec=$(echo $now | awk -F. '{print $1}') && \
base_time="$(((sec + 2) * 1000000000))" && \
echo "base time ${base_time}"
tc qdisc add dev eno0 parent root taprio \
num_tc 8 \
map 0 1 2 3 4 5 6 7 \
queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
base-time ${base_time} \
sched-entry S 01 50000 \
sched-entry S 00 50000 \
flags 2
ping -A 192.168.1.1
PING 192.168.1.1 (192.168.1.1): 56 data bytes
...
^C
--- 192.168.1.1 ping statistics ---
1425 packets transmitted, 1424 packets received, 0% packet loss
round-trip min/avg/max = 0.322/0.361/0.990 ms
And just for comparison, with the tc-taprio schedule deleted:
ping -A 192.168.1.1
PING 192.168.1.1 (192.168.1.1): 56 data bytes
...
^C
--- 192.168.1.1 ping statistics ---
33 packets transmitted, 19 packets received, 42% packet loss
round-trip min/avg/max = 0.336/0.464/0.597 ms
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-05 22:20:56 +03:00
unsigned long destports ;
int sharindx ;
int maxlen ;
int ipv ;
u64 base_time ;
u64 cycle_time ;
int num_entries ;
struct action_gate_entry * entries ;
struct flow_stats stats ;
2020-05-05 22:20:55 +03:00
} vl ;
2020-03-29 14:52:02 +03:00
} ;
} ;
struct sja1105_flow_block {
struct list_head rules ;
bool l2_policer_used [ SJA1105_NUM_L2_POLICERS ] ;
2020-05-05 22:20:55 +03:00
int num_virtual_links ;
2020-03-29 14:52:02 +03:00
} ;
net: dsa: sja1105: save/restore VLANs using a delta commit method
Managing the VLAN table that is present in hardware will become very
difficult once we add a third operating state
(best_effort_vlan_filtering). That is because correct cleanup (not too
little, not too much) becomes virtually impossible, when VLANs can be
added from the bridge layer, from dsa_8021q for basic tagging, for
cross-chip bridging, as well as retagging rules for sub-VLANs and
cross-chip sub-VLANs. So we need to rethink VLAN interaction with the
switch in a more scalable way.
In preparation for that, use the priv->expect_dsa_8021q boolean to
classify any VLAN request received through .port_vlan_add or
.port_vlan_del towards either one of 2 internal lists: bridge VLANs and
dsa_8021q VLANs.
Then, implement a central sja1105_build_vlan_table method that creates a
VLAN configuration from scratch based on the 2 lists of VLANs kept by
the driver, and based on the VLAN awareness state. Currently, if we are
VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs.
Then, implement a delta commit procedure that identifies which VLANs
from this new configuration are actually different from the config
previously committed to hardware. We apply the delta through the dynamic
configuration interface (we don't reset the switch). The result is that
the hardware should see the exact sequence of operations as before this
patch.
This also helps remove the "br" argument passed to
dsa_8021q_crosschip_bridge_join, which it was only using to figure out
whether it should commit the configuration back to us or not, based on
the VLAN awareness state of the bridge. We can simplify that, by always
allowing those VLANs inside of our dsa_8021q_vlans list, and committing
those to hardware when necessary.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 20:20:29 +03:00
struct sja1105_bridge_vlan {
struct list_head list ;
int port ;
u16 vid ;
bool pvid ;
bool untagged ;
} ;
2020-05-12 20:20:27 +03:00
enum sja1105_vlan_state {
SJA1105_VLAN_UNAWARE ,
2020-05-12 20:20:35 +03:00
SJA1105_VLAN_BEST_EFFORT ,
2020-05-12 20:20:27 +03:00
SJA1105_VLAN_FILTERING_FULL ,
} ;
2019-05-02 23:23:30 +03:00
struct sja1105_private {
struct sja1105_static_config static_config ;
net: dsa: sja1105: Error out if RGMII delays are requested in DT
Documentation/devicetree/bindings/net/ethernet.txt is confusing because
it says what the MAC should not do, but not what it *should* do:
* "rgmii-rxid" (RGMII with internal RX delay provided by the PHY, the MAC
should not add an RX delay in this case)
The gap in semantics is threefold:
1. Is it illegal for the MAC to apply the Rx internal delay by itself,
and simplify the phy_mode (mask off "rgmii-rxid" into "rgmii") before
passing it to of_phy_connect? The documentation would suggest yes.
1. For "rgmii-rxid", while the situation with the Rx clock skew is more
or less clear (needs to be added by the PHY), what should the MAC
driver do about the Tx delays? Is it an implicit wild card for the
MAC to apply delays in the Tx direction if it can? What if those were
already added as serpentine PCB traces, how could that be made more
obvious through DT bindings so that the MAC doesn't attempt to add
them twice and again potentially break the link?
3. If the interface is a fixed-link and therefore the PHY object is
fixed (a purely software entity that obviously cannot add clock
skew), what is the meaning of the above property?
So an interpretation of the RGMII bindings was chosen that hopefully
does not contradict their intention but also makes them more applied.
The SJA1105 driver understands to act upon "rgmii-*id" phy-mode bindings
if the port is in the PHY role (either explicitly, or if it is a
fixed-link). Otherwise it always passes the duty of setting up delays to
the PHY driver.
The error behavior that this patch adds is required on SJA1105E/T where
the MAC really cannot apply internal delays. If the other end of the
fixed-link cannot apply RGMII delays either (this would be specified
through its own DT bindings), then the situation requires PCB delays.
For SJA1105P/Q/R/S, this is however hardware supported and the error is
thus only temporary. I created a stub function pointer for configuring
delays per-port on RXC and TXC, and will implement it when I have access
to a board with this hardware setup.
Meanwhile do not allow the user to select an invalid configuration.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-02 23:23:32 +03:00
bool rgmii_rx_delay [ SJA1105_NUM_PORTS ] ;
bool rgmii_tx_delay [ SJA1105_NUM_PORTS ] ;
2020-05-12 20:20:35 +03:00
bool best_effort_vlan_filtering ;
2019-05-02 23:23:30 +03:00
const struct sja1105_info * info ;
struct gpio_desc * reset_gpio ;
struct spi_device * spidev ;
struct dsa_switch * ds ;
net: dsa: sja1105: save/restore VLANs using a delta commit method
Managing the VLAN table that is present in hardware will become very
difficult once we add a third operating state
(best_effort_vlan_filtering). That is because correct cleanup (not too
little, not too much) becomes virtually impossible, when VLANs can be
added from the bridge layer, from dsa_8021q for basic tagging, for
cross-chip bridging, as well as retagging rules for sub-VLANs and
cross-chip sub-VLANs. So we need to rethink VLAN interaction with the
switch in a more scalable way.
In preparation for that, use the priv->expect_dsa_8021q boolean to
classify any VLAN request received through .port_vlan_add or
.port_vlan_del towards either one of 2 internal lists: bridge VLANs and
dsa_8021q VLANs.
Then, implement a central sja1105_build_vlan_table method that creates a
VLAN configuration from scratch based on the 2 lists of VLANs kept by
the driver, and based on the VLAN awareness state. Currently, if we are
VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs.
Then, implement a delta commit procedure that identifies which VLANs
from this new configuration are actually different from the config
previously committed to hardware. We apply the delta through the dynamic
configuration interface (we don't reset the switch). The result is that
the hardware should see the exact sequence of operations as before this
patch.
This also helps remove the "br" argument passed to
dsa_8021q_crosschip_bridge_join, which it was only using to figure out
whether it should commit the configuration back to us or not, based on
the VLAN awareness state of the bridge. We can simplify that, by always
allowing those VLANs inside of our dsa_8021q_vlans list, and committing
those to hardware when necessary.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 20:20:29 +03:00
struct list_head dsa_8021q_vlans ;
struct list_head bridge_vlans ;
2020-03-29 14:52:02 +03:00
struct sja1105_flow_block flow_block ;
2019-05-05 13:19:27 +03:00
struct sja1105_port ports [ SJA1105_NUM_PORTS ] ;
/* Serializes transmission of management frames so that
* the switch doesn ' t confuse them with one another .
*/
struct mutex mgmt_lock ;
net: dsa: tag_8021q: add a context structure
While working on another tag_8021q driver implementation, some things
became apparent:
- It is not mandatory for a DSA driver to offload the tag_8021q VLANs by
using the VLAN table per se. For example, it can add custom TCAM rules
that simply encapsulate RX traffic, and redirect & decapsulate rules
for TX traffic. For such a driver, it makes no sense to receive the
tag_8021q configuration through the same callback as it receives the
VLAN configuration from the bridge and the 8021q modules.
- Currently, sja1105 (the only tag_8021q user) sets a
priv->expect_dsa_8021q variable to distinguish between the bridge
calling, and tag_8021q calling. That can be improved, to say the
least.
- The crosschip bridging operations are, in fact, stateful already. The
list of crosschip_links must be kept by the caller and passed to the
relevant tag_8021q functions.
So it would be nice if the tag_8021q configuration was more
self-contained. This patch attempts to do that.
Create a struct dsa_8021q_context which encapsulates a struct
dsa_switch, and has 2 function pointers for adding and deleting a VLAN.
These will replace the previous channel to the driver, which was through
the .port_vlan_add and .port_vlan_del callbacks of dsa_switch_ops.
Also put the list of crosschip_links into this dsa_8021q_context.
Drivers that don't support cross-chip bridging can simply omit to
initialize this list, as long as they dont call any cross-chip function.
The sja1105_vlan_add and sja1105_vlan_del functions are refactored into
a smaller sja1105_vlan_add_one, which now has 2 entry points:
- sja1105_vlan_add, from struct dsa_switch_ops
- sja1105_dsa_8021q_vlan_add, from the tag_8021q ops
But even this change is fairly trivial. It just reflects the fact that
for sja1105, the VLANs from these 2 channels end up in the same hardware
table. However that is not necessarily true in the general sense (and
that's the reason for making this change).
The rest of the patch is mostly plain refactoring of "ds" -> "ctx". The
dsa_8021q_context structure needs to be propagated because adding a VLAN
is now done through the ops function pointers inside of it.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-10 19:48:56 +03:00
struct dsa_8021q_context * dsa_8021q_ctx ;
2020-05-12 20:20:27 +03:00
enum sja1105_vlan_state vlan_state ;
2020-05-28 03:27:58 +03:00
struct sja1105_cbs_entry * cbs ;
2019-06-08 15:04:40 +03:00
struct sja1105_tagger_data tagger_data ;
2019-10-12 02:18:15 +03:00
struct sja1105_ptp_data ptp_data ;
2019-09-15 05:00:02 +03:00
struct sja1105_tas_data tas_data ;
2019-05-02 23:23:30 +03:00
} ;
# include "sja1105_dynamic_config.h"
struct sja1105_spi_message {
u64 access ;
u64 read_count ;
u64 address ;
} ;
2019-09-15 05:00:02 +03:00
/* From sja1105_main.c */
2019-11-12 23:22:00 +02:00
enum sja1105_reset_reason {
SJA1105_VLAN_FILTERING = 0 ,
SJA1105_RX_HWTSTAMPING ,
SJA1105_AGEING_TIME ,
SJA1105_SCHEDULING ,
2020-03-27 21:55:45 +02:00
SJA1105_BEST_EFFORT_POLICING ,
2020-05-05 22:20:55 +03:00
SJA1105_VIRTUAL_LINKS ,
2019-11-12 23:22:00 +02:00
} ;
int sja1105_static_config_reload ( struct sja1105_private * priv ,
enum sja1105_reset_reason reason ) ;
2019-09-15 05:00:02 +03:00
2020-05-12 20:20:37 +03:00
void sja1105_frame_memory_partitioning ( struct sja1105_private * priv ) ;
2019-05-02 23:23:30 +03:00
/* From sja1105_spi.c */
2019-10-01 22:18:01 +03:00
int sja1105_xfer_buf ( const struct sja1105_private * priv ,
sja1105_spi_rw_mode_t rw , u64 reg_addr ,
net: dsa: sja1105: Switch to scatter/gather API for SPI
This reworks the SPI transfer implementation to make use of more of the
SPI core features. The main benefit is to avoid the memcpy in
sja1105_xfer_buf().
The memcpy was only needed because the function was transferring a
single buffer at a time. So it needed to copy the caller-provided buffer
at buf + 4, to store the SPI message header in the "headroom" area.
But the SPI core supports scatter-gather messages, comprised of multiple
transfers. We can actually use those to break apart every SPI message
into 2 transfers: one for the header and one for the actual payload.
To keep the behavior the same regarding the chip select signal, it is
necessary to tell the SPI core to de-assert the chip select after each
chunk. This was not needed before, because each spi_message contained
only 1 single transfer.
The meaning of the per-transfer cs_change=1 is:
- If the transfer is the last one of the message, keep CS asserted
- Otherwise, deassert CS
We need to deassert CS in the "otherwise" case, which was implicit
before.
Avoiding the memcpy creates yet another opportunity. The device can't
process more than 256 bytes of SPI payload at a time, so the
sja1105_xfer_long_buf() function used to exist, to split the larger
caller buffer into chunks.
But these chunks couldn't be used as scatter/gather buffers for
spi_message until now, because of that memcpy (we would have needed more
memory for each chunk). So we can now remove the sja1105_xfer_long_buf()
function and have a single implementation for long and short buffers.
Another benefit is lower usage of stack memory. Previously we had to
store 2 SPI buffers for each chunk. Due to the elimination of the
memcpy, we can now send pointers to the actual chunks from the
caller-supplied buffer to the SPI core.
Since the patch merges two functions into a rewritten implementation,
the function prototype was also changed, mainly for cosmetic consistency
with the structures used within it.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-12 01:31:15 +03:00
u8 * buf , size_t len ) ;
2019-10-01 22:18:00 +03:00
int sja1105_xfer_u32 ( const struct sja1105_private * priv ,
2019-11-09 13:32:22 +02:00
sja1105_spi_rw_mode_t rw , u64 reg_addr , u32 * value ,
struct ptp_system_timestamp * ptp_sts ) ;
2019-10-01 22:18:00 +03:00
int sja1105_xfer_u64 ( const struct sja1105_private * priv ,
2019-11-09 13:32:22 +02:00
sja1105_spi_rw_mode_t rw , u64 reg_addr , u64 * value ,
struct ptp_system_timestamp * ptp_sts ) ;
2019-05-02 23:23:30 +03:00
int sja1105_static_config_upload ( struct sja1105_private * priv ) ;
2019-06-08 16:03:43 +03:00
int sja1105_inhibit_tx ( const struct sja1105_private * priv ,
unsigned long port_bitmap , bool tx_inhibited ) ;
2019-05-02 23:23:30 +03:00
2020-06-20 20:18:32 +03:00
extern const struct sja1105_info sja1105e_info ;
extern const struct sja1105_info sja1105t_info ;
extern const struct sja1105_info sja1105p_info ;
extern const struct sja1105_info sja1105q_info ;
extern const struct sja1105_info sja1105r_info ;
extern const struct sja1105_info sja1105s_info ;
2019-05-02 23:23:30 +03:00
/* From sja1105_clocking.c */
typedef enum {
XMII_MAC = 0 ,
XMII_PHY = 1 ,
} sja1105_mii_role_t ;
typedef enum {
XMII_MODE_MII = 0 ,
XMII_MODE_RMII = 1 ,
XMII_MODE_RGMII = 2 ,
2020-03-20 13:29:37 +02:00
XMII_MODE_SGMII = 3 ,
2019-05-02 23:23:30 +03:00
} sja1105_phy_interface_t ;
typedef enum {
SJA1105_SPEED_10MBPS = 3 ,
SJA1105_SPEED_100MBPS = 2 ,
SJA1105_SPEED_1000MBPS = 1 ,
SJA1105_SPEED_AUTO = 0 ,
} sja1105_speed_t ;
2019-06-08 19:12:28 +03:00
int sja1105pqrs_setup_rgmii_delay ( const void * ctx , int port ) ;
2019-05-02 23:23:30 +03:00
int sja1105_clocking_setup_port ( struct sja1105_private * priv , int port ) ;
int sja1105_clocking_setup ( struct sja1105_private * priv ) ;
2019-05-02 23:23:35 +03:00
/* From sja1105_ethtool.c */
void sja1105_get_ethtool_stats ( struct dsa_switch * ds , int port , u64 * data ) ;
void sja1105_get_strings ( struct dsa_switch * ds , int port ,
u32 stringset , u8 * data ) ;
int sja1105_get_sset_count ( struct dsa_switch * ds , int port , int sset ) ;
2019-05-02 23:23:30 +03:00
2019-05-02 23:23:35 +03:00
/* From sja1105_dynamic_config.c */
2019-05-02 23:23:30 +03:00
int sja1105_dynamic_config_read ( struct sja1105_private * priv ,
enum sja1105_blk_idx blk_idx ,
int index , void * entry ) ;
int sja1105_dynamic_config_write ( struct sja1105_private * priv ,
enum sja1105_blk_idx blk_idx ,
int index , void * entry , bool keep ) ;
2019-06-03 00:15:45 +03:00
enum sja1105_iotag {
SJA1105_C_TAG = 0 , /* Inner VLAN header */
SJA1105_S_TAG = 1 , /* Outer VLAN header */
} ;
2019-06-03 00:11:57 +03:00
u8 sja1105et_fdb_hash ( struct sja1105_private * priv , const u8 * addr , u16 vid ) ;
int sja1105et_fdb_add ( struct dsa_switch * ds , int port ,
const unsigned char * addr , u16 vid ) ;
int sja1105et_fdb_del ( struct dsa_switch * ds , int port ,
const unsigned char * addr , u16 vid ) ;
int sja1105pqrs_fdb_add ( struct dsa_switch * ds , int port ,
const unsigned char * addr , u16 vid ) ;
int sja1105pqrs_fdb_del ( struct dsa_switch * ds , int port ,
const unsigned char * addr , u16 vid ) ;
2019-05-02 23:23:31 +03:00
2020-03-29 14:52:02 +03:00
/* From sja1105_flower.c */
int sja1105_cls_flower_del ( struct dsa_switch * ds , int port ,
struct flow_cls_offload * cls , bool ingress ) ;
int sja1105_cls_flower_add ( struct dsa_switch * ds , int port ,
struct flow_cls_offload * cls , bool ingress ) ;
net: dsa: sja1105: implement tc-gate using time-triggered virtual links
Restrict the TTEthernet hardware support on this switch to operate as
closely as possible to IEEE 802.1Qci as possible. This means that it can
perform PTP-time-based ingress admission control on streams identified
by {DMAC, VID, PCP}, which is useful when trying to ensure the
determinism of traffic scheduled via IEEE 802.1Qbv.
The oddity comes from the fact that in hardware (and in TTEthernet at
large), virtual links always need a full-blown action, including not
only the type of policing, but also the list of destination ports. So in
practice, a single tc-gate action will result in all packets getting
dropped. Additional actions (either "trap" or "redirect") need to be
specified in the same filter rule such that the conforming packets are
actually forwarded somewhere.
Apart from the VL Lookup, Policing and Forwarding tables which need to
be programmed for each flow (virtual link), the Schedule engine also
needs to be told to open/close the admission gates for each individual
virtual link. A fairly accurate (and detailed) description of how that
works is already present in sja1105_tas.c, since it is already used to
trigger the egress gates for the tc-taprio offload (IEEE 802.1Qbv). Key
point here, we remember that the schedule engine supports 8
"subschedules" (execution threads that iterate through the global
schedule in parallel, and that no 2 hardware threads must execute a
schedule entry at the same time). For tc-taprio, each egress port used
one of these 8 subschedules, leaving a total of 4 subschedules unused.
In principle we could have allocated 1 subschedule for the tc-gate
offload of each ingress port, but actually the schedules of all virtual
links installed on each ingress port would have needed to be merged
together, before they could have been programmed to hardware. So
simplify our life and just merge the entire tc-gate configuration, for
all virtual links on all ingress ports, into a single subschedule. Be
sure to check that against the usual hardware scheduling conflicts, and
program it to hardware alongside any tc-taprio subschedule that may be
present.
The following scenarios were tested:
1. Quantitative testing:
tc qdisc add dev swp2 clsact
tc filter add dev swp2 ingress flower skip_sw \
dst_mac 42:be:24:9b:76:20 \
action gate index 1 base-time 0 \
sched-entry OPEN 1200 -1 -1 \
sched-entry CLOSE 1200 -1 -1 \
action trap
ping 192.168.1.2 -f
PING 192.168.1.2 (192.168.1.2) 56(84) bytes of data.
.............................
--- 192.168.1.2 ping statistics ---
948 packets transmitted, 467 received, 50.7384% packet loss, time 9671ms
2. Qualitative testing (with a phase-aligned schedule - the clocks are
synchronized by ptp4l, not shown here):
Receiver (sja1105):
tc qdisc add dev swp2 clsact
now=$(phc_ctl /dev/ptp1 get | awk '/clock time is/ {print $5}') && \
sec=$(echo $now | awk -F. '{print $1}') && \
base_time="$(((sec + 2) * 1000000000))" && \
echo "base time ${base_time}"
tc filter add dev swp2 ingress flower skip_sw \
dst_mac 42:be:24:9b:76:20 \
action gate base-time ${base_time} \
sched-entry OPEN 60000 -1 -1 \
sched-entry CLOSE 40000 -1 -1 \
action trap
Sender (enetc):
now=$(phc_ctl /dev/ptp0 get | awk '/clock time is/ {print $5}') && \
sec=$(echo $now | awk -F. '{print $1}') && \
base_time="$(((sec + 2) * 1000000000))" && \
echo "base time ${base_time}"
tc qdisc add dev eno0 parent root taprio \
num_tc 8 \
map 0 1 2 3 4 5 6 7 \
queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
base-time ${base_time} \
sched-entry S 01 50000 \
sched-entry S 00 50000 \
flags 2
ping -A 192.168.1.1
PING 192.168.1.1 (192.168.1.1): 56 data bytes
...
^C
--- 192.168.1.1 ping statistics ---
1425 packets transmitted, 1424 packets received, 0% packet loss
round-trip min/avg/max = 0.322/0.361/0.990 ms
And just for comparison, with the tc-taprio schedule deleted:
ping -A 192.168.1.1
PING 192.168.1.1 (192.168.1.1): 56 data bytes
...
^C
--- 192.168.1.1 ping statistics ---
33 packets transmitted, 19 packets received, 42% packet loss
round-trip min/avg/max = 0.336/0.464/0.597 ms
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-05 22:20:56 +03:00
int sja1105_cls_flower_stats ( struct dsa_switch * ds , int port ,
struct flow_cls_offload * cls , bool ingress ) ;
2020-03-29 14:52:02 +03:00
void sja1105_flower_setup ( struct dsa_switch * ds ) ;
void sja1105_flower_teardown ( struct dsa_switch * ds ) ;
2020-05-05 22:20:55 +03:00
struct sja1105_rule * sja1105_rule_find ( struct sja1105_private * priv ,
unsigned long cookie ) ;
2020-03-29 14:52:02 +03:00
2019-05-02 23:23:30 +03:00
# endif