2006-01-02 19:04:38 +01:00
/*
* net / tipc / node . h : Include file for TIPC node management routines
2007-02-09 23:25:21 +09:00
*
2016-09-01 13:52:49 -04:00
* Copyright ( c ) 2000 - 2006 , 2014 - 2016 , Ericsson AB
2014-03-27 12:54:36 +08:00
* Copyright ( c ) 2005 , 2010 - 2014 , Wind River Systems
2006-01-02 19:04:38 +01:00
* All rights reserved .
*
2006-01-11 13:30:43 +01:00
* Redistribution and use in source and binary forms , with or without
2006-01-02 19:04:38 +01:00
* modification , are permitted provided that the following conditions are met :
*
2006-01-11 13:30:43 +01:00
* 1. Redistributions of source code must retain the above copyright
* notice , this list of conditions and the following disclaimer .
* 2. Redistributions in binary form must reproduce the above copyright
* notice , this list of conditions and the following disclaimer in the
* documentation and / or other materials provided with the distribution .
* 3. Neither the names of the copyright holders nor the names of its
* contributors may be used to endorse or promote products derived from
* this software without specific prior written permission .
2006-01-02 19:04:38 +01:00
*
2006-01-11 13:30:43 +01:00
* Alternatively , this software may be distributed under the terms of the
* GNU General Public License ( " GPL " ) version 2 as published by the Free
* Software Foundation .
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS " AS IS "
* AND ANY EXPRESS OR IMPLIED WARRANTIES , INCLUDING , BUT NOT LIMITED TO , THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED . IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
* LIABLE FOR ANY DIRECT , INDIRECT , INCIDENTAL , SPECIAL , EXEMPLARY , OR
* CONSEQUENTIAL DAMAGES ( INCLUDING , BUT NOT LIMITED TO , PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES ; LOSS OF USE , DATA , OR PROFITS ; OR BUSINESS
* INTERRUPTION ) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY , WHETHER IN
* CONTRACT , STRICT LIABILITY , OR TORT ( INCLUDING NEGLIGENCE OR OTHERWISE )
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE , EVEN IF ADVISED OF THE
2006-01-02 19:04:38 +01:00
* POSSIBILITY OF SUCH DAMAGE .
*/
# ifndef _TIPC_NODE_H
# define _TIPC_NODE_H
2010-12-31 18:59:19 +00:00
# include "addr.h"
# include "net.h"
2006-01-02 19:04:38 +01:00
# include "bearer.h"
2014-06-25 20:41:33 -05:00
# include "msg.h"
2006-01-02 19:04:38 +01:00
2015-10-22 08:51:40 -04:00
/* Optional capabilities supported by this code version
*/
enum {
2018-09-28 20:23:21 +02:00
TIPC_SYN_BIT = ( 1 ) ,
2016-09-01 13:52:49 -04:00
TIPC_BCAST_SYNCH = ( 1 < < 1 ) ,
TIPC_BCAST_STATE_NACK = ( 1 < < 2 ) ,
2017-01-18 13:50:53 -05:00
TIPC_BLOCK_FLOWCTL = ( 1 < < 3 ) ,
tipc: introduce communication groups
As a preparation for introducing flow control for multicast and datagram
messaging we need a more strictly defined framework than we have now. A
socket must be able keep track of exactly how many and which other
sockets it is allowed to communicate with at any moment, and keep the
necessary state for those.
We therefore introduce a new concept we have named Communication Group.
Sockets can join a group via a new setsockopt() call TIPC_GROUP_JOIN.
The call takes four parameters: 'type' serves as group identifier,
'instance' serves as an logical member identifier, and 'scope' indicates
the visibility of the group (node/cluster/zone). Finally, 'flags' makes
it possible to set certain properties for the member. For now, there is
only one flag, indicating if the creator of the socket wants to receive
a copy of broadcast or multicast messages it is sending via the socket,
and if wants to be eligible as destination for its own anycasts.
A group is closed, i.e., sockets which have not joined a group will
not be able to send messages to or receive messages from members of
the group, and vice versa.
Any member of a group can send multicast ('group broadcast') messages
to all group members, optionally including itself, using the primitive
send(). The messages are received via the recvmsg() primitive. A socket
can only be member of one group at a time.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-13 11:04:23 +02:00
TIPC_BCAST_RCAST = ( 1 < < 4 ) ,
2018-07-10 01:07:35 +02:00
TIPC_NODE_ID128 = ( 1 < < 5 ) ,
2019-03-19 18:49:49 +07:00
TIPC_LINK_PROTO_SEQNO = ( 1 < < 6 ) ,
tipc: improve TIPC throughput by Gap ACK blocks
During unicast link transmission, it's observed very often that because
of one or a few lost/dis-ordered packets, the sending side will fastly
reach the send window limit and must wait for the packets to be arrived
at the receiving side or in the worst case, a retransmission must be
done first. The sending side cannot release a lot of subsequent packets
in its transmq even though all of them might have already been received
by the receiving side.
That is, one or two packets dis-ordered/lost and dozens of packets have
to wait, this obviously reduces the overall throughput!
This commit introduces an algorithm to overcome this by using "Gap ACK
blocks". Basically, a Gap ACK block will consist of <ack, gap> numbers
that describes the link deferdq where packets have been got by the
receiving side but with gaps, for example:
link deferdq: [1 2 3 4 10 11 13 14 15 20]
--> Gap ACK blocks: <4, 5>, <11, 1>, <15, 4>, <20, 0>
The Gap ACK blocks will be sent to the sending side along with the
traditional ACK or NACK message. Immediately when receiving the message
the sending side will now not only release from its transmq the packets
ack-ed by the ACK but also by the Gap ACK blocks! So, more packets can
be enqueued and transmitted.
In addition, the sending side can now do "multi-retransmissions"
according to the Gaps reported in the Gap ACK blocks.
The new algorithm as verified helps greatly improve the TIPC throughput
especially under packet loss condition.
So far, a maximum of 32 blocks is quite enough without any "Too few Gap
ACK blocks" reports with a 5.0% packet loss rate, however this number
can be increased in the furture if needed.
Also, the patch is backward compatible.
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-04 11:09:51 +07:00
TIPC_MCAST_RBCTL = ( 1 < < 7 ) ,
2019-07-24 08:56:11 +07:00
TIPC_GAP_ACK_BLOCK = ( 1 < < 8 ) ,
TIPC_TUNNEL_ENHANCED = ( 1 < < 9 )
2015-10-22 08:51:40 -04:00
} ;
2018-09-28 20:23:21 +02:00
# define TIPC_NODE_CAPABILITIES (TIPC_SYN_BIT | \
TIPC_BCAST_SYNCH | \
TIPC_BCAST_STATE_NACK | \
TIPC_BCAST_RCAST | \
TIPC_BLOCK_FLOWCTL | \
TIPC_NODE_ID128 | \
2019-03-19 18:49:49 +07:00
TIPC_LINK_PROTO_SEQNO | \
tipc: improve TIPC throughput by Gap ACK blocks
During unicast link transmission, it's observed very often that because
of one or a few lost/dis-ordered packets, the sending side will fastly
reach the send window limit and must wait for the packets to be arrived
at the receiving side or in the worst case, a retransmission must be
done first. The sending side cannot release a lot of subsequent packets
in its transmq even though all of them might have already been received
by the receiving side.
That is, one or two packets dis-ordered/lost and dozens of packets have
to wait, this obviously reduces the overall throughput!
This commit introduces an algorithm to overcome this by using "Gap ACK
blocks". Basically, a Gap ACK block will consist of <ack, gap> numbers
that describes the link deferdq where packets have been got by the
receiving side but with gaps, for example:
link deferdq: [1 2 3 4 10 11 13 14 15 20]
--> Gap ACK blocks: <4, 5>, <11, 1>, <15, 4>, <20, 0>
The Gap ACK blocks will be sent to the sending side along with the
traditional ACK or NACK message. Immediately when receiving the message
the sending side will now not only release from its transmq the packets
ack-ed by the ACK but also by the Gap ACK blocks! So, more packets can
be enqueued and transmitted.
In addition, the sending side can now do "multi-retransmissions"
according to the Gaps reported in the Gap ACK blocks.
The new algorithm as verified helps greatly improve the TIPC throughput
especially under packet loss condition.
So far, a maximum of 32 blocks is quite enough without any "Too few Gap
ACK blocks" reports with a 5.0% packet loss rate, however this number
can be increased in the furture if needed.
Also, the patch is backward compatible.
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-04-04 11:09:51 +07:00
TIPC_MCAST_RBCTL | \
2019-07-24 08:56:11 +07:00
TIPC_GAP_ACK_BLOCK | \
TIPC_TUNNEL_ENHANCED )
2015-11-19 14:30:46 -05:00
# define INVALID_BEARER_ID -1
2015-10-22 08:51:40 -04:00
2015-01-09 15:27:05 +08:00
void tipc_node_stop ( struct net * net ) ;
2018-04-25 19:29:36 +02:00
bool tipc_node_get_id ( struct net * net , u32 addr , u8 * id ) ;
tipc: enable tracepoints in tipc
As for the sake of debugging/tracing, the commit enables tracepoints in
TIPC along with some general trace_events as shown below. It also
defines some 'tipc_*_dump()' functions that allow to dump TIPC object
data whenever needed, that is, for general debug purposes, ie. not just
for the trace_events.
The following trace_events are now available:
- trace_tipc_skb_dump(): allows to trace and dump TIPC msg & skb data,
e.g. message type, user, droppable, skb truesize, cloned skb, etc.
- trace_tipc_list_dump(): allows to trace and dump any TIPC buffers or
queues, e.g. TIPC link transmq, socket receive queue, etc.
- trace_tipc_sk_dump(): allows to trace and dump TIPC socket data, e.g.
sk state, sk type, connection type, rmem_alloc, socket queues, etc.
- trace_tipc_link_dump(): allows to trace and dump TIPC link data, e.g.
link state, silent_intv_cnt, gap, bc_gap, link queues, etc.
- trace_tipc_node_dump(): allows to trace and dump TIPC node data, e.g.
node state, active links, capabilities, link entries, etc.
How to use:
Put the trace functions at any places where we want to dump TIPC data
or events.
Note:
a) The dump functions will generate raw data only, that is, to offload
the trace event's processing, it can require a tool or script to parse
the data but this should be simple.
b) The trace_tipc_*_dump() should be reserved for a failure cases only
(e.g. the retransmission failure case) or where we do not expect to
happen too often, then we can consider enabling these events by default
since they will almost not take any effects under normal conditions,
but once the rare condition or failure occurs, we get the dumped data
fully for post-analysis.
For other trace purposes, we can reuse these trace classes as template
but different events.
c) A trace_event is only effective when we enable it. To enable the
TIPC trace_events, echo 1 to 'enable' files in the events/tipc/
directory in the 'debugfs' file system. Normally, they are located at:
/sys/kernel/debug/tracing/events/tipc/
For example:
To enable the tipc_link_dump event:
echo 1 > /sys/kernel/debug/tracing/events/tipc/tipc_link_dump/enable
To enable all the TIPC trace_events:
echo 1 > /sys/kernel/debug/tracing/events/tipc/enable
To collect the trace data:
cat trace
or
cat trace_pipe > /trace.out &
To disable all the TIPC trace_events:
echo 0 > /sys/kernel/debug/tracing/events/tipc/enable
To clear the trace buffer:
echo > trace
d) Like the other trace_events, the feature like 'filter' or 'trigger'
is also usable for the tipc trace_events.
For more details, have a look at:
Documentation/trace/ftrace.txt
MAINTAINERS | add two new files 'trace.h' & 'trace.c' in tipc
Acked-by: Ying Xue <ying.xue@windriver.com>
Tested-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 09:17:56 +07:00
u32 tipc_node_get_addr ( struct tipc_node * node ) ;
tipc: handle collisions of 32-bit node address hash values
When a 32-bit node address is generated from a 128-bit identifier,
there is a risk of collisions which must be discovered and handled.
We do this as follows:
- We don't apply the generated address immediately to the node, but do
instead initiate a 1 sec trial period to allow other cluster members
to discover and handle such collisions.
- During the trial period the node periodically sends out a new type
of message, DSC_TRIAL_MSG, using broadcast or emulated broadcast,
to all the other nodes in the cluster.
- When a node is receiving such a message, it must check that the
presented 32-bit identifier either is unused, or was used by the very
same peer in a previous session. In both cases it accepts the request
by not responding to it.
- If it finds that the same node has been up before using a different
address, it responds with a DSC_TRIAL_FAIL_MSG containing that
address.
- If it finds that the address has already been taken by some other
node, it generates a new, unused address and returns it to the
requester.
- During the trial period the requesting node must always be prepared
to accept a failure message, i.e., a message where a peer suggests a
different (or equal) address to the one tried. In those cases it
must apply the suggested value as trial address and restart the trial
period.
This algorithm ensures that in the vast majority of cases a node will
have the same address before and after a reboot. If a legacy user
configures the address explicitly, there will be no trial period and
messages, so this protocol addition is completely backwards compatible.
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-22 20:42:51 +01:00
u32 tipc_node_try_addr ( struct net * net , u8 * id , u32 addr ) ;
void tipc_node_check_dest ( struct net * net , u32 onode , u8 * peer_id128 ,
2015-07-30 18:24:22 -04:00
struct tipc_bearer * bearer ,
u16 capabilities , u32 signature ,
struct tipc_media_addr * maddr ,
bool * respond , bool * dupl_addr ) ;
2015-07-30 18:24:16 -04:00
void tipc_node_delete_links ( struct net * net , int bearer_id ) ;
2018-04-19 11:06:20 +02:00
void tipc_node_apply_property ( struct net * net , struct tipc_bearer * b , int prop ) ;
2015-01-09 15:27:05 +08:00
int tipc_node_get_linkname ( struct net * net , u32 bearer_id , u32 node ,
char * linkname , size_t len ) ;
2015-07-16 16:54:24 -04:00
int tipc_node_xmit ( struct net * net , struct sk_buff_head * list , u32 dnode ,
int selector ) ;
2017-10-13 11:04:21 +02:00
int tipc_node_distr_xmit ( struct net * net , struct sk_buff_head * list ) ;
2015-07-16 16:54:24 -04:00
int tipc_node_xmit_skb ( struct net * net , struct sk_buff * skb , u32 dest ,
u32 selector ) ;
2015-11-19 14:30:42 -05:00
void tipc_node_subscribe ( struct net * net , struct list_head * subscr , u32 addr ) ;
void tipc_node_unsubscribe ( struct net * net , struct list_head * subscr , u32 addr ) ;
void tipc_node_broadcast ( struct net * net , struct sk_buff * skb ) ;
2015-01-09 15:27:05 +08:00
int tipc_node_add_conn ( struct net * net , u32 dnode , u32 port , u32 peer_port ) ;
void tipc_node_remove_conn ( struct net * net , u32 dnode , u32 port ) ;
2015-11-19 14:30:45 -05:00
int tipc_node_get_mtu ( struct net * net , u32 addr , u32 sel ) ;
2017-10-13 11:04:19 +02:00
bool tipc_node_is_up ( struct net * net , u32 addr ) ;
2016-05-02 11:58:46 -04:00
u16 tipc_node_get_capabilities ( struct net * net , u32 addr ) ;
2014-11-20 10:29:17 +01:00
int tipc_nl_node_dump ( struct sk_buff * skb , struct netlink_callback * cb ) ;
2015-11-19 14:30:46 -05:00
int tipc_nl_node_dump_link ( struct sk_buff * skb , struct netlink_callback * cb ) ;
2015-11-19 14:30:45 -05:00
int tipc_nl_node_reset_link_stats ( struct sk_buff * skb , struct genl_info * info ) ;
int tipc_nl_node_get_link ( struct sk_buff * skb , struct genl_info * info ) ;
int tipc_nl_node_set_link ( struct sk_buff * skb , struct genl_info * info ) ;
2016-08-18 10:33:52 +02:00
int tipc_nl_peer_rm ( struct sk_buff * skb , struct genl_info * info ) ;
2014-06-25 20:41:33 -05:00
2016-07-26 08:47:19 +02:00
int tipc_nl_node_set_monitor ( struct sk_buff * skb , struct genl_info * info ) ;
2016-07-26 08:47:20 +02:00
int tipc_nl_node_get_monitor ( struct sk_buff * skb , struct genl_info * info ) ;
2016-07-26 08:47:22 +02:00
int tipc_nl_node_dump_monitor ( struct sk_buff * skb , struct netlink_callback * cb ) ;
int tipc_nl_node_dump_monitor_peer ( struct sk_buff * skb ,
struct netlink_callback * cb ) ;
2006-01-02 19:04:38 +01:00
# endif