2006-01-02 19:04:38 +01:00
/*
2014-06-09 11:08:18 -05:00
* net / tipc / socket . c : TIPC socket API
2007-02-09 23:25:21 +09:00
*
2014-03-12 11:31:09 -04:00
* Copyright ( c ) 2001 - 2007 , 2012 - 2014 , Ericsson AB
tipc: introduce new TIPC server infrastructure
TIPC has two internal servers, one providing a subscription
service for topology events, and another providing the
configuration interface. These servers have previously been running
in BH context, accessing the TIPC-port (aka native) API directly.
Apart from these servers, even the TIPC socket implementation is
partially built on this API.
As this API may simultaneously be called via different paths and in
different contexts, a complex and costly lock policiy is required
in order to protect TIPC internal resources.
To eliminate the need for this complex lock policiy, we introduce
a new, generic service API that uses kernel sockets for message
passing instead of the native API. Once the toplogy and configuration
servers are converted to use this new service, all code pertaining
to the native API can be removed. This entails a significant
reduction in code amount and complexity, and opens up for a complete
rework of the locking policy in TIPC.
The new service also solves another problem:
As the current topology server works in BH context, it cannot easily
be blocked when sending of events fails due to congestion. In such
cases events may have to be silently dropped, something that is
unacceptable. Therefore, the new service keeps a dedicated outbound
queue receiving messages from BH context. Once messages are
inserted into this queue, we will immediately schedule a work from a
special workqueue. This way, messages/events from the topology server
are in reality sent in process context, and the server can block
if necessary.
Analogously, there is a new workqueue for receiving messages. Once a
notification about an arriving message is received in BH context, we
schedule a work from the receive workqueue to do the job of
receiving the message in process context.
As both sending and receive messages are now finished in processes,
subscribed events cannot be dropped any more.
As of this commit, this new server infrastructure is built, but
not actually yet called by the existing TIPC code, but since the
conversion changes required in order to use it are significant,
the addition is kept here as a separate commit.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 10:54:39 -04:00
* Copyright ( c ) 2004 - 2008 , 2010 - 2013 , Wind River Systems
2006-01-02 19:04:38 +01:00
* All rights reserved .
*
2006-01-11 13:30:43 +01:00
* Redistribution and use in source and binary forms , with or without
2006-01-02 19:04:38 +01:00
* modification , are permitted provided that the following conditions are met :
*
2006-01-11 13:30:43 +01:00
* 1. Redistributions of source code must retain the above copyright
* notice , this list of conditions and the following disclaimer .
* 2. Redistributions in binary form must reproduce the above copyright
* notice , this list of conditions and the following disclaimer in the
* documentation and / or other materials provided with the distribution .
* 3. Neither the names of the copyright holders nor the names of its
* contributors may be used to endorse or promote products derived from
* this software without specific prior written permission .
2006-01-02 19:04:38 +01:00
*
2006-01-11 13:30:43 +01:00
* Alternatively , this software may be distributed under the terms of the
* GNU General Public License ( " GPL " ) version 2 as published by the Free
* Software Foundation .
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS " AS IS "
* AND ANY EXPRESS OR IMPLIED WARRANTIES , INCLUDING , BUT NOT LIMITED TO , THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED . IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
* LIABLE FOR ANY DIRECT , INDIRECT , INCIDENTAL , SPECIAL , EXEMPLARY , OR
* CONSEQUENTIAL DAMAGES ( INCLUDING , BUT NOT LIMITED TO , PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES ; LOSS OF USE , DATA , OR PROFITS ; OR BUSINESS
* INTERRUPTION ) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY , WHETHER IN
* CONTRACT , STRICT LIABILITY , OR TORT ( INCLUDING NEGLIGENCE OR OTHERWISE )
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE , EVEN IF ADVISED OF THE
2006-01-02 19:04:38 +01:00
* POSSIBILITY OF SUCH DAMAGE .
*/
# include "core.h"
2014-06-25 20:41:37 -05:00
# include "name_table.h"
2014-04-24 16:26:47 +02:00
# include "node.h"
2014-06-25 20:41:37 -05:00
# include "link.h"
2012-06-29 00:16:37 -04:00
# include <linux/export.h>
2014-08-22 18:09:14 -04:00
# include "config.h"
2014-08-22 18:09:18 -04:00
# include "socket.h"
2012-06-29 00:16:37 -04:00
2006-01-02 19:04:38 +01:00
# define SS_LISTENING -1 /* socket is listening */
# define SS_READY -2 /* socket is connectionless */
2014-08-22 18:09:20 -04:00
# define CONN_TIMEOUT_DEFAULT 8000 /* default connect timeout = 8s */
2014-08-22 18:09:11 -04:00
# define CONN_PROBING_INTERVAL 3600000 /* [ms] => 1 h */
2014-08-22 18:09:20 -04:00
# define TIPC_FWD_MSG 1
# define TIPC_CONN_OK 0
# define TIPC_CONN_PROBING 1
/**
* struct tipc_sock - TIPC socket structure
* @ sk : socket - interacts with ' port ' and with user via the socket API
* @ connected : non - zero if port is currently connected to a peer port
* @ conn_type : TIPC type used when connection was established
* @ conn_instance : TIPC instance used when connection was established
* @ published : non - zero if port has one or more associated names
* @ max_pkt : maximum packet size " hint " used when building messages sent by port
* @ ref : unique reference to port in TIPC object registry
* @ phdr : preformatted message header used when sending messages
* @ port_list : adjacent ports in TIPC ' s global list of ports
* @ publications : list of publications for port
* @ pub_count : total # of publications port has made during its lifetime
* @ probing_state :
* @ probing_interval :
* @ timer :
* @ port : port - interacts with ' sk ' and with the rest of the TIPC stack
* @ peer_name : the peer of the connection , if any
* @ conn_timeout : the time we can wait for an unresponded setup request
* @ dupl_rcvcnt : number of bytes counted twice , in both backlog and rcv queue
* @ link_cong : non - zero if owner must sleep because of link congestion
* @ sent_unacked : # messages sent by socket , and not yet acked by peer
* @ rcv_unacked : # messages read by user , but not yet acked back to peer
*/
struct tipc_sock {
struct sock sk ;
int connected ;
u32 conn_type ;
u32 conn_instance ;
int published ;
u32 max_pkt ;
u32 ref ;
struct tipc_msg phdr ;
struct list_head sock_list ;
struct list_head publications ;
u32 pub_count ;
u32 probing_state ;
u32 probing_interval ;
struct timer_list timer ;
uint conn_timeout ;
atomic_t dupl_rcvcnt ;
bool link_cong ;
uint sent_unacked ;
uint rcv_unacked ;
} ;
2006-01-02 19:04:38 +01:00
tipc: compensate for double accounting in socket rcv buffer
The function net/core/sock.c::__release_sock() runs a tight loop
to move buffers from the socket backlog queue to the receive queue.
As a security measure, sk_backlog.len of the receiving socket
is not set to zero until after the loop is finished, i.e., until
the whole backlog queue has been transferred to the receive queue.
During this transfer, the data that has already been moved is counted
both in the backlog queue and the receive queue, hence giving an
incorrect picture of the available queue space for new arriving buffers.
This leads to unnecessary rejection of buffers by sk_add_backlog(),
which in TIPC leads to unnecessarily broken connections.
In this commit, we compensate for this double accounting by adding
a counter that keeps track of it. The function socket.c::backlog_rcv()
receives buffers one by one from __release_sock(), and adds them to the
socket receive queue. If the transfer is successful, it increases a new
atomic counter 'tipc_sock::dupl_rcvcnt' with 'truesize' of the
transferred buffer. If a new buffer arrives during this transfer and
finds the socket busy (owned), we attempt to add it to the backlog.
However, when sk_add_backlog() is called, we adjust the 'limit'
parameter with the value of the new counter, so that the risk of
inadvertent rejection is eliminated.
It should be noted that this change does not invalidate the original
purpose of zeroing 'sk_backlog.len' after the full transfer. We set an
upper limit for dupl_rcvcnt, so that if a 'wild' sender (i.e., one that
doesn't respect the send window) keeps pumping in buffers to
sk_add_backlog(), he will eventually reach an upper limit,
(2 x TIPC_CONN_OVERLOAD_LIMIT). After that, no messages can be added
to the backlog, and the connection will be broken. Ordinary, well-
behaved senders will never reach this buffer limit at all.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-14 05:39:09 -04:00
static int tipc_backlog_rcv ( struct sock * sk , struct sk_buff * skb ) ;
2014-04-11 16:15:36 -04:00
static void tipc_data_ready ( struct sock * sk ) ;
2012-08-21 11:16:57 +08:00
static void tipc_write_space ( struct sock * sk ) ;
2014-02-18 16:06:46 +08:00
static int tipc_release ( struct socket * sock ) ;
static int tipc_accept ( struct socket * sock , struct socket * new_sock , int flags ) ;
2014-07-16 20:41:01 -04:00
static int tipc_wait_for_sndmsg ( struct socket * sock , long * timeo_p ) ;
2014-08-22 18:09:09 -04:00
static void tipc_sk_timeout ( unsigned long ref ) ;
2014-08-22 18:09:20 -04:00
static int tipc_sk_publish ( struct tipc_sock * tsk , uint scope ,
2014-08-22 18:09:17 -04:00
struct tipc_name_seq const * seq ) ;
2014-08-22 18:09:20 -04:00
static int tipc_sk_withdraw ( struct tipc_sock * tsk , uint scope ,
2014-08-22 18:09:17 -04:00
struct tipc_name_seq const * seq ) ;
2014-08-22 18:09:19 -04:00
static u32 tipc_sk_ref_acquire ( struct tipc_sock * tsk ) ;
static void tipc_sk_ref_discard ( u32 ref ) ;
static struct tipc_sock * tipc_sk_get ( u32 ref ) ;
static struct tipc_sock * tipc_sk_get_next ( u32 * ref ) ;
static void tipc_sk_put ( struct tipc_sock * tsk ) ;
2006-01-02 19:04:38 +01:00
2008-02-07 18:18:01 -08:00
static const struct proto_ops packet_ops ;
static const struct proto_ops stream_ops ;
static const struct proto_ops msg_ops ;
2006-01-02 19:04:38 +01:00
static struct proto tipc_proto ;
tipc: introduce new TIPC server infrastructure
TIPC has two internal servers, one providing a subscription
service for topology events, and another providing the
configuration interface. These servers have previously been running
in BH context, accessing the TIPC-port (aka native) API directly.
Apart from these servers, even the TIPC socket implementation is
partially built on this API.
As this API may simultaneously be called via different paths and in
different contexts, a complex and costly lock policiy is required
in order to protect TIPC internal resources.
To eliminate the need for this complex lock policiy, we introduce
a new, generic service API that uses kernel sockets for message
passing instead of the native API. Once the toplogy and configuration
servers are converted to use this new service, all code pertaining
to the native API can be removed. This entails a significant
reduction in code amount and complexity, and opens up for a complete
rework of the locking policy in TIPC.
The new service also solves another problem:
As the current topology server works in BH context, it cannot easily
be blocked when sending of events fails due to congestion. In such
cases events may have to be silently dropped, something that is
unacceptable. Therefore, the new service keeps a dedicated outbound
queue receiving messages from BH context. Once messages are
inserted into this queue, we will immediately schedule a work from a
special workqueue. This way, messages/events from the topology server
are in reality sent in process context, and the server can block
if necessary.
Analogously, there is a new workqueue for receiving messages. Once a
notification about an arriving message is received in BH context, we
schedule a work from the receive workqueue to do the job of
receiving the message in process context.
As both sending and receive messages are now finished in processes,
subscribed events cannot be dropped any more.
As of this commit, this new server infrastructure is built, but
not actually yet called by the existing TIPC code, but since the
conversion changes required in order to use it are significant,
the addition is kept here as a separate commit.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 10:54:39 -04:00
static struct proto tipc_proto_kern ;
2006-01-02 19:04:38 +01:00
2007-02-09 23:25:21 +09:00
/*
2008-04-15 00:22:02 -07:00
* Revised TIPC socket locking policy :
*
* Most socket operations take the standard socket lock when they start
* and hold it until they finish ( or until they need to sleep ) . Acquiring
* this lock grants the owner exclusive access to the fields of the socket
* data structures , with the exception of the backlog queue . A few socket
* operations can be done without taking the socket lock because they only
* read socket information that never changes during the life of the socket .
*
* Socket operations may acquire the lock for the associated TIPC port if they
* need to perform an operation on the port . If any routine needs to acquire
* both the socket lock and the port lock it must take the socket lock first
* to avoid the risk of deadlock .
*
* The dispatcher handling incoming messages cannot grab the socket lock in
* the standard fashion , since invoked it runs at the BH level and cannot block .
* Instead , it checks to see if the socket lock is currently owned by someone ,
* and either handles the message itself or adds it to the socket ' s backlog
* queue ; in the latter case the queued message is processed once the process
* owning the socket lock releases it .
*
* NOTE : Releasing the socket lock while an operation is sleeping overcomes
* the problem of a blocked socket operation preventing any other operations
* from occurring . However , applications must be careful if they have
* multiple threads trying to send ( or receive ) on the same socket , as these
* operations might interfere with each other . For example , doing a connect
* and a receive at the same time might allow the receive to consume the
* ACK message meant for the connect . While additional work could be done
* to try and overcome this , it doesn ' t seem to be worthwhile at the present .
*
* NOTE : Releasing the socket lock while an operation is sleeping also ensures
* that another operation that must be performed in a non - blocking manner is
* not delayed for very long because the lock has already been taken .
*
* NOTE : This code assumes that certain fields of a port / socket pair are
* constant over its lifetime ; such fields can be examined without taking
* the socket lock and / or port lock , and do not need to be re - read even
* after resuming processing after waiting . These fields include :
* - socket type
* - pointer to socket sk structure ( aka tipc_sock structure )
* - pointer to port structure
* - port reference
*/
2014-08-22 18:09:20 -04:00
static u32 tsk_peer_node ( struct tipc_sock * tsk )
2014-08-22 18:09:18 -04:00
{
2014-08-22 18:09:20 -04:00
return msg_destnode ( & tsk - > phdr ) ;
2014-08-22 18:09:18 -04:00
}
2014-08-22 18:09:20 -04:00
static u32 tsk_peer_port ( struct tipc_sock * tsk )
2014-08-22 18:09:18 -04:00
{
2014-08-22 18:09:20 -04:00
return msg_destport ( & tsk - > phdr ) ;
2014-08-22 18:09:18 -04:00
}
2014-08-22 18:09:20 -04:00
static bool tsk_unreliable ( struct tipc_sock * tsk )
2014-08-22 18:09:18 -04:00
{
2014-08-22 18:09:20 -04:00
return msg_src_droppable ( & tsk - > phdr ) ! = 0 ;
2014-08-22 18:09:18 -04:00
}
2014-08-22 18:09:20 -04:00
static void tsk_set_unreliable ( struct tipc_sock * tsk , bool unreliable )
2014-08-22 18:09:18 -04:00
{
2014-08-22 18:09:20 -04:00
msg_set_src_droppable ( & tsk - > phdr , unreliable ? 1 : 0 ) ;
2014-08-22 18:09:18 -04:00
}
2014-08-22 18:09:20 -04:00
static bool tsk_unreturnable ( struct tipc_sock * tsk )
2014-08-22 18:09:18 -04:00
{
2014-08-22 18:09:20 -04:00
return msg_dest_droppable ( & tsk - > phdr ) ! = 0 ;
2014-08-22 18:09:18 -04:00
}
2014-08-22 18:09:20 -04:00
static void tsk_set_unreturnable ( struct tipc_sock * tsk , bool unreturnable )
2014-08-22 18:09:18 -04:00
{
2014-08-22 18:09:20 -04:00
msg_set_dest_droppable ( & tsk - > phdr , unreturnable ? 1 : 0 ) ;
2014-08-22 18:09:18 -04:00
}
2014-08-22 18:09:20 -04:00
static int tsk_importance ( struct tipc_sock * tsk )
2014-08-22 18:09:18 -04:00
{
2014-08-22 18:09:20 -04:00
return msg_importance ( & tsk - > phdr ) ;
2014-08-22 18:09:18 -04:00
}
2014-08-22 18:09:20 -04:00
static int tsk_set_importance ( struct tipc_sock * tsk , int imp )
2014-08-22 18:09:18 -04:00
{
if ( imp > TIPC_CRITICAL_IMPORTANCE )
return - EINVAL ;
2014-08-22 18:09:20 -04:00
msg_set_importance ( & tsk - > phdr , ( u32 ) imp ) ;
2014-08-22 18:09:18 -04:00
return 0 ;
}
2014-03-12 11:31:09 -04:00
2014-08-22 18:09:20 -04:00
static struct tipc_sock * tipc_sk ( const struct sock * sk )
{
return container_of ( sk , struct tipc_sock , sk ) ;
}
static int tsk_conn_cong ( struct tipc_sock * tsk )
{
return tsk - > sent_unacked > = TIPC_FLOWCTRL_WIN ;
}
2008-04-15 00:22:02 -07:00
/**
2014-08-22 18:09:18 -04:00
* tsk_advance_rx_queue - discard first buffer in socket receive queue
2008-04-15 00:22:02 -07:00
*
* Caller must hold socket lock
2006-01-02 19:04:38 +01:00
*/
2014-08-22 18:09:18 -04:00
static void tsk_advance_rx_queue ( struct sock * sk )
2006-01-02 19:04:38 +01:00
{
2011-11-04 13:24:29 -04:00
kfree_skb ( __skb_dequeue ( & sk - > sk_receive_queue ) ) ;
2006-01-02 19:04:38 +01:00
}
/**
2014-08-22 18:09:18 -04:00
* tsk_rej_rx_queue - reject all buffers in socket receive queue
2008-04-15 00:22:02 -07:00
*
* Caller must hold socket lock
2006-01-02 19:04:38 +01:00
*/
2014-08-22 18:09:18 -04:00
static void tsk_rej_rx_queue ( struct sock * sk )
2006-01-02 19:04:38 +01:00
{
2008-04-15 00:22:02 -07:00
struct sk_buff * buf ;
2014-06-25 20:41:35 -05:00
u32 dnode ;
2008-04-15 00:22:02 -07:00
2014-06-25 20:41:35 -05:00
while ( ( buf = __skb_dequeue ( & sk - > sk_receive_queue ) ) ) {
if ( tipc_msg_reverse ( buf , & dnode , TIPC_ERR_NO_PORT ) )
2014-07-16 20:41:03 -04:00
tipc_link_xmit ( buf , dnode , 0 ) ;
2014-06-25 20:41:35 -05:00
}
2006-01-02 19:04:38 +01:00
}
2014-08-22 18:09:18 -04:00
/* tsk_peer_msg - verify if message was sent by connected port's peer
2014-08-22 18:09:17 -04:00
*
* Handles cases where the node ' s network address has changed from
* the default of < 0.0 .0 > to its configured setting .
*/
2014-08-22 18:09:18 -04:00
static bool tsk_peer_msg ( struct tipc_sock * tsk , struct tipc_msg * msg )
2014-08-22 18:09:17 -04:00
{
2014-08-22 18:09:20 -04:00
u32 peer_port = tsk_peer_port ( tsk ) ;
2014-08-22 18:09:17 -04:00
u32 orig_node ;
u32 peer_node ;
2014-08-22 18:09:20 -04:00
if ( unlikely ( ! tsk - > connected ) )
2014-08-22 18:09:17 -04:00
return false ;
if ( unlikely ( msg_origport ( msg ) ! = peer_port ) )
return false ;
orig_node = msg_orignode ( msg ) ;
2014-08-22 18:09:20 -04:00
peer_node = tsk_peer_node ( tsk ) ;
2014-08-22 18:09:17 -04:00
if ( likely ( orig_node = = peer_node ) )
return true ;
if ( ! orig_node & & ( peer_node = = tipc_own_addr ) )
return true ;
if ( ! peer_node & & ( orig_node = = tipc_own_addr ) )
return true ;
return false ;
}
2006-01-02 19:04:38 +01:00
/**
tipc: introduce new TIPC server infrastructure
TIPC has two internal servers, one providing a subscription
service for topology events, and another providing the
configuration interface. These servers have previously been running
in BH context, accessing the TIPC-port (aka native) API directly.
Apart from these servers, even the TIPC socket implementation is
partially built on this API.
As this API may simultaneously be called via different paths and in
different contexts, a complex and costly lock policiy is required
in order to protect TIPC internal resources.
To eliminate the need for this complex lock policiy, we introduce
a new, generic service API that uses kernel sockets for message
passing instead of the native API. Once the toplogy and configuration
servers are converted to use this new service, all code pertaining
to the native API can be removed. This entails a significant
reduction in code amount and complexity, and opens up for a complete
rework of the locking policy in TIPC.
The new service also solves another problem:
As the current topology server works in BH context, it cannot easily
be blocked when sending of events fails due to congestion. In such
cases events may have to be silently dropped, something that is
unacceptable. Therefore, the new service keeps a dedicated outbound
queue receiving messages from BH context. Once messages are
inserted into this queue, we will immediately schedule a work from a
special workqueue. This way, messages/events from the topology server
are in reality sent in process context, and the server can block
if necessary.
Analogously, there is a new workqueue for receiving messages. Once a
notification about an arriving message is received in BH context, we
schedule a work from the receive workqueue to do the job of
receiving the message in process context.
As both sending and receive messages are now finished in processes,
subscribed events cannot be dropped any more.
As of this commit, this new server infrastructure is built, but
not actually yet called by the existing TIPC code, but since the
conversion changes required in order to use it are significant,
the addition is kept here as a separate commit.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 10:54:39 -04:00
* tipc_sk_create - create a TIPC socket
2008-04-15 00:22:02 -07:00
* @ net : network namespace ( must be default network )
2006-01-02 19:04:38 +01:00
* @ sock : pre - allocated socket structure
* @ protocol : protocol indicator ( must be 0 )
2009-11-05 22:18:14 -08:00
* @ kern : caused by kernel or by userspace ?
2007-02-09 23:25:21 +09:00
*
2008-04-15 00:22:02 -07:00
* This routine creates additional data structures used by the TIPC socket ,
* initializes them , and links them together .
2006-01-02 19:04:38 +01:00
*
* Returns 0 on success , errno otherwise
*/
2014-03-12 11:31:12 -04:00
static int tipc_sk_create ( struct net * net , struct socket * sock ,
int protocol , int kern )
2006-01-02 19:04:38 +01:00
{
2008-04-15 00:22:02 -07:00
const struct proto_ops * ops ;
socket_state state ;
2006-01-02 19:04:38 +01:00
struct sock * sk ;
2014-03-12 11:31:12 -04:00
struct tipc_sock * tsk ;
2014-08-22 18:09:13 -04:00
struct tipc_msg * msg ;
2014-03-12 11:31:12 -04:00
u32 ref ;
2008-04-15 00:22:02 -07:00
/* Validate arguments */
2006-01-02 19:04:38 +01:00
if ( unlikely ( protocol ! = 0 ) )
return - EPROTONOSUPPORT ;
switch ( sock - > type ) {
case SOCK_STREAM :
2008-04-15 00:22:02 -07:00
ops = & stream_ops ;
state = SS_UNCONNECTED ;
2006-01-02 19:04:38 +01:00
break ;
case SOCK_SEQPACKET :
2008-04-15 00:22:02 -07:00
ops = & packet_ops ;
state = SS_UNCONNECTED ;
2006-01-02 19:04:38 +01:00
break ;
case SOCK_DGRAM :
case SOCK_RDM :
2008-04-15 00:22:02 -07:00
ops = & msg_ops ;
state = SS_READY ;
2006-01-02 19:04:38 +01:00
break ;
2006-06-25 23:47:18 -07:00
default :
return - EPROTOTYPE ;
2006-01-02 19:04:38 +01:00
}
2008-04-15 00:22:02 -07:00
/* Allocate socket's protocol area */
tipc: introduce new TIPC server infrastructure
TIPC has two internal servers, one providing a subscription
service for topology events, and another providing the
configuration interface. These servers have previously been running
in BH context, accessing the TIPC-port (aka native) API directly.
Apart from these servers, even the TIPC socket implementation is
partially built on this API.
As this API may simultaneously be called via different paths and in
different contexts, a complex and costly lock policiy is required
in order to protect TIPC internal resources.
To eliminate the need for this complex lock policiy, we introduce
a new, generic service API that uses kernel sockets for message
passing instead of the native API. Once the toplogy and configuration
servers are converted to use this new service, all code pertaining
to the native API can be removed. This entails a significant
reduction in code amount and complexity, and opens up for a complete
rework of the locking policy in TIPC.
The new service also solves another problem:
As the current topology server works in BH context, it cannot easily
be blocked when sending of events fails due to congestion. In such
cases events may have to be silently dropped, something that is
unacceptable. Therefore, the new service keeps a dedicated outbound
queue receiving messages from BH context. Once messages are
inserted into this queue, we will immediately schedule a work from a
special workqueue. This way, messages/events from the topology server
are in reality sent in process context, and the server can block
if necessary.
Analogously, there is a new workqueue for receiving messages. Once a
notification about an arriving message is received in BH context, we
schedule a work from the receive workqueue to do the job of
receiving the message in process context.
As both sending and receive messages are now finished in processes,
subscribed events cannot be dropped any more.
As of this commit, this new server infrastructure is built, but
not actually yet called by the existing TIPC code, but since the
conversion changes required in order to use it are significant,
the addition is kept here as a separate commit.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 10:54:39 -04:00
if ( ! kern )
sk = sk_alloc ( net , AF_TIPC , GFP_KERNEL , & tipc_proto ) ;
else
sk = sk_alloc ( net , AF_TIPC , GFP_KERNEL , & tipc_proto_kern ) ;
2008-04-15 00:22:02 -07:00
if ( sk = = NULL )
2006-01-02 19:04:38 +01:00
return - ENOMEM ;
2014-03-12 11:31:12 -04:00
tsk = tipc_sk ( sk ) ;
2014-08-22 18:09:19 -04:00
ref = tipc_sk_ref_acquire ( tsk ) ;
2014-03-12 11:31:12 -04:00
if ( ! ref ) {
2014-08-22 18:09:13 -04:00
pr_warn ( " Socket create failed; reference table exhausted \n " ) ;
2008-04-15 00:22:02 -07:00
return - ENOMEM ;
}
2014-08-22 18:09:20 -04:00
tsk - > max_pkt = MAX_PKT_DEFAULT ;
tsk - > ref = ref ;
INIT_LIST_HEAD ( & tsk - > publications ) ;
msg = & tsk - > phdr ;
2014-08-22 18:09:13 -04:00
tipc_msg_init ( msg , TIPC_LOW_IMPORTANCE , TIPC_NAMED_MSG ,
NAMED_H_SIZE , 0 ) ;
msg_set_origport ( msg , ref ) ;
2006-01-02 19:04:38 +01:00
2008-04-15 00:22:02 -07:00
/* Finish initializing socket data structures */
sock - > ops = ops ;
sock - > state = state ;
sock_init_data ( sock , sk ) ;
2014-08-22 18:09:20 -04:00
k_init_timer ( & tsk - > timer , ( Handler ) tipc_sk_timeout , ref ) ;
tipc: compensate for double accounting in socket rcv buffer
The function net/core/sock.c::__release_sock() runs a tight loop
to move buffers from the socket backlog queue to the receive queue.
As a security measure, sk_backlog.len of the receiving socket
is not set to zero until after the loop is finished, i.e., until
the whole backlog queue has been transferred to the receive queue.
During this transfer, the data that has already been moved is counted
both in the backlog queue and the receive queue, hence giving an
incorrect picture of the available queue space for new arriving buffers.
This leads to unnecessary rejection of buffers by sk_add_backlog(),
which in TIPC leads to unnecessarily broken connections.
In this commit, we compensate for this double accounting by adding
a counter that keeps track of it. The function socket.c::backlog_rcv()
receives buffers one by one from __release_sock(), and adds them to the
socket receive queue. If the transfer is successful, it increases a new
atomic counter 'tipc_sock::dupl_rcvcnt' with 'truesize' of the
transferred buffer. If a new buffer arrives during this transfer and
finds the socket busy (owned), we attempt to add it to the backlog.
However, when sk_add_backlog() is called, we adjust the 'limit'
parameter with the value of the new counter, so that the risk of
inadvertent rejection is eliminated.
It should be noted that this change does not invalidate the original
purpose of zeroing 'sk_backlog.len' after the full transfer. We set an
upper limit for dupl_rcvcnt, so that if a 'wild' sender (i.e., one that
doesn't respect the send window) keeps pumping in buffers to
sk_add_backlog(), he will eventually reach an upper limit,
(2 x TIPC_CONN_OVERLOAD_LIMIT). After that, no messages can be added
to the backlog, and the connection will be broken. Ordinary, well-
behaved senders will never reach this buffer limit at all.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-14 05:39:09 -04:00
sk - > sk_backlog_rcv = tipc_backlog_rcv ;
2013-06-17 10:54:37 -04:00
sk - > sk_rcvbuf = sysctl_tipc_rmem [ 1 ] ;
2012-08-21 11:16:57 +08:00
sk - > sk_data_ready = tipc_data_ready ;
sk - > sk_write_space = tipc_write_space ;
tipc: compensate for double accounting in socket rcv buffer
The function net/core/sock.c::__release_sock() runs a tight loop
to move buffers from the socket backlog queue to the receive queue.
As a security measure, sk_backlog.len of the receiving socket
is not set to zero until after the loop is finished, i.e., until
the whole backlog queue has been transferred to the receive queue.
During this transfer, the data that has already been moved is counted
both in the backlog queue and the receive queue, hence giving an
incorrect picture of the available queue space for new arriving buffers.
This leads to unnecessary rejection of buffers by sk_add_backlog(),
which in TIPC leads to unnecessarily broken connections.
In this commit, we compensate for this double accounting by adding
a counter that keeps track of it. The function socket.c::backlog_rcv()
receives buffers one by one from __release_sock(), and adds them to the
socket receive queue. If the transfer is successful, it increases a new
atomic counter 'tipc_sock::dupl_rcvcnt' with 'truesize' of the
transferred buffer. If a new buffer arrives during this transfer and
finds the socket busy (owned), we attempt to add it to the backlog.
However, when sk_add_backlog() is called, we adjust the 'limit'
parameter with the value of the new counter, so that the risk of
inadvertent rejection is eliminated.
It should be noted that this change does not invalidate the original
purpose of zeroing 'sk_backlog.len' after the full transfer. We set an
upper limit for dupl_rcvcnt, so that if a 'wild' sender (i.e., one that
doesn't respect the send window) keeps pumping in buffers to
sk_add_backlog(), he will eventually reach an upper limit,
(2 x TIPC_CONN_OVERLOAD_LIMIT). After that, no messages can be added
to the backlog, and the connection will be broken. Ordinary, well-
behaved senders will never reach this buffer limit at all.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-14 05:39:09 -04:00
tsk - > conn_timeout = CONN_TIMEOUT_DEFAULT ;
2014-06-25 20:41:42 -05:00
tsk - > sent_unacked = 0 ;
tipc: compensate for double accounting in socket rcv buffer
The function net/core/sock.c::__release_sock() runs a tight loop
to move buffers from the socket backlog queue to the receive queue.
As a security measure, sk_backlog.len of the receiving socket
is not set to zero until after the loop is finished, i.e., until
the whole backlog queue has been transferred to the receive queue.
During this transfer, the data that has already been moved is counted
both in the backlog queue and the receive queue, hence giving an
incorrect picture of the available queue space for new arriving buffers.
This leads to unnecessary rejection of buffers by sk_add_backlog(),
which in TIPC leads to unnecessarily broken connections.
In this commit, we compensate for this double accounting by adding
a counter that keeps track of it. The function socket.c::backlog_rcv()
receives buffers one by one from __release_sock(), and adds them to the
socket receive queue. If the transfer is successful, it increases a new
atomic counter 'tipc_sock::dupl_rcvcnt' with 'truesize' of the
transferred buffer. If a new buffer arrives during this transfer and
finds the socket busy (owned), we attempt to add it to the backlog.
However, when sk_add_backlog() is called, we adjust the 'limit'
parameter with the value of the new counter, so that the risk of
inadvertent rejection is eliminated.
It should be noted that this change does not invalidate the original
purpose of zeroing 'sk_backlog.len' after the full transfer. We set an
upper limit for dupl_rcvcnt, so that if a 'wild' sender (i.e., one that
doesn't respect the send window) keeps pumping in buffers to
sk_add_backlog(), he will eventually reach an upper limit,
(2 x TIPC_CONN_OVERLOAD_LIMIT). After that, no messages can be added
to the backlog, and the connection will be broken. Ordinary, well-
behaved senders will never reach this buffer limit at all.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-14 05:39:09 -04:00
atomic_set ( & tsk - > dupl_rcvcnt , 0 ) ;
2008-05-12 15:42:28 -07:00
2008-04-15 00:22:02 -07:00
if ( sock - > state = = SS_READY ) {
2014-08-22 18:09:20 -04:00
tsk_set_unreturnable ( tsk , true ) ;
2008-04-15 00:22:02 -07:00
if ( sock - > type = = SOCK_DGRAM )
2014-08-22 18:09:20 -04:00
tsk_set_unreliable ( tsk , true ) ;
2008-04-15 00:22:02 -07:00
}
2006-01-02 19:04:38 +01:00
return 0 ;
}
tipc: introduce new TIPC server infrastructure
TIPC has two internal servers, one providing a subscription
service for topology events, and another providing the
configuration interface. These servers have previously been running
in BH context, accessing the TIPC-port (aka native) API directly.
Apart from these servers, even the TIPC socket implementation is
partially built on this API.
As this API may simultaneously be called via different paths and in
different contexts, a complex and costly lock policiy is required
in order to protect TIPC internal resources.
To eliminate the need for this complex lock policiy, we introduce
a new, generic service API that uses kernel sockets for message
passing instead of the native API. Once the toplogy and configuration
servers are converted to use this new service, all code pertaining
to the native API can be removed. This entails a significant
reduction in code amount and complexity, and opens up for a complete
rework of the locking policy in TIPC.
The new service also solves another problem:
As the current topology server works in BH context, it cannot easily
be blocked when sending of events fails due to congestion. In such
cases events may have to be silently dropped, something that is
unacceptable. Therefore, the new service keeps a dedicated outbound
queue receiving messages from BH context. Once messages are
inserted into this queue, we will immediately schedule a work from a
special workqueue. This way, messages/events from the topology server
are in reality sent in process context, and the server can block
if necessary.
Analogously, there is a new workqueue for receiving messages. Once a
notification about an arriving message is received in BH context, we
schedule a work from the receive workqueue to do the job of
receiving the message in process context.
As both sending and receive messages are now finished in processes,
subscribed events cannot be dropped any more.
As of this commit, this new server infrastructure is built, but
not actually yet called by the existing TIPC code, but since the
conversion changes required in order to use it are significant,
the addition is kept here as a separate commit.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 10:54:39 -04:00
/**
* tipc_sock_create_local - create TIPC socket from inside TIPC module
* @ type : socket type - SOCK_RDM or SOCK_SEQPACKET
*
* We cannot use sock_creat_kern here because it bumps module user count .
* Since socket owner and creator is the same module we must make sure
* that module count remains zero for module local sockets , otherwise
* we cannot do rmmod .
*
* Returns 0 on success , errno otherwise
*/
int tipc_sock_create_local ( int type , struct socket * * res )
{
int rc ;
rc = sock_create_lite ( AF_TIPC , type , 0 , res ) ;
if ( rc < 0 ) {
pr_err ( " Failed to create kernel socket \n " ) ;
return rc ;
}
tipc_sk_create ( & init_net , * res , 0 , 1 ) ;
return 0 ;
}
/**
* tipc_sock_release_local - release socket created by tipc_sock_create_local
* @ sock : the socket to be released .
*
* Module reference count is not incremented when such sockets are created ,
* so we must keep it from being decremented when they are released .
*/
void tipc_sock_release_local ( struct socket * sock )
{
2014-02-18 16:06:46 +08:00
tipc_release ( sock ) ;
tipc: introduce new TIPC server infrastructure
TIPC has two internal servers, one providing a subscription
service for topology events, and another providing the
configuration interface. These servers have previously been running
in BH context, accessing the TIPC-port (aka native) API directly.
Apart from these servers, even the TIPC socket implementation is
partially built on this API.
As this API may simultaneously be called via different paths and in
different contexts, a complex and costly lock policiy is required
in order to protect TIPC internal resources.
To eliminate the need for this complex lock policiy, we introduce
a new, generic service API that uses kernel sockets for message
passing instead of the native API. Once the toplogy and configuration
servers are converted to use this new service, all code pertaining
to the native API can be removed. This entails a significant
reduction in code amount and complexity, and opens up for a complete
rework of the locking policy in TIPC.
The new service also solves another problem:
As the current topology server works in BH context, it cannot easily
be blocked when sending of events fails due to congestion. In such
cases events may have to be silently dropped, something that is
unacceptable. Therefore, the new service keeps a dedicated outbound
queue receiving messages from BH context. Once messages are
inserted into this queue, we will immediately schedule a work from a
special workqueue. This way, messages/events from the topology server
are in reality sent in process context, and the server can block
if necessary.
Analogously, there is a new workqueue for receiving messages. Once a
notification about an arriving message is received in BH context, we
schedule a work from the receive workqueue to do the job of
receiving the message in process context.
As both sending and receive messages are now finished in processes,
subscribed events cannot be dropped any more.
As of this commit, this new server infrastructure is built, but
not actually yet called by the existing TIPC code, but since the
conversion changes required in order to use it are significant,
the addition is kept here as a separate commit.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 10:54:39 -04:00
sock - > ops = NULL ;
sock_release ( sock ) ;
}
/**
* tipc_sock_accept_local - accept a connection on a socket created
* with tipc_sock_create_local . Use this function to avoid that
* module reference count is inadvertently incremented .
*
* @ sock : the accepting socket
* @ newsock : reference to the new socket to be created
* @ flags : socket flags
*/
int tipc_sock_accept_local ( struct socket * sock , struct socket * * newsock ,
2013-06-17 10:54:47 -04:00
int flags )
tipc: introduce new TIPC server infrastructure
TIPC has two internal servers, one providing a subscription
service for topology events, and another providing the
configuration interface. These servers have previously been running
in BH context, accessing the TIPC-port (aka native) API directly.
Apart from these servers, even the TIPC socket implementation is
partially built on this API.
As this API may simultaneously be called via different paths and in
different contexts, a complex and costly lock policiy is required
in order to protect TIPC internal resources.
To eliminate the need for this complex lock policiy, we introduce
a new, generic service API that uses kernel sockets for message
passing instead of the native API. Once the toplogy and configuration
servers are converted to use this new service, all code pertaining
to the native API can be removed. This entails a significant
reduction in code amount and complexity, and opens up for a complete
rework of the locking policy in TIPC.
The new service also solves another problem:
As the current topology server works in BH context, it cannot easily
be blocked when sending of events fails due to congestion. In such
cases events may have to be silently dropped, something that is
unacceptable. Therefore, the new service keeps a dedicated outbound
queue receiving messages from BH context. Once messages are
inserted into this queue, we will immediately schedule a work from a
special workqueue. This way, messages/events from the topology server
are in reality sent in process context, and the server can block
if necessary.
Analogously, there is a new workqueue for receiving messages. Once a
notification about an arriving message is received in BH context, we
schedule a work from the receive workqueue to do the job of
receiving the message in process context.
As both sending and receive messages are now finished in processes,
subscribed events cannot be dropped any more.
As of this commit, this new server infrastructure is built, but
not actually yet called by the existing TIPC code, but since the
conversion changes required in order to use it are significant,
the addition is kept here as a separate commit.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 10:54:39 -04:00
{
struct sock * sk = sock - > sk ;
int ret ;
ret = sock_create_lite ( sk - > sk_family , sk - > sk_type ,
sk - > sk_protocol , newsock ) ;
if ( ret < 0 )
return ret ;
2014-02-18 16:06:46 +08:00
ret = tipc_accept ( sock , * newsock , flags ) ;
tipc: introduce new TIPC server infrastructure
TIPC has two internal servers, one providing a subscription
service for topology events, and another providing the
configuration interface. These servers have previously been running
in BH context, accessing the TIPC-port (aka native) API directly.
Apart from these servers, even the TIPC socket implementation is
partially built on this API.
As this API may simultaneously be called via different paths and in
different contexts, a complex and costly lock policiy is required
in order to protect TIPC internal resources.
To eliminate the need for this complex lock policiy, we introduce
a new, generic service API that uses kernel sockets for message
passing instead of the native API. Once the toplogy and configuration
servers are converted to use this new service, all code pertaining
to the native API can be removed. This entails a significant
reduction in code amount and complexity, and opens up for a complete
rework of the locking policy in TIPC.
The new service also solves another problem:
As the current topology server works in BH context, it cannot easily
be blocked when sending of events fails due to congestion. In such
cases events may have to be silently dropped, something that is
unacceptable. Therefore, the new service keeps a dedicated outbound
queue receiving messages from BH context. Once messages are
inserted into this queue, we will immediately schedule a work from a
special workqueue. This way, messages/events from the topology server
are in reality sent in process context, and the server can block
if necessary.
Analogously, there is a new workqueue for receiving messages. Once a
notification about an arriving message is received in BH context, we
schedule a work from the receive workqueue to do the job of
receiving the message in process context.
As both sending and receive messages are now finished in processes,
subscribed events cannot be dropped any more.
As of this commit, this new server infrastructure is built, but
not actually yet called by the existing TIPC code, but since the
conversion changes required in order to use it are significant,
the addition is kept here as a separate commit.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 10:54:39 -04:00
if ( ret < 0 ) {
sock_release ( * newsock ) ;
return ret ;
}
( * newsock ) - > ops = sock - > ops ;
return ret ;
}
2006-01-02 19:04:38 +01:00
/**
2014-02-18 16:06:46 +08:00
* tipc_release - destroy a TIPC socket
2006-01-02 19:04:38 +01:00
* @ sock : socket to destroy
*
* This routine cleans up any messages that are still queued on the socket .
* For DGRAM and RDM socket types , all queued messages are rejected .
* For SEQPACKET and STREAM socket types , the first message is rejected
* and any others are discarded . ( If the first message on a STREAM socket
* is partially - read , it is discarded and the next one is rejected instead . )
2007-02-09 23:25:21 +09:00
*
2006-01-02 19:04:38 +01:00
* NOTE : Rejected messages are not necessarily returned to the sender ! They
* are returned or discarded according to the " destination droppable " setting
* specified for the message by the sender .
*
* Returns 0 on success , errno otherwise
*/
2014-02-18 16:06:46 +08:00
static int tipc_release ( struct socket * sock )
2006-01-02 19:04:38 +01:00
{
struct sock * sk = sock - > sk ;
2014-03-12 11:31:12 -04:00
struct tipc_sock * tsk ;
2006-01-02 19:04:38 +01:00
struct sk_buff * buf ;
2014-06-25 20:41:35 -05:00
u32 dnode ;
2006-01-02 19:04:38 +01:00
2008-04-15 00:22:02 -07:00
/*
* Exit if socket isn ' t fully initialized ( occurs when a failed accept ( )
* releases a pre - allocated child socket that was never used )
*/
if ( sk = = NULL )
2006-01-02 19:04:38 +01:00
return 0 ;
2007-02-09 23:25:21 +09:00
2014-03-12 11:31:12 -04:00
tsk = tipc_sk ( sk ) ;
2008-04-15 00:22:02 -07:00
lock_sock ( sk ) ;
/*
* Reject all unreceived messages , except on an active connection
* ( which disconnects locally & sends a ' FIN + ' to peer )
*/
2014-08-22 18:09:20 -04:00
dnode = tsk_peer_node ( tsk ) ;
2006-01-02 19:04:38 +01:00
while ( sock - > state ! = SS_DISCONNECTING ) {
2008-04-15 00:22:02 -07:00
buf = __skb_dequeue ( & sk - > sk_receive_queue ) ;
if ( buf = = NULL )
2006-01-02 19:04:38 +01:00
break ;
2013-10-18 07:23:16 +02:00
if ( TIPC_SKB_CB ( buf ) - > handle ! = NULL )
2011-11-04 13:24:29 -04:00
kfree_skb ( buf ) ;
2008-04-15 00:22:02 -07:00
else {
if ( ( sock - > state = = SS_CONNECTING ) | |
( sock - > state = = SS_CONNECTED ) ) {
sock - > state = SS_DISCONNECTING ;
2014-08-22 18:09:20 -04:00
tsk - > connected = 0 ;
tipc_node_remove_conn ( dnode , tsk - > ref ) ;
2008-04-15 00:22:02 -07:00
}
2014-06-25 20:41:35 -05:00
if ( tipc_msg_reverse ( buf , & dnode , TIPC_ERR_NO_PORT ) )
2014-07-16 20:41:03 -04:00
tipc_link_xmit ( buf , dnode , 0 ) ;
2008-04-15 00:22:02 -07:00
}
2006-01-02 19:04:38 +01:00
}
2014-08-22 18:09:20 -04:00
tipc_sk_withdraw ( tsk , 0 , NULL ) ;
tipc_sk_ref_discard ( tsk - > ref ) ;
k_cancel_timer ( & tsk - > timer ) ;
if ( tsk - > connected ) {
2014-08-22 18:09:13 -04:00
buf = tipc_msg_create ( TIPC_CRITICAL_IMPORTANCE , TIPC_CONN_MSG ,
SHORT_H_SIZE , 0 , dnode , tipc_own_addr ,
2014-08-22 18:09:20 -04:00
tsk_peer_port ( tsk ) ,
tsk - > ref , TIPC_ERR_NO_PORT ) ;
2014-08-22 18:09:13 -04:00
if ( buf )
2014-08-22 18:09:20 -04:00
tipc_link_xmit ( buf , dnode , tsk - > ref ) ;
tipc_node_remove_conn ( dnode , tsk - > ref ) ;
2014-08-22 18:09:13 -04:00
}
2014-08-22 18:09:20 -04:00
k_term_timer ( & tsk - > timer ) ;
2006-01-02 19:04:38 +01:00
2008-04-15 00:22:02 -07:00
/* Discard any remaining (connection-based) messages in receive queue */
2013-01-20 23:30:08 +01:00
__skb_queue_purge ( & sk - > sk_receive_queue ) ;
2006-01-02 19:04:38 +01:00
2008-04-15 00:22:02 -07:00
/* Reject any messages that accumulated in backlog queue */
sock - > state = SS_DISCONNECTING ;
release_sock ( sk ) ;
2006-01-02 19:04:38 +01:00
sock_put ( sk ) ;
2008-04-15 00:22:02 -07:00
sock - > sk = NULL ;
2006-01-02 19:04:38 +01:00
2014-04-06 15:56:14 +02:00
return 0 ;
2006-01-02 19:04:38 +01:00
}
/**
2014-02-18 16:06:46 +08:00
* tipc_bind - associate or disassocate TIPC name ( s ) with a socket
2006-01-02 19:04:38 +01:00
* @ sock : socket structure
* @ uaddr : socket address describing name ( s ) and desired operation
* @ uaddr_len : size of socket address data structure
2007-02-09 23:25:21 +09:00
*
2006-01-02 19:04:38 +01:00
* Name and name sequence binding is indicated using a positive scope value ;
* a negative scope value unbinds the specified name . Specifying no name
* ( i . e . a socket address length of 0 ) unbinds all names from the socket .
2007-02-09 23:25:21 +09:00
*
2006-01-02 19:04:38 +01:00
* Returns 0 on success , errno otherwise
2008-04-15 00:22:02 -07:00
*
* NOTE : This routine doesn ' t need to take the socket lock since it doesn ' t
* access any non - constant socket information .
2006-01-02 19:04:38 +01:00
*/
2014-02-18 16:06:46 +08:00
static int tipc_bind ( struct socket * sock , struct sockaddr * uaddr ,
int uaddr_len )
2006-01-02 19:04:38 +01:00
{
2013-12-27 10:18:28 +08:00
struct sock * sk = sock - > sk ;
2006-01-02 19:04:38 +01:00
struct sockaddr_tipc * addr = ( struct sockaddr_tipc * ) uaddr ;
2014-03-12 11:31:12 -04:00
struct tipc_sock * tsk = tipc_sk ( sk ) ;
2013-12-27 10:18:28 +08:00
int res = - EINVAL ;
2006-01-02 19:04:38 +01:00
2013-12-27 10:18:28 +08:00
lock_sock ( sk ) ;
if ( unlikely ( ! uaddr_len ) ) {
2014-08-22 18:09:20 -04:00
res = tipc_sk_withdraw ( tsk , 0 , NULL ) ;
2013-12-27 10:18:28 +08:00
goto exit ;
}
2007-02-09 23:25:21 +09:00
2013-12-27 10:18:28 +08:00
if ( uaddr_len < sizeof ( struct sockaddr_tipc ) ) {
res = - EINVAL ;
goto exit ;
}
if ( addr - > family ! = AF_TIPC ) {
res = - EAFNOSUPPORT ;
goto exit ;
}
2006-01-02 19:04:38 +01:00
if ( addr - > addrtype = = TIPC_ADDR_NAME )
addr - > addr . nameseq . upper = addr - > addr . nameseq . lower ;
2013-12-27 10:18:28 +08:00
else if ( addr - > addrtype ! = TIPC_ADDR_NAMESEQ ) {
res = - EAFNOSUPPORT ;
goto exit ;
}
2007-02-09 23:25:21 +09:00
tipc: convert topology server to use new server facility
As the new TIPC server infrastructure has been introduced, we can
now convert the TIPC topology server to it. We get two benefits
from doing this:
1) It simplifies the topology server locking policy. In the
original locking policy, we placed one spin lock pointer in the
tipc_subscriber structure to reuse the lock of the subscriber's
server port, controlling access to members of tipc_subscriber
instance. That is, we only used one lock to ensure both
tipc_port and tipc_subscriber members were safely accessed.
Now we introduce another spin lock for tipc_subscriber structure
only protecting themselves, to get a finer granularity locking
policy. Moreover, the change will allow us to make the topology
server code more readable and maintainable.
2) It fixes a bug where sent subscription events may be lost when
the topology port is congested. Using the new service, the
topology server now queues sent events into an outgoing buffer,
and then wakes up a sender process which has been blocked in
workqueue context. The process will keep picking events from the
buffer and send them to their respective subscribers, using the
kernel socket interface, until the buffer is empty. Even if the
socket is congested during transmission there is no risk that
events may be dropped, since the sender process may block when
needed.
Some minor reordering of initialization is done, since we now
have a scenario where the topology server must be started after
socket initialization has taken place, as the former depends
on the latter. And overall, we see a simplification of the
TIPC subscriber code in making this changeover.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 10:54:40 -04:00
if ( ( addr - > addr . nameseq . type < TIPC_RESERVED_TYPES ) & &
2013-06-17 10:54:41 -04:00
( addr - > addr . nameseq . type ! = TIPC_TOP_SRV ) & &
2013-12-27 10:18:28 +08:00
( addr - > addr . nameseq . type ! = TIPC_CFG_SRV ) ) {
res = - EACCES ;
goto exit ;
}
2011-11-02 15:49:40 -04:00
2013-12-27 10:18:28 +08:00
res = ( addr - > scope > 0 ) ?
2014-08-22 18:09:20 -04:00
tipc_sk_publish ( tsk , addr - > scope , & addr - > addr . nameseq ) :
tipc_sk_withdraw ( tsk , - addr - > scope , & addr - > addr . nameseq ) ;
2013-12-27 10:18:28 +08:00
exit :
release_sock ( sk ) ;
return res ;
2006-01-02 19:04:38 +01:00
}
2007-02-09 23:25:21 +09:00
/**
2014-02-18 16:06:46 +08:00
* tipc_getname - get port ID of socket or peer socket
2006-01-02 19:04:38 +01:00
* @ sock : socket structure
* @ uaddr : area for returned socket address
* @ uaddr_len : area for returned length of socket address
2008-07-14 22:43:32 -07:00
* @ peer : 0 = own ID , 1 = current peer ID , 2 = current / former peer ID
2007-02-09 23:25:21 +09:00
*
2006-01-02 19:04:38 +01:00
* Returns 0 on success , errno otherwise
2008-04-15 00:22:02 -07:00
*
2008-07-14 22:43:32 -07:00
* NOTE : This routine doesn ' t need to take the socket lock since it only
* accesses socket information that is unchanging ( or which changes in
2010-12-31 18:59:32 +00:00
* a completely predictable manner ) .
2006-01-02 19:04:38 +01:00
*/
2014-02-18 16:06:46 +08:00
static int tipc_getname ( struct socket * sock , struct sockaddr * uaddr ,
int * uaddr_len , int peer )
2006-01-02 19:04:38 +01:00
{
struct sockaddr_tipc * addr = ( struct sockaddr_tipc * ) uaddr ;
2014-03-12 11:31:12 -04:00
struct tipc_sock * tsk = tipc_sk ( sock - > sk ) ;
2006-01-02 19:04:38 +01:00
2010-10-31 07:10:32 +00:00
memset ( addr , 0 , sizeof ( * addr ) ) ;
2008-04-15 00:22:02 -07:00
if ( peer ) {
2008-07-14 22:43:32 -07:00
if ( ( sock - > state ! = SS_CONNECTED ) & &
( ( peer ! = 2 ) | | ( sock - > state ! = SS_DISCONNECTING ) ) )
return - ENOTCONN ;
2014-08-22 18:09:20 -04:00
addr - > addr . id . ref = tsk_peer_port ( tsk ) ;
addr - > addr . id . node = tsk_peer_node ( tsk ) ;
2008-04-15 00:22:02 -07:00
} else {
2014-08-22 18:09:20 -04:00
addr - > addr . id . ref = tsk - > ref ;
2010-11-30 12:01:03 +00:00
addr - > addr . id . node = tipc_own_addr ;
2008-04-15 00:22:02 -07:00
}
2006-01-02 19:04:38 +01:00
* uaddr_len = sizeof ( * addr ) ;
addr - > addrtype = TIPC_ADDR_ID ;
addr - > family = AF_TIPC ;
addr - > scope = 0 ;
addr - > addr . name . domain = 0 ;
2008-04-15 00:22:02 -07:00
return 0 ;
2006-01-02 19:04:38 +01:00
}
/**
2014-02-18 16:06:46 +08:00
* tipc_poll - read and possibly block on pollmask
2006-01-02 19:04:38 +01:00
* @ file : file structure associated with the socket
* @ sock : socket for which to calculate the poll bits
* @ wait : ? ? ?
*
2008-03-26 16:48:21 -07:00
* Returns pollmask value
*
* COMMENTARY :
* It appears that the usual socket locking mechanisms are not useful here
* since the pollmask info is potentially out - of - date the moment this routine
* exits . TCP and other protocols seem to rely on higher level poll routines
* to handle any preventable race conditions , so TIPC will do the same . . .
*
* TIPC sets the returned events as follows :
2010-08-17 11:00:06 +00:00
*
* socket state flags set
* - - - - - - - - - - - - - - - - - - - - -
* unconnected no read flags
2012-10-16 16:47:06 +02:00
* POLLOUT if port is not congested
2010-08-17 11:00:06 +00:00
*
* connecting POLLIN / POLLRDNORM if ACK / NACK in rx queue
* no write flags
*
* connected POLLIN / POLLRDNORM if data in rx queue
* POLLOUT if port is not congested
*
* disconnecting POLLIN / POLLRDNORM / POLLHUP
* no write flags
*
* listening POLLIN if SYN in rx queue
* no write flags
*
* ready POLLIN / POLLRDNORM if data in rx queue
* [ connectionless ] POLLOUT ( since port cannot be congested )
*
* IMPORTANT : The fact that a read or write operation is indicated does NOT
* imply that the operation will succeed , merely that it should be performed
* and will not block .
2006-01-02 19:04:38 +01:00
*/
2014-02-18 16:06:46 +08:00
static unsigned int tipc_poll ( struct file * file , struct socket * sock ,
poll_table * wait )
2006-01-02 19:04:38 +01:00
{
2008-03-26 16:48:21 -07:00
struct sock * sk = sock - > sk ;
2014-03-12 11:31:12 -04:00
struct tipc_sock * tsk = tipc_sk ( sk ) ;
2010-08-17 11:00:06 +00:00
u32 mask = 0 ;
2008-03-26 16:48:21 -07:00
2012-08-21 11:16:57 +08:00
sock_poll_wait ( file , sk_sleep ( sk ) , wait ) ;
2008-03-26 16:48:21 -07:00
2010-08-17 11:00:06 +00:00
switch ( ( int ) sock - > state ) {
2012-10-16 16:47:06 +02:00
case SS_UNCONNECTED :
2014-06-25 20:41:42 -05:00
if ( ! tsk - > link_cong )
2012-10-16 16:47:06 +02:00
mask | = POLLOUT ;
break ;
2010-08-17 11:00:06 +00:00
case SS_READY :
case SS_CONNECTED :
2014-08-22 18:09:20 -04:00
if ( ! tsk - > link_cong & & ! tsk_conn_cong ( tsk ) )
2010-08-17 11:00:06 +00:00
mask | = POLLOUT ;
/* fall thru' */
case SS_CONNECTING :
case SS_LISTENING :
if ( ! skb_queue_empty ( & sk - > sk_receive_queue ) )
mask | = ( POLLIN | POLLRDNORM ) ;
break ;
case SS_DISCONNECTING :
mask = ( POLLIN | POLLRDNORM | POLLHUP ) ;
break ;
}
2008-03-26 16:48:21 -07:00
return mask ;
2006-01-02 19:04:38 +01:00
}
2014-07-16 20:41:01 -04:00
/**
* tipc_sendmcast - send multicast message
* @ sock : socket structure
* @ seq : destination address
* @ iov : message data to send
* @ dsz : total length of message data
* @ timeo : timeout to wait for wakeup
*
* Called from function tipc_sendmsg ( ) , which has done all sanity checks
* Returns the number of bytes sent on success , or errno
*/
static int tipc_sendmcast ( struct socket * sock , struct tipc_name_seq * seq ,
struct iovec * iov , size_t dsz , long timeo )
{
struct sock * sk = sock - > sk ;
2014-08-22 18:09:20 -04:00
struct tipc_msg * mhdr = & tipc_sk ( sk ) - > phdr ;
2014-07-16 20:41:01 -04:00
struct sk_buff * buf ;
uint mtu ;
int rc ;
msg_set_type ( mhdr , TIPC_MCAST_MSG ) ;
msg_set_lookup_scope ( mhdr , TIPC_CLUSTER_SCOPE ) ;
msg_set_destport ( mhdr , 0 ) ;
msg_set_destnode ( mhdr , 0 ) ;
msg_set_nametype ( mhdr , seq - > type ) ;
msg_set_namelower ( mhdr , seq - > lower ) ;
msg_set_nameupper ( mhdr , seq - > upper ) ;
msg_set_hdr_sz ( mhdr , MCAST_H_SIZE ) ;
new_mtu :
mtu = tipc_bclink_get_mtu ( ) ;
2014-07-16 20:41:03 -04:00
rc = tipc_msg_build ( mhdr , iov , 0 , dsz , mtu , & buf ) ;
2014-07-16 20:41:01 -04:00
if ( unlikely ( rc < 0 ) )
return rc ;
do {
rc = tipc_bclink_xmit ( buf ) ;
if ( likely ( rc > = 0 ) ) {
rc = dsz ;
break ;
}
if ( rc = = - EMSGSIZE )
goto new_mtu ;
if ( rc ! = - ELINKCONG )
break ;
2014-08-22 18:09:07 -04:00
tipc_sk ( sk ) - > link_cong = 1 ;
2014-07-16 20:41:01 -04:00
rc = tipc_wait_for_sndmsg ( sock , & timeo ) ;
if ( rc )
kfree_skb_list ( buf ) ;
} while ( ! rc ) ;
return rc ;
}
2014-07-16 20:41:00 -04:00
/* tipc_sk_mcast_rcv - Deliver multicast message to all destination sockets
*/
void tipc_sk_mcast_rcv ( struct sk_buff * buf )
{
struct tipc_msg * msg = buf_msg ( buf ) ;
struct tipc_port_list dports = { 0 , NULL , } ;
struct tipc_port_list * item ;
struct sk_buff * b ;
uint i , last , dst = 0 ;
u32 scope = TIPC_CLUSTER_SCOPE ;
if ( in_own_node ( msg_orignode ( msg ) ) )
scope = TIPC_NODE_SCOPE ;
/* Create destination port list: */
tipc_nametbl_mc_translate ( msg_nametype ( msg ) ,
msg_namelower ( msg ) ,
msg_nameupper ( msg ) ,
scope ,
& dports ) ;
last = dports . count ;
if ( ! last ) {
kfree_skb ( buf ) ;
return ;
}
for ( item = & dports ; item ; item = item - > next ) {
for ( i = 0 ; i < PLSIZE & & + + dst < = last ; i + + ) {
b = ( dst ! = last ) ? skb_clone ( buf , GFP_ATOMIC ) : buf ;
if ( ! b ) {
pr_warn ( " Failed do clone mcast rcv buffer \n " ) ;
continue ;
}
msg_set_destport ( msg , item - > ports [ i ] ) ;
tipc_sk_rcv ( b ) ;
}
}
tipc_port_list_free ( & dports ) ;
}
2014-06-25 20:41:41 -05:00
/**
* tipc_sk_proto_rcv - receive a connection mng protocol message
* @ tsk : receiving socket
* @ dnode : node to send response message to , if any
* @ buf : buffer containing protocol message
* Returns 0 ( TIPC_OK ) if message was consumed , 1 ( TIPC_FWD_MSG ) if
* ( CONN_PROBE_REPLY ) message should be forwarded .
*/
2014-07-20 13:14:28 +08:00
static int tipc_sk_proto_rcv ( struct tipc_sock * tsk , u32 * dnode ,
struct sk_buff * buf )
2014-06-25 20:41:41 -05:00
{
struct tipc_msg * msg = buf_msg ( buf ) ;
2014-06-25 20:41:42 -05:00
int conn_cong ;
2014-06-25 20:41:41 -05:00
/* Ignore if connection cannot be validated: */
2014-08-22 18:09:18 -04:00
if ( ! tsk_peer_msg ( tsk , msg ) )
2014-06-25 20:41:41 -05:00
goto exit ;
2014-08-22 18:09:20 -04:00
tsk - > probing_state = TIPC_CONN_OK ;
2014-06-25 20:41:41 -05:00
if ( msg_type ( msg ) = = CONN_ACK ) {
2014-08-22 18:09:20 -04:00
conn_cong = tsk_conn_cong ( tsk ) ;
2014-06-25 20:41:42 -05:00
tsk - > sent_unacked - = msg_msgcnt ( msg ) ;
if ( conn_cong )
2014-08-22 18:09:07 -04:00
tsk - > sk . sk_write_space ( & tsk - > sk ) ;
2014-06-25 20:41:41 -05:00
} else if ( msg_type ( msg ) = = CONN_PROBE ) {
if ( ! tipc_msg_reverse ( buf , dnode , TIPC_OK ) )
return TIPC_OK ;
msg_set_type ( msg , CONN_PROBE_REPLY ) ;
return TIPC_FWD_MSG ;
}
/* Do nothing if msg_type() == CONN_PROBE_REPLY */
exit :
kfree_skb ( buf ) ;
return TIPC_OK ;
}
2007-02-09 23:25:21 +09:00
/**
2006-01-02 19:04:38 +01:00
* dest_name_check - verify user is permitted to send to specified port name
* @ dest : destination address
* @ m : descriptor for message to be sent
2007-02-09 23:25:21 +09:00
*
2006-01-02 19:04:38 +01:00
* Prevents restricted configuration commands from being issued by
* unauthorized users .
2007-02-09 23:25:21 +09:00
*
2006-01-02 19:04:38 +01:00
* Returns 0 if permission is granted , otherwise errno
*/
2006-03-20 22:37:04 -08:00
static int dest_name_check ( struct sockaddr_tipc * dest , struct msghdr * m )
2006-01-02 19:04:38 +01:00
{
struct tipc_cfg_msg_hdr hdr ;
2014-06-25 20:41:37 -05:00
if ( unlikely ( dest - > addrtype = = TIPC_ADDR_ID ) )
return 0 ;
2007-02-09 23:25:21 +09:00
if ( likely ( dest - > addr . name . name . type > = TIPC_RESERVED_TYPES ) )
return 0 ;
if ( likely ( dest - > addr . name . name . type = = TIPC_TOP_SRV ) )
return 0 ;
if ( likely ( dest - > addr . name . name . type ! = TIPC_CFG_SRV ) )
return - EACCES ;
2006-01-02 19:04:38 +01:00
2011-01-18 13:09:29 -05:00
if ( ! m - > msg_iovlen | | ( m - > msg_iov [ 0 ] . iov_len < sizeof ( hdr ) ) )
return - EMSGSIZE ;
2007-02-09 23:25:21 +09:00
if ( copy_from_user ( & hdr , m - > msg_iov [ 0 ] . iov_base , sizeof ( hdr ) ) )
2006-01-02 19:04:38 +01:00
return - EFAULT ;
2006-06-25 23:41:47 -07:00
if ( ( ntohs ( hdr . tcm_type ) & 0xC000 ) & & ( ! capable ( CAP_NET_ADMIN ) ) )
2006-01-02 19:04:38 +01:00
return - EACCES ;
2007-02-09 23:25:21 +09:00
2006-01-02 19:04:38 +01:00
return 0 ;
}
2014-01-17 09:50:05 +08:00
static int tipc_wait_for_sndmsg ( struct socket * sock , long * timeo_p )
{
struct sock * sk = sock - > sk ;
2014-03-12 11:31:12 -04:00
struct tipc_sock * tsk = tipc_sk ( sk ) ;
2014-01-17 09:50:05 +08:00
DEFINE_WAIT ( wait ) ;
int done ;
do {
int err = sock_error ( sk ) ;
if ( err )
return err ;
if ( sock - > state = = SS_DISCONNECTING )
return - EPIPE ;
if ( ! * timeo_p )
return - EAGAIN ;
if ( signal_pending ( current ) )
return sock_intr_errno ( * timeo_p ) ;
prepare_to_wait ( sk_sleep ( sk ) , & wait , TASK_INTERRUPTIBLE ) ;
2014-06-25 20:41:42 -05:00
done = sk_wait_event ( sk , timeo_p , ! tsk - > link_cong ) ;
2014-01-17 09:50:05 +08:00
finish_wait ( sk_sleep ( sk ) , & wait ) ;
} while ( ! done ) ;
return 0 ;
}
2006-01-02 19:04:38 +01:00
/**
2014-02-18 16:06:46 +08:00
* tipc_sendmsg - send message in connectionless manner
2008-04-15 00:22:02 -07:00
* @ iocb : if NULL , indicates that socket lock is already held
2006-01-02 19:04:38 +01:00
* @ sock : socket structure
* @ m : message to send
2014-06-25 20:41:37 -05:00
* @ dsz : amount of user data to be sent
2007-02-09 23:25:21 +09:00
*
2006-01-02 19:04:38 +01:00
* Message must have an destination specified explicitly .
2007-02-09 23:25:21 +09:00
* Used for SOCK_RDM and SOCK_DGRAM messages ,
2006-01-02 19:04:38 +01:00
* and for ' SYN ' messages on SOCK_SEQPACKET and SOCK_STREAM connections .
* ( Note : ' SYN + ' is prohibited on SOCK_STREAM . )
2007-02-09 23:25:21 +09:00
*
2006-01-02 19:04:38 +01:00
* Returns the number of bytes sent on success , or errno otherwise
*/
2014-02-18 16:06:46 +08:00
static int tipc_sendmsg ( struct kiocb * iocb , struct socket * sock ,
2014-06-25 20:41:37 -05:00
struct msghdr * m , size_t dsz )
2006-01-02 19:04:38 +01:00
{
2014-06-25 20:41:37 -05:00
DECLARE_SOCKADDR ( struct sockaddr_tipc * , dest , m - > msg_name ) ;
2008-04-15 00:22:02 -07:00
struct sock * sk = sock - > sk ;
2014-03-12 11:31:12 -04:00
struct tipc_sock * tsk = tipc_sk ( sk ) ;
2014-08-22 18:09:20 -04:00
struct tipc_msg * mhdr = & tsk - > phdr ;
2014-06-25 20:41:37 -05:00
struct iovec * iov = m - > msg_iov ;
u32 dnode , dport ;
struct sk_buff * buf ;
struct tipc_name_seq * seq = & dest - > addr . nameseq ;
u32 mtu ;
2014-01-17 09:50:05 +08:00
long timeo ;
2014-06-25 20:41:37 -05:00
int rc = - EINVAL ;
2006-01-02 19:04:38 +01:00
if ( unlikely ( ! dest ) )
return - EDESTADDRREQ ;
2014-06-25 20:41:37 -05:00
2006-06-25 23:49:06 -07:00
if ( unlikely ( ( m - > msg_namelen < sizeof ( * dest ) ) | |
( dest - > family ! = AF_TIPC ) ) )
2006-01-02 19:04:38 +01:00
return - EINVAL ;
2014-06-25 20:41:37 -05:00
if ( dsz > TIPC_MAX_USER_MSG_SIZE )
2010-04-20 17:58:24 -04:00
return - EMSGSIZE ;
2006-01-02 19:04:38 +01:00
2008-04-15 00:22:02 -07:00
if ( iocb )
lock_sock ( sk ) ;
2014-06-25 20:41:37 -05:00
if ( unlikely ( sock - > state ! = SS_READY ) ) {
2008-04-15 00:22:02 -07:00
if ( sock - > state = = SS_LISTENING ) {
2014-06-25 20:41:37 -05:00
rc = - EPIPE ;
2008-04-15 00:22:02 -07:00
goto exit ;
}
if ( sock - > state ! = SS_UNCONNECTED ) {
2014-06-25 20:41:37 -05:00
rc = - EISCONN ;
2008-04-15 00:22:02 -07:00
goto exit ;
}
2014-08-22 18:09:20 -04:00
if ( tsk - > published ) {
2014-06-25 20:41:37 -05:00
rc = - EOPNOTSUPP ;
2008-04-15 00:22:02 -07:00
goto exit ;
}
2006-06-25 23:44:57 -07:00
if ( dest - > addrtype = = TIPC_ADDR_NAME ) {
2014-08-22 18:09:20 -04:00
tsk - > conn_type = dest - > addr . name . name . type ;
tsk - > conn_instance = dest - > addr . name . name . instance ;
2006-06-25 23:44:57 -07:00
}
2006-01-02 19:04:38 +01:00
}
2014-06-25 20:41:37 -05:00
rc = dest_name_check ( dest , m ) ;
if ( rc )
goto exit ;
2006-01-02 19:04:38 +01:00
2014-01-17 09:50:05 +08:00
timeo = sock_sndtimeo ( sk , m - > msg_flags & MSG_DONTWAIT ) ;
2014-06-25 20:41:37 -05:00
if ( dest - > addrtype = = TIPC_ADDR_MCAST ) {
rc = tipc_sendmcast ( sock , seq , iov , dsz , timeo ) ;
goto exit ;
} else if ( dest - > addrtype = = TIPC_ADDR_NAME ) {
u32 type = dest - > addr . name . name . type ;
u32 inst = dest - > addr . name . name . instance ;
u32 domain = dest - > addr . name . domain ;
dnode = domain ;
msg_set_type ( mhdr , TIPC_NAMED_MSG ) ;
msg_set_hdr_sz ( mhdr , NAMED_H_SIZE ) ;
msg_set_nametype ( mhdr , type ) ;
msg_set_nameinst ( mhdr , inst ) ;
msg_set_lookup_scope ( mhdr , tipc_addr_scope ( domain ) ) ;
dport = tipc_nametbl_translate ( type , inst , & dnode ) ;
msg_set_destnode ( mhdr , dnode ) ;
msg_set_destport ( mhdr , dport ) ;
if ( unlikely ( ! dport & & ! dnode ) ) {
rc = - EHOSTUNREACH ;
goto exit ;
2007-02-09 23:25:21 +09:00
}
2014-06-25 20:41:37 -05:00
} else if ( dest - > addrtype = = TIPC_ADDR_ID ) {
dnode = dest - > addr . id . node ;
msg_set_type ( mhdr , TIPC_DIRECT_MSG ) ;
msg_set_lookup_scope ( mhdr , 0 ) ;
msg_set_destnode ( mhdr , dnode ) ;
msg_set_destport ( mhdr , dest - > addr . id . ref ) ;
msg_set_hdr_sz ( mhdr , BASIC_H_SIZE ) ;
}
new_mtu :
2014-08-22 18:09:20 -04:00
mtu = tipc_node_get_mtu ( dnode , tsk - > ref ) ;
2014-07-16 20:41:03 -04:00
rc = tipc_msg_build ( mhdr , iov , 0 , dsz , mtu , & buf ) ;
2014-06-25 20:41:37 -05:00
if ( rc < 0 )
goto exit ;
do {
2014-08-22 18:09:07 -04:00
TIPC_SKB_CB ( buf ) - > wakeup_pending = tsk - > link_cong ;
2014-08-22 18:09:20 -04:00
rc = tipc_link_xmit ( buf , dnode , tsk - > ref ) ;
2014-06-25 20:41:37 -05:00
if ( likely ( rc > = 0 ) ) {
if ( sock - > state ! = SS_READY )
2008-04-15 00:22:02 -07:00
sock - > state = SS_CONNECTING ;
2014-06-25 20:41:37 -05:00
rc = dsz ;
2008-04-15 00:22:02 -07:00
break ;
2007-02-09 23:25:21 +09:00
}
2014-06-25 20:41:37 -05:00
if ( rc = = - EMSGSIZE )
goto new_mtu ;
if ( rc ! = - ELINKCONG )
2008-04-15 00:22:02 -07:00
break ;
2014-08-22 18:09:07 -04:00
tsk - > link_cong = 1 ;
2014-06-25 20:41:37 -05:00
rc = tipc_wait_for_sndmsg ( sock , & timeo ) ;
2014-07-06 20:38:50 -04:00
if ( rc )
kfree_skb_list ( buf ) ;
2014-06-25 20:41:37 -05:00
} while ( ! rc ) ;
2008-04-15 00:22:02 -07:00
exit :
if ( iocb )
release_sock ( sk ) ;
2014-06-25 20:41:37 -05:00
return rc ;
2006-01-02 19:04:38 +01:00
}
2014-01-17 09:50:06 +08:00
static int tipc_wait_for_sndpkt ( struct socket * sock , long * timeo_p )
{
struct sock * sk = sock - > sk ;
2014-03-12 11:31:12 -04:00
struct tipc_sock * tsk = tipc_sk ( sk ) ;
2014-01-17 09:50:06 +08:00
DEFINE_WAIT ( wait ) ;
int done ;
do {
int err = sock_error ( sk ) ;
if ( err )
return err ;
if ( sock - > state = = SS_DISCONNECTING )
return - EPIPE ;
else if ( sock - > state ! = SS_CONNECTED )
return - ENOTCONN ;
if ( ! * timeo_p )
return - EAGAIN ;
if ( signal_pending ( current ) )
return sock_intr_errno ( * timeo_p ) ;
prepare_to_wait ( sk_sleep ( sk ) , & wait , TASK_INTERRUPTIBLE ) ;
done = sk_wait_event ( sk , timeo_p ,
2014-06-25 20:41:42 -05:00
( ! tsk - > link_cong & &
2014-08-22 18:09:20 -04:00
! tsk_conn_cong ( tsk ) ) | |
! tsk - > connected ) ;
2014-01-17 09:50:06 +08:00
finish_wait ( sk_sleep ( sk ) , & wait ) ;
} while ( ! done ) ;
return 0 ;
}
2007-02-09 23:25:21 +09:00
/**
2014-06-25 20:41:38 -05:00
* tipc_send_stream - send stream - oriented data
* @ iocb : ( unused )
2006-01-02 19:04:38 +01:00
* @ sock : socket structure
2014-06-25 20:41:38 -05:00
* @ m : data to send
* @ dsz : total length of data to be transmitted
2007-02-09 23:25:21 +09:00
*
2014-06-25 20:41:38 -05:00
* Used for SOCK_STREAM data .
2007-02-09 23:25:21 +09:00
*
2014-06-25 20:41:38 -05:00
* Returns the number of bytes sent on success ( or partial success ) ,
* or errno if no data sent
2006-01-02 19:04:38 +01:00
*/
2014-06-25 20:41:38 -05:00
static int tipc_send_stream ( struct kiocb * iocb , struct socket * sock ,
struct msghdr * m , size_t dsz )
2006-01-02 19:04:38 +01:00
{
2008-04-15 00:22:02 -07:00
struct sock * sk = sock - > sk ;
2014-03-12 11:31:12 -04:00
struct tipc_sock * tsk = tipc_sk ( sk ) ;
2014-08-22 18:09:20 -04:00
struct tipc_msg * mhdr = & tsk - > phdr ;
2014-06-25 20:41:38 -05:00
struct sk_buff * buf ;
2014-01-17 22:53:15 +01:00
DECLARE_SOCKADDR ( struct sockaddr_tipc * , dest , m - > msg_name ) ;
2014-08-22 18:09:20 -04:00
u32 ref = tsk - > ref ;
2014-06-25 20:41:38 -05:00
int rc = - EINVAL ;
2014-01-17 09:50:06 +08:00
long timeo ;
2014-06-25 20:41:38 -05:00
u32 dnode ;
uint mtu , send , sent = 0 ;
2006-01-02 19:04:38 +01:00
/* Handle implied connection establishment */
2014-06-25 20:41:38 -05:00
if ( unlikely ( dest ) ) {
rc = tipc_sendmsg ( iocb , sock , m , dsz ) ;
if ( dsz & & ( dsz = = rc ) )
2014-06-25 20:41:42 -05:00
tsk - > sent_unacked = 1 ;
2014-06-25 20:41:38 -05:00
return rc ;
}
if ( dsz > ( uint ) INT_MAX )
2010-04-20 17:58:24 -04:00
return - EMSGSIZE ;
2008-04-15 00:22:02 -07:00
if ( iocb )
lock_sock ( sk ) ;
2006-01-02 19:04:38 +01:00
2014-01-17 09:50:06 +08:00
if ( unlikely ( sock - > state ! = SS_CONNECTED ) ) {
if ( sock - > state = = SS_DISCONNECTING )
2014-06-25 20:41:38 -05:00
rc = - EPIPE ;
2014-01-17 09:50:06 +08:00
else
2014-06-25 20:41:38 -05:00
rc = - ENOTCONN ;
2014-01-17 09:50:06 +08:00
goto exit ;
}
2011-07-06 05:53:15 -04:00
2014-01-17 09:50:06 +08:00
timeo = sock_sndtimeo ( sk , m - > msg_flags & MSG_DONTWAIT ) ;
2014-08-22 18:09:20 -04:00
dnode = tsk_peer_node ( tsk ) ;
2014-06-25 20:41:38 -05:00
next :
2014-08-22 18:09:20 -04:00
mtu = tsk - > max_pkt ;
2014-06-25 20:41:38 -05:00
send = min_t ( uint , dsz - sent , TIPC_MAX_USER_MSG_SIZE ) ;
2014-07-16 20:41:03 -04:00
rc = tipc_msg_build ( mhdr , m - > msg_iov , sent , send , mtu , & buf ) ;
2014-06-25 20:41:38 -05:00
if ( unlikely ( rc < 0 ) )
goto exit ;
2007-02-09 23:25:21 +09:00
do {
2014-08-22 18:09:20 -04:00
if ( likely ( ! tsk_conn_cong ( tsk ) ) ) {
2014-07-16 20:41:03 -04:00
rc = tipc_link_xmit ( buf , dnode , ref ) ;
2014-06-25 20:41:38 -05:00
if ( likely ( ! rc ) ) {
2014-06-25 20:41:42 -05:00
tsk - > sent_unacked + + ;
2014-06-25 20:41:38 -05:00
sent + = send ;
if ( sent = = dsz )
break ;
goto next ;
}
if ( rc = = - EMSGSIZE ) {
2014-08-22 18:09:20 -04:00
tsk - > max_pkt = tipc_node_get_mtu ( dnode , ref ) ;
2014-06-25 20:41:38 -05:00
goto next ;
}
if ( rc ! = - ELINKCONG )
break ;
2014-08-22 18:09:07 -04:00
tsk - > link_cong = 1 ;
2014-06-25 20:41:38 -05:00
}
rc = tipc_wait_for_sndpkt ( sock , & timeo ) ;
2014-07-06 20:38:50 -04:00
if ( rc )
kfree_skb_list ( buf ) ;
2014-06-25 20:41:38 -05:00
} while ( ! rc ) ;
2014-01-17 09:50:06 +08:00
exit :
2008-04-15 00:22:02 -07:00
if ( iocb )
release_sock ( sk ) ;
2014-06-25 20:41:38 -05:00
return sent ? sent : rc ;
2006-01-02 19:04:38 +01:00
}
2007-02-09 23:25:21 +09:00
/**
2014-06-25 20:41:38 -05:00
* tipc_send_packet - send a connection - oriented message
* @ iocb : if NULL , indicates that socket lock is already held
2006-01-02 19:04:38 +01:00
* @ sock : socket structure
2014-06-25 20:41:38 -05:00
* @ m : message to send
* @ dsz : length of data to be transmitted
2007-02-09 23:25:21 +09:00
*
2014-06-25 20:41:38 -05:00
* Used for SOCK_SEQPACKET messages .
2007-02-09 23:25:21 +09:00
*
2014-06-25 20:41:38 -05:00
* Returns the number of bytes sent on success , or errno otherwise
2006-01-02 19:04:38 +01:00
*/
2014-06-25 20:41:38 -05:00
static int tipc_send_packet ( struct kiocb * iocb , struct socket * sock ,
struct msghdr * m , size_t dsz )
2006-01-02 19:04:38 +01:00
{
2014-06-25 20:41:38 -05:00
if ( dsz > TIPC_MAX_USER_MSG_SIZE )
return - EMSGSIZE ;
2006-01-02 19:04:38 +01:00
2014-06-25 20:41:38 -05:00
return tipc_send_stream ( iocb , sock , m , dsz ) ;
2006-01-02 19:04:38 +01:00
}
2014-08-22 18:09:11 -04:00
/* tipc_sk_finish_conn - complete the setup of a connection
2006-01-02 19:04:38 +01:00
*/
2014-08-22 18:09:20 -04:00
static void tipc_sk_finish_conn ( struct tipc_sock * tsk , u32 peer_port ,
2014-08-22 18:09:11 -04:00
u32 peer_node )
2006-01-02 19:04:38 +01:00
{
2014-08-22 18:09:20 -04:00
struct tipc_msg * msg = & tsk - > phdr ;
2006-01-02 19:04:38 +01:00
2014-08-22 18:09:11 -04:00
msg_set_destnode ( msg , peer_node ) ;
msg_set_destport ( msg , peer_port ) ;
msg_set_type ( msg , TIPC_CONN_MSG ) ;
msg_set_lookup_scope ( msg , 0 ) ;
msg_set_hdr_sz ( msg , SHORT_H_SIZE ) ;
tipc: introduce non-blocking socket connect
TIPC has so far only supported blocking connect(), meaning that a call
to connect() doesn't return until either the connection is fully
established, or an error occurs. This has proved insufficient for many
users, so we now introduce non-blocking connect(), analogous to how
this is done in TCP and other protocols.
With this feature, if a connection cannot be established instantly,
connect() will return the error code "-EINPROGRESS".
If the user later calls connect() again, he will either have the
return code "-EALREADY" or "-EISCONN", depending on whether the
connection has been established or not.
The user must have explicitly set the socket to be non-blocking
(SOCK_NONBLOCK or O_NONBLOCK, depending on method used), so unless
for some reason they had set this already (the socket would anyway
remain blocking in current TIPC) this change should be completely
backwards compatible.
It is also now possible to call select() or poll() to wait for the
completion of a connection.
An effect of the above is that the actual completion of a connection
may now be performed asynchronously, independent of the calls from
user space. Therefore, we now execute this code in BH context, in
the function filter_rcv(), which is executed upon reception of
messages in the socket.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
[PG: minor refactoring for improved connect/disconnect function names]
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-11-29 18:51:19 -05:00
2014-08-22 18:09:20 -04:00
tsk - > probing_interval = CONN_PROBING_INTERVAL ;
tsk - > probing_state = TIPC_CONN_OK ;
tsk - > connected = 1 ;
k_start_timer ( & tsk - > timer , tsk - > probing_interval ) ;
tipc_node_add_conn ( peer_node , tsk - > ref , peer_port ) ;
tsk - > max_pkt = tipc_node_get_mtu ( peer_node , tsk - > ref ) ;
2006-01-02 19:04:38 +01:00
}
/**
* set_orig_addr - capture sender ' s address for received message
* @ m : descriptor for message info
* @ msg : received message header
2007-02-09 23:25:21 +09:00
*
2006-01-02 19:04:38 +01:00
* Note : Address is not captured if not requested by receiver .
*/
2006-03-20 22:37:04 -08:00
static void set_orig_addr ( struct msghdr * m , struct tipc_msg * msg )
2006-01-02 19:04:38 +01:00
{
2014-01-17 22:53:15 +01:00
DECLARE_SOCKADDR ( struct sockaddr_tipc * , addr , m - > msg_name ) ;
2006-01-02 19:04:38 +01:00
2007-02-09 23:25:21 +09:00
if ( addr ) {
2006-01-02 19:04:38 +01:00
addr - > family = AF_TIPC ;
addr - > addrtype = TIPC_ADDR_ID ;
2013-04-07 01:52:00 +00:00
memset ( & addr - > addr , 0 , sizeof ( addr - > addr ) ) ;
2006-01-02 19:04:38 +01:00
addr - > addr . id . ref = msg_origport ( msg ) ;
addr - > addr . id . node = msg_orignode ( msg ) ;
2010-12-31 18:59:32 +00:00
addr - > addr . name . domain = 0 ; /* could leave uninitialized */
addr - > scope = 0 ; /* could leave uninitialized */
2006-01-02 19:04:38 +01:00
m - > msg_namelen = sizeof ( struct sockaddr_tipc ) ;
}
}
/**
2014-08-22 18:09:20 -04:00
* tipc_sk_anc_data_recv - optionally capture ancillary data for received message
2006-01-02 19:04:38 +01:00
* @ m : descriptor for message info
* @ msg : received message header
2014-08-22 18:09:20 -04:00
* @ tsk : TIPC port associated with message
2007-02-09 23:25:21 +09:00
*
2006-01-02 19:04:38 +01:00
* Note : Ancillary data is not captured if not requested by receiver .
2007-02-09 23:25:21 +09:00
*
2006-01-02 19:04:38 +01:00
* Returns 0 if successful , otherwise errno
*/
2014-08-22 18:09:20 -04:00
static int tipc_sk_anc_data_recv ( struct msghdr * m , struct tipc_msg * msg ,
struct tipc_sock * tsk )
2006-01-02 19:04:38 +01:00
{
u32 anc_data [ 3 ] ;
u32 err ;
u32 dest_type ;
2006-06-25 23:45:24 -07:00
int has_name ;
2006-01-02 19:04:38 +01:00
int res ;
if ( likely ( m - > msg_controllen = = 0 ) )
return 0 ;
/* Optionally capture errored message object(s) */
err = msg ? msg_errcode ( msg ) : 0 ;
if ( unlikely ( err ) ) {
anc_data [ 0 ] = err ;
anc_data [ 1 ] = msg_data_sz ( msg ) ;
2010-12-31 18:59:33 +00:00
res = put_cmsg ( m , SOL_TIPC , TIPC_ERRINFO , 8 , anc_data ) ;
if ( res )
2006-01-02 19:04:38 +01:00
return res ;
2010-12-31 18:59:33 +00:00
if ( anc_data [ 1 ] ) {
res = put_cmsg ( m , SOL_TIPC , TIPC_RETDATA , anc_data [ 1 ] ,
msg_data ( msg ) ) ;
if ( res )
return res ;
}
2006-01-02 19:04:38 +01:00
}
/* Optionally capture message destination object */
dest_type = msg ? msg_type ( msg ) : TIPC_DIRECT_MSG ;
switch ( dest_type ) {
case TIPC_NAMED_MSG :
2006-06-25 23:45:24 -07:00
has_name = 1 ;
2006-01-02 19:04:38 +01:00
anc_data [ 0 ] = msg_nametype ( msg ) ;
anc_data [ 1 ] = msg_namelower ( msg ) ;
anc_data [ 2 ] = msg_namelower ( msg ) ;
break ;
case TIPC_MCAST_MSG :
2006-06-25 23:45:24 -07:00
has_name = 1 ;
2006-01-02 19:04:38 +01:00
anc_data [ 0 ] = msg_nametype ( msg ) ;
anc_data [ 1 ] = msg_namelower ( msg ) ;
anc_data [ 2 ] = msg_nameupper ( msg ) ;
break ;
case TIPC_CONN_MSG :
2014-08-22 18:09:20 -04:00
has_name = ( tsk - > conn_type ! = 0 ) ;
anc_data [ 0 ] = tsk - > conn_type ;
anc_data [ 1 ] = tsk - > conn_instance ;
anc_data [ 2 ] = tsk - > conn_instance ;
2006-01-02 19:04:38 +01:00
break ;
default :
2006-06-25 23:45:24 -07:00
has_name = 0 ;
2006-01-02 19:04:38 +01:00
}
2010-12-31 18:59:33 +00:00
if ( has_name ) {
res = put_cmsg ( m , SOL_TIPC , TIPC_DESTNAME , 12 , anc_data ) ;
if ( res )
return res ;
}
2006-01-02 19:04:38 +01:00
return 0 ;
}
2014-08-22 18:09:20 -04:00
static void tipc_sk_send_ack ( struct tipc_sock * tsk , uint ack )
2014-08-22 18:09:12 -04:00
{
struct sk_buff * buf = NULL ;
struct tipc_msg * msg ;
2014-08-22 18:09:20 -04:00
u32 peer_port = tsk_peer_port ( tsk ) ;
u32 dnode = tsk_peer_node ( tsk ) ;
2014-08-22 18:09:12 -04:00
2014-08-22 18:09:20 -04:00
if ( ! tsk - > connected )
2014-08-22 18:09:12 -04:00
return ;
buf = tipc_msg_create ( CONN_MANAGER , CONN_ACK , INT_H_SIZE , 0 , dnode ,
2014-08-22 18:09:20 -04:00
tipc_own_addr , peer_port , tsk - > ref , TIPC_OK ) ;
2014-08-22 18:09:12 -04:00
if ( ! buf )
return ;
msg = buf_msg ( buf ) ;
msg_set_msgcnt ( msg , ack ) ;
tipc_link_xmit ( buf , dnode , msg_link_selector ( msg ) ) ;
}
2014-05-23 15:55:12 -04:00
static int tipc_wait_for_rcvmsg ( struct socket * sock , long * timeop )
2014-01-17 09:50:07 +08:00
{
struct sock * sk = sock - > sk ;
DEFINE_WAIT ( wait ) ;
2014-05-23 15:55:12 -04:00
long timeo = * timeop ;
2014-01-17 09:50:07 +08:00
int err ;
for ( ; ; ) {
prepare_to_wait ( sk_sleep ( sk ) , & wait , TASK_INTERRUPTIBLE ) ;
2014-03-06 14:40:18 +01:00
if ( timeo & & skb_queue_empty ( & sk - > sk_receive_queue ) ) {
2014-01-17 09:50:07 +08:00
if ( sock - > state = = SS_DISCONNECTING ) {
err = - ENOTCONN ;
break ;
}
release_sock ( sk ) ;
timeo = schedule_timeout ( timeo ) ;
lock_sock ( sk ) ;
}
err = 0 ;
if ( ! skb_queue_empty ( & sk - > sk_receive_queue ) )
break ;
err = sock_intr_errno ( timeo ) ;
if ( signal_pending ( current ) )
break ;
err = - EAGAIN ;
if ( ! timeo )
break ;
}
finish_wait ( sk_sleep ( sk ) , & wait ) ;
2014-05-23 15:55:12 -04:00
* timeop = timeo ;
2014-01-17 09:50:07 +08:00
return err ;
}
2007-02-09 23:25:21 +09:00
/**
2014-02-18 16:06:46 +08:00
* tipc_recvmsg - receive packet - oriented message
2006-01-02 19:04:38 +01:00
* @ iocb : ( unused )
* @ m : descriptor for message info
* @ buf_len : total size of user buffer area
* @ flags : receive flags
2007-02-09 23:25:21 +09:00
*
2006-01-02 19:04:38 +01:00
* Used for SOCK_DGRAM , SOCK_RDM , and SOCK_SEQPACKET messages .
* If the complete message doesn ' t fit in user area , truncate it .
*
* Returns size of returned message data , errno otherwise
*/
2014-02-18 16:06:46 +08:00
static int tipc_recvmsg ( struct kiocb * iocb , struct socket * sock ,
struct msghdr * m , size_t buf_len , int flags )
2006-01-02 19:04:38 +01:00
{
2008-04-15 00:22:02 -07:00
struct sock * sk = sock - > sk ;
2014-03-12 11:31:12 -04:00
struct tipc_sock * tsk = tipc_sk ( sk ) ;
2006-01-02 19:04:38 +01:00
struct sk_buff * buf ;
struct tipc_msg * msg ;
2014-01-17 09:50:07 +08:00
long timeo ;
2006-01-02 19:04:38 +01:00
unsigned int sz ;
u32 err ;
int res ;
2008-04-15 00:22:02 -07:00
/* Catch invalid receive requests */
2006-01-02 19:04:38 +01:00
if ( unlikely ( ! buf_len ) )
return - EINVAL ;
2008-04-15 00:22:02 -07:00
lock_sock ( sk ) ;
2006-01-02 19:04:38 +01:00
2008-04-15 00:22:02 -07:00
if ( unlikely ( sock - > state = = SS_UNCONNECTED ) ) {
res = - ENOTCONN ;
2006-01-02 19:04:38 +01:00
goto exit ;
}
2014-01-17 09:50:07 +08:00
timeo = sock_rcvtimeo ( sk , flags & MSG_DONTWAIT ) ;
2008-04-15 00:22:02 -07:00
restart :
2006-01-02 19:04:38 +01:00
2008-04-15 00:22:02 -07:00
/* Look for a message in receive queue; wait if necessary */
2014-05-23 15:55:12 -04:00
res = tipc_wait_for_rcvmsg ( sock , & timeo ) ;
2014-01-17 09:50:07 +08:00
if ( res )
goto exit ;
2006-01-02 19:04:38 +01:00
2008-04-15 00:22:02 -07:00
/* Look at first message in receive queue */
buf = skb_peek ( & sk - > sk_receive_queue ) ;
2006-01-02 19:04:38 +01:00
msg = buf_msg ( buf ) ;
sz = msg_data_sz ( msg ) ;
err = msg_errcode ( msg ) ;
/* Discard an empty non-errored message & try again */
if ( ( ! sz ) & & ( ! err ) ) {
2014-08-22 18:09:18 -04:00
tsk_advance_rx_queue ( sk ) ;
2006-01-02 19:04:38 +01:00
goto restart ;
}
/* Capture sender's address (optional) */
set_orig_addr ( m , msg ) ;
/* Capture ancillary data (optional) */
2014-08-22 18:09:20 -04:00
res = tipc_sk_anc_data_recv ( m , msg , tsk ) ;
2008-04-15 00:22:02 -07:00
if ( res )
2006-01-02 19:04:38 +01:00
goto exit ;
/* Capture message data (if valid) & compute return value (always) */
if ( ! err ) {
if ( unlikely ( buf_len < sz ) ) {
sz = buf_len ;
m - > msg_flags | = MSG_TRUNC ;
}
2011-02-21 09:45:40 -05:00
res = skb_copy_datagram_iovec ( buf , msg_hdr_sz ( msg ) ,
m - > msg_iov , sz ) ;
if ( res )
2006-01-02 19:04:38 +01:00
goto exit ;
res = sz ;
} else {
if ( ( sock - > state = = SS_READY ) | |
( ( err = = TIPC_CONN_SHUTDOWN ) | | m - > msg_control ) )
res = 0 ;
else
res = - ECONNRESET ;
}
/* Consume received message (optional) */
if ( likely ( ! ( flags & MSG_PEEK ) ) ) {
2008-04-15 00:06:12 -07:00
if ( ( sock - > state ! = SS_READY ) & &
2014-06-25 20:41:42 -05:00
( + + tsk - > rcv_unacked > = TIPC_CONNACK_INTV ) ) {
2014-08-22 18:09:20 -04:00
tipc_sk_send_ack ( tsk , tsk - > rcv_unacked ) ;
2014-06-25 20:41:42 -05:00
tsk - > rcv_unacked = 0 ;
}
2014-08-22 18:09:18 -04:00
tsk_advance_rx_queue ( sk ) ;
2007-02-09 23:25:21 +09:00
}
2006-01-02 19:04:38 +01:00
exit :
2008-04-15 00:22:02 -07:00
release_sock ( sk ) ;
2006-01-02 19:04:38 +01:00
return res ;
}
2007-02-09 23:25:21 +09:00
/**
2014-02-18 16:06:46 +08:00
* tipc_recv_stream - receive stream - oriented data
2006-01-02 19:04:38 +01:00
* @ iocb : ( unused )
* @ m : descriptor for message info
* @ buf_len : total size of user buffer area
* @ flags : receive flags
2007-02-09 23:25:21 +09:00
*
* Used for SOCK_STREAM messages only . If not enough data is available
2006-01-02 19:04:38 +01:00
* will optionally wait for more ; never truncates data .
*
* Returns size of returned message data , errno otherwise
*/
2014-02-18 16:06:46 +08:00
static int tipc_recv_stream ( struct kiocb * iocb , struct socket * sock ,
struct msghdr * m , size_t buf_len , int flags )
2006-01-02 19:04:38 +01:00
{
2008-04-15 00:22:02 -07:00
struct sock * sk = sock - > sk ;
2014-03-12 11:31:12 -04:00
struct tipc_sock * tsk = tipc_sk ( sk ) ;
2006-01-02 19:04:38 +01:00
struct sk_buff * buf ;
struct tipc_msg * msg ;
2014-01-17 09:50:07 +08:00
long timeo ;
2006-01-02 19:04:38 +01:00
unsigned int sz ;
2010-08-17 11:00:04 +00:00
int sz_to_copy , target , needed ;
2006-01-02 19:04:38 +01:00
int sz_copied = 0 ;
u32 err ;
2008-04-15 00:22:02 -07:00
int res = 0 ;
2006-01-02 19:04:38 +01:00
2008-04-15 00:22:02 -07:00
/* Catch invalid receive attempts */
2006-01-02 19:04:38 +01:00
if ( unlikely ( ! buf_len ) )
return - EINVAL ;
2008-04-15 00:22:02 -07:00
lock_sock ( sk ) ;
2006-01-02 19:04:38 +01:00
2014-01-17 09:50:07 +08:00
if ( unlikely ( sock - > state = = SS_UNCONNECTED ) ) {
2008-04-15 00:22:02 -07:00
res = - ENOTCONN ;
2006-01-02 19:04:38 +01:00
goto exit ;
}
2010-08-17 11:00:04 +00:00
target = sock_rcvlowat ( sk , flags & MSG_WAITALL , buf_len ) ;
2014-01-17 09:50:07 +08:00
timeo = sock_rcvtimeo ( sk , flags & MSG_DONTWAIT ) ;
2006-01-02 19:04:38 +01:00
2012-04-30 15:29:02 -04:00
restart :
2008-04-15 00:22:02 -07:00
/* Look for a message in receive queue; wait if necessary */
2014-05-23 15:55:12 -04:00
res = tipc_wait_for_rcvmsg ( sock , & timeo ) ;
2014-01-17 09:50:07 +08:00
if ( res )
goto exit ;
2006-01-02 19:04:38 +01:00
2008-04-15 00:22:02 -07:00
/* Look at first message in receive queue */
buf = skb_peek ( & sk - > sk_receive_queue ) ;
2006-01-02 19:04:38 +01:00
msg = buf_msg ( buf ) ;
sz = msg_data_sz ( msg ) ;
err = msg_errcode ( msg ) ;
/* Discard an empty non-errored message & try again */
if ( ( ! sz ) & & ( ! err ) ) {
2014-08-22 18:09:18 -04:00
tsk_advance_rx_queue ( sk ) ;
2006-01-02 19:04:38 +01:00
goto restart ;
}
/* Optionally capture sender's address & ancillary data of first msg */
if ( sz_copied = = 0 ) {
set_orig_addr ( m , msg ) ;
2014-08-22 18:09:20 -04:00
res = tipc_sk_anc_data_recv ( m , msg , tsk ) ;
2008-04-15 00:22:02 -07:00
if ( res )
2006-01-02 19:04:38 +01:00
goto exit ;
}
/* Capture message data (if valid) & compute return value (always) */
if ( ! err ) {
2011-02-21 09:45:40 -05:00
u32 offset = ( u32 ) ( unsigned long ) ( TIPC_SKB_CB ( buf ) - > handle ) ;
2006-01-02 19:04:38 +01:00
2011-02-21 09:45:40 -05:00
sz - = offset ;
2006-01-02 19:04:38 +01:00
needed = ( buf_len - sz_copied ) ;
sz_to_copy = ( sz < = needed ) ? sz : needed ;
2011-02-21 09:45:40 -05:00
res = skb_copy_datagram_iovec ( buf , msg_hdr_sz ( msg ) + offset ,
m - > msg_iov , sz_to_copy ) ;
if ( res )
2006-01-02 19:04:38 +01:00
goto exit ;
2011-02-21 09:45:40 -05:00
2006-01-02 19:04:38 +01:00
sz_copied + = sz_to_copy ;
if ( sz_to_copy < sz ) {
if ( ! ( flags & MSG_PEEK ) )
2011-02-21 09:45:40 -05:00
TIPC_SKB_CB ( buf ) - > handle =
( void * ) ( unsigned long ) ( offset + sz_to_copy ) ;
2006-01-02 19:04:38 +01:00
goto exit ;
}
} else {
if ( sz_copied ! = 0 )
goto exit ; /* can't add error msg to valid data */
if ( ( err = = TIPC_CONN_SHUTDOWN ) | | m - > msg_control )
res = 0 ;
else
res = - ECONNRESET ;
}
/* Consume received message (optional) */
if ( likely ( ! ( flags & MSG_PEEK ) ) ) {
2014-06-25 20:41:42 -05:00
if ( unlikely ( + + tsk - > rcv_unacked > = TIPC_CONNACK_INTV ) ) {
2014-08-22 18:09:20 -04:00
tipc_sk_send_ack ( tsk , tsk - > rcv_unacked ) ;
2014-06-25 20:41:42 -05:00
tsk - > rcv_unacked = 0 ;
}
2014-08-22 18:09:18 -04:00
tsk_advance_rx_queue ( sk ) ;
2007-02-09 23:25:21 +09:00
}
2006-01-02 19:04:38 +01:00
/* Loop around if more data is required */
2009-11-29 16:55:45 -08:00
if ( ( sz_copied < buf_len ) & & /* didn't get all requested data */
( ! skb_queue_empty ( & sk - > sk_receive_queue ) | |
2010-08-17 11:00:04 +00:00
( sz_copied < target ) ) & & /* and more is ready or required */
2009-11-29 16:55:45 -08:00
( ! ( flags & MSG_PEEK ) ) & & /* and aren't just peeking at data */
( ! err ) ) /* and haven't reached a FIN */
2006-01-02 19:04:38 +01:00
goto restart ;
exit :
2008-04-15 00:22:02 -07:00
release_sock ( sk ) ;
2006-06-25 23:48:22 -07:00
return sz_copied ? sz_copied : res ;
2006-01-02 19:04:38 +01:00
}
2012-08-21 11:16:57 +08:00
/**
* tipc_write_space - wake up thread if port congestion is released
* @ sk : socket
*/
static void tipc_write_space ( struct sock * sk )
{
struct socket_wq * wq ;
rcu_read_lock ( ) ;
wq = rcu_dereference ( sk - > sk_wq ) ;
if ( wq_has_sleeper ( wq ) )
wake_up_interruptible_sync_poll ( & wq - > wait , POLLOUT |
POLLWRNORM | POLLWRBAND ) ;
rcu_read_unlock ( ) ;
}
/**
* tipc_data_ready - wake up threads to indicate messages have been received
* @ sk : socket
* @ len : the length of messages
*/
2014-04-11 16:15:36 -04:00
static void tipc_data_ready ( struct sock * sk )
2012-08-21 11:16:57 +08:00
{
struct socket_wq * wq ;
rcu_read_lock ( ) ;
wq = rcu_dereference ( sk - > sk_wq ) ;
if ( wq_has_sleeper ( wq ) )
wake_up_interruptible_sync_poll ( & wq - > wait , POLLIN |
POLLRDNORM | POLLRDBAND ) ;
rcu_read_unlock ( ) ;
}
2012-11-29 18:39:14 -05:00
/**
* filter_connect - Handle all incoming messages for a connection - based socket
2014-03-12 11:31:12 -04:00
* @ tsk : TIPC socket
2012-11-29 18:39:14 -05:00
* @ msg : message
*
2014-06-25 20:41:31 -05:00
* Returns 0 ( TIPC_OK ) if everyting ok , - TIPC_ERR_NO_PORT otherwise
2012-11-29 18:39:14 -05:00
*/
2014-06-25 20:41:31 -05:00
static int filter_connect ( struct tipc_sock * tsk , struct sk_buff * * buf )
2012-11-29 18:39:14 -05:00
{
2014-03-12 11:31:12 -04:00
struct sock * sk = & tsk - > sk ;
2014-03-12 11:31:09 -04:00
struct socket * sock = sk - > sk_socket ;
2012-11-29 18:39:14 -05:00
struct tipc_msg * msg = buf_msg ( * buf ) ;
2014-06-25 20:41:31 -05:00
int retval = - TIPC_ERR_NO_PORT ;
2012-11-29 18:39:14 -05:00
if ( msg_mcast ( msg ) )
return retval ;
switch ( ( int ) sock - > state ) {
case SS_CONNECTED :
/* Accept only connection-based messages sent by peer */
2014-08-22 18:09:18 -04:00
if ( tsk_peer_msg ( tsk , msg ) ) {
2012-11-29 18:39:14 -05:00
if ( unlikely ( msg_errcode ( msg ) ) ) {
sock - > state = SS_DISCONNECTING ;
2014-08-22 18:09:20 -04:00
tsk - > connected = 0 ;
2014-08-22 18:09:11 -04:00
/* let timer expire on it's own */
2014-08-22 18:09:20 -04:00
tipc_node_remove_conn ( tsk_peer_node ( tsk ) ,
tsk - > ref ) ;
2012-11-29 18:39:14 -05:00
}
retval = TIPC_OK ;
}
break ;
case SS_CONNECTING :
/* Accept only ACK or NACK message */
2014-08-22 18:09:11 -04:00
if ( unlikely ( ! msg_connected ( msg ) ) )
break ;
tipc: introduce non-blocking socket connect
TIPC has so far only supported blocking connect(), meaning that a call
to connect() doesn't return until either the connection is fully
established, or an error occurs. This has proved insufficient for many
users, so we now introduce non-blocking connect(), analogous to how
this is done in TCP and other protocols.
With this feature, if a connection cannot be established instantly,
connect() will return the error code "-EINPROGRESS".
If the user later calls connect() again, he will either have the
return code "-EALREADY" or "-EISCONN", depending on whether the
connection has been established or not.
The user must have explicitly set the socket to be non-blocking
(SOCK_NONBLOCK or O_NONBLOCK, depending on method used), so unless
for some reason they had set this already (the socket would anyway
remain blocking in current TIPC) this change should be completely
backwards compatible.
It is also now possible to call select() or poll() to wait for the
completion of a connection.
An effect of the above is that the actual completion of a connection
may now be performed asynchronously, independent of the calls from
user space. Therefore, we now execute this code in BH context, in
the function filter_rcv(), which is executed upon reception of
messages in the socket.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
[PG: minor refactoring for improved connect/disconnect function names]
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-11-29 18:51:19 -05:00
if ( unlikely ( msg_errcode ( msg ) ) ) {
sock - > state = SS_DISCONNECTING ;
tipc: set sk_err correctly when connection fails
Should a connect fail, if the publication/server is unavailable or
due to some other error, a positive value will be returned and errno
is never set. If the application code checks for an explicit zero
return from connect (success) or a negative return (failure), it
will not catch the error and subsequent send() calls will fail as
shown from the strace snippet below.
socket(0x1e /* PF_??? */, SOCK_SEQPACKET, 0) = 3
connect(3, {sa_family=0x1e /* AF_??? */, sa_data="\2\1\322\4\0\0\322\4\0\0\0\0\0\0"}, 16) = 111
sendto(3, "test", 4, 0, NULL, 0) = -1 EPIPE (Broken pipe)
The reason for this behaviour is that TIPC wrongly inverts error
codes set in sk_err.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-28 09:29:58 +02:00
sk - > sk_err = ECONNREFUSED ;
tipc: introduce non-blocking socket connect
TIPC has so far only supported blocking connect(), meaning that a call
to connect() doesn't return until either the connection is fully
established, or an error occurs. This has proved insufficient for many
users, so we now introduce non-blocking connect(), analogous to how
this is done in TCP and other protocols.
With this feature, if a connection cannot be established instantly,
connect() will return the error code "-EINPROGRESS".
If the user later calls connect() again, he will either have the
return code "-EALREADY" or "-EISCONN", depending on whether the
connection has been established or not.
The user must have explicitly set the socket to be non-blocking
(SOCK_NONBLOCK or O_NONBLOCK, depending on method used), so unless
for some reason they had set this already (the socket would anyway
remain blocking in current TIPC) this change should be completely
backwards compatible.
It is also now possible to call select() or poll() to wait for the
completion of a connection.
An effect of the above is that the actual completion of a connection
may now be performed asynchronously, independent of the calls from
user space. Therefore, we now execute this code in BH context, in
the function filter_rcv(), which is executed upon reception of
messages in the socket.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
[PG: minor refactoring for improved connect/disconnect function names]
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-11-29 18:51:19 -05:00
retval = TIPC_OK ;
break ;
}
2014-08-22 18:09:11 -04:00
if ( unlikely ( msg_importance ( msg ) > TIPC_CRITICAL_IMPORTANCE ) ) {
tipc: introduce non-blocking socket connect
TIPC has so far only supported blocking connect(), meaning that a call
to connect() doesn't return until either the connection is fully
established, or an error occurs. This has proved insufficient for many
users, so we now introduce non-blocking connect(), analogous to how
this is done in TCP and other protocols.
With this feature, if a connection cannot be established instantly,
connect() will return the error code "-EINPROGRESS".
If the user later calls connect() again, he will either have the
return code "-EALREADY" or "-EISCONN", depending on whether the
connection has been established or not.
The user must have explicitly set the socket to be non-blocking
(SOCK_NONBLOCK or O_NONBLOCK, depending on method used), so unless
for some reason they had set this already (the socket would anyway
remain blocking in current TIPC) this change should be completely
backwards compatible.
It is also now possible to call select() or poll() to wait for the
completion of a connection.
An effect of the above is that the actual completion of a connection
may now be performed asynchronously, independent of the calls from
user space. Therefore, we now execute this code in BH context, in
the function filter_rcv(), which is executed upon reception of
messages in the socket.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
[PG: minor refactoring for improved connect/disconnect function names]
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-11-29 18:51:19 -05:00
sock - > state = SS_DISCONNECTING ;
2014-08-22 18:09:11 -04:00
sk - > sk_err = EINVAL ;
2012-11-29 18:39:14 -05:00
retval = TIPC_OK ;
tipc: introduce non-blocking socket connect
TIPC has so far only supported blocking connect(), meaning that a call
to connect() doesn't return until either the connection is fully
established, or an error occurs. This has proved insufficient for many
users, so we now introduce non-blocking connect(), analogous to how
this is done in TCP and other protocols.
With this feature, if a connection cannot be established instantly,
connect() will return the error code "-EINPROGRESS".
If the user later calls connect() again, he will either have the
return code "-EALREADY" or "-EISCONN", depending on whether the
connection has been established or not.
The user must have explicitly set the socket to be non-blocking
(SOCK_NONBLOCK or O_NONBLOCK, depending on method used), so unless
for some reason they had set this already (the socket would anyway
remain blocking in current TIPC) this change should be completely
backwards compatible.
It is also now possible to call select() or poll() to wait for the
completion of a connection.
An effect of the above is that the actual completion of a connection
may now be performed asynchronously, independent of the calls from
user space. Therefore, we now execute this code in BH context, in
the function filter_rcv(), which is executed upon reception of
messages in the socket.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
[PG: minor refactoring for improved connect/disconnect function names]
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-11-29 18:51:19 -05:00
break ;
}
2014-08-22 18:09:20 -04:00
tipc_sk_finish_conn ( tsk , msg_origport ( msg ) , msg_orignode ( msg ) ) ;
msg_set_importance ( & tsk - > phdr , msg_importance ( msg ) ) ;
2014-08-22 18:09:11 -04:00
sock - > state = SS_CONNECTED ;
tipc: introduce non-blocking socket connect
TIPC has so far only supported blocking connect(), meaning that a call
to connect() doesn't return until either the connection is fully
established, or an error occurs. This has proved insufficient for many
users, so we now introduce non-blocking connect(), analogous to how
this is done in TCP and other protocols.
With this feature, if a connection cannot be established instantly,
connect() will return the error code "-EINPROGRESS".
If the user later calls connect() again, he will either have the
return code "-EALREADY" or "-EISCONN", depending on whether the
connection has been established or not.
The user must have explicitly set the socket to be non-blocking
(SOCK_NONBLOCK or O_NONBLOCK, depending on method used), so unless
for some reason they had set this already (the socket would anyway
remain blocking in current TIPC) this change should be completely
backwards compatible.
It is also now possible to call select() or poll() to wait for the
completion of a connection.
An effect of the above is that the actual completion of a connection
may now be performed asynchronously, independent of the calls from
user space. Therefore, we now execute this code in BH context, in
the function filter_rcv(), which is executed upon reception of
messages in the socket.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
[PG: minor refactoring for improved connect/disconnect function names]
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-11-29 18:51:19 -05:00
/* If an incoming message is an 'ACK-', it should be
* discarded here because it doesn ' t contain useful
* data . In addition , we should try to wake up
* connect ( ) routine if sleeping .
*/
if ( msg_data_sz ( msg ) = = 0 ) {
kfree_skb ( * buf ) ;
* buf = NULL ;
if ( waitqueue_active ( sk_sleep ( sk ) ) )
wake_up_interruptible ( sk_sleep ( sk ) ) ;
}
retval = TIPC_OK ;
2012-11-29 18:39:14 -05:00
break ;
case SS_LISTENING :
case SS_UNCONNECTED :
/* Accept only SYN message */
if ( ! msg_connected ( msg ) & & ! ( msg_errcode ( msg ) ) )
retval = TIPC_OK ;
break ;
case SS_DISCONNECTING :
break ;
default :
pr_err ( " Unknown socket state %u \n " , sock - > state ) ;
}
return retval ;
}
2013-01-20 23:30:09 +01:00
/**
* rcvbuf_limit - get proper overload limit of socket receive queue
* @ sk : socket
* @ buf : message
*
* For all connection oriented messages , irrespective of importance ,
* the default overload value ( i . e . 67 MB ) is set as limit .
*
* For all connectionless messages , by default new queue limits are
* as belows :
*
2013-06-17 10:54:37 -04:00
* TIPC_LOW_IMPORTANCE ( 4 MB )
* TIPC_MEDIUM_IMPORTANCE ( 8 MB )
* TIPC_HIGH_IMPORTANCE ( 16 MB )
* TIPC_CRITICAL_IMPORTANCE ( 32 MB )
2013-01-20 23:30:09 +01:00
*
* Returns overload limit according to corresponding message importance
*/
static unsigned int rcvbuf_limit ( struct sock * sk , struct sk_buff * buf )
{
struct tipc_msg * msg = buf_msg ( buf ) ;
if ( msg_connected ( msg ) )
2013-12-12 09:36:39 +08:00
return sysctl_tipc_rmem [ 2 ] ;
return sk - > sk_rcvbuf > > TIPC_CRITICAL_IMPORTANCE < <
msg_importance ( msg ) ;
2013-01-20 23:30:09 +01:00
}
2007-02-09 23:25:21 +09:00
/**
2008-04-15 00:22:02 -07:00
* filter_rcv - validate incoming message
* @ sk : socket
2006-01-02 19:04:38 +01:00
* @ buf : message
2007-02-09 23:25:21 +09:00
*
2008-04-15 00:22:02 -07:00
* Enqueues message on receive queue if acceptable ; optionally handles
* disconnect indication for a connected socket .
*
* Called with socket lock already taken ; port lock may also be taken .
2007-02-09 23:25:21 +09:00
*
2014-06-25 20:41:31 -05:00
* Returns 0 ( TIPC_OK ) if message was consumed , - TIPC error code if message
2014-06-25 20:41:41 -05:00
* to be rejected , 1 ( TIPC_FWD_MSG ) if ( CONN_MANAGER ) message to be forwarded
2006-01-02 19:04:38 +01:00
*/
2014-06-25 20:41:31 -05:00
static int filter_rcv ( struct sock * sk , struct sk_buff * buf )
2006-01-02 19:04:38 +01:00
{
2008-04-15 00:22:02 -07:00
struct socket * sock = sk - > sk_socket ;
2014-03-12 11:31:12 -04:00
struct tipc_sock * tsk = tipc_sk ( sk ) ;
2006-01-02 19:04:38 +01:00
struct tipc_msg * msg = buf_msg ( buf ) ;
2013-01-20 23:30:09 +01:00
unsigned int limit = rcvbuf_limit ( sk , buf ) ;
2014-06-25 20:41:41 -05:00
u32 onode ;
2014-06-25 20:41:31 -05:00
int rc = TIPC_OK ;
2006-01-02 19:04:38 +01:00
2014-06-25 20:41:41 -05:00
if ( unlikely ( msg_user ( msg ) = = CONN_MANAGER ) )
return tipc_sk_proto_rcv ( tsk , & onode , buf ) ;
2014-06-25 20:41:40 -05:00
2014-08-22 18:09:07 -04:00
if ( unlikely ( msg_user ( msg ) = = SOCK_WAKEUP ) ) {
kfree_skb ( buf ) ;
tsk - > link_cong = 0 ;
sk - > sk_write_space ( sk ) ;
return TIPC_OK ;
}
2006-01-02 19:04:38 +01:00
/* Reject message if it is wrong sort of message for socket */
2012-04-26 18:13:08 -04:00
if ( msg_type ( msg ) > TIPC_DIRECT_MSG )
2014-06-25 20:41:31 -05:00
return - TIPC_ERR_NO_PORT ;
2008-04-15 00:22:02 -07:00
2006-01-02 19:04:38 +01:00
if ( sock - > state = = SS_READY ) {
2010-12-31 18:59:25 +00:00
if ( msg_connected ( msg ) )
2014-06-25 20:41:31 -05:00
return - TIPC_ERR_NO_PORT ;
2006-01-02 19:04:38 +01:00
} else {
2014-06-25 20:41:31 -05:00
rc = filter_connect ( tsk , & buf ) ;
if ( rc ! = TIPC_OK | | buf = = NULL )
return rc ;
2006-01-02 19:04:38 +01:00
}
/* Reject message if there isn't room to queue it */
2013-01-20 23:30:09 +01:00
if ( sk_rmem_alloc_get ( sk ) + buf - > truesize > = limit )
2014-06-25 20:41:31 -05:00
return - TIPC_ERR_OVERLOAD ;
2006-01-02 19:04:38 +01:00
2013-01-20 23:30:09 +01:00
/* Enqueue message */
2013-10-18 07:23:16 +02:00
TIPC_SKB_CB ( buf ) - > handle = NULL ;
2008-04-15 00:22:02 -07:00
__skb_queue_tail ( & sk - > sk_receive_queue , buf ) ;
2013-01-20 23:30:09 +01:00
skb_set_owner_r ( buf , sk ) ;
2008-04-15 00:22:02 -07:00
2014-04-11 16:15:36 -04:00
sk - > sk_data_ready ( sk ) ;
2008-04-15 00:22:02 -07:00
return TIPC_OK ;
}
2006-01-02 19:04:38 +01:00
2008-04-15 00:22:02 -07:00
/**
tipc: compensate for double accounting in socket rcv buffer
The function net/core/sock.c::__release_sock() runs a tight loop
to move buffers from the socket backlog queue to the receive queue.
As a security measure, sk_backlog.len of the receiving socket
is not set to zero until after the loop is finished, i.e., until
the whole backlog queue has been transferred to the receive queue.
During this transfer, the data that has already been moved is counted
both in the backlog queue and the receive queue, hence giving an
incorrect picture of the available queue space for new arriving buffers.
This leads to unnecessary rejection of buffers by sk_add_backlog(),
which in TIPC leads to unnecessarily broken connections.
In this commit, we compensate for this double accounting by adding
a counter that keeps track of it. The function socket.c::backlog_rcv()
receives buffers one by one from __release_sock(), and adds them to the
socket receive queue. If the transfer is successful, it increases a new
atomic counter 'tipc_sock::dupl_rcvcnt' with 'truesize' of the
transferred buffer. If a new buffer arrives during this transfer and
finds the socket busy (owned), we attempt to add it to the backlog.
However, when sk_add_backlog() is called, we adjust the 'limit'
parameter with the value of the new counter, so that the risk of
inadvertent rejection is eliminated.
It should be noted that this change does not invalidate the original
purpose of zeroing 'sk_backlog.len' after the full transfer. We set an
upper limit for dupl_rcvcnt, so that if a 'wild' sender (i.e., one that
doesn't respect the send window) keeps pumping in buffers to
sk_add_backlog(), he will eventually reach an upper limit,
(2 x TIPC_CONN_OVERLOAD_LIMIT). After that, no messages can be added
to the backlog, and the connection will be broken. Ordinary, well-
behaved senders will never reach this buffer limit at all.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-14 05:39:09 -04:00
* tipc_backlog_rcv - handle incoming message from backlog queue
2008-04-15 00:22:02 -07:00
* @ sk : socket
* @ buf : message
*
* Caller must hold socket lock , but not port lock .
*
* Returns 0
*/
tipc: compensate for double accounting in socket rcv buffer
The function net/core/sock.c::__release_sock() runs a tight loop
to move buffers from the socket backlog queue to the receive queue.
As a security measure, sk_backlog.len of the receiving socket
is not set to zero until after the loop is finished, i.e., until
the whole backlog queue has been transferred to the receive queue.
During this transfer, the data that has already been moved is counted
both in the backlog queue and the receive queue, hence giving an
incorrect picture of the available queue space for new arriving buffers.
This leads to unnecessary rejection of buffers by sk_add_backlog(),
which in TIPC leads to unnecessarily broken connections.
In this commit, we compensate for this double accounting by adding
a counter that keeps track of it. The function socket.c::backlog_rcv()
receives buffers one by one from __release_sock(), and adds them to the
socket receive queue. If the transfer is successful, it increases a new
atomic counter 'tipc_sock::dupl_rcvcnt' with 'truesize' of the
transferred buffer. If a new buffer arrives during this transfer and
finds the socket busy (owned), we attempt to add it to the backlog.
However, when sk_add_backlog() is called, we adjust the 'limit'
parameter with the value of the new counter, so that the risk of
inadvertent rejection is eliminated.
It should be noted that this change does not invalidate the original
purpose of zeroing 'sk_backlog.len' after the full transfer. We set an
upper limit for dupl_rcvcnt, so that if a 'wild' sender (i.e., one that
doesn't respect the send window) keeps pumping in buffers to
sk_add_backlog(), he will eventually reach an upper limit,
(2 x TIPC_CONN_OVERLOAD_LIMIT). After that, no messages can be added
to the backlog, and the connection will be broken. Ordinary, well-
behaved senders will never reach this buffer limit at all.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-14 05:39:09 -04:00
static int tipc_backlog_rcv ( struct sock * sk , struct sk_buff * buf )
2008-04-15 00:22:02 -07:00
{
2014-06-25 20:41:31 -05:00
int rc ;
2014-06-25 20:41:35 -05:00
u32 onode ;
tipc: compensate for double accounting in socket rcv buffer
The function net/core/sock.c::__release_sock() runs a tight loop
to move buffers from the socket backlog queue to the receive queue.
As a security measure, sk_backlog.len of the receiving socket
is not set to zero until after the loop is finished, i.e., until
the whole backlog queue has been transferred to the receive queue.
During this transfer, the data that has already been moved is counted
both in the backlog queue and the receive queue, hence giving an
incorrect picture of the available queue space for new arriving buffers.
This leads to unnecessary rejection of buffers by sk_add_backlog(),
which in TIPC leads to unnecessarily broken connections.
In this commit, we compensate for this double accounting by adding
a counter that keeps track of it. The function socket.c::backlog_rcv()
receives buffers one by one from __release_sock(), and adds them to the
socket receive queue. If the transfer is successful, it increases a new
atomic counter 'tipc_sock::dupl_rcvcnt' with 'truesize' of the
transferred buffer. If a new buffer arrives during this transfer and
finds the socket busy (owned), we attempt to add it to the backlog.
However, when sk_add_backlog() is called, we adjust the 'limit'
parameter with the value of the new counter, so that the risk of
inadvertent rejection is eliminated.
It should be noted that this change does not invalidate the original
purpose of zeroing 'sk_backlog.len' after the full transfer. We set an
upper limit for dupl_rcvcnt, so that if a 'wild' sender (i.e., one that
doesn't respect the send window) keeps pumping in buffers to
sk_add_backlog(), he will eventually reach an upper limit,
(2 x TIPC_CONN_OVERLOAD_LIMIT). After that, no messages can be added
to the backlog, and the connection will be broken. Ordinary, well-
behaved senders will never reach this buffer limit at all.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-14 05:39:09 -04:00
struct tipc_sock * tsk = tipc_sk ( sk ) ;
2014-06-09 11:08:18 -05:00
uint truesize = buf - > truesize ;
2008-04-15 00:22:02 -07:00
2014-06-25 20:41:31 -05:00
rc = filter_rcv ( sk , buf ) ;
tipc: compensate for double accounting in socket rcv buffer
The function net/core/sock.c::__release_sock() runs a tight loop
to move buffers from the socket backlog queue to the receive queue.
As a security measure, sk_backlog.len of the receiving socket
is not set to zero until after the loop is finished, i.e., until
the whole backlog queue has been transferred to the receive queue.
During this transfer, the data that has already been moved is counted
both in the backlog queue and the receive queue, hence giving an
incorrect picture of the available queue space for new arriving buffers.
This leads to unnecessary rejection of buffers by sk_add_backlog(),
which in TIPC leads to unnecessarily broken connections.
In this commit, we compensate for this double accounting by adding
a counter that keeps track of it. The function socket.c::backlog_rcv()
receives buffers one by one from __release_sock(), and adds them to the
socket receive queue. If the transfer is successful, it increases a new
atomic counter 'tipc_sock::dupl_rcvcnt' with 'truesize' of the
transferred buffer. If a new buffer arrives during this transfer and
finds the socket busy (owned), we attempt to add it to the backlog.
However, when sk_add_backlog() is called, we adjust the 'limit'
parameter with the value of the new counter, so that the risk of
inadvertent rejection is eliminated.
It should be noted that this change does not invalidate the original
purpose of zeroing 'sk_backlog.len' after the full transfer. We set an
upper limit for dupl_rcvcnt, so that if a 'wild' sender (i.e., one that
doesn't respect the send window) keeps pumping in buffers to
sk_add_backlog(), he will eventually reach an upper limit,
(2 x TIPC_CONN_OVERLOAD_LIMIT). After that, no messages can be added
to the backlog, and the connection will be broken. Ordinary, well-
behaved senders will never reach this buffer limit at all.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-14 05:39:09 -04:00
2014-06-25 20:41:41 -05:00
if ( likely ( ! rc ) ) {
if ( atomic_read ( & tsk - > dupl_rcvcnt ) < TIPC_CONN_OVERLOAD_LIMIT )
atomic_add ( truesize , & tsk - > dupl_rcvcnt ) ;
return 0 ;
}
if ( ( rc < 0 ) & & ! tipc_msg_reverse ( buf , & onode , - rc ) )
return 0 ;
2014-07-16 20:41:03 -04:00
tipc_link_xmit ( buf , onode , 0 ) ;
tipc: compensate for double accounting in socket rcv buffer
The function net/core/sock.c::__release_sock() runs a tight loop
to move buffers from the socket backlog queue to the receive queue.
As a security measure, sk_backlog.len of the receiving socket
is not set to zero until after the loop is finished, i.e., until
the whole backlog queue has been transferred to the receive queue.
During this transfer, the data that has already been moved is counted
both in the backlog queue and the receive queue, hence giving an
incorrect picture of the available queue space for new arriving buffers.
This leads to unnecessary rejection of buffers by sk_add_backlog(),
which in TIPC leads to unnecessarily broken connections.
In this commit, we compensate for this double accounting by adding
a counter that keeps track of it. The function socket.c::backlog_rcv()
receives buffers one by one from __release_sock(), and adds them to the
socket receive queue. If the transfer is successful, it increases a new
atomic counter 'tipc_sock::dupl_rcvcnt' with 'truesize' of the
transferred buffer. If a new buffer arrives during this transfer and
finds the socket busy (owned), we attempt to add it to the backlog.
However, when sk_add_backlog() is called, we adjust the 'limit'
parameter with the value of the new counter, so that the risk of
inadvertent rejection is eliminated.
It should be noted that this change does not invalidate the original
purpose of zeroing 'sk_backlog.len' after the full transfer. We set an
upper limit for dupl_rcvcnt, so that if a 'wild' sender (i.e., one that
doesn't respect the send window) keeps pumping in buffers to
sk_add_backlog(), he will eventually reach an upper limit,
(2 x TIPC_CONN_OVERLOAD_LIMIT). After that, no messages can be added
to the backlog, and the connection will be broken. Ordinary, well-
behaved senders will never reach this buffer limit at all.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-14 05:39:09 -04:00
2008-04-15 00:22:02 -07:00
return 0 ;
}
/**
2014-03-12 11:31:10 -04:00
* tipc_sk_rcv - handle incoming message
2014-05-14 05:39:15 -04:00
* @ buf : buffer containing arriving message
* Consumes buffer
* Returns 0 if success , or errno : - EHOSTUNREACH
2008-04-15 00:22:02 -07:00
*/
2014-05-14 05:39:15 -04:00
int tipc_sk_rcv ( struct sk_buff * buf )
2008-04-15 00:22:02 -07:00
{
2014-05-14 05:39:15 -04:00
struct tipc_sock * tsk ;
struct sock * sk ;
u32 dport = msg_destport ( buf_msg ( buf ) ) ;
2014-06-25 20:41:31 -05:00
int rc = TIPC_OK ;
tipc: compensate for double accounting in socket rcv buffer
The function net/core/sock.c::__release_sock() runs a tight loop
to move buffers from the socket backlog queue to the receive queue.
As a security measure, sk_backlog.len of the receiving socket
is not set to zero until after the loop is finished, i.e., until
the whole backlog queue has been transferred to the receive queue.
During this transfer, the data that has already been moved is counted
both in the backlog queue and the receive queue, hence giving an
incorrect picture of the available queue space for new arriving buffers.
This leads to unnecessary rejection of buffers by sk_add_backlog(),
which in TIPC leads to unnecessarily broken connections.
In this commit, we compensate for this double accounting by adding
a counter that keeps track of it. The function socket.c::backlog_rcv()
receives buffers one by one from __release_sock(), and adds them to the
socket receive queue. If the transfer is successful, it increases a new
atomic counter 'tipc_sock::dupl_rcvcnt' with 'truesize' of the
transferred buffer. If a new buffer arrives during this transfer and
finds the socket busy (owned), we attempt to add it to the backlog.
However, when sk_add_backlog() is called, we adjust the 'limit'
parameter with the value of the new counter, so that the risk of
inadvertent rejection is eliminated.
It should be noted that this change does not invalidate the original
purpose of zeroing 'sk_backlog.len' after the full transfer. We set an
upper limit for dupl_rcvcnt, so that if a 'wild' sender (i.e., one that
doesn't respect the send window) keeps pumping in buffers to
sk_add_backlog(), he will eventually reach an upper limit,
(2 x TIPC_CONN_OVERLOAD_LIMIT). After that, no messages can be added
to the backlog, and the connection will be broken. Ordinary, well-
behaved senders will never reach this buffer limit at all.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-14 05:39:09 -04:00
uint limit ;
2014-06-25 20:41:35 -05:00
u32 dnode ;
2014-05-14 05:39:15 -04:00
tipc: introduce message evaluation function
When a message arrives in a node and finds no destination
socket, we may need to drop it, reject it, or forward it after
a secondary destination lookup. The latter two cases currently
results in a code path that is perceived as complex, because it
follows a deep call chain via obscure functions such as
net_route_named_msg() and net_route_msg().
We now introduce a function, tipc_msg_eval(), that takes the
decision about whether such a message should be rejected or
forwarded, but leaves it to the caller to actually perform
the indicated action.
If the decision is 'reject', it is still the task of the recently
introduced function tipc_msg_reverse() to take the final decision
about whether the message is rejectable or not. In the latter case
it drops the message.
As a result of this change, we can finally eliminate the function
net_route_named_msg(), and hence become independent of net_route_msg().
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-25 20:41:36 -05:00
/* Validate destination and message */
2014-08-22 18:09:16 -04:00
tsk = tipc_sk_get ( dport ) ;
2014-08-22 18:09:15 -04:00
if ( unlikely ( ! tsk ) ) {
tipc: introduce message evaluation function
When a message arrives in a node and finds no destination
socket, we may need to drop it, reject it, or forward it after
a secondary destination lookup. The latter two cases currently
results in a code path that is perceived as complex, because it
follows a deep call chain via obscure functions such as
net_route_named_msg() and net_route_msg().
We now introduce a function, tipc_msg_eval(), that takes the
decision about whether such a message should be rejected or
forwarded, but leaves it to the caller to actually perform
the indicated action.
If the decision is 'reject', it is still the task of the recently
introduced function tipc_msg_reverse() to take the final decision
about whether the message is rejectable or not. In the latter case
it drops the message.
As a result of this change, we can finally eliminate the function
net_route_named_msg(), and hence become independent of net_route_msg().
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-25 20:41:36 -05:00
rc = tipc_msg_eval ( buf , & dnode ) ;
2014-05-14 05:39:15 -04:00
goto exit ;
}
sk = & tsk - > sk ;
/* Queue message */
2008-04-15 00:22:02 -07:00
bh_lock_sock ( sk ) ;
2014-05-14 05:39:15 -04:00
2008-04-15 00:22:02 -07:00
if ( ! sock_owned_by_user ( sk ) ) {
2014-06-25 20:41:31 -05:00
rc = filter_rcv ( sk , buf ) ;
2008-04-15 00:22:02 -07:00
} else {
tipc: compensate for double accounting in socket rcv buffer
The function net/core/sock.c::__release_sock() runs a tight loop
to move buffers from the socket backlog queue to the receive queue.
As a security measure, sk_backlog.len of the receiving socket
is not set to zero until after the loop is finished, i.e., until
the whole backlog queue has been transferred to the receive queue.
During this transfer, the data that has already been moved is counted
both in the backlog queue and the receive queue, hence giving an
incorrect picture of the available queue space for new arriving buffers.
This leads to unnecessary rejection of buffers by sk_add_backlog(),
which in TIPC leads to unnecessarily broken connections.
In this commit, we compensate for this double accounting by adding
a counter that keeps track of it. The function socket.c::backlog_rcv()
receives buffers one by one from __release_sock(), and adds them to the
socket receive queue. If the transfer is successful, it increases a new
atomic counter 'tipc_sock::dupl_rcvcnt' with 'truesize' of the
transferred buffer. If a new buffer arrives during this transfer and
finds the socket busy (owned), we attempt to add it to the backlog.
However, when sk_add_backlog() is called, we adjust the 'limit'
parameter with the value of the new counter, so that the risk of
inadvertent rejection is eliminated.
It should be noted that this change does not invalidate the original
purpose of zeroing 'sk_backlog.len' after the full transfer. We set an
upper limit for dupl_rcvcnt, so that if a 'wild' sender (i.e., one that
doesn't respect the send window) keeps pumping in buffers to
sk_add_backlog(), he will eventually reach an upper limit,
(2 x TIPC_CONN_OVERLOAD_LIMIT). After that, no messages can be added
to the backlog, and the connection will be broken. Ordinary, well-
behaved senders will never reach this buffer limit at all.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-14 05:39:09 -04:00
if ( sk - > sk_backlog . len = = 0 )
atomic_set ( & tsk - > dupl_rcvcnt , 0 ) ;
limit = rcvbuf_limit ( sk , buf ) + atomic_read ( & tsk - > dupl_rcvcnt ) ;
if ( sk_add_backlog ( sk , buf , limit ) )
2014-06-25 20:41:31 -05:00
rc = - TIPC_ERR_OVERLOAD ;
2008-04-15 00:22:02 -07:00
}
bh_unlock_sock ( sk ) ;
2014-08-22 18:09:16 -04:00
tipc_sk_put ( tsk ) ;
2014-06-25 20:41:31 -05:00
if ( likely ( ! rc ) )
2014-05-14 05:39:15 -04:00
return 0 ;
exit :
tipc: introduce message evaluation function
When a message arrives in a node and finds no destination
socket, we may need to drop it, reject it, or forward it after
a secondary destination lookup. The latter two cases currently
results in a code path that is perceived as complex, because it
follows a deep call chain via obscure functions such as
net_route_named_msg() and net_route_msg().
We now introduce a function, tipc_msg_eval(), that takes the
decision about whether such a message should be rejected or
forwarded, but leaves it to the caller to actually perform
the indicated action.
If the decision is 'reject', it is still the task of the recently
introduced function tipc_msg_reverse() to take the final decision
about whether the message is rejectable or not. In the latter case
it drops the message.
As a result of this change, we can finally eliminate the function
net_route_named_msg(), and hence become independent of net_route_msg().
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-25 20:41:36 -05:00
if ( ( rc < 0 ) & & ! tipc_msg_reverse ( buf , & dnode , - rc ) )
2014-06-25 20:41:35 -05:00
return - EHOSTUNREACH ;
tipc: introduce message evaluation function
When a message arrives in a node and finds no destination
socket, we may need to drop it, reject it, or forward it after
a secondary destination lookup. The latter two cases currently
results in a code path that is perceived as complex, because it
follows a deep call chain via obscure functions such as
net_route_named_msg() and net_route_msg().
We now introduce a function, tipc_msg_eval(), that takes the
decision about whether such a message should be rejected or
forwarded, but leaves it to the caller to actually perform
the indicated action.
If the decision is 'reject', it is still the task of the recently
introduced function tipc_msg_reverse() to take the final decision
about whether the message is rejectable or not. In the latter case
it drops the message.
As a result of this change, we can finally eliminate the function
net_route_named_msg(), and hence become independent of net_route_msg().
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-25 20:41:36 -05:00
2014-07-16 20:41:03 -04:00
tipc_link_xmit ( buf , dnode , 0 ) ;
tipc: introduce message evaluation function
When a message arrives in a node and finds no destination
socket, we may need to drop it, reject it, or forward it after
a secondary destination lookup. The latter two cases currently
results in a code path that is perceived as complex, because it
follows a deep call chain via obscure functions such as
net_route_named_msg() and net_route_msg().
We now introduce a function, tipc_msg_eval(), that takes the
decision about whether such a message should be rejected or
forwarded, but leaves it to the caller to actually perform
the indicated action.
If the decision is 'reject', it is still the task of the recently
introduced function tipc_msg_reverse() to take the final decision
about whether the message is rejectable or not. In the latter case
it drops the message.
As a result of this change, we can finally eliminate the function
net_route_named_msg(), and hence become independent of net_route_msg().
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-25 20:41:36 -05:00
return ( rc < 0 ) ? - EHOSTUNREACH : 0 ;
2006-01-02 19:04:38 +01:00
}
2014-01-17 09:50:03 +08:00
static int tipc_wait_for_connect ( struct socket * sock , long * timeo_p )
{
struct sock * sk = sock - > sk ;
DEFINE_WAIT ( wait ) ;
int done ;
do {
int err = sock_error ( sk ) ;
if ( err )
return err ;
if ( ! * timeo_p )
return - ETIMEDOUT ;
if ( signal_pending ( current ) )
return sock_intr_errno ( * timeo_p ) ;
prepare_to_wait ( sk_sleep ( sk ) , & wait , TASK_INTERRUPTIBLE ) ;
done = sk_wait_event ( sk , timeo_p , sock - > state ! = SS_CONNECTING ) ;
finish_wait ( sk_sleep ( sk ) , & wait ) ;
} while ( ! done ) ;
return 0 ;
}
2006-01-02 19:04:38 +01:00
/**
2014-02-18 16:06:46 +08:00
* tipc_connect - establish a connection to another TIPC port
2006-01-02 19:04:38 +01:00
* @ sock : socket structure
* @ dest : socket address for destination port
* @ destlen : size of socket address data structure
2008-04-15 00:22:02 -07:00
* @ flags : file - related flags associated with socket
2006-01-02 19:04:38 +01:00
*
* Returns 0 on success , errno otherwise
*/
2014-02-18 16:06:46 +08:00
static int tipc_connect ( struct socket * sock , struct sockaddr * dest ,
int destlen , int flags )
2006-01-02 19:04:38 +01:00
{
2008-04-15 00:22:02 -07:00
struct sock * sk = sock - > sk ;
2008-04-15 00:20:37 -07:00
struct sockaddr_tipc * dst = ( struct sockaddr_tipc * ) dest ;
struct msghdr m = { NULL , } ;
2014-01-17 09:50:03 +08:00
long timeout = ( flags & O_NONBLOCK ) ? 0 : tipc_sk ( sk ) - > conn_timeout ;
socket_state previous ;
2008-04-15 00:20:37 -07:00
int res ;
2008-04-15 00:22:02 -07:00
lock_sock ( sk ) ;
2008-04-15 00:20:37 -07:00
/* For now, TIPC does not allow use of connect() with DGRAM/RDM types */
2008-04-15 00:22:02 -07:00
if ( sock - > state = = SS_READY ) {
res = - EOPNOTSUPP ;
goto exit ;
}
2008-04-15 00:20:37 -07:00
/*
* Reject connection attempt using multicast address
*
* Note : send_msg ( ) validates the rest of the address fields ,
* so there ' s no need to do it here
*/
2008-04-15 00:22:02 -07:00
if ( dst - > addrtype = = TIPC_ADDR_MCAST ) {
res = - EINVAL ;
goto exit ;
}
2014-01-17 09:50:03 +08:00
previous = sock - > state ;
tipc: introduce non-blocking socket connect
TIPC has so far only supported blocking connect(), meaning that a call
to connect() doesn't return until either the connection is fully
established, or an error occurs. This has proved insufficient for many
users, so we now introduce non-blocking connect(), analogous to how
this is done in TCP and other protocols.
With this feature, if a connection cannot be established instantly,
connect() will return the error code "-EINPROGRESS".
If the user later calls connect() again, he will either have the
return code "-EALREADY" or "-EISCONN", depending on whether the
connection has been established or not.
The user must have explicitly set the socket to be non-blocking
(SOCK_NONBLOCK or O_NONBLOCK, depending on method used), so unless
for some reason they had set this already (the socket would anyway
remain blocking in current TIPC) this change should be completely
backwards compatible.
It is also now possible to call select() or poll() to wait for the
completion of a connection.
An effect of the above is that the actual completion of a connection
may now be performed asynchronously, independent of the calls from
user space. Therefore, we now execute this code in BH context, in
the function filter_rcv(), which is executed upon reception of
messages in the socket.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
[PG: minor refactoring for improved connect/disconnect function names]
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-11-29 18:51:19 -05:00
switch ( sock - > state ) {
case SS_UNCONNECTED :
/* Send a 'SYN-' to destination */
m . msg_name = dest ;
m . msg_namelen = destlen ;
/* If connect is in non-blocking case, set MSG_DONTWAIT to
* indicate send_msg ( ) is never blocked .
*/
if ( ! timeout )
m . msg_flags = MSG_DONTWAIT ;
2014-02-18 16:06:46 +08:00
res = tipc_sendmsg ( NULL , sock , & m , 0 ) ;
tipc: introduce non-blocking socket connect
TIPC has so far only supported blocking connect(), meaning that a call
to connect() doesn't return until either the connection is fully
established, or an error occurs. This has proved insufficient for many
users, so we now introduce non-blocking connect(), analogous to how
this is done in TCP and other protocols.
With this feature, if a connection cannot be established instantly,
connect() will return the error code "-EINPROGRESS".
If the user later calls connect() again, he will either have the
return code "-EALREADY" or "-EISCONN", depending on whether the
connection has been established or not.
The user must have explicitly set the socket to be non-blocking
(SOCK_NONBLOCK or O_NONBLOCK, depending on method used), so unless
for some reason they had set this already (the socket would anyway
remain blocking in current TIPC) this change should be completely
backwards compatible.
It is also now possible to call select() or poll() to wait for the
completion of a connection.
An effect of the above is that the actual completion of a connection
may now be performed asynchronously, independent of the calls from
user space. Therefore, we now execute this code in BH context, in
the function filter_rcv(), which is executed upon reception of
messages in the socket.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
[PG: minor refactoring for improved connect/disconnect function names]
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-11-29 18:51:19 -05:00
if ( ( res < 0 ) & & ( res ! = - EWOULDBLOCK ) )
goto exit ;
/* Just entered SS_CONNECTING state; the only
* difference is that return value in non - blocking
* case is EINPROGRESS , rather than EALREADY .
*/
res = - EINPROGRESS ;
case SS_CONNECTING :
2014-01-17 09:50:03 +08:00
if ( previous = = SS_CONNECTING )
res = - EALREADY ;
if ( ! timeout )
goto exit ;
timeout = msecs_to_jiffies ( timeout ) ;
/* Wait until an 'ACK' or 'RST' arrives, or a timeout occurs */
res = tipc_wait_for_connect ( sock , & timeout ) ;
tipc: introduce non-blocking socket connect
TIPC has so far only supported blocking connect(), meaning that a call
to connect() doesn't return until either the connection is fully
established, or an error occurs. This has proved insufficient for many
users, so we now introduce non-blocking connect(), analogous to how
this is done in TCP and other protocols.
With this feature, if a connection cannot be established instantly,
connect() will return the error code "-EINPROGRESS".
If the user later calls connect() again, he will either have the
return code "-EALREADY" or "-EISCONN", depending on whether the
connection has been established or not.
The user must have explicitly set the socket to be non-blocking
(SOCK_NONBLOCK or O_NONBLOCK, depending on method used), so unless
for some reason they had set this already (the socket would anyway
remain blocking in current TIPC) this change should be completely
backwards compatible.
It is also now possible to call select() or poll() to wait for the
completion of a connection.
An effect of the above is that the actual completion of a connection
may now be performed asynchronously, independent of the calls from
user space. Therefore, we now execute this code in BH context, in
the function filter_rcv(), which is executed upon reception of
messages in the socket.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
[PG: minor refactoring for improved connect/disconnect function names]
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-11-29 18:51:19 -05:00
break ;
case SS_CONNECTED :
res = - EISCONN ;
break ;
default :
res = - EINVAL ;
2014-01-17 09:50:03 +08:00
break ;
2008-04-15 00:20:37 -07:00
}
2008-04-15 00:22:02 -07:00
exit :
release_sock ( sk ) ;
2008-04-15 00:20:37 -07:00
return res ;
2006-01-02 19:04:38 +01:00
}
2007-02-09 23:25:21 +09:00
/**
2014-02-18 16:06:46 +08:00
* tipc_listen - allow socket to listen for incoming connections
2006-01-02 19:04:38 +01:00
* @ sock : socket structure
* @ len : ( unused )
2007-02-09 23:25:21 +09:00
*
2006-01-02 19:04:38 +01:00
* Returns 0 on success , errno otherwise
*/
2014-02-18 16:06:46 +08:00
static int tipc_listen ( struct socket * sock , int len )
2006-01-02 19:04:38 +01:00
{
2008-04-15 00:22:02 -07:00
struct sock * sk = sock - > sk ;
int res ;
lock_sock ( sk ) ;
2006-01-02 19:04:38 +01:00
2011-07-06 06:01:13 -04:00
if ( sock - > state ! = SS_UNCONNECTED )
2008-04-15 00:22:02 -07:00
res = - EINVAL ;
else {
sock - > state = SS_LISTENING ;
res = 0 ;
}
release_sock ( sk ) ;
return res ;
2006-01-02 19:04:38 +01:00
}
2014-01-17 09:50:04 +08:00
static int tipc_wait_for_accept ( struct socket * sock , long timeo )
{
struct sock * sk = sock - > sk ;
DEFINE_WAIT ( wait ) ;
int err ;
/* True wake-one mechanism for incoming connections: only
* one process gets woken up , not the ' whole herd ' .
* Since we do not ' race & poll ' for established sockets
* anymore , the common case will execute the loop only once .
*/
for ( ; ; ) {
prepare_to_wait_exclusive ( sk_sleep ( sk ) , & wait ,
TASK_INTERRUPTIBLE ) ;
2014-03-06 14:40:18 +01:00
if ( timeo & & skb_queue_empty ( & sk - > sk_receive_queue ) ) {
2014-01-17 09:50:04 +08:00
release_sock ( sk ) ;
timeo = schedule_timeout ( timeo ) ;
lock_sock ( sk ) ;
}
err = 0 ;
if ( ! skb_queue_empty ( & sk - > sk_receive_queue ) )
break ;
err = - EINVAL ;
if ( sock - > state ! = SS_LISTENING )
break ;
err = sock_intr_errno ( timeo ) ;
if ( signal_pending ( current ) )
break ;
err = - EAGAIN ;
if ( ! timeo )
break ;
}
finish_wait ( sk_sleep ( sk ) , & wait ) ;
return err ;
}
2007-02-09 23:25:21 +09:00
/**
2014-02-18 16:06:46 +08:00
* tipc_accept - wait for connection request
2006-01-02 19:04:38 +01:00
* @ sock : listening socket
* @ newsock : new socket that is to be connected
* @ flags : file - related flags associated with socket
2007-02-09 23:25:21 +09:00
*
2006-01-02 19:04:38 +01:00
* Returns 0 on success , errno otherwise
*/
2014-02-18 16:06:46 +08:00
static int tipc_accept ( struct socket * sock , struct socket * new_sock , int flags )
2006-01-02 19:04:38 +01:00
{
2012-12-04 11:01:55 -05:00
struct sock * new_sk , * sk = sock - > sk ;
2006-01-02 19:04:38 +01:00
struct sk_buff * buf ;
2014-08-22 18:09:20 -04:00
struct tipc_sock * new_tsock ;
2012-12-04 11:01:55 -05:00
struct tipc_msg * msg ;
2014-01-17 09:50:04 +08:00
long timeo ;
2008-04-15 00:22:02 -07:00
int res ;
2006-01-02 19:04:38 +01:00
2008-04-15 00:22:02 -07:00
lock_sock ( sk ) ;
2006-01-02 19:04:38 +01:00
2008-04-15 00:22:02 -07:00
if ( sock - > state ! = SS_LISTENING ) {
res = - EINVAL ;
2006-01-02 19:04:38 +01:00
goto exit ;
}
2014-01-17 09:50:04 +08:00
timeo = sock_rcvtimeo ( sk , flags & O_NONBLOCK ) ;
res = tipc_wait_for_accept ( sock , timeo ) ;
if ( res )
goto exit ;
2008-04-15 00:22:02 -07:00
buf = skb_peek ( & sk - > sk_receive_queue ) ;
tipc: introduce new TIPC server infrastructure
TIPC has two internal servers, one providing a subscription
service for topology events, and another providing the
configuration interface. These servers have previously been running
in BH context, accessing the TIPC-port (aka native) API directly.
Apart from these servers, even the TIPC socket implementation is
partially built on this API.
As this API may simultaneously be called via different paths and in
different contexts, a complex and costly lock policiy is required
in order to protect TIPC internal resources.
To eliminate the need for this complex lock policiy, we introduce
a new, generic service API that uses kernel sockets for message
passing instead of the native API. Once the toplogy and configuration
servers are converted to use this new service, all code pertaining
to the native API can be removed. This entails a significant
reduction in code amount and complexity, and opens up for a complete
rework of the locking policy in TIPC.
The new service also solves another problem:
As the current topology server works in BH context, it cannot easily
be blocked when sending of events fails due to congestion. In such
cases events may have to be silently dropped, something that is
unacceptable. Therefore, the new service keeps a dedicated outbound
queue receiving messages from BH context. Once messages are
inserted into this queue, we will immediately schedule a work from a
special workqueue. This way, messages/events from the topology server
are in reality sent in process context, and the server can block
if necessary.
Analogously, there is a new workqueue for receiving messages. Once a
notification about an arriving message is received in BH context, we
schedule a work from the receive workqueue to do the job of
receiving the message in process context.
As both sending and receive messages are now finished in processes,
subscribed events cannot be dropped any more.
As of this commit, this new server infrastructure is built, but
not actually yet called by the existing TIPC code, but since the
conversion changes required in order to use it are significant,
the addition is kept here as a separate commit.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 10:54:39 -04:00
res = tipc_sk_create ( sock_net ( sock - > sk ) , new_sock , 0 , 1 ) ;
2012-12-04 11:01:55 -05:00
if ( res )
goto exit ;
2006-01-02 19:04:38 +01:00
2012-12-04 11:01:55 -05:00
new_sk = new_sock - > sk ;
2014-08-22 18:09:20 -04:00
new_tsock = tipc_sk ( new_sk ) ;
2012-12-04 11:01:55 -05:00
msg = buf_msg ( buf ) ;
2006-01-02 19:04:38 +01:00
2012-12-04 11:01:55 -05:00
/* we lock on new_sk; but lockdep sees the lock on sk */
lock_sock_nested ( new_sk , SINGLE_DEPTH_NESTING ) ;
/*
* Reject any stray messages received by new socket
* before the socket lock was taken ( very , very unlikely )
*/
2014-08-22 18:09:18 -04:00
tsk_rej_rx_queue ( new_sk ) ;
2012-12-04 11:01:55 -05:00
/* Connect new socket to it's peer */
2014-08-22 18:09:20 -04:00
tipc_sk_finish_conn ( new_tsock , msg_origport ( msg ) , msg_orignode ( msg ) ) ;
2012-12-04 11:01:55 -05:00
new_sock - > state = SS_CONNECTED ;
2014-08-22 18:09:20 -04:00
tsk_set_importance ( new_tsock , msg_importance ( msg ) ) ;
2012-12-04 11:01:55 -05:00
if ( msg_named ( msg ) ) {
2014-08-22 18:09:20 -04:00
new_tsock - > conn_type = msg_nametype ( msg ) ;
new_tsock - > conn_instance = msg_nameinst ( msg ) ;
2006-01-02 19:04:38 +01:00
}
2012-12-04 11:01:55 -05:00
/*
* Respond to ' SYN - ' by discarding it & returning ' ACK ' - .
* Respond to ' SYN + ' by queuing it on new socket .
*/
if ( ! msg_data_sz ( msg ) ) {
struct msghdr m = { NULL , } ;
2014-08-22 18:09:18 -04:00
tsk_advance_rx_queue ( sk ) ;
2014-02-18 16:06:46 +08:00
tipc_send_packet ( NULL , new_sock , & m , 0 ) ;
2012-12-04 11:01:55 -05:00
} else {
__skb_dequeue ( & sk - > sk_receive_queue ) ;
__skb_queue_head ( & new_sk - > sk_receive_queue , buf ) ;
2013-01-20 23:30:09 +01:00
skb_set_owner_r ( buf , new_sk ) ;
2012-12-04 11:01:55 -05:00
}
release_sock ( new_sk ) ;
2006-01-02 19:04:38 +01:00
exit :
2008-04-15 00:22:02 -07:00
release_sock ( sk ) ;
2006-01-02 19:04:38 +01:00
return res ;
}
/**
2014-02-18 16:06:46 +08:00
* tipc_shutdown - shutdown socket connection
2006-01-02 19:04:38 +01:00
* @ sock : socket structure
2008-03-06 15:05:38 -08:00
* @ how : direction to close ( must be SHUT_RDWR )
2006-01-02 19:04:38 +01:00
*
* Terminates connection ( if necessary ) , then purges socket ' s receive queue .
2007-02-09 23:25:21 +09:00
*
2006-01-02 19:04:38 +01:00
* Returns 0 on success , errno otherwise
*/
2014-02-18 16:06:46 +08:00
static int tipc_shutdown ( struct socket * sock , int how )
2006-01-02 19:04:38 +01:00
{
2008-04-15 00:22:02 -07:00
struct sock * sk = sock - > sk ;
2014-03-12 11:31:12 -04:00
struct tipc_sock * tsk = tipc_sk ( sk ) ;
2006-01-02 19:04:38 +01:00
struct sk_buff * buf ;
2014-08-22 18:09:10 -04:00
u32 dnode ;
2006-01-02 19:04:38 +01:00
int res ;
2008-03-06 15:05:38 -08:00
if ( how ! = SHUT_RDWR )
return - EINVAL ;
2006-01-02 19:04:38 +01:00
2008-04-15 00:22:02 -07:00
lock_sock ( sk ) ;
2006-01-02 19:04:38 +01:00
switch ( sock - > state ) {
2008-04-15 00:22:02 -07:00
case SS_CONNECTING :
2006-01-02 19:04:38 +01:00
case SS_CONNECTED :
restart :
2012-04-30 15:29:02 -04:00
/* Disconnect and send a 'FIN+' or 'FIN-' message to peer */
2008-04-15 00:22:02 -07:00
buf = __skb_dequeue ( & sk - > sk_receive_queue ) ;
if ( buf ) {
2013-10-18 07:23:16 +02:00
if ( TIPC_SKB_CB ( buf ) - > handle ! = NULL ) {
2011-11-04 13:24:29 -04:00
kfree_skb ( buf ) ;
2006-01-02 19:04:38 +01:00
goto restart ;
}
2014-08-22 18:09:10 -04:00
if ( tipc_msg_reverse ( buf , & dnode , TIPC_CONN_SHUTDOWN ) )
2014-08-22 18:09:20 -04:00
tipc_link_xmit ( buf , dnode , tsk - > ref ) ;
tipc_node_remove_conn ( dnode , tsk - > ref ) ;
2008-04-15 00:22:02 -07:00
} else {
2014-08-22 18:09:20 -04:00
dnode = tsk_peer_node ( tsk ) ;
2014-08-22 18:09:10 -04:00
buf = tipc_msg_create ( TIPC_CRITICAL_IMPORTANCE ,
TIPC_CONN_MSG , SHORT_H_SIZE ,
0 , dnode , tipc_own_addr ,
2014-08-22 18:09:20 -04:00
tsk_peer_port ( tsk ) ,
tsk - > ref , TIPC_CONN_SHUTDOWN ) ;
tipc_link_xmit ( buf , dnode , tsk - > ref ) ;
2006-01-02 19:04:38 +01:00
}
2014-08-22 18:09:20 -04:00
tsk - > connected = 0 ;
2008-04-15 00:22:02 -07:00
sock - > state = SS_DISCONNECTING ;
2014-08-22 18:09:20 -04:00
tipc_node_remove_conn ( dnode , tsk - > ref ) ;
2006-01-02 19:04:38 +01:00
/* fall through */
case SS_DISCONNECTING :
2012-10-29 09:38:15 -04:00
/* Discard any unreceived messages */
2013-01-20 23:30:08 +01:00
__skb_queue_purge ( & sk - > sk_receive_queue ) ;
2012-10-29 09:38:15 -04:00
/* Wake up anyone sleeping in poll */
sk - > sk_state_change ( sk ) ;
2006-01-02 19:04:38 +01:00
res = 0 ;
break ;
default :
res = - ENOTCONN ;
}
2008-04-15 00:22:02 -07:00
release_sock ( sk ) ;
2006-01-02 19:04:38 +01:00
return res ;
}
2014-08-22 18:09:09 -04:00
static void tipc_sk_timeout ( unsigned long ref )
{
2014-08-22 18:09:16 -04:00
struct tipc_sock * tsk ;
2014-08-22 18:09:09 -04:00
struct sock * sk ;
struct sk_buff * buf = NULL ;
u32 peer_port , peer_node ;
2014-08-22 18:09:16 -04:00
tsk = tipc_sk_get ( ref ) ;
2014-08-22 18:09:15 -04:00
if ( ! tsk )
2014-08-28 10:02:41 +08:00
return ;
2014-08-22 18:09:16 -04:00
2014-08-28 10:02:41 +08:00
sk = & tsk - > sk ;
2014-08-22 18:09:16 -04:00
bh_lock_sock ( sk ) ;
2014-08-22 18:09:20 -04:00
if ( ! tsk - > connected ) {
2014-08-22 18:09:16 -04:00
bh_unlock_sock ( sk ) ;
goto exit ;
2014-08-22 18:09:09 -04:00
}
2014-08-22 18:09:20 -04:00
peer_port = tsk_peer_port ( tsk ) ;
peer_node = tsk_peer_node ( tsk ) ;
2014-08-22 18:09:09 -04:00
2014-08-22 18:09:20 -04:00
if ( tsk - > probing_state = = TIPC_CONN_PROBING ) {
2014-08-22 18:09:09 -04:00
/* Previous probe not answered -> self abort */
buf = tipc_msg_create ( TIPC_CRITICAL_IMPORTANCE , TIPC_CONN_MSG ,
SHORT_H_SIZE , 0 , tipc_own_addr ,
peer_node , ref , peer_port ,
TIPC_ERR_NO_PORT ) ;
} else {
buf = tipc_msg_create ( CONN_MANAGER , CONN_PROBE , INT_H_SIZE ,
0 , peer_node , tipc_own_addr ,
peer_port , ref , TIPC_OK ) ;
2014-08-22 18:09:20 -04:00
tsk - > probing_state = TIPC_CONN_PROBING ;
k_start_timer ( & tsk - > timer , tsk - > probing_interval ) ;
2014-08-22 18:09:09 -04:00
}
bh_unlock_sock ( sk ) ;
2014-08-22 18:09:16 -04:00
if ( buf )
tipc_link_xmit ( buf , peer_node , ref ) ;
exit :
tipc_sk_put ( tsk ) ;
2014-08-22 18:09:09 -04:00
}
2014-08-22 18:09:20 -04:00
static int tipc_sk_publish ( struct tipc_sock * tsk , uint scope ,
2014-08-22 18:09:17 -04:00
struct tipc_name_seq const * seq )
{
struct publication * publ ;
u32 key ;
2014-08-22 18:09:20 -04:00
if ( tsk - > connected )
2014-08-22 18:09:17 -04:00
return - EINVAL ;
2014-08-22 18:09:20 -04:00
key = tsk - > ref + tsk - > pub_count + 1 ;
if ( key = = tsk - > ref )
2014-08-22 18:09:17 -04:00
return - EADDRINUSE ;
publ = tipc_nametbl_publish ( seq - > type , seq - > lower , seq - > upper ,
2014-08-22 18:09:20 -04:00
scope , tsk - > ref , key ) ;
2014-08-22 18:09:17 -04:00
if ( unlikely ( ! publ ) )
return - EINVAL ;
2014-08-22 18:09:20 -04:00
list_add ( & publ - > pport_list , & tsk - > publications ) ;
tsk - > pub_count + + ;
tsk - > published = 1 ;
2014-08-22 18:09:17 -04:00
return 0 ;
}
2014-08-22 18:09:20 -04:00
static int tipc_sk_withdraw ( struct tipc_sock * tsk , uint scope ,
2014-08-22 18:09:17 -04:00
struct tipc_name_seq const * seq )
{
struct publication * publ ;
struct publication * safe ;
int rc = - EINVAL ;
2014-08-22 18:09:20 -04:00
list_for_each_entry_safe ( publ , safe , & tsk - > publications , pport_list ) {
2014-08-22 18:09:17 -04:00
if ( seq ) {
if ( publ - > scope ! = scope )
continue ;
if ( publ - > type ! = seq - > type )
continue ;
if ( publ - > lower ! = seq - > lower )
continue ;
if ( publ - > upper ! = seq - > upper )
break ;
tipc_nametbl_withdraw ( publ - > type , publ - > lower ,
publ - > ref , publ - > key ) ;
rc = 0 ;
break ;
}
tipc_nametbl_withdraw ( publ - > type , publ - > lower ,
publ - > ref , publ - > key ) ;
rc = 0 ;
}
2014-08-22 18:09:20 -04:00
if ( list_empty ( & tsk - > publications ) )
tsk - > published = 0 ;
2014-08-22 18:09:17 -04:00
return rc ;
}
2014-08-22 18:09:20 -04:00
static int tipc_sk_show ( struct tipc_sock * tsk , char * buf ,
2014-08-22 18:09:14 -04:00
int len , int full_id )
{
struct publication * publ ;
int ret ;
if ( full_id )
ret = tipc_snprintf ( buf , len , " <%u.%u.%u:%u>: " ,
tipc_zone ( tipc_own_addr ) ,
tipc_cluster ( tipc_own_addr ) ,
2014-08-22 18:09:20 -04:00
tipc_node ( tipc_own_addr ) , tsk - > ref ) ;
2014-08-22 18:09:14 -04:00
else
2014-08-22 18:09:20 -04:00
ret = tipc_snprintf ( buf , len , " %-10u: " , tsk - > ref ) ;
2014-08-22 18:09:14 -04:00
2014-08-22 18:09:20 -04:00
if ( tsk - > connected ) {
u32 dport = tsk_peer_port ( tsk ) ;
u32 destnode = tsk_peer_node ( tsk ) ;
2014-08-22 18:09:14 -04:00
ret + = tipc_snprintf ( buf + ret , len - ret ,
" connected to <%u.%u.%u:%u> " ,
tipc_zone ( destnode ) ,
tipc_cluster ( destnode ) ,
tipc_node ( destnode ) , dport ) ;
2014-08-22 18:09:20 -04:00
if ( tsk - > conn_type ! = 0 )
2014-08-22 18:09:14 -04:00
ret + = tipc_snprintf ( buf + ret , len - ret ,
2014-08-22 18:09:20 -04:00
" via {%u,%u} " , tsk - > conn_type ,
tsk - > conn_instance ) ;
} else if ( tsk - > published ) {
2014-08-22 18:09:14 -04:00
ret + = tipc_snprintf ( buf + ret , len - ret , " bound to " ) ;
2014-08-22 18:09:20 -04:00
list_for_each_entry ( publ , & tsk - > publications , pport_list ) {
2014-08-22 18:09:14 -04:00
if ( publ - > lower = = publ - > upper )
ret + = tipc_snprintf ( buf + ret , len - ret ,
" {%u,%u} " , publ - > type ,
publ - > lower ) ;
else
ret + = tipc_snprintf ( buf + ret , len - ret ,
" {%u,%u,%u} " , publ - > type ,
publ - > lower , publ - > upper ) ;
}
}
ret + = tipc_snprintf ( buf + ret , len - ret , " \n " ) ;
return ret ;
}
struct sk_buff * tipc_sk_socks_show ( void )
{
struct sk_buff * buf ;
struct tlv_desc * rep_tlv ;
char * pb ;
int pb_len ;
struct tipc_sock * tsk ;
int str_len = 0 ;
u32 ref = 0 ;
buf = tipc_cfg_reply_alloc ( TLV_SPACE ( ULTRA_STRING_MAX_LEN ) ) ;
if ( ! buf )
return NULL ;
rep_tlv = ( struct tlv_desc * ) buf - > data ;
pb = TLV_DATA ( rep_tlv ) ;
pb_len = ULTRA_STRING_MAX_LEN ;
2014-08-22 18:09:16 -04:00
tsk = tipc_sk_get_next ( & ref ) ;
for ( ; tsk ; tsk = tipc_sk_get_next ( & ref ) ) {
lock_sock ( & tsk - > sk ) ;
2014-08-22 18:09:20 -04:00
str_len + = tipc_sk_show ( tsk , pb + str_len ,
2014-08-22 18:09:14 -04:00
pb_len - str_len , 0 ) ;
2014-08-22 18:09:16 -04:00
release_sock ( & tsk - > sk ) ;
tipc_sk_put ( tsk ) ;
2014-08-22 18:09:14 -04:00
}
str_len + = 1 ; /* for "\0" */
skb_put ( buf , TLV_SPACE ( str_len ) ) ;
TLV_SET ( rep_tlv , TIPC_TLV_ULTRA_STRING , NULL , str_len ) ;
return buf ;
}
/* tipc_sk_reinit: set non-zero address in all existing sockets
* when we go from standalone to network mode .
*/
void tipc_sk_reinit ( void )
{
struct tipc_msg * msg ;
u32 ref = 0 ;
2014-08-22 18:09:16 -04:00
struct tipc_sock * tsk = tipc_sk_get_next ( & ref ) ;
2014-08-22 18:09:14 -04:00
2014-08-22 18:09:16 -04:00
for ( ; tsk ; tsk = tipc_sk_get_next ( & ref ) ) {
lock_sock ( & tsk - > sk ) ;
2014-08-22 18:09:20 -04:00
msg = & tsk - > phdr ;
2014-08-22 18:09:14 -04:00
msg_set_prevnode ( msg , tipc_own_addr ) ;
msg_set_orignode ( msg , tipc_own_addr ) ;
2014-08-22 18:09:16 -04:00
release_sock ( & tsk - > sk ) ;
tipc_sk_put ( tsk ) ;
2014-08-22 18:09:14 -04:00
}
}
2014-08-22 18:09:19 -04:00
/**
* struct reference - TIPC socket reference entry
* @ tsk : pointer to socket associated with reference entry
* @ ref : reference value for socket ( combines instance & array index info )
*/
struct reference {
struct tipc_sock * tsk ;
u32 ref ;
} ;
/**
* struct tipc_ref_table - table of TIPC socket reference entries
* @ entries : pointer to array of reference entries
* @ capacity : array index of first unusable entry
* @ init_point : array index of first uninitialized entry
* @ first_free : array index of first unused socket reference entry
* @ last_free : array index of last unused socket reference entry
* @ index_mask : bitmask for array index portion of reference values
* @ start_mask : initial value for instance value portion of reference values
*/
struct ref_table {
struct reference * entries ;
u32 capacity ;
u32 init_point ;
u32 first_free ;
u32 last_free ;
u32 index_mask ;
u32 start_mask ;
} ;
/* Socket reference table consists of 2**N entries.
*
* State Socket ptr Reference
* - - - - - - - - - - - - - - - - - - - - - - - -
* In use non - NULL XXXX | own index
* ( XXXX changes each time entry is acquired )
* Free NULL YYYY | next free index
* ( YYYY is one more than last used XXXX )
* Uninitialized NULL 0
*
* Entry 0 is not used ; this allows index 0 to denote the end of the free list .
*
* Note that a reference value of 0 does not necessarily indicate that an
* entry is uninitialized , since the last entry in the free list could also
* have a reference value of 0 ( although this is unlikely ) .
*/
static struct ref_table tipc_ref_table ;
static DEFINE_RWLOCK ( ref_table_lock ) ;
/**
* tipc_ref_table_init - create reference table for sockets
*/
int tipc_sk_ref_table_init ( u32 req_sz , u32 start )
{
struct reference * table ;
u32 actual_sz ;
/* account for unused entry, then round up size to a power of 2 */
req_sz + + ;
for ( actual_sz = 16 ; actual_sz < req_sz ; actual_sz < < = 1 ) {
/* do nothing */
} ;
/* allocate table & mark all entries as uninitialized */
table = vzalloc ( actual_sz * sizeof ( struct reference ) ) ;
if ( table = = NULL )
return - ENOMEM ;
tipc_ref_table . entries = table ;
tipc_ref_table . capacity = req_sz ;
tipc_ref_table . init_point = 1 ;
tipc_ref_table . first_free = 0 ;
tipc_ref_table . last_free = 0 ;
tipc_ref_table . index_mask = actual_sz - 1 ;
tipc_ref_table . start_mask = start & ~ tipc_ref_table . index_mask ;
return 0 ;
}
/**
* tipc_ref_table_stop - destroy reference table for sockets
*/
void tipc_sk_ref_table_stop ( void )
{
if ( ! tipc_ref_table . entries )
return ;
vfree ( tipc_ref_table . entries ) ;
tipc_ref_table . entries = NULL ;
}
/* tipc_ref_acquire - create reference to a socket
*
* Register an socket pointer in the reference table .
* Returns a unique reference value that is used from then on to retrieve the
* socket pointer , or to determine if the socket has been deregistered .
*/
u32 tipc_sk_ref_acquire ( struct tipc_sock * tsk )
{
u32 index ;
u32 index_mask ;
u32 next_plus_upper ;
u32 ref = 0 ;
struct reference * entry ;
if ( unlikely ( ! tsk ) ) {
pr_err ( " Attempt to acquire ref. to non-existent obj \n " ) ;
return 0 ;
}
if ( unlikely ( ! tipc_ref_table . entries ) ) {
pr_err ( " Ref. table not found in acquisition attempt \n " ) ;
return 0 ;
}
/* Take a free entry, if available; otherwise initialize a new one */
write_lock_bh ( & ref_table_lock ) ;
index = tipc_ref_table . first_free ;
entry = & tipc_ref_table . entries [ index ] ;
if ( likely ( index ) ) {
index = tipc_ref_table . first_free ;
entry = & tipc_ref_table . entries [ index ] ;
index_mask = tipc_ref_table . index_mask ;
next_plus_upper = entry - > ref ;
tipc_ref_table . first_free = next_plus_upper & index_mask ;
ref = ( next_plus_upper & ~ index_mask ) + index ;
entry - > tsk = tsk ;
} else if ( tipc_ref_table . init_point < tipc_ref_table . capacity ) {
index = tipc_ref_table . init_point + + ;
entry = & tipc_ref_table . entries [ index ] ;
ref = tipc_ref_table . start_mask + index ;
}
if ( ref ) {
entry - > ref = ref ;
entry - > tsk = tsk ;
}
write_unlock_bh ( & ref_table_lock ) ;
return ref ;
}
/* tipc_sk_ref_discard - invalidate reference to an socket
*
* Disallow future references to an socket and free up the entry for re - use .
*/
void tipc_sk_ref_discard ( u32 ref )
{
struct reference * entry ;
u32 index ;
u32 index_mask ;
if ( unlikely ( ! tipc_ref_table . entries ) ) {
pr_err ( " Ref. table not found during discard attempt \n " ) ;
return ;
}
index_mask = tipc_ref_table . index_mask ;
index = ref & index_mask ;
entry = & tipc_ref_table . entries [ index ] ;
write_lock_bh ( & ref_table_lock ) ;
if ( unlikely ( ! entry - > tsk ) ) {
pr_err ( " Attempt to discard ref. to non-existent socket \n " ) ;
goto exit ;
}
if ( unlikely ( entry - > ref ! = ref ) ) {
pr_err ( " Attempt to discard non-existent reference \n " ) ;
goto exit ;
}
/* Mark entry as unused; increment instance part of entry's
* reference to invalidate any subsequent references
*/
entry - > tsk = NULL ;
entry - > ref = ( ref & ~ index_mask ) + ( index_mask + 1 ) ;
/* Append entry to free entry list */
if ( unlikely ( tipc_ref_table . first_free = = 0 ) )
tipc_ref_table . first_free = index ;
else
tipc_ref_table . entries [ tipc_ref_table . last_free ] . ref | = index ;
tipc_ref_table . last_free = index ;
exit :
write_unlock_bh ( & ref_table_lock ) ;
}
/* tipc_sk_get - find referenced socket and return pointer to it
*/
struct tipc_sock * tipc_sk_get ( u32 ref )
{
struct reference * entry ;
struct tipc_sock * tsk ;
if ( unlikely ( ! tipc_ref_table . entries ) )
return NULL ;
read_lock_bh ( & ref_table_lock ) ;
entry = & tipc_ref_table . entries [ ref & tipc_ref_table . index_mask ] ;
tsk = entry - > tsk ;
if ( likely ( tsk & & ( entry - > ref = = ref ) ) )
sock_hold ( & tsk - > sk ) ;
else
tsk = NULL ;
read_unlock_bh ( & ref_table_lock ) ;
return tsk ;
}
/* tipc_sk_get_next - lock & return next socket after referenced one
*/
struct tipc_sock * tipc_sk_get_next ( u32 * ref )
{
struct reference * entry ;
struct tipc_sock * tsk = NULL ;
uint index = * ref & tipc_ref_table . index_mask ;
read_lock_bh ( & ref_table_lock ) ;
while ( + + index < tipc_ref_table . capacity ) {
entry = & tipc_ref_table . entries [ index ] ;
if ( ! entry - > tsk )
continue ;
tsk = entry - > tsk ;
sock_hold ( & tsk - > sk ) ;
* ref = entry - > ref ;
break ;
}
read_unlock_bh ( & ref_table_lock ) ;
return tsk ;
}
static void tipc_sk_put ( struct tipc_sock * tsk )
{
sock_put ( & tsk - > sk ) ;
}
2006-01-02 19:04:38 +01:00
/**
2014-02-18 16:06:46 +08:00
* tipc_setsockopt - set socket option
2006-01-02 19:04:38 +01:00
* @ sock : socket structure
* @ lvl : option level
* @ opt : option identifier
* @ ov : pointer to new option value
* @ ol : length of option value
2007-02-09 23:25:21 +09:00
*
* For stream sockets only , accepts and ignores all IPPROTO_TCP options
2006-01-02 19:04:38 +01:00
* ( to ease compatibility ) .
2007-02-09 23:25:21 +09:00
*
2006-01-02 19:04:38 +01:00
* Returns 0 on success , errno otherwise
*/
2014-02-18 16:06:46 +08:00
static int tipc_setsockopt ( struct socket * sock , int lvl , int opt ,
char __user * ov , unsigned int ol )
2006-01-02 19:04:38 +01:00
{
2008-04-15 00:22:02 -07:00
struct sock * sk = sock - > sk ;
2014-03-12 11:31:12 -04:00
struct tipc_sock * tsk = tipc_sk ( sk ) ;
2006-01-02 19:04:38 +01:00
u32 value ;
int res ;
2007-02-09 23:25:21 +09:00
if ( ( lvl = = IPPROTO_TCP ) & & ( sock - > type = = SOCK_STREAM ) )
return 0 ;
2006-01-02 19:04:38 +01:00
if ( lvl ! = SOL_TIPC )
return - ENOPROTOOPT ;
if ( ol < sizeof ( value ) )
return - EINVAL ;
2010-12-31 18:59:33 +00:00
res = get_user ( value , ( u32 __user * ) ov ) ;
if ( res )
2006-01-02 19:04:38 +01:00
return res ;
2008-04-15 00:22:02 -07:00
lock_sock ( sk ) ;
2007-02-09 23:25:21 +09:00
2006-01-02 19:04:38 +01:00
switch ( opt ) {
case TIPC_IMPORTANCE :
2014-08-22 18:09:20 -04:00
res = tsk_set_importance ( tsk , value ) ;
2006-01-02 19:04:38 +01:00
break ;
case TIPC_SRC_DROPPABLE :
if ( sock - > type ! = SOCK_STREAM )
2014-08-22 18:09:20 -04:00
tsk_set_unreliable ( tsk , value ) ;
2007-02-09 23:25:21 +09:00
else
2006-01-02 19:04:38 +01:00
res = - ENOPROTOOPT ;
break ;
case TIPC_DEST_DROPPABLE :
2014-08-22 18:09:20 -04:00
tsk_set_unreturnable ( tsk , value ) ;
2006-01-02 19:04:38 +01:00
break ;
case TIPC_CONN_TIMEOUT :
2011-05-26 13:44:34 -04:00
tipc_sk ( sk ) - > conn_timeout = value ;
2008-04-15 00:22:02 -07:00
/* no need to set "res", since already 0 at this point */
2006-01-02 19:04:38 +01:00
break ;
default :
res = - EINVAL ;
}
2008-04-15 00:22:02 -07:00
release_sock ( sk ) ;
2006-01-02 19:04:38 +01:00
return res ;
}
/**
2014-02-18 16:06:46 +08:00
* tipc_getsockopt - get socket option
2006-01-02 19:04:38 +01:00
* @ sock : socket structure
* @ lvl : option level
* @ opt : option identifier
* @ ov : receptacle for option value
* @ ol : receptacle for length of option value
2007-02-09 23:25:21 +09:00
*
* For stream sockets only , returns 0 length result for all IPPROTO_TCP options
2006-01-02 19:04:38 +01:00
* ( to ease compatibility ) .
2007-02-09 23:25:21 +09:00
*
2006-01-02 19:04:38 +01:00
* Returns 0 on success , errno otherwise
*/
2014-02-18 16:06:46 +08:00
static int tipc_getsockopt ( struct socket * sock , int lvl , int opt ,
char __user * ov , int __user * ol )
2006-01-02 19:04:38 +01:00
{
2008-04-15 00:22:02 -07:00
struct sock * sk = sock - > sk ;
2014-03-12 11:31:12 -04:00
struct tipc_sock * tsk = tipc_sk ( sk ) ;
2007-02-09 23:25:21 +09:00
int len ;
2006-01-02 19:04:38 +01:00
u32 value ;
2007-02-09 23:25:21 +09:00
int res ;
2006-01-02 19:04:38 +01:00
2007-02-09 23:25:21 +09:00
if ( ( lvl = = IPPROTO_TCP ) & & ( sock - > type = = SOCK_STREAM ) )
return put_user ( 0 , ol ) ;
2006-01-02 19:04:38 +01:00
if ( lvl ! = SOL_TIPC )
return - ENOPROTOOPT ;
2010-12-31 18:59:33 +00:00
res = get_user ( len , ol ) ;
if ( res )
2007-02-09 23:25:21 +09:00
return res ;
2006-01-02 19:04:38 +01:00
2008-04-15 00:22:02 -07:00
lock_sock ( sk ) ;
2006-01-02 19:04:38 +01:00
switch ( opt ) {
case TIPC_IMPORTANCE :
2014-08-22 18:09:20 -04:00
value = tsk_importance ( tsk ) ;
2006-01-02 19:04:38 +01:00
break ;
case TIPC_SRC_DROPPABLE :
2014-08-22 18:09:20 -04:00
value = tsk_unreliable ( tsk ) ;
2006-01-02 19:04:38 +01:00
break ;
case TIPC_DEST_DROPPABLE :
2014-08-22 18:09:20 -04:00
value = tsk_unreturnable ( tsk ) ;
2006-01-02 19:04:38 +01:00
break ;
case TIPC_CONN_TIMEOUT :
2014-08-22 18:09:20 -04:00
value = tsk - > conn_timeout ;
2008-04-15 00:22:02 -07:00
/* no need to set "res", since already 0 at this point */
2006-01-02 19:04:38 +01:00
break ;
2010-12-31 18:59:32 +00:00
case TIPC_NODE_RECVQ_DEPTH :
tipc: eliminate aggregate sk_receive_queue limit
As a complement to the per-socket sk_recv_queue limit, TIPC keeps a
global atomic counter for the sum of sk_recv_queue sizes across all
tipc sockets. When incremented, the counter is compared to an upper
threshold value, and if this is reached, the message is rejected
with error code TIPC_OVERLOAD.
This check was originally meant to protect the node against
buffer exhaustion and general CPU overload. However, all experience
indicates that the feature not only is redundant on Linux, but even
harmful. Users run into the limit very often, causing disturbances
for their applications, while removing it seems to have no negative
effects at all. We have also seen that overall performance is
boosted significantly when this bottleneck is removed.
Furthermore, we don't see any other network protocols maintaining
such a mechanism, something strengthening our conviction that this
control can be eliminated.
As a result, the atomic variable tipc_queue_size is now unused
and so it can be deleted. There is a getsockopt call that used
to allow reading it; we retain that but just return zero for
maximum compatibility.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
[PG: phase out tipc_queue_size as pointed out by Neil Horman]
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-11-27 06:15:27 -05:00
value = 0 ; /* was tipc_queue_size, now obsolete */
2009-06-30 03:25:39 +00:00
break ;
2010-12-31 18:59:32 +00:00
case TIPC_SOCK_RECVQ_DEPTH :
2009-06-30 03:25:39 +00:00
value = skb_queue_len ( & sk - > sk_receive_queue ) ;
break ;
2006-01-02 19:04:38 +01:00
default :
res = - EINVAL ;
}
2008-04-15 00:22:02 -07:00
release_sock ( sk ) ;
2010-12-31 18:59:31 +00:00
if ( res )
return res ; /* "get" failed */
2006-01-02 19:04:38 +01:00
2010-12-31 18:59:31 +00:00
if ( len < sizeof ( value ) )
return - EINVAL ;
if ( copy_to_user ( ov , & value , sizeof ( value ) ) )
return - EFAULT ;
return put_user ( sizeof ( value ) , ol ) ;
2006-01-02 19:04:38 +01:00
}
2014-07-20 13:14:28 +08:00
static int tipc_ioctl ( struct socket * sk , unsigned int cmd , unsigned long arg )
2014-04-24 16:26:47 +02:00
{
struct tipc_sioc_ln_req lnr ;
void __user * argp = ( void __user * ) arg ;
switch ( cmd ) {
case SIOCGETLINKNAME :
if ( copy_from_user ( & lnr , argp , sizeof ( lnr ) ) )
return - EFAULT ;
if ( ! tipc_node_get_linkname ( lnr . bearer_id , lnr . peer ,
lnr . linkname , TIPC_MAX_LINK_NAME ) ) {
if ( copy_to_user ( argp , & lnr , sizeof ( lnr ) ) )
return - EFAULT ;
return 0 ;
}
return - EADDRNOTAVAIL ;
default :
return - ENOIOCTLCMD ;
}
}
2012-07-10 10:55:35 +00:00
/* Protocol switches for the various types of TIPC sockets */
2008-02-07 18:18:01 -08:00
static const struct proto_ops msg_ops = {
2010-12-31 18:59:32 +00:00
. owner = THIS_MODULE ,
2006-01-02 19:04:38 +01:00
. family = AF_TIPC ,
2014-02-18 16:06:46 +08:00
. release = tipc_release ,
. bind = tipc_bind ,
. connect = tipc_connect ,
2007-06-10 17:24:55 -07:00
. socketpair = sock_no_socketpair ,
2011-07-06 06:01:13 -04:00
. accept = sock_no_accept ,
2014-02-18 16:06:46 +08:00
. getname = tipc_getname ,
. poll = tipc_poll ,
2014-04-24 16:26:47 +02:00
. ioctl = tipc_ioctl ,
2011-07-06 06:01:13 -04:00
. listen = sock_no_listen ,
2014-02-18 16:06:46 +08:00
. shutdown = tipc_shutdown ,
. setsockopt = tipc_setsockopt ,
. getsockopt = tipc_getsockopt ,
. sendmsg = tipc_sendmsg ,
. recvmsg = tipc_recvmsg ,
2007-07-19 10:44:56 +09:00
. mmap = sock_no_mmap ,
. sendpage = sock_no_sendpage
2006-01-02 19:04:38 +01:00
} ;
2008-02-07 18:18:01 -08:00
static const struct proto_ops packet_ops = {
2010-12-31 18:59:32 +00:00
. owner = THIS_MODULE ,
2006-01-02 19:04:38 +01:00
. family = AF_TIPC ,
2014-02-18 16:06:46 +08:00
. release = tipc_release ,
. bind = tipc_bind ,
. connect = tipc_connect ,
2007-06-10 17:24:55 -07:00
. socketpair = sock_no_socketpair ,
2014-02-18 16:06:46 +08:00
. accept = tipc_accept ,
. getname = tipc_getname ,
. poll = tipc_poll ,
2014-04-24 16:26:47 +02:00
. ioctl = tipc_ioctl ,
2014-02-18 16:06:46 +08:00
. listen = tipc_listen ,
. shutdown = tipc_shutdown ,
. setsockopt = tipc_setsockopt ,
. getsockopt = tipc_getsockopt ,
. sendmsg = tipc_send_packet ,
. recvmsg = tipc_recvmsg ,
2007-07-19 10:44:56 +09:00
. mmap = sock_no_mmap ,
. sendpage = sock_no_sendpage
2006-01-02 19:04:38 +01:00
} ;
2008-02-07 18:18:01 -08:00
static const struct proto_ops stream_ops = {
2010-12-31 18:59:32 +00:00
. owner = THIS_MODULE ,
2006-01-02 19:04:38 +01:00
. family = AF_TIPC ,
2014-02-18 16:06:46 +08:00
. release = tipc_release ,
. bind = tipc_bind ,
. connect = tipc_connect ,
2007-06-10 17:24:55 -07:00
. socketpair = sock_no_socketpair ,
2014-02-18 16:06:46 +08:00
. accept = tipc_accept ,
. getname = tipc_getname ,
. poll = tipc_poll ,
2014-04-24 16:26:47 +02:00
. ioctl = tipc_ioctl ,
2014-02-18 16:06:46 +08:00
. listen = tipc_listen ,
. shutdown = tipc_shutdown ,
. setsockopt = tipc_setsockopt ,
. getsockopt = tipc_getsockopt ,
. sendmsg = tipc_send_stream ,
. recvmsg = tipc_recv_stream ,
2007-07-19 10:44:56 +09:00
. mmap = sock_no_mmap ,
. sendpage = sock_no_sendpage
2006-01-02 19:04:38 +01:00
} ;
2008-02-07 18:18:01 -08:00
static const struct net_proto_family tipc_family_ops = {
2010-12-31 18:59:32 +00:00
. owner = THIS_MODULE ,
2006-01-02 19:04:38 +01:00
. family = AF_TIPC ,
tipc: introduce new TIPC server infrastructure
TIPC has two internal servers, one providing a subscription
service for topology events, and another providing the
configuration interface. These servers have previously been running
in BH context, accessing the TIPC-port (aka native) API directly.
Apart from these servers, even the TIPC socket implementation is
partially built on this API.
As this API may simultaneously be called via different paths and in
different contexts, a complex and costly lock policiy is required
in order to protect TIPC internal resources.
To eliminate the need for this complex lock policiy, we introduce
a new, generic service API that uses kernel sockets for message
passing instead of the native API. Once the toplogy and configuration
servers are converted to use this new service, all code pertaining
to the native API can be removed. This entails a significant
reduction in code amount and complexity, and opens up for a complete
rework of the locking policy in TIPC.
The new service also solves another problem:
As the current topology server works in BH context, it cannot easily
be blocked when sending of events fails due to congestion. In such
cases events may have to be silently dropped, something that is
unacceptable. Therefore, the new service keeps a dedicated outbound
queue receiving messages from BH context. Once messages are
inserted into this queue, we will immediately schedule a work from a
special workqueue. This way, messages/events from the topology server
are in reality sent in process context, and the server can block
if necessary.
Analogously, there is a new workqueue for receiving messages. Once a
notification about an arriving message is received in BH context, we
schedule a work from the receive workqueue to do the job of
receiving the message in process context.
As both sending and receive messages are now finished in processes,
subscribed events cannot be dropped any more.
As of this commit, this new server infrastructure is built, but
not actually yet called by the existing TIPC code, but since the
conversion changes required in order to use it are significant,
the addition is kept here as a separate commit.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 10:54:39 -04:00
. create = tipc_sk_create
2006-01-02 19:04:38 +01:00
} ;
static struct proto tipc_proto = {
. name = " TIPC " ,
. owner = THIS_MODULE ,
2013-06-17 10:54:37 -04:00
. obj_size = sizeof ( struct tipc_sock ) ,
. sysctl_rmem = sysctl_tipc_rmem
2006-01-02 19:04:38 +01:00
} ;
tipc: introduce new TIPC server infrastructure
TIPC has two internal servers, one providing a subscription
service for topology events, and another providing the
configuration interface. These servers have previously been running
in BH context, accessing the TIPC-port (aka native) API directly.
Apart from these servers, even the TIPC socket implementation is
partially built on this API.
As this API may simultaneously be called via different paths and in
different contexts, a complex and costly lock policiy is required
in order to protect TIPC internal resources.
To eliminate the need for this complex lock policiy, we introduce
a new, generic service API that uses kernel sockets for message
passing instead of the native API. Once the toplogy and configuration
servers are converted to use this new service, all code pertaining
to the native API can be removed. This entails a significant
reduction in code amount and complexity, and opens up for a complete
rework of the locking policy in TIPC.
The new service also solves another problem:
As the current topology server works in BH context, it cannot easily
be blocked when sending of events fails due to congestion. In such
cases events may have to be silently dropped, something that is
unacceptable. Therefore, the new service keeps a dedicated outbound
queue receiving messages from BH context. Once messages are
inserted into this queue, we will immediately schedule a work from a
special workqueue. This way, messages/events from the topology server
are in reality sent in process context, and the server can block
if necessary.
Analogously, there is a new workqueue for receiving messages. Once a
notification about an arriving message is received in BH context, we
schedule a work from the receive workqueue to do the job of
receiving the message in process context.
As both sending and receive messages are now finished in processes,
subscribed events cannot be dropped any more.
As of this commit, this new server infrastructure is built, but
not actually yet called by the existing TIPC code, but since the
conversion changes required in order to use it are significant,
the addition is kept here as a separate commit.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-17 10:54:39 -04:00
static struct proto tipc_proto_kern = {
. name = " TIPC " ,
. obj_size = sizeof ( struct tipc_sock ) ,
. sysctl_rmem = sysctl_tipc_rmem
} ;
2006-01-02 19:04:38 +01:00
/**
2006-01-18 00:38:21 +01:00
* tipc_socket_init - initialize TIPC socket interface
2007-02-09 23:25:21 +09:00
*
2006-01-02 19:04:38 +01:00
* Returns 0 on success , errno otherwise
*/
2006-01-18 00:38:21 +01:00
int tipc_socket_init ( void )
2006-01-02 19:04:38 +01:00
{
int res ;
2007-02-09 23:25:21 +09:00
res = proto_register ( & tipc_proto , 1 ) ;
2006-01-02 19:04:38 +01:00
if ( res ) {
2012-06-29 00:16:37 -04:00
pr_err ( " Failed to register TIPC protocol type \n " ) ;
2006-01-02 19:04:38 +01:00
goto out ;
}
res = sock_register ( & tipc_family_ops ) ;
if ( res ) {
2012-06-29 00:16:37 -04:00
pr_err ( " Failed to register TIPC socket type \n " ) ;
2006-01-02 19:04:38 +01:00
proto_unregister ( & tipc_proto ) ;
goto out ;
}
out :
return res ;
}
/**
2006-01-18 00:38:21 +01:00
* tipc_socket_stop - stop TIPC socket interface
2006-01-02 19:04:38 +01:00
*/
2006-01-18 00:38:21 +01:00
void tipc_socket_stop ( void )
2006-01-02 19:04:38 +01:00
{
sock_unregister ( tipc_family_ops . family ) ;
proto_unregister ( & tipc_proto ) ;
}