2008-01-11 17:57:09 +03:00
/* SCTP kernel implementation
2005-04-17 02:20:36 +04:00
* Copyright ( c ) 1999 - 2000 Cisco , Inc .
* Copyright ( c ) 1999 - 2001 Motorola , Inc .
* Copyright ( c ) 2002 International Business Machines , Corp .
2007-02-09 17:25:18 +03:00
*
2008-01-11 17:57:09 +03:00
* This file is part of the SCTP kernel implementation
2007-02-09 17:25:18 +03:00
*
2005-04-17 02:20:36 +04:00
* These functions are the methods for accessing the SCTP inqueue .
*
* An SCTP inqueue is a queue into which you push SCTP packets
* ( which might be bundles or fragments of chunks ) and out of which you
* pop SCTP whole chunks .
2007-02-09 17:25:18 +03:00
*
2008-01-11 17:57:09 +03:00
* This SCTP implementation is free software ;
2007-02-09 17:25:18 +03:00
* you can redistribute it and / or modify it under the terms of
2005-04-17 02:20:36 +04:00
* the GNU General Public License as published by
* the Free Software Foundation ; either version 2 , or ( at your option )
* any later version .
2007-02-09 17:25:18 +03:00
*
2008-01-11 17:57:09 +03:00
* This SCTP implementation is distributed in the hope that it
2005-04-17 02:20:36 +04:00
* will be useful , but WITHOUT ANY WARRANTY ; without even the implied
* * * * * * * * * * * * * * * * * * * * * * * * *
* warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE .
* See the GNU General Public License for more details .
2007-02-09 17:25:18 +03:00
*
2005-04-17 02:20:36 +04:00
* You should have received a copy of the GNU General Public License
2013-12-06 18:28:48 +04:00
* along with GNU CC ; see the file COPYING . If not , see
* < http : //www.gnu.org/licenses/>.
2007-02-09 17:25:18 +03:00
*
2005-04-17 02:20:36 +04:00
* Please send any bug reports or fixes you make to the
* email address ( es ) :
2013-07-23 16:51:47 +04:00
* lksctp developers < linux - sctp @ vger . kernel . org >
2007-02-09 17:25:18 +03:00
*
* Written or modified by :
2005-04-17 02:20:36 +04:00
* La Monte H . P . Yarroll < piggy @ acm . org >
* Karl Knutson < karl @ athena . chicago . il . us >
*/
2010-08-24 17:21:08 +04:00
# define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
2005-04-17 02:20:36 +04:00
# include <net/sctp/sctp.h>
# include <net/sctp/sm.h>
# include <linux/interrupt.h>
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 11:04:11 +03:00
# include <linux/slab.h>
2005-04-17 02:20:36 +04:00
/* Initialize an SCTP inqueue. */
void sctp_inq_init ( struct sctp_inq * queue )
{
2005-07-09 08:47:49 +04:00
INIT_LIST_HEAD ( & queue - > in_chunk_list ) ;
2005-04-17 02:20:36 +04:00
queue - > in_progress = NULL ;
/* Create a task for delivering data. */
2006-11-22 17:57:56 +03:00
INIT_WORK ( & queue - > immediate , NULL ) ;
2005-04-17 02:20:36 +04:00
}
/* Release the memory associated with an SCTP inqueue. */
void sctp_inq_free ( struct sctp_inq * queue )
{
2005-07-09 08:47:49 +04:00
struct sctp_chunk * chunk , * tmp ;
2005-04-17 02:20:36 +04:00
/* Empty the queue. */
2005-07-09 08:47:49 +04:00
list_for_each_entry_safe ( chunk , tmp , & queue - > in_chunk_list , list ) {
list_del_init ( & chunk - > list ) ;
2005-04-17 02:20:36 +04:00
sctp_chunk_free ( chunk ) ;
2005-07-09 08:47:49 +04:00
}
2005-04-17 02:20:36 +04:00
/* If there is a packet which is currently being worked on,
* free it as well .
*/
2006-01-17 22:51:28 +03:00
if ( queue - > in_progress ) {
2005-04-17 02:20:36 +04:00
sctp_chunk_free ( queue - > in_progress ) ;
2006-01-17 22:51:28 +03:00
queue - > in_progress = NULL ;
}
2005-04-17 02:20:36 +04:00
}
/* Put a new packet in an SCTP inqueue.
* We assume that packet - > sctp_hdr is set and in host byte order .
*/
2006-08-22 11:15:33 +04:00
void sctp_inq_push ( struct sctp_inq * q , struct sctp_chunk * chunk )
2005-04-17 02:20:36 +04:00
{
/* Directly call the packet handling routine. */
2007-11-07 19:39:27 +03:00
if ( chunk - > rcvr - > dead ) {
sctp_chunk_free ( chunk ) ;
return ;
}
2005-04-17 02:20:36 +04:00
/* We are now calling this either from the soft interrupt
* or from the backlog processing .
* Eventually , we should clean up inqueue to not rely
* on the BH related data structures .
*/
2006-08-22 11:15:33 +04:00
list_add_tail ( & chunk - > list , & q - > in_chunk_list ) ;
2012-12-01 08:49:42 +04:00
if ( chunk - > asoc )
chunk - > asoc - > stats . ipackets + + ;
2006-11-22 17:57:56 +03:00
q - > immediate . func ( & q - > immediate ) ;
2005-04-17 02:20:36 +04:00
}
2007-10-04 04:51:34 +04:00
/* Peek at the next chunk on the inqeue. */
struct sctp_chunkhdr * sctp_inq_peek ( struct sctp_inq * queue )
{
struct sctp_chunk * chunk ;
2017-06-30 06:52:13 +03:00
struct sctp_chunkhdr * ch = NULL ;
2007-10-04 04:51:34 +04:00
chunk = queue - > in_progress ;
/* If there is no more chunks in this packet, say so */
if ( chunk - > singleton | |
chunk - > end_of_packet | |
chunk - > pdiscard )
return NULL ;
2017-06-30 06:52:13 +03:00
ch = ( struct sctp_chunkhdr * ) chunk - > chunk_end ;
2007-10-04 04:51:34 +04:00
return ch ;
}
2005-04-17 02:20:36 +04:00
/* Extract a chunk from an SCTP inqueue.
*
* WARNING : If you need to put the chunk on another queue , you need to
* make a shallow copy ( clone ) of it .
*/
struct sctp_chunk * sctp_inq_pop ( struct sctp_inq * queue )
{
struct sctp_chunk * chunk ;
2017-06-30 06:52:13 +03:00
struct sctp_chunkhdr * ch = NULL ;
2005-04-17 02:20:36 +04:00
/* The assumption is that we are safe to process the chunks
* at this time .
*/
2016-06-02 21:05:42 +03:00
chunk = queue - > in_progress ;
if ( chunk ) {
2005-04-17 02:20:36 +04:00
/* There is a packet that we have been working on.
* Any post processing work to do before we move on ?
*/
if ( chunk - > singleton | |
chunk - > end_of_packet | |
chunk - > pdiscard ) {
2016-06-02 21:05:43 +03:00
if ( chunk - > head_skb = = chunk - > skb ) {
chunk - > skb = skb_shinfo ( chunk - > skb ) - > frag_list ;
goto new_skb ;
}
if ( chunk - > skb - > next ) {
chunk - > skb = chunk - > skb - > next ;
goto new_skb ;
}
if ( chunk - > head_skb )
chunk - > skb = chunk - > head_skb ;
2005-04-17 02:20:36 +04:00
sctp_chunk_free ( chunk ) ;
chunk = queue - > in_progress = NULL ;
} else {
/* Nothing to do. Next chunk in the packet, please. */
2017-06-30 06:52:13 +03:00
ch = ( struct sctp_chunkhdr * ) chunk - > chunk_end ;
2005-04-17 02:20:36 +04:00
/* Force chunk->skb->data to chunk->chunk_end. */
net: sctp: fix remote memory pressure from excessive queueing
This scenario is not limited to ASCONF, just taken as one
example triggering the issue. When receiving ASCONF probes
in the form of ...
-------------- INIT[ASCONF; ASCONF_ACK] ------------->
<----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------
-------------------- COOKIE-ECHO -------------------->
<-------------------- COOKIE-ACK ---------------------
---- ASCONF_a; [ASCONF_b; ...; ASCONF_n;] JUNK ------>
[...]
---- ASCONF_m; [ASCONF_o; ...; ASCONF_z;] JUNK ------>
... where ASCONF_a, ASCONF_b, ..., ASCONF_z are good-formed
ASCONFs and have increasing serial numbers, we process such
ASCONF chunk(s) marked with !end_of_packet and !singleton,
since we have not yet reached the SCTP packet end. SCTP does
only do verification on a chunk by chunk basis, as an SCTP
packet is nothing more than just a container of a stream of
chunks which it eats up one by one.
We could run into the case that we receive a packet with a
malformed tail, above marked as trailing JUNK. All previous
chunks are here goodformed, so the stack will eat up all
previous chunks up to this point. In case JUNK does not fit
into a chunk header and there are no more other chunks in
the input queue, or in case JUNK contains a garbage chunk
header, but the encoded chunk length would exceed the skb
tail, or we came here from an entirely different scenario
and the chunk has pdiscard=1 mark (without having had a flush
point), it will happen, that we will excessively queue up
the association's output queue (a correct final chunk may
then turn it into a response flood when flushing the
queue ;)): I ran a simple script with incremental ASCONF
serial numbers and could see the server side consuming
excessive amount of RAM [before/after: up to 2GB and more].
The issue at heart is that the chunk train basically ends
with !end_of_packet and !singleton markers and since commit
2e3216cd54b1 ("sctp: Follow security requirement of responding
with 1 packet") therefore preventing an output queue flush
point in sctp_do_sm() -> sctp_cmd_interpreter() on the input
chunk (chunk = event_arg) even though local_cork is set,
but its precedence has changed since then. In the normal
case, the last chunk with end_of_packet=1 would trigger the
queue flush to accommodate possible outgoing bundling.
In the input queue, sctp_inq_pop() seems to do the right thing
in terms of discarding invalid chunks. So, above JUNK will
not enter the state machine and instead be released and exit
the sctp_assoc_bh_rcv() chunk processing loop. It's simply
the flush point being missing at loop exit. Adding a try-flush
approach on the output queue might not work as the underlying
infrastructure might be long gone at this point due to the
side-effect interpreter run.
One possibility, albeit a bit of a kludge, would be to defer
invalid chunk freeing into the state machine in order to
possibly trigger packet discards and thus indirectly a queue
flush on error. It would surely be better to discard chunks
as in the current, perhaps better controlled environment, but
going back and forth, it's simply architecturally not possible.
I tried various trailing JUNK attack cases and it seems to
look good now.
Joint work with Vlad Yasevich.
Fixes: 2e3216cd54b1 ("sctp: Follow security requirement of responding with 1 packet")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-10 00:55:33 +04:00
skb_pull ( chunk - > skb , chunk - > chunk_end - chunk - > skb - > data ) ;
/* We are guaranteed to pull a SCTP header. */
2005-04-17 02:20:36 +04:00
}
}
/* Do we need to take the next packet out of the queue to process? */
if ( ! chunk ) {
2005-07-09 08:47:49 +04:00
struct list_head * entry ;
2016-06-02 21:05:42 +03:00
next_chunk :
2005-04-17 02:20:36 +04:00
/* Is the queue empty? */
2016-06-02 21:05:43 +03:00
entry = sctp_list_dequeue ( & queue - > in_chunk_list ) ;
if ( ! entry )
2005-04-17 02:20:36 +04:00
return NULL ;
2016-06-02 21:05:42 +03:00
chunk = list_entry ( entry , struct sctp_chunk , list ) ;
2005-04-17 02:20:36 +04:00
2016-06-02 21:05:43 +03:00
if ( ( skb_shinfo ( chunk - > skb ) - > gso_type & SKB_GSO_SCTP ) = = SKB_GSO_SCTP ) {
/* GSO-marked skbs but without frags, handle
* them normally
*/
if ( skb_shinfo ( chunk - > skb ) - > frag_list )
chunk - > head_skb = chunk - > skb ;
/* skbs with "cover letter" */
if ( chunk - > head_skb & & chunk - > skb - > data_len = = chunk - > skb - > len )
chunk - > skb = skb_shinfo ( chunk - > skb ) - > frag_list ;
if ( WARN_ON ( ! chunk - > skb ) ) {
__SCTP_INC_STATS ( dev_net ( chunk - > skb - > dev ) , SCTP_MIB_IN_PKT_DISCARDS ) ;
sctp_chunk_free ( chunk ) ;
goto next_chunk ;
}
}
if ( chunk - > asoc )
sock_rps_save_rxhash ( chunk - > asoc - > base . sk , chunk - > skb ) ;
2016-06-02 21:05:42 +03:00
queue - > in_progress = chunk ;
2016-06-02 21:05:43 +03:00
new_skb :
2005-04-17 02:20:36 +04:00
/* This is the first chunk in the packet. */
2017-06-30 06:52:13 +03:00
ch = ( struct sctp_chunkhdr * ) chunk - > skb - > data ;
2016-06-02 21:05:43 +03:00
chunk - > singleton = 1 ;
2006-05-06 04:02:09 +04:00
chunk - > data_accepted = 0 ;
2016-06-02 21:05:43 +03:00
chunk - > pdiscard = 0 ;
chunk - > auth = 0 ;
chunk - > has_asconf = 0 ;
chunk - > end_of_packet = 0 ;
2016-07-13 21:08:57 +03:00
if ( chunk - > head_skb ) {
struct sctp_input_cb
* cb = SCTP_INPUT_CB ( chunk - > skb ) ,
* head_cb = SCTP_INPUT_CB ( chunk - > head_skb ) ;
cb - > chunk = head_cb - > chunk ;
2016-07-13 21:08:58 +03:00
cb - > af = head_cb - > af ;
2016-07-13 21:08:57 +03:00
}
2005-04-17 02:20:36 +04:00
}
2007-02-09 17:25:18 +03:00
chunk - > chunk_hdr = ch ;
2016-09-21 14:45:55 +03:00
chunk - > chunk_end = ( ( __u8 * ) ch ) + SCTP_PAD4 ( ntohs ( ch - > length ) ) ;
2017-06-30 06:52:13 +03:00
skb_pull ( chunk - > skb , sizeof ( * ch ) ) ;
2005-04-17 02:20:36 +04:00
chunk - > subh . v = NULL ; /* Subheader is no longer valid. */
2017-06-30 06:52:13 +03:00
if ( chunk - > chunk_end + sizeof ( * ch ) < skb_tail_pointer ( chunk - > skb ) ) {
2005-04-17 02:20:36 +04:00
/* This is not a singleton */
chunk - > singleton = 0 ;
2007-04-20 07:29:13 +04:00
} else if ( chunk - > chunk_end > skb_tail_pointer ( chunk - > skb ) ) {
net: sctp: fix remote memory pressure from excessive queueing
This scenario is not limited to ASCONF, just taken as one
example triggering the issue. When receiving ASCONF probes
in the form of ...
-------------- INIT[ASCONF; ASCONF_ACK] ------------->
<----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------
-------------------- COOKIE-ECHO -------------------->
<-------------------- COOKIE-ACK ---------------------
---- ASCONF_a; [ASCONF_b; ...; ASCONF_n;] JUNK ------>
[...]
---- ASCONF_m; [ASCONF_o; ...; ASCONF_z;] JUNK ------>
... where ASCONF_a, ASCONF_b, ..., ASCONF_z are good-formed
ASCONFs and have increasing serial numbers, we process such
ASCONF chunk(s) marked with !end_of_packet and !singleton,
since we have not yet reached the SCTP packet end. SCTP does
only do verification on a chunk by chunk basis, as an SCTP
packet is nothing more than just a container of a stream of
chunks which it eats up one by one.
We could run into the case that we receive a packet with a
malformed tail, above marked as trailing JUNK. All previous
chunks are here goodformed, so the stack will eat up all
previous chunks up to this point. In case JUNK does not fit
into a chunk header and there are no more other chunks in
the input queue, or in case JUNK contains a garbage chunk
header, but the encoded chunk length would exceed the skb
tail, or we came here from an entirely different scenario
and the chunk has pdiscard=1 mark (without having had a flush
point), it will happen, that we will excessively queue up
the association's output queue (a correct final chunk may
then turn it into a response flood when flushing the
queue ;)): I ran a simple script with incremental ASCONF
serial numbers and could see the server side consuming
excessive amount of RAM [before/after: up to 2GB and more].
The issue at heart is that the chunk train basically ends
with !end_of_packet and !singleton markers and since commit
2e3216cd54b1 ("sctp: Follow security requirement of responding
with 1 packet") therefore preventing an output queue flush
point in sctp_do_sm() -> sctp_cmd_interpreter() on the input
chunk (chunk = event_arg) even though local_cork is set,
but its precedence has changed since then. In the normal
case, the last chunk with end_of_packet=1 would trigger the
queue flush to accommodate possible outgoing bundling.
In the input queue, sctp_inq_pop() seems to do the right thing
in terms of discarding invalid chunks. So, above JUNK will
not enter the state machine and instead be released and exit
the sctp_assoc_bh_rcv() chunk processing loop. It's simply
the flush point being missing at loop exit. Adding a try-flush
approach on the output queue might not work as the underlying
infrastructure might be long gone at this point due to the
side-effect interpreter run.
One possibility, albeit a bit of a kludge, would be to defer
invalid chunk freeing into the state machine in order to
possibly trigger packet discards and thus indirectly a queue
flush on error. It would surely be better to discard chunks
as in the current, perhaps better controlled environment, but
going back and forth, it's simply architecturally not possible.
I tried various trailing JUNK attack cases and it seems to
look good now.
Joint work with Vlad Yasevich.
Fixes: 2e3216cd54b1 ("sctp: Follow security requirement of responding with 1 packet")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-10 00:55:33 +04:00
/* Discard inside state machine. */
chunk - > pdiscard = 1 ;
chunk - > chunk_end = skb_tail_pointer ( chunk - > skb ) ;
2005-04-17 02:20:36 +04:00
} else {
/* We are at the end of the packet, so mark the chunk
* in case we need to send a SACK .
*/
chunk - > end_of_packet = 1 ;
}
net: sctp: rework debugging framework to use pr_debug and friends
We should get rid of all own SCTP debug printk macros and use the ones
that the kernel offers anyway instead. This makes the code more readable
and conform to the kernel code, and offers all the features of dynamic
debbuging that pr_debug() et al has, such as only turning on/off portions
of debug messages at runtime through debugfs. The runtime cost of having
CONFIG_DYNAMIC_DEBUG enabled, but none of the debug statements printing,
is negligible [1]. If kernel debugging is completly turned off, then these
statements will also compile into "empty" functions.
While we're at it, we also need to change the Kconfig option as it /now/
only refers to the ifdef'ed code portions in outqueue.c that enable further
debugging/tracing of SCTP transaction fields. Also, since SCTP_ASSERT code
was enabled with this Kconfig option and has now been removed, we
transform those code parts into WARNs resp. where appropriate BUG_ONs so
that those bugs can be more easily detected as probably not many people
have SCTP debugging permanently turned on.
To turn on all SCTP debugging, the following steps are needed:
# mount -t debugfs none /sys/kernel/debug
# echo -n 'module sctp +p' > /sys/kernel/debug/dynamic_debug/control
This can be done more fine-grained on a per file, per line basis and others
as described in [2].
[1] https://www.kernel.org/doc/ols/2009/ols2009-pages-39-46.pdf
[2] Documentation/dynamic-debug-howto.txt
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-28 21:49:40 +04:00
pr_debug ( " +++sctp_inq_pop+++ chunk:%p[%s], length:%d, skb->len:%d \n " ,
chunk , sctp_cname ( SCTP_ST_CHUNK ( chunk - > chunk_hdr - > type ) ) ,
ntohs ( chunk - > chunk_hdr - > length ) , chunk - > skb - > len ) ;
2005-04-17 02:20:36 +04:00
return chunk ;
}
/* Set a top-half handler.
*
* Originally , we the top - half handler was scheduled as a BH . We now
* call the handler directly in sctp_inq_push ( ) at a time that
* we know we are lock safe .
* The intent is that this routine will pull stuff out of the
* inqueue and process it .
*/
2006-11-22 17:57:56 +03:00
void sctp_inq_set_th_handler ( struct sctp_inq * q , work_func_t callback )
2005-04-17 02:20:36 +04:00
{
2006-11-22 17:57:56 +03:00
INIT_WORK ( & q - > immediate , callback ) ;
2005-04-17 02:20:36 +04:00
}