Commit Graph

22637 Commits

Author SHA1 Message Date
David S. Miller
011e3c6325 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-04-12 19:41:23 -04:00
Eldad Zack
c1412fce7e net/ipv6/exthdrs.c: Strict PadN option checking
Added strict checking of PadN, as PadN can be used to increase header
size and thus push the protocol header into the 2nd fragment.

PadN is used to align the options within the Hop-by-Hop or
Destination Options header to 64-bit boundaries. The maximum valid
size is thus 7 bytes.
RFC 4942 recommends to actively check the "payload" itself and
ensure that it contains only zeroes.

See also RFC 4942 section 2.1.9.5.

Signed-off-by: Eldad Zack <eldad@fogrefinery.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-12 17:36:44 -04:00
Linus Torvalds
174808af90 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix bluetooth userland regression reported by Keith Packard, from
    Gustavo Padovan.

 2) Revert ath9k PS idle change, from Sujith Manoharan.

 3) Correct default TCP memory limits (again), from Eric Dumazet.

 4) Fix tcp_rcv_rtt_update() accidental use of unscaled RTT, from Neal
    Cardwell.

 5) We made a facility for layers like wireless to say how much tailroom
    they need in the SKB for link layer stuff such as wireless
    encryption etc., but TCP works hard to fill every SKB out to the end
    defeating this specification.

    This leads to every TCP packet getting reallocated by the wireless
    code in order to have the right amount of tailroom available.

    Fix TCP to only fill SKBs out to the real amount of data area it
    asked for during the allocation, this way it won't eat into the
    slack added for the device's tailroom needs.

    Reported by Marc Merlin and fixed by Eric Dumazet.

 6) Leaks, endian bugs, and new device IDs in bluetooth from Santosh
    Nayak, João Paulo Rechi Vita, Cho, Yu-Chen, Andrei Emeltchenko,
    AceLan Kao, and Andrei Emeltchenko.

 7) OOPS on tty_close fix in bluetooth's hci_ldisc from Johan Hovold.

 8) netfilter erroneously scales TCP window twice, fix from Changli Gao.

 9) Memleak fix in wext-core from Julia Lawall.

10) Consistently handle invalid TCP packets in ipv4 vs.  ipv6 conntrack,
    from Jozsef Kadlecsik.

11) Validate IP header length properly in netfilter conntrack's
    ipv4_get_l4proto().

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (39 commits)
  NFC: Fix the LLCP Tx fragmentation loop
  rtlwifi: Add missing DMA buffer unmapping for PCI drivers
  rtlwifi: Preallocate USB read buffers and eliminate kalloc in read routine
  tcp: avoid order-1 allocations on wifi and tx path
  net: allow pskb_expand_head() to get maximum tailroom
  bridge: Do not send queries on multicast group leaves
  MAINTAINERS: Mark NATSEMI driver as orphan'd.
  tcp: fix tcp_rcv_rtt_update() use of an unscaled RTT sample
  tcp: restore correct limit
  Revert "ath9k: fix going to full-sleep on PS idle"
  rt2x00: Fix rfkill_polling register function.
  bcma: fix build error on MIPS; implicit pcibios_enable_device
  netfilter: nf_conntrack: fix incorrect logic in nf_conntrack_init_net
  netfilter: nf_ct_ipv4: packets with wrong ihl are invalid
  netfilter: nf_ct_ipv4: handle invalid IPv4 and IPv6 packets consistently
  net/wireless/wext-core.c: add missing kfree
  rtlwifi: Fix oops on rate-control failure
  mac80211: Convert WARN_ON to WARN_ON_ONCE
  rtlwifi: rtl8192de: Fix firmware initialization
  nl80211: ensure interface is up in various APIs
  ...
2012-04-12 14:04:33 -07:00
Jeffrin Jose
46ba5b23c3 net: Fixed coding style issues relating to braces.
Fixed coding style issues in net/core/utils.c
in relation with braces placement.

Signed-off-by: Jeffrin Jose <ahiliation@yahoo.co.in>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-12 16:35:48 -04:00
Eldad Zack
6fdbd1648b net/ipv6/ipv6_sockglue.c: Removed redundant extern
extern int sysctl_mld_max_msf is already defined in linux/ipv6.h.

Signed-off-by: Eldad Zack <eldad@fogrefinery.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-12 16:14:47 -04:00
Shuah Khan
e1e420c71b net/core: simple_strtoul cleanup
Changed net/core/net-sysfs.c: netdev_store() to use kstrtoul()
instead of obsolete simple_strtoul().

Signed-off-by: Shuah Khan <shuahkhan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-12 16:09:08 -04:00
Alexey I. Froloff
e35f30c131 Treat ND option 31 as userland (DNSSL support)
As specified in RFC6106, DNSSL option contains one or more domain names
of DNS suffixes.  8-bit identifier of the DNSSL option type as assigned
by the IANA is 31.  This option should also be treated as userland.

Signed-off-by: Alexey I. Froloff <raorn@raorn.name>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-12 15:56:57 -04:00
John W. Linville
5d94994422 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2012-04-12 09:55:22 -04:00
Samuel Ortiz
b4838d12e1 NFC: Fix the LLCP Tx fragmentation loop
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-11 15:09:33 -04:00
Eric Dumazet
a21d45726a tcp: avoid order-1 allocations on wifi and tx path
Marc Merlin reported many order-1 allocations failures in TX path on its
wireless setup, that dont make any sense with MTU=1500 network, and non
SG capable hardware.

After investigation, it turns out TCP uses sk_stream_alloc_skb() and
used as a convention skb_tailroom(skb) to know how many bytes of data
payload could be put in this skb (for non SG capable devices)

Note : these skb used kmalloc-4096 (MTU=1500 + MAX_HEADER +
sizeof(struct skb_shared_info) being above 2048)

Later, mac80211 layer need to add some bytes at the tail of skb
(IEEE80211_ENCRYPT_TAILROOM = 18 bytes) and since no more tailroom is
available has to call pskb_expand_head() and request order-1
allocations.

This patch changes sk_stream_alloc_skb() so that only
sk->sk_prot->max_header bytes of headroom are reserved, and use a new
skb field, avail_size to hold the data payload limit.

This way, order-0 allocations done by TCP stack can leave more than 2 KB
of tailroom and no more allocation is performed in mac80211 layer (or
any layer needing some tailroom)

avail_size is unioned with mark/dropcount, since mark will be set later
in IP stack for output packets. Therefore, skb size is unchanged.

Reported-by: Marc MERLIN <marc@merlins.org>
Tested-by: Marc MERLIN <marc@merlins.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-11 10:11:12 -04:00
Eric Dumazet
87151b8689 net: allow pskb_expand_head() to get maximum tailroom
Marc Merlin reported many order-1 allocations failures in TX path on its
wireless setup, that dont make any sense with MTU=1500 network, and non
SG capable hardware.

Turns out part of the problem comes from pskb_expand_head() not using
ksize() to get exact head size given by kmalloc(). Doing the same thing
than __alloc_skb() allows more tailroom in skb and can prevent future
reallocations.

As a bonus, struct skb_shared_info becomes cache line aligned.

Reported-by: Marc MERLIN <marc@merlins.org>
Tested-by: Marc MERLIN <marc@merlins.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-11 10:10:43 -04:00
Herbert Xu
996304bbea bridge: Do not send queries on multicast group leaves
As it stands the bridge IGMP snooping system will respond to
group leave messages with queries for remaining membership.
This is both unnecessary and undesirable.  First of all any
multicast routers present should be doing this rather than us.
What's more the queries that we send may end up upsetting other
multicast snooping swithces in the system that are buggy.

In fact, we can simply remove the code that send these queries
because the existing membership expiry mechanism doesn't rely
on them anyway.

So this patch simply removes all code associated with group
queries in response to group leave messages.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-11 09:43:13 -04:00
Simon Wunderlich
7a5cc24277 batman-adv: add bridge loop avoidance compile option
The define CONFIG_BATMAN_ADV_BLA switches the bridge loop avoidance
on - skip it, and the bridge loop avoidance is not compiled in.

This is useful if binary size should be saved or the feature is
not needed.

Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-04-11 14:29:00 +02:00
Simon Wunderlich
38ef3d1d91 batman-adv: form groups in the bridge loop avoidance
backbone gateways may be part of the same LAN, but participate
in different meshes. With this patch, backbone gateways form groups by
applying the groupid of another backbone gateway if it is higher. After
forming the group, they only accept messages from backbone gateways of
the same group.

Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-04-11 14:28:59 +02:00
Simon Wunderlich
b1a8c04b8a batman-adv: drop STP over batman
Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-04-11 14:28:59 +02:00
Simon Wunderlich
fe2da6ff27 batman-adv: add broadcast duplicate check
When multiple backbone gateways relay the same broadcast from the
backbone into the mesh, other nodes in the mesh may receive this
broadcast multiple times. To avoid this, the crc checksums of
received broadcasts are recorded and new broadcast packets with
the same content may be dropped if received by another gateway.

Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-04-11 14:28:59 +02:00
Simon Wunderlich
20ff9d593f batman-adv: don't let backbone gateways exchange tt entries
As the backbone gateways are connected to the same backbone, they
should announce the same clients on the backbone non-exclusively.

Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-04-11 14:28:59 +02:00
Simon Wunderlich
db08e6e557 batman-adv: allow multiple entries in tt_global_entries
as backbone gateways will all independently announce the same clients,
also the tt global table must be able to hold multiple originators per
client entry.

Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-04-11 14:28:59 +02:00
Simon Wunderlich
9bf8e4d425 batman-adv: export claim tables through debugfs
Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-04-11 14:28:59 +02:00
Simon Wunderlich
c867305509 batman-adv: make bridge loop avoidance switchable
Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-04-11 14:28:59 +02:00
Simon Wunderlich
23721387c4 batman-adv: add basic bridge loop avoidance code
This second version of the bridge loop avoidance for batman-adv
avoids loops between the mesh and a backbone (usually a LAN).

By connecting multiple batman-adv mesh nodes to the same ethernet
segment a loop can be created when the soft-interface is bridged
into that ethernet segment. A simple visualization of the loop
involving the most common case - a LAN as ethernet segment:

node1  <-- LAN  -->  node2
  |                   |
wifi   <-- mesh -->  wifi

Packets from the LAN (e.g. ARP broadcasts) will circle forever from
node1 or node2 over the mesh back into the LAN.

With this patch, batman recognizes backbone gateways, nodes which are
part of the mesh and backbone/LAN at the same time. Each backbone
gateway "claims" clients from within the mesh to handle them
exclusively. By restricting that only responsible backbone gateways
may handle their claimed clients traffic, loops are effectively
avoided.

Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-04-11 14:28:58 +02:00
Simon Wunderlich
a7f6ee9493 batman-adv: remove old bridge loop avoidance code
The functionality is to be replaced by an improved implementation,
so first clean up.

Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-04-11 14:28:58 +02:00
Marek Lindner
8681a1c4dd batman-adv: encourage batman to take shorter routes by changing the default hop penalty
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-04-11 14:28:58 +02:00
Sven Eckelmann
de7aae6570 batman-adv: Remove declaration of only locally used functions
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Acked-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-04-11 14:28:58 +02:00
Sven Eckelmann
0079d2cef1 batman-adv: Replace bitarray operations with bitmap
bitarray.c consists mostly of functionality that is already available as part
of the standard kernel API. batman-adv could use architecture optimized code
and reduce the binary size by switching to the standard functions.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-04-11 14:28:58 +02:00
Antonio Quartulli
c1faead333 batman-adv: use ETH_ALEN instead of hardcoded numeric constants
In packet.h the numeric constant 6 is used instead of the more portable ETH_ALEN
define. This patch substitute any hardcoded value with such define.

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Acked-by: Sven Eckelmann <sven@narfation.org>
2012-04-11 14:28:58 +02:00
Antonio Quartulli
10e3cd6a25 batman-adv: clean up Kconfig
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-04-11 14:28:58 +02:00
Linus Torvalds
94fb175c04 dmaengine-fixes for 3.4-rc3
1/ regression fix for Xen as it now trips over a broken assumption
    about the dma address size on 32-bit builds
 
 2/ new quirk for netdma to ignore dma channels that cannot meet
    netdma alignment requirements
 
 3/ fixes for two long standing issues in ioatdma (ring size overflow)
    and iop-adma (potential stack corruption)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJPhIfhAAoJEB7SkWpmfYgCguIQAL4qF+RC9/JggSHIjfOrYiPd
 yboV80GqqQHHBwy8hfZVUrIEPMebvD/xUIk6iUQNXR+6EA8Ln0jukvQMpWNnI+Cc
 TXgA5Ok70an4PD1MqnCsWyCJjsyPyhprbRHurxBcesf+y96POJxhING0rcKvft50
 mvYnbtrkYe9M9x3b8TBGc0JaTVeL29Ck3FtkTz4uUktbkhRNfCcfEd28NRQpf8MB
 vkjbjRGBQmGsnKxYCaEhlF1GPJyTlYjg4BBWtseJgb2R9s7tvJrkotFea/NmSfjq
 XCuVKjpiFp3YyJuxJERWdwqRWvyAZFfcYyZX440nG0b7GBgSn+T7A9XhUs8vMboi
 tLwoDfBbJDlKMaFpHex7Z6RtZZmVl3gWDNZTqpG44n4pabd4RPip04f0k7Wfs+cp
 tzU9hGAOvgsZ8w4/JgxH8YJOZbIGzbDGOA1IhWcbxIbmFTblMiFnV3TC7qfhoRbR
 8qtScIE7bUck2MYVlMMn9utd9tvKFa6HNgo41+f78/4+U7zQ/VrsbA/DWQct40R5
 5k+EEvyYFUzIXn79E0GVN5h4NHH5gfAs3MZ7jIgwgHedBp4Ki68XYKNu+pIV3YwG
 CFTPn1mVOXnCdt+fsjG5tL9Jecx1Mij6w3nWU93ZU6cHmC77YmU+DLxPIGuyR1a2
 EmpObwfq5peXzkgQpEsB
 =F3IR
 -----END PGP SIGNATURE-----

Merge tag 'dmaengine-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine

Pull dmaengine fixes from Dan Williams:

1/ regression fix for Xen as it now trips over a broken assumption
   about the dma address size on 32-bit builds

2/ new quirk for netdma to ignore dma channels that cannot meet
   netdma alignment requirements

3/ fixes for two long standing issues in ioatdma (ring size overflow)
   and iop-adma (potential stack corruption)

* tag 'dmaengine-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine:
  netdma: adding alignment check for NETDMA ops
  ioatdma: DMA copy alignment needed to address IOAT DMA silicon errata
  ioat: ring size variables need to be 32bit to avoid overflow
  iop-adma: Corrected array overflow in RAID6 Xscale(R) test.
  ioat: fix size of 'completion' for Xen
2012-04-10 15:30:16 -07:00
Neal Cardwell
18a223e0b9 tcp: fix tcp_rcv_rtt_update() use of an unscaled RTT sample
Fix a code path in tcp_rcv_rtt_update() that was comparing scaled and
unscaled RTT samples.

The intent in the code was to only use the 'm' measurement if it was a
new minimum.  However, since 'm' had not yet been shifted left 3 bits
but 'new_sample' had, this comparison would nearly always succeed,
leading us to erroneously set our receive-side RTT estimate to the 'm'
sample when that sample could be nearly 8x too high to use.

The overall effect is to often cause the receive-side RTT estimate to
be significantly too large (up to 40% too large for brief periods in
my tests).

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-10 14:47:09 -04:00
Eric Dumazet
5fb84b1428 tcp: restore correct limit
Commit c43b874d5d (tcp: properly initialize tcp memory limits) tried
to fix a regression added in commits 4acb4190 & 3dc43e3,
but still get it wrong.

Result is machines with low amount of memory have too small tcp_rmem[2]
value and slow tcp receives : Per socket limit being 1/1024 of memory
instead of 1/128 in old kernels, so rcv window is capped to small
values.

Fix this to match comment and previous behavior.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Glauber Costa <glommer@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-10 14:39:26 -04:00
David S. Miller
ecd159fc5f Merge branch 'master' of git://1984.lsi.us.es/net 2012-04-10 14:38:31 -04:00
David S. Miller
06eb4eafbd Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-04-10 14:30:45 -04:00
Gao feng
6ba900676b netfilter: nf_conntrack: fix incorrect logic in nf_conntrack_init_net
in function nf_conntrack_init_net,when nf_conntrack_timeout_init falied,
we should call nf_conntrack_ecache_fini to do rollback.
but the current code calls nf_conntrack_timeout_fini.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-10 13:00:38 +02:00
Jozsef Kadlecsik
07153c6ec0 netfilter: nf_ct_ipv4: packets with wrong ihl are invalid
It was reported that the Linux kernel sometimes logs:

klogd: [2629147.402413] kernel BUG at net / netfilter /
nf_conntrack_proto_tcp.c: 447!
klogd: [1072212.887368] kernel BUG at net / netfilter /
nf_conntrack_proto_tcp.c: 392

ipv4_get_l4proto() in nf_conntrack_l3proto_ipv4.c and tcp_error() in
nf_conntrack_proto_tcp.c should catch malformed packets, so the errors
at the indicated lines - TCP options parsing - should not happen.
However, tcp_error() relies on the "dataoff" offset to the TCP header,
calculated by ipv4_get_l4proto().  But ipv4_get_l4proto() does not check
bogus ihl values in IPv4 packets, which then can slip through tcp_error()
and get caught at the TCP options parsing routines.

The patch fixes ipv4_get_l4proto() by invalidating packets with bogus
ihl value.

The patch closes netfilter bugzilla id 771.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-10 12:50:49 +02:00
Jozsef Kadlecsik
8430eac2f6 netfilter: nf_ct_ipv4: handle invalid IPv4 and IPv6 packets consistently
IPv6 conntrack marked invalid packets as INVALID and let the user
drop those by an explicit rule, while IPv4 conntrack dropped such
packets itself.

IPv4 conntrack is changed so that it marks INVALID packets and let
the user to drop them.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-10 00:38:34 +02:00
Julia Lawall
7ab2485b69 net/wireless/wext-core.c: add missing kfree
Free extra as done in the error-handling code just above.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-09 15:54:50 -04:00
Johannes Berg
2b5f8b0b44 nl80211: ensure interface is up in various APIs
The nl80211 handling code should ensure as much as
it can that the interface is in a valid state, it
can certainly ensure the interface is running.

Not doing so can cause calls through mac80211 into
the driver that result in warnings and unspecified
behaviour in the driver.

Cc: stable@vger.kernel.org
Reported-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-09 15:54:46 -04:00
Johannes Berg
3f9768a5d2 mac80211: fix association beacon wait timeout
The TU_TO_EXP_TIME() macro already includes the
"jiffies +" piece of the calculation, so don't
add jiffies again.

Reported-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2012-04-09 15:54:46 -04:00
John W. Linville
41833af713 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth 2012-04-09 15:47:49 -04:00
Pablo Neira Ayuso
95ad2f873d netfilter: ip6_tables: ip6t_ext_hdr is now static inline
We may hit this in xt_LOG:

net/built-in.o:xt_LOG.c:function dump_ipv6_packet:
	error: undefined reference to 'ip6t_ext_hdr'

happens with these config options:

CONFIG_NETFILTER_XT_TARGET_LOG=y
CONFIG_IP6_NF_IPTABLES=m

ip6t_ext_hdr is fairly small and it is called in the packet path.
Make it static inline.

Reported-by: Simon Kirby <sim@netnation.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-09 16:29:34 +02:00
Changli Gao
6ee0b693bd netfilter: nf_ct_tcp: don't scale the size of the window up twice
For a picked up connection, the window win is scaled twice: one is by the
initialization code, and the other is by the sender updating code.

I use the temporary variable swin instead of modifying the variable win.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-09 16:29:30 +02:00
Linus Torvalds
23f347ef63 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking updates from David Miller:

 1) Fix inaccuracies in network driver interface documentation, from Ben
    Hutchings.

 2) Fix handling of negative offsets in BPF JITs, from Jan Seiffert.

 3) Compile warning, locking, and refcounting fixes in netfilter's
    xt_CT, from Pablo Neira Ayuso.

 4) phonet sendmsg needs to validate user length just like any other
    datagram protocol, fix from Sasha Levin.

 5) Ipv6 multicast code uses wrong loop index, from RongQing Li.

 6) Link handling and firmware fixes in bnx2x driver from Yaniv Rosner
    and Yuval Mintz.

 7) mlx4 erroneously allocates 4 pages at a time, regardless of page
    size, fix from Thadeu Lima de Souza Cascardo.

 8) SCTP socket option wasn't extended in a backwards compatible way,
    fix from Thomas Graf.

 9) Add missing address change event emissions to bonding, from Shlomo
    Pongratz.

10) /proc/net/dev regressed because it uses a private offset to track
    where we are in the hash table, but this doesn't track the offset
    pullback that the seq_file code does resulting in some entries being
    missed in large dumps.

    Fix from Eric Dumazet.

11) do_tcp_sendpage() unloads the send queue way too fast, because it
    invokes tcp_push() when it shouldn't.  Let the natural sequence
    generated by the splice paths, and the assosciated MSG_MORE
    settings, guide the tcp_push() calls.

    Otherwise what goes out of TCP is spaghetti and doesn't batch
    effectively into GSO/TSO clusters.

    From Eric Dumazet.

12) Once we put a SKB into either the netlink receiver's queue or a
    socket error queue, it can be consumed and freed up, therefore we
    cannot touch it after queueing it like that.

    Fixes from Eric Dumazet.

13) PPP has this annoying behavior in that for every transmit call it
    immediately stops the TX queue, then calls down into the next layer
    to transmit the PPP frame.

    But if that next layer can take it immediately, it just un-stops the
    TX queue right before returning from the transmit method.

    Besides being useless work, it makes several facilities unusable, in
    particular things like the equalizers.  Well behaved devices should
    only stop the TX queue when they really are full, and in PPP's case
    when it gets backlogged to the downstream device.

    David Woodhouse therefore fixed PPP to not stop the TX queue until
    it's downstream can't take data any more.

14) IFF_UNICAST_FLT got accidently lost in some recent stmmac driver
    changes, re-add.  From Marc Kleine-Budde.

15) Fix link flaps in ixgbe, from Eric W. Multanen.

16) Descriptor writeback fixes in e1000e from Matthew Vick.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (47 commits)
  net: fix a race in sock_queue_err_skb()
  netlink: fix races after skb queueing
  doc, net: Update ndo_start_xmit return type and values
  doc, net: Remove instruction to set net_device::trans_start
  doc, net: Update netdev operation names
  doc, net: Update documentation of synchronisation for TX multiqueue
  doc, net: Remove obsolete reference to dev->poll
  ethtool: Remove exception to the requirement of holding RTNL lock
  MAINTAINERS: update for Marvell Ethernet drivers
  bonding: properly unset current_arp_slave on slave link up
  phonet: Check input from user before allocating
  tcp: tcp_sendpages() should call tcp_push() once
  ipv6: fix array index in ip6_mc_add_src()
  mlx4: allocate just enough pages instead of always 4 pages
  stmmac: re-add IFF_UNICAST_FLT for dwmac1000
  bnx2x: Clear MDC/MDIO warning message
  bnx2x: Fix BCM57711+BCM84823 link issue
  bnx2x: Clear BCM84833 LED after fan failure
  bnx2x: Fix BCM84833 PHY FW version presentation
  bnx2x: Fix link issue for BCM8727 boards.
  ...
2012-04-06 10:37:38 -07:00
Eric Dumazet
110c43304d net: fix a race in sock_queue_err_skb()
As soon as an skb is queued into socket error queue, another thread
can consume it, so we are not allowed to reference skb anymore, or risk
use after free.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-06 05:07:21 -04:00
Eric Dumazet
4a7e7c2ad5 netlink: fix races after skb queueing
As soon as an skb is queued into socket receive_queue, another thread
can consume it, so we are not allowed to reference skb anymore, or risk
use after free.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-06 04:21:06 -04:00
Sasha Levin
bcf1b70ac6 phonet: Check input from user before allocating
A phonet packet is limited to USHRT_MAX bytes, this is never checked during
tx which means that the user can specify any size he wishes, and the kernel
will attempt to allocate that size.

In the good case, it'll lead to the following warning, but it may also cause
the kernel to kick in the OOM and kill a random task on the server.

[ 8921.744094] WARNING: at mm/page_alloc.c:2255 __alloc_pages_slowpath+0x65/0x730()
[ 8921.749770] Pid: 5081, comm: trinity Tainted: G        W    3.4.0-rc1-next-20120402-sasha #46
[ 8921.756672] Call Trace:
[ 8921.758185]  [<ffffffff810b2ba7>] warn_slowpath_common+0x87/0xb0
[ 8921.762868]  [<ffffffff810b2be5>] warn_slowpath_null+0x15/0x20
[ 8921.765399]  [<ffffffff8117eae5>] __alloc_pages_slowpath+0x65/0x730
[ 8921.769226]  [<ffffffff81179c8a>] ? zone_watermark_ok+0x1a/0x20
[ 8921.771686]  [<ffffffff8117d045>] ? get_page_from_freelist+0x625/0x660
[ 8921.773919]  [<ffffffff8117f3a8>] __alloc_pages_nodemask+0x1f8/0x240
[ 8921.776248]  [<ffffffff811c03e0>] kmalloc_large_node+0x70/0xc0
[ 8921.778294]  [<ffffffff811c4bd4>] __kmalloc_node_track_caller+0x34/0x1c0
[ 8921.780847]  [<ffffffff821b0e3c>] ? sock_alloc_send_pskb+0xbc/0x260
[ 8921.783179]  [<ffffffff821b3c65>] __alloc_skb+0x75/0x170
[ 8921.784971]  [<ffffffff821b0e3c>] sock_alloc_send_pskb+0xbc/0x260
[ 8921.787111]  [<ffffffff821b002e>] ? release_sock+0x7e/0x90
[ 8921.788973]  [<ffffffff821b0ff0>] sock_alloc_send_skb+0x10/0x20
[ 8921.791052]  [<ffffffff824cfc20>] pep_sendmsg+0x60/0x380
[ 8921.792931]  [<ffffffff824cb4a6>] ? pn_socket_bind+0x156/0x180
[ 8921.794917]  [<ffffffff824cb50f>] ? pn_socket_autobind+0x3f/0x90
[ 8921.797053]  [<ffffffff824cb63f>] pn_socket_sendmsg+0x4f/0x70
[ 8921.798992]  [<ffffffff821ab8e7>] sock_aio_write+0x187/0x1b0
[ 8921.801395]  [<ffffffff810e325e>] ? sub_preempt_count+0xae/0xf0
[ 8921.803501]  [<ffffffff8111842c>] ? __lock_acquire+0x42c/0x4b0
[ 8921.805505]  [<ffffffff821ab760>] ? __sock_recv_ts_and_drops+0x140/0x140
[ 8921.807860]  [<ffffffff811e07cc>] do_sync_readv_writev+0xbc/0x110
[ 8921.809986]  [<ffffffff811958e7>] ? might_fault+0x97/0xa0
[ 8921.811998]  [<ffffffff817bd99e>] ? security_file_permission+0x1e/0x90
[ 8921.814595]  [<ffffffff811e17e2>] do_readv_writev+0xe2/0x1e0
[ 8921.816702]  [<ffffffff810b8dac>] ? do_setitimer+0x1ac/0x200
[ 8921.818819]  [<ffffffff810e2ec1>] ? get_parent_ip+0x11/0x50
[ 8921.820863]  [<ffffffff810e325e>] ? sub_preempt_count+0xae/0xf0
[ 8921.823318]  [<ffffffff811e1926>] vfs_writev+0x46/0x60
[ 8921.825219]  [<ffffffff811e1a3f>] sys_writev+0x4f/0xb0
[ 8921.827127]  [<ffffffff82658039>] system_call_fastpath+0x16/0x1b
[ 8921.829384] ---[ end trace dffe390f30db9eb7 ]---

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Acked-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-05 19:05:56 -04:00
Eric Dumazet
35f9c09fe9 tcp: tcp_sendpages() should call tcp_push() once
commit 2f53384424 (tcp: allow splice() to build full TSO packets) added
a regression for splice() calls using SPLICE_F_MORE.

We need to call tcp_flush() at the end of the last page processed in
tcp_sendpages(), or else transmits can be deferred and future sends
stall.

Add a new internal flag, MSG_SENDPAGE_NOTLAST, acting like MSG_MORE, but
with different semantic.

For all sendpage() providers, its a transparent change. Only
sock_sendpage() and tcp_sendpages() can differentiate the two different
flags provided by pipe_to_sendpage()

Reported-by: Tom Herbert <therbert@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: H.K. Jerry Chu <hkchu@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail>com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-05 19:04:27 -04:00
Linus Torvalds
5d32c88f0b Merge branch 'akpm' (Andrew's patch-bomb)
Merge batch of fixes from Andrew Morton:
 "The simple_open() cleanup was held back while I wanted for laggards to
  merge things.

  I still need to send a few checkpoint/restore patches.  I've been
  wobbly about merging them because I'm wobbly about the overall
  prospects for success of the project.  But after speaking with Pavel
  at the LSF conference, it sounds like they're further toward
  completion than I feared - apparently davem is at the "has stopped
  complaining" stage regarding the net changes.  So I need to go back
  and re-review those patchs and their (lengthy) discussion."

* emailed from Andrew Morton <akpm@linux-foundation.org>: (16 patches)
  memcg swap: use mem_cgroup_uncharge_swap fix
  backlight: add driver for DA9052/53 PMIC v1
  C6X: use set_current_blocked() and block_sigmask()
  MAINTAINERS: add entry for sparse checker
  MAINTAINERS: fix REMOTEPROC F: typo
  alpha: use set_current_blocked() and block_sigmask()
  simple_open: automatically convert to simple_open()
  scripts/coccinelle/api/simple_open.cocci: semantic patch for simple_open()
  libfs: add simple_open()
  hugetlbfs: remove unregister_filesystem() when initializing module
  drivers/rtc/rtc-88pm860x.c: fix rtc irq enable callback
  fs/xattr.c:setxattr(): improve handling of allocation failures
  fs/xattr.c:listxattr(): fall back to vmalloc() if kmalloc() failed
  fs/xattr.c: suppress page allocation failure warnings from sys_listxattr()
  sysrq: use SEND_SIG_FORCED instead of force_sig()
  proc: fix mount -t proc -o AAA
2012-04-05 15:30:34 -07:00
Dave Jiang
a2bd1140a2 netdma: adding alignment check for NETDMA ops
This is the fallout from adding memcpy alignment workaround for certain
IOATDMA hardware. NetDMA will only use DMA engine that can handle byte align
ops.

Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2012-04-05 15:27:12 -07:00
Stephen Boyd
234e340582 simple_open: automatically convert to simple_open()
Many users of debugfs copy the implementation of default_open() when
they want to support a custom read/write function op.  This leads to a
proliferation of the default_open() implementation across the entire
tree.

Now that the common implementation has been consolidated into libfs we
can replace all the users of this function with simple_open().

This replacement was done with the following semantic patch:

<smpl>
@ open @
identifier open_f != simple_open;
identifier i, f;
@@
-int open_f(struct inode *i, struct file *f)
-{
(
-if (i->i_private)
-f->private_data = i->i_private;
|
-f->private_data = i->i_private;
)
-return 0;
-}

@ has_open depends on open @
identifier fops;
identifier open.open_f;
@@
struct file_operations fops = {
...
-.open = open_f,
+.open = simple_open,
...
};
</smpl>

[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-05 15:25:50 -07:00
RongQing.Li
ce713ee5a1 net: replace continue with break to reduce unnecessary loop in xxx_xmarksources
The conditional which decides to skip inactive filters does not
change with the change of loop index, so it is unnecessary to
check them many times.

Signed-off-by: RongQing.Li <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-05 05:42:18 -04:00