Commit Graph

299513 Commits

Author SHA1 Message Date
majianpeng
7f59388108 net/ipv4:Remove two memleak reports by kmemleak_not_leak.
Signed-off-by: majianpeng <majianpeng@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-18 00:20:28 -04:00
Eric Dumazet
85bba52de1 net: filter: remove unused cpu_off in sparc JIT
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-17 22:55:59 -04:00
françois romieu
0c20494050 dmfe: enforce consistent timing delay.
The driver does not always use the same timing for what looks like
the same operations.

- DCR0
  Use the same udelay everywhere for reset. Upper bound is 100 us.
- DCR9
  Use 5us delay for srom clock. 1us delay for phy_write_1bit (writes
  PHY_DATA_[01]) are not changed as they stay withing a 2,5MHz MDIO
  clock range.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Reviewed-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-17 22:54:13 -04:00
David S. Miller
584c5e2ad3 net: filter: Fix some more small issues in sparc JIT.
Fix mixed space and tabs.

Put bpf_jit_load_*[] externs into bpf_jit.h

"while(0)" --> "while (0)"
"COND (X)" --> "COND(X)"
Document branch offset calculations, and bpf_error's return
sequence.

Document the reason we need to emit three nops between the
%y register write and the divide instruction.

Remove erroneous trailing semicolons from emit_read_y() and
emit_write_y().

Based upon feedback from Sam Ravnborg.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-17 16:43:46 -04:00
David S. Miller
7b56f76edf net: filter: Fix some minor issues in sparc JIT.
Correct conventions comments.  %o4 and %o5 were swapped,
%g3 was not documented.

Use r_TMP instead of r_SKB_DATA + r_OFF where possible in
assembler stubs.

Correct discussion of %o4 and %o5 in one of bpf_jit_compile()'s
comments.

Based upon feedback from Richard Mortimer.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-17 16:08:29 -04:00
Hayes Wang
b3d7b2f2f0 r8169: support the new RTL8411 chip.
Compared with previous chipsets, it needs no special action trough the
jumbo{enable/disable} helpers to operate with jumbo frames.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Acked-by: Francois Romieu <romieu@fr.zoreil.com>
2012-04-17 11:22:41 +02:00
Hayes Wang
5f886e0890 r8169: adjust some functions of 8111f
Put some settings of 8111f into one function which may be reused.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
2012-04-17 11:22:41 +02:00
Hayes Wang
7e18dca162 r8169: support the new RTL8402 chip.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
2012-04-17 11:22:41 +02:00
Hayes Wang
beb1fe184f r8169: add device specific CSI access helpers.
New chipsets need it.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
2012-04-17 11:22:40 +02:00
Hayes Wang
0004299ad4 r8169: modify pll power function
Adjust r810x_pll_power_down, r810x_pll_power_up, and r8168_pll_power_up.
Always power up device during rtl_open. For r810x, turn off more power
when the WOL is disabled.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
2012-04-17 11:22:40 +02:00
Francois Romieu
d387b427c9 r8169: 8168c and later require bit 0x20 to be set in Config2 for PME signaling.
The new 84xx stopped flying below the radars.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes Wang <hayeswang@realtek.com>
2012-04-17 11:22:22 +02:00
Francois Romieu
851e602219 r8169: Config1 is read-only on 8168c and later.
Suggested by Hayes.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes Wang <hayeswang@realtek.com>
2012-04-17 11:21:19 +02:00
Randy Dunlap
ecffe75f93 hippi: fix printk format in rrunner.c
Fix printk format warning (from i386 build):

drivers/net/hippi/rrunner.c:146:9: warning: format '%08llx' expects type 'long long unsigned int', but argument 3 has type 'resource_size_t'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Jes Sorensen <jes@trained-monkey.org>
Cc:	linux-hippi@sunsite.dk
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-16 23:48:38 -04:00
David S. Miller
7335baed84 Merge branch 'master' of git://gitorious.org/linux-can/linux-can-next 2012-04-16 23:47:37 -04:00
David S. Miller
2809a2087c net: filter: Just In Time compiler for sparc
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-16 23:15:14 -04:00
Axel Lin
fb7944b369 net/can: use module_pci_driver
This patch converts the drivers in drivers/net/can/* to use
module_pci_driver() macro which makes the code smaller and a bit simpler.

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Cc: Wolfgang Grandegger <wg@grandegger.com>
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-can@vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2012-04-16 21:08:18 +02:00
Daniel Baluta
a75afd4770 can: fix sparse warning for cgw_list
Make cgw_list static to remove the following sparse warning:
net/can/gw.c:69:1: warning: symbol 'cgw_list' was not declared.
Should it be static?

Signed-off-by: Daniel Baluta <dbaluta@ixiacom.com>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2012-04-16 21:08:18 +02:00
Neal Cardwell
f4f9f6e75d tcp: restore formatting of macros for tcp_skb_cb sacked field
Commit b82d1bb4 inadvertendly placed unrelated new code between
TCPCB_EVER_RETRANS and TCPCB_RETRANS and the other macros that refer
to the sacked field in the struct tcp_skb_cb (probably because there
was a misleading empty line there). This commit fixes up the
formatting so that all macros related to the sacked field are adjacent
again.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-16 14:38:16 -04:00
Tony Zelenoff
65cff87201 atl1: remove unused member from atl1_adapter structure
Signed-off-by: Tony Zelenoff <antonz@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-16 14:21:21 -04:00
Paul Gortmaker
43c880dff3 drivers/net: fix unresolved 64bit math in mellanox/mlx4/en_dcb_nl.c
Commit 109d244605

    "net/mlx4_en: Set max rate-limit for a TC"

introduced 64 bit math operations into mlx4_en_dcbnl_ieee_setmaxrate()

causing the following final link failure on an x86_32 allmodconfig

  ERROR: "__udivdi3" [drivers/net/ethernet/mellanox/mlx4/mlx4_en.ko] undefined!

Convert it to use div_u64() instead.

Cc: Amir Vadai <amirv@mellanox.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-16 02:12:11 -04:00
David S. Miller
56845d78ce Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/atheros/atlx/atl1.c
	drivers/net/ethernet/atheros/atlx/atl1.h

Resolved a conflict between a DMA error bug fix and NAPI
support changes in the atl1 driver.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 13:19:04 -04:00
John Fastabend
df8ef8f3aa macvlan: add FDB bridge ops and macvlan flags
This adds FDB bridge ops to the macvlan device passthru mode.
Additionally a flags field was added and a NOPROMISC bit to
allow users to use passthru mode without the driver calling
dev_set_promiscuity(). The flags field is a u16 placed in a
4 byte hole (consuming 2 bytes) of the macvlan_dev struct.

We want to do this so that the macvlan driver or stack
above the macvlan driver does not have to process every
packet. For the use case where we know all the MAC addresses
of the endstations above us this works well.

This patch is a result of Roopa Prabhu's work. Follow up
patches are needed for VEPA and VEB macvlan modes.

v2: Change from distinct nopromisc mode to a flags field to
    configure this. This avoids the tendency to add a new
    mode every time we need some slightly different behavior.
v3: fix error in dev_set_promiscuity and add change and get
    link attributes for flags.

CC: Roopa Prabhu <roprabhu@cisco.com>
CC: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 13:06:05 -04:00
Greg Rose
2b2027124f ixgbe: UTA table incorrectly programmed
The UTA table was being set to the functional equivalent of promiscuous
mode.  This was resulting in traffic from the virtual function being
flooded onto the wire and the PF device. This resulted in additional
overhead for VF traffic sent to the network and in the case of traffic
sent to the PF or another VF resulted in unwanted packets on the wire.

This was actually not the intended behavior. Now that we can program
the embedded switch correctly we can remove this snippit of code. Users
who want to support this should configure the FDB correctly using the
FDB ops.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 13:06:05 -04:00
John Fastabend
9dcb373c55 ixgbe: allow RAR table to be updated in promisc mode
This allows RAR table updates while in promiscuous. With
SR-IOV enabled it is valuable to allow the RAR table to
be updated even when in promisc mode to configure forwarding

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 13:06:05 -04:00
John Fastabend
0f4b0add85 ixgbe: enable FDB netdevice ops
Enable FDB ops on ixgbe when in SR-IOV mode.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 13:06:04 -04:00
John Fastabend
3ff661c38c net: rtnetlink notify events for FDB NTF_SELF adds and deletes
It is useful to be able to monitor for FDB events in user space.
This patch adds support to generate netlink events when a change
is made to a device supporting the FDB ops.

This brings embedded switches inline with the SW net/bridge which
triggers events on FDB updates as well.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 13:06:04 -04:00
John Fastabend
d83b060360 net: add fdb generic dump routine
This adds a generic dump routine drivers can call. It
should be sufficient to handle any bridging model that
uses the unicast address list. This should be most SR-IOV
enabled NICs.

v2: return error on nlmsg_put and use -EMSGSIZE instead
    of -ENOMEM this is inline other usages

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 13:06:04 -04:00
John Fastabend
12a9463445 net: addr_list: add exclusive dev_uc_add and dev_mc_add
This adds a dev_uc_add_excl() and dev_mc_add_excl() calls
similar to the original dev_{uc|mc}_add() except it sets
the global bit and returns -EEXIST for duplicat entires.

This is useful for drivers that support SR-IOV, macvlan
devices and any other devices that need to manage the
unicast and multicast lists.

v2: fix typo UNICAST should be MULTICAST in dev_mc_add_excl()

CC: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 13:06:04 -04:00
John Fastabend
77162022ab net: add generic PF_BRIDGE:RTM_ FDB hooks
This adds two new flags NTF_MASTER and NTF_SELF that can
now be used to specify where PF_BRIDGE netlink commands should
be sent. NTF_MASTER sends the commands to the 'dev->master'
device for parsing. Typically this will be the linux net/bridge,
or open-vswitch devices. Also without any flags set the command
will be handled by the master device as well so that current user
space tools continue to work as expected.

The NTF_SELF flag will push the PF_BRIDGE commands to the
device. In the basic example below the commands are then parsed
and programmed in the embedded bridge.

Note if both NTF_SELF and NTF_MASTER bits are set then the
command will be sent to both 'dev->master' and 'dev' this allows
user space to easily keep the embedded bridge and software bridge
in sync.

There is a slight complication in the case with both flags set
when an error occurs. To resolve this the rtnl handler clears
the NTF_ flag in the netlink ack to indicate which sets completed
successfully. The add/del handlers will abort as soon as any
error occurs.

To support this new net device ops were added to call into
the device and the existing bridging code was refactored
to use these. There should be no required changes in user space
to support the current bridge behavior.

A basic setup with a SR-IOV enabled NIC looks like this,

          veth0  veth2
            |      |
          ------------
          |  bridge0 |   <---- software bridging
          ------------
               /
               /
  ethx.y      ethx
    VF         PF
     \         \          <---- propagate FDB entries to HW
     \         \
  --------------------
  |  Embedded Bridge |    <---- hardware offloaded switching
  --------------------

In this case the embedded bridge must be managed to allow 'veth0'
to communicate with 'ethx.y' correctly. At present drivers managing
the embedded bridge either send frames onto the network which
then get dropped by the switch OR the embedded bridge will flood
these frames. With this patch we have a mechanism to manage the
embedded bridge correctly from user space. This example is specific
to SR-IOV but replacing the VF with another PF or dropping this
into the DSA framework generates similar management issues.

Examples session using the 'br'[1] tool to add, dump and then
delete a mac address with a new "embedded" option and enabled
ixgbe driver:

# br fdb add 22:35:19:ac:60:59 dev eth3
# br fdb
port    mac addr                flags
veth0   22:35:19:ac:60:58       static
veth0   9a:5f:81:f7:f6:ec       local
eth3    00:1b:21:55:23:59       local
eth3    22:35:19:ac:60:59       static
veth0   22:35:19:ac:60:57       static
#br fdb add 22:35:19:ac:60:59 embedded dev eth3
#br fdb
port    mac addr                flags
veth0   22:35:19:ac:60:58       static
veth0   9a:5f:81:f7:f6:ec       local
eth3    00:1b:21:55:23:59       local
eth3    22:35:19:ac:60:59       static
veth0   22:35:19:ac:60:57       static
eth3    22:35:19:ac:60:59       local embedded
#br fdb del 22:35:19:ac:60:59 embedded dev eth3

I added a couple lines to 'br' to set the flags correctly is all. It
is my opinion that the merit of this patch is now embedded and SW
bridges can both be modeled correctly in user space using very nearly
the same message passing.

[1] 'br' tool was published as an RFC here and will be renamed 'bridge'
    http://patchwork.ozlabs.org/patch/117664/

Thanks to Jamal Hadi Salim, Stephen Hemminger and Ben Hutchings for
valuable feedback, suggestions, and review.

v2: fixed api descriptions and error case with both NTF_SELF and
    NTF_MASTER set plus updated patch description.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 13:06:04 -04:00
Tony Zelenoff
136cd14e1e atl1: do not drop rx/tx interrupts before they are scheduled
To prevent interrupts lost they should be dropped only if
they are scheduled via napi interfaces. In other case, there is
exists situation when napi handler process TX interrupt, stay in
RX processing and in that moment any other interrupt received.
Then before this patch TX bit in ISR will be cleaned, napi
schedule will not occur in case of currently processing event and
TX interrupt definitely will be lost.

Signed-off-by: Tony Zelenoff <antonz@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 13:00:12 -04:00
Tony Zelenoff
2a9bc71e9a atl1: do not process interrupts in cycle in handler
As the rx/tx handled inside napi handler, the cycle is
not needed now, because only the rx/tx need such kind of
processing.

Signed-off-by: Tony Zelenoff <antonz@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 13:00:12 -04:00
Tony Zelenoff
73650f28ae atl1: enable errors and link ints when rx/tx scheduled
Signed-off-by: Tony Zelenoff <antonz@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 12:56:02 -04:00
Tony Zelenoff
aa45ba90b5 atl1: add value to check ability of reenabling IRQs
Unfortunately it is not clear from code is usage of
IMR register possible or not. So, to prevent possible
side-effects of reading this register i prefer store
interrupts enable flag separately.

Signed-off-by: Tony Zelenoff <antonz@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 12:56:02 -04:00
Tony Zelenoff
02d5d11bfa atl1: make function to set imr of card
This function should be used later to set/remove proper
bits in imr to disable only rx ints.

Signed-off-by: Tony Zelenoff <antonz@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 12:56:02 -04:00
Tony Zelenoff
5c3d52ef5a atl1: use defined functions to disable irq
Looks like direct writes to IMR register is not good idea,
because there are exist functions to make this work.

Signed-off-by: Tony Zelenoff <antonz@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 12:56:01 -04:00
Tony Zelenoff
0dbab2fb1d atl1: add napi process of tx interrupts
Make the tx ints processing same as rx ones via napi.
The idea got from e1000. The interrupt disabling is
still not fine grained.

Signed-off-by: Tony Zelenoff <antonz@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 12:55:08 -04:00
Tony Zelenoff
6294512bbe atl1: make driver napi compatible
This is first step, here there is no fine interrupt
disabling which cause TX/ERR interrupts stalling when
RX scheduled ints processed.

Signed-off-by: Tony Zelenoff <antonz@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 12:55:07 -04:00
Tony Zelenoff
3e1d83f711 atl1: handle rx in separate condition
Remove rx from unlikely optimization in case of rx is very
likely thing for network card. This also reduce code a bit.

Signed-off-by: Tony Zelenoff <antonz@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 12:54:15 -04:00
Herbert Xu
c5c2326059 bridge: Add multicast_querier toggle and disable queries by default
Sending general queries was implemented as an optimisation to speed
up convergence on start-up.  In order to prevent interference with
multicast routers a zero source address has to be used.

Unfortunately these packets appear to cause some multicast-aware
switches to misbehave, e.g., by disrupting multicast packets to us.

Since the multicast snooping feature still functions without sending
our own queries, this patch will change the default to not send
queries.

For those that need queries in order to speed up convergence on start-up,
a toggle is provided to restore the previous behaviour.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 12:51:35 -04:00
Herbert Xu
c83b8fab06 bridge: Restart queries when last querier expires
As it stands when we discover that a real querier (one that queries
with a non-zero source address) we stop querying.  However, even
after said querier has fallen off the edge of the earth, we will
never restart querying (unless the bridge itself is restarted).

This patch fixes this by kicking our own querier into gear when
the timer for other queriers expire.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 12:51:34 -04:00
Herbert Xu
748572162a bridge: Add br_multicast_start_querier
This patch adds the helper br_multicast_start_querier so that
the code which starts the queriers in br_multicast_toggle can
be reused elsewhere.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 12:51:34 -04:00
Eric Dumazet
95c9617472 net: cleanup unsigned to unsigned int
Use of "unsigned int" is preferred to bare "unsigned" in net tree.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 12:44:40 -04:00
Daniel Baluta
5e73ea1a31 ipv4: fix checkpatch errors
Fix checkpatch errors of the following type:
	* ERROR: "foo * bar" should be "foo *bar"
	* ERROR: "(foo*)" should be "(foo *)"

Signed-off-by: Daniel Baluta <dbaluta@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 12:37:19 -04:00
Jason Wang
586d17c5a0 virtio-net: send gratuitous packets when needed
As hypervior does not have the knowledge of guest network configuration, it's
better to ask guest to send gratuitous packets when needed.

This patch implements VIRTIO_NET_F_GUEST_ANNOUNCE feature: hypervisor would
notice the guest when it thinks it's time for guest to announce the link
presnece. Guest tests VIRTIO_NET_S_ANNOUNCE bit during config change interrupt
and woule send gratuitous packets through netif_notify_peers() and ack the
notification through ctrl vq.

We need to make sure the atomicy of read and ack in guest otherwise we may ack
more times than being notified. This is done through handling the whole config
change interrupt in an non-reentrant workqueue.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 03:23:31 -04:00
Peter Hüwe
8831a3f2c9 isdn/hysdn: Convert to kstrtoul_from_user
This patch replaces the code for getting an number from a
userspace buffer by a simple call to kstroul_from_user.
This makes it easier to read and less error prone.

Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 03:23:31 -04:00
David S. Miller
cf22f9a2b8 ipv6: Remove unused argument to addrconf_dad_start().
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-14 21:37:40 -04:00
Masanari Iida
fd9071ec61 net: Fix spelling typo in net
Correct spelling typo within drivers/net.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-14 15:29:02 -04:00
Vijay Subramanian
a8cb05b238 tcp: Remove redundant code entering quickack mode
tcp_enter_quickack_mode() already calls tcp_incr_quickack() and sets
icsk->icsk_ack.ato  to TCP_ATO_MIN. This patch removes the duplication.

Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com>
Reviewed-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-14 15:29:02 -04:00
Alex Copot
aacd9289af tcp: bind() use stronger condition for bind_conflict
We must try harder to get unique (addr, port) pairs when
doing port autoselection for sockets with SO_REUSEADDR
option set.

We achieve this by adding a relaxation parameter to
inet_csk_bind_conflict. When 'relax' parameter is off
we return a conflict whenever the current searched
pair (addr, port) is not unique.

This tries to address the problems reported in patch:
	8d238b25b1
	Revert "tcp: bind() fix when many ports are bound"

Tests where ran for creating and binding(0) many sockets
on 100 IPs. The results are, on average:

	* 60000 sockets, 600 ports / IP:
		* 0.210 s, 620 (IP, port) duplicates without patch
		* 0.219 s, no duplicates with patch
	* 100000 sockets, 1000 ports / IP:
		* 0.371 s, 1720 duplicates without patch
		* 0.373 s, no duplicates with patch
	* 200000 sockets, 2000 ports / IP:
		* 0.766 s, 6900 duplicates without patch
		* 0.768 s, no duplicates with patch
	* 500000 sockets, 5000 ports / IP:
		* 2.227 s, 41500 duplicates without patch
		* 2.284 s, no duplicates with patch

Signed-off-by: Alex Copot <alex.mihai.c@gmail.com>
Signed-off-by: Daniel Baluta <dbaluta@ixiacom.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-14 15:28:55 -04:00
Eric Dumazet
c72e118334 inet: makes syn_ack_timeout mandatory
There are two struct request_sock_ops providers, tcp and dccp.

inet_csk_reqsk_queue_prune() can avoid testing syn_ack_timeout being
NULL if we make it non NULL like syn_ack_timeout

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Cc: dccp@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-14 15:24:26 -04:00