Commit Graph

1856 Commits

Author SHA1 Message Date
Stephen Hemminger
8b42ec3926 [BRIDGE]: netfilter VLAN macro cleanup
Fix the VLAN macros in bridge netfilter code. Macros should
not depend on magic variables.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:58:05 -08:00
Stephen Hemminger
f8a2602861 [BRIDGE]: netfilter dont use __constant_htons
Only use__constant_htons() for initializers and switch cases.
For other uses, it is just as efficient and clearer to use htons

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:57:46 -08:00
Stephen Hemminger
789bc3e5b6 [BRIDGE]: netfilter whitespace
Run br_netfilter through Lindent to fix whitespace.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:57:32 -08:00
Stephen Hemminger
d5513a7d32 [BRIDGE]: optimize frame pass up
The netfilter hook that is used to receive frames doesn't need to be a
stub.  It is only called in two ways, both of which ignore the return
value.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:57:18 -08:00
Stephen Hemminger
cee4854122 [BRIDGE]: use kzalloc
Use kzalloc versus kmalloc+memset. Also don't need to do
memset() of bridge address since it is in netdev private data
that is already zero'd in alloc_netdev.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:57:03 -08:00
Stephen Hemminger
3b781fa10b [BRIDGE]: use kcalloc
Use kcalloc rather than kmalloc + memset.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:56:50 -08:00
Stephen Hemminger
a95fcacdc3 [BRIDGE]: use setup_timer
Use the now standard setup_timer function.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:56:38 -08:00
Stephen Hemminger
e3efe08e9a [BRIDGE]: remove unneeded bh disables
The STP timers run off softirq (kernel timers), so there is no need to
disable bottom half in the spin locks.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:56:25 -08:00
Andrew Morton
9ebddc1aa3 [BRIDGE] br_netfilter: Warning fixes.
net/bridge/br_netfilter.c: In function `br_nf_pre_routing':
net/bridge/br_netfilter.c:427: warning: unused variable `vhdr'
net/bridge/br_netfilter.c:445: warning: unused variable `vhdr'

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:55:24 -08:00
Andrew Morton
74ca4e5acd [BRIDGE] ebtables: Build fix.
net/bridge/netfilter/ebtables.c:1481: warning: initialization makes pointer from integer without a cast

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:55:02 -08:00
David S. Miller
dbeff12b4d [INET]: Fix typo in Arnaldo's connection sock compat fixups.
"struct inet_csk" --> "struct inet_connection_sock" :-)

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:52:32 -08:00
Arnaldo Carvalho de Melo
8ca0d17bd7 [DCCP] feat: Pass dccp_minisock ptr where only the minisock is used
This is in preparation for having a dccp_minisock embedded into
dccp_request_sock so that feature negotiation can be done prior to
creating the full blown dccp_sock.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:51:53 -08:00
Arnaldo Carvalho de Melo
a4bf390242 [DCCP] minisock: Rename struct dccp_options to struct dccp_minisock
This will later be included in struct dccp_request_sock so that we can
have per connection feature negotiation state while in the 3way
handshake, when we clone the DCCP_ROLE_LISTEN socket (in
dccp_create_openreq_child) we'll just copy this state from
dreq_minisock to dccps_minisock.

Also the feature negotiation and option parsing code will mostly touch
dccps_minisock, which will simplify some stuff.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:50:58 -08:00
Arnaldo Carvalho de Melo
543d9cfeec [NET]: Identation & other cleanups related to compat_[gs]etsockopt cset
No code changes, just tidying up, in some cases moving EXPORT_SYMBOLs
to just after the function exported, etc.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:48:35 -08:00
Arnaldo Carvalho de Melo
f94691acf9 [SK_BUFF]: export skb_pull_rcsum
*** Warning: "skb_pull_rcsum" [net/bridge/bridge.ko] undefined!
*** Warning: "skb_pull_rcsum" [net/8021q/8021q.ko] undefined!
*** Warning: "skb_pull_rcsum" [drivers/net/pppoe.ko] undefined!
*** Warning: "skb_pull_rcsum" [drivers/net/ppp_generic.ko] undefined!

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:47:55 -08:00
Arnaldo Carvalho de Melo
dec73ff029 [ICSK] compat: Introduce inet_csk_compat_[gs]etsockopt
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:46:16 -08:00
Arnaldo Carvalho de Melo
d1d47beef8 [SNAP]: Remove leftover unused hdr variable
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:45:37 -08:00
Dmitry Mishin
3fdadf7d27 [NET]: {get|set}sockopt compatibility layer
This patch extends {get|set}sockopt compatibility layer in order to
move protocol specific parts to their place and avoid huge universal
net/compat.c file in the future.

Signed-off-by: Dmitry Mishin <dim@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:45:21 -08:00
Dave Jones
c750360938 [IPV6]: remove useless test in ip6_append_data
We've already dereferenced 'np' a dozen
times at this point, so it's safe to say it's not null.

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:44:52 -08:00
Adrian Bunk
afb5bb5744 [PKT_SCHED]: Let NET_CLS_ACT no longer depend on EXPERIMENTAL
This option should IMHO no longer depend on EXPERIMENTAL.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
ACKed-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:44:24 -08:00
Herbert Xu
cbb042f9e1 [NET]: Replace skb_pull/skb_postpull_rcsum with skb_pull_rcsum
We're now starting to have quite a number of places that do skb_pull
followed immediately by an skb_postpull_rcsum.  We can merge these two
operations into one function with skb_pull_rcsum.  This makes sense
since most pull operations on receive skb's need to update the
checksum.

I've decided to make this out-of-line since it is fairly big and the
fast path where hardware checksums are enabled need to call
csum_partial anyway.

Since this is a brand new function we get to add an extra check on the
len argument.  As it is most callers of skb_pull ignore its return
value which essentially means that there is no check on the len
argument.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:43:56 -08:00
Steven Whitehouse
ecba320f2e [DECnet]: Use RCU locking in dn_rules.c
As per Robert Olsson's patch for ipv4, this is the DECnet
version to keep the code "in step". It changes the list
of rules to use RCU rather than an rwlock.

Inspired-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: Steven Whitehouse <steve@chygwyn.com>
Signed-off-by: Patrick Caulfield <patrick@tykepenguin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:43:28 -08:00
Patrick Caulfield
c60992db46 [DECnet]: Patch to fix recvmsg() flag check
This patch means that 64bit kernel/32bit userland platforms will now
work correctly with DECnet.

Signed-off-by: Patrick Caulfield <patrick@tykepenguin.com>
Signed-off-by: Steven Whitehouse <steve@chygwyn.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:43:05 -08:00
Steven Whitehouse
c4ea94ab37 [DECnet]: Endian annotation and fixes for DECnet.
The typedef for dn_address has been removed in favour of using __le16
or __u16 directly as appropriate. All the DECnet header files are
updated accordingly.

The byte ordering of dn_eth2dn() and dn_dn2eth() are both changed
since just about all their callers wanted network order rather than
host order, so the conversion is now done in the functions themselves.

Several missed endianess conversions have been picked up during the
conversion process. The nh_gw field in struct dn_fib_info has been
changed from a 32 bit field to 16 bits as it ought to be.

One or two cases of using htons rather than dn_htons in the routing
code have been found and fixed.

There are still a few warnings to fix, but this patch deals with the
important cases.

Signed-off-by: Steven Whitehouse <steve@chygwyn.com>
Signed-off-by: Patrick Caulfield <patrick@tykepenguin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:42:39 -08:00
Catherine Zhang
2c7946a7bf [SECURITY]: TCP/UDP getpeersec
This patch implements an application of the LSM-IPSec networking
controls whereby an application can determine the label of the
security association its TCP or UDP sockets are currently connected to
via getsockopt and the auxiliary data mechanism of recvmsg.

Patch purpose:

This patch enables a security-aware application to retrieve the
security context of an IPSec security association a particular TCP or
UDP socket is using.  The application can then use this security
context to determine the security context for processing on behalf of
the peer at the other end of this connection.  In the case of UDP, the
security context is for each individual packet.  An example
application is the inetd daemon, which could be modified to start
daemons running at security contexts dependent on the remote client.

Patch design approach:

- Design for TCP
The patch enables the SELinux LSM to set the peer security context for
a socket based on the security context of the IPSec security
association.  The application may retrieve this context using
getsockopt.  When called, the kernel determines if the socket is a
connected (TCP_ESTABLISHED) TCP socket and, if so, uses the dst_entry
cache on the socket to retrieve the security associations.  If a
security association has a security context, the context string is
returned, as for UNIX domain sockets.

- Design for UDP
Unlike TCP, UDP is connectionless.  This requires a somewhat different
API to retrieve the peer security context.  With TCP, the peer
security context stays the same throughout the connection, thus it can
be retrieved at any time between when the connection is established
and when it is torn down.  With UDP, each read/write can have
different peer and thus the security context might change every time.
As a result the security context retrieval must be done TOGETHER with
the packet retrieval.

The solution is to build upon the existing Unix domain socket API for
retrieving user credentials.  Linux offers the API for obtaining user
credentials via ancillary messages (i.e., out of band/control messages
that are bundled together with a normal message).

Patch implementation details:

- Implementation for TCP
The security context can be retrieved by applications using getsockopt
with the existing SO_PEERSEC flag.  As an example (ignoring error
checking):

getsockopt(sockfd, SOL_SOCKET, SO_PEERSEC, optbuf, &optlen);
printf("Socket peer context is: %s\n", optbuf);

The SELinux function, selinux_socket_getpeersec, is extended to check
for labeled security associations for connected (TCP_ESTABLISHED ==
sk->sk_state) TCP sockets only.  If so, the socket has a dst_cache of
struct dst_entry values that may refer to security associations.  If
these have security associations with security contexts, the security
context is returned.

getsockopt returns a buffer that contains a security context string or
the buffer is unmodified.

- Implementation for UDP
To retrieve the security context, the application first indicates to
the kernel such desire by setting the IP_PASSSEC option via
getsockopt.  Then the application retrieves the security context using
the auxiliary data mechanism.

An example server application for UDP should look like this:

toggle = 1;
toggle_len = sizeof(toggle);

setsockopt(sockfd, SOL_IP, IP_PASSSEC, &toggle, &toggle_len);
recvmsg(sockfd, &msg_hdr, 0);
if (msg_hdr.msg_controllen > sizeof(struct cmsghdr)) {
    cmsg_hdr = CMSG_FIRSTHDR(&msg_hdr);
    if (cmsg_hdr->cmsg_len <= CMSG_LEN(sizeof(scontext)) &&
        cmsg_hdr->cmsg_level == SOL_IP &&
        cmsg_hdr->cmsg_type == SCM_SECURITY) {
        memcpy(&scontext, CMSG_DATA(cmsg_hdr), sizeof(scontext));
    }
}

ip_setsockopt is enhanced with a new socket option IP_PASSSEC to allow
a server socket to receive security context of the peer.  A new
ancillary message type SCM_SECURITY.

When the packet is received we get the security context from the
sec_path pointer which is contained in the sk_buff, and copy it to the
ancillary message space.  An additional LSM hook,
selinux_socket_getpeersec_udp, is defined to retrieve the security
context from the SELinux space.  The existing function,
selinux_socket_getpeersec does not suit our purpose, because the
security context is copied directly to user space, rather than to
kernel space.

Testing:

We have tested the patch by setting up TCP and UDP connections between
applications on two machines using the IPSec policies that result in
labeled security associations being built.  For TCP, we can then
extract the peer security context using getsockopt on either end.  For
UDP, the receiving end can retrieve the security context using the
auxiliary data mechanism of recvmsg.

Signed-off-by: Catherine Zhang <cxzhang@watson.ibm.com>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:41:23 -08:00
Patrick McHardy
be33690d8f [XFRM]: Fix aevent related crash
When xfrm_user isn't loaded xfrm_nl is NULL, which makes IPsec crash because
xfrm_aevent_is_on passes the NULL pointer to netlink_has_listeners as socket.
A second problem is that the xfrm_nl pointer is not cleared when the socket
is releases at module unload time.

Protect references of xfrm_nl from outside of xfrm_user by RCU, check
that the socket is present in xfrm_aevent_is_on and set it to NULL
when unloading xfrm_user.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:40:54 -08:00
Rick Jones
15d99e02ba [TCP]: sysctl to allow TCP window > 32767 sans wscale
Back in the dark ages, we had to be conservative and only allow 15-bit
window fields if the window scale option was not negotiated.  Some
ancient stacks used a signed 16-bit quantity for the window field of
the TCP header and would get confused.

Those days are long gone, so we can use the full 16-bits by default
now.

There is a sysctl added so that we can still interact with such old
stacks

Signed-off-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:40:29 -08:00
Neil Horman
abd596a4b6 [IPV4] ARP: Alloc acceptance of unsolicited ARP via netdevice sysctl.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:39:47 -08:00
Per Liden
87546b1c25 [TIPC]: Avoid compiler warning
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:38:33 -08:00
Per Liden
de70c5ba43 [TIPC]: Reduce stack usage
The node_map struct can be quite large (516 bytes) and allocating two of
them on the stack is not a good idea since we might only have a 4K stack
to start with.

Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:38:14 -08:00
Adrian Bunk
988f088a8e [TIPC]: Cleanups
This patch contains the following possible cleanups:
- make needlessly global code static
- #if 0 the following unused global functions:
  - name_table.c: tipc_nametbl_print()
  - name_table.c: tipc_nametbl_dump()
  - net.c: tipc_net_next_node()

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:37:52 -08:00
Per Liden
7c501a5960 [TIPC]: Remove unused functions
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:37:27 -08:00
Sam Ravnborg
05790c6456 [TIPC]: Remove inlines from *.c
With reference to latest discussions on linux-kernel with respect to
inline here is a patch for tipc to remove all inlines as used in
the .c files. See also chapter 14 in Documentation/CodingStyle.

Before:
   text        data     bss     dec     hex filename
 102990        5292    1752  110034   1add2 tipc.o

Now:
   text        data     bss     dec     hex filename
 101190        5292    1752  108234   1a6ca tipc.o

This is a nice text size reduction which will improve icache usage.
In some cases bigger (> 4 lines) functions where declared inline
and used in many places, they are most probarly no longer inlined by gcc
resulting in the size reduction.
There are several one liners that no longer are declared inline, but gcc
should inline these just fine without the inline hint.

With this patch applied one warning is added about an unused static
function - that was hidded by utilising inline before.
The function in question were kept so this patch is solely a
inline removal patch.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:37:04 -08:00
Sam Ravnborg
1fc54d8f49 [TIPC]: Fix simple sparse warnings
Tried to run the new tipc stack through sparse.
Following patch fixes all cases where 0 was used
as replacement of NULL.
Use NULL to document this is a pointer and to silence sparse.

This brough sparse warning count down with 127 to 24 warnings.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:36:47 -08:00
David S. Miller
edb2c34fb2 [NETFILTER]: Fix warnings in ip_nat_snmp_basic.c
net/ipv4/netfilter/ip_nat_snmp_basic.c: In function 'asn1_header_decode':
net/ipv4/netfilter/ip_nat_snmp_basic.c:248: warning: 'len' may be used uninitialized in this function
net/ipv4/netfilter/ip_nat_snmp_basic.c:248: warning: 'def' may be used uninitialized in this function
net/ipv4/netfilter/ip_nat_snmp_basic.c: In function 'snmp_translate':
net/ipv4/netfilter/ip_nat_snmp_basic.c:672: warning: 'l' may be used uninitialized in this function
net/ipv4/netfilter/ip_nat_snmp_basic.c:668: warning: 'type' may be used uninitialized in this function

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:36:21 -08:00
David S. Miller
fb9504964d [DCCP]: Fix uninitialized var warnings in dccp_parse_options().
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:36:01 -08:00
Ingo Molnar
57b47a53ec [NET]: sem2mutex part 2
Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:35:41 -08:00
Arjan van de Ven
4a3e2f711a [NET] sem2mutex: net/
Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:33:17 -08:00
Stephen Hemminger
1533306186 [NET]: dev_put/dev_hold cleanup
Get rid of the old __dev_put macro that is just a hold over from pre 2.6
kernel.  And turn dev_hold into an inline instead of a macro.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:32:28 -08:00
Arnaldo Carvalho de Melo
2d0817d11e [DCCP] options: Make dccp_insert_options & friends yell on error
And not the silly LIMIT_NETDEBUG and silently return without inserting
the option requested.

Also drop some old debugging messages associated to option insertion.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:32:06 -08:00
Arnaldo Carvalho de Melo
110bae4efb [DCCP]: Remove leftover dccp_send_response prototype
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:31:46 -08:00
Arnaldo Carvalho de Melo
c5fed1597e [DCCP]: ditch dccp_v[46]_ctl_send_ack
Merging it with its only user: dccp_v[46]_reqsk_send_ack.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:31:26 -08:00
Arnaldo Carvalho de Melo
118b2c9532 [DCCP]: Use sk->sk_prot->max_header consistently for non-data packets
Using this also provides opportunities for introducing
inet_csk_alloc_skb that would call alloc_skb, account it to the sock
and skb_reserve(max_header), but I'll leave this for later, for now
using sk_prot->max_header consistently is enough.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:31:09 -08:00
Arnaldo Carvalho de Melo
e5a6de915b [DCCP] options: Fix handling of ackvecs in DATA packets
I.e. they should be just ignored, but we have to use 'break', not 'continue',
as we have to possibly reset the mandatory flag.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:30:51 -08:00
David S. Miller
aa837b5bbd [ATM]: Fix build after neigh->parms->neigh_destructor change.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:30:23 -08:00
Benjamin LaHaise
6cb153cab9 [NET]: use fget_light() in net/socket.c
Here's an updated copy of the patch to use fget_light in net/socket.c.
Rerunning the tests show a drop of ~80Mbit/s on average, which looks
bad until you see the drop in cpu usage from ~89% to ~82%.  That will
get fixed in another patch...

Before: max 8113.70, min 8026.32, avg 8072.34
 87380  16384  16384    10.01      8045.55   87.11    87.11    1.774   1.774
 87380  16384  16384    10.01      8065.14   90.86    90.86    1.846   1.846
 87380  16384  16384    10.00      8077.76   89.85    89.85    1.822   1.822
 87380  16384  16384    10.00      8026.32   89.80    89.80    1.833   1.833
 87380  16384  16384    10.01      8108.59   89.81    89.81    1.815   1.815
 87380  16384  16384    10.01      8034.53   89.01    89.01    1.815   1.815
 87380  16384  16384    10.00      8113.70   90.45    90.45    1.827   1.827
 87380  16384  16384    10.00      8111.37   89.90    89.90    1.816   1.816
 87380  16384  16384    10.01      8077.75   87.96    87.96    1.784   1.784
 87380  16384  16384    10.00      8062.70   90.25    90.25    1.834   1.834

After: max 8035.81, min 7963.69, avg 7998.14
 87380  16384  16384    10.01      8000.93   82.11    82.11    1.682   1.682
 87380  16384  16384    10.01      8016.17   83.67    83.67    1.710   1.710
 87380  16384  16384    10.01      7963.69   83.47    83.47    1.717   1.717
 87380  16384  16384    10.01      8014.35   81.71    81.71    1.671   1.671
 87380  16384  16384    10.00      7967.68   83.41    83.41    1.715   1.715
 87380  16384  16384    10.00      7995.22   81.00    81.00    1.660   1.660
 87380  16384  16384    10.00      8002.61   83.90    83.90    1.718   1.718
 87380  16384  16384    10.00      8035.81   81.71    81.71    1.666   1.666
 87380  16384  16384    10.01      8005.36   82.56    82.56    1.690   1.690
 87380  16384  16384    10.00      7979.61   82.50    82.50    1.694   1.694

Signed-off-by: Benjamin LaHaise <bcrl@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:27:12 -08:00
Stephen Hemminger
8aca8a27d9 [NET]: minor net_rx_action optimization
The functions list_del followed by list_add_tail is equivalent to the
existing inline list_move_tail. list_move_tail avoids unnecessary
_LIST_POISON.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:26:39 -08:00
Michael S. Tsirkin
c5ecd62c25 [NET]: Move destructor from neigh->ops to neigh_params
struct neigh_ops currently has a destructor field, which no in-kernel
drivers outside of infiniband use.  The infiniband/ulp/ipoib in-tree
driver stashes some info in the neighbour structure (the results of
the second-stage lookup from ARP results to real link-level path), and
it uses neigh->ops->destructor to get a callback so it can clean up
this extra info when a neighbour is freed.  We've run into problems
with this: since the destructor is in an ops field that is shared
between neighbours that may belong to different net devices, there's
no way to set/clear it safely.

The following patch moves this field to neigh_parms where it can be
safely set, together with its twin neigh_setup.  Two additional
patches in the patch series update ipoib to use this new interface.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:25:41 -08:00
Luiz Capitulino
53dcb0e38c [PKTGEN]: Updates version.
Due to the thread's lock changes, we're at a new version now.

Signed-off-by: Luiz Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:25:05 -08:00
Luiz Capitulino
6146e6a43b [PKTGEN]: Removes thread_{un,}lock() macros.
As suggested by Arnaldo, this patch replaces the
thread_lock()/thread_unlock() by directly calls to
mutex_lock()/mutex_unlock().

This change makes the code a bit more readable, and the direct calls
are used everywhere in the kernel.

Signed-off-by: Luiz Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 22:24:45 -08:00