27243 Commits

Author SHA1 Message Date
Cong Wang
7b024082b2 openvswitch: fix the calculation of checksum for vlan header
In vlan_insert_tag(), we insert a 4-byte VLAN header _after_
mac header:

        memmove(skb->data, skb->data + VLAN_HLEN, 2 * ETH_ALEN);
        ...
        veth->h_vlan_proto = htons(ETH_P_8021Q);
        ...
        veth->h_vlan_TCI = htons(vlan_tci);

so after it, we should recompute the checksum to include these 4 bytes.
skb->data still points to the mac header, therefore VLAN header is at
(2 * ETH_ALEN = 12) bytes after it, not (ETH_HLEN = 14) bytes.

This can also be observed via tcpdump:

         0x0000:  ffff ffff ffff 5254 005d 6f6e 8100 000a
         0x0010:  0806 0001 0800 0604 0001 5254 005d 6f6e
         0x0020:  c0a8 026e 0000 0000 0000 c0a8 0282

Similar for __pop_vlan_tci(), the vlan header we remove is the one
overwritten in:

	memmove(skb->data + VLAN_HLEN, skb->data, 2 * ETH_ALEN);

Therefore the VLAN_HLEN = 4 bytes after 2 * ETH_ALEN is the part
we want to sub from checksum.

Cc: David S. Miller <davem@davemloft.net>
Cc: Jesse Gross <jesse@nicira.com>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-02-22 17:02:38 -08:00
Rich Lane
17b682a048 openvswitch: Fix parsing invalid LLC/SNAP ethertypes
Before this patch, if an LLC/SNAP packet with OUI 00:00:00 had an
ethertype less than 1536 the flow key given to userspace in the upcall
would contain the invalid ethertype (for example, 3). If userspace
attempted to insert a kernel flow for this key it would be rejected
by ovs_flow_from_nlattrs.

This patch allows OVS to pass the OFTest pktact.DirectBadLlcPackets.

Signed-off-by: Rich Lane <rlane@bigswitch.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-02-22 17:02:28 -08:00
Rich Lane
a15ff76c95 openvswitch: Call genlmsg_end in queue_userspace_packet
Without genlmsg_end the upcall message ends (according to nlmsg_len)
after the struct ovs_header.

Signed-off-by: Rich Lane <rlane@bigswitch.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-02-22 17:02:11 -08:00
Rich Lane
cb7c5bdffb openvswitch: Fix ovs_vport_cmd_new return value on success
If the pointer does not represent an error then the PTR_ERR
macro may still return a nonzero value.

Signed-off-by: Rich Lane <rlane@bigswitch.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-02-22 17:01:57 -08:00
Rich Lane
734907e82d openvswitch: Fix ovs_vport_cmd_del return value on success
If the pointer does not represent an error then the PTR_ERR macro may still
return a nonzero value. The fix is the same as in ovs_vport_cmd_set.

Signed-off-by: Rich Lane <rlane@bigswitch.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
2013-02-22 17:01:49 -08:00
Trond Myklebust
a9a6b52ee1 SUNRPC: Don't start the retransmission timer when out of socket space
If the socket is full, we're better off just waiting until it empties,
or until the connection is broken. The reason why we generally don't
want to time out is that the call to xprt->ops->release_xprt() will
trigger a connection reset, which isn't helpful...

Let's make an exception for soft RPC calls, since they have to provide
timeout guarantees.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2013-02-22 15:17:17 -05:00
Yuchung Cheng
1b63edd6ec tcp: fix SYN-data space mis-accounting
In fast open the sender unncessarily reduces the space available
for data in SYN by 12 bytes.  This is because in the sender
incorrectly reserves space for TS option twice in tcp_send_syn_data():
tcp_mtu_to_mss() already accounts for TS option space. But it further
reserves MAX_TCP_OPTION_SPACE when computing the payload space.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-22 15:10:19 -05:00
stephen hemminger
cbda4eaffa sock: only define socket limit if mem cgroup configured
The mem cgroup socket limit is only used if the config option is
enabled. Found with sparse

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-22 15:10:19 -05:00
Alexander Duyck
2bb60cb9b7 net: Fix locking bug in netif_set_xps_queue
Smatch found a locking bug in netif_set_xps_queue in which we were not
releasing the lock in the case of an allocation failure.

This change corrects that so that we release the xps_map_mutex before
returning -ENOMEM in the case of an allocation failure.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-22 15:10:19 -05:00
Li Wei
5b0520425e ipv4: fix error handling in icmp_protocol.
Now we handle icmp errors in each transport protocol's err_handler,
for icmp protocols, that is ping_err. Since this handler only care
of those icmp errors triggered by echo request, errors triggered
by echo reply(which sent by kernel) are sliently ignored.

So wrap ping_err() with icmp_err() to deal with those icmp errors.

Signed-off-by: Li Wei <lw@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-22 15:10:18 -05:00
Linus Torvalds
81ec44a6c6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 update from Martin Schwidefsky:
 "The most prominent change in this patch set is the software dirty bit
  patch for s390.  It removes __HAVE_ARCH_PAGE_TEST_AND_CLEAR_DIRTY and
  the page_test_and_clear_dirty primitive which makes the common memory
  management code a bit less obscure.

  Heiko fixed most of the PCI related fallout, more often than not
  missing GENERIC_HARDIRQS dependencies.  Notable is one of the 3270
  patches which adds an export to tty_io to be able to resize a tty.

  The rest is the usual bunch of cleanups and bug fixes."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (42 commits)
  s390/module: Add missing R_390_NONE relocation type
  drivers/gpio: add missing GENERIC_HARDIRQ dependency
  drivers/input: add couple of missing GENERIC_HARDIRQS dependencies
  s390/cleanup: rename SPP to LPP
  s390/mm: implement software dirty bits
  s390/mm: Fix crst upgrade of mmap with MAP_FIXED
  s390/linker skript: discard exit.data at runtime
  drivers/media: add missing GENERIC_HARDIRQS dependency
  s390/bpf,jit: add vlan tag support
  drivers/net,AT91RM9200: add missing GENERIC_HARDIRQS dependency
  iucv: fix kernel panic at reboot
  s390/Kconfig: sort list of arch selected config options
  phylib: remove !S390 dependeny from Kconfig
  uio: remove !S390 dependency from Kconfig
  dasd: fix sysfs cleanup in dasd_generic_remove
  s390/pci: fix hotplug module init
  s390/pci: cleanup clp page allocation
  s390/pci: cleanup clp inline assembly
  s390/perf: cpum_cf: fallback to software sampling events
  s390/mm: provide PAGE_SHARED define
  ...
2013-02-21 17:54:03 -08:00
Linus Torvalds
9afa3195b9 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree from Jiri Kosina:
 "Assorted tiny fixes queued in trivial tree"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (22 commits)
  DocBook: update EXPORT_SYMBOL entry to point at export.h
  Documentation: update top level 00-INDEX file with new additions
  ARM: at91/ide: remove unsused at91-ide Kconfig entry
  percpu_counter.h: comment code for better readability
  x86, efi: fix comment typo in head_32.S
  IB: cxgb3: delay freeing mem untill entirely done with it
  net: mvneta: remove unneeded version.h include
  time: x86: report_lost_ticks doesn't exist any more
  pcmcia: avoid static analysis complaint about use-after-free
  fs/jfs: Fix typo in comment : 'how may' -> 'how many'
  of: add missing documentation for of_platform_populate()
  btrfs: remove unnecessary cur_trans set before goto loop in join_transaction
  sound: soc: Fix typo in sound/codecs
  treewide: Fix typo in various drivers
  btrfs: fix comment typos
  Update ibmvscsi module name in Kconfig.
  powerpc: fix typo (utilties -> utilities)
  of: fix spelling mistake in comment
  h8300: Fix home page URL in h8300/README
  xtensa: Fix home page URL in Kconfig
  ...
2013-02-21 17:40:58 -08:00
Linus Torvalds
7c2db36e73 Merge branch 'akpm' (incoming from Andrew)
Merge misc patches from Andrew Morton:

 - Florian has vanished so I appear to have become fbdev maintainer
   again :(

 - Joel and Mark are distracted to welcome to the new OCFS2 maintainer

 - The backlight queue

 - Small core kernel changes

 - lib/ updates

 - The rtc queue

 - Various random bits

* akpm: (164 commits)
  rtc: rtc-davinci: use devm_*() functions
  rtc: rtc-max8997: use devm_request_threaded_irq()
  rtc: rtc-max8907: use devm_request_threaded_irq()
  rtc: rtc-da9052: use devm_request_threaded_irq()
  rtc: rtc-wm831x: use devm_request_threaded_irq()
  rtc: rtc-tps80031: use devm_request_threaded_irq()
  rtc: rtc-lp8788: use devm_request_threaded_irq()
  rtc: rtc-coh901331: use devm_clk_get()
  rtc: rtc-vt8500: use devm_*() functions
  rtc: rtc-tps6586x: use devm_request_threaded_irq()
  rtc: rtc-imxdi: use devm_clk_get()
  rtc: rtc-cmos: use dev_warn()/dev_dbg() instead of printk()/pr_debug()
  rtc: rtc-pcf8583: use dev_warn() instead of printk()
  rtc: rtc-sun4v: use pr_warn() instead of printk()
  rtc: rtc-vr41xx: use dev_info() instead of printk()
  rtc: rtc-rs5c313: use pr_err() instead of printk()
  rtc: rtc-at91rm9200: use dev_dbg()/dev_err() instead of printk()/pr_debug()
  rtc: rtc-rs5c372: use dev_dbg()/dev_warn() instead of printk()/pr_debug()
  rtc: rtc-ds2404: use dev_err() instead of printk()
  rtc: rtc-efi: use dev_err()/dev_warn()/pr_err() instead of printk()
  ...
2013-02-21 17:38:49 -08:00
Christian Kujau
242260fb85 sun.com documentation fixes
After I came across a help text for SUNGEM mentioning a broken sun.com
URL, I felt like fixing those up, as they are now pointing to oracle.com
URLs.

Signed-off-by: Christian Kujau <lists@nerdbynature.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-21 17:22:20 -08:00
Eric Dumazet
08dcdbf6a7 ipv6: use a stronger hash for tcp
It looks like its possible to open thousands of TCP IPv6
sessions on a server, all landing in a single slot of TCP hash
table. Incoming packets have to lookup sockets in a very
long list.

We should hash all bits from foreign IPv6 addresses, using
a salt and hash mix, not a simple XOR.

inet6_ehashfn() can also separately use the ports, instead
of xoring them.

Reported-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-21 18:15:58 -05:00
Linus Torvalds
21eaab6d19 tty/serial patches for 3.9-rc1
Here's the big tty/serial driver patches for 3.9-rc1.
 
 More tty port rework and fixes from Jiri here, as well as lots of
 individual serial driver updates and fixes.
 
 All of these have been in the linux-next tree for a while.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iEYEABECAAYFAlEmZYQACgkQMUfUDdst+ylJDgCg0B0nMevUUdM4hLvxunbbiyXM
 HUEAoIOedqriNNPvX4Bwy0hjeOEaWx0g
 =vi6x
 -----END PGP SIGNATURE-----

Merge tag 'tty-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial patches from Greg Kroah-Hartman:
 "Here's the big tty/serial driver patches for 3.9-rc1.

  More tty port rework and fixes from Jiri here, as well as lots of
  individual serial driver updates and fixes.

  All of these have been in the linux-next tree for a while."

* tag 'tty-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (140 commits)
  tty: mxser: improve error handling in mxser_probe() and mxser_module_init()
  serial: imx: fix uninitialized variable warning
  serial: tegra: assume CONFIG_OF
  TTY: do not update atime/mtime on read/write
  lguest: select CONFIG_TTY to build properly.
  ARM defconfigs: add missing inclusions of linux/platform_device.h
  fb/exynos: include platform_device.h
  ARM: sa1100/assabet: include platform_device.h directly
  serial: imx: Fix recursive locking bug
  pps: Fix build breakage from decoupling pps from tty
  tty: Remove ancient hardpps()
  pps: Additional cleanups in uart_handle_dcd_change
  pps: Move timestamp read into PPS code proper
  pps: Don't crash the machine when exiting will do
  pps: Fix a use-after free bug when unregistering a source.
  pps: Use pps_lookup_dev to reduce ldisc coupling
  pps: Add pps_lookup_dev() function
  tty: serial: uartlite: Support uartlite on big and little endian systems
  tty: serial: uartlite: Fix sparse and checkpatch warnings
  serial/arc-uart: Miscll DT related updates (Grant's review comments)
  ...

Fix up trivial conflicts, mostly just due to the TTY config option
clashing with the EXPERIMENTAL removal.
2013-02-21 13:41:04 -08:00
Li Wei
b531ed61a2 ipv4: fix a bug in ping_err().
We should get 'type' and 'code' from the outer ICMP header.

Signed-off-by: Li Wei <lw@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-21 15:25:00 -05:00
Linus Torvalds
06991c28f3 Driver core patches for 3.9-rc1
Here is the big driver core merge for 3.9-rc1
 
 There are two major series here, both of which touch lots of drivers all
 over the kernel, and will cause you some merge conflicts:
   - add a new function called devm_ioremap_resource() to properly be
     able to check return values.
   - remove CONFIG_EXPERIMENTAL
 
 If you need me to provide a merged tree to handle these resolutions,
 please let me know.
 
 Other than those patches, there's not much here, some minor fixes and
 updates.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iEYEABECAAYFAlEmV0cACgkQMUfUDdst+yncCQCfbmnQZju7kzWXk6PjdFuKspT9
 weAAoMCzcAtEzzc4LXuUxxG/sXBVBCjW
 =yWAQ
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core patches from Greg Kroah-Hartman:
 "Here is the big driver core merge for 3.9-rc1

  There are two major series here, both of which touch lots of drivers
  all over the kernel, and will cause you some merge conflicts:

   - add a new function called devm_ioremap_resource() to properly be
     able to check return values.

   - remove CONFIG_EXPERIMENTAL

  Other than those patches, there's not much here, some minor fixes and
  updates"

Fix up trivial conflicts

* tag 'driver-core-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (221 commits)
  base: memory: fix soft/hard_offline_page permissions
  drivercore: Fix ordering between deferred_probe and exiting initcalls
  backlight: fix class_find_device() arguments
  TTY: mark tty_get_device call with the proper const values
  driver-core: constify data for class_find_device()
  firmware: Ignore abort check when no user-helper is used
  firmware: Reduce ifdef CONFIG_FW_LOADER_USER_HELPER
  firmware: Make user-mode helper optional
  firmware: Refactoring for splitting user-mode helper code
  Driver core: treat unregistered bus_types as having no devices
  watchdog: Convert to devm_ioremap_resource()
  thermal: Convert to devm_ioremap_resource()
  spi: Convert to devm_ioremap_resource()
  power: Convert to devm_ioremap_resource()
  mtd: Convert to devm_ioremap_resource()
  mmc: Convert to devm_ioremap_resource()
  mfd: Convert to devm_ioremap_resource()
  media: Convert to devm_ioremap_resource()
  iommu: Convert to devm_ioremap_resource()
  drm: Convert to devm_ioremap_resource()
  ...
2013-02-21 12:05:51 -08:00
Shani Michaeli
7083e42ee2 IB/core: Add "type 2" memory windows support
This patch enhances the IB core support for Memory Windows (MWs).

MWs allow an application to have better/flexible control over remote
access to memory.

Two types of MWs are supported, with the second type having two flavors:

    Type 1  - associated with PD only
    Type 2A - associated with QPN only
    Type 2B - associated with PD and QPN

Applications can allocate a MW once, and then repeatedly bind the MW
to different ranges in MRs that are associated to the same PD. Type 1
windows are bound through a verb, while type 2 windows are bound by
posting a work request.

The 32-bit memory key is composed of a 24-bit index and an 8-bit
key. The key is changed with each bind, thus allowing more control
over the peer's use of the memory key.

The changes introduced are the following:

* add memory window type enum and a corresponding parameter to ib_alloc_mw.
* type 2 memory window bind work request support.
* create a struct that contains the common part of the bind verb struct
  ibv_mw_bind and the bind work request into a single struct.
* add the ib_inc_rkey helper function to advance the tag part of an rkey.

Consumer interface details:

* new device capability flags IB_DEVICE_MEM_WINDOW_TYPE_2A and
  IB_DEVICE_MEM_WINDOW_TYPE_2B are added to indicate device support
  for these features.

  Devices can set either IB_DEVICE_MEM_WINDOW_TYPE_2A or
  IB_DEVICE_MEM_WINDOW_TYPE_2B if it supports type 2A or type 2B
  memory windows. It can set neither to indicate it doesn't support
  type 2 windows at all.

* modify existing provides and consumers code to the new param of
  ib_alloc_mw and the ib_mw_bind_info structure

Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-02-21 11:51:45 -08:00
Linus Torvalds
2171ee8f43 NFS client bugfixes for Linux 3.9
- Fix an Oops in the pNFS layoutget code
 - Fix a number of NFSv4 and v4.1 state recovery deadlocks and hangs
   due to the interaction of the session drain lock and state management
   locks.
 - Remove task->tk_xprt, which was hiding a lot of RCU dereferencing bugs
 - Fix a long standing NFSv3 posix lock recovery bug.
 - Revert commit 324d003b0cd82151adbaecefef57b73f7959a469. It turned out
   that the root cause of the deadlock was due to interactions with the
   workqueues that have now been resolved.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQIcBAABAgAGBQJRJUfVAAoJEGcL54qWCgDySIoP/2QC2qNKqfcoIwOqOAR0pu5m
 wHybR67WXt/R6EK17l273apHfnJQb/IFFKRBfAIAW0vy4bJLwHMtutIeUweFWht2
 hmluYqVWAa9lPwaY1YMolo8Mfizaj0cY3QHv2ogF4poY0SLO6L8O/xo8n8nrn/pk
 QJW9tcvz43A2tbW92t833sQMFQSSjs9L2tvxM/BEahLrSEE4dl6WMWscmHQ/IXt/
 Hs3ADc4//dDF36k5lu7tU0qaVNDacjMXCV6Bw0HUCWCwjWzqmPxSlTDKAcer2vpm
 UawKDFTF4X1QYwbtGlNju0xhsSGRg7ZNRVEeRxqSROT280nzpp7C3CjM9sECtM2l
 SR9RqoYfBsEAyWZtsALApuSBZZZHOvUfVJEL1Cm0ZSeEbWSOO/DT07KRVgfb97h4
 hmE7pn5htgXdiGN2YCphMjrqjaoP4aKmULenj1h1jHbDOgrlqp2QLWqg/5NxMsTw
 M3vmYqQWkW60JN6e1pWJbsoyUkSMTVycRNrZ6WgShtd1lrz3tg66rDXYWdIhU6pT
 6qAKrkXnM5JkLoMoImNQod00Vr1ZXpVc8DLoFtug8rGohCdFJhWkYHefqVJ9rFhv
 2S2NpKDsFiA0q3aK0/YAkUnykYDAnLn7BM5SvhJjq8mKru4bP70PrN9LO8aqhE4E
 fFgj2MGG5QwrL0/GMuls
 =dthb
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.9-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:

 - Fix an Oops in the pNFS layoutget code

 - Fix a number of NFSv4 and v4.1 state recovery deadlocks and hangs due
   to the interaction of the session drain lock and state management
   locks.

 - Remove task->tk_xprt, which was hiding a lot of RCU dereferencing
   bugs

 - Fix a long standing NFSv3 posix lock recovery bug.

 - Revert commit 324d003b0cd8 ("NFS: add nfs_sb_deactive_async to avoid
   deadlock").  It turned out that the root cause of the deadlock was
   due to interactions with the workqueues that have now been resolved.

* tag 'nfs-for-3.9-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (22 commits)
  NLM: Ensure that we resend all pending blocking locks after a reclaim
  umount oops when remove blocklayoutdriver first
  sunrpc: silence build warning in gss_fill_context
  nfs: remove kfree() redundant null checks
  NFSv4.1: Don't decode skipped layoutgets
  NFSv4.1: Fix bulk recall and destroy of layouts
  NFSv4.1: Fix an ABBA locking issue with session and state serialisation
  NFSv4: Fix a reboot recovery race when opening a file
  NFSv4: Ensure delegation recall and byte range lock removal don't conflict
  NFSv4: Fix up the return values of nfs4_open_delegation_recall
  NFSv4.1: Don't lose locks when a server reboots during delegation return
  NFSv4.1: Prevent deadlocks between state recovery and file locking
  NFSv4: Allow the state manager to mark an open_owner as being recovered
  SUNRPC: Add missing static declaration to _gss_mech_get_by_name
  Revert "NFS: add nfs_sb_deactive_async to avoid deadlock"
  SUNRPC: Nuke the tk_xprt macro
  SUNRPC: Avoid RCU dereferences in the transport bind and connect code
  SUNRPC: Fix an RCU dereference in xprt_reserve
  SUNRPC: Pass pointers to struct rpc_xprt to the congestion window
  SUNRPC: Fix an RCU dereference in xs_local_rpcbind
  ...
2013-02-21 09:23:01 -08:00
Jozsef Kadlecsik
dd82088dab netfilter: ipset: "Directory not empty" error message
When an entry flagged with "nomatch" was tested by ipset, it
returned the error message "Kernel error received:
Directory not empty" instead of "<element> is NOT in set <setname>"
(reported by John Brendler).

The internal error code was not properly transformed before returning
to userspace, fixed.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2013-02-21 17:35:43 +01:00
Nicolas Dichtel
85dfb745ee af_key: initialize satype in key_notify_policy_flush()
This field was left uninitialized. Some user daemons perform check against this
field.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-02-21 07:14:38 +01:00
Linus Torvalds
a0b1c42951 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking update from David Miller:

 1) Checkpoint/restarted TCP sockets now can properly propagate the TCP
    timestamp offset.  From Andrey Vagin.

 2) VMWARE VM VSOCK layer, from Andy King.

 3) Much improved support for virtual functions and SR-IOV in bnx2x,
    from Ariel ELior.

 4) All protocols on ipv4 and ipv6 are now network namespace aware, and
    all the compatability checks for initial-namespace-only protocols is
    removed.  Thanks to Tom Parkin for helping deal with the last major
    holdout, L2TP.

 5) IPV6 support in netpoll and network namespace support in pktgen,
    from Cong Wang.

 6) Multiple Registration Protocol (MRP) and Multiple VLAN Registration
    Protocol (MVRP) support, from David Ward.

 7) Compute packet lengths more accurately in the packet scheduler, from
    Eric Dumazet.

 8) Use per-task page fragment allocator in skb_append_datato_frags(),
    also from Eric Dumazet.

 9) Add support for connection tracking labels in netfilter, from
    Florian Westphal.

10) Fix default multicast group joining on ipv6, and add anti-spoofing
    checks to 6to4 and 6rd.  From Hannes Frederic Sowa.

11) Make ipv4/ipv6 fragmentation memory limits more reasonable in modern
    times, rearrange inet frag datastructures for better cacheline
    locality, and move more operations outside of locking.  From Jesper
    Dangaard Brouer.

12) Instead of strict master <--> slave relationships, allow arbitrary
    scenerios with "upper device lists".  From Jiri Pirko.

13) Improve rate limiting accuracy in TBF and act_police, also from Jiri
    Pirko.

14) Add a BPF filter netfilter match target, from Willem de Bruijn.

15) Orphan and delete a bunch of pre-historic networking drivers from
    Paul Gortmaker.

16) Add TSO support for GRE tunnels, from Pravin B SHelar.  Although
    this still needs some minor bug fixing before it's %100 correct in
    all cases.

17) Handle unresolved IPSEC states like ARP, with a resolution packet
    queue.  From Steffen Klassert.

18) Remove TCP Appropriate Byte Count support (ABC), from Stephen
    Hemminger.  This was long overdue.

19) Support SO_REUSEPORT, from Tom Herbert.

20) Allow locking a socket BPF filter, so that it cannot change after a
    process drops capabilities.

21) Add VLAN filtering to bridge, from Vlad Yasevich.

22) Bring ipv6 on-par with ipv4 and do not cache neighbour entries in
    the ipv6 routes, from YOSHIFUJI Hideaki.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1538 commits)
  ipv6: fix race condition regarding dst->expires and dst->from.
  net: fix a wrong assignment in skb_split()
  ip_gre: remove an extra dst_release()
  ppp: set qdisc_tx_busylock to avoid LOCKDEP splat
  atl1c: restore buffer state
  net: fix a build failure when !CONFIG_PROC_FS
  net: ipv4: fix waring -Wunused-variable
  net: proc: fix build failed when procfs is not configured
  Revert "xen: netback: remove redundant xenvif_put"
  net: move procfs code to net/core/net-procfs.c
  qmi_wwan, cdc-ether: add ADU960S
  bonding: set sysfs device_type to 'bond'
  bonding: fix bond_release_all inconsistencies
  b44: use netdev_alloc_skb_ip_align()
  xen: netback: remove redundant xenvif_put
  net: fec: Do a sanity check on the gpio number
  ip_gre: propogate target device GSO capability to the tunnel device
  ip_gre: allow CSUM capable devices to handle packets
  bonding: Fix initialize after use for 3ad machine state spinlock
  bonding: Fix race condition between bond_enslave() and bond_3ad_update_lacp_rate()
  ...
2013-02-20 18:58:50 -08:00
YOSHIFUJI Hideaki / 吉藤英明
ecd9883724 ipv6: fix race condition regarding dst->expires and dst->from.
Eric Dumazet wrote:
| Some strange crashes happen in rt6_check_expired(), with access
| to random addresses.
|
| At first glance, it looks like the RTF_EXPIRES and
| stuff added in commit 1716a96101c49186b
| (ipv6: fix problem with expired dst cache)
| are racy : same dst could be manipulated at the same time
| on different cpus.
|
| At some point, our stack believes rt->dst.from contains a dst pointer,
| while its really a jiffie value (as rt->dst.expires shares the same area
| of memory)
|
| rt6_update_expires() should be fixed, or am I missing something ?
|
| CC Neil because of https://bugzilla.redhat.com/show_bug.cgi?id=892060

Because we do not have any locks for dst_entry, we cannot change
essential structure in the entry; e.g., we cannot change reference
to other entity.

To fix this issue, split 'from' and 'expires' field in dst_entry
out of union.  Once it is 'from' is assigned in the constructor,
keep the reference until the very last stage of the life time of
the object.

Of course, it is unsafe to change 'from', so make rt6_set_from simple
just for fresh entries.

Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: Neil Horman <nhorman@tuxdriver.com>
CC: Gao Feng <gaofeng@cn.fujitsu.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reported-by: Steinar H. Gunderson <sesse@google.com>
Reviewed-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-20 15:11:45 -05:00
Amerigo Wang
68534c682e net: fix a wrong assignment in skb_split()
commit c9af6db4c11ccc6c3e7f1 (net: Fix possible wrong checksum generation)
has a suspicous piece:

	-       skb_shinfo(skb1)->gso_type = skb_shinfo(skb)->gso_type;
	-
	+       skb_shinfo(skb)->tx_flags = skb_shinfo(skb1)->tx_flags & SKBTX_SHARED_FRAG;

skb1 is the new skb, therefore should be on the left side of the assignment.
This patch fixes it.

Cc: Pravin B Shelar <pshelar@nicira.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-20 15:11:44 -05:00
Linus Torvalds
1eaec8212e Merge branch 'for-3.9-cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue [delayed_]work_pending() cleanups from Tejun Heo:
 "This is part of on-going cleanups to remove / minimize usages of
  workqueue interfaces which are deprecated and/or misleading.

  This round drops a number of usages of [delayed_]work_pending(), which
  are dangerous as they lack any form of synchronization and thus often
  lead to buggy / unnecessary code.  There are a couple legitimate use
  cases in kernel.  Hopefully, they can be converted and
  [delayed_]work_pending() can be removed completely.  Even if not,
  removing most of misuses should make it more difficult to find
  examples of misuses and thus slow down growth of them.

  These changes are independent from other workqueue changes."

* 'for-3.9-cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  wimax/i2400m: fix i2400m->wake_tx_skb handling
  kprobes: fix wait_for_kprobe_optimizer()
  ipw2x00: simplify scan_event handling
  video/exynos: don't use [delayed_]work_pending()
  tty/max3100: don't use [delayed_]work_pending()
  x86/mce: don't use [delayed_]work_pending()
  rfkill: don't use [delayed_]work_pending()
  wl1251: don't use [delayed_]work_pending()
  thinkpad_acpi: don't use [delayed_]work_pending()
  mwifiex: don't use [delayed_]work_pending()
  sja1000: don't use [delayed_]work_pending()
2013-02-19 21:58:52 -08:00
David S. Miller
c6b5380797 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next
John W. Linville says:

====================
One last batch of stragglers intended for 3.9...

For the iwlwifi pull, Johannes says:

"I hadn't expected to ask you to pull iwlwifi-next again, but I have a
number of fixes most of which I'd also send in after rc1, so here it is.

The first commit is a merge error between mac80211-next and
iwlwifi-next; in addition I have fixes for P2P scanning and MVM driver
MAC (virtual interface) management from Ilan, a CT-kill (critical
temperature) fix from Eytan, and myself fixed three different little but
annoying bugs in the MVM driver.

The only ones I might not send for -rc1 are Emmanuel's debug patch, but
OTOH it should help greatly if there are any issues, and my own time
event debugging patch that I used to find the race condition but we
decided to keep it for the future."

For the mac80211 pull, Johannes says:

"Like iwlwifi-next, this would almost be suitable for rc1.

I have a fix for station management on non-TDLS drivers, a CAB queue
crash fix for mesh, a fix for an annoying (but harmless) warning, a
tracing fix and a documentation fix. Other than that, only a few mesh
cleanups."

Along with that is a fix for memory corruption in rtlwifi, an
orinoco_usb fix to avoid allocating a DMA buffer on the stack, an a
hostap fix to return -ENOMEM instead of -1 after a memory allocation
failure.  The remaining bits implement 802.11ac support for the mwifiex
driver -- I think that is still worth getting into 3.9.

Please let me know if there are problems!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-19 22:24:50 -05:00
Eric Dumazet
4aa896c4ba ip_gre: remove an extra dst_release()
commit 68c331631143 (v4 GRE: Add TCP segmentation offload for GRE)
introduced a bug in error path.

dst is attached to skb, so will be released when skb is freed.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-19 22:24:04 -05:00
Alex Elder
4c7a08c83a Merge branch 'testing' of github.com:ceph/ceph-client into into linux-3.8-ceph 2013-02-19 19:21:08 -06:00
Alex Elder
903bb32e89 libceph: drop return value from page vector copy routines
The return values provided for ceph_copy_to_page_vector() and
ceph_copy_from_page_vector() serve no purpose, so get rid of them.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-19 19:14:05 -06:00
Alex Elder
b324814e84 libceph: use void pointers in page vector functions
The functions used for working with ceph page vectors are defined
with char pointers, but they're really intended to operate on
untyped data.  Change the types of these function parameters
to (void *) to reflect this.

(Note that the functions now assume void pointer arithmetic works
like arithmetic on char pointers.)

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-19 19:14:04 -06:00
Alex Elder
fbfab53966 libceph: allow STAT osd operations
Add support for CEPH_OSD_OP_STAT operations in the osd client
and in rbd.

This operation sends no data to the osd; everything required is
encoded in identity of the target object.

The result will be ENOENT if the object doesn't exist.  If it does
exist and no other error occurs the server returns the size and last
modification time of the target object as output data (in little
endian format).  The size is a 64 bit unsigned and the time is
ceph_timespec structure (two unsigned 32-bit integers, representing
a seconds and nanoseconds value).

This resolves:
    http://tracker.ceph.com/issues/4007

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-19 19:14:03 -06:00
Alex Elder
f44246e394 libceph: simplify data length calculation
Simplify the way the data length recorded in a message header is
calculated in ceph_osdc_build_request().

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-02-19 19:14:02 -06:00
John W. Linville
0b7164458f Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2013-02-19 14:56:34 -05:00
Cong Wang
cd0615746b net: fix a build failure when !CONFIG_PROC_FS
When !CONFIG_PROC_FS dev_mcast_init() is not defined,
actually we can just merge dev_mcast_init() into
dev_proc_init().

Reported-by: Gao feng <gaofeng@cn.fujitsu.com>
Cc: Gao feng <gaofeng@cn.fujitsu.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-19 13:18:13 -05:00
Gao feng
082c7ca42b net: ipv4: fix waring -Wunused-variable
the vars ip_rt_gc_timeout is used only when
CONFIG_SYSCTL is selected.

move these vars into CONFIG_SYSCTL.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-19 13:18:13 -05:00
Cong Wang
900ff8c632 net: move procfs code to net/core/net-procfs.c
Similar to net/core/net-sysfs.c, group procfs code to
a single unit.

Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-19 00:51:10 -05:00
Dmitry Kravkov
eb6b9a8cad ip_gre: propogate target device GSO capability to the tunnel device
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-19 00:51:09 -05:00
Dmitry Kravkov
aa0e51cdda ip_gre: allow CSUM capable devices to handle packets
If device is not able to handle checksumming it will
be handled in dev_xmit

Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-19 00:51:09 -05:00
David S. Miller
2ccba5433b Merge branch 'master' of git://1984.lsi.us.es/nf-next
Pablo Neira Ayuso says:

====================
The following patchset contain updates for your net-next tree, they are:

* Fix (for just added) connlabel dependencies, from Florian Westphal.

* Add aliasing support for conntrack, thus users can either use -m state
  or -m conntrack from iptables while using the same kernel module, from
  Jozsef Kadlecsik.

* Some code refactoring for the CT target to merge common code in
  revision 0 and 1, from myself.

* Add aliasing support for CT, based on patch from Jozsef Kadlecsik.

* Add one mutex per nfnetlink subsystem, from myself.

* Improved logging for packets that are dropped by helpers, from myself.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-18 23:42:09 -05:00
David S. Miller
6338a53a2b Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net into net
Pull in 'net' to take in the bug fixes that didn't make it into
3.8-final.

Also, deal with the semantic conflict of the change made to
net/ipv6/xfrm6_policy.c   A missing rt6->n neighbour release
was added to 'net', but in 'net-next' we no longer cache the
neighbour entries in the ipv6 routes so that change is not
appropriate there.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-18 23:34:21 -05:00
Pablo Neira Ayuso
b20ab9cc63 netfilter: nf_ct_helper: better logging for dropped packets
Connection tracking helpers have to drop packets under exceptional
situations. Currently, the user gets the following logging message
in case that happens:

	nf_ct_%s: dropping packet ...

However, depending on the helper, there are different reasons why a
packet can be dropped.

This patch modifies the existing code to provide more specific
error message in the scope of each helper to help users to debug
the reason why the packet has been dropped, ie:

	nf_ct_%s: dropping packet: reason ...

Thanks to Joe Perches for many formatting suggestions.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-02-19 02:48:05 +01:00
Eric Dumazet
8064b3cf75 ipv6: fix a sparse warning
net/ipv6/reassembly.c:82:72: warning: incorrect type in argument 3 (different base types)
net/ipv6/reassembly.c:82:72:    expected unsigned int [unsigned] [usertype] c
net/ipv6/reassembly.c:82:72:    got restricted __be32 [usertype] id

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-18 15:28:00 -05:00
John W. Linville
a9908ebf5c Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next 2013-02-18 15:27:42 -05:00
David S. Miller
40d1ae57a0 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next
John W. Linville says:

====================
This probably is the last big pull request for wireless bits
for 3.9.  Of course, I'm sure there will be a few stragglers here
and there...surely a few bug fixes as well... :-) (In fact, I see
that Johannes has already queued-up a few more for me while I was
preparing this...)

Included are a number of pulls...

For mac80211-next, Johannes says:

"The biggest change I have is undoubtedly Marco's mesh powersave
implementation. Beyond that, I have a patch from Emmanuel to modify the
DTIM period API in mac80211, scan improvements and a removal of some
previous workaround code from Stanislaw, dynamic short slot time from
Thomas and 64-bit station byte counters from Vladimir. I also made a
number of changes myself, some related to WoWLAN, some auth/deauth
improvements and most of them BSS list cleanups."

"This time, I have relatively large number of fixes in various areas of
the code (a memory leak in regulatory, an RX race in mac80211, the new
radar checking caused a P2P device problem, some mesh issues with
stations, an older bug in tracing and for kernel-doc) as well as a
number of small new features. The biggest (in the diffstat) is my work
on hidden SSID tracking."

"Please pull to get
 * radar detection work from Simon
 * mesh improvements from Thomas
 * a connection monitoring/powersave fix from Wojciech
 * TDLS-related station management work from Jouni
 * VLAN crypto fixes from Michael Braun
 * CCK support in minstrel_ht from Felix
 * an SMPS (not SMSP, oops) related improvement in mac80211 (Emmanuel)
 * some WoWLAN work from Amitkumar Karwar: pattern match offset and a
   documentation fix
 * some WoWLAN work from myself (TCP connection wakeup feature API)
 * and a lot of VHT (and some HT) work (also from myself)

And a number of more random cleanups/fixes. I merged mac80211/master to
avoid a merge problem there."

And regarding iwlwifi-next, Johannes says:

"We continue work on our new driver, but I also have a WoWLAN and AP mode
improvement for the previous driver and a change to use threaded
interrupts to prepare us for working with non-PCIe devices."

Regarding wl12xx, Luca says:

"A few more patches intended for 3.9.  Mostly some clean-ups I've been
doing to make it easier to support device-tree.  Also including one bug
fix for wl12xx where the rates we advertise were wrong and an update in
the wlconf structure to support newer firmwares."

For the nfc-next bits, Samuel says:

"This is the second NFC pull request for 3.9.

We have:

- A few pn533 fixes on top of Waldemar refactorization of the driver, one of
  them fixes target mode.

- A new driver for Inside Secure microread chipset. It supports two
  physical layers: i2c and MEI. The MEI one depends on a patchset that's
  been sent to Greg Kroah-Hartman for inclusion into the 3.9 kernel [1]. The
  dependency is a KConfig one which means this code is not buildable as long
  as the MEI API is not usptream."

"This 3rd NFC pull request for 3.9 contains a fix for the microread MEI
physical layer support, as the MEI bus API changed.

From the MEI code, we now pass the MEI id back to the driver probe routine,
and we also pass a name and a MEI id table through the mei_bus_driver
structure. A few renames as well like e.g. mei_bus_driver to mei_driver or
mei_bus_client to mei_device in order to be closer to the driver model
practices."

For the ath6kl bits, Kalle says:

"There's not anything special here, most of the patches are just code
cleanup. The only functional changes are using the beacon interval from user
space and fixing a crash which happens when inserting and removing the
module in a loop."

Also, I pulled the wireless tree in order to resolve some pending
merge issues.  On top of that, there is a bunch of work on brcmfmac
that leads up to P2P support.  Also, mwifiex, rtlwifi, and a variety
of other drivers see some basic cleanups and minor enhancements.

Please let me know if there are problems!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-18 15:12:07 -05:00
Andy King
6cf1c5fc26 VSOCK: Don't reject PF_VSOCK protocol
Allow our own family as the protocol value for socket creation.

Reported-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Andy King <acking@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-18 15:02:51 -05:00
Dmitry Torokhov
7ccd7de691 VSOCK: get rid of vsock_version.h
There isn't really a need to have a separate file for it.

Acked-by: Andy King <acking@vmware.com>
Signed-off-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-18 15:02:51 -05:00
Dmitry Torokhov
7777ac3860 VSOCK: get rid of EXPORT_SYMTAB
This is the default behavior for a looooooong time.

Acked-by: Andy King <acking@vmware.com>
Signed-off-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-18 15:02:51 -05:00
Romain KUNTZ
18cf0d0784 xfrm: release neighbor upon dst destruction
Neighbor is cloned in xfrm6_fill_dst but seems to never be released.
Neighbor entry should be released when XFRM6 dst entry is destroyed
in xfrm6_dst_destroy, otherwise references may be kept forever on
the device pointed by the neighbor entry.

I may not have understood all the subtleties of XFRM & dst so I would
be happy to receive comments on this patch.

Signed-off-by: Romain Kuntz <r.kuntz@ipflavors.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-18 14:57:29 -05:00
Gao feng
ece31ffd53 net: proc: change proc_net_remove to remove_proc_entry
proc_net_remove is only used to remove proc entries
that under /proc/net,it's not a general function for
removing proc entries of netns. if we want to remove
some proc entries which under /proc/net/stat/, we still
need to call remove_proc_entry.

this patch use remove_proc_entry to replace proc_net_remove.
we can remove proc_net_remove after this patch.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-18 14:53:08 -05:00