Commit Graph

288589 Commits

Author SHA1 Message Date
Linus Torvalds
5928a2b60c Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU changes for v3.4 from Ingo Molnar.  The major features of this
series are:

 - making RCU more aggressive about entering dyntick-idle mode in order
   to improve energy efficiency

 - converting a few more call_rcu()s to kfree_rcu()s

 - applying a number of rcutree fixes and cleanups to rcutiny

 - removing CONFIG_SMP #ifdefs from treercu

 - allowing RCU CPU stall times to be set via sysfs

 - adding CPU-stall capability to rcutorture

 - adding more RCU-abuse diagnostics

 - updating documentation

 - fixing yet more issues located by the still-ongoing top-to-bottom
   inspection of RCU, this time with a special focus on the CPU-hotplug
   code path.

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (48 commits)
  rcu: Stop spurious warnings from synchronize_sched_expedited
  rcu: Hold off RCU_FAST_NO_HZ after timer posted
  rcu: Eliminate softirq-mediated RCU_FAST_NO_HZ idle-entry loop
  rcu: Add RCU_NONIDLE() for idle-loop RCU read-side critical sections
  rcu: Allow nesting of rcu_idle_enter() and rcu_idle_exit()
  rcu: Remove redundant check for rcu_head misalignment
  PTR_ERR should be called before its argument is cleared.
  rcu: Convert WARN_ON_ONCE() in rcu_lock_acquire() to lockdep
  rcu: Trace only after NULL-pointer check
  rcu: Call out dangers of expedited RCU primitives
  rcu: Rework detection of use of RCU by offline CPUs
  lockdep: Add CPU-idle/offline warning to lockdep-RCU splat
  rcu: No interrupt disabling for rcu_prepare_for_idle()
  rcu: Move synchronize_sched_expedited() to rcutree.c
  rcu: Check for illegal use of RCU from offlined CPUs
  rcu: Update stall-warning documentation
  rcu: Add CPU-stall capability to rcutorture
  rcu: Make documentation give more realistic rcutorture duration
  rcutorture: Permit holding off CPU-hotplug operations during boot
  rcu: Print scheduling-clock information on RCU CPU stall-warning messages
  ...
2012-03-20 10:10:18 -07:00
Linus Torvalds
5ed59af850 Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core/locking changes for v3.4 from Ingo Molnar

* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  futex: Simplify return logic
  futex: Cover all PI opcodes with cmpxchg enabled check
2012-03-19 17:11:15 -07:00
Linus Torvalds
b7f077d7bc Merge branch 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core/iommu changes for v3.4 from Ingo Molnar

* 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/iommu/intel: Increase the number of iommus supported to MAX_IO_APICS
  x86/iommu/intel: Fix identity mapping for sandy bridge
2012-03-19 17:10:38 -07:00
Linus Torvalds
b0e37d7ac6 Merge branch 'dcache-word-accesses'
* branch 'dcache-word-accesses':
  vfs: use 'unsigned long' accesses for dcache name comparison and hashing

This does the name hashing and lookup using word-sized accesses when
that is efficient, namely on x86 (although any little-endian machine
with good unaligned accesses would do).

It does very much depend on little-endian logic, but it's a very hot
couple of functions under some real loads, and this patch improves the
performance of __d_lookup_rcu() and link_path_walk() by up to about 30%.
Giving a 10% improvement on some very pathname-heavy benchmarks.

Because we do make unaligned accesses past the filename, the
optimization is disabled when CONFIG_DEBUG_PAGEALLOC is active, and we
effectively depend on the fact that on x86 we don't really ever have the
last page of usable RAM followed immediately by any IO memory (due to
ACPI tables, BIOS buffer areas etc).

Some of the bit operations we do are a bit "subtle".  It's commented,
but you do need to really think about the code.  Or just consider it
black magic.

Thanks to people on G+ for some of the optimized bit tricks.
2012-03-19 16:37:28 -07:00
Linus Torvalds
6d7d1a0dc7 vfs: get rid of batshit-insane pointless dentry hash calculations
For some odd historical reason, the final mixing round for the dentry
cache hash table lookup had an insane "xor with big constant" logic.  In
two places.

The big constant that is being xor'ed is GOLDEN_RATIO_PRIME, which is a
fairly random-looking number that is designed to be *multiplied* with so
that the bits get spread out over a whole long-word.

But xor'ing with it is insane.  It doesn't really even change the hash -
it really only shifts the hash around in the hash table.  To make
matters worse, the insane big constant is different on 32-bit and 64-bit
builds, even though the name hash bits we use are always 32-bit (and the
bits from the pointer we mix in effectively are too).

It's all total voodoo programming, in other words.

Now, some testing and analysis of the hash chains shows that the rest of
the hash function seems to be fairly good.  It does pick the right bits
of the parent dentry pointer, for example, and while it's generally a
bad idea to use an xor to mix down the upper bits (because if there is a
repeating pattern, the xor can cause "destructive interference"), it
seems to not have been a disaster.

For example, replacing the hash with the normal "hash_long()" code (that
uses the GOLDEN_RATIO_PRIME constant correctly, btw) actually just makes
the hash worse.  The hand-picked hash knew which bits of the pointer had
the highest entropy, and hash_long() ends up mixing bits less optimally
at least in some trivial tests.

So the hash function overall seems fine, it just has that really odd
"shift result around by a constant xor".

So get rid of the silly xor, and replace the down-mixing of the bits
with an add instead of an xor that tends to not have the same kind of
destructive interference issues.  Some stats on the resulting hash
chains shows that they look statistically identical before and after,
but the code is simpler and no longer makes you go "WTF?".

Also, the incoming hash really is just "unsigned int", not a long, and
there's no real point to worry about the high 26 bits of the dentry
pointer for the 64-bit case, because they are all going to be identical
anyway.

So also change the hashing to be done in the more natural 'unsigned int'
that is the real size of the actual hashed data anyway.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-19 16:19:53 -07:00
Linus Torvalds
c16fa4f2ad Linux 3.3 2012-03-18 16:15:34 -07:00
Jason Baron
93dc6107a7 Don't limit non-nested epoll paths
Commit 28d82dc1c4 ("epoll: limit paths") that I did to limit the
number of possible wakeup paths in epoll is causing a few applications
to longer work (dovecot for one).

The original patch is really about limiting the amount of epoll nesting
(since epoll fds can be attached to other fds). Thus, we probably can
allow an unlimited number of paths of depth 1. My current patch limits
it at 1000. And enforce the limits on paths that have a greater depth.

This is captured in: https://bugzilla.redhat.com/show_bug.cgi?id=681578

Signed-off-by: Jason Baron <jbaron@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-18 12:25:04 -07:00
Linus Torvalds
c579bc7e31 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking changes from David Miller:
 "1) icmp6_dst_alloc() returns NULL instead of ERR_PTR() leading to
     crashes, particularly during shutdown.  Reported by Dave Jones and
     fixed by Eric Dumazet.

  2) hyperv and wimax/i2400m return NETDEV_TX_BUSY when they have
     already freed the SKB, which causes crashes as to the caller this
     means requeue the packet.  Fixes from Eric Dumazet.

  3) usbnet driver doesn't allocate the right amount of headroom on
     fresh RX SKBs, fix from Eric Dumazet.

  4) Fix regression in ip6_mc_find_dev_rcu(), as an RCU lookup it
     abolutely should not take a reference to 'dev', this leads to
     leaks.  Fix from RonQing Li.

  5) Fix netfilter ctnetlink race between delete and timeout expiration.
     From Pablo Neira Ayuso.

  6) Revert SFQ change which causes regressions, specifically queueing
     to tail can lead to unavoidable flow starvation.  From Eric
     Dumazet.

  7) Fix a memory leak and a crash on corrupt firmware files in bnx2x,
     from Michal Schmidt."

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  netfilter: ctnetlink: fix race between delete and timeout expiration
  ipv6: Don't dev_hold(dev) in ip6_mc_find_dev_rcu.
  wimax/i2400m: fix erroneous NETDEV_TX_BUSY use
  net/hyperv: fix erroneous NETDEV_TX_BUSY use
  net/usbnet: reserve headroom on rx skbs
  bnx2x: fix memory leak in bnx2x_init_firmware()
  bnx2x: fix a crash on corrupt firmware file
  sch_sfq: revert dont put new flow at the end of flows
  ipv6: fix icmp6_dst_alloc()
2012-03-17 19:22:24 -07:00
Linus Torvalds
96ee0499c5 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar.

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf tools, x86: Build perf on older user-space as well
  perf tools: Use scnprintf where applicable
  perf tools: Incorrect use of snprintf results in SEGV
2012-03-17 09:54:16 -07:00
Pablo Neira Ayuso
a16a1647fa netfilter: ctnetlink: fix race between delete and timeout expiration
Kerin Millar reported hardlockups while running `conntrackd -c'
in a busy firewall. That system (with several processors) was
acting as backup in a primary-backup setup.

After several tries, I found a race condition between the deletion
operation of ctnetlink and timeout expiration. This patch fixes
this problem.

Tested-by: Kerin Millar <kerframil@gmail.com>
Reported-by: Kerin Millar <kerframil@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-17 01:47:08 -07:00
RongQing.Li
c577923756 ipv6: Don't dev_hold(dev) in ip6_mc_find_dev_rcu.
ip6_mc_find_dev_rcu() is called with rcu_read_lock(), so don't
need to dev_hold().
With dev_hold(), not corresponding dev_put(), will lead to leak.

[ bug introduced in 96b52e61be (ipv6: mcast: RCU conversions) ]

Signed-off-by: RongQing.Li <roy.qing.li@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-16 21:56:42 -07:00
Linus Torvalds
cb1ecf25a8 Merge branch 'akpm' (more patches from Andrew)
Merge some more email patches from Andrew Morton:
 "A couple of nilfs fixes"

* emailed from Andrew Morton <akpm@linux-foundation.org>:
  nilfs2: fix NULL pointer dereference in nilfs_load_super_block()
  nilfs2: clamp ns_r_segments_percentage to [1, 99]
2012-03-16 17:14:55 -07:00
Ryusuke Konishi
d7178c79d9 nilfs2: fix NULL pointer dereference in nilfs_load_super_block()
According to the report from Slicky Devil, nilfs caused kernel oops at
nilfs_load_super_block function during mount after he shrank the
partition without resizing the filesystem:

 BUG: unable to handle kernel NULL pointer dereference at 00000048
 IP: [<d0d7a08e>] nilfs_load_super_block+0x17e/0x280 [nilfs2]
 *pde = 00000000
 Oops: 0000 [#1] PREEMPT SMP
 ...
 Call Trace:
  [<d0d7a87b>] init_nilfs+0x4b/0x2e0 [nilfs2]
  [<d0d6f707>] nilfs_mount+0x447/0x5b0 [nilfs2]
  [<c0226636>] mount_fs+0x36/0x180
  [<c023d961>] vfs_kern_mount+0x51/0xa0
  [<c023ddae>] do_kern_mount+0x3e/0xe0
  [<c023f189>] do_mount+0x169/0x700
  [<c023fa9b>] sys_mount+0x6b/0xa0
  [<c04abd1f>] sysenter_do_call+0x12/0x28
 Code: 53 18 8b 43 20 89 4b 18 8b 4b 24 89 53 1c 89 43 24 89 4b 20 8b 43
 20 c7 43 2c 00 00 00 00 23 75 e8 8b 50 68 89 53 28 8b 54 b3 20 <8b> 72
 48 8b 7a 4c 8b 55 08 89 b3 84 00 00 00 89 bb 88 00 00 00
 EIP: [<d0d7a08e>] nilfs_load_super_block+0x17e/0x280 [nilfs2] SS:ESP 0068:ca9bbdcc
 CR2: 0000000000000048

This turned out due to a defect in an error path which runs if the
calculated location of the secondary super block was invalid.

This patch fixes it and eliminates the reported oops.

Reported-by: Slicky Devil <slicky.dvl@gmail.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Tested-by: Slicky Devil <slicky.dvl@gmail.com>
Cc: <stable@vger.kernel.org>	[2.6.30+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-16 17:14:44 -07:00
Haogang Chen
3d777a6406 nilfs2: clamp ns_r_segments_percentage to [1, 99]
ns_r_segments_percentage is read from the disk.  Bogus or malicious
value could cause integer overflow and malfunction due to meaningless
disk usage calculation.  This patch reports error when mounting such
bogus volumes.

Signed-off-by: Haogang Chen <haogangchen@gmail.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-16 17:14:44 -07:00
Linus Torvalds
33e9ee8dbd Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull maintainer update from James Morris:
 "Please pull this patch which adds Serge as maintainer of the
  capabilities code, as discussed on lwn and the lsm list.

  New capabilities must be signed off by the maintainer, and new uses of
  any capabilities should at be cc'd to the maintainer."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  MAINTAINERS: Add Serge as maintainer of capabilities
2012-03-16 17:04:02 -07:00
Linus Torvalds
9fc005c017 Remove dead code from entry.S which causes a build failure when using
a newer assembler (v2.22 complains about it, v2.20 ignores it).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJPY0NTAAoJEOiN4VijXeFPpGcQAJ/tk1n7nlgVkmxvVpiqGj3s
 IFrdJJ8SQkj9VTSWQ9FVgxIpJIJ7J57FHoLJtVmz7HbDs6/LgSY/jSX40SzYlrgt
 84Wh8HD8Ycqi26YSr1+yLMfPsUhbp5d1065fl6E0XNSDkd5m9qH2Qmf4RenVLWcG
 P01jDnNAmz/NXHKLVlCZMdZlrUnFYZn5W5K8RidAc19Hv2nsDTa1p7OLpRhy05ST
 ZTXF5fioQoPwcCpsFkjDjANtltw1x4TvCxFG1ubttwBkIOZ0OUr3yTfd3AtShazV
 VC2etsiAcmKasUsIvCzh6EYd2t+i6GG+V7tM+i1ATlznGFE6PbGUHIWEWBVeo2XC
 7nq8S40d9Z9uExuFo3AtYSZhylwbU6v0ah+eGljhozQ8XBDLiCvTRX7ZpFq5AX4C
 F5nigoCrdvpw9mL06e9x4aVl1UgXlk+lkyjNE+EUhXf/0RtVf+2qDC2Is9enwdxR
 NDO6lSi1+pTgmDCgs6WhKaAQhijbAsi48JfGUj6C5ahMTBGTF5SbhjNzYZQU68A/
 2QsuuR0s9hwlXVldOcx1yxBVFuijX6L8CpWJSJPgvWI52f+9ZXNMPbj9Ty7efGKJ
 /xjvNjw48LHqDrzqzpGaZ1M8/JriuOTUOZV3khSpsk9tSSxXQ6LXsPjdoxziU/ed
 56umOZgs6HHe6ND1mwNe
 =dMe3
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming

Pull c6x bugfix from Mark Salter:
 "Remove dead code from entry.S which causes a build failure when using
  a newer assembler (v2.22 complains about it, v2.20 ignores it)."

* tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming:
  C6X: remove dead code from entry.S
2012-03-16 17:03:15 -07:00
Anton Blanchard
c017386352 afs: Remote abort can cause BUG in rxrpc code
When writing files to afs I sometimes hit a BUG:

kernel BUG at fs/afs/rxrpc.c:179!

With a backtrace of:

	afs_free_call
	afs_make_call
	afs_fs_store_data
	afs_vnode_store_data
	afs_write_back_from_locked_page
	afs_writepages_region
	afs_writepages

The cause is:

	ASSERT(skb_queue_empty(&call->rx_queue));

Looking at a tcpdump of the session the abort happens because we
are exceeding our disk quota:

	rx abort fs reply store-data error diskquota exceeded (32)

So the abort error is valid. We hit the BUG because we haven't
freed all the resources for the call.

By freeing any skbs in call->rx_queue before calling afs_free_call
we avoid hitting leaking memory and avoid hitting the BUG.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-16 17:01:41 -07:00
Anton Blanchard
2c724fb927 afs: Read of file returns EBADMSG
A read of a large file on an afs mount failed:

# cat junk.file > /dev/null
cat: junk.file: Bad message

Looking at the trace, call->offset wrapped since it is only an
unsigned short. In afs_extract_data:

        _enter("{%u},{%zu},%d,,%zu", call->offset, len, last, count);
...

        if (call->offset < count) {
                if (last) {
                        _leave(" = -EBADMSG [%d < %zu]", call->offset, count);
                        return -EBADMSG;
                }

Which matches the trace:

[cat   ] ==> afs_extract_data({65132},{524},1,,65536)
[cat   ] <== afs_extract_data() = -EBADMSG [0 < 65536]

call->offset went from 65132 to 0. Fix this by making call->offset an
unsigned int.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-16 17:01:41 -07:00
Mark Salter
6e780cf5c0 C6X: remove dead code from entry.S
The ENDPROC() on sys_fadvise64_c6x() in arch/c6x/kernel/entry.S is
outside of the conditional block with the matching ENTRY() macro. This
leads a newer (v2.22 vs. v2.20) assembler to complain:

  /tmp/ccGZBaPT.s: Assembler messages:
  /tmp/ccGZBaPT.s: Error: .size expression for sys_fadvise64_c6x does not evaluate to a constant

The conditional block became dead code when c6x switched to generic
unistd.h and should be removed along with the offending ENDPROC().

Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
2012-03-16 09:27:57 -04:00
Eric Dumazet
b8fbaef586 wimax/i2400m: fix erroneous NETDEV_TX_BUSY use
A driver start_xmit() method cannot free skb and return NETDEV_TX_BUSY,
since caller is going to reuse freed skb.

In fact netif_tx_stop_queue() / netif_stop_queue() is needed before
returning NETDEV_TX_BUSY or you can trigger a ksoftirqd fatal loop.

In case of memory allocation error, only safe way is to drop the packet
and return NETDEV_TX_OK

Also increments tx_dropped counter

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-16 02:01:41 -07:00
Eric Dumazet
bb6d5e76fb net/hyperv: fix erroneous NETDEV_TX_BUSY use
A driver start_xmit() method cannot free skb and return NETDEV_TX_BUSY,
since caller is going to reuse freed skb.

This is mostly a revert of commit bf769375c (staging: hv: fix the return
status of netvsc_start_xmit())

In fact netif_tx_stop_queue() / netif_stop_queue() is needed before
returning NETDEV_TX_BUSY or you can trigger a ksoftirqd fatal loop.

In case of memory allocation error, only safe way is to drop the packet
and return NETDEV_TX_OK

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-16 02:01:17 -07:00
Eric Dumazet
7bdd402706 net/usbnet: reserve headroom on rx skbs
network drivers should reserve some headroom on incoming skbs so that we
dont need expensive reallocations, eg forwarding packets in tunnels.

This NET_SKB_PAD padding is done in various helpers, like
__netdev_alloc_skb_ip_align() in this patch, combining NET_SKB_PAD and
NET_IP_ALIGN magic.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Oliver Neukum <oneukum@suse.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-16 02:01:05 -07:00
Michal Schmidt
c0ea452e42 bnx2x: fix memory leak in bnx2x_init_firmware()
When cycling the interface down and up, bnx2x_init_firmware() knows that
the firmware is already loaded, but nevertheless it allocates certain
arrays anew (init_data, init_ops, init_ops_offsets, iro_arr). The old
arrays are leaked.

Fix the leaks by returning early if the firmware was already loaded.
Because if the firmware is loaded, so are the arrays.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Acked-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-16 01:57:26 -07:00
Michal Schmidt
127d0a198a bnx2x: fix a crash on corrupt firmware file
If the requested firmware is deemed corrupt and then released, reset the
pointer to NULL in order to avoid double-freeing it in
bnx2x_release_firmware() or dereferencing it in bnx2x_init_firmware().

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Acked-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-16 01:57:26 -07:00
Eric Dumazet
cc34eb672e sch_sfq: revert dont put new flow at the end of flows
This reverts commit d47a0ac7b6 (sch_sfq: dont put new flow at the end of
flows)

As Jesper found out, patch sounded great but has bad side effects.

In stress situation, pushing new flows in front of the queue can prevent
old flows doing any progress. Packets can stay in SFQ queue for
unlimited amount of time.

It's possible to add heuristics to limit this problem, but this would
add complexity outside of SFQ scope.

A more sensible answer to Dave Taht concerns (who reported the issued I
tried to solve in original commit) is probably to use a qdisc hierarchy
so that high prio packets dont enter a potentially crowded SFQ qdisc.

Reported-by: Jesper Dangaard Brouer <jdb@comx.dk>
Cc: Dave Taht <dave.taht@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-16 01:55:25 -07:00
Eric Dumazet
122bdf67f1 ipv6: fix icmp6_dst_alloc()
commit 87a115783 ( ipv6: Move xfrm_lookup() call down into
icmp6_dst_alloc().) forgot to convert one error path, leading
to crashes in mld_sendpack()

Many thanks to Dave Jones for providing a very complete bug report.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-16 01:53:42 -07:00
James Morris
95d16c7211 MAINTAINERS: Add Serge as maintainer of capabilities
Add Serge as maintainer of capabilities, per suggestion on LWN:
http://lwn.net/Articles/486306/

Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-03-16 12:05:48 +11:00
Linus Torvalds
0c4d0670f6 Merge branch 'akpm' (Andrew's patch-bomb)
Merge patches from Andrew Morton:
 "Nine patches - some bug fixes and some MAINTAINERS fiddling."

* emailed from Andrew Morton <akpm@linux-foundation.org>:
  drivers/video/backlight/s6e63m0.c: fix corruption storing gamma mode
  MAINTAINERS: add entry for exynos mipi display drivers
  MAINTAINERS: fix link to Gustavo Padovans tree
  MAINTAINERS: add Johan to Bluetooth maintainers
  MAINTAINERS: Gustavo has moved
  prctl: use CAP_SYS_RESOURCE for PR_SET_MM option
  rapidio/tsi721: fix bug in register offset definitions
  MAINTAINERS: update ST's Mailing list for SPEAr
  memcg: free mem_cgroup by RCU to fix oops
2012-03-15 17:16:22 -07:00
Linus Torvalds
7c32442ff8 Merge branch 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
Pull i2c subsystem fixes from Jean Delvare.

* 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
  i2c-algo-bit: Fix spurious SCL timeouts under heavy load
  i2c-core: Comment says "transmitted" but means "received"
2012-03-15 17:14:35 -07:00
Linus Torvalds
538e7e96fd Five patches since v3.3-rc7:
fecfb64 hwmon: (zl6100) Enable interval between chip accesses for all chips
 c43524b hwmon: (w83627ehf) Describe undocumented pwm attributes
 aacb6b0 hwmon: (w83627ehf) Fix temp2 source for W83627UHG
 32260d9 hwmon: (w83627ehf) Fix memory leak in probe function
 33fa9b6 hwmon: (w83627ehf) Fix writing into fan_stop_time for NCT6775F/NCT6776F
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.10 (GNU/Linux)
 
 iQIcBAABAgAGBQJPYNeWAAoJEMsfJm/On5mBDYwP/RlQHZAQpars+Bw50A+MauP0
 ifYNGAWYo9WE2J6spd9LST0ESycLW74Lg/PJMbq/zfbFS3OgaKfS5nKcHTOmhyyC
 dyDdekrcvQpYDBWnE+Ir3nKeJ/byHbnog7SOs+AEHmE8gc66jc1s/La93b669odA
 /cHN0FGST83b0qu+UEEkC8Xv1Y/rp52vF4GaZZEM7bA1tgWxY7u15d2A2lsj7il1
 aiVCublpSHBhDs9W5SEaZKJOEkETX0CDnx95txIwl+4SFhL6XymqZmymlSIKT8rv
 FkxWckbI/qV4a7NQ3tI+QZw18YGn+kbIIkTfuhjxZiWrZB8Kd3VRbdyNcFEGDLCU
 h1dNAcBcE1eXf2GUe9wPxs11LbFEN5ZmDxX1Z32LfOxwi+9nd7C+MLoabxx1A/Hp
 aav6q8VA5NNKj4iDwbKx6+hjVkUkglzHw4QPGd+Htaq2htiZKYHQAG2qAIMGQHf2
 aa2ATAajx+n7O4U3fcf5hDBx9+Swg8EkuwN787wQdyfB/6X95JiGpkJrgKWFYiyt
 4ba7cOe/PhoGW6qFUEB+uVmiHTwJbI+WSS5RIXX/PZLvcxxqa4AhNd5L55tsghfd
 Ylh71Y5nL/NBt7JW/gkEeyuulD5dSYmfiabytb/KzrC3ek6Uj36mB78/eTJukESg
 PLMy0g9gtuuwmm7tjHxB
 =3+SR
 -----END PGP SIGNATURE-----

Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck.

* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (zl6100) Enable interval between chip accesses for all chips
  hwmon: (w83627ehf) Describe undocumented pwm attributes
  hwmon: (w83627ehf) Fix temp2 source for W83627UHG
  hwmon: (w83627ehf) Fix memory leak in probe function
  hwmon: (w83627ehf) Fix writing into fan_stop_time for NCT6775F/NCT6776F
2012-03-15 17:13:39 -07:00
Linus Torvalds
0c48ca8512 Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm exynos/intel updates from Dave Airlie:
 "Two minor updates from Jesse for Intel SNB fixes, and a few fixes from
  Samsung for exynos.  The pull req has Alan's commit in it since Intel
  based their tree on my tree at that time, but it all seems fine wrt
  merging."

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm exynos: use drm_fb_helper_set_par directly
  drm/exynos: Fix fb_videomode <-> drm_mode_modeinfo conversion
  drm/exynos: fix runtime_pm fimd device state on probe
  drm/exynos: use correct 'exynos-drm' name for platform device
  drm/i915: support 32 bit BGR formats in sprite planes
  drm/i915: fix color order for BGR formats on SNB
  drm/gma500: Fix Cedarview boot failures in 3.3-rc
2012-03-15 17:07:25 -07:00
Linus Torvalds
72c79bdbda Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
 "For 4 fixes for 3.3 (all trivial):
       - uvc video driver: fixes a division by zero;
       - davinci: add module.h to fix compilation;
       - smsusb: fix the delivery system setting;
       - smsdvb: the get_frontend implementation there is broken.

  The smsdvb patch has 127 lines, but it is trivial: instead of
  returning a cache of the set_frontend (with is wrong, as it doesn't
  have the updated values for the data, and the implementation there is
  buggy), it copies the information of the detected DVB parameters from
  the smsdvb private structures into the corresponding DVBv5 struct
  fields."

* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  [media] smsdvb: fix get_frontend
  [media] smsusb: fix the default delivery system setting
  [media] media: davinci: added module.h to resolve unresolved macros
  [media] [FOR,v3.3] uvcvideo: Avoid division by 0 in timestamp calculation
2012-03-15 17:06:05 -07:00
Linus Torvalds
fe83558a33 Merge branch '3.3-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull target fixes from Nicholas Bellinger:
 "This series addresses two recently reported regression bugs related to
  legacy SCSI reservation usage in target core, and iscsi-target
  reservation conflict handling.

  The second patch in particular addresses possible data-corruption with
  SCSI reservations that is specific to iscsi-target fabric LUNs with
  multiple client writers.  Both patches need to go into v3.2 stable
  ASAP, and the branch based on the last target-pending/3.3-rc-fixes
  HEAD.

  Again, thanks to Martin Svec for his help to identify and address this
  regression bug with iscsi-target."

* '3.3-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  iscsi-target: Fix reservation conflict -EBUSY response handling bug
  target: Fix compatible reservation handling (CRH=1) with legacy RESERVE/RELEASE
2012-03-15 17:04:56 -07:00
Dan Carpenter
cf2b94daab drivers/video/backlight/s6e63m0.c: fix corruption storing gamma mode
strict_strtoul() writes a long but ->gamma_mode only has space to store an
int, so on 64 bit systems we end up scribbling over ->gamma_table_count as
well.  I've changed it to use kstrtouint() instead.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-15 17:03:04 -07:00
Donghwa Lee
5eb1eb4ea1 MAINTAINERS: add entry for exynos mipi display drivers
I'd like to add Inki Dae, Donghwa Lee and Kyungmin Park as maintainers
who developers for exynos mipi display drivers for
video/driver/exynos/exynos_mipi* and include/video/exynos_mipi*.

Signed-off-by: Donghwa Lee <dh09.lee@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Kukjin Kim <kgene.kim@samsung.com>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-15 17:03:04 -07:00
Johan Hedberg
4c57c7e4af MAINTAINERS: fix link to Gustavo Padovans tree
Gustavo's tree is called just bluetooth.git and not bluetooth-2.6.git
anymore.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: "Gustavo F. Padovan" <padovan@profusion.mobi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-15 17:03:03 -07:00
Johan Hedberg
eb491ecac1 MAINTAINERS: add Johan to Bluetooth maintainers
I've been coordinating Bluetooth patches in my tree for some time and
it's possible I'll do it in the future too, so add myself to the
Bluetooth sections as well as mention my tree there.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: "Gustavo F. Padovan" <padovan@profusion.mobi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-15 17:03:03 -07:00
Gustavo Padovan
960d4d1ba6 MAINTAINERS: Gustavo has moved
This is going to be the primary e-mail for kernel development.

Signed-off-by: Gustavo Padovan <gustavo@padovan.org>
Cc: Johan Hedberg <johan.hedberg@gmail.com>
Cc: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-15 17:03:03 -07:00
Cyrill Gorcunov
79f0713d40 prctl: use CAP_SYS_RESOURCE for PR_SET_MM option
CAP_SYS_ADMIN is already overloaded left and right, so to have more
fine-grained access control use CAP_SYS_RESOURCE here.

The CAP_SYS_RESOUCE is chosen because this prctl option allows a current
process to adjust some fields of memory map descriptor which rather
represents what the process owns: pointers to code, data, stack
segments, command line, auxiliary vector data and etc.

Suggested-by: Michael Kerrisk <mtk.manpages@gmail.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul Bolle <pebolle@tiscali.nl>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-15 17:03:03 -07:00
Alexandre Bounine
9bbad7da76 rapidio/tsi721: fix bug in register offset definitions
Fix indexed register offset definitions that use decimal (wrong) instead
of hexadecimal (correct) notation for indexing multipliers.

Incorrect definitions do not affect Tsi721 driver in its current default
configuration because it uses only IDB queue 0.  Loss of inbound
doorbell functionality should be observed if queue other than 0 is used.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Chul Kim <chul.kim@idt.com>
Cc: <stable@vger.kernel.org>		[3.2+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-15 17:03:03 -07:00
Viresh Kumar
fbfa0748d8 MAINTAINERS: update ST's Mailing list for SPEAr
We have created a ST's Mailing list for SPEAr.  This can be accessed
from non-st email ids.  I want people to cc this list, when they have
changes specific to SPEAr.  So, its better to get this updated in
MAINTAINERS file.

linux-arm-kernel@lists.infradead.org is also added for SPEAr.

Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-15 17:03:03 -07:00
Hugh Dickins
59927fb984 memcg: free mem_cgroup by RCU to fix oops
After fixing the GPF in mem_cgroup_lru_del_list(), three times one
machine running a similar load (moving and removing memcgs while
swapping) has oopsed in mem_cgroup_zone_nr_lru_pages(), when retrieving
memcg zone numbers for get_scan_count() for shrink_mem_cgroup_zone():
this is where a struct mem_cgroup is first accessed after being chosen
by mem_cgroup_iter().

Just what protects a struct mem_cgroup from being freed, in between
mem_cgroup_iter()'s css_get_next() and its css_tryget()? css_tryget()
fails once css->refcnt is zero with CSS_REMOVED set in flags, yes: but
what if that memory is freed and reused for something else, which sets
"refcnt" non-zero? Hmm, and scope for an indefinite freeze if refcnt is
left at zero but flags are cleared.

It's tempting to move the css_tryget() into css_get_next(), to make it
really "get" the css, but I don't think that actually solves anything:
the same difficulty in moving from css_id found to stable css remains.

But we already have rcu_read_lock() around the two, so it's easily fixed
if __mem_cgroup_free() just uses kfree_rcu() to free mem_cgroup.

However, a big struct mem_cgroup is allocated with vzalloc() instead of
kzalloc(), and we're not allowed to vfree() at interrupt time: there
doesn't appear to be a general vfree_rcu() to help with this, so roll
our own using schedule_work().  The compiler decently removes
vfree_work() and vfree_rcu() when the config doesn't need them.

Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Ying Han <yinghan@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-15 17:03:03 -07:00
Ville Syrjala
8ee161ce5e i2c-algo-bit: Fix spurious SCL timeouts under heavy load
When the system is under heavy load, there can be a significant delay
between the getscl() and time_after() calls inside sclhi(). That delay
may cause the time_after() check to trigger after SCL has gone high,
causing sclhi() to return -ETIMEDOUT.

To fix the problem, double check that SCL is still low after the
timeout has been reached, before deciding to return -ETIMEDOUT.

Signed-off-by: Ville Syrjala <syrjala@sci.fi>
Cc: stable@vger.kernel.org
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2012-03-15 18:11:05 +01:00
Wolfram Sang
834aa6f30c i2c-core: Comment says "transmitted" but means "received"
Fix that. Also convert this and the related comment to proper commenting
style.

Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
2012-03-15 18:11:05 +01:00
Dave Airlie
bb2551da10 Merge branch 'exynos-drm-fixes' of git://git.infradead.org/users/kmpark/linux-samsung into drm-fixes
* 'exynos-drm-fixes' of git://git.infradead.org/users/kmpark/linux-samsung:
  drm exynos: use drm_fb_helper_set_par directly
  drm/exynos: Fix fb_videomode <-> drm_mode_modeinfo conversion
  drm/exynos: fix runtime_pm fimd device state on probe
  drm/exynos: use correct 'exynos-drm' name for platform device
2012-03-15 09:41:26 +00:00
Sascha Hauer
34418c25d6 drm exynos: use drm_fb_helper_set_par directly
info->fix.visual already is correctly set from drm_fb_helper_fill_fix.
info->fix.line_length is also set from drm_fb_helper_fill_fix,
so drm_fb_helper_set_par directly instead of a custom
exynos_drm_fbdev_set_par.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-03-15 11:39:00 +09:00
Laurent Pinchart
f7d86075fa drm/exynos: Fix fb_videomode <-> drm_mode_modeinfo conversion
The fb_videomode structure stores the front porch and back porch in the
right_margin and left_margin fields respectively. right_margin should
thus be computed with hsync_start - hdisplay, and left_margin with
htotal - hsync_end. The same holds for the vertical direction.

       Active               Front           Sync            Back
       Region               Porch                           Porch
<-------------------><----------------><-------------><---------------->

  //////////////////|
 ////////////////// |
//////////////////  |..................               ..................
                                       _______________

<------ xres -------><- right_margin -><- hsync_len -><- left_margin -->

<---- hdisplay ----->
<------------ hsync_start ------------>
<--------------------- hsync_end -------------------->
<--------------------------------- htotal ----------------------------->

Fix the fb_videomode <-> drm_mode_modeinfo conversion functions
accordingly.

Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Joonyoung Shim <jy0922.shim@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-03-15 11:39:00 +09:00
Marek Szyprowski
0d8ce3ae37 drm/exynos: fix runtime_pm fimd device state on probe
A call to pm_runtime_set_active() forces device to be at the active
state and skips calling its runtime suspend/resume callbacks. This
results in a freeze with a new power domain code based on gen_pd. Fimd
driver does all required runtime power management calls, so this
pm_runtime_set_active call is buggy. This patch removes it and corrects
clock management in probe function (clocks are now enabled by
pm_runtime_get_sync() call).

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-03-15 11:38:59 +09:00
Marek Szyprowski
9866b6c64b drm/exynos: use correct 'exynos-drm' name for platform device
Currently Exynos DRM driver uses DRIVER_NAME ('exynos') name for the
core platform device. This is confusing, because it doesn't refer to the
function the platform device is performing. This patch renames the
platform device to the 'exynos-drm', which matches the convention for
naming the platform devices. The name used inside DRM subsystem has not
been changed.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-03-15 11:38:59 +09:00
Linus Torvalds
f1cbd03f5e Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "Been sitting on this for a while, but lets get this out the door.
  This fixes various important bugs for 3.3 final, along with a few more
  trivial ones.  Please pull!"

* 'for-linus' of git://git.kernel.dk/linux-block:
  block: fix ioc leak in put_io_context
  block, sx8: fix pointer math issue getting fw version
  Block: use a freezable workqueue for disk-event polling
  drivers/block/DAC960: fix -Wuninitialized warning
  drivers/block/DAC960: fix DAC960_V2_IOCTL_Opcode_T -Wenum-compare warning
  block: fix __blkdev_get and add_disk race condition
  block: Fix setting bio flags in drivers (sd_dif/floppy)
  block: Fix NULL pointer dereference in sd_revalidate_disk
  block: exit_io_context() should call elevator_exit_icq_fn()
  block: simplify ioc_release_fn()
  block: replace icq->changed with icq->flags
2012-03-14 17:16:45 -07:00