Commit Graph

50166 Commits

Author SHA1 Message Date
Jay Vosburgh
a816c7c712 bonding: Improve IGMP join processing
In active-backup mode, the current bonding code duplicates IGMP
traffic to all slaves, so that switches are up to date in case of a
failover from an active to a backup interface.  If bonding then fails
back to the original active interface, it is likely that the "active
slave" switch's IGMP forwarding for the port will be out of date until
some event occurs to refresh the switch (e.g., a membership query).

	This patch alters the behavior of bonding to no longer flood
IGMP to all ports, and to issue IGMP JOINs to the newly active port at
the time of a failover.  This insures that switches are kept up to date
for all cases.

	"GOELLESCH Niels" <niels.goellesch@eurocontrol.int> originally
reported this problem, and included a patch.  His original patch was
modified by Jay Vosburgh to additionally remove the existing IGMP flood
behavior, use RCU, streamline code paths, fix trailing white space, and
adjust for style.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-03-06 06:08:11 -05:00
Jay Vosburgh
e245cb71d4 bonding: only receive ARPs for us
The ARP validation code only needs ARPs for the bonding device.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-03-06 06:08:11 -05:00
Jay Vosburgh
c4f283b1f2 bonding: fix double dev_add_pack
Bonding can erroneously register the same packet_type to receive
ARPs (for use by ARP validation): once at device open time, and once via
sysfs.  Since sysfs can change the validate setting (and thus register
or unregister) at any time, a flag is needed to synchronize with device
open in order to avoid double registrations, and the simplest place is
within the packet_type structure itself.  Double unregister is not an
issue.

	Bug reported by Ulrich Oelmann <ulrich.oelmann@web.de>.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-03-06 06:08:11 -05:00
Ingo Molnar
c3442e2965 [PATCH] paravirt: re-enable COMPAT_VDSO
CONFIG_PARAVIRT broke old glibc bootup: it silently turned off the
selectability of CONFIG_COMPAT_VDSO and thus rendered distro kernels
unbootable on old-style VDSO glibc setups.

the proper solution is to keep COMPAT_VDSO available - if a hypervisor
needs any modification of that concept then we'll judge those changes in
full context, once those changes are submitted.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 08:34:25 -08:00
Linus Torvalds
227c5fe799 Merge branch 'for-linus' of git://git.o-hand.com/linux-rpurdie-backlight
* 'for-linus' of git://git.o-hand.com/linux-rpurdie-backlight:
  backlight: Allow enable/disable of fb backlights, fixing regressions
  backlight: Fix nvidia backlight initial brightness
2007-03-05 08:25:43 -08:00
Ingo Molnar
6ebf622b25 [PATCH] disable NMI watchdog by default
there's a new NMI watchdog related problem: KVM crashes on certain
bzImages because ... we enable the NMI watchdog by default (even if the
user does not ask for it) , and no other OS on this planet does that so
KVM doesnt have emulation for that yet. So KVM injects a #GP, which
crashes the Linux guest:

 general protection fault: 0000 [#1]
 PREEMPT SMP
 Modules linked in:
 CPU:    0
 EIP:    0060:[<c011a8ae>]    Not tainted VLI
 EFLAGS: 00000246   (2.6.20-rc5-rt0 #3)
 EIP is at setup_apic_nmi_watchdog+0x26d/0x3d3

and no, i did /not/ request an nmi_watchdog on the boot command line!

Solution: turn off that darn thing! It's a debug tool, not a 'make life
harder' tool!!

with this patch the KVM guest boots up just fine.

And with this my laptop (Lenovo T60) also stopped its sporadic hard
hanging (sometimes in acpi_init(), sometimes later during bootup,
sometimes much later during actual use) as well. It hung with both
nmi_watchdog=1 and nmi_watchdog=2, so it's generally the fact of NMI
injection that is causing problems, not the NMI watchdog variant, nor
any particular bootup code.

[ NMI breaks on some systems, esp in combination with SMM -Arjan ]

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 08:23:51 -08:00
Ingo Molnar
0d05ad2c09 [PATCH] paravirt: let users decide whether they want VMI
do not use default=y for CONFIG_VMI (we do not do that for any driver or
special-hardware feature): the overwhelming majority of Linux users does
not need it, and interested users and distributions can enable it
as-needed.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 08:23:51 -08:00
Ingo Molnar
e9417fb324 [PATCH] paravirt: clarify VMI description
Clarify the description of the CONFIG_VMI option: describe the reality
that VMI is a VMWare-only interface for now. Once that changes and
another hypervisor adopts the VMI ABI we can change the text.

As can be seen from the Xen paravirtualization patches submitted to lkml
the Xen project has chosen its own, non-VMI interface between Xen and
the para-Linux - so remove Xen from the description.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 08:23:51 -08:00
Ingo Molnar
3f1a73b6dd [PATCH] paravirt: remove NO_IDLE_HZ on x86
Temove the mistaken turning on of NO_IDLE_HZ on x86+PARAVIRT kernels.

It's an obsolete, limited form of dynticks.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 08:23:51 -08:00
David Miller
8690ba446d [PATCH] video/aty/mach64_ct.c: fix bogus delay loop
CT based mach64 cards were reported to hang on sparc64 boxes when
compiled with gcc-4.1.x and later.

Looking at this piece of code, it's no surprise.  A critical
delay was implemented as an empty for() loop, and gcc 4.0.x
and previous did not optimize it away, so we did get a delay.

But gcc-4.1.x and later can optimize it away, and we get crashes.

Use a real udelay() to fix this.  Fix verified on SunBlade100.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 08:12:54 -08:00
Adrian Bunk
8f48561223 [PATCH] arch/i386/kernel/vmi.c must #include <asm/kmap_types.h>
CC      arch/i386/kernel/vmi.o
/home/bunk/linux/kernel-2.6/linux-2.6.21-rc2-mm1/arch/i386/kernel/vmi.c: In function 'vmi_map_pt_hook':
/home/bunk/linux/kernel-2.6/linux-2.6.21-rc2-mm1/arch/i386/kernel/vmi.c:387: error: 'KM_PTE0' undeclared (first use in this function)
/home/bunk/linux/kernel-2.6/linux-2.6.21-rc2-mm1/arch/i386/kernel/vmi.c:387: error: (Each undeclared identifier is reported only once
/home/bunk/linux/kernel-2.6/linux-2.6.21-rc2-mm1/arch/i386/kernel/vmi.c:387: error: for each function it appears in.)
/home/bunk/linux/kernel-2.6/linux-2.6.21-rc2-mm1/arch/i386/kernel/vmi.c:387: error: 'KM_PTE1' undeclared (first use in this function)
make[2]: *** [arch/i386/kernel/vmi.o] Error 1

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:54 -08:00
Sam Ravnborg
00f8b0c185 [PATCH] usb-storage: do not rebuild when kernel version changes
Replacing use of UTS_RELEASE with utsname()->release avoids that the
usb-storage driver is recompiled each time the kernel version changes.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:54 -08:00
Roland Kletzing
f9c99463b0 [PATCH] Documentation for io-accounting / reporting via procfs
Add some documentation for the new and very useful io-accounting feature.
It's being added to Documentation/filesystems/proc.txt

Signed-off-by: Roland Kletzing <devzero@web.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:54 -08:00
Antonino A. Daplas
721c04c65f [PATCH] atyfb: Fix kconfig error
Fix the following compile error:

  MODPOST 327 modules
WARNING: "aty_st_lcd" [drivers/video/aty/atyfb.ko] undefined!
WARNING: "aty_ld_lcd" [drivers/video/aty/atyfb.ko] undefined!
make[1]: *** [__modpost] Error 1
make: *** [modules] Error 2

Signed-off-by: Antonino Daplas <adaplas@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:53 -08:00
Michal Piotrowski
6346190b2f [PATCH] char/epca.c: remove unused function
"drivers/char/epca.c:2741: warning: 'get_termio' defined but not used"

Signed-off-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:53 -08:00
Heiko Carstens
e81ce1f7ec [PATCH] timer/hrtimer: take per cpu locks in sane order
Doing something like this on a two cpu system

  # echo 0 > /sys/devices/system/cpu/cpu0/online
  # echo 1 > /sys/devices/system/cpu/cpu0/online
  # echo 0 > /sys/devices/system/cpu/cpu1/online

will give me this:

  =======================================================
  [ INFO: possible circular locking dependency detected ]
  2.6.21-rc2-g562aa1d4-dirty #7
  -------------------------------------------------------
  bash/1282 is trying to acquire lock:
   (&cpu_base->lock_key){.+..}, at: [<000000000005f17e>] hrtimer_cpu_notify+0xc6/0x240

  but task is already holding lock:
   (&cpu_base->lock_key#2){.+..}, at: [<000000000005f174>] hrtimer_cpu_notify+0xbc/0x240

  which lock already depends on the new lock.

This happens because we have the following code in kernel/hrtimer.c:

  migrate_hrtimers(int cpu)
  [...]
  old_base = &per_cpu(hrtimer_bases, cpu);
  new_base = &get_cpu_var(hrtimer_bases);
  [...]
  spin_lock(&new_base->lock);
  spin_lock(&old_base->lock);

Which means the spinlocks are taken in an order which depends on which cpu
gets shut down from which other cpu. Therefore lockdep complains that there
might be an ABBA deadlock. Since migrate_hrtimers() gets only called on
cpu hotplug it's safe to assume that it isn't executed concurrently on a

The same problem exists in kernel/timer.c: migrate_timers().

As pointed out by Christian Borntraeger one possible solution to avoid
the locking order complaints would be to make sure that the locks are
always taken in the same order. E.g. by taking the lock of the cpu with
the lower number first.

To achieve this we introduce two new spinlock functions double_spin_lock
and double_spin_unlock which lock or unlock two locks in a given order.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: John Stultz <johnstul@us.ibm.com>
Cc: Christian Borntraeger <cborntra@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:53 -08:00
john stultz
6bb74df481 [PATCH] clocksource init adjustments (fix bug #7426)
This patch resolves the issue found here:
http://bugme.osdl.org/show_bug.cgi?id=7426

The basic summary is:
Currently we register most of i386/x86_64 clocksources at module_init
time. Then we enable clocksource selection at late_initcall time. This
causes some problems for drivers that use gettimeofday for init
calibration routines (specifically the es1968 driver in this case),
where durring module_init, the only clocksource available is the low-res
jiffies clocksource. This may cause slight calibration errors, due to
the small sampling time used.

It should be noted that drivers that require fine grained time may not
function on architectures that do not have better then jiffies
resolution timekeeping (there are a few). However, this does not
discount the reasonable need for such fine-grained timekeeping at init
time.

Thus the solution here is to register clocksources earlier (ideally when
the hardware is being initialized), and then we enable clocksource
selection at fs_initcall (before device_initcall).

This patch should probably get some testing time in -mm, since
clocksource selection is one of the most important issues for correct
timekeeping, and I've only been able to test this on a few of my own
boxes.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:53 -08:00
David Rientjes
4540768011 [PATCH] x86_64: remove unusued 'flags' variable
Removes unused 'flags' variable from setup_IO_APIC_irq().

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:53 -08:00
Christian Krafft
4ff31d7757 [PATCH] ipmi: check, if default ports are accessible on PPC
ipmi_si_intf tries to access default ports, if no device could be found
elsewhere.  On PPC we have a function to check, if these legacy IO ports
are accessible.  This patch adds a check for these ports on PPC.  This
patch fixes a breakage of IPMI module on PPC machines without a BMC.

Signed-off-by: Christian Krafft <krafft@de.ibm.com>
Acked-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Corey Minyard <minyard@acm.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:53 -08:00
Dmitriy Monakhov
a8fa74ab52 [PATCH] ecryptfs: handle AOP_TRUNCATED_PAGE better
- In fact we don't have to fail if AOP_TRUNCATED_PAGE was returned from
  prepare_write or commit_write. It is beter to retry attempt where it
  is possible.

- Rearange ecryptfs_get_lower_page() error handling logic, make it more clean.

Signed-off-by: Dmitriy Monakhov <dmonakhov@openvz.org>
Acked-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:53 -08:00
Dmitriy Monakhov
82b1652840 [PATCH] ecryptfs: lower root result must be adirectory
- Currently after path_lookup succeed we dot't have any guarantie what
  it is DIR. This must be explicitly demanded.
- path_lookup can't return negative dentry, So inode check is useless.

Signed-off-by: Dmitriy Monakhov <dmonakhov@openvz.org>
Acked-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:53 -08:00
Thomas Gleixner
a5f5e43e2b [PATCH] fix "NMI appears to be stuck"
Testing NMI watchdog ... CPU#0: NMI appears to be stuck (54->54)!
  CPU#1: NMI appears to be stuck (0->0)!

Keep the PIT/HPET alive when nmi_watchdog = 1 is given on the command
line.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:53 -08:00
NeilBrown
6d3baf2eb8 [PATCH] md: fix for raid6 reshape
Recent patch for raid6 reshape had a change missing that showed up in
subsequent review.

Many places in the raid5 code used "conf->raid_disks-1" to mean "number of
data disks".  With raid6 that had to be changed to "conf->raid_disk -
conf->max_degraded" or similar.  One place was missed.

This bug means that if a raid6 reshape were aborted in the middle the
recorded position would be wrong.  On restart it would either fail (as the
position wasn't on an appropriate boundary) or would leave a section of the
array unreshaped, causing data corruption.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:53 -08:00
Zachary Amsden
c6b36e9a3c [PATCH] vmi: smp fixes
Critical fixes for SMP.

Fix a couple functions which needed to be __devinit and fix a bogus parameter
to AP startup that just so happened to work because the low virtual mapping of
memory was still established.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:53 -08:00
Zachary Amsden
772205f62e [PATCH] vmi: apic ops
Use para_fill instead of directly setting the APIC ops to the result of the
vmi_get_function call - this allows one to implement a VMI ROM without
implementing APIC functions, just using the native APIC functions.

While doing this, I realized that there is a lot more cleanup that should have
been done.  Basically, we should never assume that the ROM implements a
specific set of functions, and always allow fallback to the native
implementation.

This is critical for future compatibility.

Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
Signed-off-by: Zachary Amsden <zach@vmware.com>

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:52 -08:00
Zachary Amsden
a9eddc9528 [PATCH] vmi: fix nohz compile
More goo from hrtimers integration.  We do compile and run properly with NO_HZ
enabled.  There was a period when we didn't because of a missing export, but
that was since fixed.

And with the clocksource code now firmly in place, we can get rid of code that
fixes up the wallclock, since this is done in the common infrastructure.  This
actually fixes a timer bug as well, that was caused by do_settimeofday no
longer being callable with interrupts disabled due to the use of
on_each_cpu().

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:52 -08:00
Zachary Amsden
e30fab3ad3 [PATCH] vmi: pit override
The time_init_hook in paravirt-ops no longer functions in the correct manner
after the integration of the hrtimers code.  The problem is that now the call
path for time initialization is:

  time_init :
       late_time_init = hpet_time_init;

  late_time_init -> hpet_time_init:
       setup_pit_timer (BAD)
       do_time_init --> (via paravirt.h)
          time_init_hook --> (via arch_hooks.h)
              time_init_hook (in SUBARCH/setup.c)

If this isn't confusing enough, the paravirt case goes through an indirect
function pointer in the paravirt-ops table.  The problem is, by the time the
paravirt hook is called, the pit timer is already enabled.

But paravirt guests have their own timer, and don't want to use the PIT.
Rather than intensify the struggle for power going on here, just make it all
nice and simple and just unconditionally do all timer setup in the
late_time_init hook.  This also has the advantage of enabling timers in the
same place in all code paths, so everyone has the same bugs and we don't have
outliers who break other code because they turn on timer too early or too
late.

So the paravirt-ops time init function is now by default hpet_time_init, which
is the time init function used for native hardware.  Paravirt guests have the
chance to override this when they setup the paravirt-ops table, and should
need no change.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:52 -08:00
Zachary Amsden
eda08b1bef [PATCH] vmi: paravirt drop udelay op
Not respecting udelay causes problems with any virtual hardware that is passed
through to real hardware.  This can be noticed by any device that interacts
with the real world in real time - like AP startup, which takes real time.  Or
keyboard LEDs, which should blink in real-time.  Or floppy drives, but only
when passed through to a real floppy controller on OSes which can't
sufficiently buffer the floppy commands to emulate a zero latency floppy.  Or
IDE drives, when connecting to a physical CDROM.

This was mostly a hack to get the kernel to boot faster, but it introduced a
number of misvirtualization bugs, and Alan and Pavel argued pretty strongly
against it.  We were the only client, and now want to clean up this cruft.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:52 -08:00
Zachary Amsden
9a1c13e91f [PATCH] vmi: fix highpte
Provide a PT map hook for HIGHPTE kernels to designate where they are mapping
page tables.  This information is required so the physical address of PTE
updates can be determined; otherwise, the mm layer would have to carry the
physical address all the way to each PTE modification callsite, which is even
more hideous that the macros required to provide the proper hooks.

So lets not mess up arch neutral code to achieve this, but keep the horror in
an #ifdef HIGHPTE in include/asm-i386/pgtable.h.  I had to use macros here
because some types are not yet defined in all the include paths for this
header.

This patch is absolutely required for HIGHPTE kernels to operate properly with
VMI.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:52 -08:00
Zachary Amsden
1182d8528b [PATCH] vmi: cpu cycles fix
In order to share the common code in tsc.c which does CPU Khz calibration, we
need to make an accurate value of CPU speed available to the tsc.c code.  This
value loses a lot of precision in a VM because of the timing differences with
real hardware, but we need it to be as precise as possible so the guest can
make accurate time calculations with the cycle counters.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:52 -08:00
Zachary Amsden
6cb9a8350a [PATCH] vmi: sched clock paravirt op fix
The custom_sched_clock hook is broken.  The result from sched_clock needs to
be in nanoseconds, not in CPU cycles.  The TSC is insufficient for this
purpose, because TSC is poorly defined in a virtual environment, and mostly
represents real world time instead of scheduled process time (which can be
interrupted without notice when a virtual machine is descheduled).

To make the scheduler consistent, we must expose a different nature of time,
that is scheduled time.  So deprecate this custom_sched_clock hack and turn it
into a paravirt-op, as it should have been all along.  This allows the tsc.c
code which converts cycles to nanoseconds to be shared by all paravirt-ops
backends.

It is unfortunate to add a new paravirt-op, but this is a very distinct
abstraction which is clearly different for all virtual machine
implementations, and it gets rid of an ugly indirect function which I
ashamedly admit I hacked in to try to get this to work earlier, and then even
got in the wrong units.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:52 -08:00
Zachary Amsden
7507ba34e8 [PATCH] vmi: timer fixes round two
Critical bugfixes for the VMI-Timer code.

1) Do not setup a one shot alarm if we are keeping the periodic alarm
   armed.  Additionally, since the periodic alarm can be run at a lower rate
   than HZ, let's fixup the guard to the no-idle-hz mode appropriately.  This
   fixes the bug where the no-idle-hz mode might have a higher interrupt rate
   than the non-idle case.

2) The interrupt handler can no longer adjust xtime due to nested lock
   acquisition.  Drop this.  We don't need to check for wallclock time at
   every tick, it can be done in userspace instead.

3) Add a bypass to disable noidle operation.  This is useful as a last
   minute workaround, or testing measure.

4) The code to skip the IO_APIC timer testing (no_timer_check) should be
   conditional on IO_APIC, not SMP, since UP kernels can have this configured
   in as well.

Signed-off-by: Dan Hecht <dhecht@vmware.com>
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:52 -08:00
Christoph Lameter
0dc952dc3e [PATCH] Page migration: Fix vma flag checking
Currently we do not check for vma flags if sys_move_pages is called to move
individual pages.  If sys_migrate_pages is called to move pages then we
check for vm_flags that indicate a non migratable vma but that still
includes VM_LOCKED and we can migrate mlocked pages.

Extract the vma_migratable check from mm/mempolicy.c, fix it and put it
into migrate.h so that is can be used from both locations.

Problem was spotted by Lee Schermerhorn

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:51 -08:00
Paul Mundt
1f2b69f9bd [PATCH] fb: sm501fb off-by-1 sysfs store
Currently sm501fb_crtsrc_store() won't allow the routing to be changed via
echos from userspace in to the sysfs file.  The reason for this is that the
strnicmp() for both heads uses a sizeof() for the string length, which ends
up being strlen() + 1 (\0 in the normal case, but the echo gives a newline,
which is where the issue occurs), this then causes a mismatch and
subsequently bails with the -EINVAL.

In addition to this, the hardcoded lengths were then used for the store
length that was returned, which ended up being erroneous and resulting in a
write error.  There's also no point in returning anything but the full
length since it will -EINVAL out on a mismatch well before then anyways.

sizeof("string") is great for making sure you have space in your buffer,
but rather less so for string comparisons :-)

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Ben Dooks <ben-linux@fluff.org>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:51 -08:00
Con Kolivas
69f7c0a1be [PATCH] sched: remove SMT nice
Remove the SMT-nice feature which idles sibling cpus on SMT cpus to
facilitiate nice working properly where cpu power is shared.  The idling of
cpus in the presence of runnable tasks is considered too fragile, easy to
break with outside code, and the complexity of managing this system if an
architecture comes along with many logical cores sharing cpu power will be
unworkable.

Remove the associated per_cpu_gain variable in sched_domains used only by
this code.

Also:

  The reason is that with dynticks enabled, this code breaks without yet
  further tweaks so dynticks brought on the rapid demise of this code.  So
  either we tweak this code or kill it off entirely.  It was Ingo's preference
  to kill it off.  Either way this needs to happen for 2.6.21 since dynticks
  has gone in.

Signed-off-by: Con Kolivas <kernel@kolivas.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:51 -08:00
Hugh Dickins
759b9775c2 [PATCH] shmem and simple const super_operations
shmem's super_operations were missed from the recent const-ification;
and simple_fill_super()'s, which can share with get_sb_pseudo()'s.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Acked-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:51 -08:00
Johannes Berg
cc2cccaec8 [PATCH] schedule wext/rtnl for removal
Since wext is being replaced as fast as we can (it'll probably stick around
for legacy drivers though) and the wext/netlink stuff was never really
used, this schedules it for removal.

The removal schedule is tight but there are no users of the code, the main
user of the wext user interface are the wireless-tools, they only have an
alpha version using the netlink interface and even that is incomplete.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:51 -08:00
Maciej W. Rozycki
de320199c0 [PATCH] dz: remove struct pt_regs references
Remove remaining references to saved registers now that
uart_handle_sysrq_char() does not want them.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:51 -08:00
David Brownell
49015bee40 [PATCH] gpio_keys driver shouldn't be ARM-specific
The gpio_keys driver is wrongly ARM-specific; it can't build on
other platforms with GPIO suport.  This fixes that problem.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Dmitry Torokhov <dtor@mail.ru>
Cc: pHilipp Zabel <philipp.zabel@gmail.com>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Ben Nizette <ben.nizette@iinet.net.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:51 -08:00
David Brownell
0a938b9768 [PATCH] add CONFIG_GENERIC_GPIO
Most drivers using GPIOs already know they are running on a system that
supports the generic GPIO calls, because of other platform dependencies.
But the generic GPIO-based LED and input button drivers can't know that.

So this patch adds a Kconfig hook, GENERIC_GPIO, to mark the platforms
where <asm/gpio.h> will do the right thing.  Currently that's a bunch of
ARMs, and AVR32; more are on the way.

It also fixes a dependency bug for the gpio button input driver; it was
wrong to start with, now it covers all platforms with GENERIC_GPIO.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Acked-by: Richard Purdie <rpurdie@rpsys.net>
Cc: Arnaud Patard <arnaud.patard@rtp-net.org>
Cc: <raph@8d.com>
Cc: <msvoboda@ra.rockwell.com>
Cc: pHilipp Zabel <philipp.zabel@gmail.com>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Dmitry Torokhov <dtor@mail.ru>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:51 -08:00
Tony Breeds
1ad7c31107 [PATCH] Fix soft lockup with iSeries viocd driver
Fix soft lockup with iSeries viocd driver, caused by eventually calling
end_that_request_first() with nr_bytes 0.

Some versions of hald do an SG_IO ioctl on the viocd device which becomes a
request with hard_nr_sectors and hard_cur_sectors set to zero.  Passing zero
as the number of sectors to end_request() (which calls
end_that_request_first()) causes an infinite loop when the bio is being freed.

This patch makes sure that the zero is never passed.  It only requires some
number larger the the request size the terminate the loop.

The lockup is triggered by hald, interrogating the device.

Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:51 -08:00
David Brownell
5fdc2abe39 [PATCH] parport is an orphan
The writing on the wall seem to be that the parport stack is orphaned,
rather than maintained by four folk ...  and having a webpage that says the
latest patches are based on a 2.5 kernel.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Acked-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:50 -08:00
Dmitriy Monakhov
ad5f119679 [PATCH] ecryptfs: check xattr operation support fix
- ecryptfs_write_inode_size_to_metadata() error code was ignored.
  - i_op->setxattr() must be supported by lower fs because used below.

Signed-off-by: Monakhov Dmitriy <dmonakhov@openvz.org>
Acked-by: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:50 -08:00
Eric W. Biederman
58e0543e8f [PATCH] msi: support masking msi irqs without a mask bit
For devices that do not support msi-x we only support 1 interrupt.  Therefore
we can disable that one interrupt by disabling the msi capability itself.  If
we leave the intx interrupts disabled while we have the msi capability
disabled no interrupts should be delivered from that device.

Devices with just the minimal msi support (and thus hitting this code path)
include things like the intel e1000 nic, so it looks like is going to be a
fairly common case and thus important to get right.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:50 -08:00
Eric W. Biederman
b1cbf4e4dd [PATCH] msi: fix up the msi enable/disable logic
enable/disable_msi_mode have several side effects which keeps them from being
generally useful.  So this patch replaces them with with two much more
targeted functions: msi_set_enable and msix_set_enable.

This patch makes pci_dev->msi_enabled and pci_dev->msix_enabled the definitive
way to test if linux has enabled the msi capability, and has the appropriate
msi data structures set up.

This patch ensures that while writing the msi messages in save/restore and
during device initialization we have the msi capability disabled so we don't
get into races.  The pci spec requires that we do not have the msi capability
enabled and the msi messages unmasked while we write the messages.  Completely
disabling the capability is overkill but it is easy :)

Care has been taken so we never have both a msi capability and intx enabled
simultaneously.  We haven't run into a problem yet but better safe then sorry.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:50 -08:00
Eric W. Biederman
f5f2b13129 [PATCH] msi: sanely support hardware level msi disabling
In some cases when we are not using msi we need a way to ensure that the
hardware does not have an msi capability enabled.  Currently the code has been
calling disable_msi_mode to try and achieve that.  However disable_msi_mode
has several other side effects and is only available when msi support is
compiled in so it isn't really appropriate.

Instead this patch implements pci_msi_off which disables all msi and msix
capabilities unconditionally with no additional side effects.

pci_disable_device was redundantly clearing the bus master enable flag and
clearing the msi enable bit.  A device that is not allowed to perform bus
mastering operations cannot generate intx or msi interrupt messages as those
are essentially a special case of dma, and require bus mastering.  So the call
in pci_disable_device to disable msi capabilities was redundant.

quirk_pcie_pxh also called disable_msi_mode and is updated to use pci_msi_off.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:50 -08:00
Jean Delvare
58a53b246b [PATCH] io_apic.h needs apicdef.h
A -mm patch caused:

In file included from drivers/pci/quirks.c:532:
include/asm/io_apic.h:61: error: "MAX_IO_APICS" undeclared here (not in a function)

So let's include the needed header.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:50 -08:00
Andrew Morton
d5dedf99e4 [PATCH] cyclades: return closing_wait
In http://bugzilla.kernel.org/show_bug.cgi?id=8065, Shen points out that the
cyclades driver forget to return closing_wait to userspace.

Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Shen <shanlu@cs.uiuc.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:50 -08:00
Richard Purdie
202d4e6025 backlight: Allow enable/disable of fb backlights, fixing regressions
Enabling the backlight by default appears to cause problems for many
users. This patch disables backlight controls unless explicitly
enabled by users via a module parameter. Since PMAC users are known
to work, default to enabled in that case.

Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
2007-03-05 08:49:38 +00:00
Richard Purdie
238576e12f backlight: Fix nvidia backlight initial brightness
Fix a mix up when the nvidia driver was converted resulting
in the backlight having an incorrect initial brightness.

Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
2007-03-05 08:49:38 +00:00