linux

iv/linux

Author	SHA1	Message	Date
David Rientjes	41f7f60d31	cpusets: fix obsolete comment mm migration is no longer done in cpuset_update_task_memory_state() so it can no longer take current->mm->mmap_sem, so fix the obsolete comment. [ This changed in commit `04c19fa6f1` ("cpuset: migrate all tasks in cpuset at once") when the mm migration was moved from cpuset_update_task_memory_state() to update_nodemask() ] Signed-off-by: David Rientjes <rientjes@google.com> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-05 17:53:33 -08:00
Linus Torvalds	103926c689	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: (27 commits) [SCSI] mpt fusion: don't oops if NumPhys==0 [SCSI] iscsi class: regression - fix races with state manipulation and blocking/unblocking [SCSI] qla4xxx: regression - add start scan callout [SCSI] qla4xxx: fix host reset dpc race [SCSI] tgt: fix build errors when dprintk is defined [SCSI] tgt: set the data length properly [SCSI] tgt: stop zero'ing scsi_cmnd [SCSI] ibmvstgt: set up scsi_host properly before __scsi_alloc_queue [SCSI] docbook: fix fusion source files [SCSI] docbook: fix scsi source file [SCSI] qla2xxx: Update version number to 8.02.00-k9. [SCSI] qla2xxx: Correct usage of inconsistent timeout values while issuing ELS commands. [SCSI] qla2xxx: Correct discrepancies during OVERRUN handling on FWI2-capable cards. [SCSI] qla2xxx: Correct needless clean-up resets during shutdown. [SCSI] arcmsr: update version and changelog [SCSI] ps3rom: disable clustering [SCSI] ps3rom: fix wrong resid calculation bug [SCSI] mvsas: fix phy sas address [SCSI] gdth: fix to internal commands execution [SCSI] gdth: bugfix for the at-exit problems ...	2008-03-05 17:49:59 -08:00
Linus Torvalds	da71aeb614	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6: NFS: use new LSM interfaces to explicitly set mount options LSM/SELinux: Interfaces to allow FS to control mount options	2008-03-05 17:49:38 -08:00
Linus Torvalds	9af6b056a2	Merge branch 'fixes-25' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq * 'fixes-25' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq: [CPUFREQ] fix section mismatch warnings [CPUFREQ] Remove debugging message from e_powersaver [CPUFREQ] Fix missing cpufreq_cpu_put() call in ->store [CPUFREQ] Fix missing cpufreq_cpu_put() call in ->show	2008-03-05 17:49:01 -08:00
Linus Torvalds	8cce3e7cbe	Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6 * 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: [S390] incorrect reipl nss name. [S390] Load disabled wait psw if reipl fails. [S390] Fix IPL from NSS. [S390] zcrypt: fix ap_device_list handling [S390] sclp_vt220: speed up console output for interactive work [S390] dasd: fix reference counting in display method for proc/dasd/devices [S390] dasd: let dasd erp matching recognize alias recovery [S390] Get rid of memcpy gcc warning workaround. [S390] idle: Fix machine check handling in idle loop. [S390] Update default configuration.	2008-03-05 17:47:41 -08:00
Eric Paris	f9c3a38021	NFS: use new LSM interfaces to explicitly set mount options NFS and SELinux worked together previously because SELinux had NFS specific knowledge built in. This design was approved by both groups back in 2004 but the recent NFS changes to use nfs_parsed_mount_data and the usage of nfs_clone_mount_data showed this to be a poor fragile solution. This patch fixes the NFS functionality regression by making use of the new LSM interfaces to allow an FS to explicitly set its own mount options. The explicit setting of mount options is done in the nfs get_sb functions which are called before the generic vfs hooks try to set mount options for filesystems which use text mount data. This does not currently support NFSv4 as that functionality did not exist in previous kernels and thus there is no regression. I will be adding the needed code, which I believe to be the exact same as the v3 code, in nfs4_get_sb for 2.6.26. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-03-06 08:40:59 +11:00
Eric Paris	e000752989	LSM/SELinux: Interfaces to allow FS to control mount options Introduce new LSM interfaces to allow an FS to deal with their own mount options. This includes a new string parsing function exported from the LSM that an FS can use to get a security data blob and a new security data blob. This is particularly useful for an FS which uses binary mount data, like NFS, which does not pass strings into the vfs to be handled by the loaded LSM. Also fix a BUG() in both SELinux and SMACK when dealing with binary mount data. If the binary mount data is less than one page the copy_page() in security_sb_copy_data() can cause an illegal page fault and boom. Remove all NFSisms from the SELinux code since they were broken by past NFS changes. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-03-06 08:40:53 +11:00
Krzysztof Oledzki	51f39eae14	[SCSI] mpt fusion: don't oops if NumPhys==0 Don't oops if NumPhys==0, instead return -ENODEV. This patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=9909 Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl> Acked-by: Eric Moore <Eric.Moore@lsi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-03-05 14:57:57 -06:00
Sam Ravnborg	f6ebef30e2	[CPUFREQ] fix section mismatch warnings Fix the following warnings: WARNING: vmlinux.o(.text+0xfe6711): Section mismatch in reference from the function cpufreq_unregister_driver() to the variable .cpuinit.data:cpufreq_cpu_notifier WARNING: vmlinux.o(.text+0xfe68af): Section mismatch in reference from the function cpufreq_register_driver() to the variable .cpuinit.data:cpufreq_cpu_notifier WARNING: vmlinux.o(.exit.text+0xc4fa): Section mismatch in reference from the function cpufreq_stats_exit() to the variable .cpuinit.data:cpufreq_stat_cpu_notifier The warnings were casued by references to unregister_hotcpu_notifier() from normal functions or exit functions. This is flagged by modpost as a potential error because it does not know that for the non HOTPLUG_CPU scenario the unregister_hotcpu_notifier() is a nop. Silence the warning by replacing the __initdata annotation with a __refdata annotation. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Dave Jones <davej@codemonkey.org.uk>	2008-03-05 14:45:31 -05:00
Dave Jones	0e5aa8d621	[CPUFREQ] Remove debugging message from e_powersaver We don't need to printk a message every time we transition. Leave the code there, but ifdef'd out, as it's useful when adding support for new processors. Reported-by: Petr Titěra <P.Titera@century.cz> Signed-off-by: Dave Jones <davej@redhat.com>	2008-03-05 14:45:31 -05:00
Dave Jones	a07530b445	[CPUFREQ] Fix missing cpufreq_cpu_put() call in ->store refactor to use gotos instead of explicit exit paths Signed-off-by: Dave Jones <davej@redhat.com>	2008-03-05 14:45:31 -05:00
Dave Jones	0db4a8a99f	[CPUFREQ] Fix missing cpufreq_cpu_put() call in ->show refactor to use gotos instead of explicit exit paths Signed-off-by: Dave Jones <davej@redhat.com>	2008-03-05 14:45:31 -05:00
Mike Christie	45ab33b6c1	[SCSI] iscsi class: regression - fix races with state manipulation and blocking/unblocking For qla4xxx, we could be starting a session, but some error (network, target, IO from a device that got started, etc) could cause the session to fail and curring the block/unblock and state manipulation could race with each other. This patch just has those operations done in the single threaded iscsi eh work queue, so that way they are serialized. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-03-05 12:04:09 -06:00
Mike Christie	024f801f52	[SCSI] qla4xxx: regression - add start scan callout We are seeing EXIST errors from sysfs during device addition. We need a start scan callout so we do not start scanning sessions found during hba setup, before the async scsi scan code is ready. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Acked-by: David C Somayajulu <david.somayajulu@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-03-05 12:03:54 -06:00
Mike Christie	50a29aec9c	[SCSI] qla4xxx: fix host reset dpc race The host reset callout could be starting to reset the hba at the same time the dpc thread is. This creates lots of problems because they both want to do wierd things with the firmware and interrupts, etc. This patch just has the host reset function fully shutdown the dpc thread before resetting the hba. This patch also moves the setting of the session online bit to fix a potential race with the dpc thread and iscsi recovery thread. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Acked-by: David C Somayajulu <david.somayajulu@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-03-05 12:03:17 -06:00
Jeff Garzik	a878539ef9	ahci: work around ATI SB600 h/w quirk This addresses the recent ATI SB600 errata, where the hardware does not like 256-length PRD entries during FPDMA (aka NCQ). It hurts performance on SB600, but it is more important to get a correct patch eliminating the data corruption/lockups, and then later on tune for performance. We simply limit each command to a maximum of 255 sectors, on SB600. Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-03-05 07:53:06 -05:00
Alan Cox	6ddd68615a	pata_hpt*, pata_serverworks: fix UDMA masking When masking, mask out the modes that are unsupported not the ones that are supported. This makes life happier. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-03-05 07:46:34 -05:00
Hongjie Yang	583b33bc83	[S390] incorrect reipl nss name. /sys/firmware/reipl/nss/name contains the nss name when defsys or savesys command has been executed. If the defsys or savesys command fails the kernel_nss_name has to be cleared since a reipl on that nss name won't be possible. Signed-off-by: Hongjie Yang <hongjie@us.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-03-05 12:37:20 +01:00
Michael Holzheu	208e559155	[S390] Load disabled wait psw if reipl fails. Normally this should not happen, but it's cleaner to do it that way. Signed-off-by: Michael Holzheu <holzheu@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-03-05 12:37:19 +01:00
Heiko Carstens	684de39bd7	[S390] Fix IPL from NSS. IPL from NSS didn't work because the memory detection routine omits any memory sections with a size lower than what MAX_ORDER defines. This causes the detection routine to skip the first memory segment which has a size of 1MB. Which later on will let the kernel think that there is no memory available at all. Since in addition the z/VM memory increment size is 1MB force MAX_ORDER to be 9, so we can support 1MB segments. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-03-05 12:37:19 +01:00
Ralph Wuerthner	faa582ca80	[S390] zcrypt: fix ap_device_list handling In ap_device_probe() we can add the new ap device to the internal device list only if the device probe function successfully returns. Otherwise we might end up with an invalid device in the internal ap device list. Signed-off-by: Ralph Wuerthner <rwuerthn@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-03-05 12:37:19 +01:00
Christian Borntraeger	fa331ffc56	[S390] sclp_vt220: speed up console output for interactive work Currently an output buffer can wait up to HZ/2 until the buffer is flushed. The wait time is noticeable in interactive tools like mc. Change the value to HZ/20, which seems enough for interactive work. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-03-05 12:37:18 +01:00
Stefan Weinhuber	a5e2383991	[S390] dasd: fix reference counting in display method for proc/dasd/devices Using the /proc/dasd/devices interface leaves the reference counter of alias devices in an inconsistent state. A process that tries to set such a device offline afterwards will hang. The dasd_devices_show function returns immediately for alias devices and this code path was missing a dasd_put_device call. Signed-off-by: Stefan Weinhuber <wein@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-03-05 12:37:18 +01:00
Stefan Weinhuber	5c12f2406c	[S390] dasd: let dasd erp matching recognize alias recovery When a request fails that was started on an alias device then the first recovery step is to retry it on the base device. If the recovery request fails again with the same symptoms, the next step should not be a simple retry, but should be a proper recovery based on sense data, etc. To do so, the dasd recovery functions need to recognize the alias recovery step in the erp chain by comparing the start devices. Signed-off-by: Stefan Weinhuber <wein@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-03-05 12:37:18 +01:00
Heiko Carstens	98c7b388af	[S390] Get rid of memcpy gcc warning workaround. Compile smp.o with -Wno-nonnull so gcc stops warning about memcpy being used with a null parameter. Also remove the workaround code and use a char * cast instead of a void * cast to do computations. Cc: Bastian Blank <bastian@waldi.eu.org> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-03-05 12:37:17 +01:00
Heiko Carstens	5ccd0e43bb	[S390] idle: Fix machine check handling in idle loop. If a machine check handling is pending when the idle loop is entered default_idle will be left with timer ticks and virtual timer disabled. Fix this by "calling" the idle_chain. Also a BUG_ON(!in_interrupt) in start_hz_timer must be removed since the function now gets called from non interrupt context as well. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-03-05 12:37:17 +01:00
Martin Schwidefsky	9361a492cd	[S390] Update default configuration. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-03-05 12:37:16 +01:00
Linus Torvalds	29e8c3c304	Linux 2.6.25-rc4	2008-03-04 20:33:54 -08:00
Pavel Roskin	9b37ccfc63	module: allow ndiswrapper to use GPL-only symbols A change after 2.6.24 broke ndiswrapper by accidentally removing its access to GPL-only symbols. Revert that change and add comments about the reasons why ndiswrapper and driverloader are treated in a special way. Signed-off-by: Pavel Roskin <proski@gnu.org> Acked-by: Greg KH <gregkh@suse.de> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jon Masters <jonathan@jonmasters.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-04 20:29:40 -08:00
Linus Torvalds	27d0483aa1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (22 commits) [IPCONFIG]: The kernel gets no IP from some DHCP servers b43legacy: Fix module init message rndis_wlan: fix broken data copy libertas: compare the current command with response libertas: fix sanity check on sequence number in command response p54: fix eeprom parser length sanity checks p54: fix EEPROM structure endianness ssb: Add pcibios_enable_device() return value check rc80211-pid: fix rate adjustment [ESP]: Add select on AUTHENC [TCP]: Improve ipv4 established hash function. [NETPOLL]: Revert two bogus cleanups that broke netconsole. [PPPOL2TP]: Add missing sock_put() in pppol2tp_tunnel_closeall() Subject: [PPPOL2TP] add missing sock_put() in pppol2tp_recv_dequeue() [BLUETOOTH]: l2cap info_timer delete fix in hci_conn_del [NET]: Fix race in generic address resolution. iucv: fix build error on !SMP [TCP]: Must count fack_count also when skipping [TUN]: Fix RTNL-locking in tun/tap driver [SCTP]: Use proc_create to setup de->proc_fops. ...	2008-03-04 20:20:58 -08:00
Linus Torvalds	665c1ef836	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: [SPARC]: Fix link errors with gcc-4.3 sparc64: replace remaining __FUNCTION__ occurances sparc: replace remaining __FUNCTION__ occurances [SPARC]: Add reboot_command[] extern decl to asm/system.h [SPARC]: Mark linux_sparc_{fpu,chips} static.	2008-03-04 20:20:32 -08:00
Stephen Hemminger	dea75bdfa5	[IPCONFIG]: The kernel gets no IP from some DHCP servers From: Stephen Hemminger <shemminger@linux-foundation.org> Based upon a patch by Marcel Wappler: This patch fixes a DHCP issue of the kernel: some DHCP servers (i.e. in the Linksys WRT54Gv5) are very strict about the contents of the DHCPDISCOVER packet they receive from clients. Table 5 in RFC2131 page 36 requests the fields 'ciaddr' and 'siaddr' MUST be set to '0'. These DHCP servers ignore Linux kernel's DHCP discovery packets with these two fields set to '255.255.255.255' (in contrast to popular DHCP clients, such as 'dhclient' or 'udhcpc'). This leads to a not booting system. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-04 17:03:49 -08:00
David S. Miller	3123e666ea	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6	2008-03-04 16:44:01 -08:00
Linus Torvalds	71ca44dac4	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] fix ia64 kprobes compilation [IA64] move gcc_intrin.h from header-y to unifdef-y [IA64] workaround tiger ia64_sal_get_physical_id_info hang [IA64] move defconfig to arch/ia64/configs/ [IA64] Fix irq migration in multiple vector domain [IA64] signal(ia64_ia32): add a signal stack overflow check [IA64] signal(ia64): add a signal stack overflow check [IA64] CONFIG_SGI_SN2 - auto select NUMA and ACPI_NUMA	2008-03-04 16:39:23 -08:00
Linus Torvalds	2c6f2db13a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6: debugfs: fix sparse warnings Driver core: Fix cleanup when failing device_add(). driver core: Remove dpm_sysfs_remove() from error path of device_add() PM: fix new mutex-locking bug in the PM core PM: Do not acquire device semaphores upfront during suspend kobject: properly initialize ksets sysfs: CONFIG_SYSFS_DEPRECATED fix driver core: fix up Kconfig text for CONFIG_SYSFS_DEPRECATED	2008-03-04 16:37:35 -08:00
Linus Torvalds	12f981f902	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/pci-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/pci-2.6: pci: hotplug: pciehp: fix error code path in hpc_power_off_slot PCI: Add DECLARE_PCI_DEVICE_TABLE macro PCI: fix up error messages for pci_bus registering PCI: fix section mismatch warning in pci_scan_child_bus PCI: consolidate duplicated MSI enable functions PCI: use dev_printk in quirk messages	2008-03-04 16:37:10 -08:00
Linus Torvalds	10955d2251	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: USB: ftdi_sio - really enable EM1010PC USB: remove incorrect struct class_device from the printer gadget USB: pxa2xx_udc: fix misuse of clock enable/disable calls USB: ftdi_sio: Workaround for broken Matrix Orbital serial port USB: Add support for AXESSTEL MV110H CDMA modem usb-storage: update earlier scatter-gather bug fix USB: isp116x: fix enumeration on boot USB: ehci: handle large bulk URBs correctly (again) USB: spruce up the device blacklist USB: fix comment of struct usb_interface USB: update Kconfig entry for USB_SUSPEND usb: Add support for the mos7820/7840-based B&B USB/RS485 converter to mos7840.c	2008-03-04 16:36:53 -08:00
Masami Hiramatsu	b2a5cd6938	kprobes: fix a null pointer bug in register_kretprobe() Fix a bug in regiseter_kretprobe() which does not check rp->kp.symbol_name == NULL before calling kprobe_lookup_name. For maintainability, this introduces kprobe_addr helper function which resolves addr field. It is used by register_kprobe and register_kretprobe. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jim Keniston <jkenisto@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-04 16:35:19 -08:00
Randy Dunlap	1913130553	input: add I2C to config since the driver makes several i2c() calls Add to help text that the Intel I2C ICH (i801) driver is also needed for this kernel. Add LEDS_CLASS to config since the driver makes les_classdev_() calls: ERROR: "led_classdev_register" [drivers/input/misc/apanel.ko] undefined! ERROR: "__led_classdev_unregister" [drivers/input/misc/apanel.ko] undefined! Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-04 16:35:18 -08:00
Josef Bacik	92587216f8	ext3: fix mount option parsing The "resize" option won't be noticed as it comes after the NULL option, so if you try to mount (or in this case remount) with that option it won't be recognized. Signed-off-by: Josef Bacik <jbacik@redhat.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-04 16:35:18 -08:00
Nishanth Aravamudan	348e1e04b5	hugetlb: fix pool shrinking while in restricted cpuset Adam Litke noticed that currently we grow the hugepage pool independent of any cpuset the running process may be in, but when shrinking the pool, the cpuset is checked. This leads to inconsistency when shrinking the pool in a restricted cpuset -- an administrator may have been able to grow the pool on a node restricted by a containing cpuset, but they cannot shrink it there. There are two options: either prevent growing of the pool outside of the cpuset or allow shrinking outside of the cpuset. >From previous discussions on linux-mm, /proc/sys/vm/nr_hugepages is an administrative interface that should not be restricted by cpusets. So allow shrinking the pool by removing pages from nodes outside of current's cpuset. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Acked-by: Adam Litke <agl@us.ibm.com> Cc: William Irwin <wli@holomorphy.com> Cc: Lee Schermerhorn <Lee.Schermerhonr@hp.com> Cc: Christoph Lameter <clameter@sgi.com> Cc: Paul Jackson <pj@sgi.com> Cc: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-04 16:35:18 -08:00
Adam Litke	ac09b3a151	hugetlb: close a difficult to trigger reservation race A hugetlb reservation may be inadequately backed in the event of racing allocations and frees when utilizing surplus huge pages. Consider the following series of events in processes A and B: A) Allocates some surplus pages to satisfy a reservation B) Frees some huge pages A) A notices the extra free pages and drops hugetlb_lock to free some of its surplus pages back to the buddy allocator. B) Allocates some huge pages A) Reacquires hugetlb_lock and returns from gather_surplus_huge_pages() Avoid this by commiting the reservation after pages have been allocated but before dropping the lock to free excess pages. For parity, release the reservation in return_unused_surplus_pages(). This patch also corrects the cpuset_mems_nr() error path in hugetlb_acct_memory(). If the cpuset check fails, uncommit the reservation, but also be sure to return any surplus huge pages that may have been allocated to back the failed reservation. Thanks to Andy Whitcroft for discovering this. Signed-off-by: Adam Litke <agl@us.ibm.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-04 16:35:18 -08:00
K.Tanaka	a07e6ab41b	md: the md RAID10 resync thread could cause a md RAID10 array deadlock This message describes another issue about md RAID10 found by testing the 2.6.24 md RAID10 using new scsi fault injection framework. Abstract: When a scsi error results in disabling a disk during RAID10 recovery, the resync threads of md RAID10 could stall. This case, the raid array has already been broken and it may not matter. But I think stall is not preferable. If it occurs, even shutdown or reboot will fail because of resource busy. The deadlock mechanism: The r10bio_s structure has a "remaining" member to keep track of BIOs yet to be handled when recovering. The "remaining" counter is incremented when building a BIO in sync_request() and is decremented when finish a BIO in end_sync_write(). If building a BIO fails for some reasons in sync_request(), the "remaining" should be decremented if it has already been incremented. I found a case where this decrement is forgotten. This causes a md_do_sync() deadlock because md_do_sync() waits for md_done_sync() called by end_sync_write(), but end_sync_write() never calls md_done_sync() because of the "remaining" counter mismatch. For example, this problem would be reproduced in the following case: Personalities : [raid10] md0 : active raid10 sdf1[4] sde1[5](F) sdd1[2] sdc1[1] sdb1[6](F) 3919616 blocks 64K chunks 2 near-copies [4/2] [_UU_] [>....................] recovery = 2.2% (45376/1959808) finish=0.7min speed=45376K/sec This case, sdf1 is recovering, sdb1 and sde1 are disabled. An additional error with detaching sdd will cause a deadlock. md0 : active raid10 sdf1[4] sde1[5](F) sdd1[6](F) sdc1[1] sdb1[7](F) 3919616 blocks 64K chunks 2 near-copies [4/1] [_U__] [=>...................] recovery = 5.0% (99520/1959808) finish=5.9min speed=5237K/sec 2739 ? S< 0:17 [md0_raid10] 28608 ? D< 0:00 [md0_resync] 28629 pts/1 Ss 0:00 bash 28830 pts/1 R+ 0:00 ps ax 31819 ? D< 0:00 [kjournald] The resync thread keeps working, but actually it is deadlocked. Patch: By this patch, the remaining counter will be decremented if needed. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-04 16:35:18 -08:00
NeilBrown	1c830532f6	md: fix possible raid1/raid10 deadlock on read error during resync Thanks to K.Tanaka and the scsi fault injection framework, here is a fix for another possible deadlock in raid1/raid10 error handing. If a read request returns an error while a resync is happening and a resync request is pending, the attempt to fix the error will block until the resync progresses, and the resync will block until the read request completes. Thus a deadlock. This patch fixes the problem. Cc: "K.Tanaka" <k-tanaka@ce.jp.nec.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-04 16:35:18 -08:00
Keld Simonsen	8ed3a19563	md: don't attempt read-balancing for raid10 'far' layouts This patch changes the disk to be read for layout "far > 1" to always be the disk with the lowest block address. Thus the chunks to be read will always be (for a fully functioning array) from the first band of stripes, and the raid will then work as a raid0 consisting of the first band of stripes. Some advantages: The fastest part which is the outer sectors of the disks involved will be used. The outer blocks of a disk may be as much as 100 % faster than the inner blocks. Average seek time will be smaller, as seeks will always be confined to the first part of the disks. Mixed disks with different performance characteristics will work better, as they will work as raid0, the sequential read rate will be number of disks involved times the IO rate of the slowest disk. If a disk is malfunctioning, the first disk which is working, and has the lowest block address for the logical block will be used. Signed-off-by: Keld Simonsen <keld@dkuug.dk> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-04 16:35:18 -08:00
NeilBrown	27c529bb8e	md: lock access to rdev attributes properly When we access attributes of an rdev (component device on an md array) through sysfs, we really need to lock the array against concurrent changes. We currently do that when we change an attribute, but not when we read an attribute. We need to lock when reading as well else rdev->mddev could become NULL while we are accessing it. So add appropriate locking (mddev_lock) to rdev_attr_show. rdev_size_store requires some extra care as well as it needs to unlock the mddev while scanning other mddevs for overlapping regions. We currently assume that rdev->mddev will still be unchanged after the scan, but that cannot be certain. So take a copy of rdev->mddev for use at the end of the function. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-04 16:35:18 -08:00
NeilBrown	2515619823	md: make sure a reshape is started when device switches to read-write A resync/reshape/recovery thread will refuse to progress when the array is marked read-only. So whenever it mark it not read-only, it is important to wake up thread resync thread. There is one place we didn't do this. The problem manifests if the start_ro module parameters is set, and a raid5 array that is in the middle of a reshape (restripe) is started. The array will initially be semi-read-only (meaning it acts like it is readonly until the first write). So the reshape will not proceed. On the first write, the array will become read-write, but the reshape will not be started, and there is no event which will ever restart that thread. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-04 16:35:18 -08:00
NeilBrown	d0fae18f1b	md: clean up irregularity with raid autodetect When a raid1 array is stopped, all components currently get added to the list for auto-detection. However we should really only add components that were found by autodetection in the first place. So add a flag to record that information, and use it. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-04 16:35:18 -08:00
NeilBrown	a1801f858e	md: guard against possible bad array geometry in v1 metadata Make sure the data doesn't start before the end of the superblock when the superblock is at the start of the device. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-04 16:35:17 -08:00
NeilBrown	8311c29d40	md: reduce CPU wastage on idle md array with a write-intent bitmap On an md array with a write-intent bitmap, a thread wakes up every few seconds and scans the bitmap looking for work to do. If the array is idle, there will be no work to do, but a lot of scanning is done to discover this. So cache the fact that the bitmap is completely clean, and avoid scanning the whole bitmap when the cache is known to be clean. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-04 16:35:17 -08:00

1 2 3 4 5 ...

86973 Commits