28530 Commits

Author SHA1 Message Date
Bartlomiej Zolnierkiewicz
6b5cde3629 cmd64x: set IDE_HFLAG_SERIALIZE explictly for CMD646
* Set IDE_HFLAG_SERIALIZE explictly for CMD646.

* Remove no longer needed ide_cmd646 chipset type (which has
  a nice side-effect of fixing handling of unexpected IRQs).

Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-12-29 20:27:32 +01:00
Bartlomiej Zolnierkiewicz
27c01c2db0 ide-cd: remove obsolete seek optimization
It doesn't make much sense nowadays and is problematic on some drives.

Cc: Borislav Petkov <petkovbb@googlemail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-12-29 20:27:32 +01:00
Bartlomiej Zolnierkiewicz
2a2ca6a961 ide: replace the global ide_lock spinlock by per-hwgroup spinlocks (v2)
Now that (almost) all host drivers have been fixed not to abuse ide_lock
and core code usage of ide_lock has been sanitized we may safely replace
ide_lock by per-hwgroup locks.

This patch is partially based on earlier patch from Ravikiran G Thirumalai.

While at it:
- don't use deprecated HWIF() and HWGROUP() macros
- update locking documentation in ide.h

v2:
Add missing spin_lock_init(&hwgroup->lock).  (Noticed by Elias Oltmanns)

Cc: Vaibhav V. Nivargi <vaibhav.nivargi@gmail.com>
Cc: Alok N. Kataria <alokk@calsoftinc.com>
Cc: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Cc: Elias Oltmanns <eo@nebensachen.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-12-29 20:27:31 +01:00
Robert Love
85b4aa4926 [SCSI] fcoe: Fibre Channel over Ethernet
Encapsulation protocol for running Fibre Channel over Ethernet interfaces.
Creates virtual Fibre Channel host adapters using libfc.

This layer is the LLD to the scsi-ml. It allocates the Scsi_Host, utilizes
libfc for Fibre Channel protocol processing and interacts with netdev to
send/receive Ethernet packets.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-29 11:24:33 -06:00
Robert Love
42e9a92fe6 [SCSI] libfc: A modular Fibre Channel library
libFC is composed of 4 blocks supported by an exchange manager
and a framing library. The upper 4 layers are fc_lport, fc_disc,
fc_rport and fc_fcp. A LLD that uses libfc could choose to
either use libfc's block, or using the transport template
defined in libfc.h, override one or more blocks with its own
implementation.

The EM (Exchange Manager) manages exhcanges/sequences for all
commands- ELS, CT and FCP.

The framing library frames ELS and CT commands.

The fc_lport block manages the library's representation of the
host's FC enabled ports.

The fc_disc block manages discovery of targets as well as
handling changes that occur in the FC fabric (via. RSCN events).

The fc_rport block manages the library's representation of other
entities in the FC fabric. Currently the library uses this block
for targets, its peer when in point-to-point mode and the
directory server, but can be extended for other entities if
needed.

The fc_fcp block interacts with the scsi-ml and handles all
I/O.

Signed-off-by: Robert Love <robert.w.love@intel.com>
[jejb: added include of delay.h to fix ppc64 compile prob spotted by sfr]
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-29 11:24:33 -06:00
Robert Love
f032c2f7cd [SCSI] FC protocol definition header files
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-29 11:24:32 -06:00
FUJITA Tomonori
f4f4e47e4a [SCSI] add residual argument to scsi_execute and scsi_execute_req
scsi_execute() and scsi_execute_req() discard the residual length
information. Some callers need it. This adds residual argument
(optional) to scsi_execute and scsi_execute_req.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-29 11:24:24 -06:00
Mike Christie
6df19a791b [SCSI] libiscsi_tcp: support padding offload
cxgb3i does not offload the processing of the header,
but it will always process the padding. This patch
adds a padding offload flag to detect when the LLD
supports this.

The patch also modifies the header processing so that
we do not try to read/bypass the header dugest in the
skb. cxgb3i will not include it with the header like
with other offload cards.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-29 11:24:23 -06:00
Mike Christie
2ff79d52d5 [SCSI] libiscsi: pass opcode into alloc_pdu callout
We do not need to allocate a itt for data_out, so this
passes the opcode to the alloc_pdu callout.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-29 11:24:23 -06:00
Mike Christie
262ef63627 [SCSI] libiscsi: allow drivers to modify the itt sent to the target
bnx2i and cxgb3i need to encode LLD info in the itt so that
the firmware/hardware can process the pdu. This patch allows
the LLDs to encode info in the task->hdr->itt that they
setup in the alloc_pdu callout (any resources that are allocated
can be freed with the pdu in the cleanup_task callout). If
the LLD encodes info in the itt they should implement a
parse_pdu_itt callout. If parse_pdu_itt is not implemented
libiscsi will do the right thing for the LLD.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-29 11:24:22 -06:00
Mike Christie
a081c13e39 [SCSI] iscsi_tcp: split module into lib and lld
As explained in the previous mails, cxgb3i needs iscsi_tcp's
r2t/data_out and data_in procesing so this just moves functions
that both drivers want to use to a new module libiscsi_tcp. The
next patch will hook iscsi_tcp in.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-29 11:24:22 -06:00
Mike Christie
577577da6d [SCSI] libiscsi: prepare libiscsi for new offload engines by modifying unsol data code
cxgb3i offloads data transfers. It does not offload the entire scsi/iscsi
procssing like qla4xxx and it does not offload the iscsi sequence
processing like how bnx2i does. cxgb3i relies on iscsi_tcp for the
seqeunce handling so this changes how we transfer unsolicitied data by
adding a common r2t struct and helpers.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-29 11:24:20 -06:00
Mike Christie
63c62f1cb9 [SCSI] iscsi_tcp: prepare helpers for LLDs that can offload some operations
cxgb3i is unlike qla4xxx and bnx2i in that it does not offload entire
scsi commands or iscsi sequences. Instead it only offloads the transfer
of a ISCSI DATA_IN pdu's data, the digests and padding. This patch fixes up the
iscsi tcp recv path so that it exports its skb recv processing so
cxgb3i and other drivers can call them. All they have to do is pass
the function the skb with the hdr or data pdu header and this function
will do the rest.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-29 11:24:20 -06:00
James Bottomley
b29f841378 [SCSI] remove timeout from struct scsi_device
by removing the unused timeout parameter we ensure a compile failure if
anyone is accidentally still using it rather than the block timeout.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-29 11:24:20 -06:00
Jiri Slaby
d61c72e52b DMI: add dmi_match
Add a wrapper for testing system_info which will handle also NULL
system infos.

This will be used by the ata PIIX driver.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Alexandru Romanescu <a_romanescu@yahoo.co.uk>
Cc: Tejun Heo <tj@kernel.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-12-29 07:39:34 -05:00
Yinghai Lu
43a256322a sparseirq: move __weak symbols into separate compilation unit
GCC has a bug with __weak alias functions: if the functions are in
the same compilation unit as their call site, GCC can decide to
inline them - and thus rob the linker of the opportunity to override
the weak alias with the real thing.

So move all the IRQ handling related __weak symbols to kernel/irq/chip.c.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-29 12:15:49 +01:00
Pekka Enberg
3c506efd7e Merge branch 'topic/failslab' into for-linus
Conflicts:

	mm/slub.c

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2008-12-29 11:47:05 +02:00
Pekka Enberg
fd37617e69 Merge branches 'topic/fixes', 'topic/cleanups' and 'topic/documentation' into for-linus 2008-12-29 11:45:47 +02:00
Pascal Terjan
dfcd361028 slab: Fix comment on #endif
This #endif in slab.h is described as closing the inner block while it's for
the big CONFIG_NUMA one. That makes reading the code a bit harder.

This trivial patch fixes the comment.

Signed-off-by: Pascal Terjan <pterjan@mandriva.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2008-12-29 11:40:56 +02:00
Akinobu Mita
773ff60e84 SLUB: failslab support
Currently fault-injection capability for SLAB allocator is only
available to SLAB. This patch makes it available to SLUB, too.

[penberg@cs.helsinki.fi: unify slab and slub implementations]
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2008-12-29 11:27:46 +02:00
Eric Anholt
fede5c91c4 drm: Add a debug node for vblank state.
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@linux.ie>
2008-12-29 17:47:27 +10:00
Kristian Høgsberg
3c4fdcfb29 drm: pin new and unpin old buffer when setting a mode.
This removes the requirement for user space to pin a buffer before
setting a mode that is backed by the pixels from that buffer.

Signed-off-by: Kristian Høgsberg <krh@redhat.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@linux.ie>
2008-12-29 17:47:27 +10:00
Eric Anholt
8d391aa410 drm/i915: Add missing userland definitions for gem init/execbuffer.
fdo bug #19132.

Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2008-12-29 17:47:25 +10:00
Dave Airlie
dfef245922 i915/drm: provide compat defines for userspace for certain struct members.
Painfully userspace started using new names that were never actually to be
used from the external repo.

Also fill out the gaps in the structure for old/new userspace compat

Add compat defines for these structs.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2008-12-29 17:47:25 +10:00
Kristian H�gsberg
0c7c266475 drm: drop DRM_IOCTL_MODE_REPLACEFB, add+remove works just as well.
The replace fb ioctl replaces the backing buffer object for a modesetting
framebuffer object.  This can be acheived by just creating a new
framebuffer backed by the new buffer object, setting that for the crtcs
in question and then removing the old framebuffer object.

Signed-off-by: Kristian Hogsberg <krh@redhat.com>
Acked-by: Jakob Bornecrantz <jakob@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2008-12-29 17:47:25 +10:00
Jakob Bornecrantz
e0c8463a8b drm: sanitise drm modesetting API + remove unused hotplug
The initially merged modesetting API has some uglies in it, this
cleans up the struct members and ioctl ordering for initial submission.

It also removes the unneeded hotplug infrastructure.

airlied:- I've pulled this patch in from git modesetting-gem tree.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2008-12-29 17:47:25 +10:00
Jesse Barnes
79e539453b DRM: i915: add mode setting support
This commit adds i915 driver support for the DRM mode setting APIs.
Currently, VGA, LVDS, SDVO DVI & VGA, TV and DVO LVDS outputs are
supported.  HDMI, DisplayPort and additional SDVO output support will
follow.

Support for the mode setting code is controlled by the new 'modeset'
module option.  A new config option, CONFIG_DRM_I915_KMS controls the
default behavior, and whether a PCI ID list is built into the module for
use by user level module utilities.

Note that if mode setting is enabled, user level drivers that access
display registers directly or that don't use the kernel graphics memory
manager will likely corrupt kernel graphics memory, disrupt output
configuration (possibly leading to hangs and/or blank displays), and
prevent panic/oops messages from appearing.  So use caution when
enabling this code; be sure your user level code supports the new
interfaces.

A new SysRq key, 'g', provides emergency support for switching back to
the kernel's framebuffer console; which is useful for testing.

Co-authors: Dave Airlie <airlied@linux.ie>, Hong Liu <hong.liu@intel.com>

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2008-12-29 17:47:23 +10:00
Dave Airlie
f453ba0460 DRM: add mode setting support
Add mode setting support to the DRM layer.

This is a fairly big chunk of work that allows DRM drivers to provide
full output control and configuration capabilities to userspace.  It was
motivated by several factors:
  - the fb layer's APIs aren't suited for anything but simple
    configurations
  - coordination between the fb layer, DRM layer, and various userspace
    drivers is poor to non-existent (radeonfb excepted)
  - user level mode setting drivers makes displaying panic & oops
    messages more difficult
  - suspend/resume of graphics state is possible in many more
    configurations with kernel level support

This commit just adds the core DRM part of the mode setting APIs.
Driver specific commits using these new structure and APIs will follow.

Co-authors: Jesse Barnes <jbarnes@virtuousgeek.org>, Jakob Bornecrantz <jakob@tungstengraphics.com>
Contributors: Alan Hourihane <alanh@tungstengraphics.com>, Maarten Maathuis <madman2003@gmail.com>

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2008-12-29 17:47:23 +10:00
Jesse Barnes
de151cf67c drm/i915: add GEM GTT mapping support
Use the new core GEM object mapping code to allow GTT mapping of GEM
objects on i915.  The fault handler will make sure a fence register is
allocated too, if the object in question is tiled.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2008-12-29 17:47:23 +10:00
Jesse Barnes
a2c0a97b78 drm: GEM mmap support
Add core support for mapping of GEM objects.  Drivers should provide a
vm_operations_struct if they want to support page faulting of objects.
The code for handling GEM object offsets was taken from TTM, which was
written by Thomas Hellström.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2008-12-29 17:47:22 +10:00
Vegard Nossum
1147c9cdd0 drm: fix leak of uninitialized data to userspace
...so drm_getunique() is trying to copy some uninitialized data to
userspace. The ECX register contains the number of words that are
left to copy -- so there are 5 * 4 = 20 bytes left. The offset of the
first uninitialized byte (counting from the start of the string) is
also 20 (i.e. 0xf65d2294&((1 << 5)-1) == 20). So somebody tried to
copy 40 bytes when the string was only 19 long.

In drm_set_busid() we have this code:

        dev->unique_len = 40;
        dev->unique = drm_alloc(dev->unique_len + 1, DRM_MEM_DRIVER);
      ...
        len = snprintf(dev->unique, dev->unique_len, pci:%04x:%02x:%02x.%d",

...so it seems that dev->unique is never updated to reflect the
actual length of the string. The remaining bytes (20 in this case)
are random uninitialized bytes that are copied into userspace.

This patch fixes the problem by setting dev->unique_len after the
snprintf().

airlied- I've had to fix this up to store the alloced size so
we have it for drm_free later.

Reported-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: Vegard Nossum <vegardno@thuin.ifi.uio.no>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2008-12-29 17:47:22 +10:00
Dave Airlie
7c1c2871a6 drm: move to kref per-master structures.
This is step one towards having multiple masters sharing a drm
device in order to get fast-user-switching to work.

It splits out the information associated with the drm master
into a separate kref counted structure, and allocates this when
a master opens the device node. It also allows the current master
to abdicate (say while VT switched), and a new master to take over
the hardware.

It moves the Intel and radeon drivers to using the sarea from
within the new master structures.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2008-12-29 17:47:22 +10:00
Dave Airlie
e7f7ab45eb drm: cleanup exit path for module unload
The current sub-module unload exit path is a mess, it tries
to abuse the idr. Just keep a list of devices per driver struct
and free them in-order on rmmod.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2008-12-29 17:47:21 +10:00
Jens Axboe
b3a6ffe16b Get rid of CONFIG_LSF
We have two seperate config entries for large devices/files. One
is CONFIG_LBD that guards just the devices, the other is CONFIG_LSF
that handles large files. This doesn't make a lot of sense, you typically
want both or none. So get rid of CONFIG_LSF and change CONFIG_LBD wording
to indicate that it covers both.

Acked-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-12-29 08:29:51 +01:00
Jens Axboe
a6f23657d3 block: add one-hit cache for disk partition lookup
disk_map_sector_rcu() returns a partition from a sector offset,
which we use for IO statistics on a per-partition basis. The
lookup itself is an O(N) list lookup, where N is the number of
partitions. This actually hurts performance quite a bit, even
on the lower end partitions. On higher numbered partitions,
it can get pretty bad.

Solve this by adding a one-hit cache for partition lookup.
This makes the lookup O(1) for the case where we do most IO to
one partition. Even for mixed partition workloads, amortized cost
is pretty close to O(1) since the natural IO batching makes the
one-hit cache last for lots of IOs.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-12-29 08:29:51 +01:00
Jens Axboe
b374d18a4b block: get rid of elevator_t typedef
Just use struct elevator_queue everywhere instead.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-12-29 08:29:50 +01:00
Jens Axboe
abf137dd77 aio: make the lookup_ioctx() lockless
The mm->ioctx_list is currently protected by a reader-writer lock,
so we always grab that lock on the read side for doing ioctx
lookups. As the workload is extremely reader biased, turn this into
an rcu hlist so we can make lookup_ioctx() lockless. Get rid of
the rwlock and use a spinlock for providing update side exclusion.

There's usually only 1 entry on this list, so it doesn't make sense
to look into fancier data structures.

Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-12-29 08:29:50 +01:00
Jens Axboe
392ddc3298 bio: add support for inlining a number of bio_vecs inside the bio
When we go and allocate a bio for IO, we actually do two allocations.
One for the bio itself, and one for the bi_io_vec that holds the
actual pages we are interested in.

This feature inlines a definable amount of io vecs inside the bio
itself, so we eliminate the bio_vec array allocation for IO's up
to a certain size. It defaults to 4 vecs, which is typically 16k
of IO.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-12-29 08:29:50 +01:00
Jens Axboe
bb799ca020 bio: allow individual slabs in the bio_set
Instead of having a global bio slab cache, add a reference to one
in each bio_set that is created. This allows for personalized slabs
in each bio_set, so that they can have bios of different sizes.

This means we can personalize the bios we return. File systems may
want to embed the bio inside another structure, to avoid allocation
more items (and stuffing them in ->bi_private) after the get a bio.
Or we may want to embed a number of bio_vecs directly at the end
of a bio, to avoid doing two allocations to return a bio. This is now
possible.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-12-29 08:29:23 +01:00
Jens Axboe
1b43449869 bio: move the slab pointer inside the bio_set
In preparation for adding differently sized bios.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-12-29 08:28:46 +01:00
Jens Axboe
7ff9345ffa bio: only mempool back the largest bio_vec slab cache
We only very rarely need the mempool backing, so it makes sense to
get rid of all but one of the mempool in a bio_set. So keep the
largest bio_vec count mempool so we can always honor the largest
allocation, and "upgrade" callers that fail.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-12-29 08:28:46 +01:00
Tejun Heo
58eea927d2 block: simplify empty barrier implementation
Empty barrier required special handling in __elv_next_request() to
complete it without letting the low level driver see it.

With previous changes, barrier code is now flexible enough to skip the
BAR step using the same barrier sequence selection mechanism.  Drop
the special handling and mask off q->ordered from start_ordered().

Remove blk_empty_barrier() test which now has no user.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-12-29 08:28:45 +01:00
Tejun Heo
8f11b3e99a block: make barrier completion more robust
Barrier completion had the following assumptions.

* start_ordered() couldn't finish the whole sequence properly.  If all
  actions are to be skipped, q->ordseq is set correctly but the actual
  completion was never triggered thus hanging the barrier request.

* Drain completion in elv_complete_request() assumed that there's
  always at least one request in the queue when drain completes.

Both assumptions are true but these assumptions need to be removed to
improve empty barrier implementation.  This patch makes the following
changes.

* Make start_ordered() use blk_ordered_complete_seq() to mark skipped
  steps complete and notify __elv_next_request() that it should fetch
  the next request if the whole barrier has completed inside
  start_ordered().

* Make drain completion path in elv_complete_request() check whether
  the queue is empty.  Empty queue also indicates drain completion.

* While at it, convert 0/1 return from blk_do_ordered() to false/true.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-12-29 08:28:45 +01:00
Tejun Heo
f671620e7d block: make every barrier action optional
In all barrier sequences, the barrier write itself was always assumed
to be issued and thus didn't have corresponding control flag.  This
patch adds QUEUE_ORDERED_DO_BAR and unify action mask handling in
start_ordered() such that any barrier action can be skipped.

This patch doesn't introduce any visible behavior changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-12-29 08:28:45 +01:00
Tejun Heo
313e42999d block: reorganize QUEUE_ORDERED_* constants
Separate out ordering type (drain,) and action masks (preflush,
postflush, fua) from visible ordering mode selectors
(QUEUE_ORDERED_*).  Ordering types are now named QUEUE_ORDERED_BY_*
while action masks are named QUEUE_ORDERED_DO_*.

This change is necessary to add QUEUE_ORDERED_DO_BAR and make it
optional to improve empty barrier implementation.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-12-29 08:28:44 +01:00
Richard Kennedy
ba744d5e29 block: reorder struct bio to remove padding on 64bit
Remove 8 bytes of padding from struct bio which also removes 16 bytes from
struct bio_pair to make it 248 bytes.  bio_pair then fits into one fewer
cache lines & into a smaller slab.

Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-12-29 08:28:44 +01:00
Cheng Renquan
64d01dc9e1 block: use cancel_work_sync() instead of kblockd_flush_work()
After many improvements on kblockd_flush_work, it is now identical to
cancel_work_sync, so a direct call to cancel_work_sync is suggested.

The only difference is that cancel_work_sync is a GPL symbol,
so no non-GPL modules anymore.

Signed-off-by: Cheng Renquan <crquan@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-12-29 08:28:44 +01:00
Keith Mannthey
08bafc0341 block: Supress Buffer I/O errors when SCSI REQ_QUIET flag set
Allow the scsi request REQ_QUIET flag to be propagated to the buffer
file system layer. The basic ideas is to pass the flag from the scsi
request to the bio (block IO) and then to the buffer layer.  The buffer
layer can then suppress needless printks.

This patch declutters the kernel log by removed the 40-50 (per lun)
buffer io error messages seen during a boot in my multipath setup . It
is a good chance any real errors will be missed in the "noise" it the
logs without this patch.

During boot I see blocks of messages like
"
__ratelimit: 211 callbacks suppressed
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242847
Buffer I/O error on device sdm, logical block 1
Buffer I/O error on device sdm, logical block 5242878
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242872
"
in my logs.

My disk environment is multipath fiber channel using the SCSI_DH_RDAC
code and multipathd.  This topology includes an "active" and "ghost"
path for each lun. IO's to the "ghost" path will never complete and the
SCSI layer, via the scsi device handler rdac code, quick returns the IOs
to theses paths and sets the REQ_QUIET scsi flag to suppress the scsi
layer messages.

 I am wanting to extend the QUIET behavior to include the buffer file
system layer to deal with these errors as well. I have been running this
patch for a while now on several boxes without issue.  A few runs of
bonnie++ show no noticeable difference in performance in my setup.

Thanks for John Stultz for the quiet_error finalization.

Submitted-by:  Keith Mannthey <kmannth@us.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-12-29 08:28:44 +01:00
Fernando Luis Vázquez Cao
88e740f165 block: add queue flag for paravirt frontend drivers
As is the case with SSD devices, we do not want to idle in AS/CFQ when
the block device is a paravirt front-end driver. This patch adds a flag
(QUEUE_FLAG_VIRT) which should be used by front-end drivers such as
virtio_blk and xen-blkfront to indicate a paravirtualized device.

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-12-29 08:28:41 +01:00
Lachlan McIlroy
0a8c5395f9 [XFS] Fix merge failures
Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6

Conflicts:

	fs/xfs/linux-2.6/xfs_cred.h
	fs/xfs/linux-2.6/xfs_globals.h
	fs/xfs/linux-2.6/xfs_ioctl.c
	fs/xfs/xfs_vnodeops.h

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-29 16:47:18 +11:00