linux

iv/linux

Author	SHA1	Message	Date
Ed Cashin	1b8a1636ce	aoe: update cap on outstanding commands based on config query response The ATA over Ethernet config query response contains a "buffer count" field reflecting the AoE target's capacity to buffer incoming AoE commands. By taking the current value of this field into accound, we increase performance throughput or avoid network congestion, when the value has increased or decreased, respectively. Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-17 17:15:23 -08:00
Ed Cashin	4e78dd144b	aoe: print warning regarding a common reason for dropped transmits Dropped transmits are not common, but when they do occur, increasing the transmit queue length often helps. Signed-off-by: Ed Cashin <ecashin@coraid.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-17 17:15:23 -08:00
Ed Cashin	662a889608	aoe: describe the behavior of the "err" character device Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-17 17:15:23 -08:00
Linus Torvalds	9228ff9038	Merge branch 'for-3.8/drivers' of git://git.kernel.dk/linux-block Pull block driver update from Jens Axboe: "Now that the core bits are in, here are the driver bits for 3.8. The branch contains: - A huge pile of drbd bits that were dumped from the 3.7 merge window. Following that, it was both made perfectly clear that there is going to be no more over-the-wall pulls and how the situation on individual pulls can be improved. - A few cleanups from Akinobu Mita for drbd and cciss. - Queue improvement for loop from Lukas. This grew into adding a generic interface for waiting/checking an even with a specific lock, allowing this to be pulled out of md and now loop and drbd is also using it. - A few fixes for xen back/front block driver from Roger Pau Monne. - Partition improvements from Stephen Warren, allowing partiion UUID to be used as an identifier." * 'for-3.8/drivers' of git://git.kernel.dk/linux-block: (609 commits) drbd: update Kconfig to match current dependencies drbd: Fix drbdsetup wait-connect, wait-sync etc... commands drbd: close race between drbd_set_role and drbd_connect drbd: respect no-md-barriers setting also when changed online via disk-options drbd: Remove obsolete check drbd: fixup after wait_even_lock_irq() addition to generic code loop: Limit the number of requests in the bio list wait: add wait_event_lock_irq() interface xen-blkfront: free allocated page xen-blkback: move free persistent grants code block: partition: msdos: provide UUIDs for partitions init: reduce PARTUUID min length to 1 from 36 block: store partition_meta_info.uuid as a string cciss: use check_signature() cciss: cleanup bitops usage drbd: use copy_highpage drbd: if the replication link breaks during handshake, keep retrying drbd: check return of kmalloc in receive_uuids drbd: Broadcast sync progress no more often than once per second drbd: don't try to clear bits once the disk has failed ...	2012-12-17 13:39:11 -08:00
Alex Elder	b8f5c6edca	rbd: don't use ENOTSUPP ENOTSUPP is not a standard errno (it shows up as "Unknown error 524" in an error message). This is what was getting produced when the the local rbd code does not implement features required by a discovered rbd image. Change the error code returned in this case to ENXIO. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-12-17 12:07:32 -06:00
Alex Elder	2fd82b9e92	rbd: get rid of RBD_MAX_SEG_NAME_LEN RBD_MAX_SEG_NAME_LEN represents the maximum length of an rbd object name (i.e., one of the objects providing storage backing an rbd image). Another symbol, MAX_OBJ_NAME_SIZE, is used in the osd client code to define the maximum length of any object name in an osd request. Right now they disagree, with RBD_MAX_SEG_NAME_LEN being too big. There's no real benefit at this point to defining the rbd object name length limit separate from any other object name, so just get rid of RBD_MAX_SEG_NAME_LEN and use MAX_OBJ_NAME_SIZE in its place. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-12-17 08:37:29 -06:00
Alex Elder	42382b709b	rbd: do not allow remove of mounted-on image There is no check in rbd_remove() to see if anybody holds open the image being removed. That's not cool. Add a simple open count that goes up and down with opens and closes (releases) of the device, and don't allow an rbd image to be removed if the count is non-zero. Protect the updates of the open count value with ctl_mutex to ensure the underlying rbd device doesn't get removed while concurrently being opened. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-12-17 08:36:59 -06:00
Roger Pau Monne	7dc341175a	xen-blkback: implement safe iterator for the list of persistent grants Change foreach_grant iterator to a safe version, that allows freeing the element while iterating. Also move the free code in free_persistent_gnts to prevent freeing the element before the rb_next call. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad@kernel.org> Cc: xen-devel@lists.xen.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-12-07 15:13:09 -05:00
Lars Ellenberg	d2ec180c23	drbd: update Kconfig to match current dependencies We no longer need the connector. But we need libcrc32c. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-12-06 13:08:29 +01:00
Philipp Reisner	ef86b77957	drbd: Fix drbdsetup wait-connect, wait-sync etc... commands This was introduces when moving the code over from the 8.3 codebase with commit 328e0f125bf41f4f33f684db22015f92cb44fe56 Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-12-06 13:04:34 +01:00
Philipp Reisner	13c76aba78	drbd: close race between drbd_set_role and drbd_connect drbd_set_role(, R_PRIMARY, ) does the state change to Primary, some more housekeeping, and possibly generates a new UUID set. All of this holding the "state_mutex". The connection handshake involves sending of various state information, including the current data generation UUID set, and two connection state changes from C_WF_CONNECTION to C_WF_REPORT_PARAMS further to a number of different outcomes, resync being one of them. If the connection handshake happens between the state change to Primary and the generation of the new UUIDs, the resync decision based on the old UUID set may be confused, depending on circumstances. Make sure that, before we do the handshake, any promotion to Primary role will either be complete (including the housekeeping stuff), or can see, and serialize with, the ongoing handshake, based on the "STATE_SENT" bit, which is set when we start the handshake, and cleared only when we leave C_WF_REPORT_PARAMS again. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-12-06 13:00:33 +01:00
Lars Ellenberg	691631c065	drbd: respect no-md-barriers setting also when changed online via disk-options We need to propagate the configuration into the flag bits, or it won't be effective. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-12-06 13:00:04 +01:00
Philipp Reisner	298307ed1d	drbd: Remove obsolete check Smatch complained about it this redundanct check. The check was introduced in 2006-09-13. On 2007-07-24 the body of the function was enclosed by get_ldev()/put_ldev() reference counting. Since then the check is useless and miss leading. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-12-06 12:09:55 +01:00
Jens Axboe	84ad6845fb	Merge branch 'stable/for-jens-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-3.8/drivers	2012-12-01 09:42:41 +01:00
Jens Axboe	2cecb73098	drbd: fixup after wait_even_lock_irq() addition to generic code Compiling drbd yields: drivers/block/drbd/drbd_state.c: In function ‘_conn_request_state’: drivers/block/drbd/drbd_state.c:1804:5: error: macro "wait_event_lock_irq" passed 4 arguments, but takes just 3 drivers/block/drbd/drbd_state.c:1801:3: error: ‘wait_event_lock_irq’ undeclared (first use in this function) drivers/block/drbd/drbd_state.c:1801:3: note: each undeclared identifier is reported only once for each function it appears in drivers/block/drbd/drbd_state.c: At top level: drivers/block/drbd/drbd_state.c:1734:1: warning: ‘_conn_rq_cond’ defined but not used [-Wunused-function] Due to drbd having copied the MD definition for wait_event_lock_irq() as well. Kill them. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-11-30 21:20:15 +01:00
Lukas Czerner	7b5a35225b	loop: Limit the number of requests in the bio list Currently there is not limitation of number of requests in the loop bio list. This can lead into some nasty situations when the caller spawns tons of bio requests taking huge amount of memory. This is even more obvious with discard where blkdev_issue_discard() will submit all bios for the range and wait for them to finish afterwards. On really big loop devices and slow backing file system this can lead to OOM situation as reported by Dave Chinner. With this patch we will wait in loop_make_request() if the number of bios in the loop bio list would exceed 'nr_congestion_on'. We'll wake up the process as we process the bios form the list. Some threshold hysteresis is in place to avoid high frequency oscillation. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reported-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-11-30 11:48:05 +01:00
Roger Pau Monne	07c540a0b5	xen-blkfront: free allocated page Free the page allocated for the persistent grant. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-11-26 14:58:11 -05:00
Roger Pau Monne	4d4f270f18	xen-blkback: move free persistent grants code Move the code that frees persistent grants from the red-black tree to a function. This will make it easier for other consumers to move this to a common place. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-11-26 14:58:11 -05:00
Selvan Mani	836413e8c7	mtip32xx: Fix padding issue Hi Jens, Another tiny patch. Removed __packed before the struct smart_attr and added __packed at end of the structure to fix padding issue. Signed-off-by: Selvan Mani <smani@micron.com> Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-11-23 14:32:55 +01:00
Ed Cashin	11cfb6ff73	aoe: avoid running request handler on plugged queue Calling the request handler directly on a plugged queue defeats the performance improvements provided by the plugging mechanism. Use the __blk_run_queue function instead of calling the request handler directly, so that we don't interfere with the block layer's ability to plug the queue. Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-11-23 14:32:55 +01:00
Wei Yongjun	298d80152c	mtip32xx: fix potential NULL pointer dereference in mtip_timeout_function() The dereference to port should be moved below the NULL test. dpatch engine is used to auto generate this patch. (https://github.com/weiyj/dpatch) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-11-23 14:32:55 +01:00
Jens Axboe	7c5d62388e	mtip32xx: fix shift larger than type warning If we're building a 32-bit kernel and CONFIG_LBADF isn't set, sector_t is 32-bits wide. The shifts by 32 and 40 are thus larger than we support. Cast the sector offset to a u64 to avoid these warnings. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-11-23 14:32:55 +01:00
Selvan Mani	4b9e884523	mtip32xx: Fix incorrect mask used for erase mode Previous commit use value 3 for erasemode mask. Changing the mask to correct value to 2 Signed-off-by: Selvan Mani <smani@micron.com> Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-11-23 14:32:55 +01:00
Selvan Mani	eda4531492	mtip32xx: Fix to make lba address correct in big-endian systems Earlier lba address was assigned directly to lba_low and lba_low_ex, which would result in a different number (bytes reversed) in big-endian systems. Now assigning lba address byte-by-byte to fis. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Selvan Mani <smani@micron.com> Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-11-23 14:32:55 +01:00
Selvan Mani	3208795e61	mtip32xx: fix potential crash on SEC_ERASE_UNIT The mtip driver lifted this code from elsewhere and then added a special handling check for SEC_ERASE_UNIT. If the caller tries to do a security erase but passes no output data for the command then outbuf is not allocated and the driver duly explodes. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Selvan Mani <smani@micron.com> Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-11-23 14:32:54 +01:00
Jiri Kosina	eac7cc52c6	floppy: destroy floppy workqueue before cleaning up the queue We need to first destroy the floppy_wq workqueue before cleaning up the queue. Otherwise we might race with still pending work with the workqueue, but all the block queue already gone. This might lead to various oopses, such as CPU 0 Pid: 6, comm: kworker/u:0 Not tainted 3.7.0-rc4 #1 Bochs Bochs RIP: 0010:[<ffffffff8134eef5>] [<ffffffff8134eef5>] blk_peek_request+0xd5/0x1c0 RSP: 0000:ffff88000dc7dd88 EFLAGS: 00010092 RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff88000f602688 RSI: ffffffff81fd95d8 RDI: 6b6b6b6b6b6b6b6b RBP: ffff88000dc7dd98 R08: ffffffff81fd95c8 R09: 0000000000000000 R10: ffffffff81fd9480 R11: 0000000000000001 R12: 6b6b6b6b6b6b6b6b R13: ffff88000dc7dfd8 R14: ffff88000dc7dfd8 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffffffff81e21000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000000 CR3: 0000000001e11000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kworker/u:0 (pid: 6, threadinfo ffff88000dc7c000, task ffff88000dc5ecc0) Stack: 0000000000000000 0000000000000000 ffff88000dc7ddb8 ffffffff8134efee ffff88000dc7ddb8 0000000000000000 ffff88000dc7dde8 ffffffff814aef3c ffffffff81e75d80 ffff88000dc0c640 ffff88000fbfb000 ffffffff814aed90 Call Trace: [<ffffffff8134efee>] blk_fetch_request+0xe/0x30 [<ffffffff814aef3c>] redo_fd_request+0x1ac/0x400 [<ffffffff814aed90>] ? start_motor+0x130/0x130 [<ffffffff8106b526>] process_one_work+0x136/0x450 [<ffffffff8106af65>] ? manage_workers+0x205/0x2e0 [<ffffffff8106bb6d>] worker_thread+0x14d/0x420 [<ffffffff8106ba20>] ? rescuer_thread+0x1a0/0x1a0 [<ffffffff8107075a>] kthread+0xba/0xc0 [<ffffffff810706a0>] ? __kthread_parkme+0x80/0x80 [<ffffffff818b553a>] ret_from_fork+0x7a/0xb0 [<ffffffff810706a0>] ? __kthread_parkme+0x80/0x80 Code: 0f 84 c0 00 00 00 83 f8 01 0f 85 e2 00 00 00 81 4b 40 00 00 80 00 48 89 df e8 58 f8 ff ff be fb ff ff ff fe ff ff <49> 8b 1c 24 49 39 dc 0f 85 2e ff ff ff 41 0f b6 84 24 28 04 00 RIP [<ffffffff8134eef5>] blk_peek_request+0xd5/0x1c0 RSP <ffff88000dc7dd88> Reported-by: Fengguang Wu <fengguang.wu@intel.com> Tested-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-11-23 14:32:54 +01:00
Akinobu Mita	d48c152a41	cciss: use check_signature() Use check_signature() to find a signature in the mmio address. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Mike Miller <mike.miller@hp.com> Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-11-23 14:28:34 +01:00
Akinobu Mita	1f118bc479	cciss: cleanup bitops usage - Remove unnecessary correction of bit and address - Use BITS_TO_LONGS macro to calculate bitmap size - Use bitmap_zero() Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Mike Miller <mike.miller@hp.com> Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-11-23 14:27:22 +01:00
Keith Busch	2b19603415	NVMe: Initialize iod nents to 0 For commands that do not map a scatter list, we need to initilaize the iod's number of sg entries (nents) to 0 and not unmap in this case. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2012-11-13 09:13:50 -05:00
Keith Busch	6ecec74520	NVMe: Define SMART log This data structure is defined in the NVMe specification. It's not used by the kernel, but is available for use by userspace software. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2012-11-13 09:13:50 -05:00
Keith Busch	08df1e0565	NVMe: Add result to nvme_get_features nvme_get_features() was not returning the result. Add a parameter to return the result in (similar to nvme_set_features()) and change all callers. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2012-11-13 09:13:49 -05:00
Keith Busch	f4f117f64b	NVMe: Set result from user admin command The ioctl data structure includes space for the 'result' of the admin command to be returned; it just wasn't filled in. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2012-11-13 09:13:49 -05:00
Keith Busch	3295874b60	NVMe: End queued bio requests when freeing queue If the queue has bios queued on it when it is freed, bio_endio() must be called for them first. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2012-11-13 09:13:49 -05:00
Keith Busch	859361a228	NVMe: Free cmdid on nvme_submit_bio error nvme_map_bio() is called after the cmdid is allocated, so we have to free the cmdid before returning from nvme_submit_bio() if nvme_map_bio() returned an error. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>	2012-11-13 09:13:49 -05:00
Jens Axboe	8d0ff3924b	Merge branch 'stable/for-jens-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-3.8/drivers	2012-11-12 09:18:47 -07:00
Akinobu Mita	f1d6a328bb	drbd: use copy_highpage Use copy_highpage() to copy from one page to another. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:22:26 +01:00
Lars Ellenberg	ed635cb067	drbd: if the replication link breaks during handshake, keep retrying The 8.3.12 commit drbd: Bugfix for the connection behavior fixes a "wasted established connection", if a former connection attempt failed during its early stages. However it opened a window for a regression, if a connection attempt fails during its last stages. The result was a terminated receiver thread, that left behind the supposedly transient "C_UNCONNECTED" state. Any later requests to change the connection state fail, as they wait for the connection state to "stabilize". Fix: short circuit and keep retrying to restablish a new connection, if we don't reach C_WF_REPORT_PARAMS. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:22:19 +01:00
Jing Wang	063eacf88c	drbd: check return of kmalloc in receive_uuids Signed-off-by: Jing Wang <windsdaemon@gmail.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:22:10 +01:00
Philipp Reisner	986836503e	Merge branch 'drbd-8.4_ed6' into for-3.8-drivers-drbd-8.4_ed6	2012-11-09 14:20:23 +01:00
Philipp Reisner	328e0f125b	drbd: Broadcast sync progress no more often than once per second Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:11:43 +01:00
Philipp Reisner	518a4d53b2	drbd: don't try to clear bits once the disk has failed If the disk has failed already, there is no point trying to change the bitmap. drbd_set_out_of_sync() already had this safeguard, time to add it to drbd_set_in_sync() as well. This also prevents some warning messages, like FIXME asender in bm_change_bits_to, bitmap locked for 'detach' by worker if our disk fails during resync, while there are some resync acks queued up. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:11:42 +01:00
Philipp Reisner	fd0017c124	drbd: fix regression: potential NULL pointer dereference recent commit drbd: always write bitmap on detach introduced a bitmap writeout during detach, which obviously needs some meta data device to write to. Unfortunately, that same error path may be taken if we fail to attach, e.g. due to UUID mismatch, after we changed state to D_ATTACHING, but before the lower level device pointer is even assigned. We need to test for presence of mdev->ldev. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:11:42 +01:00
Philipp Reisner	4035e4c2eb	drbd: Fix clearing of MDF_AL_DISABLED Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:11:42 +01:00
Lars Ellenberg	42839f6536	drbd: log request sector offset and size for IO errors Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:11:41 +01:00
Lars Ellenberg	edc9f5eb7a	drbd: always write bitmap on detach If we detach due to local read-error (which sets a bit in the bitmap), stay Primary, and then re-attach (which re-reads the bitmap from disk), we potentially lost the "out-of-sync" (or, "bad block") information in the bitmap. Always (try to) write out the changed bitmap pages before going diskless. That way, we don't lose the bit for the bad block, the next resync will fetch it from the peer, and rewrite it locally, which may result in block reallocation in some lower layer (or the hardware), and thereby "heal" the bad blocks. If the bitmap writeout errors out as well, we will (again: try to) mark the "we need a full sync" bit in our super block, if it was a READ error; writes are covered by the activity log already. If that superblock does not make it to disk either, we are sorry. Maybe we just lost an entire disk or controller (or iSCSI connection), and there actually are no bad blocks at all, so we don't need to re-fetch from the peer, there is no "auto-healing" necessary. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:11:41 +01:00
Lars Ellenberg	e34b677d09	drbd: wait for meta data IO completion even with failed disk, unless force-detached The intention of force-detach is to be able to deal with a completely unresponsive lower level IO stack, which does not even deliver error completions anymore, but no completion at all. In all other cases, we must still wait for the meta data IO completion. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:11:40 +01:00
Lars Ellenberg	8747d30af9	drbd: a few more GFP_KERNEL -> GFP_NOIO This has not yet been observed, but conceivably, when using GFP_KERNEL allocations from drbd_md_sync(), drbd_flush_after_epoch() or receive_SyncParam(), we could trigger additional IO to our own device, or an other device in a criss-cross setup, and end up in a local deadlock, or potentially a distributed deadlock in a criss-cross setup involving the peer blocked in a similar way waiting for us to make progress. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:11:40 +01:00
Lars Ellenberg	bc891c9ae3	drbd: fix potential deadlock during bitmap (re-)allocation The former comment arguing that GFP_KERNEL was good enough was wrong: it did not take resize into account at all, and assumed the only path leading here was the normal attach on a still secondary device, so no deadlock would be possible. Both resize on a Primary, or attach on a diskless Primary, could potentially deadlock. drbd_bm_resize() is called while IO to the respective device is suspended, so we must use GFP_NOIO to avoid potential deadlock. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:11:39 +01:00
Lars Ellenberg	a506c13a4d	drbd: use list_move_tail instead of list_del/list_add_tail Using list_move_tail() instead of list_del() + list_add_tail(). spatch with a semantic match is used to found this problem. (http://coccinelle.lip6.fr/) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:11:39 +01:00
Philipp Reisner	1b6dd252e6	drbd: panic on delayed completion of aborted requests "aborting" requests, or force-detaching the disk, is intended for completely blocked/hung local backing devices which do no longer complete requests at all, not even do error completions. In this situation, usually a hard-reset and failover is the only way out. By "aborting", basically faking a local error-completion, we allow for a more graceful swichover by cleanly migrating services. Still the affected node has to be rebooted "soon". By completing these requests, we allow the upper layers to re-use the associated data pages. If later the local backing device "recovers", and now DMAs some data from disk into the original request pages, in the best case it will just put random data into unused pages; but typically it will corrupt meanwhile completely unrelated data, causing all sorts of damage. Which means delayed successful completion, especially for READ requests, is a reason to panic(). We assume that a delayed error completion is OK, though we still will complain noisily about it. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2012-11-09 14:11:39 +01:00

... 3 4 5 6 7 ...

3531 Commits