block: rate-limit the error message from failing commands

When performing a cable pull test w/ active stress I/O using fio over a dual port Intel 82599 FCoE CNA, w/ 256LUNs on one port and about 32LUNs on the other, it is observed that the system becomes not usable due to scsi-ml being busy printing the error messages for all the failing commands. I don't believe this problem is specific to FCoE and these commands are anyway failing due to link being down (DID_NO_CONNECT), just rate-limit the messages here to solve this issue. v2->v1: use __ratelimit() as Tomas Henzl mentioned as the proper way for rate-limit per function. However, in this case, the failed i/o gets to blk_end_request_err() and then blk_update_request(), which also has to be rate-limited, as added in the v2 of this patch. v3-v2: resolved conflict to apply on current 3.6-rc3 upstream tip. Signed-off-by: Yi Zou <yi.zou@intel.com> Cc: www.Open-FCoE.org <devel@open-fcoe.org> Cc: Tomas Henzl <thenzl@redhat.com> Cc: <linux-scsi@vger.kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-08-30 16:26:25 -07:00 · 2012-08-30 16:26:25 -07:00 · 37d7b34f05
commit 37d7b34f05
parent 155e36d40c
1 changed files with 5 additions and 3 deletions
--- a/block/blk-core.c
+++ b/block/blk-core.c
@ -2254,9 +2254,11 @@ bool blk_update_request(struct request *req, int error, unsigned int nr_bytes)
 			error_type = "I/O";
 			break;
 		}
-		printk(KERN_ERR "end_request: %s error, dev %s, sector %llu\n",
-		       error_type, req->rq_disk ? req->rq_disk->disk_name : "?",
-		       (unsigned long long)blk_rq_pos(req));
+		printk_ratelimited(KERN_ERR "end_request: %s error, dev %s, sector %llu\n",
+				   error_type, req->rq_disk ?
+				   req->rq_disk->disk_name : "?",
+				   (unsigned long long)blk_rq_pos(req));
+
 	}

 	blk_account_io_completion(req, nr_bytes);