linux/drivers/md
Yufen Yu e236858243 md/raid5: set default stripe_size as 4096
In RAID5, if issued bio size is bigger than stripe_size, it will be
split in the unit of stripe_size and process them one by one. Even
for size less then stripe_size, RAID5 also request data from disk at
least of stripe_size.

Nowdays, stripe_size is equal to the value of PAGE_SIZE. Since filesystem
usually issue bio in the unit of 4KB, there is no problem for PAGE_SIZE
as 4KB. But, for 64KB PAGE_SIZE, bio from filesystem requests 4KB data
while RAID5 issue IO at least stripe_size (64KB) each time. That will
waste resource of disk bandwidth and compute xor.

To avoding the waste, we want to make stripe_size configurable. This
patch just set default stripe_size as 4096. User can also set the value
bigger than 4KB for some special requirements, such as we know the
issued io size is more than 4KB.

To evaluate the new feature, we create raid5 device '/dev/md5' with
4 SSD disk and test it on arm64 machine with 64KB PAGE_SIZE.

1) We format /dev/md5 with mkfs.ext4 and mount ext4 with default
 configure on /mnt directory. Then, trying to test it by dbench with
 command: dbench -D /mnt -t 1000 10. Result show as:

 'stripe_size = 64KB'

  Operation      Count    AvgLat    MaxLat
  ----------------------------------------
  NTCreateX    9805011     0.021    64.728
  Close        7202525     0.001     0.120
  Rename        415213     0.051    44.681
  Unlink       1980066     0.079    93.147
  Deltree          240     1.793     6.516
  Mkdir            120     0.004     0.007
  Qpathinfo    8887512     0.007    37.114
  Qfileinfo    1557262     0.001     0.030
  Qfsinfo      1629582     0.012     0.152
  Sfileinfo     798756     0.040    57.641
  Find         3436004     0.019    57.782
  WriteX       4887239     0.021    57.638
  ReadX        15370483     0.005    37.818
  LockX          31934     0.003     0.022
  UnlockX        31933     0.001     0.021
  Flush         687205    13.302   530.088

 Throughput 307.799 MB/sec  10 clients  10 procs  max_latency=530.091 ms
 -------------------------------------------------------

 'stripe_size = 4KB'

  Operation      Count    AvgLat    MaxLat
  ----------------------------------------
  NTCreateX    11999166     0.021    36.380
  Close        8814128     0.001     0.122
  Rename        508113     0.051    29.169
  Unlink       2423242     0.070    38.141
  Deltree          300     1.885     7.155
  Mkdir            150     0.004     0.006
  Qpathinfo    10875921     0.007    35.485
  Qfileinfo    1905837     0.001     0.032
  Qfsinfo      1994304     0.012     0.125
  Sfileinfo     977450     0.029    26.489
  Find         4204952     0.019     9.361
  WriteX       5981890     0.019    27.804
  ReadX        18809742     0.004    33.491
  LockX          39074     0.003     0.025
  UnlockX        39074     0.001     0.014
  Flush         841022    10.712   458.848

 Throughput 376.777 MB/sec  10 clients  10 procs  max_latency=458.852 ms
 -------------------------------------------------------

 It show that setting stripe_size as 4KB has higher thoughput, i.e.
 (376.777 vs 307.799) and has smaller latency than that setting as 64KB.

 2) We try to evaluate IO throughput for /dev/md5 by fio with config:

 [4KB randwrite]
 direct=1
 numjob=2
 iodepth=64
 ioengine=libaio
 filename=/dev/md5
 bs=4KB
 rw=randwrite

 [64KB write]
 direct=1
 numjob=2
 iodepth=64
 ioengine=libaio
 filename=/dev/md5
 bs=1MB
 rw=write

 The result as follow:

               +                   +
               | stripe_size(64KB) | stripe_size(4KB)
 +----------------------------------------------------+
 4KB randwrite |     15MB/s        |      100MB/s
 +----------------------------------------------------+
 1MB write     |   1000MB/s        |      700MB/s

 The result show that when size of io is bigger than 4KB (64KB),
 64KB stripe_size has much higher IOPS. But for 4KB randwrite, that
 means, size of io issued to device are smaller, 4KB stripe_size
 have better performance.

Normally, default value (4096) can get relatively good performance.
But if each issued io is bigger than 4096, setting value more than
4096 may get better performance.

Here, we just set default stripe_size as 4096, and we will try to
support setting different stripe_size by sysfs interface in the
following patch.

Signed-off-by: Yufen Yu <yuyufen@huawei.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
2020-07-21 17:18:17 -07:00
..
bcache block: rename generic_make_request to submit_bio_noacct 2020-07-01 07:27:24 -06:00
persistent-data treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
dm-bio-prison-v1.c dm bio prison: replace spin_lock_irqsave with spin_lock_irq 2019-11-05 14:53:03 -05:00
dm-bio-prison-v1.h
dm-bio-prison-v2.c dm bio prison v2: use true/false for bool variable 2020-01-07 12:07:08 -05:00
dm-bio-prison-v2.h
dm-bio-record.h dm bio record: save/restore bi_end_io and bi_integrity 2020-03-03 10:02:46 -05:00
dm-bufio.c - Largest change for this cycle is the DM zoned target's metadata 2020-06-05 15:45:03 -07:00
dm-builtin.c
dm-cache-background-tracker.c
dm-cache-background-tracker.h
dm-cache-block-types.h
dm-cache-metadata.c
dm-cache-metadata.h
dm-cache-policy-internal.h
dm-cache-policy-smq.c
dm-cache-policy.c
dm-cache-policy.h
dm-cache-target.c block: rename generic_make_request to submit_bio_noacct 2020-07-01 07:27:24 -06:00
dm-clone-metadata.c dm clone metadata: Fix return type of dm_clone_nr_of_hydrated_regions() 2020-03-27 14:42:51 -04:00
dm-clone-metadata.h dm clone metadata: Fix return type of dm_clone_nr_of_hydrated_regions() 2020-03-27 14:42:51 -04:00
dm-clone-target.c block: rename generic_make_request to submit_bio_noacct 2020-07-01 07:27:24 -06:00
dm-core.h
dm-crypt.c block: rename generic_make_request to submit_bio_noacct 2020-07-01 07:27:24 -06:00
dm-delay.c block: rename generic_make_request to submit_bio_noacct 2020-07-01 07:27:24 -06:00
dm-dust.c dm dust: change ret to r in dust_map_write 2020-01-07 11:43:36 -05:00
dm-ebs-target.c dm ebs: use dm_bufio_forget_buffers 2020-06-05 14:59:42 -04:00
dm-era-target.c block: rename generic_make_request to submit_bio_noacct 2020-07-01 07:27:24 -06:00
dm-exception-store.c
dm-exception-store.h
dm-flakey.c block: rework zone reporting 2019-11-12 19:12:07 -07:00
dm-historical-service-time.c dm mpath: add Historical Service Time Path Selector 2020-05-15 10:29:36 -04:00
dm-init.c docs: device-mapper: move it to the admin-guide 2019-07-15 11:03:01 -03:00
dm-integrity.c block: rename generic_make_request to submit_bio_noacct 2020-07-01 07:27:24 -06:00
dm-io.c
dm-ioctl.c dm ioctl: use struct_size() helper in retrieve_deps() 2020-06-17 12:31:45 -04:00
dm-kcopyd.c dm kcopyd: always complete failed jobs 2019-08-15 15:57:39 -04:00
dm-linear.c dm,dax: Add dax zero_page_range operation 2020-04-02 19:15:03 -07:00
dm-log-userspace-base.c
dm-log-userspace-transfer.c
dm-log-userspace-transfer.h
dm-log-writes.c dm: replace zero-length array with flexible-array 2020-05-20 17:09:44 -04:00
dm-log.c
dm-mpath.c block: rename generic_make_request to submit_bio_noacct 2020-07-01 07:27:24 -06:00
dm-mpath.h
dm-path-selector.c
dm-path-selector.h dm mpath: pass IO start time to path selector 2020-05-15 10:29:36 -04:00
dm-queue-length.c dm mpath: pass IO start time to path selector 2020-05-15 10:29:36 -04:00
dm-raid1.c block: rename generic_make_request to submit_bio_noacct 2020-07-01 07:27:24 -06:00
dm-raid.c dm: replace zero-length array with flexible-array 2020-05-20 17:09:44 -04:00
dm-region-hash.c
dm-round-robin.c
dm-rq.c blk-mq: move failure injection out of blk_mq_complete_request 2020-06-24 09:15:57 -06:00
dm-rq.h
dm-service-time.c dm mpath: pass IO start time to path selector 2020-05-15 10:29:36 -04:00
dm-snap-persistent.c block: rename generic_make_request to submit_bio_noacct 2020-07-01 07:27:24 -06:00
dm-snap-transient.c
dm-snap.c block: rename generic_make_request to submit_bio_noacct 2020-07-01 07:27:24 -06:00
dm-stats.c dm: replace zero-length array with flexible-array 2020-05-20 17:09:44 -04:00
dm-stats.h
dm-stripe.c dm: replace zero-length array with flexible-array 2020-05-20 17:09:44 -04:00
dm-switch.c dm: replace zero-length array with flexible-array 2020-05-20 17:09:44 -04:00
dm-sysfs.c
dm-table.c dm: remove the make_request_fn check in device_area_is_invalid 2020-04-25 09:45:43 -06:00
dm-target.c
dm-thin-metadata.c dm thin metadata: fix lockdep complaint 2020-02-27 12:00:53 -05:00
dm-thin-metadata.h dm thin metadata: Add support for a pre-commit callback 2019-12-05 17:05:24 -05:00
dm-thin.c block: rename generic_make_request to submit_bio_noacct 2020-07-01 07:27:24 -06:00
dm-uevent.c
dm-uevent.h
dm-unstripe.c
dm-verity-fec.c dm verity fec: fix hash block number in verity_fec_decode 2020-04-16 16:16:38 -04:00
dm-verity-fec.h
dm-verity-target.c block: rename generic_make_request to submit_bio_noacct 2020-07-01 07:27:24 -06:00
dm-verity-verify-sig.c dm verity: add root hash pkcs#7 signature verification 2019-08-23 10:13:14 -04:00
dm-verity-verify-sig.h dm verity: add root hash pkcs#7 signature verification 2019-08-23 10:13:14 -04:00
dm-verity.h dm verity: add root hash pkcs#7 signature verification 2019-08-23 10:13:14 -04:00
dm-writecache.c Linux 5.8-rc4 2020-07-08 08:02:13 -06:00
dm-zero.c
dm-zoned-metadata.c dm zoned: Fix reclaim zone selection 2020-06-19 12:29:39 -04:00
dm-zoned-reclaim.c dm zoned: fix uninitialized pointer dereference 2020-06-17 12:13:08 -04:00
dm-zoned-target.c Linux 5.8-rc4 2020-07-08 08:02:13 -06:00
dm-zoned.h dm zoned: select reclaim zone based on device index 2020-06-05 14:59:53 -04:00
dm.c Linux 5.8-rc4 2020-07-08 08:02:13 -06:00
dm.h dm: make dm_table_find_target return NULL 2019-08-23 10:13:12 -04:00
Kconfig treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
Makefile dm mpath: add Historical Service Time Path Selector 2020-05-15 10:29:36 -04:00
md-bitmap.c md: fix deadlock causing by sysfs_notify 2020-07-14 22:58:51 -07:00
md-bitmap.h
md-cluster.c md-cluster: fix wild pointer of unlock_all_bitmaps() 2020-07-14 23:38:32 -07:00
md-cluster.h
md-faulty.c block: rename generic_make_request to submit_bio_noacct 2020-07-01 07:27:24 -06:00
md-linear.c block: rename generic_make_request to submit_bio_noacct 2020-07-01 07:27:24 -06:00
md-linear.h md/raid1: Replace zero-length array with flexible-array 2020-05-13 12:02:23 -07:00
md-multipath.c block: rename generic_make_request to submit_bio_noacct 2020-07-01 07:27:24 -06:00
md-multipath.h
md.c md: Fix compilation warning 2020-07-15 22:46:07 -07:00
md.h md: fix deadlock causing by sysfs_notify 2020-07-14 22:58:51 -07:00
raid0.c block: rename generic_make_request to submit_bio_noacct 2020-07-01 07:27:24 -06:00
raid0.h md/raid0: avoid RAID0 data corruption due to layout confusion. 2019-09-13 13:10:05 -07:00
raid1-10.c md: raid1-10: Unify r{1,10}bio_pool_free 2019-06-15 01:37:35 -06:00
raid1.c block: rename generic_make_request to submit_bio_noacct 2020-07-01 07:27:24 -06:00
raid1.h md/raid1: Replace zero-length array with flexible-array 2020-05-13 12:02:23 -07:00
raid5-cache.c md/raid456: convert macro STRIPE_* to RAID5_STRIPE_* 2020-07-21 17:18:12 -07:00
raid5-log.h
raid5-ppl.c md/raid456: convert macro STRIPE_* to RAID5_STRIPE_* 2020-07-21 17:18:12 -07:00
raid5.c md/raid5: set default stripe_size as 4096 2020-07-21 17:18:17 -07:00
raid5.h md/raid5: set default stripe_size as 4096 2020-07-21 17:18:17 -07:00
raid10.c md: raid10: Fix compilation warning 2020-07-15 22:46:07 -07:00
raid10.h md/raid1: Replace zero-length array with flexible-array 2020-05-13 12:02:23 -07:00