IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
In host control mode, reads are the major source of activation trials.
Keep track of those reads counters, for both active as well inactive
regions.
We reset the read counter upon write - we are only interested in "clean"
reads.
Keep those counters normalized, as we are using those reads as a
comparative score, to make various decisions. If during consecutive
normalizations an active region has exhaust its reads - inactivate it.
While at it, protect the {active,inactive}_count stats by adding them into
the applicable handler.
Link: https://lore.kernel.org/r/20210712095039.8093-5-avri.altman@wdc.com
Reviewed-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In device control mode, the device may recommend the host to either
activate or inactivate a region, and the host should follow. Meaning those
are not actually recommendations, but more of instructions.
Conversely, in host control mode, the recommendation protocol is slightly
changed:
a) The device may only recommend the host to update a subregion of an
already-active region. And,
b) The device may *not* recommend to inactivate a region.
Furthermore, in host control mode, the host may choose not to follow any of
the device's recommendations. However, in case of a recommendation to
update an active and clean subregion, it is better to follow those
recommendation because otherwise the host has no other way to know that
some internal relocation took place.
Link: https://lore.kernel.org/r/20210712095039.8093-3-avri.altman@wdc.com
Reviewed-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
We will use control_mode later when we need to differentiate between device
and host control modes.
Link: https://lore.kernel.org/r/20210712095039.8093-2-avri.altman@wdc.com
Reviewed-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Version 2.0 of HBP supports reads of varying sizes from 4KB to 1MB.
A read operation <= 32KB is supported as single HPB read. A read between
36KB and 1MB is supported by a combination of write buffer command and HPB
read command to deliver more PPN. The write buffer commands may not be
issued immediately due to busy tags. To use HPB read more aggressively, the
driver can requeue the write buffer command. The requeue threshold is
implemented as timeout and can be modified with requeue_timeout_ms entry in
sysfs.
[mkp: REQ_OP_DRV_* and blk_rq_is_passthrough()]
Link: https://lore.kernel.org/r/20210712090025epcms2p3b3d94f6f1b2cfa394e3d9ba130ca0fa7@epcms2p3
Tested-by: Can Guo <cang@codeaurora.org>
Tested-by: Stanley Chu <stanley.chu@mediatek.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Can Guo <cang@codeaurora.org>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If the logical address of a read I/O belongs to an active sub-region, the
HPB driver modifies the read I/O command to an HPB read. The driver
modifies the UFS UPIU instead of modifying the existing SCSI command.
In HPB version 1.0, the maximum read I/O size that can be converted to HPB
read is 4KB.
The dirty map of the active sub-region prevents an incorrect HPB read that
has stale physical page number which is updated by previous write I/O.
[mkp: REQ_OP_DRV_* and blk_rq_is_passthrough()]
Link: https://lore.kernel.org/r/20210712085936epcms2p4b0ec5c8cecdeea6cc043d684363842b6@epcms2p4
Tested-by: Bean Huo <beanhuo@micron.com>
Tested-by: Can Guo <cang@codeaurora.org>
Tested-by: Stanley Chu <stanley.chu@mediatek.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Can Guo <cang@codeaurora.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Acked-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Implement L2P map management in HPB.
The HPB divides logical addresses into several regions. A region consists
of several sub-regions. The sub-region is a basic unit where L2P mapping is
managed. The driver loads L2P mapping data of each sub-region. The loaded
sub-region is called active-state. The HPB driver unloads L2P mapping data
as region unit. The unloaded region is called inactive-state.
Sub-region/region candidates to be loaded and unloaded are delivered from
the UFS device. The UFS device delivers the recommended active sub-region
and inactivate region to the driver using sense data. The HPB module
performs L2P mapping management on the host through the delivered
information.
A pinned region is a preset region on the UFS device that is always
in activate-state.
The data structures for map data requests and L2P mappings use the mempool
API, minimizing allocation overhead while avoiding static allocation.
The mininum size of the memory pool used in the HPB is implemented
as a module parameter so that it can be configurable by the user.
To guarantee a minimum memory pool size of 4MB: ufshpb_host_map_kbytes=4096.
The map_work manages active/inactive via 2 "to-do" lists:
- hpb->lh_inact_rgn: regions to be inactivated
- hpb->lh_act_srgn: subregions to be activated
These lists are maintained on I/O completion.
[mkp: switch to REQ_OP_DRV_*]
Link: https://lore.kernel.org/r/20210712085859epcms2p36e420f19564f6cd0c4a45d54949619eb@epcms2p3
Tested-by: Bean Huo <beanhuo@micron.com>
Tested-by: Can Guo <cang@codeaurora.org>
Tested-by: Stanley Chu <stanley.chu@mediatek.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Can Guo <cang@codeaurora.org>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Acked-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Implement Host Performance Buffer (HPB) initialization and add function
calls to UFS core driver.
NAND flash-based storage devices, including UFS, have mechanisms to
translate logical addresses of I/O requests to the corresponding physical
addresses of the flash storage. In UFS, logical-to-physical-address (L2P)
map data, which is required to identify the physical address for the
requested I/Os, can only be partially stored in SRAM from NAND flash. Due
to this partial loading, accessing the flash address area, where the L2P
information for that address is not loaded in the SRAM, can result in
serious performance degradation.
The basic concept of HPB is to cache L2P mapping entries in host system
memory so that both physical block address (PBA) and logical block address
(LBA) can be delivered in HPB read command. The HPB read command allows to
read data faster than a regular read command in UFS since it provides the
physical address (HPB Entry) of the desired logical block in addition to
its logical address. The UFS device can access the physical block in NAND
directly without searching and uploading L2P mapping table. This improves
read performance because the NAND read operation for uploading L2P mapping
table is removed.
In HPB initialization, the host checks if the UFS device supports HPB
feature and retrieves related device capabilities. Then, HPB parameters are
configured in the device.
Total start-up time of popular applications was measured and the difference
observed between HPB being enabled and disabled. Popular applications are
12 game apps and 24 non-game apps. Each test cycle consists of running 36
applications in sequence. We repeated the cycle for observing performance
improvement by L2P mapping cache hit in HPB.
The following is the test environment:
- kernel version: 4.4.0
- RAM: 8GB
- UFS 2.1 (64GB)
Results:
+-------+----------+----------+-------+
| cycle | baseline | with HPB | diff |
+-------+----------+----------+-------+
| 1 | 272.4 | 264.9 | -7.5 |
| 2 | 250.4 | 248.2 | -2.2 |
| 3 | 226.2 | 215.6 | -10.6 |
| 4 | 230.6 | 214.8 | -15.8 |
| 5 | 232.0 | 218.1 | -13.9 |
| 6 | 231.9 | 212.6 | -19.3 |
+-------+----------+----------+-------+
We also measured HPB performance using iozone:
$ iozone -r 4k -+n -i2 -ecI -t 16 -l 16 -u 16 -s $IO_RANGE/16 -F \
mnt/tmp_1 mnt/tmp_2 mnt/tmp_3 mnt/tmp_4 mnt/tmp_5 mnt/tmp_6 mnt/tmp_7 \
mnt/tmp_8 mnt/tmp_9 mnt/tmp_10 mnt/tmp_11 mnt/tmp_12 mnt/tmp_13 \
mnt/tmp_14 mnt/tmp_15 mnt/tmp_16
Results:
+----------+--------+---------+
| IO range | HPB on | HPB off |
+----------+--------+---------+
| 1 GB | 294.8 | 300.87 |
| 4 GB | 293.51 | 179.35 |
| 8 GB | 294.85 | 162.52 |
| 16 GB | 293.45 | 156.26 |
| 32 GB | 277.4 | 153.25 |
+----------+--------+---------+
Link: https://lore.kernel.org/r/20210712085830epcms2p8c1288b7f7a81b044158a18232617b572@epcms2p8
Reported-by: kernel test robot <lkp@intel.com>
Tested-by: Bean Huo <beanhuo@micron.com>
Tested-by: Can Guo <cang@codeaurora.org>
Tested-by: Stanley Chu <stanley.chu@mediatek.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Can Guo <cang@codeaurora.org>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Acked-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>