- A couple stable fixes for the DM writecache target.
- A stable fix for the DM cache target that fixes the potential for data corruption after an unclean shutdown of a cache device using writeback mode. - Update DM integrity target to allow the metadata to be stored on a separate device from data. - Fix DM kcopyd and the snapshot target to cond_resched() where appropriate and be more efficient with processing completed work. - A few fixes and improvements for DM crypt. - Add DM delay target feature to configure delay of flushes independent of writes. - Update DM thin-provisioning target to include metadata_low_watermark threshold in pool status. - Fix stale DM thin-provisioning Documentation. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJbddbuAAoJEMUj8QotnQNaoZIH/1qxc4x26bUfkVwBCiqU0cqJ 8PD8D5UBFB7+JXeI4P9p7ONVJNpk291QGeX+CBOXfhdeBBcXmuHnavJFcmH6+1Y3 omPEIvKnsODEMyZuznA8HasvlZIDPNOESaDt/8vhDPkDmPLWi3h+fUO1ay+NRtua J+8ZTe35P5SY70uLFE3nTeZScZD8KAO9Py1W+5Lz1godS6UUHhNOAcm0zzw5Nvnc 2gu2HZaCx0erFNou5Kj5Y4a/z/cITlAyXHQyzXBINk/X4sSwvzRr2tSy/FG91CuO kHduB0Tgo8eSeEta8jcOcYCm601XLlXzkIM69Z7c3fmCY+b8dY4ybHPxApTEv7c= =blC6 -----END PGP SIGNATURE----- Merge tag 'for-4.19/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper updates from Mike Snitzer: - A couple stable fixes for the DM writecache target. - A stable fix for the DM cache target that fixes the potential for data corruption after an unclean shutdown of a cache device using writeback mode. - Update DM integrity target to allow the metadata to be stored on a separate device from data. - Fix DM kcopyd and the snapshot target to cond_resched() where appropriate and be more efficient with processing completed work. - A few fixes and improvements for DM crypt. - Add DM delay target feature to configure delay of flushes independent of writes. - Update DM thin-provisioning target to include metadata_low_watermark threshold in pool status. - Fix stale DM thin-provisioning Documentation. * tag 'for-4.19/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (26 commits) dm writecache: fix a crash due to reading past end of dirty_bitmap dm crypt: don't decrease device limits dm cache metadata: set dirty on all cache blocks after a crash dm snapshot: remove stale FIXME in snapshot_map() dm snapshot: improve performance by switching out_of_order_list to rbtree dm kcopyd: avoid softlockup in run_complete_job dm cache metadata: save in-core policy_hint_size to on-disk superblock dm thin: stop no_space_timeout worker when switching to write-mode dm kcopyd: return void from dm_kcopyd_copy() dm thin: include metadata_low_watermark threshold in pool status dm writecache: report start_sector in status line dm crypt: convert essiv from ahash to shash dm crypt: use wake_up_process() instead of a wait queue dm integrity: recalculate checksums on creation dm integrity: flush journal on suspend when using separate metadata device dm integrity: use version 2 for separate metadata dm integrity: allow separate metadata device dm integrity: add ic->start in get_data_sector() dm integrity: report provided data sectors in the status dm integrity: implement fair range locks ...
This commit is contained in:
commit
b0e5c29426
@ -5,7 +5,8 @@ Device-Mapper's "delay" target delays reads and/or writes
|
||||
and maps them to different devices.
|
||||
|
||||
Parameters:
|
||||
<device> <offset> <delay> [<write_device> <write_offset> <write_delay>]
|
||||
<device> <offset> <delay> [<write_device> <write_offset> <write_delay>
|
||||
[<flush_device> <flush_offset> <flush_delay>]]
|
||||
|
||||
With separate write parameters, the first set is only used for reads.
|
||||
Offsets are specified in sectors.
|
||||
|
@ -113,6 +113,10 @@ internal_hash:algorithm(:key) (the key is optional)
|
||||
from an upper layer target, such as dm-crypt. The upper layer
|
||||
target should check the validity of the integrity tags.
|
||||
|
||||
recalculate
|
||||
Recalculate the integrity tags automatically. It is only valid
|
||||
when using internal hash.
|
||||
|
||||
journal_crypt:algorithm(:key) (the key is optional)
|
||||
Encrypt the journal using given algorithm to make sure that the
|
||||
attacker can't read the journal. You can use a block cipher here
|
||||
|
@ -28,17 +28,18 @@ administrator some freedom, for example to:
|
||||
Status
|
||||
======
|
||||
|
||||
These targets are very much still in the EXPERIMENTAL state. Please
|
||||
do not yet rely on them in production. But do experiment and offer us
|
||||
feedback. Different use cases will have different performance
|
||||
characteristics, for example due to fragmentation of the data volume.
|
||||
These targets are considered safe for production use. But different use
|
||||
cases will have different performance characteristics, for example due
|
||||
to fragmentation of the data volume.
|
||||
|
||||
If you find this software is not performing as expected please mail
|
||||
dm-devel@redhat.com with details and we'll try our best to improve
|
||||
things for you.
|
||||
|
||||
Userspace tools for checking and repairing the metadata are under
|
||||
development.
|
||||
Userspace tools for checking and repairing the metadata have been fully
|
||||
developed and are available as 'thin_check' and 'thin_repair'. The name
|
||||
of the package that provides these utilities varies by distribution (on
|
||||
a Red Hat distribution it is named 'device-mapper-persistent-data').
|
||||
|
||||
Cookbook
|
||||
========
|
||||
@ -280,7 +281,7 @@ ii) Status
|
||||
<transaction id> <used metadata blocks>/<total metadata blocks>
|
||||
<used data blocks>/<total data blocks> <held metadata root>
|
||||
ro|rw|out_of_data_space [no_]discard_passdown [error|queue]_if_no_space
|
||||
needs_check|-
|
||||
needs_check|- metadata_low_watermark
|
||||
|
||||
transaction id:
|
||||
A 64-bit number used by userspace to help synchronise with metadata
|
||||
@ -327,6 +328,11 @@ ii) Status
|
||||
thin-pool can be made fully operational again. '-' indicates
|
||||
needs_check is not set.
|
||||
|
||||
metadata_low_watermark:
|
||||
Value of metadata low watermark in blocks. The kernel sets this
|
||||
value internally but userspace needs to know this value to
|
||||
determine if an event was caused by crossing this threshold.
|
||||
|
||||
iii) Messages
|
||||
|
||||
create_thin <dev id>
|
||||
|
@ -363,7 +363,7 @@ static int __write_initial_superblock(struct dm_cache_metadata *cmd)
|
||||
disk_super->version = cpu_to_le32(cmd->version);
|
||||
memset(disk_super->policy_name, 0, sizeof(disk_super->policy_name));
|
||||
memset(disk_super->policy_version, 0, sizeof(disk_super->policy_version));
|
||||
disk_super->policy_hint_size = 0;
|
||||
disk_super->policy_hint_size = cpu_to_le32(0);
|
||||
|
||||
__copy_sm_root(cmd, disk_super);
|
||||
|
||||
@ -701,6 +701,7 @@ static int __commit_transaction(struct dm_cache_metadata *cmd,
|
||||
disk_super->policy_version[0] = cpu_to_le32(cmd->policy_version[0]);
|
||||
disk_super->policy_version[1] = cpu_to_le32(cmd->policy_version[1]);
|
||||
disk_super->policy_version[2] = cpu_to_le32(cmd->policy_version[2]);
|
||||
disk_super->policy_hint_size = cpu_to_le32(cmd->policy_hint_size);
|
||||
|
||||
disk_super->read_hits = cpu_to_le32(cmd->stats.read_hits);
|
||||
disk_super->read_misses = cpu_to_le32(cmd->stats.read_misses);
|
||||
@ -1322,6 +1323,7 @@ static int __load_mapping_v1(struct dm_cache_metadata *cmd,
|
||||
|
||||
dm_oblock_t oblock;
|
||||
unsigned flags;
|
||||
bool dirty = true;
|
||||
|
||||
dm_array_cursor_get_value(mapping_cursor, (void **) &mapping_value_le);
|
||||
memcpy(&mapping, mapping_value_le, sizeof(mapping));
|
||||
@ -1332,8 +1334,10 @@ static int __load_mapping_v1(struct dm_cache_metadata *cmd,
|
||||
dm_array_cursor_get_value(hint_cursor, (void **) &hint_value_le);
|
||||
memcpy(&hint, hint_value_le, sizeof(hint));
|
||||
}
|
||||
if (cmd->clean_when_opened)
|
||||
dirty = flags & M_DIRTY;
|
||||
|
||||
r = fn(context, oblock, to_cblock(cb), flags & M_DIRTY,
|
||||
r = fn(context, oblock, to_cblock(cb), dirty,
|
||||
le32_to_cpu(hint), hints_valid);
|
||||
if (r) {
|
||||
DMERR("policy couldn't load cache block %llu",
|
||||
@ -1361,7 +1365,7 @@ static int __load_mapping_v2(struct dm_cache_metadata *cmd,
|
||||
|
||||
dm_oblock_t oblock;
|
||||
unsigned flags;
|
||||
bool dirty;
|
||||
bool dirty = true;
|
||||
|
||||
dm_array_cursor_get_value(mapping_cursor, (void **) &mapping_value_le);
|
||||
memcpy(&mapping, mapping_value_le, sizeof(mapping));
|
||||
@ -1372,8 +1376,9 @@ static int __load_mapping_v2(struct dm_cache_metadata *cmd,
|
||||
dm_array_cursor_get_value(hint_cursor, (void **) &hint_value_le);
|
||||
memcpy(&hint, hint_value_le, sizeof(hint));
|
||||
}
|
||||
if (cmd->clean_when_opened)
|
||||
dirty = dm_bitset_cursor_get_value(dirty_cursor);
|
||||
|
||||
dirty = dm_bitset_cursor_get_value(dirty_cursor);
|
||||
r = fn(context, oblock, to_cblock(cb), dirty,
|
||||
le32_to_cpu(hint), hints_valid);
|
||||
if (r) {
|
||||
|
@ -1188,9 +1188,8 @@ static void copy_complete(int read_err, unsigned long write_err, void *context)
|
||||
queue_continuation(mg->cache->wq, &mg->k);
|
||||
}
|
||||
|
||||
static int copy(struct dm_cache_migration *mg, bool promote)
|
||||
static void copy(struct dm_cache_migration *mg, bool promote)
|
||||
{
|
||||
int r;
|
||||
struct dm_io_region o_region, c_region;
|
||||
struct cache *cache = mg->cache;
|
||||
|
||||
@ -1203,11 +1202,9 @@ static int copy(struct dm_cache_migration *mg, bool promote)
|
||||
c_region.count = cache->sectors_per_block;
|
||||
|
||||
if (promote)
|
||||
r = dm_kcopyd_copy(cache->copier, &o_region, 1, &c_region, 0, copy_complete, &mg->k);
|
||||
dm_kcopyd_copy(cache->copier, &o_region, 1, &c_region, 0, copy_complete, &mg->k);
|
||||
else
|
||||
r = dm_kcopyd_copy(cache->copier, &c_region, 1, &o_region, 0, copy_complete, &mg->k);
|
||||
|
||||
return r;
|
||||
dm_kcopyd_copy(cache->copier, &c_region, 1, &o_region, 0, copy_complete, &mg->k);
|
||||
}
|
||||
|
||||
static void bio_drop_shared_lock(struct cache *cache, struct bio *bio)
|
||||
@ -1449,12 +1446,7 @@ static void mg_full_copy(struct work_struct *ws)
|
||||
}
|
||||
|
||||
init_continuation(&mg->k, mg_upgrade_lock);
|
||||
|
||||
if (copy(mg, is_policy_promote)) {
|
||||
DMERR_LIMIT("%s: migration copy failed", cache_device_name(cache));
|
||||
mg->k.input = BLK_STS_IOERR;
|
||||
mg_complete(mg, false);
|
||||
}
|
||||
copy(mg, is_policy_promote);
|
||||
}
|
||||
|
||||
static void mg_copy(struct work_struct *ws)
|
||||
@ -2250,7 +2242,7 @@ static int parse_features(struct cache_args *ca, struct dm_arg_set *as,
|
||||
{0, 2, "Invalid number of cache feature arguments"},
|
||||
};
|
||||
|
||||
int r;
|
||||
int r, mode_ctr = 0;
|
||||
unsigned argc;
|
||||
const char *arg;
|
||||
struct cache_features *cf = &ca->features;
|
||||
@ -2264,14 +2256,20 @@ static int parse_features(struct cache_args *ca, struct dm_arg_set *as,
|
||||
while (argc--) {
|
||||
arg = dm_shift_arg(as);
|
||||
|
||||
if (!strcasecmp(arg, "writeback"))
|
||||
if (!strcasecmp(arg, "writeback")) {
|
||||
cf->io_mode = CM_IO_WRITEBACK;
|
||||
mode_ctr++;
|
||||
}
|
||||
|
||||
else if (!strcasecmp(arg, "writethrough"))
|
||||
else if (!strcasecmp(arg, "writethrough")) {
|
||||
cf->io_mode = CM_IO_WRITETHROUGH;
|
||||
mode_ctr++;
|
||||
}
|
||||
|
||||
else if (!strcasecmp(arg, "passthrough"))
|
||||
else if (!strcasecmp(arg, "passthrough")) {
|
||||
cf->io_mode = CM_IO_PASSTHROUGH;
|
||||
mode_ctr++;
|
||||
}
|
||||
|
||||
else if (!strcasecmp(arg, "metadata2"))
|
||||
cf->metadata_version = 2;
|
||||
@ -2282,6 +2280,11 @@ static int parse_features(struct cache_args *ca, struct dm_arg_set *as,
|
||||
}
|
||||
}
|
||||
|
||||
if (mode_ctr > 1) {
|
||||
*error = "Duplicate cache io_mode features requested";
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
@ -99,7 +99,7 @@ struct crypt_iv_operations {
|
||||
};
|
||||
|
||||
struct iv_essiv_private {
|
||||
struct crypto_ahash *hash_tfm;
|
||||
struct crypto_shash *hash_tfm;
|
||||
u8 *salt;
|
||||
};
|
||||
|
||||
@ -144,7 +144,7 @@ struct crypt_config {
|
||||
struct workqueue_struct *io_queue;
|
||||
struct workqueue_struct *crypt_queue;
|
||||
|
||||
wait_queue_head_t write_thread_wait;
|
||||
spinlock_t write_thread_lock;
|
||||
struct task_struct *write_thread;
|
||||
struct rb_root write_tree;
|
||||
|
||||
@ -327,25 +327,22 @@ static int crypt_iv_plain64be_gen(struct crypt_config *cc, u8 *iv,
|
||||
static int crypt_iv_essiv_init(struct crypt_config *cc)
|
||||
{
|
||||
struct iv_essiv_private *essiv = &cc->iv_gen_private.essiv;
|
||||
AHASH_REQUEST_ON_STACK(req, essiv->hash_tfm);
|
||||
struct scatterlist sg;
|
||||
SHASH_DESC_ON_STACK(desc, essiv->hash_tfm);
|
||||
struct crypto_cipher *essiv_tfm;
|
||||
int err;
|
||||
|
||||
sg_init_one(&sg, cc->key, cc->key_size);
|
||||
ahash_request_set_tfm(req, essiv->hash_tfm);
|
||||
ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_SLEEP, NULL, NULL);
|
||||
ahash_request_set_crypt(req, &sg, essiv->salt, cc->key_size);
|
||||
desc->tfm = essiv->hash_tfm;
|
||||
desc->flags = CRYPTO_TFM_REQ_MAY_SLEEP;
|
||||
|
||||
err = crypto_ahash_digest(req);
|
||||
ahash_request_zero(req);
|
||||
err = crypto_shash_digest(desc, cc->key, cc->key_size, essiv->salt);
|
||||
shash_desc_zero(desc);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
essiv_tfm = cc->iv_private;
|
||||
|
||||
err = crypto_cipher_setkey(essiv_tfm, essiv->salt,
|
||||
crypto_ahash_digestsize(essiv->hash_tfm));
|
||||
crypto_shash_digestsize(essiv->hash_tfm));
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
@ -356,7 +353,7 @@ static int crypt_iv_essiv_init(struct crypt_config *cc)
|
||||
static int crypt_iv_essiv_wipe(struct crypt_config *cc)
|
||||
{
|
||||
struct iv_essiv_private *essiv = &cc->iv_gen_private.essiv;
|
||||
unsigned salt_size = crypto_ahash_digestsize(essiv->hash_tfm);
|
||||
unsigned salt_size = crypto_shash_digestsize(essiv->hash_tfm);
|
||||
struct crypto_cipher *essiv_tfm;
|
||||
int r, err = 0;
|
||||
|
||||
@ -408,7 +405,7 @@ static void crypt_iv_essiv_dtr(struct crypt_config *cc)
|
||||
struct crypto_cipher *essiv_tfm;
|
||||
struct iv_essiv_private *essiv = &cc->iv_gen_private.essiv;
|
||||
|
||||
crypto_free_ahash(essiv->hash_tfm);
|
||||
crypto_free_shash(essiv->hash_tfm);
|
||||
essiv->hash_tfm = NULL;
|
||||
|
||||
kzfree(essiv->salt);
|
||||
@ -426,7 +423,7 @@ static int crypt_iv_essiv_ctr(struct crypt_config *cc, struct dm_target *ti,
|
||||
const char *opts)
|
||||
{
|
||||
struct crypto_cipher *essiv_tfm = NULL;
|
||||
struct crypto_ahash *hash_tfm = NULL;
|
||||
struct crypto_shash *hash_tfm = NULL;
|
||||
u8 *salt = NULL;
|
||||
int err;
|
||||
|
||||
@ -436,14 +433,14 @@ static int crypt_iv_essiv_ctr(struct crypt_config *cc, struct dm_target *ti,
|
||||
}
|
||||
|
||||
/* Allocate hash algorithm */
|
||||
hash_tfm = crypto_alloc_ahash(opts, 0, CRYPTO_ALG_ASYNC);
|
||||
hash_tfm = crypto_alloc_shash(opts, 0, 0);
|
||||
if (IS_ERR(hash_tfm)) {
|
||||
ti->error = "Error initializing ESSIV hash";
|
||||
err = PTR_ERR(hash_tfm);
|
||||
goto bad;
|
||||
}
|
||||
|
||||
salt = kzalloc(crypto_ahash_digestsize(hash_tfm), GFP_KERNEL);
|
||||
salt = kzalloc(crypto_shash_digestsize(hash_tfm), GFP_KERNEL);
|
||||
if (!salt) {
|
||||
ti->error = "Error kmallocing salt storage in ESSIV";
|
||||
err = -ENOMEM;
|
||||
@ -454,7 +451,7 @@ static int crypt_iv_essiv_ctr(struct crypt_config *cc, struct dm_target *ti,
|
||||
cc->iv_gen_private.essiv.hash_tfm = hash_tfm;
|
||||
|
||||
essiv_tfm = alloc_essiv_cipher(cc, ti, salt,
|
||||
crypto_ahash_digestsize(hash_tfm));
|
||||
crypto_shash_digestsize(hash_tfm));
|
||||
if (IS_ERR(essiv_tfm)) {
|
||||
crypt_iv_essiv_dtr(cc);
|
||||
return PTR_ERR(essiv_tfm);
|
||||
@ -465,7 +462,7 @@ static int crypt_iv_essiv_ctr(struct crypt_config *cc, struct dm_target *ti,
|
||||
|
||||
bad:
|
||||
if (hash_tfm && !IS_ERR(hash_tfm))
|
||||
crypto_free_ahash(hash_tfm);
|
||||
crypto_free_shash(hash_tfm);
|
||||
kfree(salt);
|
||||
return err;
|
||||
}
|
||||
@ -1620,36 +1617,31 @@ static int dmcrypt_write(void *data)
|
||||
struct rb_root write_tree;
|
||||
struct blk_plug plug;
|
||||
|
||||
DECLARE_WAITQUEUE(wait, current);
|
||||
|
||||
spin_lock_irq(&cc->write_thread_wait.lock);
|
||||
spin_lock_irq(&cc->write_thread_lock);
|
||||
continue_locked:
|
||||
|
||||
if (!RB_EMPTY_ROOT(&cc->write_tree))
|
||||
goto pop_from_list;
|
||||
|
||||
set_current_state(TASK_INTERRUPTIBLE);
|
||||
__add_wait_queue(&cc->write_thread_wait, &wait);
|
||||
|
||||
spin_unlock_irq(&cc->write_thread_wait.lock);
|
||||
spin_unlock_irq(&cc->write_thread_lock);
|
||||
|
||||
if (unlikely(kthread_should_stop())) {
|
||||
set_current_state(TASK_RUNNING);
|
||||
remove_wait_queue(&cc->write_thread_wait, &wait);
|
||||
break;
|
||||
}
|
||||
|
||||
schedule();
|
||||
|
||||
set_current_state(TASK_RUNNING);
|
||||
spin_lock_irq(&cc->write_thread_wait.lock);
|
||||
__remove_wait_queue(&cc->write_thread_wait, &wait);
|
||||
spin_lock_irq(&cc->write_thread_lock);
|
||||
goto continue_locked;
|
||||
|
||||
pop_from_list:
|
||||
write_tree = cc->write_tree;
|
||||
cc->write_tree = RB_ROOT;
|
||||
spin_unlock_irq(&cc->write_thread_wait.lock);
|
||||
spin_unlock_irq(&cc->write_thread_lock);
|
||||
|
||||
BUG_ON(rb_parent(write_tree.rb_node));
|
||||
|
||||
@ -1693,7 +1685,9 @@ static void kcryptd_crypt_write_io_submit(struct dm_crypt_io *io, int async)
|
||||
return;
|
||||
}
|
||||
|
||||
spin_lock_irqsave(&cc->write_thread_wait.lock, flags);
|
||||
spin_lock_irqsave(&cc->write_thread_lock, flags);
|
||||
if (RB_EMPTY_ROOT(&cc->write_tree))
|
||||
wake_up_process(cc->write_thread);
|
||||
rbp = &cc->write_tree.rb_node;
|
||||
parent = NULL;
|
||||
sector = io->sector;
|
||||
@ -1706,9 +1700,7 @@ static void kcryptd_crypt_write_io_submit(struct dm_crypt_io *io, int async)
|
||||
}
|
||||
rb_link_node(&io->rb_node, parent, rbp);
|
||||
rb_insert_color(&io->rb_node, &cc->write_tree);
|
||||
|
||||
wake_up_locked(&cc->write_thread_wait);
|
||||
spin_unlock_irqrestore(&cc->write_thread_wait.lock, flags);
|
||||
spin_unlock_irqrestore(&cc->write_thread_lock, flags);
|
||||
}
|
||||
|
||||
static void kcryptd_crypt_write_convert(struct dm_crypt_io *io)
|
||||
@ -2831,7 +2823,7 @@ static int crypt_ctr(struct dm_target *ti, unsigned int argc, char **argv)
|
||||
goto bad;
|
||||
}
|
||||
|
||||
init_waitqueue_head(&cc->write_thread_wait);
|
||||
spin_lock_init(&cc->write_thread_lock);
|
||||
cc->write_tree = RB_ROOT;
|
||||
|
||||
cc->write_thread = kthread_create(dmcrypt_write, cc, "dmcrypt_write");
|
||||
@ -3069,11 +3061,11 @@ static void crypt_io_hints(struct dm_target *ti, struct queue_limits *limits)
|
||||
*/
|
||||
limits->max_segment_size = PAGE_SIZE;
|
||||
|
||||
if (cc->sector_size != (1 << SECTOR_SHIFT)) {
|
||||
limits->logical_block_size = cc->sector_size;
|
||||
limits->physical_block_size = cc->sector_size;
|
||||
blk_limits_io_min(limits, cc->sector_size);
|
||||
}
|
||||
limits->logical_block_size =
|
||||
max_t(unsigned short, limits->logical_block_size, cc->sector_size);
|
||||
limits->physical_block_size =
|
||||
max_t(unsigned, limits->physical_block_size, cc->sector_size);
|
||||
limits->io_min = max_t(unsigned, limits->io_min, cc->sector_size);
|
||||
}
|
||||
|
||||
static struct target_type crypt_target = {
|
||||
|
@ -17,6 +17,13 @@
|
||||
|
||||
#define DM_MSG_PREFIX "delay"
|
||||
|
||||
struct delay_class {
|
||||
struct dm_dev *dev;
|
||||
sector_t start;
|
||||
unsigned delay;
|
||||
unsigned ops;
|
||||
};
|
||||
|
||||
struct delay_c {
|
||||
struct timer_list delay_timer;
|
||||
struct mutex timer_lock;
|
||||
@ -25,19 +32,16 @@ struct delay_c {
|
||||
struct list_head delayed_bios;
|
||||
atomic_t may_delay;
|
||||
|
||||
struct dm_dev *dev_read;
|
||||
sector_t start_read;
|
||||
unsigned read_delay;
|
||||
unsigned reads;
|
||||
struct delay_class read;
|
||||
struct delay_class write;
|
||||
struct delay_class flush;
|
||||
|
||||
struct dm_dev *dev_write;
|
||||
sector_t start_write;
|
||||
unsigned write_delay;
|
||||
unsigned writes;
|
||||
int argc;
|
||||
};
|
||||
|
||||
struct dm_delay_info {
|
||||
struct delay_c *context;
|
||||
struct delay_class *class;
|
||||
struct list_head list;
|
||||
unsigned long expires;
|
||||
};
|
||||
@ -77,7 +81,7 @@ static struct bio *flush_delayed_bios(struct delay_c *dc, int flush_all)
|
||||
{
|
||||
struct dm_delay_info *delayed, *next;
|
||||
unsigned long next_expires = 0;
|
||||
int start_timer = 0;
|
||||
unsigned long start_timer = 0;
|
||||
struct bio_list flush_bios = { };
|
||||
|
||||
mutex_lock(&delayed_bios_lock);
|
||||
@ -87,10 +91,7 @@ static struct bio *flush_delayed_bios(struct delay_c *dc, int flush_all)
|
||||
sizeof(struct dm_delay_info));
|
||||
list_del(&delayed->list);
|
||||
bio_list_add(&flush_bios, bio);
|
||||
if ((bio_data_dir(bio) == WRITE))
|
||||
delayed->context->writes--;
|
||||
else
|
||||
delayed->context->reads--;
|
||||
delayed->class->ops--;
|
||||
continue;
|
||||
}
|
||||
|
||||
@ -100,7 +101,6 @@ static struct bio *flush_delayed_bios(struct delay_c *dc, int flush_all)
|
||||
} else
|
||||
next_expires = min(next_expires, delayed->expires);
|
||||
}
|
||||
|
||||
mutex_unlock(&delayed_bios_lock);
|
||||
|
||||
if (start_timer)
|
||||
@ -117,6 +117,50 @@ static void flush_expired_bios(struct work_struct *work)
|
||||
flush_bios(flush_delayed_bios(dc, 0));
|
||||
}
|
||||
|
||||
static void delay_dtr(struct dm_target *ti)
|
||||
{
|
||||
struct delay_c *dc = ti->private;
|
||||
|
||||
destroy_workqueue(dc->kdelayd_wq);
|
||||
|
||||
if (dc->read.dev)
|
||||
dm_put_device(ti, dc->read.dev);
|
||||
if (dc->write.dev)
|
||||
dm_put_device(ti, dc->write.dev);
|
||||
if (dc->flush.dev)
|
||||
dm_put_device(ti, dc->flush.dev);
|
||||
|
||||
mutex_destroy(&dc->timer_lock);
|
||||
|
||||
kfree(dc);
|
||||
}
|
||||
|
||||
static int delay_class_ctr(struct dm_target *ti, struct delay_class *c, char **argv)
|
||||
{
|
||||
int ret;
|
||||
unsigned long long tmpll;
|
||||
char dummy;
|
||||
|
||||
if (sscanf(argv[1], "%llu%c", &tmpll, &dummy) != 1) {
|
||||
ti->error = "Invalid device sector";
|
||||
return -EINVAL;
|
||||
}
|
||||
c->start = tmpll;
|
||||
|
||||
if (sscanf(argv[2], "%u%c", &c->delay, &dummy) != 1) {
|
||||
ti->error = "Invalid delay";
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
ret = dm_get_device(ti, argv[0], dm_table_get_mode(ti->table), &c->dev);
|
||||
if (ret) {
|
||||
ti->error = "Device lookup failed";
|
||||
return ret;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Mapping parameters:
|
||||
* <device> <offset> <delay> [<write_device> <write_offset> <write_delay>]
|
||||
@ -128,134 +172,89 @@ static void flush_expired_bios(struct work_struct *work)
|
||||
static int delay_ctr(struct dm_target *ti, unsigned int argc, char **argv)
|
||||
{
|
||||
struct delay_c *dc;
|
||||
unsigned long long tmpll;
|
||||
char dummy;
|
||||
int ret;
|
||||
|
||||
if (argc != 3 && argc != 6) {
|
||||
ti->error = "Requires exactly 3 or 6 arguments";
|
||||
if (argc != 3 && argc != 6 && argc != 9) {
|
||||
ti->error = "Requires exactly 3, 6 or 9 arguments";
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
dc = kmalloc(sizeof(*dc), GFP_KERNEL);
|
||||
dc = kzalloc(sizeof(*dc), GFP_KERNEL);
|
||||
if (!dc) {
|
||||
ti->error = "Cannot allocate context";
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
dc->reads = dc->writes = 0;
|
||||
|
||||
ret = -EINVAL;
|
||||
if (sscanf(argv[1], "%llu%c", &tmpll, &dummy) != 1) {
|
||||
ti->error = "Invalid device sector";
|
||||
goto bad;
|
||||
}
|
||||
dc->start_read = tmpll;
|
||||
|
||||
if (sscanf(argv[2], "%u%c", &dc->read_delay, &dummy) != 1) {
|
||||
ti->error = "Invalid delay";
|
||||
goto bad;
|
||||
}
|
||||
|
||||
ret = dm_get_device(ti, argv[0], dm_table_get_mode(ti->table),
|
||||
&dc->dev_read);
|
||||
if (ret) {
|
||||
ti->error = "Device lookup failed";
|
||||
goto bad;
|
||||
}
|
||||
|
||||
ret = -EINVAL;
|
||||
dc->dev_write = NULL;
|
||||
if (argc == 3)
|
||||
goto out;
|
||||
|
||||
if (sscanf(argv[4], "%llu%c", &tmpll, &dummy) != 1) {
|
||||
ti->error = "Invalid write device sector";
|
||||
goto bad_dev_read;
|
||||
}
|
||||
dc->start_write = tmpll;
|
||||
|
||||
if (sscanf(argv[5], "%u%c", &dc->write_delay, &dummy) != 1) {
|
||||
ti->error = "Invalid write delay";
|
||||
goto bad_dev_read;
|
||||
}
|
||||
|
||||
ret = dm_get_device(ti, argv[3], dm_table_get_mode(ti->table),
|
||||
&dc->dev_write);
|
||||
if (ret) {
|
||||
ti->error = "Write device lookup failed";
|
||||
goto bad_dev_read;
|
||||
}
|
||||
|
||||
out:
|
||||
ret = -EINVAL;
|
||||
dc->kdelayd_wq = alloc_workqueue("kdelayd", WQ_MEM_RECLAIM, 0);
|
||||
if (!dc->kdelayd_wq) {
|
||||
DMERR("Couldn't start kdelayd");
|
||||
goto bad_queue;
|
||||
}
|
||||
|
||||
ti->private = dc;
|
||||
timer_setup(&dc->delay_timer, handle_delayed_timer, 0);
|
||||
|
||||
INIT_WORK(&dc->flush_expired_bios, flush_expired_bios);
|
||||
INIT_LIST_HEAD(&dc->delayed_bios);
|
||||
mutex_init(&dc->timer_lock);
|
||||
atomic_set(&dc->may_delay, 1);
|
||||
dc->argc = argc;
|
||||
|
||||
ret = delay_class_ctr(ti, &dc->read, argv);
|
||||
if (ret)
|
||||
goto bad;
|
||||
|
||||
if (argc == 3) {
|
||||
ret = delay_class_ctr(ti, &dc->write, argv);
|
||||
if (ret)
|
||||
goto bad;
|
||||
ret = delay_class_ctr(ti, &dc->flush, argv);
|
||||
if (ret)
|
||||
goto bad;
|
||||
goto out;
|
||||
}
|
||||
|
||||
ret = delay_class_ctr(ti, &dc->write, argv + 3);
|
||||
if (ret)
|
||||
goto bad;
|
||||
if (argc == 6) {
|
||||
ret = delay_class_ctr(ti, &dc->flush, argv + 3);
|
||||
if (ret)
|
||||
goto bad;
|
||||
goto out;
|
||||
}
|
||||
|
||||
ret = delay_class_ctr(ti, &dc->flush, argv + 6);
|
||||
if (ret)
|
||||
goto bad;
|
||||
|
||||
out:
|
||||
dc->kdelayd_wq = alloc_workqueue("kdelayd", WQ_MEM_RECLAIM, 0);
|
||||
if (!dc->kdelayd_wq) {
|
||||
ret = -EINVAL;
|
||||
DMERR("Couldn't start kdelayd");
|
||||
goto bad;
|
||||
}
|
||||
|
||||
ti->num_flush_bios = 1;
|
||||
ti->num_discard_bios = 1;
|
||||
ti->per_io_data_size = sizeof(struct dm_delay_info);
|
||||
ti->private = dc;
|
||||
return 0;
|
||||
|
||||
bad_queue:
|
||||
if (dc->dev_write)
|
||||
dm_put_device(ti, dc->dev_write);
|
||||
bad_dev_read:
|
||||
dm_put_device(ti, dc->dev_read);
|
||||
bad:
|
||||
kfree(dc);
|
||||
delay_dtr(ti);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static void delay_dtr(struct dm_target *ti)
|
||||
{
|
||||
struct delay_c *dc = ti->private;
|
||||
|
||||
destroy_workqueue(dc->kdelayd_wq);
|
||||
|
||||
dm_put_device(ti, dc->dev_read);
|
||||
|
||||
if (dc->dev_write)
|
||||
dm_put_device(ti, dc->dev_write);
|
||||
|
||||
mutex_destroy(&dc->timer_lock);
|
||||
|
||||
kfree(dc);
|
||||
}
|
||||
|
||||
static int delay_bio(struct delay_c *dc, int delay, struct bio *bio)
|
||||
static int delay_bio(struct delay_c *dc, struct delay_class *c, struct bio *bio)
|
||||
{
|
||||
struct dm_delay_info *delayed;
|
||||
unsigned long expires = 0;
|
||||
|
||||
if (!delay || !atomic_read(&dc->may_delay))
|
||||
if (!c->delay || !atomic_read(&dc->may_delay))
|
||||
return DM_MAPIO_REMAPPED;
|
||||
|
||||
delayed = dm_per_bio_data(bio, sizeof(struct dm_delay_info));
|
||||
|
||||
delayed->context = dc;
|
||||
delayed->expires = expires = jiffies + msecs_to_jiffies(delay);
|
||||
delayed->expires = expires = jiffies + msecs_to_jiffies(c->delay);
|
||||
|
||||
mutex_lock(&delayed_bios_lock);
|
||||
|
||||
if (bio_data_dir(bio) == WRITE)
|
||||
dc->writes++;
|
||||
else
|
||||
dc->reads++;
|
||||
|
||||
c->ops++;
|
||||
list_add_tail(&delayed->list, &dc->delayed_bios);
|
||||
|
||||
mutex_unlock(&delayed_bios_lock);
|
||||
|
||||
queue_timeout(dc, expires);
|
||||
@ -282,23 +281,28 @@ static void delay_resume(struct dm_target *ti)
|
||||
static int delay_map(struct dm_target *ti, struct bio *bio)
|
||||
{
|
||||
struct delay_c *dc = ti->private;
|
||||
struct delay_class *c;
|
||||
struct dm_delay_info *delayed = dm_per_bio_data(bio, sizeof(struct dm_delay_info));
|
||||
|
||||
if ((bio_data_dir(bio) == WRITE) && (dc->dev_write)) {
|
||||
bio_set_dev(bio, dc->dev_write->bdev);
|
||||
if (bio_sectors(bio))
|
||||
bio->bi_iter.bi_sector = dc->start_write +
|
||||
dm_target_offset(ti, bio->bi_iter.bi_sector);
|
||||
|
||||
return delay_bio(dc, dc->write_delay, bio);
|
||||
if (bio_data_dir(bio) == WRITE) {
|
||||
if (unlikely(bio->bi_opf & REQ_PREFLUSH))
|
||||
c = &dc->flush;
|
||||
else
|
||||
c = &dc->write;
|
||||
} else {
|
||||
c = &dc->read;
|
||||
}
|
||||
delayed->class = c;
|
||||
bio_set_dev(bio, c->dev->bdev);
|
||||
if (bio_sectors(bio))
|
||||
bio->bi_iter.bi_sector = c->start + dm_target_offset(ti, bio->bi_iter.bi_sector);
|
||||
|
||||
bio_set_dev(bio, dc->dev_read->bdev);
|
||||
bio->bi_iter.bi_sector = dc->start_read +
|
||||
dm_target_offset(ti, bio->bi_iter.bi_sector);
|
||||
|
||||
return delay_bio(dc, dc->read_delay, bio);
|
||||
return delay_bio(dc, c, bio);
|
||||
}
|
||||
|
||||
#define DMEMIT_DELAY_CLASS(c) \
|
||||
DMEMIT("%s %llu %u", (c)->dev->name, (unsigned long long)(c)->start, (c)->delay)
|
||||
|
||||
static void delay_status(struct dm_target *ti, status_type_t type,
|
||||
unsigned status_flags, char *result, unsigned maxlen)
|
||||
{
|
||||
@ -307,17 +311,19 @@ static void delay_status(struct dm_target *ti, status_type_t type,
|
||||
|
||||
switch (type) {
|
||||
case STATUSTYPE_INFO:
|
||||
DMEMIT("%u %u", dc->reads, dc->writes);
|
||||
DMEMIT("%u %u %u", dc->read.ops, dc->write.ops, dc->flush.ops);
|
||||
break;
|
||||
|
||||
case STATUSTYPE_TABLE:
|
||||
DMEMIT("%s %llu %u", dc->dev_read->name,
|
||||
(unsigned long long) dc->start_read,
|
||||
dc->read_delay);
|
||||
if (dc->dev_write)
|
||||
DMEMIT(" %s %llu %u", dc->dev_write->name,
|
||||
(unsigned long long) dc->start_write,
|
||||
dc->write_delay);
|
||||
DMEMIT_DELAY_CLASS(&dc->read);
|
||||
if (dc->argc >= 6) {
|
||||
DMEMIT(" ");
|
||||
DMEMIT_DELAY_CLASS(&dc->write);
|
||||
}
|
||||
if (dc->argc >= 9) {
|
||||
DMEMIT(" ");
|
||||
DMEMIT_DELAY_CLASS(&dc->flush);
|
||||
}
|
||||
break;
|
||||
}
|
||||
}
|
||||
@ -328,12 +334,15 @@ static int delay_iterate_devices(struct dm_target *ti,
|
||||
struct delay_c *dc = ti->private;
|
||||
int ret = 0;
|
||||
|
||||
ret = fn(ti, dc->dev_read, dc->start_read, ti->len, data);
|
||||
ret = fn(ti, dc->read.dev, dc->read.start, ti->len, data);
|
||||
if (ret)
|
||||
goto out;
|
||||
ret = fn(ti, dc->write.dev, dc->write.start, ti->len, data);
|
||||
if (ret)
|
||||
goto out;
|
||||
ret = fn(ti, dc->flush.dev, dc->flush.start, ti->len, data);
|
||||
if (ret)
|
||||
goto out;
|
||||
|
||||
if (dc->dev_write)
|
||||
ret = fn(ti, dc->dev_write, dc->start_write, ti->len, data);
|
||||
|
||||
out:
|
||||
return ret;
|
||||
|
@ -31,6 +31,8 @@
|
||||
#define MIN_LOG2_INTERLEAVE_SECTORS 3
|
||||
#define MAX_LOG2_INTERLEAVE_SECTORS 31
|
||||
#define METADATA_WORKQUEUE_MAX_ACTIVE 16
|
||||
#define RECALC_SECTORS 8192
|
||||
#define RECALC_WRITE_SUPER 16
|
||||
|
||||
/*
|
||||
* Warning - DEBUG_PRINT prints security-sensitive data to the log,
|
||||
@ -44,7 +46,8 @@
|
||||
*/
|
||||
|
||||
#define SB_MAGIC "integrt"
|
||||
#define SB_VERSION 1
|
||||
#define SB_VERSION_1 1
|
||||
#define SB_VERSION_2 2
|
||||
#define SB_SECTORS 8
|
||||
#define MAX_SECTORS_PER_BLOCK 8
|
||||
|
||||
@ -57,9 +60,12 @@ struct superblock {
|
||||
__u64 provided_data_sectors; /* userspace uses this value */
|
||||
__u32 flags;
|
||||
__u8 log2_sectors_per_block;
|
||||
__u8 pad[3];
|
||||
__u64 recalc_sector;
|
||||
};
|
||||
|
||||
#define SB_FLAG_HAVE_JOURNAL_MAC 0x1
|
||||
#define SB_FLAG_RECALCULATING 0x2
|
||||
|
||||
#define JOURNAL_ENTRY_ROUNDUP 8
|
||||
|
||||
@ -139,6 +145,7 @@ struct alg_spec {
|
||||
|
||||
struct dm_integrity_c {
|
||||
struct dm_dev *dev;
|
||||
struct dm_dev *meta_dev;
|
||||
unsigned tag_size;
|
||||
__s8 log2_tag_size;
|
||||
sector_t start;
|
||||
@ -170,7 +177,8 @@ struct dm_integrity_c {
|
||||
unsigned short journal_section_sectors;
|
||||
unsigned journal_sections;
|
||||
unsigned journal_entries;
|
||||
sector_t device_sectors;
|
||||
sector_t data_device_sectors;
|
||||
sector_t meta_device_sectors;
|
||||
unsigned initial_sectors;
|
||||
unsigned metadata_run;
|
||||
__s8 log2_metadata_run;
|
||||
@ -178,7 +186,7 @@ struct dm_integrity_c {
|
||||
__u8 sectors_per_block;
|
||||
|
||||
unsigned char mode;
|
||||
bool suspending;
|
||||
int suspending;
|
||||
|
||||
int failed;
|
||||
|
||||
@ -186,6 +194,7 @@ struct dm_integrity_c {
|
||||
|
||||
/* these variables are locked with endio_wait.lock */
|
||||
struct rb_root in_progress;
|
||||
struct list_head wait_list;
|
||||
wait_queue_head_t endio_wait;
|
||||
struct workqueue_struct *wait_wq;
|
||||
|
||||
@ -210,6 +219,11 @@ struct dm_integrity_c {
|
||||
struct workqueue_struct *writer_wq;
|
||||
struct work_struct writer_work;
|
||||
|
||||
struct workqueue_struct *recalc_wq;
|
||||
struct work_struct recalc_work;
|
||||
u8 *recalc_buffer;
|
||||
u8 *recalc_tags;
|
||||
|
||||
struct bio_list flush_bio_list;
|
||||
|
||||
unsigned long autocommit_jiffies;
|
||||
@ -233,7 +247,14 @@ struct dm_integrity_c {
|
||||
struct dm_integrity_range {
|
||||
sector_t logical_sector;
|
||||
unsigned n_sectors;
|
||||
struct rb_node node;
|
||||
bool waiting;
|
||||
union {
|
||||
struct rb_node node;
|
||||
struct {
|
||||
struct task_struct *task;
|
||||
struct list_head wait_entry;
|
||||
};
|
||||
};
|
||||
};
|
||||
|
||||
struct dm_integrity_io {
|
||||
@ -337,10 +358,14 @@ static commit_id_t dm_integrity_commit_id(struct dm_integrity_c *ic, unsigned i,
|
||||
static void get_area_and_offset(struct dm_integrity_c *ic, sector_t data_sector,
|
||||
sector_t *area, sector_t *offset)
|
||||
{
|
||||
__u8 log2_interleave_sectors = ic->sb->log2_interleave_sectors;
|
||||
|
||||
*area = data_sector >> log2_interleave_sectors;
|
||||
*offset = (unsigned)data_sector & ((1U << log2_interleave_sectors) - 1);
|
||||
if (!ic->meta_dev) {
|
||||
__u8 log2_interleave_sectors = ic->sb->log2_interleave_sectors;
|
||||
*area = data_sector >> log2_interleave_sectors;
|
||||
*offset = (unsigned)data_sector & ((1U << log2_interleave_sectors) - 1);
|
||||
} else {
|
||||
*area = 0;
|
||||
*offset = data_sector;
|
||||
}
|
||||
}
|
||||
|
||||
#define sector_to_block(ic, n) \
|
||||
@ -379,6 +404,9 @@ static sector_t get_data_sector(struct dm_integrity_c *ic, sector_t area, sector
|
||||
{
|
||||
sector_t result;
|
||||
|
||||
if (ic->meta_dev)
|
||||
return offset;
|
||||
|
||||
result = area << ic->sb->log2_interleave_sectors;
|
||||
if (likely(ic->log2_metadata_run >= 0))
|
||||
result += (area + 1) << ic->log2_metadata_run;
|
||||
@ -386,6 +414,8 @@ static sector_t get_data_sector(struct dm_integrity_c *ic, sector_t area, sector
|
||||
result += (area + 1) * ic->metadata_run;
|
||||
|
||||
result += (sector_t)ic->initial_sectors + offset;
|
||||
result += ic->start;
|
||||
|
||||
return result;
|
||||
}
|
||||
|
||||
@ -395,6 +425,14 @@ static void wraparound_section(struct dm_integrity_c *ic, unsigned *sec_ptr)
|
||||
*sec_ptr -= ic->journal_sections;
|
||||
}
|
||||
|
||||
static void sb_set_version(struct dm_integrity_c *ic)
|
||||
{
|
||||
if (ic->meta_dev || ic->sb->flags & cpu_to_le32(SB_FLAG_RECALCULATING))
|
||||
ic->sb->version = SB_VERSION_2;
|
||||
else
|
||||
ic->sb->version = SB_VERSION_1;
|
||||
}
|
||||
|
||||
static int sync_rw_sb(struct dm_integrity_c *ic, int op, int op_flags)
|
||||
{
|
||||
struct dm_io_request io_req;
|
||||
@ -406,7 +444,7 @@ static int sync_rw_sb(struct dm_integrity_c *ic, int op, int op_flags)
|
||||
io_req.mem.ptr.addr = ic->sb;
|
||||
io_req.notify.fn = NULL;
|
||||
io_req.client = ic->io;
|
||||
io_loc.bdev = ic->dev->bdev;
|
||||
io_loc.bdev = ic->meta_dev ? ic->meta_dev->bdev : ic->dev->bdev;
|
||||
io_loc.sector = ic->start;
|
||||
io_loc.count = SB_SECTORS;
|
||||
|
||||
@ -753,7 +791,7 @@ static void rw_journal(struct dm_integrity_c *ic, int op, int op_flags, unsigned
|
||||
io_req.notify.fn = NULL;
|
||||
}
|
||||
io_req.client = ic->io;
|
||||
io_loc.bdev = ic->dev->bdev;
|
||||
io_loc.bdev = ic->meta_dev ? ic->meta_dev->bdev : ic->dev->bdev;
|
||||
io_loc.sector = ic->start + SB_SECTORS + sector;
|
||||
io_loc.count = n_sectors;
|
||||
|
||||
@ -857,7 +895,7 @@ static void copy_from_journal(struct dm_integrity_c *ic, unsigned section, unsig
|
||||
io_req.notify.context = data;
|
||||
io_req.client = ic->io;
|
||||
io_loc.bdev = ic->dev->bdev;
|
||||
io_loc.sector = ic->start + target;
|
||||
io_loc.sector = target;
|
||||
io_loc.count = n_sectors;
|
||||
|
||||
r = dm_io(&io_req, 1, &io_loc, NULL);
|
||||
@ -867,13 +905,27 @@ static void copy_from_journal(struct dm_integrity_c *ic, unsigned section, unsig
|
||||
}
|
||||
}
|
||||
|
||||
static bool add_new_range(struct dm_integrity_c *ic, struct dm_integrity_range *new_range)
|
||||
static bool ranges_overlap(struct dm_integrity_range *range1, struct dm_integrity_range *range2)
|
||||
{
|
||||
return range1->logical_sector < range2->logical_sector + range2->n_sectors &&
|
||||
range2->logical_sector + range2->n_sectors > range2->logical_sector;
|
||||
}
|
||||
|
||||
static bool add_new_range(struct dm_integrity_c *ic, struct dm_integrity_range *new_range, bool check_waiting)
|
||||
{
|
||||
struct rb_node **n = &ic->in_progress.rb_node;
|
||||
struct rb_node *parent;
|
||||
|
||||
BUG_ON((new_range->logical_sector | new_range->n_sectors) & (unsigned)(ic->sectors_per_block - 1));
|
||||
|
||||
if (likely(check_waiting)) {
|
||||
struct dm_integrity_range *range;
|
||||
list_for_each_entry(range, &ic->wait_list, wait_entry) {
|
||||
if (unlikely(ranges_overlap(range, new_range)))
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
parent = NULL;
|
||||
|
||||
while (*n) {
|
||||
@ -898,7 +950,22 @@ static bool add_new_range(struct dm_integrity_c *ic, struct dm_integrity_range *
|
||||
static void remove_range_unlocked(struct dm_integrity_c *ic, struct dm_integrity_range *range)
|
||||
{
|
||||
rb_erase(&range->node, &ic->in_progress);
|
||||
wake_up_locked(&ic->endio_wait);
|
||||
while (unlikely(!list_empty(&ic->wait_list))) {
|
||||
struct dm_integrity_range *last_range =
|
||||
list_first_entry(&ic->wait_list, struct dm_integrity_range, wait_entry);
|
||||
struct task_struct *last_range_task;
|
||||
if (!ranges_overlap(range, last_range))
|
||||
break;
|
||||
last_range_task = last_range->task;
|
||||
list_del(&last_range->wait_entry);
|
||||
if (!add_new_range(ic, last_range, false)) {
|
||||
last_range->task = last_range_task;
|
||||
list_add(&last_range->wait_entry, &ic->wait_list);
|
||||
break;
|
||||
}
|
||||
last_range->waiting = false;
|
||||
wake_up_process(last_range_task);
|
||||
}
|
||||
}
|
||||
|
||||
static void remove_range(struct dm_integrity_c *ic, struct dm_integrity_range *range)
|
||||
@ -910,6 +977,19 @@ static void remove_range(struct dm_integrity_c *ic, struct dm_integrity_range *r
|
||||
spin_unlock_irqrestore(&ic->endio_wait.lock, flags);
|
||||
}
|
||||
|
||||
static void wait_and_add_new_range(struct dm_integrity_c *ic, struct dm_integrity_range *new_range)
|
||||
{
|
||||
new_range->waiting = true;
|
||||
list_add_tail(&new_range->wait_entry, &ic->wait_list);
|
||||
new_range->task = current;
|
||||
do {
|
||||
__set_current_state(TASK_UNINTERRUPTIBLE);
|
||||
spin_unlock_irq(&ic->endio_wait.lock);
|
||||
io_schedule();
|
||||
spin_lock_irq(&ic->endio_wait.lock);
|
||||
} while (unlikely(new_range->waiting));
|
||||
}
|
||||
|
||||
static void init_journal_node(struct journal_node *node)
|
||||
{
|
||||
RB_CLEAR_NODE(&node->node);
|
||||
@ -1599,8 +1679,12 @@ retry:
|
||||
|
||||
dio->range.n_sectors = min(dio->range.n_sectors,
|
||||
ic->free_sectors << ic->sb->log2_sectors_per_block);
|
||||
if (unlikely(!dio->range.n_sectors))
|
||||
goto sleep;
|
||||
if (unlikely(!dio->range.n_sectors)) {
|
||||
if (from_map)
|
||||
goto offload_to_thread;
|
||||
sleep_on_endio_wait(ic);
|
||||
goto retry;
|
||||
}
|
||||
range_sectors = dio->range.n_sectors >> ic->sb->log2_sectors_per_block;
|
||||
ic->free_sectors -= range_sectors;
|
||||
journal_section = ic->free_section;
|
||||
@ -1654,22 +1738,20 @@ retry:
|
||||
}
|
||||
}
|
||||
}
|
||||
if (unlikely(!add_new_range(ic, &dio->range))) {
|
||||
if (unlikely(!add_new_range(ic, &dio->range, true))) {
|
||||
/*
|
||||
* We must not sleep in the request routine because it could
|
||||
* stall bios on current->bio_list.
|
||||
* So, we offload the bio to a workqueue if we have to sleep.
|
||||
*/
|
||||
sleep:
|
||||
if (from_map) {
|
||||
offload_to_thread:
|
||||
spin_unlock_irq(&ic->endio_wait.lock);
|
||||
INIT_WORK(&dio->work, integrity_bio_wait);
|
||||
queue_work(ic->wait_wq, &dio->work);
|
||||
return;
|
||||
} else {
|
||||
sleep_on_endio_wait(ic);
|
||||
goto retry;
|
||||
}
|
||||
wait_and_add_new_range(ic, &dio->range);
|
||||
}
|
||||
spin_unlock_irq(&ic->endio_wait.lock);
|
||||
|
||||
@ -1701,14 +1783,18 @@ sleep:
|
||||
bio->bi_end_io = integrity_end_io;
|
||||
|
||||
bio->bi_iter.bi_size = dio->range.n_sectors << SECTOR_SHIFT;
|
||||
bio->bi_iter.bi_sector += ic->start;
|
||||
generic_make_request(bio);
|
||||
|
||||
if (need_sync_io) {
|
||||
wait_for_completion_io(&read_comp);
|
||||
if (unlikely(ic->recalc_wq != NULL) &&
|
||||
ic->sb->flags & cpu_to_le32(SB_FLAG_RECALCULATING) &&
|
||||
dio->range.logical_sector + dio->range.n_sectors > le64_to_cpu(ic->sb->recalc_sector))
|
||||
goto skip_check;
|
||||
if (likely(!bio->bi_status))
|
||||
integrity_metadata(&dio->work);
|
||||
else
|
||||
skip_check:
|
||||
dec_in_flight(dio);
|
||||
|
||||
} else {
|
||||
@ -1892,8 +1978,8 @@ static void do_journal_write(struct dm_integrity_c *ic, unsigned write_start,
|
||||
io->range.n_sectors = (k - j) << ic->sb->log2_sectors_per_block;
|
||||
|
||||
spin_lock_irq(&ic->endio_wait.lock);
|
||||
while (unlikely(!add_new_range(ic, &io->range)))
|
||||
sleep_on_endio_wait(ic);
|
||||
if (unlikely(!add_new_range(ic, &io->range, true)))
|
||||
wait_and_add_new_range(ic, &io->range);
|
||||
|
||||
if (likely(!from_replay)) {
|
||||
struct journal_node *section_node = &ic->journal_tree[i * ic->journal_section_entries];
|
||||
@ -1981,7 +2067,7 @@ static void integrity_writer(struct work_struct *w)
|
||||
unsigned prev_free_sectors;
|
||||
|
||||
/* the following test is not needed, but it tests the replay code */
|
||||
if (READ_ONCE(ic->suspending))
|
||||
if (READ_ONCE(ic->suspending) && !ic->meta_dev)
|
||||
return;
|
||||
|
||||
spin_lock_irq(&ic->endio_wait.lock);
|
||||
@ -2008,6 +2094,108 @@ static void integrity_writer(struct work_struct *w)
|
||||
spin_unlock_irq(&ic->endio_wait.lock);
|
||||
}
|
||||
|
||||
static void recalc_write_super(struct dm_integrity_c *ic)
|
||||
{
|
||||
int r;
|
||||
|
||||
dm_integrity_flush_buffers(ic);
|
||||
if (dm_integrity_failed(ic))
|
||||
return;
|
||||
|
||||
sb_set_version(ic);
|
||||
r = sync_rw_sb(ic, REQ_OP_WRITE, 0);
|
||||
if (unlikely(r))
|
||||
dm_integrity_io_error(ic, "writing superblock", r);
|
||||
}
|
||||
|
||||
static void integrity_recalc(struct work_struct *w)
|
||||
{
|
||||
struct dm_integrity_c *ic = container_of(w, struct dm_integrity_c, recalc_work);
|
||||
struct dm_integrity_range range;
|
||||
struct dm_io_request io_req;
|
||||
struct dm_io_region io_loc;
|
||||
sector_t area, offset;
|
||||
sector_t metadata_block;
|
||||
unsigned metadata_offset;
|
||||
__u8 *t;
|
||||
unsigned i;
|
||||
int r;
|
||||
unsigned super_counter = 0;
|
||||
|
||||
spin_lock_irq(&ic->endio_wait.lock);
|
||||
|
||||
next_chunk:
|
||||
|
||||
if (unlikely(READ_ONCE(ic->suspending)))
|
||||
goto unlock_ret;
|
||||
|
||||
range.logical_sector = le64_to_cpu(ic->sb->recalc_sector);
|
||||
if (unlikely(range.logical_sector >= ic->provided_data_sectors))
|
||||
goto unlock_ret;
|
||||
|
||||
get_area_and_offset(ic, range.logical_sector, &area, &offset);
|
||||
range.n_sectors = min((sector_t)RECALC_SECTORS, ic->provided_data_sectors - range.logical_sector);
|
||||
if (!ic->meta_dev)
|
||||
range.n_sectors = min(range.n_sectors, (1U << ic->sb->log2_interleave_sectors) - (unsigned)offset);
|
||||
|
||||
if (unlikely(!add_new_range(ic, &range, true)))
|
||||
wait_and_add_new_range(ic, &range);
|
||||
|
||||
spin_unlock_irq(&ic->endio_wait.lock);
|
||||
|
||||
if (unlikely(++super_counter == RECALC_WRITE_SUPER)) {
|
||||
recalc_write_super(ic);
|
||||
super_counter = 0;
|
||||
}
|
||||
|
||||
if (unlikely(dm_integrity_failed(ic)))
|
||||
goto err;
|
||||
|
||||
io_req.bi_op = REQ_OP_READ;
|
||||
io_req.bi_op_flags = 0;
|
||||
io_req.mem.type = DM_IO_VMA;
|
||||
io_req.mem.ptr.addr = ic->recalc_buffer;
|
||||
io_req.notify.fn = NULL;
|
||||
io_req.client = ic->io;
|
||||
io_loc.bdev = ic->dev->bdev;
|
||||
io_loc.sector = get_data_sector(ic, area, offset);
|
||||
io_loc.count = range.n_sectors;
|
||||
|
||||
r = dm_io(&io_req, 1, &io_loc, NULL);
|
||||
if (unlikely(r)) {
|
||||
dm_integrity_io_error(ic, "reading data", r);
|
||||
goto err;
|
||||
}
|
||||
|
||||
t = ic->recalc_tags;
|
||||
for (i = 0; i < range.n_sectors; i += ic->sectors_per_block) {
|
||||
integrity_sector_checksum(ic, range.logical_sector + i, ic->recalc_buffer + (i << SECTOR_SHIFT), t);
|
||||
t += ic->tag_size;
|
||||
}
|
||||
|
||||
metadata_block = get_metadata_sector_and_offset(ic, area, offset, &metadata_offset);
|
||||
|
||||
r = dm_integrity_rw_tag(ic, ic->recalc_tags, &metadata_block, &metadata_offset, t - ic->recalc_tags, TAG_WRITE);
|
||||
if (unlikely(r)) {
|
||||
dm_integrity_io_error(ic, "writing tags", r);
|
||||
goto err;
|
||||
}
|
||||
|
||||
spin_lock_irq(&ic->endio_wait.lock);
|
||||
remove_range_unlocked(ic, &range);
|
||||
ic->sb->recalc_sector = cpu_to_le64(range.logical_sector + range.n_sectors);
|
||||
goto next_chunk;
|
||||
|
||||
err:
|
||||
remove_range(ic, &range);
|
||||
return;
|
||||
|
||||
unlock_ret:
|
||||
spin_unlock_irq(&ic->endio_wait.lock);
|
||||
|
||||
recalc_write_super(ic);
|
||||
}
|
||||
|
||||
static void init_journal(struct dm_integrity_c *ic, unsigned start_section,
|
||||
unsigned n_sections, unsigned char commit_seq)
|
||||
{
|
||||
@ -2210,17 +2398,22 @@ static void dm_integrity_postsuspend(struct dm_target *ti)
|
||||
|
||||
del_timer_sync(&ic->autocommit_timer);
|
||||
|
||||
ic->suspending = true;
|
||||
WRITE_ONCE(ic->suspending, 1);
|
||||
|
||||
if (ic->recalc_wq)
|
||||
drain_workqueue(ic->recalc_wq);
|
||||
|
||||
queue_work(ic->commit_wq, &ic->commit_work);
|
||||
drain_workqueue(ic->commit_wq);
|
||||
|
||||
if (ic->mode == 'J') {
|
||||
if (ic->meta_dev)
|
||||
queue_work(ic->writer_wq, &ic->writer_work);
|
||||
drain_workqueue(ic->writer_wq);
|
||||
dm_integrity_flush_buffers(ic);
|
||||
}
|
||||
|
||||
ic->suspending = false;
|
||||
WRITE_ONCE(ic->suspending, 0);
|
||||
|
||||
BUG_ON(!RB_EMPTY_ROOT(&ic->in_progress));
|
||||
|
||||
@ -2232,6 +2425,16 @@ static void dm_integrity_resume(struct dm_target *ti)
|
||||
struct dm_integrity_c *ic = (struct dm_integrity_c *)ti->private;
|
||||
|
||||
replay_journal(ic);
|
||||
|
||||
if (ic->recalc_wq && ic->sb->flags & cpu_to_le32(SB_FLAG_RECALCULATING)) {
|
||||
__u64 recalc_pos = le64_to_cpu(ic->sb->recalc_sector);
|
||||
if (recalc_pos < ic->provided_data_sectors) {
|
||||
queue_work(ic->recalc_wq, &ic->recalc_work);
|
||||
} else if (recalc_pos > ic->provided_data_sectors) {
|
||||
ic->sb->recalc_sector = cpu_to_le64(ic->provided_data_sectors);
|
||||
recalc_write_super(ic);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
static void dm_integrity_status(struct dm_target *ti, status_type_t type,
|
||||
@ -2243,7 +2446,13 @@ static void dm_integrity_status(struct dm_target *ti, status_type_t type,
|
||||
|
||||
switch (type) {
|
||||
case STATUSTYPE_INFO:
|
||||
DMEMIT("%llu", (unsigned long long)atomic64_read(&ic->number_of_mismatches));
|
||||
DMEMIT("%llu %llu",
|
||||
(unsigned long long)atomic64_read(&ic->number_of_mismatches),
|
||||
(unsigned long long)ic->provided_data_sectors);
|
||||
if (ic->sb->flags & cpu_to_le32(SB_FLAG_RECALCULATING))
|
||||
DMEMIT(" %llu", (unsigned long long)le64_to_cpu(ic->sb->recalc_sector));
|
||||
else
|
||||
DMEMIT(" -");
|
||||
break;
|
||||
|
||||
case STATUSTYPE_TABLE: {
|
||||
@ -2251,19 +2460,25 @@ static void dm_integrity_status(struct dm_target *ti, status_type_t type,
|
||||
watermark_percentage += ic->journal_entries / 2;
|
||||
do_div(watermark_percentage, ic->journal_entries);
|
||||
arg_count = 5;
|
||||
arg_count += !!ic->meta_dev;
|
||||
arg_count += ic->sectors_per_block != 1;
|
||||
arg_count += !!(ic->sb->flags & cpu_to_le32(SB_FLAG_RECALCULATING));
|
||||
arg_count += !!ic->internal_hash_alg.alg_string;
|
||||
arg_count += !!ic->journal_crypt_alg.alg_string;
|
||||
arg_count += !!ic->journal_mac_alg.alg_string;
|
||||
DMEMIT("%s %llu %u %c %u", ic->dev->name, (unsigned long long)ic->start,
|
||||
ic->tag_size, ic->mode, arg_count);
|
||||
if (ic->meta_dev)
|
||||
DMEMIT(" meta_device:%s", ic->meta_dev->name);
|
||||
if (ic->sectors_per_block != 1)
|
||||
DMEMIT(" block_size:%u", ic->sectors_per_block << SECTOR_SHIFT);
|
||||
if (ic->sb->flags & cpu_to_le32(SB_FLAG_RECALCULATING))
|
||||
DMEMIT(" recalculate");
|
||||
DMEMIT(" journal_sectors:%u", ic->initial_sectors - SB_SECTORS);
|
||||
DMEMIT(" interleave_sectors:%u", 1U << ic->sb->log2_interleave_sectors);
|
||||
DMEMIT(" buffer_sectors:%u", 1U << ic->log2_buffer_sectors);
|
||||
DMEMIT(" journal_watermark:%u", (unsigned)watermark_percentage);
|
||||
DMEMIT(" commit_time:%u", ic->autocommit_msec);
|
||||
if (ic->sectors_per_block != 1)
|
||||
DMEMIT(" block_size:%u", ic->sectors_per_block << SECTOR_SHIFT);
|
||||
|
||||
#define EMIT_ALG(a, n) \
|
||||
do { \
|
||||
@ -2286,7 +2501,10 @@ static int dm_integrity_iterate_devices(struct dm_target *ti,
|
||||
{
|
||||
struct dm_integrity_c *ic = ti->private;
|
||||
|
||||
return fn(ti, ic->dev, ic->start + ic->initial_sectors + ic->metadata_run, ti->len, data);
|
||||
if (!ic->meta_dev)
|
||||
return fn(ti, ic->dev, ic->start + ic->initial_sectors + ic->metadata_run, ti->len, data);
|
||||
else
|
||||
return fn(ti, ic->dev, 0, ti->len, data);
|
||||
}
|
||||
|
||||
static void dm_integrity_io_hints(struct dm_target *ti, struct queue_limits *limits)
|
||||
@ -2319,26 +2537,38 @@ static void calculate_journal_section_size(struct dm_integrity_c *ic)
|
||||
static int calculate_device_limits(struct dm_integrity_c *ic)
|
||||
{
|
||||
__u64 initial_sectors;
|
||||
sector_t last_sector, last_area, last_offset;
|
||||
|
||||
calculate_journal_section_size(ic);
|
||||
initial_sectors = SB_SECTORS + (__u64)ic->journal_section_sectors * ic->journal_sections;
|
||||
if (initial_sectors + METADATA_PADDING_SECTORS >= ic->device_sectors || initial_sectors > UINT_MAX)
|
||||
if (initial_sectors + METADATA_PADDING_SECTORS >= ic->meta_device_sectors || initial_sectors > UINT_MAX)
|
||||
return -EINVAL;
|
||||
ic->initial_sectors = initial_sectors;
|
||||
|
||||
ic->metadata_run = roundup((__u64)ic->tag_size << (ic->sb->log2_interleave_sectors - ic->sb->log2_sectors_per_block),
|
||||
(__u64)(1 << SECTOR_SHIFT << METADATA_PADDING_SECTORS)) >> SECTOR_SHIFT;
|
||||
if (!(ic->metadata_run & (ic->metadata_run - 1)))
|
||||
ic->log2_metadata_run = __ffs(ic->metadata_run);
|
||||
else
|
||||
ic->log2_metadata_run = -1;
|
||||
if (!ic->meta_dev) {
|
||||
sector_t last_sector, last_area, last_offset;
|
||||
|
||||
get_area_and_offset(ic, ic->provided_data_sectors - 1, &last_area, &last_offset);
|
||||
last_sector = get_data_sector(ic, last_area, last_offset);
|
||||
ic->metadata_run = roundup((__u64)ic->tag_size << (ic->sb->log2_interleave_sectors - ic->sb->log2_sectors_per_block),
|
||||
(__u64)(1 << SECTOR_SHIFT << METADATA_PADDING_SECTORS)) >> SECTOR_SHIFT;
|
||||
if (!(ic->metadata_run & (ic->metadata_run - 1)))
|
||||
ic->log2_metadata_run = __ffs(ic->metadata_run);
|
||||
else
|
||||
ic->log2_metadata_run = -1;
|
||||
|
||||
if (ic->start + last_sector < last_sector || ic->start + last_sector >= ic->device_sectors)
|
||||
return -EINVAL;
|
||||
get_area_and_offset(ic, ic->provided_data_sectors - 1, &last_area, &last_offset);
|
||||
last_sector = get_data_sector(ic, last_area, last_offset);
|
||||
if (last_sector < ic->start || last_sector >= ic->meta_device_sectors)
|
||||
return -EINVAL;
|
||||
} else {
|
||||
__u64 meta_size = ic->provided_data_sectors * ic->tag_size;
|
||||
meta_size = (meta_size + ((1U << (ic->log2_buffer_sectors + SECTOR_SHIFT)) - 1))
|
||||
>> (ic->log2_buffer_sectors + SECTOR_SHIFT);
|
||||
meta_size <<= ic->log2_buffer_sectors;
|
||||
if (ic->initial_sectors + meta_size < ic->initial_sectors ||
|
||||
ic->initial_sectors + meta_size > ic->meta_device_sectors)
|
||||
return -EINVAL;
|
||||
ic->metadata_run = 1;
|
||||
ic->log2_metadata_run = 0;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
@ -2350,7 +2580,6 @@ static int initialize_superblock(struct dm_integrity_c *ic, unsigned journal_sec
|
||||
|
||||
memset(ic->sb, 0, SB_SECTORS << SECTOR_SHIFT);
|
||||
memcpy(ic->sb->magic, SB_MAGIC, 8);
|
||||
ic->sb->version = SB_VERSION;
|
||||
ic->sb->integrity_tag_size = cpu_to_le16(ic->tag_size);
|
||||
ic->sb->log2_sectors_per_block = __ffs(ic->sectors_per_block);
|
||||
if (ic->journal_mac_alg.alg_string)
|
||||
@ -2360,28 +2589,55 @@ static int initialize_superblock(struct dm_integrity_c *ic, unsigned journal_sec
|
||||
journal_sections = journal_sectors / ic->journal_section_sectors;
|
||||
if (!journal_sections)
|
||||
journal_sections = 1;
|
||||
ic->sb->journal_sections = cpu_to_le32(journal_sections);
|
||||
|
||||
if (!interleave_sectors)
|
||||
interleave_sectors = DEFAULT_INTERLEAVE_SECTORS;
|
||||
ic->sb->log2_interleave_sectors = __fls(interleave_sectors);
|
||||
ic->sb->log2_interleave_sectors = max((__u8)MIN_LOG2_INTERLEAVE_SECTORS, ic->sb->log2_interleave_sectors);
|
||||
ic->sb->log2_interleave_sectors = min((__u8)MAX_LOG2_INTERLEAVE_SECTORS, ic->sb->log2_interleave_sectors);
|
||||
if (!ic->meta_dev) {
|
||||
ic->sb->journal_sections = cpu_to_le32(journal_sections);
|
||||
if (!interleave_sectors)
|
||||
interleave_sectors = DEFAULT_INTERLEAVE_SECTORS;
|
||||
ic->sb->log2_interleave_sectors = __fls(interleave_sectors);
|
||||
ic->sb->log2_interleave_sectors = max((__u8)MIN_LOG2_INTERLEAVE_SECTORS, ic->sb->log2_interleave_sectors);
|
||||
ic->sb->log2_interleave_sectors = min((__u8)MAX_LOG2_INTERLEAVE_SECTORS, ic->sb->log2_interleave_sectors);
|
||||
|
||||
ic->provided_data_sectors = 0;
|
||||
for (test_bit = fls64(ic->device_sectors) - 1; test_bit >= 3; test_bit--) {
|
||||
__u64 prev_data_sectors = ic->provided_data_sectors;
|
||||
ic->provided_data_sectors = 0;
|
||||
for (test_bit = fls64(ic->meta_device_sectors) - 1; test_bit >= 3; test_bit--) {
|
||||
__u64 prev_data_sectors = ic->provided_data_sectors;
|
||||
|
||||
ic->provided_data_sectors |= (sector_t)1 << test_bit;
|
||||
if (calculate_device_limits(ic))
|
||||
ic->provided_data_sectors = prev_data_sectors;
|
||||
ic->provided_data_sectors |= (sector_t)1 << test_bit;
|
||||
if (calculate_device_limits(ic))
|
||||
ic->provided_data_sectors = prev_data_sectors;
|
||||
}
|
||||
if (!ic->provided_data_sectors)
|
||||
return -EINVAL;
|
||||
} else {
|
||||
ic->sb->log2_interleave_sectors = 0;
|
||||
ic->provided_data_sectors = ic->data_device_sectors;
|
||||
ic->provided_data_sectors &= ~(sector_t)(ic->sectors_per_block - 1);
|
||||
|
||||
try_smaller_buffer:
|
||||
ic->sb->journal_sections = cpu_to_le32(0);
|
||||
for (test_bit = fls(journal_sections) - 1; test_bit >= 0; test_bit--) {
|
||||
__u32 prev_journal_sections = le32_to_cpu(ic->sb->journal_sections);
|
||||
__u32 test_journal_sections = prev_journal_sections | (1U << test_bit);
|
||||
if (test_journal_sections > journal_sections)
|
||||
continue;
|
||||
ic->sb->journal_sections = cpu_to_le32(test_journal_sections);
|
||||
if (calculate_device_limits(ic))
|
||||
ic->sb->journal_sections = cpu_to_le32(prev_journal_sections);
|
||||
|
||||
}
|
||||
if (!le32_to_cpu(ic->sb->journal_sections)) {
|
||||
if (ic->log2_buffer_sectors > 3) {
|
||||
ic->log2_buffer_sectors--;
|
||||
goto try_smaller_buffer;
|
||||
}
|
||||
return -EINVAL;
|
||||
}
|
||||
}
|
||||
|
||||
if (!ic->provided_data_sectors)
|
||||
return -EINVAL;
|
||||
|
||||
ic->sb->provided_data_sectors = cpu_to_le64(ic->provided_data_sectors);
|
||||
|
||||
sb_set_version(ic);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
@ -2828,6 +3084,7 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv)
|
||||
{0, 9, "Invalid number of feature args"},
|
||||
};
|
||||
unsigned journal_sectors, interleave_sectors, buffer_sectors, journal_watermark, sync_msec;
|
||||
bool recalculate;
|
||||
bool should_write_sb;
|
||||
__u64 threshold;
|
||||
unsigned long long start;
|
||||
@ -2848,6 +3105,7 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv)
|
||||
ti->per_io_data_size = sizeof(struct dm_integrity_io);
|
||||
|
||||
ic->in_progress = RB_ROOT;
|
||||
INIT_LIST_HEAD(&ic->wait_list);
|
||||
init_waitqueue_head(&ic->endio_wait);
|
||||
bio_list_init(&ic->flush_bio_list);
|
||||
init_waitqueue_head(&ic->copy_to_journal_wait);
|
||||
@ -2883,13 +3141,12 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv)
|
||||
goto bad;
|
||||
}
|
||||
|
||||
ic->device_sectors = i_size_read(ic->dev->bdev->bd_inode) >> SECTOR_SHIFT;
|
||||
journal_sectors = min((sector_t)DEFAULT_MAX_JOURNAL_SECTORS,
|
||||
ic->device_sectors >> DEFAULT_JOURNAL_SIZE_FACTOR);
|
||||
journal_sectors = 0;
|
||||
interleave_sectors = DEFAULT_INTERLEAVE_SECTORS;
|
||||
buffer_sectors = DEFAULT_BUFFER_SECTORS;
|
||||
journal_watermark = DEFAULT_JOURNAL_WATERMARK;
|
||||
sync_msec = DEFAULT_SYNC_MSEC;
|
||||
recalculate = false;
|
||||
ic->sectors_per_block = 1;
|
||||
|
||||
as.argc = argc - DIRECT_ARGUMENTS;
|
||||
@ -2908,7 +3165,7 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv)
|
||||
goto bad;
|
||||
}
|
||||
if (sscanf(opt_string, "journal_sectors:%u%c", &val, &dummy) == 1)
|
||||
journal_sectors = val;
|
||||
journal_sectors = val ? val : 1;
|
||||
else if (sscanf(opt_string, "interleave_sectors:%u%c", &val, &dummy) == 1)
|
||||
interleave_sectors = val;
|
||||
else if (sscanf(opt_string, "buffer_sectors:%u%c", &val, &dummy) == 1)
|
||||
@ -2917,7 +3174,17 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv)
|
||||
journal_watermark = val;
|
||||
else if (sscanf(opt_string, "commit_time:%u%c", &val, &dummy) == 1)
|
||||
sync_msec = val;
|
||||
else if (sscanf(opt_string, "block_size:%u%c", &val, &dummy) == 1) {
|
||||
else if (!memcmp(opt_string, "meta_device:", strlen("meta_device:"))) {
|
||||
if (ic->meta_dev) {
|
||||
dm_put_device(ti, ic->meta_dev);
|
||||
ic->meta_dev = NULL;
|
||||
}
|
||||
r = dm_get_device(ti, strchr(opt_string, ':') + 1, dm_table_get_mode(ti->table), &ic->meta_dev);
|
||||
if (r) {
|
||||
ti->error = "Device lookup failed";
|
||||
goto bad;
|
||||
}
|
||||
} else if (sscanf(opt_string, "block_size:%u%c", &val, &dummy) == 1) {
|
||||
if (val < 1 << SECTOR_SHIFT ||
|
||||
val > MAX_SECTORS_PER_BLOCK << SECTOR_SHIFT ||
|
||||
(val & (val -1))) {
|
||||
@ -2941,6 +3208,8 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv)
|
||||
"Invalid journal_mac argument");
|
||||
if (r)
|
||||
goto bad;
|
||||
} else if (!strcmp(opt_string, "recalculate")) {
|
||||
recalculate = true;
|
||||
} else {
|
||||
r = -EINVAL;
|
||||
ti->error = "Invalid argument";
|
||||
@ -2948,6 +3217,21 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv)
|
||||
}
|
||||
}
|
||||
|
||||
ic->data_device_sectors = i_size_read(ic->dev->bdev->bd_inode) >> SECTOR_SHIFT;
|
||||
if (!ic->meta_dev)
|
||||
ic->meta_device_sectors = ic->data_device_sectors;
|
||||
else
|
||||
ic->meta_device_sectors = i_size_read(ic->meta_dev->bdev->bd_inode) >> SECTOR_SHIFT;
|
||||
|
||||
if (!journal_sectors) {
|
||||
journal_sectors = min((sector_t)DEFAULT_MAX_JOURNAL_SECTORS,
|
||||
ic->data_device_sectors >> DEFAULT_JOURNAL_SIZE_FACTOR);
|
||||
}
|
||||
|
||||
if (!buffer_sectors)
|
||||
buffer_sectors = 1;
|
||||
ic->log2_buffer_sectors = min((int)__fls(buffer_sectors), 31 - SECTOR_SHIFT);
|
||||
|
||||
r = get_mac(&ic->internal_hash, &ic->internal_hash_alg, &ti->error,
|
||||
"Invalid internal hash", "Error setting internal hash key");
|
||||
if (r)
|
||||
@ -3062,7 +3346,7 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv)
|
||||
should_write_sb = true;
|
||||
}
|
||||
|
||||
if (ic->sb->version != SB_VERSION) {
|
||||
if (!ic->sb->version || ic->sb->version > SB_VERSION_2) {
|
||||
r = -EINVAL;
|
||||
ti->error = "Unknown version";
|
||||
goto bad;
|
||||
@ -3083,11 +3367,19 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv)
|
||||
goto bad;
|
||||
}
|
||||
/* make sure that ti->max_io_len doesn't overflow */
|
||||
if (ic->sb->log2_interleave_sectors < MIN_LOG2_INTERLEAVE_SECTORS ||
|
||||
ic->sb->log2_interleave_sectors > MAX_LOG2_INTERLEAVE_SECTORS) {
|
||||
r = -EINVAL;
|
||||
ti->error = "Invalid interleave_sectors in the superblock";
|
||||
goto bad;
|
||||
if (!ic->meta_dev) {
|
||||
if (ic->sb->log2_interleave_sectors < MIN_LOG2_INTERLEAVE_SECTORS ||
|
||||
ic->sb->log2_interleave_sectors > MAX_LOG2_INTERLEAVE_SECTORS) {
|
||||
r = -EINVAL;
|
||||
ti->error = "Invalid interleave_sectors in the superblock";
|
||||
goto bad;
|
||||
}
|
||||
} else {
|
||||
if (ic->sb->log2_interleave_sectors) {
|
||||
r = -EINVAL;
|
||||
ti->error = "Invalid interleave_sectors in the superblock";
|
||||
goto bad;
|
||||
}
|
||||
}
|
||||
ic->provided_data_sectors = le64_to_cpu(ic->sb->provided_data_sectors);
|
||||
if (ic->provided_data_sectors != le64_to_cpu(ic->sb->provided_data_sectors)) {
|
||||
@ -3101,20 +3393,28 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv)
|
||||
ti->error = "Journal mac mismatch";
|
||||
goto bad;
|
||||
}
|
||||
|
||||
try_smaller_buffer:
|
||||
r = calculate_device_limits(ic);
|
||||
if (r) {
|
||||
if (ic->meta_dev) {
|
||||
if (ic->log2_buffer_sectors > 3) {
|
||||
ic->log2_buffer_sectors--;
|
||||
goto try_smaller_buffer;
|
||||
}
|
||||
}
|
||||
ti->error = "The device is too small";
|
||||
goto bad;
|
||||
}
|
||||
if (!ic->meta_dev)
|
||||
ic->log2_buffer_sectors = min(ic->log2_buffer_sectors, (__u8)__ffs(ic->metadata_run));
|
||||
|
||||
if (ti->len > ic->provided_data_sectors) {
|
||||
r = -EINVAL;
|
||||
ti->error = "Not enough provided sectors for requested mapping size";
|
||||
goto bad;
|
||||
}
|
||||
|
||||
if (!buffer_sectors)
|
||||
buffer_sectors = 1;
|
||||
ic->log2_buffer_sectors = min3((int)__fls(buffer_sectors), (int)__ffs(ic->metadata_run), 31 - SECTOR_SHIFT);
|
||||
|
||||
threshold = (__u64)ic->journal_entries * (100 - journal_watermark);
|
||||
threshold += 50;
|
||||
@ -3138,8 +3438,40 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv)
|
||||
(unsigned long long)ic->provided_data_sectors);
|
||||
DEBUG_print(" log2_buffer_sectors %u\n", ic->log2_buffer_sectors);
|
||||
|
||||
ic->bufio = dm_bufio_client_create(ic->dev->bdev, 1U << (SECTOR_SHIFT + ic->log2_buffer_sectors),
|
||||
1, 0, NULL, NULL);
|
||||
if (recalculate && !(ic->sb->flags & cpu_to_le32(SB_FLAG_RECALCULATING))) {
|
||||
ic->sb->flags |= cpu_to_le32(SB_FLAG_RECALCULATING);
|
||||
ic->sb->recalc_sector = cpu_to_le64(0);
|
||||
}
|
||||
|
||||
if (ic->sb->flags & cpu_to_le32(SB_FLAG_RECALCULATING)) {
|
||||
if (!ic->internal_hash) {
|
||||
r = -EINVAL;
|
||||
ti->error = "Recalculate is only valid with internal hash";
|
||||
goto bad;
|
||||
}
|
||||
ic->recalc_wq = alloc_workqueue("dm-intergrity-recalc", WQ_MEM_RECLAIM, 1);
|
||||
if (!ic->recalc_wq ) {
|
||||
ti->error = "Cannot allocate workqueue";
|
||||
r = -ENOMEM;
|
||||
goto bad;
|
||||
}
|
||||
INIT_WORK(&ic->recalc_work, integrity_recalc);
|
||||
ic->recalc_buffer = vmalloc(RECALC_SECTORS << SECTOR_SHIFT);
|
||||
if (!ic->recalc_buffer) {
|
||||
ti->error = "Cannot allocate buffer for recalculating";
|
||||
r = -ENOMEM;
|
||||
goto bad;
|
||||
}
|
||||
ic->recalc_tags = kvmalloc((RECALC_SECTORS >> ic->sb->log2_sectors_per_block) * ic->tag_size, GFP_KERNEL);
|
||||
if (!ic->recalc_tags) {
|
||||
ti->error = "Cannot allocate tags for recalculating";
|
||||
r = -ENOMEM;
|
||||
goto bad;
|
||||
}
|
||||
}
|
||||
|
||||
ic->bufio = dm_bufio_client_create(ic->meta_dev ? ic->meta_dev->bdev : ic->dev->bdev,
|
||||
1U << (SECTOR_SHIFT + ic->log2_buffer_sectors), 1, 0, NULL, NULL);
|
||||
if (IS_ERR(ic->bufio)) {
|
||||
r = PTR_ERR(ic->bufio);
|
||||
ti->error = "Cannot initialize dm-bufio";
|
||||
@ -3171,9 +3503,11 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned argc, char **argv)
|
||||
ic->just_formatted = true;
|
||||
}
|
||||
|
||||
r = dm_set_target_max_io_len(ti, 1U << ic->sb->log2_interleave_sectors);
|
||||
if (r)
|
||||
goto bad;
|
||||
if (!ic->meta_dev) {
|
||||
r = dm_set_target_max_io_len(ti, 1U << ic->sb->log2_interleave_sectors);
|
||||
if (r)
|
||||
goto bad;
|
||||
}
|
||||
|
||||
if (!ic->internal_hash)
|
||||
dm_integrity_set(ti, ic);
|
||||
@ -3192,6 +3526,7 @@ static void dm_integrity_dtr(struct dm_target *ti)
|
||||
struct dm_integrity_c *ic = ti->private;
|
||||
|
||||
BUG_ON(!RB_EMPTY_ROOT(&ic->in_progress));
|
||||
BUG_ON(!list_empty(&ic->wait_list));
|
||||
|
||||
if (ic->metadata_wq)
|
||||
destroy_workqueue(ic->metadata_wq);
|
||||
@ -3201,6 +3536,12 @@ static void dm_integrity_dtr(struct dm_target *ti)
|
||||
destroy_workqueue(ic->commit_wq);
|
||||
if (ic->writer_wq)
|
||||
destroy_workqueue(ic->writer_wq);
|
||||
if (ic->recalc_wq)
|
||||
destroy_workqueue(ic->recalc_wq);
|
||||
if (ic->recalc_buffer)
|
||||
vfree(ic->recalc_buffer);
|
||||
if (ic->recalc_tags)
|
||||
kvfree(ic->recalc_tags);
|
||||
if (ic->bufio)
|
||||
dm_bufio_client_destroy(ic->bufio);
|
||||
mempool_exit(&ic->journal_io_mempool);
|
||||
@ -3208,6 +3549,8 @@ static void dm_integrity_dtr(struct dm_target *ti)
|
||||
dm_io_client_destroy(ic->io);
|
||||
if (ic->dev)
|
||||
dm_put_device(ti, ic->dev);
|
||||
if (ic->meta_dev)
|
||||
dm_put_device(ti, ic->meta_dev);
|
||||
dm_integrity_free_page_list(ic, ic->journal);
|
||||
dm_integrity_free_page_list(ic, ic->journal_io);
|
||||
dm_integrity_free_page_list(ic, ic->journal_xor);
|
||||
@ -3248,7 +3591,7 @@ static void dm_integrity_dtr(struct dm_target *ti)
|
||||
|
||||
static struct target_type integrity_target = {
|
||||
.name = "integrity",
|
||||
.version = {1, 1, 0},
|
||||
.version = {1, 2, 0},
|
||||
.module = THIS_MODULE,
|
||||
.features = DM_TARGET_SINGLETON | DM_TARGET_INTEGRITY,
|
||||
.ctr = dm_integrity_ctr,
|
||||
|
@ -487,6 +487,8 @@ static int run_complete_job(struct kcopyd_job *job)
|
||||
if (atomic_dec_and_test(&kc->nr_jobs))
|
||||
wake_up(&kc->destroyq);
|
||||
|
||||
cond_resched();
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
@ -741,9 +743,9 @@ static void split_job(struct kcopyd_job *master_job)
|
||||
}
|
||||
}
|
||||
|
||||
int dm_kcopyd_copy(struct dm_kcopyd_client *kc, struct dm_io_region *from,
|
||||
unsigned int num_dests, struct dm_io_region *dests,
|
||||
unsigned int flags, dm_kcopyd_notify_fn fn, void *context)
|
||||
void dm_kcopyd_copy(struct dm_kcopyd_client *kc, struct dm_io_region *from,
|
||||
unsigned int num_dests, struct dm_io_region *dests,
|
||||
unsigned int flags, dm_kcopyd_notify_fn fn, void *context)
|
||||
{
|
||||
struct kcopyd_job *job;
|
||||
int i;
|
||||
@ -818,16 +820,14 @@ int dm_kcopyd_copy(struct dm_kcopyd_client *kc, struct dm_io_region *from,
|
||||
job->progress = 0;
|
||||
split_job(job);
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
EXPORT_SYMBOL(dm_kcopyd_copy);
|
||||
|
||||
int dm_kcopyd_zero(struct dm_kcopyd_client *kc,
|
||||
unsigned num_dests, struct dm_io_region *dests,
|
||||
unsigned flags, dm_kcopyd_notify_fn fn, void *context)
|
||||
void dm_kcopyd_zero(struct dm_kcopyd_client *kc,
|
||||
unsigned num_dests, struct dm_io_region *dests,
|
||||
unsigned flags, dm_kcopyd_notify_fn fn, void *context)
|
||||
{
|
||||
return dm_kcopyd_copy(kc, NULL, num_dests, dests, flags, fn, context);
|
||||
dm_kcopyd_copy(kc, NULL, num_dests, dests, flags, fn, context);
|
||||
}
|
||||
EXPORT_SYMBOL(dm_kcopyd_zero);
|
||||
|
||||
|
@ -326,9 +326,8 @@ static void recovery_complete(int read_err, unsigned long write_err,
|
||||
dm_rh_recovery_end(reg, !(read_err || write_err));
|
||||
}
|
||||
|
||||
static int recover(struct mirror_set *ms, struct dm_region *reg)
|
||||
static void recover(struct mirror_set *ms, struct dm_region *reg)
|
||||
{
|
||||
int r;
|
||||
unsigned i;
|
||||
struct dm_io_region from, to[DM_KCOPYD_MAX_REGIONS], *dest;
|
||||
struct mirror *m;
|
||||
@ -367,10 +366,8 @@ static int recover(struct mirror_set *ms, struct dm_region *reg)
|
||||
if (!errors_handled(ms))
|
||||
set_bit(DM_KCOPYD_IGNORE_ERROR, &flags);
|
||||
|
||||
r = dm_kcopyd_copy(ms->kcopyd_client, &from, ms->nr_mirrors - 1, to,
|
||||
flags, recovery_complete, reg);
|
||||
|
||||
return r;
|
||||
dm_kcopyd_copy(ms->kcopyd_client, &from, ms->nr_mirrors - 1, to,
|
||||
flags, recovery_complete, reg);
|
||||
}
|
||||
|
||||
static void reset_ms_flags(struct mirror_set *ms)
|
||||
@ -388,7 +385,6 @@ static void do_recovery(struct mirror_set *ms)
|
||||
{
|
||||
struct dm_region *reg;
|
||||
struct dm_dirty_log *log = dm_rh_dirty_log(ms->rh);
|
||||
int r;
|
||||
|
||||
/*
|
||||
* Start quiescing some regions.
|
||||
@ -398,11 +394,8 @@ static void do_recovery(struct mirror_set *ms)
|
||||
/*
|
||||
* Copy any already quiesced regions.
|
||||
*/
|
||||
while ((reg = dm_rh_recovery_start(ms->rh))) {
|
||||
r = recover(ms, reg);
|
||||
if (r)
|
||||
dm_rh_recovery_end(reg, 0);
|
||||
}
|
||||
while ((reg = dm_rh_recovery_start(ms->rh)))
|
||||
recover(ms, reg);
|
||||
|
||||
/*
|
||||
* Update the in sync flag.
|
||||
|
@ -85,7 +85,7 @@ struct dm_snapshot {
|
||||
* A list of pending exceptions that completed out of order.
|
||||
* Protected by kcopyd single-threaded callback.
|
||||
*/
|
||||
struct list_head out_of_order_list;
|
||||
struct rb_root out_of_order_tree;
|
||||
|
||||
mempool_t pending_pool;
|
||||
|
||||
@ -200,7 +200,7 @@ struct dm_snap_pending_exception {
|
||||
/* A sequence number, it is used for in-order completion. */
|
||||
sector_t exception_sequence;
|
||||
|
||||
struct list_head out_of_order_entry;
|
||||
struct rb_node out_of_order_node;
|
||||
|
||||
/*
|
||||
* For writing a complete chunk, bypassing the copy.
|
||||
@ -1173,7 +1173,7 @@ static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv)
|
||||
atomic_set(&s->pending_exceptions_count, 0);
|
||||
s->exception_start_sequence = 0;
|
||||
s->exception_complete_sequence = 0;
|
||||
INIT_LIST_HEAD(&s->out_of_order_list);
|
||||
s->out_of_order_tree = RB_ROOT;
|
||||
mutex_init(&s->lock);
|
||||
INIT_LIST_HEAD(&s->list);
|
||||
spin_lock_init(&s->pe_lock);
|
||||
@ -1539,28 +1539,41 @@ static void copy_callback(int read_err, unsigned long write_err, void *context)
|
||||
pe->copy_error = read_err || write_err;
|
||||
|
||||
if (pe->exception_sequence == s->exception_complete_sequence) {
|
||||
struct rb_node *next;
|
||||
|
||||
s->exception_complete_sequence++;
|
||||
complete_exception(pe);
|
||||
|
||||
while (!list_empty(&s->out_of_order_list)) {
|
||||
pe = list_entry(s->out_of_order_list.next,
|
||||
struct dm_snap_pending_exception, out_of_order_entry);
|
||||
next = rb_first(&s->out_of_order_tree);
|
||||
while (next) {
|
||||
pe = rb_entry(next, struct dm_snap_pending_exception,
|
||||
out_of_order_node);
|
||||
if (pe->exception_sequence != s->exception_complete_sequence)
|
||||
break;
|
||||
next = rb_next(next);
|
||||
s->exception_complete_sequence++;
|
||||
list_del(&pe->out_of_order_entry);
|
||||
rb_erase(&pe->out_of_order_node, &s->out_of_order_tree);
|
||||
complete_exception(pe);
|
||||
cond_resched();
|
||||
}
|
||||
} else {
|
||||
struct list_head *lh;
|
||||
struct rb_node *parent = NULL;
|
||||
struct rb_node **p = &s->out_of_order_tree.rb_node;
|
||||
struct dm_snap_pending_exception *pe2;
|
||||
|
||||
list_for_each_prev(lh, &s->out_of_order_list) {
|
||||
pe2 = list_entry(lh, struct dm_snap_pending_exception, out_of_order_entry);
|
||||
if (pe2->exception_sequence < pe->exception_sequence)
|
||||
break;
|
||||
while (*p) {
|
||||
pe2 = rb_entry(*p, struct dm_snap_pending_exception, out_of_order_node);
|
||||
parent = *p;
|
||||
|
||||
BUG_ON(pe->exception_sequence == pe2->exception_sequence);
|
||||
if (pe->exception_sequence < pe2->exception_sequence)
|
||||
p = &((*p)->rb_left);
|
||||
else
|
||||
p = &((*p)->rb_right);
|
||||
}
|
||||
list_add(&pe->out_of_order_entry, lh);
|
||||
|
||||
rb_link_node(&pe->out_of_order_node, parent, p);
|
||||
rb_insert_color(&pe->out_of_order_node, &s->out_of_order_tree);
|
||||
}
|
||||
}
|
||||
|
||||
@ -1694,8 +1707,6 @@ static int snapshot_map(struct dm_target *ti, struct bio *bio)
|
||||
if (!s->valid)
|
||||
return DM_MAPIO_KILL;
|
||||
|
||||
/* FIXME: should only take write lock if we need
|
||||
* to copy an exception */
|
||||
mutex_lock(&s->lock);
|
||||
|
||||
if (!s->valid || (unlikely(s->snapshot_overflowed) &&
|
||||
|
@ -1220,18 +1220,13 @@ static struct dm_thin_new_mapping *get_next_mapping(struct pool *pool)
|
||||
static void ll_zero(struct thin_c *tc, struct dm_thin_new_mapping *m,
|
||||
sector_t begin, sector_t end)
|
||||
{
|
||||
int r;
|
||||
struct dm_io_region to;
|
||||
|
||||
to.bdev = tc->pool_dev->bdev;
|
||||
to.sector = begin;
|
||||
to.count = end - begin;
|
||||
|
||||
r = dm_kcopyd_zero(tc->pool->copier, 1, &to, 0, copy_complete, m);
|
||||
if (r < 0) {
|
||||
DMERR_LIMIT("dm_kcopyd_zero() failed");
|
||||
copy_complete(1, 1, m);
|
||||
}
|
||||
dm_kcopyd_zero(tc->pool->copier, 1, &to, 0, copy_complete, m);
|
||||
}
|
||||
|
||||
static void remap_and_issue_overwrite(struct thin_c *tc, struct bio *bio,
|
||||
@ -1257,7 +1252,6 @@ static void schedule_copy(struct thin_c *tc, dm_block_t virt_block,
|
||||
struct dm_bio_prison_cell *cell, struct bio *bio,
|
||||
sector_t len)
|
||||
{
|
||||
int r;
|
||||
struct pool *pool = tc->pool;
|
||||
struct dm_thin_new_mapping *m = get_next_mapping(pool);
|
||||
|
||||
@ -1296,19 +1290,8 @@ static void schedule_copy(struct thin_c *tc, dm_block_t virt_block,
|
||||
to.sector = data_dest * pool->sectors_per_block;
|
||||
to.count = len;
|
||||
|
||||
r = dm_kcopyd_copy(pool->copier, &from, 1, &to,
|
||||
0, copy_complete, m);
|
||||
if (r < 0) {
|
||||
DMERR_LIMIT("dm_kcopyd_copy() failed");
|
||||
copy_complete(1, 1, m);
|
||||
|
||||
/*
|
||||
* We allow the zero to be issued, to simplify the
|
||||
* error path. Otherwise we'd need to start
|
||||
* worrying about decrementing the prepare_actions
|
||||
* counter.
|
||||
*/
|
||||
}
|
||||
dm_kcopyd_copy(pool->copier, &from, 1, &to,
|
||||
0, copy_complete, m);
|
||||
|
||||
/*
|
||||
* Do we need to zero a tail region?
|
||||
@ -2520,6 +2503,8 @@ static void set_pool_mode(struct pool *pool, enum pool_mode new_mode)
|
||||
case PM_WRITE:
|
||||
if (old_mode != new_mode)
|
||||
notify_of_pool_mode_change(pool, "write");
|
||||
if (old_mode == PM_OUT_OF_DATA_SPACE)
|
||||
cancel_delayed_work_sync(&pool->no_space_timeout);
|
||||
pool->out_of_data_space = false;
|
||||
pool->pf.error_if_no_space = pt->requested_pf.error_if_no_space;
|
||||
dm_pool_metadata_read_write(pool->pmd);
|
||||
@ -3890,6 +3875,8 @@ static void pool_status(struct dm_target *ti, status_type_t type,
|
||||
else
|
||||
DMEMIT("- ");
|
||||
|
||||
DMEMIT("%llu ", (unsigned long long)calc_metadata_threshold(pt));
|
||||
|
||||
break;
|
||||
|
||||
case STATUSTYPE_TABLE:
|
||||
@ -3979,7 +3966,7 @@ static struct target_type pool_target = {
|
||||
.name = "thin-pool",
|
||||
.features = DM_TARGET_SINGLETON | DM_TARGET_ALWAYS_WRITEABLE |
|
||||
DM_TARGET_IMMUTABLE,
|
||||
.version = {1, 19, 0},
|
||||
.version = {1, 20, 0},
|
||||
.module = THIS_MODULE,
|
||||
.ctr = pool_ctr,
|
||||
.dtr = pool_dtr,
|
||||
@ -4353,7 +4340,7 @@ static void thin_io_hints(struct dm_target *ti, struct queue_limits *limits)
|
||||
|
||||
static struct target_type thin_target = {
|
||||
.name = "thin",
|
||||
.version = {1, 19, 0},
|
||||
.version = {1, 20, 0},
|
||||
.module = THIS_MODULE,
|
||||
.ctr = thin_ctr,
|
||||
.dtr = thin_dtr,
|
||||
|
@ -457,7 +457,7 @@ static void ssd_commit_flushed(struct dm_writecache *wc)
|
||||
COMPLETION_INITIALIZER_ONSTACK(endio.c),
|
||||
ATOMIC_INIT(1),
|
||||
};
|
||||
unsigned bitmap_bits = wc->dirty_bitmap_size * BITS_PER_LONG;
|
||||
unsigned bitmap_bits = wc->dirty_bitmap_size * 8;
|
||||
unsigned i = 0;
|
||||
|
||||
while (1) {
|
||||
@ -2240,6 +2240,8 @@ static void writecache_status(struct dm_target *ti, status_type_t type,
|
||||
DMEMIT("%c %s %s %u ", WC_MODE_PMEM(wc) ? 'p' : 's',
|
||||
wc->dev->name, wc->ssd_dev->name, wc->block_size);
|
||||
extra_args = 0;
|
||||
if (wc->start_sector)
|
||||
extra_args += 2;
|
||||
if (wc->high_wm_percent_set)
|
||||
extra_args += 2;
|
||||
if (wc->low_wm_percent_set)
|
||||
@ -2254,6 +2256,8 @@ static void writecache_status(struct dm_target *ti, status_type_t type,
|
||||
extra_args++;
|
||||
|
||||
DMEMIT("%u", extra_args);
|
||||
if (wc->start_sector)
|
||||
DMEMIT(" start_sector %llu", (unsigned long long)wc->start_sector);
|
||||
if (wc->high_wm_percent_set) {
|
||||
x = (uint64_t)wc->freelist_high_watermark * 100;
|
||||
x += wc->n_blocks / 2;
|
||||
@ -2280,7 +2284,7 @@ static void writecache_status(struct dm_target *ti, status_type_t type,
|
||||
|
||||
static struct target_type writecache_target = {
|
||||
.name = "writecache",
|
||||
.version = {1, 1, 0},
|
||||
.version = {1, 1, 1},
|
||||
.module = THIS_MODULE,
|
||||
.ctr = writecache_ctr,
|
||||
.dtr = writecache_dtr,
|
||||
|
@ -161,10 +161,8 @@ static int dmz_reclaim_copy(struct dmz_reclaim *zrc,
|
||||
|
||||
/* Copy the valid region */
|
||||
set_bit(DMZ_RECLAIM_KCOPY, &zrc->flags);
|
||||
ret = dm_kcopyd_copy(zrc->kc, &src, 1, &dst, flags,
|
||||
dmz_reclaim_kcopy_end, zrc);
|
||||
if (ret)
|
||||
return ret;
|
||||
dm_kcopyd_copy(zrc->kc, &src, 1, &dst, flags,
|
||||
dmz_reclaim_kcopy_end, zrc);
|
||||
|
||||
/* Wait for copy to complete */
|
||||
wait_on_bit_io(&zrc->flags, DMZ_RECLAIM_KCOPY,
|
||||
|
@ -62,9 +62,9 @@ void dm_kcopyd_client_destroy(struct dm_kcopyd_client *kc);
|
||||
typedef void (*dm_kcopyd_notify_fn)(int read_err, unsigned long write_err,
|
||||
void *context);
|
||||
|
||||
int dm_kcopyd_copy(struct dm_kcopyd_client *kc, struct dm_io_region *from,
|
||||
unsigned num_dests, struct dm_io_region *dests,
|
||||
unsigned flags, dm_kcopyd_notify_fn fn, void *context);
|
||||
void dm_kcopyd_copy(struct dm_kcopyd_client *kc, struct dm_io_region *from,
|
||||
unsigned num_dests, struct dm_io_region *dests,
|
||||
unsigned flags, dm_kcopyd_notify_fn fn, void *context);
|
||||
|
||||
/*
|
||||
* Prepare a callback and submit it via the kcopyd thread.
|
||||
@ -81,9 +81,9 @@ void *dm_kcopyd_prepare_callback(struct dm_kcopyd_client *kc,
|
||||
dm_kcopyd_notify_fn fn, void *context);
|
||||
void dm_kcopyd_do_callback(void *job, int read_err, unsigned long write_err);
|
||||
|
||||
int dm_kcopyd_zero(struct dm_kcopyd_client *kc,
|
||||
unsigned num_dests, struct dm_io_region *dests,
|
||||
unsigned flags, dm_kcopyd_notify_fn fn, void *context);
|
||||
void dm_kcopyd_zero(struct dm_kcopyd_client *kc,
|
||||
unsigned num_dests, struct dm_io_region *dests,
|
||||
unsigned flags, dm_kcopyd_notify_fn fn, void *context);
|
||||
|
||||
#endif /* __KERNEL__ */
|
||||
#endif /* _LINUX_DM_KCOPYD_H */
|
||||
|
Loading…
x
Reference in New Issue
Block a user