2001-10-04 00:38:07 +04:00
/*
2004-03-30 23:35:44 +04:00
* Copyright ( C ) 2001 - 2004 Sistina Software , Inc . All rights reserved .
2009-05-13 17:02:52 +04:00
* Copyright ( C ) 2004 - 2009 Red Hat , Inc . All rights reserved .
2001-11-07 11:50:07 +03:00
*
2004-03-30 23:35:44 +04:00
* This file is part of LVM2 .
2001-11-07 11:50:07 +03:00
*
2004-03-30 23:35:44 +04:00
* This copyrighted material is made available to anyone wishing to use ,
* modify , copy , or redistribute it subject to the terms and conditions
2007-08-21 00:55:30 +04:00
* of the GNU Lesser General Public License v .2 .1 .
2001-11-07 11:50:07 +03:00
*
2007-08-21 00:55:30 +04:00
* You should have received a copy of the GNU Lesser General Public License
2004-03-30 23:35:44 +04:00
* along with this program ; if not , write to the Free Software Foundation ,
2016-01-21 13:49:46 +03:00
* Inc . , 51 Franklin Street , Fifth Floor , Boston , MA 02110 - 1301 USA
2001-10-04 00:38:07 +04:00
*/
# include "tools.h"
2002-02-11 23:50:53 +03:00
int vgcreate ( struct cmd_context * cmd , int argc , char * * argv )
2001-10-04 00:38:07 +04:00
{
2016-01-26 20:34:59 +03:00
struct processing_handle * handle ;
2016-02-19 00:38:23 +03:00
struct pvcreate_params pp ;
2008-01-16 01:56:30 +03:00
struct vgcreate_params vp_new ;
struct vgcreate_params vp_def ;
2001-10-04 00:38:07 +04:00
struct volume_group * vg ;
2004-03-08 20:19:15 +03:00
const char * tag ;
2009-10-06 00:03:37 +04:00
char * vg_name ;
2010-11-11 20:29:05 +03:00
struct arg_value_group_list * current_group ;
2001-10-04 00:38:07 +04:00
2001-10-12 16:21:43 +04:00
if ( ! argc ) {
log_error ( " Please provide volume group name and "
" physical volumes " ) ;
2001-10-05 02:53:37 +04:00
return EINVALID_CMD_LINE ;
2001-10-04 00:38:07 +04:00
}
2009-10-06 00:03:37 +04:00
vg_name = argv [ 0 ] ;
argc - - ;
argv + + ;
2016-02-19 00:38:23 +03:00
pvcreate_params_set_defaults ( & pp ) ;
2016-01-26 20:34:59 +03:00
2016-02-19 00:38:23 +03:00
if ( ! pvcreate_params_from_args ( cmd , & pp ) )
2001-10-05 02:53:37 +04:00
return EINVALID_CMD_LINE ;
2016-01-26 20:34:59 +03:00
pp . pv_count = argc ;
pp . pv_names = argv ;
/* Don't create a new PV on top of an existing PV like pvcreate does. */
pp . preserve_existing = 1 ;
2014-09-12 12:03:12 +04:00
if ( ! vgcreate_params_set_defaults ( cmd , & vp_def , NULL ) )
return EINVALID_CMD_LINE ;
2009-11-01 23:03:24 +03:00
vp_def . vg_name = vg_name ;
2012-10-16 12:07:27 +04:00
if ( ! vgcreate_params_set_from_args ( cmd , & vp_new , & vp_def ) )
2004-03-26 18:46:37 +03:00
return EINVALID_CMD_LINE ;
2012-10-16 12:07:27 +04:00
if ( ! vgcreate_params_validate ( cmd , & vp_new ) )
2016-01-26 20:34:59 +03:00
return EINVALID_CMD_LINE ;
2001-10-15 22:39:40 +04:00
locking: unify global lock for flock and lockd
There have been two file locks used to protect lvm
"global state": "ORPHANS" and "GLOBAL".
Commands that used the ORPHAN flock in exclusive mode:
pvcreate, pvremove, vgcreate, vgextend, vgremove,
vgcfgrestore
Commands that used the ORPHAN flock in shared mode:
vgimportclone, pvs, pvscan, pvresize, pvmove,
pvdisplay, pvchange, fullreport
Commands that used the GLOBAL flock in exclusive mode:
pvchange, pvscan, vgimportclone, vgscan
Commands that used the GLOBAL flock in shared mode:
pvscan --cache, pvs
The ORPHAN lock covers the important cases of serializing
the use of orphan PVs. It also partially covers the
reporting of orphan PVs (although not correctly as
explained below.)
The GLOBAL lock doesn't seem to have a clear purpose
(it may have eroded over time.)
Neither lock correctly protects the VG namespace, or
orphan PV properties.
To simplify and correct these issues, the two separate
flocks are combined into the one GLOBAL flock, and this flock
is used from the locking sites that are in place for the
lvmlockd global lock.
The logic behind the lvmlockd (distributed) global lock is
that any command that changes "global state" needs to take
the global lock in ex mode. Global state in lvm is: the list
of VG names, the set of orphan PVs, and any properties of
orphan PVs. Reading this global state can use the global lock
in sh mode to ensure it doesn't change while being reported.
The locking of global state now looks like:
lockd_global()
previously named lockd_gl(), acquires the distributed
global lock through lvmlockd. This is unchanged.
It serializes distributed lvm commands that are changing
global state. This is a no-op when lvmlockd is not in use.
lockf_global()
acquires an flock on a local file. It serializes local lvm
commands that are changing global state.
lock_global()
first calls lockf_global() to acquire the local flock for
global state, and if this succeeds, it calls lockd_global()
to acquire the distributed lock for global state.
Replace instances of lockd_gl() with lock_global(), so that the
existing sites for lvmlockd global state locking are now also
used for local file locking of global state. Remove the previous
file locking calls lock_vol(GLOBAL) and lock_vol(ORPHAN).
The following commands which change global state are now
serialized with the exclusive global flock:
pvchange (of orphan), pvresize (of orphan), pvcreate, pvremove,
vgcreate, vgextend, vgremove, vgreduce, vgrename,
vgcfgrestore, vgimportclone, vgmerge, vgsplit
Commands that use a shared flock to read global state (and will
be serialized against the prior list) are those that use
process_each functions that are based on processing a list of
all VG names, or all PVs. The list of all VGs or all PVs is
global state and the shared lock prevents those lists from
changing while the command is processing them.
The ORPHAN lock previously attempted to produce an accurate
listing of orphan PVs, but it was only acquired at the end of
the command during the fake vg_read of the fake orphan vg.
This is not when orphan PVs were determined; they were
determined by elimination beforehand by processing all real
VGs, and subtracting the PVs in the real VGs from the list
of all PVs that had been identified during the initial scan.
This is fixed by holding the single global lock in shared mode
while processing all VGs to determine the list of orphan PVs.
2019-04-18 23:01:19 +03:00
if ( ! lockf_global ( cmd , " ex " ) )
return_ECMD_FAILED ;
if ( ! lockd_global_create ( cmd , " ex " , vp_new . lock_type ) )
2016-01-26 20:34:59 +03:00
return_ECMD_FAILED ;
2015-03-05 23:00:44 +03:00
2018-12-07 23:35:22 +03:00
clear_hint_file ( cmd ) ;
2016-01-26 20:34:59 +03:00
/*
* Check if the VG name already exists . This should be done before
* creating PVs on any of the devices .
2017-10-26 22:32:30 +03:00
*
* When searching if a VG name exists , acquire the VG lock ,
* then do the initial label scan which reads all devices and
* populates lvmcache with any VG name it finds . If the VG name
* we want to use exists , then the label scan will find it ,
* and the fmt_from_vgname call ( used to check if the name exists )
* will return non - NULL .
2016-01-26 20:34:59 +03:00
*/
2017-10-26 22:32:30 +03:00
if ( ! lock_vol ( cmd , vp_new . vg_name , LCK_VG_WRITE , NULL ) ) {
log_error ( " Can't get lock for %s. " , vp_new . vg_name ) ;
return ECMD_FAILED ;
}
2018-07-10 21:39:29 +03:00
lvmcache_label_scan ( cmd ) ;
2017-10-26 22:32:30 +03:00
if ( lvmcache_fmt_from_vgname ( cmd , vp_new . vg_name , NULL , 0 ) ) {
unlock_vg ( cmd , NULL , vp_new . vg_name ) ;
log_error ( " A volume group called %s already exists. " , vp_new . vg_name ) ;
2009-12-28 21:34:45 +03:00
return ECMD_FAILED ;
2009-12-03 22:20:48 +03:00
}
Change vg_create() to take only minimal parameters and obtain a lock.
vg_t *vg_create(struct cmd_context *cmd, const char *vg_name);
This is the first step towards the API called to create a VG.
Call vg_lock_newname() inside this function. Use _vg_make_handle()
where possible.
Now we have 2 ways to construct a volume group:
1) vg_read: Used when constructing an existing VG from disks
2) vg_create: Used when constructing a new VG
Both of these interfaces obtain a lock, and return a vg_t *.
The usage of _vg_make_handle() inside vg_create() doesn't fit
perfectly but it's ok for now. Needs some cleanup though and I've
noted "FIXME" in the code.
Add the new vg_create() plus vg 'set' functions for non-default
VG parameters in the following tools:
- vgcreate: Fairly straightforward refactoring. We just moved
vg_lock_newname inside vg_create so we check the return via
vg_read_error.
- vgsplit: The refactoring here is a bit more tricky. Originally
we called vg_lock_newname and depending on the error code, we either
read the existing vg or created the new one. Now vg_create()
calls vg_lock_newname, so we first try to create the VG. If this
fails with FAILED_EXIST, we can then do the vg_read. If the
create succeeds, we check the input parameters and set any new
values on the VG.
TODO in future patches:
1. The VG_ORPHAN lock needs some thought. We may want to treat
this as any other VG, and require the application to obtain a handle
and pass it to other API calls (for example, vg_extend). Or,
we may find that hiding the VG_ORPHAN lock inside other APIs is
the way to go. I thought of placing the VG_ORPHAN lock inside
vg_create() and tying it to the vg handle, but was not certain
this was the right approach.
2. Cleanup error paths. Integrate vg_read_error() with vg_create and
vg_read* error codes and/or the new error APIs.
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2009-07-09 14:09:33 +04:00
2016-05-31 13:24:05 +03:00
if ( ! ( handle = init_processing_handle ( cmd , NULL ) ) ) {
2016-01-26 20:34:59 +03:00
log_error ( " Failed to initialize processing handle. " ) ;
return ECMD_FAILED ;
}
if ( ! pvcreate_each_device ( cmd , handle , & pp ) ) {
destroy_processing_handle ( cmd , handle ) ;
return_ECMD_FAILED ;
}
if ( ! ( vg = vg_create ( cmd , vp_new . vg_name ) ) )
goto_bad ;
2013-06-25 14:32:09 +04:00
if ( vg - > fid - > fmt - > features & FMT_CONFIG_PROFILE )
config: differentiate command and metadata profiles and consolidate profile handling code
- When defining configuration source, the code now uses separate
CONFIG_PROFILE_COMMAND and CONFIG_PROFILE_METADATA markers
(before, it was just CONFIG_PROFILE that did not make the
difference between the two). This helps when checking the
configuration if it contains correct set of options which
are all in either command-profilable or metadata-profilable
group without mixing these groups together - so it's a firm
distinction. The "command profile" can't contain
"metadata profile" and vice versa! This is strictly checked
and if the settings are mixed, such profile is rejected and
it's not used. So in the end, the CONFIG_PROFILE_COMMAND
set of options and CONFIG_PROFILE_METADATA are mutually exclusive
sets.
- Marking configuration with one or the other marker will also
determine the way these configuration sources are positioned
in the configuration cascade which is now:
CONFIG_STRING -> CONFIG_PROFILE_COMMAND -> CONFIG_PROFILE_METADATA -> CONFIG_FILE/CONFIG_MERGED_FILES
- Marking configuration with one or the other marker will also make
it possible to issue a command context refresh (will be probably
a part of a future patch) if needed for settings in global profile
set. For settings in metadata profile set this is impossible since
we can't refresh cmd context in the middle of reading VG/LV metadata
and for each VG/LV separately because each VG/LV can have a different
metadata profile assinged and it's not possible to change these
settings at this level.
- When command profile is incorrect, it's rejected *and also* the
command exits immediately - the profile *must* be correct for the
command that was run with a profile to be executed. Before this
patch, when the profile was found incorrect, there was just the
warning message and the command continued without profile applied.
But it's more correct to exit immediately in this case.
- When metadata profile is incorrect, we reject it during command
runtime (as we know the profile name from metadata and not early
from command line as it is in case of command profiles) and we
*do continue* with the command as we're in the middle of operation.
Also, the metadata profile is applied directly and on the fly on
find_config_tree_* fn call and even if the metadata profile is
found incorrect, we still need to return the non-profiled value
as found in the other configuration provided or default value.
To exit immediately even in this case, we'd need to refactor
existing find_config_tree_* fns so they can return error. Currently,
these fns return only config values (which end up with default
values in the end if the config is not found).
- To check the profile validity before use to be sure it's correct,
one can use :
lvm dumpconfig --commandprofile/--metadataprofile ProfileName --validate
(the --commandprofile/--metadataprofile for dumpconfig will come
as part of the subsequent patch)
- This patch also adds a reference to --commandprofile and
--metadataprofile in the cmd help string (which was missing before
for the --profile for some commands). We do not mention --profile
now as people should use --commandprofile or --metadataprofile
directly. However, the --profile is still supported for backward
compatibility and it's translated as:
--profile == --metadataprofile for lvcreate, vgcreate, lvchange and vgchange
(as these commands are able to attach profile to metadata)
--profile == --commandprofile for all the other commands
(--metadataprofile is not allowed there as it makes no sense)
- This patch also contains some cleanups to make the code handling
the profiles more readable...
2014-05-20 16:13:10 +04:00
vg - > profile = vg - > cmd - > profile_params - > global_metadata_profile ;
2013-06-25 14:32:09 +04:00
Change vg_create() to take only minimal parameters and obtain a lock.
vg_t *vg_create(struct cmd_context *cmd, const char *vg_name);
This is the first step towards the API called to create a VG.
Call vg_lock_newname() inside this function. Use _vg_make_handle()
where possible.
Now we have 2 ways to construct a volume group:
1) vg_read: Used when constructing an existing VG from disks
2) vg_create: Used when constructing a new VG
Both of these interfaces obtain a lock, and return a vg_t *.
The usage of _vg_make_handle() inside vg_create() doesn't fit
perfectly but it's ok for now. Needs some cleanup though and I've
noted "FIXME" in the code.
Add the new vg_create() plus vg 'set' functions for non-default
VG parameters in the following tools:
- vgcreate: Fairly straightforward refactoring. We just moved
vg_lock_newname inside vg_create so we check the return via
vg_read_error.
- vgsplit: The refactoring here is a bit more tricky. Originally
we called vg_lock_newname and depending on the error code, we either
read the existing vg or created the new one. Now vg_create()
calls vg_lock_newname, so we first try to create the VG. If this
fails with FAILED_EXIST, we can then do the vg_read. If the
create succeeds, we check the input parameters and set any new
values on the VG.
TODO in future patches:
1. The VG_ORPHAN lock needs some thought. We may want to treat
this as any other VG, and require the application to obtain a handle
and pass it to other API calls (for example, vg_extend). Or,
we may find that hiding the VG_ORPHAN lock inside other APIs is
the way to go. I thought of placing the VG_ORPHAN lock inside
vg_create() and tying it to the vg handle, but was not certain
this was the right approach.
2. Cleanup error paths. Integrate vg_read_error() with vg_create and
vg_read* error codes and/or the new error APIs.
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2009-07-09 14:09:33 +04:00
if ( ! vg_set_extent_size ( vg , vp_new . extent_size ) | |
! vg_set_max_lv ( vg , vp_new . max_lv ) | |
! vg_set_max_pv ( vg , vp_new . max_pv ) | |
2009-10-31 20:39:22 +03:00
! vg_set_alloc_policy ( vg , vp_new . alloc ) | |
2014-10-24 21:29:04 +04:00
! vg_set_system_id ( vg , vp_new . system_id ) | |
2010-07-01 00:03:52 +04:00
! vg_set_mda_copies ( vg , vp_new . vgmetadatacopies ) )
2016-01-26 20:34:59 +03:00
goto_bad ;
2009-07-24 19:01:43 +04:00
Change vg_create() to take only minimal parameters and obtain a lock.
vg_t *vg_create(struct cmd_context *cmd, const char *vg_name);
This is the first step towards the API called to create a VG.
Call vg_lock_newname() inside this function. Use _vg_make_handle()
where possible.
Now we have 2 ways to construct a volume group:
1) vg_read: Used when constructing an existing VG from disks
2) vg_create: Used when constructing a new VG
Both of these interfaces obtain a lock, and return a vg_t *.
The usage of _vg_make_handle() inside vg_create() doesn't fit
perfectly but it's ok for now. Needs some cleanup though and I've
noted "FIXME" in the code.
Add the new vg_create() plus vg 'set' functions for non-default
VG parameters in the following tools:
- vgcreate: Fairly straightforward refactoring. We just moved
vg_lock_newname inside vg_create so we check the return via
vg_read_error.
- vgsplit: The refactoring here is a bit more tricky. Originally
we called vg_lock_newname and depending on the error code, we either
read the existing vg or created the new one. Now vg_create()
calls vg_lock_newname, so we first try to create the VG. If this
fails with FAILED_EXIST, we can then do the vg_read. If the
create succeeds, we check the input parameters and set any new
values on the VG.
TODO in future patches:
1. The VG_ORPHAN lock needs some thought. We may want to treat
this as any other VG, and require the application to obtain a handle
and pass it to other API calls (for example, vg_extend). Or,
we may find that hiding the VG_ORPHAN lock inside other APIs is
the way to go. I thought of placing the VG_ORPHAN lock inside
vg_create() and tying it to the vg handle, but was not certain
this was the right approach.
2. Cleanup error paths. Integrate vg_read_error() with vg_create and
vg_read* error codes and/or the new error APIs.
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2009-07-09 14:09:33 +04:00
/* attach the pv's */
2016-01-26 20:34:59 +03:00
if ( ! vg_extend_each_pv ( vg , & pp ) )
Change vg_create() to take only minimal parameters and obtain a lock.
vg_t *vg_create(struct cmd_context *cmd, const char *vg_name);
This is the first step towards the API called to create a VG.
Call vg_lock_newname() inside this function. Use _vg_make_handle()
where possible.
Now we have 2 ways to construct a volume group:
1) vg_read: Used when constructing an existing VG from disks
2) vg_create: Used when constructing a new VG
Both of these interfaces obtain a lock, and return a vg_t *.
The usage of _vg_make_handle() inside vg_create() doesn't fit
perfectly but it's ok for now. Needs some cleanup though and I've
noted "FIXME" in the code.
Add the new vg_create() plus vg 'set' functions for non-default
VG parameters in the following tools:
- vgcreate: Fairly straightforward refactoring. We just moved
vg_lock_newname inside vg_create so we check the return via
vg_read_error.
- vgsplit: The refactoring here is a bit more tricky. Originally
we called vg_lock_newname and depending on the error code, we either
read the existing vg or created the new one. Now vg_create()
calls vg_lock_newname, so we first try to create the VG. If this
fails with FAILED_EXIST, we can then do the vg_read. If the
create succeeds, we check the input parameters and set any new
values on the VG.
TODO in future patches:
1. The VG_ORPHAN lock needs some thought. We may want to treat
this as any other VG, and require the application to obtain a handle
and pass it to other API calls (for example, vg_extend). Or,
we may find that hiding the VG_ORPHAN lock inside other APIs is
the way to go. I thought of placing the VG_ORPHAN lock inside
vg_create() and tying it to the vg handle, but was not certain
this was the right approach.
2. Cleanup error paths. Integrate vg_read_error() with vg_create and
vg_read* error codes and/or the new error APIs.
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
2009-07-09 14:09:33 +04:00
goto_bad ;
2001-10-04 00:38:07 +04:00
2008-01-16 01:56:30 +03:00
if ( vp_new . max_lv ! = vg - > max_lv )
2007-06-28 21:33:44 +04:00
log_warn ( " WARNING: Setting maxlogicalvolumes to %d "
" (0 means unlimited) " , vg - > max_lv ) ;
2001-10-15 22:39:40 +04:00
2008-01-16 01:56:30 +03:00
if ( vp_new . max_pv ! = vg - > max_pv )
2007-06-28 21:33:44 +04:00
log_warn ( " WARNING: Setting maxphysicalvolumes to %d "
" (0 means unlimited) " , vg - > max_pv ) ;
2001-10-15 22:39:40 +04:00
2016-06-22 00:24:52 +03:00
if ( arg_is_set ( cmd , addtag_ARG ) ) {
2010-11-11 20:29:05 +03:00
dm_list_iterate_items ( current_group , & cmd - > arg_value_groups ) {
if ( ! grouped_arg_is_set ( current_group - > arg_values , addtag_ARG ) )
continue ;
if ( ! ( tag = grouped_arg_str_value ( current_group - > arg_values , addtag_ARG , NULL ) ) ) {
log_error ( " Failed to get tag " ) ;
goto bad ;
}
2004-03-26 18:46:37 +03:00
2010-11-11 20:29:05 +03:00
if ( ! vg_change_tag ( vg , tag , 1 ) )
goto_bad ;
}
2004-03-26 18:46:37 +03:00
}
2009-09-15 02:47:49 +04:00
if ( ! archive ( vg ) )
goto_bad ;
2002-01-09 16:17:14 +03:00
2001-11-21 22:32:35 +03:00
/* Store VG on disk(s) */
2009-09-15 02:47:49 +04:00
if ( ! vg_write ( vg ) | | ! vg_commit ( vg ) )
goto_bad ;
2002-02-11 18:42:34 +03:00
2015-03-05 23:00:44 +03:00
/*
* The VG is initially written without lock_type set , i . e . it starts as
* a local VG . lockd_init_vg ( ) then writes the VG a second time with
* both lock_type and lock_args set .
*/
2015-07-30 20:04:31 +03:00
if ( ! lockd_init_vg ( cmd , vg , vp_new . lock_type , 0 ) ) {
2015-03-05 23:00:44 +03:00
log_error ( " Failed to initialize lock args for lock type %s " ,
vp_new . lock_type ) ;
vg_remove_pvs ( vg ) ;
vg_remove_direct ( vg ) ;
goto_bad ;
}
lvmetad: two phase vg_update
Previously, a command sent lvmetad new VG metadata in vg_commit().
In vg_commit(), devices are suspended, so any memory allocation
done by the command while sending to lvmetad, or by lvmetad while
updating its cache could deadlock if memory reclaim was triggered.
Now lvmetad is updated in unlock_vg(), after devices are resumed.
The new method for updating VG metadata in lvmetad is in two phases:
1. In vg_write(), before devices are suspended, the command sends
lvmetad a short message ("set_vg_info") telling it what the new
VG seqno will be. lvmetad sees that the seqno is newer than
the seqno of its cached VG, so it sets the INVALID flag for the
cached VG. If sending the message to lvmetad fails, the command
fails before the metadata is committed and the change is not made.
If sending the message succeeds, vg_commit() is called.
2. In unlock_vg(), after devices are resumed, the command sends
lvmetad the standard vg_update message with the new metadata.
lvmetad sees that the seqno in the new metadata matches the
seqno it saved from set_vg_info, and knows it has the latest
copy, so it clears the INVALID flag for the cached VG.
If a command fails between 1 and 2 (after committing the VG on disk,
but before sending lvmetad the new metadata), the cached VG retains
the INVALID flag in lvmetad. A subsequent command will read the
cached VG from lvmetad, see the INVALID flag, ignore the cached
copy, read the VG from disk instead, update the lvmetad copy
with the latest copy from disk, (this clears the INVALID flag
in lvmetad), and use the correct VG metadata for the command.
(This INVALID mechanism already existed for use by lvmlockd.)
2016-06-08 22:42:03 +03:00
unlock_vg ( cmd , vg , vp_new . vg_name ) ;
2001-10-04 00:38:07 +04:00
2002-01-07 14:12:11 +03:00
backup ( vg ) ;
2002-01-01 00:27:39 +03:00
2018-06-05 18:47:01 +03:00
log_print_unless_silent ( " Volume group \" %s \" successfully created%s%s " ,
vg - > name ,
2015-02-24 02:41:38 +03:00
vg - > system_id ? " with system ID " : " " , vg - > system_id ? : " " ) ;
2001-10-12 18:25:53 +04:00
2015-03-05 23:00:44 +03:00
/*
* Start the VG lockspace because it will likely be used right away .
* Optionally wait for the start to complete so the VG can be fully
* used after this command completes ( otherwise , the VG can only be
* read without locks until the lockspace is done starting . )
*/
2018-06-01 18:04:54 +03:00
if ( vg_is_shared ( vg ) ) {
2015-03-05 23:00:44 +03:00
const char * start_opt = arg_str_value ( cmd , lockopt_ARG , NULL ) ;
2019-01-16 19:41:43 +03:00
if ( ! lockd_start_vg ( cmd , vg , 1 , NULL ) ) {
2015-03-05 23:00:44 +03:00
log_error ( " Failed to start locking " ) ;
goto out ;
}
locking: unify global lock for flock and lockd
There have been two file locks used to protect lvm
"global state": "ORPHANS" and "GLOBAL".
Commands that used the ORPHAN flock in exclusive mode:
pvcreate, pvremove, vgcreate, vgextend, vgremove,
vgcfgrestore
Commands that used the ORPHAN flock in shared mode:
vgimportclone, pvs, pvscan, pvresize, pvmove,
pvdisplay, pvchange, fullreport
Commands that used the GLOBAL flock in exclusive mode:
pvchange, pvscan, vgimportclone, vgscan
Commands that used the GLOBAL flock in shared mode:
pvscan --cache, pvs
The ORPHAN lock covers the important cases of serializing
the use of orphan PVs. It also partially covers the
reporting of orphan PVs (although not correctly as
explained below.)
The GLOBAL lock doesn't seem to have a clear purpose
(it may have eroded over time.)
Neither lock correctly protects the VG namespace, or
orphan PV properties.
To simplify and correct these issues, the two separate
flocks are combined into the one GLOBAL flock, and this flock
is used from the locking sites that are in place for the
lvmlockd global lock.
The logic behind the lvmlockd (distributed) global lock is
that any command that changes "global state" needs to take
the global lock in ex mode. Global state in lvm is: the list
of VG names, the set of orphan PVs, and any properties of
orphan PVs. Reading this global state can use the global lock
in sh mode to ensure it doesn't change while being reported.
The locking of global state now looks like:
lockd_global()
previously named lockd_gl(), acquires the distributed
global lock through lvmlockd. This is unchanged.
It serializes distributed lvm commands that are changing
global state. This is a no-op when lvmlockd is not in use.
lockf_global()
acquires an flock on a local file. It serializes local lvm
commands that are changing global state.
lock_global()
first calls lockf_global() to acquire the local flock for
global state, and if this succeeds, it calls lockd_global()
to acquire the distributed lock for global state.
Replace instances of lockd_gl() with lock_global(), so that the
existing sites for lvmlockd global state locking are now also
used for local file locking of global state. Remove the previous
file locking calls lock_vol(GLOBAL) and lock_vol(ORPHAN).
The following commands which change global state are now
serialized with the exclusive global flock:
pvchange (of orphan), pvresize (of orphan), pvcreate, pvremove,
vgcreate, vgextend, vgremove, vgreduce, vgrename,
vgcfgrestore, vgimportclone, vgmerge, vgsplit
Commands that use a shared flock to read global state (and will
be serialized against the prior list) are those that use
process_each functions that are based on processing a list of
all VG names, or all PVs. The list of all VGs or all PVs is
global state and the shared lock prevents those lists from
changing while the command is processing them.
The ORPHAN lock previously attempted to produce an accurate
listing of orphan PVs, but it was only acquired at the end of
the command during the fake vg_read of the fake orphan vg.
This is not when orphan PVs were determined; they were
determined by elimination beforehand by processing all real
VGs, and subtracting the PVs in the real VGs from the list
of all PVs that had been identified during the initial scan.
This is fixed by holding the single global lock in shared mode
while processing all VGs to determine the list of orphan PVs.
2019-04-18 23:01:19 +03:00
lock_global ( cmd , " un " ) ;
2015-03-05 23:00:44 +03:00
if ( ! start_opt | | ! strcmp ( start_opt , " wait " ) ) {
/* It is OK if the user does Ctrl-C to cancel the wait. */
log_print_unless_silent ( " Starting locking. Waiting until locks are ready... " ) ;
lockd_start_wait ( cmd ) ;
} else if ( ! strcmp ( start_opt , " nowait " ) ) {
log_print_unless_silent ( " Starting locking. VG is read-only until locks are ready. " ) ;
}
}
out :
2011-08-11 00:25:29 +04:00
release_vg ( vg ) ;
2016-01-26 20:34:59 +03:00
destroy_processing_handle ( cmd , handle ) ;
2003-10-22 02:06:07 +04:00
return ECMD_PROCESSED ;
2008-12-01 23:14:33 +03:00
bad :
lvmetad: two phase vg_update
Previously, a command sent lvmetad new VG metadata in vg_commit().
In vg_commit(), devices are suspended, so any memory allocation
done by the command while sending to lvmetad, or by lvmetad while
updating its cache could deadlock if memory reclaim was triggered.
Now lvmetad is updated in unlock_vg(), after devices are resumed.
The new method for updating VG metadata in lvmetad is in two phases:
1. In vg_write(), before devices are suspended, the command sends
lvmetad a short message ("set_vg_info") telling it what the new
VG seqno will be. lvmetad sees that the seqno is newer than
the seqno of its cached VG, so it sets the INVALID flag for the
cached VG. If sending the message to lvmetad fails, the command
fails before the metadata is committed and the change is not made.
If sending the message succeeds, vg_commit() is called.
2. In unlock_vg(), after devices are resumed, the command sends
lvmetad the standard vg_update message with the new metadata.
lvmetad sees that the seqno in the new metadata matches the
seqno it saved from set_vg_info, and knows it has the latest
copy, so it clears the INVALID flag for the cached VG.
If a command fails between 1 and 2 (after committing the VG on disk,
but before sending lvmetad the new metadata), the cached VG retains
the INVALID flag in lvmetad. A subsequent command will read the
cached VG from lvmetad, see the INVALID flag, ignore the cached
copy, read the VG from disk instead, update the lvmetad copy
with the latest copy from disk, (this clears the INVALID flag
in lvmetad), and use the correct VG metadata for the command.
(This INVALID mechanism already existed for use by lvmlockd.)
2016-06-08 22:42:03 +03:00
unlock_vg ( cmd , vg , vp_new . vg_name ) ;
2011-08-11 00:25:29 +04:00
release_vg ( vg ) ;
2016-01-26 20:34:59 +03:00
destroy_processing_handle ( cmd , handle ) ;
2008-12-01 23:14:33 +03:00
return ECMD_FAILED ;
2001-10-04 00:38:07 +04:00
}