1
0
mirror of https://gitlab.com/libvirt/libvirt.git synced 2025-03-20 06:50:22 +03:00

nwfilter: Fix deadlock between nwfilter-list and VM startup/migration

The `nwfilterBindingCreateXML` and `nwfilterConnectListAllNWFilters`
APIs can acquire locks on multiple instances of virNWFilterObj. There
is no guarantee they will acquire these locks in the same order as
each other. Thus there is a potential for deadlock if they run
concurrently acquiring locks on the same filter objects.

This flaw has always existed, but historically was rare, because
virNWFilterObjList previously used an array. This meant iteration
over filters had a fixed order, matching order of loading filters
into libvirt.  The set of filter references would have to be just
right to expose the lock ordering deadlock.

In 8.2.0, commit c4fb52dc72b312431a3a28e3a163b38441a95665 switched
to use a hash table, introducing non-determinism to the iteration
order, as hash buckets vary based on the hash seed. As such almost
any filter with references is exposed to the deadlock risk now.

It is not easy  to guarantee lock ordering on the virNWFilterObj
instances, so acquiring `driverMutex` first, will serve to serialize
all lock acquisition on virNWFilterObj instances, avoiding the
deadlock scenario.

The major cost is that concurrency of the driver is significantly
reduced, with few other APIs able to run in parallel with updating
firewall rules.

A long term solution to this problem needs significant changes

 * The mutex on virNWFilterObj would need to change to a R/W
   lock.
 * The filter instantiation/teardown process would need to split
   into two phases. The first phase would resolve all the required
   virNWFilterObj instances & acquire read locks, while holding
   the 'driverMutex'. The second phase of running iptables/ebtables
   commands would then run without driverMutex held.
 * The filter define/undefine APIs would need to acquire write
   locks, other APIs only read locks.

This would allow concurrency of filter instantiation/teardown
with everything except for filter defnie/undefine, which was
the original desire.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
[DPB: rewrite commit message & add inline comment]
Signed-off-by: Dion Bosschieter <dionbosschieter@gmail.com>
This commit is contained in:
Dion Bosschieter 2025-02-18 15:56:32 +01:00 committed by Daniel P. Berrangé
parent 8acc0b76c6
commit 92de6563c6

View File

@ -754,11 +754,21 @@ nwfilterBindingCreateXML(virConnectPtr conn,
if (!(ret = virGetNWFilterBinding(conn, def->portdevname, def->filter)))
goto cleanup;
VIR_WITH_MUTEX_LOCK_GUARD(&driver->updateLock) {
if (virNWFilterInstantiateFilter(driver, def) < 0) {
virNWFilterBindingObjListRemove(driver->bindings, obj);
g_clear_pointer(&ret, virObjectUnref);
goto cleanup;
/*
* XXX: holding 'driverMutex' here is a very bad thing for
* concurrency but we have no choice. The instantiate process
* will acquire locks on multiple filters in arbitrary order.
* Unless we serialize on 'driverMutex', this risks deadlock
* with other APIs (eg list all filters) which also acquire
* locks on mutliple filters in arbitrarily different order.
*/
VIR_WITH_MUTEX_LOCK_GUARD(&driverMutex) {
VIR_WITH_MUTEX_LOCK_GUARD(&driver->updateLock) {
if (virNWFilterInstantiateFilter(driver, def) < 0) {
virNWFilterBindingObjListRemove(driver->bindings, obj);
g_clear_pointer(&ret, virObjectUnref);
goto cleanup;
}
}
}
@ -783,6 +793,9 @@ nwfilterBindingCreateXML(virConnectPtr conn,
static int
nwfilterBindingDelete(virNWFilterBindingPtr binding)
{
/* XXX: holding driverMutex is very bad for concurrency
* see nwfilterBindingCreate comment for more info */
VIR_LOCK_GUARD lock = virLockGuardLock(&driverMutex);
virNWFilterBindingObj *obj;
virNWFilterBindingDef *def;
int ret = -1;