11 Commits

Author SHA1 Message Date
Albert Esteve
fc6e58d9e6 reuse: addheader test/*.py
Add SPDX header to python files with
the 'py' extension in the test directory.

Signed-off-by: Albert Esteve <aesteve@redhat.com>
2022-10-18 13:04:20 +02:00
Nir Soffer
3591ab302e auth: Use the unused event only during cancellation
We track usage of the ticket using the self._ongoing set, and the
self._unused event.

When we added the first operation to an empty ongoing set, we cleared
the unused event.  When the last operation was removed from the ongoing
set, we set the unused event. We also logged debug message when changing
the flag.

Clearing the unused event is pointless, since the only case when we care
about it is during cancellation, which can happen exactly once in ticket
lifetime. While the ticket is not canceled, we use the size of the
ongoing set to tell if the ticket is active or not.

Now we never clear the event, and we set it during cancellation when the
last operation is removed.

Because of the way the auth benchmarks are written, we actually cleared
the event and set it on every request, so this change has dramatic
effect on the benchmarks, speeding the run benchmarks by factor of 30.

Since the benchmarks are too fast now, adjust the io size and image size
so the benchmarks run for longer time, giving more stable results.

The benchmarks with this change:

test_run_benchmark[ro-nbdcopy-1] 1 workers, 102400 ops, 0.337 s, 303474.35 ops/s
test_run_benchmark[ro-nbdcopy-2] 2 workers, 102400 ops, 0.360 s, 284764.91 ops/s
test_run_benchmark[ro-nbdcopy-4] 4 workers, 102400 ops, 0.377 s, 271749.58 ops/s
test_run_benchmark[ro-nbdcopy-8] 8 workers, 102400 ops, 0.374 s, 273571.81 ops/s
test_run_benchmark[ro-imageio-1] 1 workers, 102400 ops, 0.328 s, 312260.46 ops/s
test_run_benchmark[ro-imageio-2] 2 workers, 102400 ops, 1.216 s, 84186.99 ops/s
test_run_benchmark[ro-imageio-4] 4 workers, 102400 ops, 1.205 s, 84961.28 ops/s
test_run_benchmark[ro-imageio-8] 8 workers, 102400 ops, 1.330 s, 76999.42 ops/s
test_run_benchmark[rw-nbdcopy-1] 1 workers, 102400 ops, 0.229 s, 447124.14 ops/s
test_run_benchmark[rw-nbdcopy-2] 2 workers, 102400 ops, 0.232 s, 441208.16 ops/s
test_run_benchmark[rw-nbdcopy-4] 4 workers, 102400 ops, 0.230 s, 444717.75 ops/s
test_run_benchmark[rw-nbdcopy-8] 8 workers, 102400 ops, 0.229 s, 447196.19 ops/s
test_run_benchmark[rw-imageio-1] 1 workers, 102400 ops, 0.212 s, 483681.45 ops/s
test_run_benchmark[rw-imageio-2] 2 workers, 102400 ops, 0.834 s, 122720.70 ops/s
test_run_benchmark[rw-imageio-4] 4 workers, 102400 ops, 1.023 s, 100070.60 ops/s
test_run_benchmark[rw-imageio-8] 8 workers, 102400 ops, 1.064 s, 96203.37 ops/s
test_transferred_benchmark[1] 1 workers, 10000 ops, 0.027 s, 372717.89 ops/s
test_transferred_benchmark[2] 2 workers, 10000 ops, 0.038 s, 262446.05 ops/s
test_transferred_benchmark[4] 4 workers, 10000 ops, 0.062 s, 160467.17 ops/s
test_transferred_benchmark[8] 8 workers, 10000 ops, 0.112 s, 89388.93 ops/s

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
2022-02-14 16:14:26 +02:00
Nir Soffer
cc62116cdd auth: Don't track ranges for read-write ticket
When using read-write ticket, we don't report the number of transferred
bytes, so there is no point to track the ranges.

This is so far the biggest optimization, lowering the overhead per
request by 20%. However since engine never use read-write mode, this
does not improve real flows.

The most important benefit is proving that additional optimization in
measuring ranges is not effective, since 80% of the time is spent
elsewhere.

test_run_benchmark[ro-nbdcopy-1] 1 workers, 25600 ops, 1.884 s, 13588.32 ops/s
test_run_benchmark[ro-nbdcopy-2] 2 workers, 25600 ops, 2.229 s, 11483.24 ops/s
test_run_benchmark[ro-nbdcopy-4] 4 workers, 25600 ops, 2.305 s, 11104.09 ops/s
test_run_benchmark[ro-nbdcopy-8] 8 workers, 25600 ops, 2.410 s, 10623.58 ops/s
test_run_benchmark[ro-imageio-1] 1 workers, 25600 ops, 1.896 s, 13505.05 ops/s
test_run_benchmark[ro-imageio-2] 2 workers, 25600 ops, 2.270 s, 11275.83 ops/s
test_run_benchmark[ro-imageio-4] 4 workers, 25600 ops, 2.269 s, 11280.55 ops/s
test_run_benchmark[ro-imageio-8] 8 workers, 25600 ops, 2.361 s, 10841.85 ops/s
test_run_benchmark[rw-nbdcopy-1] 1 workers, 25600 ops, 1.786 s, 14332.42 ops/s
test_run_benchmark[rw-nbdcopy-2] 2 workers, 25600 ops, 2.188 s, 11700.01 ops/s
test_run_benchmark[rw-nbdcopy-4] 4 workers, 25600 ops, 2.098 s, 12204.83 ops/s
test_run_benchmark[rw-nbdcopy-8] 8 workers, 25600 ops, 1.932 s, 13247.69 ops/s
test_run_benchmark[rw-imageio-1] 1 workers, 25600 ops, 1.719 s, 14895.74 ops/s
test_run_benchmark[rw-imageio-2] 2 workers, 25600 ops, 2.353 s, 10881.34 ops/s
test_run_benchmark[rw-imageio-4] 4 workers, 25600 ops, 2.192 s, 11679.60 ops/s
test_run_benchmark[rw-imageio-8] 8 workers, 25600 ops, 2.192 s, 11677.66 ops/s

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
2022-02-09 10:20:40 +01:00
Nir Soffer
a57ea2348c tests: Unify auth benchmarks terms
Update the transferred benchmark to use workers, ops, and ops/s like the
run operation benchmark. This test show that computing transferred bytes
is pretty fast since we cannot have more than 8 workers, and in the
worst case for a sane client we will have 16 ranges (8 completed ranges
and 8 ongoing ranges).

Theoretically a client can send many non-contiguous requests which will
be much more expensive to compute; this is not tested yet.

Example run with new format:

test_transferred_benchmark[1] 1 workers, 10000 ops, 0.032 s, 314831.22 ops/s
test_transferred_benchmark[2] 2 workers, 10000 ops, 0.046 s, 216797.16 ops/s
test_transferred_benchmark[4] 4 workers, 10000 ops, 0.083 s, 120809.15 ops/s
test_transferred_benchmark[8] 8 workers, 10000 ops, 0.159 s, 63036.90 ops/s

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
2022-02-08 09:12:06 +01:00
Nir Soffer
4738ce99c3 test: Improve run operation benchmark
The tests was written in a way that is hard to relate to a real image
transfer, and the operations did not simulate the way real operations
are submitted.

Change the test to simulate transfer of a 50 GiB image with typical
clients:

- nbdcopy - split the image to 128 MiB segments, and run 4 workers, each
  sending one segment at a time.

- imageio - simulate imageio client, using pool of workers to send
  requests for the same area.

Imageio client typically use larger requests, so it has smaller
overhead. I'm using same request size for both imageio and nbdcopy so we
can compare if how the different threading models affect running
operations.

Both clients are tested with 1, 2, 4, and 8 threads. Both nbdcopy and
imageio use 4 workers by default.

Instead of nanoseconds per op, show operations per second which is more
relevant metric for this application.

In the scale lab we never got more than 1 GiB/s, so transferring 50 GiB
image will take 50 seconds. Using 4 workers and request size of 2 MiB
we have overhead of 2.38 seconds. (4.75%).

test_run_benchmark[nbdcopy-1] 1 workers, 25600 ops, 1.927 s, 13284.22 ops/s
test_run_benchmark[nbdcopy-2] 2 workers, 25600 ops, 2.243 s, 11413.15 ops/s
test_run_benchmark[nbdcopy-4] 4 workers, 25600 ops, 2.301 s, 11123.47 ops/s
test_run_benchmark[nbdcopy-8] 8 workers, 25600 ops, 2.487 s, 10291.53 ops/s
test_run_benchmark[imageio-1] 1 workers, 25600 ops, 1.872 s, 13671.72 ops/s
test_run_benchmark[imageio-2] 2 workers, 25600 ops, 2.220 s, 11533.88 ops/s
test_run_benchmark[imageio-4] 4 workers, 25600 ops, 2.389 s, 10714.05 ops/s
test_run_benchmark[imageio-8] 8 workers, 25600 ops, 2.318 s, 11044.73 ops/s

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
2022-02-08 09:12:06 +01:00
Nir Soffer
03e2f602ad logging: Replace ticket id with transfer id
Use the public transfer id instead of the sensitive ticket id used for
authentication. This makes it easier to follow transfer by grepping the
transfer id.

The ticket id is till logged when adding a ticket, we may want to remove
it from the log, or log only a part of the ticket.

Fixes #26

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
2022-01-24 13:36:42 +01:00
Nir Soffer
7f3b14bd3e auth: Use default transfer id
Engine before 4.2.7 did not pass a transfer id in the ticket. Use a
likely unique string based on the first half of the ticket uuid. This
will allow logging the transfer id instead of ticket id when working
with older engine.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
2022-01-24 13:36:42 +01:00
Nir Soffer
d5e9c757e0 http: Configurable inactivity timeout
We used 60 seconds timeout for disconnecting inactive clients. This is
too long to protect from bad clients leaving open connections, and too
short for applications that need long timeout[1].

Change the default timeout to 15 seconds, configurable via
daemon:auth_timeout. This timeout is used for new unauthorized
connections. If a connection does not authorize within this timeout, it
is disconnected.

When a connection is authorized during the first request, the connection
timeout is increased to ticket.inactivity_timeout. The default value is
60 seconds, configurable via daemon:inactivity_timeout.

Application with special needs can request a larger timeout when
creating an image transfer. Engine need to include the transfer
inactivity timeout in the ticket.

[1] https://bugzilla.redhat.com/2032324

Fixes #14.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
2021-12-20 21:13:12 +02:00
Nir Soffer
a94ca88833 auth: Close connections context in cancel()
Once ticket is marked as cancelled, no new operation can start, and new
connections cannot register with the ticket.

If there are no ongoing operations, we need to close connections backend
and buffer before we remove the ticket.

If we have ongoing operations, we need to wait until they finish. The
operations will discover that the ticket was canceled, close their
context and the underlying connection.

However if we had idle connections, we need to close their context when
we finish the wait.

When we close connection context, we don't remove the context from the
ticket, since this will cause the connection to create a new context on
the next request. The context will be removed when the connection try to
start a new operation.

If cancelling the ticket timed out, the user need to poll the ticket
"active" property. When the ticket becomes inactive, the user must
delete the ticket again.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
2021-12-20 21:13:12 +02:00
Nir Soffer
e4d08ceb39 auth: Cancel ticket faster
Previously we marked the ticket as canceled immediately, but if the
ticket had registered connections, we waited until the connections are
closed, or the timeout expires.

This mechanism was OK when connection timed out after 60 seconds of
inactivity, but we want to allow much longer inactivity timeout, and
this means that canceling a ticket can be too slow.

I found out that recent virt-v2v try to finalize a transfer without
closing the connection the imageio. This result in timeout finalizing
the transfer, since virt-v2v want sets a very large inactivity timeout
(3600 seconds).

We also had know about 2 other buggy clients (ansible upload, qe image
transfer tests) that tried to finalize a transfer without closing the
connection.

Waiting until connections time out was not helpful to anyone, since once
a ticket is marked as cancelled, any request will fail immediately with
authorization error.

Change the cancel flow to wait for ongoing operations instead of
connections. This means that ticket can be canceled immediately if there
are no ongoing operations (e.g. virt-v2v), or we wait very short time
until all ongoing operations finish.

If there are idle connections when ticket is canceled, they will remain
until the connection times out, or the user sends a request. This doe
not affect the server in any way except consuming resources. We can
check later if there is a good way to abort these connections
immediately.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
2021-12-20 21:13:12 +02:00
Nir Soffer
3fc44d98fc pypi: Eliminate the daemon directory
This make it easier to work with the project and to improve packaging.
For example, README.md is now at the expected location, so it is
packaged automatically for pypi.

Change-Id: Ib1a456054de34146bf2a4f39a69ccf1756b99e41
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
2021-10-21 19:39:45 +03:00