2303 Commits

Author SHA1 Message Date
w. ian douglas
b59762f734
Very minor misspelling in some tests (#705)
Fix misspelling "faiover" instead of "failover" in two test cases.

Signed-off-by: w. ian douglas <ian.douglas@iandouglas.com>
2024-06-28 23:56:30 +02:00
Binbin
2979fe6060
CLUSTER SLOT-STATS ORDERBY when stats are the same, compare by slot in ascending order (#710)
Test failed in my local:
```
*** [err]: CLUSTER SLOT-STATS ORDERBY LIMIT correct response pagination, where limit is less than number of assigned slots in tests/unit/cluster/slot-stats.tcl
Expected [dict exists 0 0 1 0 2 0 3 0 4 0 16383] (context: type source line 64 file /xxx/tests/unit/cluster/slot-stats.tcl cmd {assert {[dict exists $expected_slots $slot]}} proc ::assert_slot_visibility level 1)
```

It seems that when the stat is equal, that is, when the key-count is
equal,
the qsort performance will be different. When the stat is equal, we
compare
by slot (in ascending order).

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-06-28 08:03:03 -07:00
Binbin
518f0bf79b
Fix limit undefined behavior crash in CLUSTER SLOT-STATS (#709)
We did not set a default value for limit, but it will be used
in addReplyOrderBy later, the undefined behavior may crash the
server since the value could be negative and crash will happen
in addReplyArrayLen.

An interesting reproducible example (limit reuses the value of -1):
```
> cluster slot-stats orderby key-count desc limit -1
(error) ERR Limit has to lie in between 1 and 16384 (maximum number of slots).
> cluster slot-stats orderby key-count desc
Error: Server closed the connection
```

Set the default value of limit to 16384.

---------

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-06-28 08:02:52 -07:00
Kyle Kim (kimkyle@)
1269532fbd
Introduce CLUSTER SLOT-STATS command (#20). (#351)
The command provides detailed slot usage statistics upon invocation,
with initial support for key-count metric. cpu-usec (approved) and
memory-bytes (pending-approval) metrics will soon follow after the
merger of this PR.

---------

Signed-off-by: Kyle Kim <kimkyle@amazon.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
2024-06-27 16:58:27 -07:00
zhaozhao.zz
28c5a17edf
replica redirect read&write to primary in standalone mode (#325)
To implement #319 

1. replica is able to redirect read and write commands to it's primary
in standalone mode
    * reply with "-REDIRECT primary-ip:port"
2. add a subcommand `CLIENT CAPA redirect`, a client can announce the
capability to handle redirection
    * if a client can handle redirection, the data access commands (read and
write) will be redirected
3. allow `readonly` and `readwrite` command in standalone mode, may be a
breaking change
    * a client with redirect capability cannot process read commands on a
replica by default
    * use READONLY command can allow read commands on a replica

---------

Signed-off-by: zhaozhao.zz <zhaozhao.zz@alibaba-inc.com>
2024-06-27 19:00:45 +08:00
Kyle Kim (kimkyle@)
b49eaad367
Introduce a minimal debugger for .tcl integration test suite. (#683)
Introduce a break-point function called `bp`, based on the tcl wiki's
minimal debugger.

```tcl
 proc bp {{s {}}} {
    if ![info exists ::bp_skip] {
        set ::bp_skip [list]
    } elseif {[lsearch -exact $::bp_skip $s]>=0} return
    if [catch {info level -1} who] {set who ::}
    while 1 {
        puts -nonewline "$who/$s> "; flush stdout
        gets stdin line
        if {$line=="c"} {puts "continuing.."; break}
        if {$line=="i"} {set line "info locals"}
        catch {uplevel 1 $line} res
        puts $res
    }
 }
```

```
... your test code before break-point
bp 1
... your test code after break-point
```

The `bp 1` will give back the tcl interpreter to the developer, and
allow you to interactively print local variables (through `puts`), run
functions and so forth.

Source: https://wiki.tcl-lang.org/page/A+minimal+debugger

---------

Signed-off-by: Kyle Kim <kimkyle@amazon.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
2024-06-25 10:24:53 -07:00
Madelyn Olson
ce79539047
Fail tests immediately if the server is no longer running (#678)
Fix a minor inconvenience I have when writing tests. If I have a typo or
forget to generate the tls certificates, the start_server handle will
just loop for 2 minutes before printing the error. This just fails and
prints as soon as it sees the error.

Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
2024-06-21 15:29:05 +08:00
Binbin
bf1fb1fd36
Fix copy-paste error in scripts eviction test (#671)
The test needs to test "return 2" but mistakenly uses "return 1".
Also remove a extra debug print.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-06-20 10:28:47 +08:00
kukey
ae2d4217e1
Add new SCRIPT SHOW subcommand to dump script via sha1 (#617)
In some scenarios, the business may not be able to find the
previously used Lua script and only have a SHA signature.
Or there are multiple identical evalsha's args in monitor/slowlog,
and admin is not able to distinguish the script body.

Add a new script subcommmand to show the contents of script
given the scripts sha1. Returns a NOSCRIPT error if the script
is not present in the cache.

Usage: `SCRIPT SHOW sha1`
Complexity: `O(1)`

Closes #604.
Doc PR: https://github.com/valkey-io/valkey-doc/pull/143

---------

Signed-off-by: wei.kukey <wei.kukey@gmail.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2024-06-18 17:48:58 -07:00
Ping Xie
4135894a5d
Update remaining master references to primary (#660)
Signed-off-by: Ping Xie <pingxie@google.com>
2024-06-17 20:31:15 -07:00
Binbin
db6d3c1138
Only primary with slots has the right to mark a node as failed (#634)
In markNodeAsFailingIfNeeded we will count needed_quorum and failures,
needed_quorum is the half the cluster->size and plus one, and
cluster-size
is the size of primary node which contain slots, but when counting
failures, we dit not check if primary has slots.

Only the primary has slots that has the rights to vote, adding a new
clusterNodeIsVotingPrimary to formalize this concept.

Release notes:

bugfix where nodes not in the quorum group might spuriously mark nodes
as failed

---------

Signed-off-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Ping Xie <pingxie@outlook.com>
2024-06-16 20:46:08 -07:00
Sankar
a81c32079c
Make cluster meet reliable under link failures (#461)
When there is a link failure while an ongoing MEET request is sent the
sending node stops sending anymore MEET and starts sending PINGs. Since
every node responds to PINGs from unknown nodes with a PONG, the
receiving node never adds the sending node. But the sending node adds
the receiving node when it sees a PONG. This can lead to asymmetry in
cluster membership. This changes makes the sender keep sending MEET
until it sees a PONG, avoiding the asymmetry.

---------

Signed-off-by: Sankar <1890648+srgsanky@users.noreply.github.com>
2024-06-16 20:37:09 -07:00
Binbin
d5496e42bc
Lua scripts promoted from eval to script load to avoid evict (#637)
In ad28d222edcef9d4496fd7a94656013f07dd08e5, we added a Lua eval
scripts eviction. If the script was previously added via EVAL, we
promote it to SCRIPT LOAD, prevent it from being evicted later.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-06-14 08:32:19 -07:00
Ping Xie
5d9d41868d
Replace DEBUG RESTART with pause_server and resume_server (#652) 2024-06-13 17:52:50 -07:00
uriyage
d211078a27
Fix query buffer resized test flakiness (#646)
Added a wait_for_condition to avoid the timing issue.
```
*** [err]: query buffer resized correctly in tests/unit/querybuf.tcl
Expected 11 >= 16384 && 11 <= 32770 (context: type eval line 24 cmd {assert {$orig_test_client_qbuf >= 16384 && $orig_test_client_qbuf <= $MAX_QUERY_BUFFER_SIZE}} proc ::test)
*** [err]: query buffer resized correctly when not idle in tests/unit/querybuf.tcl
Expected 11 > 32768 (context: type eval line 14 cmd {assert {$orig_test_client_qbuf > 32768}} proc ::test)
*** [err]: query buffer resized correctly with fat argv in tests/unit/querybuf.tcl
query buffer should not be resized when client idle time smaller than 2s
```

Signed-off-by: Uri Yagelnik <uriy@amazon.com>
2024-06-13 18:07:07 +08:00
Madelyn Olson
627d387ad8
Improve reliability of querybuf test (#639)
We've been seeing some pretty consistent failures from
`test-valgrind-test` and `test-sanitizer-address` because of the
querybuf test periodically failing. I tracked it down to the test
periodically taking too long and the client cron getting triggered. A
simple solution is to just disable the cron during the key race
condition. I was able to run this locally for 100 iterations without
seeing a failure.

Example:
https://github.com/valkey-io/valkey/actions/runs/9474458354/job/26104103514
and
https://github.com/valkey-io/valkey/actions/runs/9474458354/job/26104106830.

Signed-off-by: Madelyn Olson <matolson@amazon.com>
2024-06-12 14:27:42 -07:00
Ping Xie
aad6769a80
Replicate slot migration states via RDB aux fields (#586) 2024-06-07 20:32:27 -07:00
Ping Xie
54c9747935
Remove master and slave from source code (#591)
External facing interfaces are not affected.

---------

Signed-off-by: Ping Xie <pingxie@google.com>
2024-06-07 14:21:33 -07:00
Madelyn Olson
bce240eab7
Replace masteruser and masterauth with primaryuser and primaryauth (#598)
Make the one backwards compatible config change we are allowed to
replace for removing master from our API.

`masterauth` and `masteruser` are still used as an alias, but aren't
explicitly referenced. As an addendum to
https://github.com/valkey-io/valkey/pull/591, it would be good to have
this in 8. Given the related PR for updated other references for master,
I just updated the ones around this specific change.

Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
2024-06-07 00:46:52 -07:00
Viktor Söderqvist
ad5fd5b95c
More rebranding (#606)
More rebranding of

* Log messages (#252)
* The DENIED error reply
* Internal function names and comments, mainly Lua API

---------

Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2024-06-07 01:40:55 +02:00
Viktor Söderqvist
278ce0cae0
Rebrand the Lua debugger (#603)
Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2024-06-06 19:53:17 +02:00
Madelyn Olson
b95e7c384f
Skip tls for xgroup read regression since it doesn't matter (#595)
"Client blocked on XREADGROUP while stream's slot is migrated" uses the
migrate command, which requires special handling for TLS and non-tls.
This was not being handled, so was throwing an error.

Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
2024-06-03 11:49:15 -07:00
uriyage
b72e43ed16
Adjust query buffer resized correctly test to non-jemalloc allocators. (#593)
Test `query buffer resized correctly` start to fail
(https://github.com/valkey-io/valkey/actions/runs/9278013807) with
non-jemalloc allocators after
https://github.com/valkey-io/valkey/pull/258 PR.

With Jemalloc we allocate ~20K for the query buffer, in the test we read
1 byte in the first read, in the second read we make sure we have at
least 16KB free place in the query buffer and we have as Jemalloc
allocated 20KB, But with non jemalloc we allocate in the first read
exactly 16KB. in the second read we check and see that we don't have
16KB free space as we already read 1 byte hence we reallocate this time
greedly (*2 of the requested size of 16KB+1) hence the test condition
that the querybuf size is < 32KB is no longer true

The `query buffer resized correctly test` starts
[failing](https://github.com/valkey-io/valkey/actions/runs/9278013807)
with non-jemalloc allocators after PR #258 .

With jemalloc, we allocate ~20KB for the query buffer. In the test, we
read 1 byte initially and then ensure there is at least 16KB of free
space in the buffer for the second read, which is satisfied by
jemalloc's 20KB allocation. However, with non-jemalloc allocators, the
first read allocates exactly 16KB. When we check again, we don't have
16KB free due to the 1 byte already read. This triggers a greedy
reallocation (doubling the requested size of 16KB+1), causing the query
buffer size to exceed the 32KB limit, thus failing the test condition.

This PR adjusted the test query buffer upper limit to be 32KB +2.

Signed-off-by: Uri Yagelnik <uriy@amazon.com>
2024-06-03 11:15:28 -07:00
naglera
28e055af0b
Deflake chained replicas disconnect (#574)
Deflake chained replicas disconnect when replica re-connect with the
same master. sync_partial_ok counter might get incremented if replica
timed out during test.

Signed-off-by: naglera <anagler123@gmail.com>
2024-06-02 20:53:39 -07:00
Ping Xie
30f277a86d
Enable debug asserts for cluster and sentinel tests (#588)
Also make `enable-debug-assert` an immutable config

Address review comments in #584

---------

Signed-off-by: Ping Xie <pingxie@google.com>
2024-06-02 13:15:08 -07:00
Chen Tianjie
d16b4ec1b9
Unshare object to avoid LRU/LFU being messed up (#250)
When LRU/LFU enabled, Valkey does not allow using shared objects, as
value objects may be shared among many different keys and they can't
share LRU/LFU information.

However `maxmemory-policy` is modifiable at runtime. If LRU/LFU is not
enabled at start, but then enabled when some shared objects are already
used, there could be some confusion in LRU/LFU information.

For `set` command it is OK since it is going to create a new object when
LRU/LFU enabled, but `get` command will not unshare the object and just
update LRU/LFU information.

So we may duplicate the object in this case. It is a one-time task for
each key using shared objects, unless this is the case for so many keys,
there should be no serious performance degradation.

Still, LRU will be updated anyway, no matter LRU/LFU is enabled or not,
because `OBJECT IDLETIME` needs it, unless `maxmemory-policy` is set to
LFU. So idle time of a key may still be messed up.

---------

Signed-off-by: chentianjie.ctj <chentianjie.ctj@alibaba-inc.com>
Signed-off-by: Chen Tianjie <TJ_Chen@outlook.com>
2024-06-01 10:09:20 +02:00
Ping Xie
2b97aa6171
Introduce enable-debug-assert to enable/disable debug asserts at runtime (#584)
Introduce a new hidden server configuration, `enable-debug-assert`, which
allows selectively enabling or disabling, at runtime, expensive or risky
assertions used primarily for debugging and testing.

Fix #569

---------

Signed-off-by: Ping Xie <pingxie@google.com>
2024-05-31 22:50:08 -07:00
nitaicaro
6fb90adf4b
Fix crash where command duration is not reset when client is blocked … (#526)
In #11012, we changed the way command durations were computed to handle
the same command being executed multiple times. In #11970, we added an
assert if the duration is not properly reset, potentially indicating
that a call to report statistics was missed.

I found an edge case where this happens - easily reproduced by blocking
a client on `XGROUPREAD` and migrating the stream's slot. This causes
the engine to process the `XGROUPREAD` command twice:

1. First time, we are blocked on the stream, so we wait for unblock to
come back to it a second time. In most cases, when we come back to
process the command second time after unblock, we process the command
normally, which includes recording the duration and then resetting it.
2. After unblocking we come back to process the command, and this is
where we hit the edge case - at this point, we had already migrated the
slot to another node, so we return a `MOVED` response. But when we do
that, we don’t reset the duration field.

Fix: also reset the duration when returning a `MOVED` response. I think
this is right, because the client should redirect the command to the
right node, which in turn will calculate the execution duration.

Also wrote a test which reproduces this, it fails without the fix and
passes with it.

---------

Signed-off-by: Nitai Caro <caronita@amazon.com>
Co-authored-by: Nitai Caro <caronita@amazon.com>
2024-05-30 12:55:00 -07:00
Binbin
6bab2d7968
Make sure clear the CLUSTER SLOTS cache on time when updating hostname (#564)
In #53, we will cache the CLUSTER SLOTS response to improve the
throughput and reduct the latency.

In the code snippet below, the second cluster slots will use the
old hostname:
```
config set cluster-preferred-endpoint-type hostname
config set cluster-announce-hostname old-hostname.com
multi
cluster slots
config set cluster-announce-hostname new-hostname.com
cluster slots
exec
```

When updating the hostname, in updateAnnouncedHostname, we will set
CLUSTER_TODO_SAVE_CONFIG and we will do a clearCachedClusterSlotsResponse
in clusterSaveConfigOrDie, so harmless in most cases.

Move the clearCachedClusterSlotsResponse call to clusterDoBeforeSleep
instead of scheduling it to be called in clusterSaveConfigOrDie.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-05-30 10:44:12 +08:00
LiiNen
96dcd1183a
Change BITCOUNT 'end' as optional like BITPOS (#118)
_This change is the thing I suggested to redis when it was BSD, and is
not just migration - this is of course more advanced_

### Issue
There is weird difference in syntax between BITPOS and BITCOUNT:
```
BITPOS key bit [start [end [BYTE | BIT]]]
BITCOUNT key [start end [BYTE | BIT]]
```

I think this might cause confusion in terms of usability.
It was not just a syntax typo error, and really works differently.
The results below are with unstable build:
```
> get TEST:ABCD
"ABCD"
> BITPOS TEST:ABCD 1 0 -1
(integer) 1
> BITCOUNT TEST:ABCD 0 -1
(integer) 9
> BITPOS TEST:ABCD 1 0
(integer) 1
> BITCOUNT TEST:ABCD 0
(error) ERR syntax error
```

### What did I fix

simply changes logic, to accept BITCOUNT also without 'end' - 'end'
become optional, like BITPOS
```
> GET TEST:ABCD
"ABCD"
> BITPOS TEST:ABCD 1 0 -1
(integer) 1
> BITCOUNT TEST:ABCD 0 -1
(integer) 9
> BITPOS TEST:ABCD 1 0
(integer) 1
> BITCOUNT TEST:ABCD 0
(integer) 9
```

Of course, I also fixed syntax hint:
```
# ASIS 
> BITCOUNT key [start end [BYTE|BIT]]
# TOBE
> BITCOUNT key [start [end [BYTE|BIT]]]
```

![image](https://github.com/valkey-io/valkey/assets/38001238/8485f58e-6785-4106-9f3f-45e62f90d24b)


### Moreover ...
I hadn't noticed that there was very small dead code in these command
logic, when I wrote PR to redis.
I found it now, when write code again, so I wrote it in valkey.
``` c
/* asis unstable */

/* bitcountCommand() */
if (!strcasecmp(c->argv[4]->ptr,"bit")) isbit = 1;
// ...
if (c->argc < 4) {
    if (isbit) end = (totlen<<3) + 7;
    else end = totlen-1;
}

/* bitposCommand() */
if (!strcasecmp(c->argv[5]->ptr,"bit")) isbit = 1;
// ...
if (c->argc < 5) {
    if (isbit) end = (totlen<<3) + 7;
    else end = totlen-1;
}
```
Bit variable (actually int) "isbit" is only being set as 1, when 'BIT'
is declared.
But we were checking whether 'isbit' is true or false in this 'if'
phrase, even if isbit could never be 1, because argc is always less than
4 (or 5 in bitpos).



I think this minor fixes will make valkey command operation more
consistent.
Of course, this PR contains just changing args from "required" to
"optional", so it will never hurt previous users.

Thanks,

---------

Signed-off-by: LiiNen <kjeonghoon065@gmail.com>
Co-authored-by: Madelyn Olson <34459052+madolson@users.noreply.github.com>
2024-05-28 15:01:28 -04:00
uriyage
fd58b73f0a
Introduce shared query buffer for client reads (#258)
This PR optimizes client query buffer handling in Valkey by introducing
a shared query buffer that is used by default for client reads. This
reduces memory usage by ~20KB per client by avoiding allocations for
most clients using short (<16KB) complete commands. For larger or
partial commands, the client still gets its own private buffer.

The primary changes are:

* Adding a shared query buffer `shared_qb` that clients use by default
* Modifying client querybuf initialization and reset logic
* Copying any partial query from shared to private buffer before command
execution
* Freeing idle client query buffers when empty to allow reuse of shared
buffer
* Master client query buffers are kept private as their contents need to
be preserved for replication stream

In addition to the memory savings, this change shows a 3% improvement in
latency and throughput when running with 1000 active clients.

The memory reduction may also help reduce the need to evict clients when
reaching max memory limit, as the query buffer is the main memory
consumer per client.

---------

Signed-off-by: Uri Yagelnik <uriy@amazon.com>
Signed-off-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
2024-05-28 11:09:37 -07:00
Viktor Söderqvist
4e44f5aae9
Fix races in test for tot-net-in, tot-net-out, tot-cmds (#559)
The races are between the '$rd' client and the 'r' client in the test
case.

Test case "client input output and command process statistics" in
unit/introspection.

---------

Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2024-05-28 17:13:16 +02:00
Ping Xie
e4ead9442b
Make CLUSTER SETSLOT with TIMEOUT 0 block indefinitely (#556)
This aligns the behaviour with established Valkey commands with a
TIMEOUT argument, such as BLPOP.

Fix #422

Signed-off-by: Ping Xie <pingxie@google.com>
2024-05-27 07:11:24 -07:00
Viktor Söderqvist
d72ba06dd0
Make cluster replicas return ASK and TRYAGAIN (#495)
After READONLY, make a cluster replica behave as its primary regarding
returning ASK redirects and TRYAGAIN.

Without this patch, a client reading from a replica cannot tell if a key
doesn't exist or if it has already been migrated to another shard as
part of an ongoing slot migration. Therefore, without an ASK redirect in
this situation, offloading reads to cluster replicas wasn't reliable.

Note: The target of a redirect is always a primary. If a client wants to
continue reading from a replica after following a redirect, it needs to
figure out the replicas of that new primary using CLUSTER SHARDS or
similar.

This is related to #21 and has been made possible by the introduction of
Replication of Slot Migration States in #445.

----

Release notes:

During cluster slot migration, replicas are able to return -ASK
redirects and -TRYAGAIN.

---------

Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2024-05-24 17:58:03 +02:00
Roshan Khatri
c4782066e7
Cache CLUSTER SLOTS response for improving throughput and reduced latency. (#53)
This commit adds a logic to cache `CLUSTER SLOTS` response for reduced
latency and also updates the cache when a change in the cluster is
detected.

Historically, `CLUSTER SLOTS` command was deprecated, however all the
server clients have been using `CLUSTER SLOTS` and have not migrated to
`CLUSTER SHARDS`. In future this logic can be added to any other
commands to improve the performance of the engine.

---------

Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
2024-05-22 14:21:41 -07:00
Viktor Söderqvist
efa8ba519b
Finish postponed SCAN changes (#501)
Commit 07ed0eafa98a66 introduced some SCAN improvements, but some
changes were postponed to a later version (8.0), which this PR finishes:

1. Prepare to move the TYPE filtering to the scan callback as well. this
was put on hold since it has side effects that can be considered a
breaking change, which is that we will not attempt to do lazy expire
(delete) a key that was filtered by not matching the TYPE (changing it
would mean TYPE filter starts behaving the same as MATCH filter already
does in that respect).
2. when the specified key TYPE filter is an unknown type, server will
reply a error immediately instead of doing a full scan that comes back
empty handed.

Fixes #235

Release notes:

> SCAN: Expired keys that don't match the TYPE argument for the SCAN are
no longer deleted by SCAN

Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
2024-05-17 13:35:31 +02:00
Ping Xie
fd53f17a61
Use pause_process to stop a node to make Valgrind happy, hopefully (#508)
Signed-off-by: Ping Xie <pingxie@google.com>
2024-05-16 22:59:00 -07:00
Lipeng Zhu
7a9951fb80
Correct the actual allocated size from allocator when call sdsRedize to align the logic with sdsnewlen function. (#476)
This patch try to correct the actual allocated size from allocator
when call sdsRedize to align the logic with sdsnewlen function.

Maybe the https://github.com/valkey-io/valkey/pull/453 optimization
should depend on this.

Signed-off-by: Lipeng Zhu <lipeng.zhu@intel.com>
2024-05-15 18:22:50 -07:00
Arthur Lee
3de5c71f48
[Feat] Support fast fail option for tcl test cases (#482)
This PR added a new option for tcl test case which will fail fast once any test cases fail.
This can be useful while running redis CI pipeline, and you want to accelerate the CI pipeline.

usage for example

> ./runtest --single unit/type/hash --fast-fail

---------

Signed-off-by: arthur.lee <arthur-lee@qq.com>
2024-05-15 06:55:24 -07:00
Ping Xie
ac47ca2d47
Suppress ASAN errors on tests that intentially crash the server via crash-memcheck-enabled no (#489)
Fix daily CI run errors like
https://github.com/valkey-io/valkey/actions/runs/9039450198/job/24842308071#step:6:4176

Signed-off-by: Ping Xie <pingxie@google.com>
2024-05-12 16:08:47 -07:00
Binbin
fdd023ff82
Migrate cluster mode tests to normal framework (#442)
We currently has two disjoint TCL frameworks:
1. Normal testing framework, which trigger by runtest, which individually
launches nodes for testing.
2. Cluster framework, which trigger by runtest-cluster, which pre-allocates
N nodes and uses them for testing large configurations.

The normal TCL testing framework is much more readily tested and is also
automatically run as part of the CI for new PRs. The runtest-cluster since
it runs very slowly (cannot be parallelized), it currently only runs in daily
CI, this results in some changes to the cluster not being exposed in PR CI
in time.

This PR migrate the Cluster mode tests to normal framework. Some cluster
tests are kept in runtest-cluster because of timing issues or not yet
supported, we can process them later.

Signed-off-by: Binbin <binloveplay1314@qq.com>
2024-05-09 10:14:47 +08:00
Chen Tianjie
ba9dd7b23a
Add noscores option to command ZSCAN. (#324)
Command syntax is now:
```
ZSCAN key cursor [MATCH pattern] [COUNT count] [NOSCORES]
```
Return format:
```
127.0.0.1:6379> zadd z 1 a 2 b 3 c
(integer) 3
127.0.0.1:6379> zscan z 0
1) "0"
2) 1) "a"
   2) "1"
   3) "b"
   4) "2"
   5) "c"
   6) "3"
127.0.0.1:6379> zscan z 0 noscores
1) "0"
2) 1) "a"
   2) "b"
   3) "c"
```
when NOSCORES is on, the command will only return members in the zset,
without scores.

For client side parsing the command return, I believe it is fine as long
as the command is backwards compatible. The return structure are still
lists, what has changed is the content. And clients can tell the
difference by the subcommand they use.

Since `novalues` option of `HSCAN` is already accepted
(redis/redis#12765), I think similar thing can be done to `ZSCAN`.

---------

Signed-off-by: Chen Tianjie <TJ_Chen@outlook.com>
2024-05-07 14:39:28 +08:00
Ping Xie
6e7af9471c
Slot migration improvement (#445) 2024-05-06 21:40:28 -07:00
Chen Tianjie
cc703aa3bc
Input output traffic stats and command process count for each client. (#327)
We already have global stats for input traffic, output traffic and how
many commands have been executed.

However, some users have the difficulty of locating the IP(s) which have
heavy network traffic. So here some stats for single client are
introduced.
```              
tot-net-in   // Total network input bytes read from the client
tot-net-out  // Total network output bytes sent to the client
tot-cmds     // Total commands the client has executed     
```                             
These three stats are shown in `CLIENT LIST` and `CLIENT INFO`.

Though the metrics are handled in hot paths of the code, personally I
don't think it will slow down the server. Considering all other complex
operations handled nearby, this is only a small and simple operation.

However we do need to be cautious when adding more and more metrics, as
discussed in redis/redis#12640, we may need to find a way to tell
whether this has obvious performance degradation.

---------

Signed-off-by: Chen Tianjie <TJ_Chen@outlook.com>
2024-05-05 21:52:59 -07:00
Shivshankar
0cb97653b7
Change default pidfile from redis.pid to valkey.pid (#378)
Changes the default value for the `pidfile` config.

The template config file `valkey.conf` already contains `pidfile
/var/run/valkey_6379.pid`. This is not a default. The default is what
you get when you start valkey without config.

Tests suites config pidfile changed to valkey accordingly.

Signed-off-by: Shivshankar-Reddy <shiva.sheri.github@gmail.com>
2024-04-30 18:45:08 +02:00
Shivshankar
52f9291f79
Rename redis to valkey in test suite logs and test names. (#366)
This PR covers below cases.
1. test suite's prints(i.e., puts statement logs).
2. Links refering to redis issues.
3. test names contains redis.

Signed-off-by: Shivshankar-Reddy <shiva.sheri.github@gmail.com>
2024-04-25 15:13:21 +08:00
Wen Hui
191be272b4
Rename redis.tcl to valkey.tcl (#283)
Includes some more changes e.g. the README under tests and some script under utils.

Signed-off-by: hwware <wen.hui.ware@gmail.com>
2024-04-24 20:54:52 +02:00
Shivshankar
8baf322742
Rename remaining test procedures (#355)
Renamed below procedures and variables (missed in #287) as follows.

redis_cluster             ->     valkey_cluster
redis1                    ->     valkey1
redis2                    ->     valkey2
get_redis_dir             ->     get_valkey_dir
test_redis_cli_rdb_dump   ->     test_valkey_cli_rdb_dump
test_redis_cli_repl       ->     test_valkey_cli_repl
redis-cli                 ->     valkey-cli
redis_reset_state         ->     valkey_reset_state
redisHandle               ->     valkeyHandle
redis_safe_read           ->     valkey_safe_read
redis_safe_gets           ->     valkey_safe_gets
redis_write               ->     valkey_write
redis_read_reply          ->     valkey_read_reply
redis_readable            ->     valkey_readable
redis_readnl              ->     valkey_readnl
redis_writenl             ->     valkey_writenl
redis_read_map            ->     valkey_read_map
redis_read_line           ->     valkey_read_line
redis_read_null           ->     valkey_read_null
redis_read_bool           ->     valkey_read_bool
redis_read_double         ->     valkey_read_double
redis_read_verbatim_str   ->     valkey_read_verbatim_str
redis_call_callback       ->     valkey_call_callback

---------

Signed-off-by: Shivshankar-Reddy <shiva.sheri.github@gmail.com>
2024-04-24 18:01:33 +02:00
Shivshankar
669f1d3014
redisbenchmark to valkeybenchmark in test directory and some test name rename. (#347)
This pr covers following chnages.
1. redisbenchmark to valkeybenchmark in test directory 
2. Removed redis from some test's title and changed the name
accordingly.
3. Updated test suite name and redis-server to valkey readme in test
directory.

---------

Signed-off-by: Shivshankar-Reddy <shiva.sheri.github@gmail.com>
2024-04-23 10:51:53 -07:00
Dmitry Polyakovsky
f0e1edc273
Updated modules examples and tests to use valkey vs redis (#349)
Scope of the changes:
- updated example modules to reference Valkey vs Redis in variable names
- updated tests to use valkeymodule.h
- updated vars in tests/modules to use valkey vs redis in variable names

Summary of the testing:
- ran make for all modules, loaded them into valkey-server and tested
commands
- ran make for test/modules
- ran make test for the entire codebase

---------

Signed-off-by: Dmitry Polyakovsky <dmitry.polyakovky@oracle.com>
Co-authored-by: Dmitry Polyakovsky <dmitry.polyakovky@oracle.com>
2024-04-23 17:55:44 +02:00