IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Doesn't really matter for tests, but I just came across it.
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Add commentary to link commit 86c7688 (MR !3447) to the upstream
fix for ICU-22610 in case there is subsequent breakage.
Signed-off-by: Earl Chew <earl_chew@yahoo.com>
Reviewed-by: Andreas Schneider <asn@samba.org>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
Autobuild-User(master): Andrew Bartlett <abartlet@samba.org>
Autobuild-Date(master): Fri Nov 8 00:20:38 UTC 2024 on atb-devel-224
Suppose that ‘slen’ is equal to (size_t)-1. A few lines up, we had:
if (lastp != 0) goto slow_path;
Therefore, ‘lastp’ must evaluate to false.
Now suppose that ‘slen’ is not equal to (size_t)-1. In that case, we
would have executed:
if (slen != 0) goto slow_path;
Therefore, ‘slen’ must evaluate to false.
Consequently, this code can be seen to be unreachable.
Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
This is a common case, and we can save a bit of work.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
If strncasecmp_ldb() encounters invalid utf-8 bytes, it compares those
as greater than any valid bytes (that is, it sorts them to the end of
the list).
If an invalid sequence is encountered in both strings at once, the
rest of the strings are now compared using the default ldb_comparison_fold
rules, as implemented in ldb_comparison_fold_ascii(). That is, each
byte is compared individually, [a-z] are translated to [A-Z], and runs of
spaces are collapsed into single spaces.
There is no perfect answer in this case, but this solution is stable,
fine-grained, and probably close to what is expected. This
byte-by-byte comparison is equivalent to a utf-8 comparison without
case-folding of multibyte codes.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
This is a function for comparing strings in a way that suits a
case-insenstive syntaxes in LDB.
We have it here, rahter than in LDB itself, because it needs the
upcase table. By default uses ASCII-only comparisons. SSSD and
OpenChange use it in that configuration, but Samba replaces the
comparison and casefold functions with Unicode aware versions.
Until now Samba has done that in a bad way; this will allow it to do
better.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
The reworked ICU libraries configuration code used [] as
default for conf.env['icu-libs']. This breaks dependency analysis
in samba_deps.py because SAMBA_SUBSYSTEM() expects deps to be
a string.
Signed-off-by: Earl Chew <earl_chew@yahoo.com>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Volker Lendecke <vl@samba.org>
Autobuild-User(master): Andreas Schneider <asn@cryptomilk.org>
Autobuild-Date(master): Tue May 14 14:44:06 UTC 2024 on atb-devel-224
Rather than probing for icu-i18n, icu-uc, and icudata libraries
separately, only probe for icu-i18n, and icu-uc, as direct dependencies
This avoids overlinking with icudata, and allows the package
to build even when ICU is not installed as a system library.
RN: Only use icu-i18n and icu-uc to express ICU dependency
BUG: https://bugzilla.samba.org/show_bug.cgi?id=15623
Signed-off-by: Earl Chew <earl_chew@yahoo.com>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Extend library_flags() to return the libraries provided by
pkg-config --libs.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=15623
Signed-off-by: Earl Chew <earl_chew@yahoo.com>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
the less/greater conparisons were not case-sensitive, which made the whole
function non-transitive.
I think codepoint_cmpi() is currently only used for equality tests, so
nothing will change.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
is codepoint_cmpi as case-insensitive as it claims when it comes to
inequalities? (no, it is not!).
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
If these are truly unicode codepoints (< ~2m) there is no overflow,
but the type is defined as uint32_t.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=15625
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
We now test cases:
1. where the first string compares less
2. one of the strings ends before the other
3. the strings differ on a character other than the first.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=15625
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
strncasecmp_m is supposed to return a negative, zero, or positive
number, not necessarily the difference between the codepoints in
the first character that differs, which we have been asserting up to
now.
This fixes a knownfail on 32 bit.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=15625
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
strcasecmp_m is supposed to return a negative, zero, or positive
number, depending on whether the first argument is less than, equal to,
or greater than the second argument (respectively).
We have been asserting that it returns exactly the difference between
the codepoints in the first character that differs.
This fixes a knownfail on 32 bit.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=15625
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
A u16string is supposed to contain UTF‐16 code units, but
ndr_pull_u16string() and ndr_push_u16string() fail to correctly ensure
this on big‐endian systems. Code that relies on the u16string array
containing correct values will then fail.
Fix ndr_pull_u16string() and ndr_push_u16string() to work on big‐endian
systems, ensuring that other code can use these strings without having
to worry about first encoding them to little‐endian.
Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
This is in line with ‘talloc_str[n]dup()’.
Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
This function returns the length in bytes — at most ‘n’ — of a UTF‐16
string excluding the null terminator.
Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
Change ‘<’ to ‘<=’ so that we check the final UTF‐16 code unit in our
search for the null terminator. This makes no difference to the result:
if we’ve reached the final code unit without finding a terminator, the
final code unit will be included in the length whether it is a null
terminator or not.
Why make this change? We’re about to factor out this loop into a new
function, utf16_len_n(), where including the final code unit *will*
matter.
Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
This function returns the length in bytes of a UTF‐16 string excluding
the null terminator.
Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
The new name indicates that — contrary to functions such as strnlen() —
the length may include the terminator.
Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
The new name indicates that — contrary to functions such as strnlen() —
the length may include the terminator.
Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
Consider the non-conforment utf-8 sequence "\xf5\x80\x80\x80", which
would encode 0x140000. We would set the high byte of the first
surrogate to 0xd8 | (0x130000 >> 18), or 0xdc, which is an invalid
start for a high surrogate, making the sequence as a whole invalid (as
you would expect -- the Unicode range was set precisely to that
covered by utf-16 surrogates).
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
At present, if we meet a string like "hello \xed\xa7\x96 world", the
bytes in the middle will be converted into half of a surrogate pair,
and the UTF-16 will be invalid. It is better to error out immediately,
because the UTF-8 string is already invalid.
https://learn.microsoft.com/en-us/windows/win32/api/Stringapiset/nf-stringapiset-widechartomultibyte#remarks
is a citation for the statement about this being a pre-Vista
problem.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
because it wasn't entirely obvious (a zero length string returns a
length 1 result).
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
Previous commit to the "embarrassing" line was ce10a7a673 "Fix
typo in comment", which did not completely fix the typo in the
comment.
But there are no gotos anymore, so no embarrassment, however spelt.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
CH_DISPLAY was removed in commit
125a2ff262, but NUM_CHARSETS was not
updated to match.
By assigning to NUM_CHARSETS the last enumeration value in charset_t, we
guard against its falling out of sync again.
Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
This is an alternative patch for MR2339: It seems that Windows AD in
turkish locale is ASCII-compatible with 'i'. Björn tells me that the
turkish locale is the only one where upper/lower casing letters in the
ASCII range is not compatible to ASCII.
Simplify our code by not calling the locale-specific standard
toupper/tolower for the ASCII range but rely on our tables.
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Alexander Bokovoy <ab@samba.org>
Reviewed-by: Andreas Schneider <asn@samba.org>
Autobuild-User(master): Volker Lendecke <vl@samba.org>
Autobuild-Date(master): Mon Apr 4 11:45:24 UTC 2022 on sn-devel-184
The DEBUG macro was missing and the CFStringGetBytes() was triggering a
-Werror,-Wpointer-sign build failure.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14862
Signed-off-by: Alex Richardson <Alexander.Richardson@cl.cam.ac.uk>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Jeremy Allison <jra@samba.org>
Autobuild-User(master): Jeremy Allison <jra@samba.org>
Autobuild-Date(master): Fri Jun 18 04:27:17 UTC 2021 on sn-devel-184
If we allow a string that encodes say '\0' as a multi-byte sequence,
we are open to confusion where we mix NUL terminated strings with
sized data blobs, which is to say EVERYWHERE.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14684
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Jeremy Allison <jra@samba.org>