samba-mirror

mirror of https://github.com/samba-team/samba.git synced 2025-01-07 17:18:11 +03:00

Author	SHA1	Message	Date
Volker Lendecke	edc1f99ffa	lib: Move some R/W "data" segment to R/O "text" Doesn't really matter for tests, but I just came across it. Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Martin Schwenke <martin@meltin.net>	2024-12-02 04:53:33 +00:00
Earl Chew	1655413f12	Describe implication of upstream ICU-22610 Add commentary to link commit 86c7688 (MR !3447) to the upstream fix for ICU-22610 in case there is subsequent breakage. Signed-off-by: Earl Chew <earl_chew@yahoo.com> Reviewed-by: Andreas Schneider <asn@samba.org> Reviewed-by: Andrew Bartlett <abartlet@samba.org> Autobuild-User(master): Andrew Bartlett <abartlet@samba.org> Autobuild-Date(master): Fri Nov 8 00:20:38 UTC 2024 on atb-devel-224	2024-11-08 00:20:38 +00:00
Douglas Bagnall	f914f53913	util:charset: s/the the\b/the/ in comments Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz> Reviewed-by: Volker Lendecke <vl@samba.org>	2024-11-06 10:57:35 +00:00
Joseph Sutton	228dd73cae	util:charset: Remove unreachable code (CID 1272948) Suppose that ‘slen’ is equal to (size_t)-1. A few lines up, we had: if (lastp != 0) goto slow_path; Therefore, ‘lastp’ must evaluate to false. Now suppose that ‘slen’ is not equal to (size_t)-1. In that case, we would have executed: if (slen != 0) goto slow_path; Therefore, ‘slen’ must evaluate to false. Consequently, this code can be seen to be unreachable. Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz> Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>	2024-08-28 04:24:39 +00:00
Volker Lendecke	a8405ed15b	lib: Remove unused strnrchr_w Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Andreas Schneider <asn@samba.org>	2024-07-04 15:26:36 +00:00
Douglas Bagnall	f9797950fd	util:charset: strncasecmp_ldb avoids iconv for ASCII This is a common case, and we can save a bit of work. Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2024-05-22 23:12:32 +00:00
Douglas Bagnall	55397514db	util:charset: strncasecmp_ldb degrades to ASCII strncasecmp If strncasecmp_ldb() encounters invalid utf-8 bytes, it compares those as greater than any valid bytes (that is, it sorts them to the end of the list). If an invalid sequence is encountered in both strings at once, the rest of the strings are now compared using the default ldb_comparison_fold rules, as implemented in ldb_comparison_fold_ascii(). That is, each byte is compared individually, [a-z] are translated to [A-Z], and runs of spaces are collapsed into single spaces. There is no perfect answer in this case, but this solution is stable, fine-grained, and probably close to what is expected. This byte-by-byte comparison is equivalent to a utf-8 comparison without case-folding of multibyte codes. Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2024-05-22 23:12:32 +00:00
Douglas Bagnall	eb91e3437b	util:charset: add strncasecmp_ldb() This is a function for comparing strings in a way that suits a case-insenstive syntaxes in LDB. We have it here, rahter than in LDB itself, because it needs the upcase table. By default uses ASCII-only comparisons. SSSD and OpenChange use it in that configuration, but Samba replaces the comparison and casefold functions with Unicode aware versions. Until now Samba has done that in a bad way; this will allow it to do better. Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2024-05-22 23:12:32 +00:00
Douglas Bagnall	f9fbc7a506	lib/util/charset: be explicit about INVALID_CODEPOINT value Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2024-05-22 23:12:32 +00:00
Earl Chew	68a1200f66	Restore empty string default for conf.env['icu-libs'] The reworked ICU libraries configuration code used [] as default for conf.env['icu-libs']. This breaks dependency analysis in samba_deps.py because SAMBA_SUBSYSTEM() expects deps to be a string. Signed-off-by: Earl Chew <earl_chew@yahoo.com> Reviewed-by: Andrew Bartlett <abartlet@samba.org> Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz> Reviewed-by: Volker Lendecke <vl@samba.org> Autobuild-User(master): Andreas Schneider <asn@cryptomilk.org> Autobuild-Date(master): Tue May 14 14:44:06 UTC 2024 on atb-devel-224	2024-05-14 14:44:06 +00:00
Earl Chew	05807488fd	Combine ICU libraries icu-i18n and icu-uc into a single dependency Rather than probing for icu-i18n, icu-uc, and icudata libraries separately, only probe for icu-i18n, and icu-uc, as direct dependencies This avoids overlinking with icudata, and allows the package to build even when ICU is not installed as a system library. RN: Only use icu-i18n and icu-uc to express ICU dependency BUG: https://bugzilla.samba.org/show_bug.cgi?id=15623 Signed-off-by: Earl Chew <earl_chew@yahoo.com> Reviewed-by: Andrew Bartlett <abartlet@samba.org> Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>	2024-05-10 00:26:35 +00:00
Earl Chew	363c331857	Augment library_flags() to return libraries Extend library_flags() to return the libraries provided by pkg-config --libs. BUG: https://bugzilla.samba.org/show_bug.cgi?id=15623 Signed-off-by: Earl Chew <earl_chew@yahoo.com> Reviewed-by: Andrew Bartlett <abartlet@samba.org> Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>	2024-05-10 00:26:35 +00:00
Douglas Bagnall	13af2cb021	lib:util: codepoint_cmpi: be transitive and case-insensitive the less/greater conparisons were not case-sensitive, which made the whole function non-transitive. I think codepoint_cmpi() is currently only used for equality tests, so nothing will change. Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2024-05-07 23:25:35 +00:00
Douglas Bagnall	310d59c7cc	lib:util:tests: more tests for codepoint_cmpi is codepoint_cmpi as case-insensitive as it claims when it comes to inequalities? (no, it is not!). Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2024-05-07 23:25:35 +00:00
Douglas Bagnall	997b72d79e	util: charset:util_str: use NUMERIC_CMP in strncasecmp_m_handle BUG: https://bugzilla.samba.org/show_bug.cgi?id=15625 Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2024-04-10 22:56:33 +00:00
Douglas Bagnall	f07ae69907	util:charset:codepoints: codepoint_cmpi warning about non-transitivity BUG: https://bugzilla.samba.org/show_bug.cgi?id=15625 Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2024-04-10 22:56:33 +00:00
Douglas Bagnall	675fdeee3d	util:charset:codepoints: condepoint_cmpi uses NUMERIC_CMP() If these are truly unicode codepoints (< ~2m) there is no overflow, but the type is defined as uint32_t. BUG: https://bugzilla.samba.org/show_bug.cgi?id=15625 Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2024-04-10 22:56:33 +00:00
Douglas Bagnall	f788a39999	util:charset:util_str: use NUMERIC_CMP in strcasecmp_m_handle BUG: https://bugzilla.samba.org/show_bug.cgi?id=15625 Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2024-04-10 22:56:33 +00:00
Douglas Bagnall	a512759d7b	torture:charset: test more of strcasecmp_m We now test cases: 1. where the first string compares less 2. one of the strings ends before the other 3. the strings differ on a character other than the first. BUG: https://bugzilla.samba.org/show_bug.cgi?id=15625 Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2024-04-10 22:56:33 +00:00
Douglas Bagnall	dda0bb6fc7	torture:charset: use < and > assertions for strncasecmp_m strncasecmp_m is supposed to return a negative, zero, or positive number, not necessarily the difference between the codepoints in the first character that differs, which we have been asserting up to now. This fixes a knownfail on 32 bit. BUG: https://bugzilla.samba.org/show_bug.cgi?id=15625 Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2024-04-10 22:56:33 +00:00
Douglas Bagnall	ac0a8cd92c	torture:charset: use < and > assertions for strcasecmp_m strcasecmp_m is supposed to return a negative, zero, or positive number, depending on whether the first argument is less than, equal to, or greater than the second argument (respectively). We have been asserting that it returns exactly the difference between the codepoints in the first character that differs. This fixes a knownfail on 32 bit. BUG: https://bugzilla.samba.org/show_bug.cgi?id=15625 Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2024-04-10 22:56:33 +00:00
Joseph Sutton	346844b730	librpc: Change type of ‘u16string’ from ‘const uint16_t ’ to ‘const unsigned char ’ A u16string is supposed to contain UTF‐16 code units, but ndr_pull_u16string() and ndr_push_u16string() fail to correctly ensure this on big‐endian systems. Code that relies on the u16string array containing correct values will then fail. Fix ndr_pull_u16string() and ndr_push_u16string() to work on big‐endian systems, ensuring that other code can use these strings without having to worry about first encoding them to little‐endian. Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2023-12-21 23:48:46 +00:00
Joseph Sutton	1947bd6d6d	util/charset: Remove trailing whitespace Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2023-12-08 02:28:33 +00:00
Joseph Sutton	4629fc7c61	util/charset: Have talloc_utf16_str[n]dup() accept NULL pointers This is in line with ‘talloc_str[n]dup()’. Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2023-11-20 21:50:32 +00:00
Joseph Sutton	939ceb233e	util/charset: Add talloc_utf16_str[n]dup() Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2023-11-16 05:18:36 +00:00
Joseph Sutton	b6ff89f6fb	util/charset: Include missing headers Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2023-11-16 05:18:36 +00:00
Joseph Sutton	3f0809f1ee	util/charset: Remove unnecessary cast Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2023-11-16 05:18:36 +00:00
Joseph Sutton	ec3e420840	util/charset: Prefer PULL_LE_U16() to older SVAL() macro Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2023-11-15 22:07:36 +00:00
Joseph Sutton	99e0a0f21a	util/charset/tests: Add tests for UTF‐16 string length functions Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2023-11-15 22:07:36 +00:00
Joseph Sutton	a46746381b	util/charset: Add utf16_len_n() This function returns the length in bytes — at most ‘n’ — of a UTF‐16 string excluding the null terminator. Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2023-11-15 22:07:36 +00:00
Joseph Sutton	74a5a3b74e	util/charset: Include final UTF‐16 code unit in length calculation loop Change ‘<’ to ‘<=’ so that we check the final UTF‐16 code unit in our search for the null terminator. This makes no difference to the result: if we’ve reached the final code unit without finding a terminator, the final code unit will be included in the length whether it is a null terminator or not. Why make this change? We’re about to factor out this loop into a new function, utf16_len_n(), where including the final code unit will matter. Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2023-11-15 22:07:36 +00:00
Joseph Sutton	516f35b5a1	util/charset: Add utf16_len() This function returns the length in bytes of a UTF‐16 string excluding the null terminator. Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2023-11-15 22:07:36 +00:00
Joseph Sutton	16996d145b	util/charset: Rename utf16_len() to utf16_null_terminated_len() The new name indicates that — contrary to functions such as strnlen() — the length may include the terminator. Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2023-11-15 22:07:36 +00:00
Joseph Sutton	542e5a3039	util/charset: Rename utf16_len_n() to utf16_null_terminated_len_n() The new name indicates that — contrary to functions such as strnlen() — the length may include the terminator. Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2023-11-15 22:07:36 +00:00
Joseph Sutton	982238e914	util/charset: Remove trailing whitespace Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2023-11-15 22:07:36 +00:00
Douglas Bagnall	3960eabca7	libutil/iconv: avoid overflow in surrogate pairs Consider the non-conforment utf-8 sequence "\xf5\x80\x80\x80", which would encode 0x140000. We would set the high byte of the first surrogate to 0xd8 \| (0x130000 >> 18), or 0xdc, which is an invalid start for a high surrogate, making the sequence as a whole invalid (as you would expect -- the Unicode range was set precisely to that covered by utf-16 surrogates). Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2023-10-26 01:24:32 +00:00
Douglas Bagnall	949fe57077	libutil/iconv: don't allow wtf-8 surrogate pairs At present, if we meet a string like "hello \xed\xa7\x96 world", the bytes in the middle will be converted into half of a surrogate pair, and the UTF-16 will be invalid. It is better to error out immediately, because the UTF-8 string is already invalid. https://learn.microsoft.com/en-us/windows/win32/api/Stringapiset/nf-stringapiset-widechartomultibyte#remarks is a citation for the statement about this being a pre-Vista problem. Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2023-10-26 01:24:32 +00:00
Douglas Bagnall	d7481f94e0	util/charset/torture: test convert_string_talloc with emptyish strings because it wasn't entirely obvious (a zero length string returns a length 1 result). Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2023-10-26 01:24:32 +00:00
Douglas Bagnall	b5a728e81e	util/convert string: remove inaccurate misspelt comment Previous commit to the "embarrassing" line was `ce10a7a673` "Fix typo in comment", which did not completely fix the typo in the comment. But there are no gotos anymore, so no embarrassment, however spelt. Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2023-10-26 01:24:32 +00:00
Douglas Bagnall	df8ab7edfa	util/charset: disambiguate docs for convert_string twins Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2023-10-26 01:24:32 +00:00
Douglas Bagnall	7cf4efe768	lib/util/charset: @param typos Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2023-10-26 01:24:32 +00:00
Douglas Bagnall	e4da279b1c	util/str: helper to check for utf-8 validity Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2023-09-26 23:45:36 +00:00
Joseph Sutton	dd2b568721	lib:charset: Fix code spelling Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2023-09-11 02:42:41 +00:00
Joseph Sutton	355fd3c7bf	lib:charset: Update NUM_CHARSETS to reflect true value CH_DISPLAY was removed in commit `125a2ff262`, but NUM_CHARSETS was not updated to match. By assigning to NUM_CHARSETS the last enumeration value in charset_t, we guard against its falling out of sync again. Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz> Reviewed-by: Andrew Bartlett <abartlet@samba.org>	2023-08-08 04:39:37 +00:00
Andreas Schneider	cfa53c8a80	lib:util: Fix code spelling Signed-off-by: Andreas Schneider <asn@samba.org> Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>	2023-04-14 05:25:33 +00:00
Volker Lendecke	7fe12e79f9	lib: Fix a typo Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org>	2022-08-26 18:54:37 +00:00
Volker Lendecke	4171736339	lib: Stay ASCII-compatible for toupper_m/tolower_m This is an alternative patch for MR2339: It seems that Windows AD in turkish locale is ASCII-compatible with 'i'. Björn tells me that the turkish locale is the only one where upper/lower casing letters in the ASCII range is not compatible to ASCII. Simplify our code by not calling the locale-specific standard toupper/tolower for the ASCII range but rely on our tables. Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Alexander Bokovoy <ab@samba.org> Reviewed-by: Andreas Schneider <asn@samba.org> Autobuild-User(master): Volker Lendecke <vl@samba.org> Autobuild-Date(master): Mon Apr 4 11:45:24 UTC 2022 on sn-devel-184	2022-04-04 11:45:24 +00:00
Alex Richardson	2564e96e83	charset_macosxfs.c: fix compilation on macOS The DEBUG macro was missing and the CFStringGetBytes() was triggering a -Werror,-Wpointer-sign build failure. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14862 Signed-off-by: Alex Richardson <Alexander.Richardson@cl.cam.ac.uk> Reviewed-by: Andrew Bartlett <abartlet@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org>	2021-10-13 01:42:35 +00:00
Douglas Bagnall	4711ad9e81	util/charset: warn loudly on unexpected E2BIG Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz> Reviewed-by: Jeremy Allison <jra@samba.org> Autobuild-User(master): Jeremy Allison <jra@samba.org> Autobuild-Date(master): Fri Jun 18 04:27:17 UTC 2021 on sn-devel-184	2021-06-18 04:27:16 +00:00
Douglas Bagnall	1ea1816629	util/iconv: reject improperly packed UTF-8 If we allow a string that encodes say '\0' as a multi-byte sequence, we are open to confusion where we mix NUL terminated strings with sized data blobs, which is to say EVERYWHERE. BUG: https://bugzilla.samba.org/show_bug.cgi?id=14684 Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz> Reviewed-by: Jeremy Allison <jra@samba.org>	2021-06-18 03:39:28 +00:00

1 2 3 4 5 ...

266 Commits