This is a bit of a grabbag, but it's the best I could come up with without having lots of single-item sections.
23 KiB
title |
---|
Coding Style |
Coding Style
Formatting
-
8ch indent, no tabs, except for files in
man/
which are 2ch indent, and still no tabs, and shell scripts, which are 4ch indent, and no tabs either. -
We prefer
/* comments */
over// comments
in code you commit, please. This way// comments
are left for developers to use for local, temporary commenting of code for debug purposes (i.e. uncommittable stuff), making such comments easily discernible from explanatory, documenting code comments (i.e. committable stuff). -
Don't break code lines too eagerly. We do not force line breaks at 80ch, all of today's screens should be much larger than that. But then again, don't overdo it, ~109ch should be enough really. The
.editorconfig
,.vimrc
and.dir-locals.el
files contained in the repository will set this limit up for you automatically, if you let them (as well as a few other things). Please note that emacs loads.dir-locals.el
automatically, but vim needs to be configured to load.vimrc
, see that file for instructions. -
Try to write this:
void foo() { }
instead of this:
void foo() { }
-
Single-line
if
blocks should not be enclosed in{}
. Write this:if (foobar) waldo();
instead of this:
if (foobar) { waldo(); }
-
Do not write
foo ()
, writefoo()
.
Code Organization and Semantics
-
Please name structures in
PascalCase
(with exceptions, such as public API structs), variables and functions insnake_case
. -
Avoid static variables, except for caches and very few other cases. Think about thread-safety! While most of our code is never used in threaded environments, at least the library code should make sure it works correctly in them. Instead of doing a lot of locking for that, we tend to prefer using TLS to do per-thread caching (which only works for small, fixed-size cache objects), or we disable caching for any thread that is not the main thread. Use
is_main_thread()
to detect whether the calling thread is the main thread. -
Do not write functions that clobber call-by-reference variables on failure. Use temporary variables for these cases and change the passed in variables only on success.
-
The order in which header files are included doesn't matter too much. systemd-internal headers must not rely on an include order, so it is safe to include them in any order possible. However, to not clutter global includes, and to make sure internal definitions will not affect global headers, please always include the headers of external components first (these are all headers enclosed in <>), followed by our own exported headers (usually everything that's prefixed by
sd-
), and then followed by internal headers. Furthermore, in all three groups, order all includes alphabetically so duplicate includes can easily be detected. -
Please avoid using global variables as much as you can. And if you do use them make sure they are static at least, instead of exported. Especially in library-like code it is important to avoid global variables. Why are global variables bad? They usually hinder generic reusability of code (since they break in threaded programs, and usually would require locking there), and as the code using them has side-effects make programs non-transparent. That said, there are many cases where they explicitly make a lot of sense, and are OK to use. For example, the log level and target in
log.c
is stored in a global variable, and that's OK and probably expected by most. Also in many cases we cache data in global variables. If you add more caches like this, please be careful however, and think about threading. Only use static variables if you are sure that thread-safety doesn't matter in your case. Alternatively, consider using TLS, which is pretty easy to use with gcc'sthread_local
concept. It's also OK to store data that is inherently global in global variables, for example data parsed from command lines, see below. -
You might wonder what kind of common code belongs in
src/shared/
and what belongs insrc/basic/
. The split is like this: anything that is used to implement the public shared object we provide (sd-bus, sd-login, sd-id128, nss-systemd, nss-mymachines, nss-resolve, nss-myhostname, pam_systemd), must be located insrc/basic
(those objects are not allowed to link to libsystemd-shared.so). Conversely, anything which is shared between multiple components and does not need to be insrc/basic/
, should be insrc/shared/
.To summarize:
src/basic/
- may be used by all code in the tree
- may not use any code outside of
src/basic/
src/libsystemd/
- may be used by all code in the tree, except for code in
src/basic/
- may not use any code outside of
src/basic/
,src/libsystemd/
src/shared/
- may be used by all code in the tree, except for code in
src/basic/
,src/libsystemd/
,src/nss-*
,src/login/pam_systemd.*
, and files undersrc/journal/
that end up inlibjournal-client.a
convenience library. - may not use any code outside of
src/basic/
,src/libsystemd/
,src/shared/
-
Our focus is on the GNU libc (glibc), not any other libcs. If other libcs are incompatible with glibc it's on them. However, if there are equivalent POSIX and Linux/GNU-specific APIs, we generally prefer the POSIX APIs. If there aren't, we are happy to use GNU or Linux APIs, and expect non-GNU implementations of libc to catch up with glibc.
Using C Constructs
-
Preferably allocate local variables on the top of the block:
{ int a, b; a = 5; b = a; }
-
Do not mix function invocations with variable definitions in one line. Wrong:
{ int a = foobar(); uint64_t x = 7; }
Right:
{ int a; uint64_t x = 7; a = foobar(); }
-
Use
goto
for cleaning up, and only use it for that. i.e. you may only jump to the end of a function, and little else. Never jump backwards! -
To minimize strict aliasing violations, we prefer unions over casting.
-
Instead of using
memzero()
/memset()
to initialize structs allocated on the stack, please try to use c99 structure initializers. It's short, prettier and actually even faster at execution. Hence:struct foobar t = { .foo = 7, .bar = "bazz", };
instead of:
struct foobar t; zero(t); t.foo = 7; t.bar = "bazz";
-
To implement an endless loop, use
for (;;)
rather thanwhile (1)
. The latter is a bit ugly anyway, since you probably really meantwhile (true)
. To avoid the discussion what the right always-true expression for an infinite while loop is, our recommendation is to simply write it without any such expression by usingfor (;;)
. -
To determine the length of a constant string
"foo"
, don't bother withsizeof("foo")-1
, please usestrlen()
instead (both gcc and clang optimize the call away for fixed strings). The only exception is when declaring an array. In that case use STRLEN, which evaluates to a static constant and doesn't force the compiler to create a VLA.
Destructors
-
The destructors always deregister the object from the next bigger object, not the other way around.
-
For robustness reasons, destructors should be able to destruct half-initialized objects, too.
-
When you define a destructor or
unref()
call for an object, please accept aNULL
object and simply treat this as NOP. This is similar to how libcfree()
works, which acceptsNULL
pointers and becomes a NOP for them. By following this scheme a lot ofif
checks can be removed before invoking your destructor, which makes the code substantially more readable and robust. -
Related to this: when you define a destructor or
unref()
call for an object, please make it return the same type it takes and always returnNULL
from it. This allows writing code like this:p = foobar_unref(p);
which will always work regardless if
p
is initialized or not,x and guarantees thatp
isNULL
afterwards, all in just one line.
Error Handling
-
Error codes are returned as negative
Exxx
. e.g.return -EINVAL
. There are some exceptions: for constructors, it is OK to returnNULL
on OOM. For lookup functions,NULL
is fine too for "not found".Be strict with this. When you write a function that can fail due to more than one cause, it really should have an
int
as the return value for the error code. -
Do not bother with error checking whether writing to stdout/stderr worked.
-
Do not log errors from "library" code, only do so from "main program" code. (With one exception: it is OK to log with DEBUG level from any code, with the exception of maybe inner loops).
-
In public API calls, you must validate all your input arguments for programming error with
assert_return()
and return a sensible return code. In all other calls, it is recommended to check for programming errors with a more brutalassert()
. We are more forgiving to public users than for ourselves! Note thatassert()
andassert_return()
really only should be used for detecting programming errors, not for runtime errors.assert()
andassert_return()
by usage of_likely_()
inform the compiler that he should not expect these checks to fail, and they inform fellow programmers about the expected validity and range of parameters. -
When you invoke certain calls like
unlink()
, ormkdir_p()
and you know it is safe to ignore the error it might return (because a later call would detect the failure anyway, or because the error is in an error path and you thus couldn't do anything about it anyway), then make this clear by casting the invocation explicitly to(void)
. Code checks like Coverity understand that, and will not complain about ignored error codes. Hence, please use this:(void) unlink("/foo/bar/baz");
instead of just this:
unlink("/foo/bar/baz");
Don't cast function calls to
(void)
that return no error conditions. Specifically, the variousxyz_unref()
calls that return aNULL
object shouldn't be cast to(void)
, since not using the return value does not hide any errors. -
When returning a return code from
main()
, please preferably useEXIT_FAILURE
andEXIT_SUCCESS
as defined by libc.
Logging
-
For every function you add, think about whether it is a "logging" function or a "non-logging" function. "Logging" functions do logging on their own, "non-logging" function never log on their own and expect their callers to log. All functions in "library" code, i.e. in
src/shared/
and suchlike must be "non-logging". Every time a "logging" function calls a "non-logging" function, it should log about the resulting errors. If a "logging" function calls another "logging" function, then it should not generate log messages, so that log messages are not generated twice for the same errors. -
If possible, do a combined log & return operation:
r = operation(...); if (r < 0) return log_(error|warning|notice|...)_errno(r, "Failed to ...: %m");
If the error value is "synthetic", i.e. it was not received from the called function, use
SYNTHETIC_ERRNO
wrapper to tell the logging system to not log the errno value, but still return it:n = read(..., s, sizeof s); if (n != sizeof s) return log_error_errno(SYNTHETIC_ERRNO(EIO), "Failed to read ...");
Memory Allocation
-
Always check OOM. There is no excuse. In program code, you can use
log_oom()
for then printing a short message, but not in "library" code. -
Avoid fixed-size string buffers, unless you really know the maximum size and that maximum size is small. They are a source of errors, since they possibly result in truncated strings. It is often nicer to use dynamic memory,
alloca()
or VLAs. If you do allocate fixed-size strings on the stack, then it is probably only OK if you either use a maximum size such asLINE_MAX
, or count in detail the maximum size a string can have. (DECIMAL_STR_MAX
andDECIMAL_STR_WIDTH
macros are your friends for this!)Or in other words, if you use
char buf[256]
then you are likely doing something wrong! -
Make use of
_cleanup_free_
and friends. It makes your code much nicer to read (and shorter)! -
Use
alloca()
, but never forget that it is not OK to invokealloca()
within a loop or within function call parameters.alloca()
memory is released at the end of a function, and not at the end of a{}
block. Thus, if you invoke it in a loop, you keep increasing the stack pointer without ever releasing memory again. (VLAs have better behavior in this case, so consider using them as an alternative.) Regarding not usingalloca()
within function parameters, see the BUGS section of thealloca(3)
man page. -
If you want to concatenate two or more strings, consider using
strjoina()
orstrjoin()
rather thanasprintf()
, as the latter is a lot slower. This matters particularly in inner loops (but note thatstrjoina()
cannot be used there).
Runtime Behaviour
-
Avoid leaving long-running child processes around, i.e.
fork()
s that are not followed quickly by anexecv()
in the child. Resource management is unclear in this case, and memory CoW will result in unexpected penalties in the parent much, much later on. -
Don't block execution for arbitrary amounts of time using
usleep()
or a similar call, unless you really know what you do. Just "giving something some time", or so is a lazy excuse. Always wait for the proper event, instead of doing time-based poll loops. -
Whenever installing a signal handler, make sure to set
SA_RESTART
for it, so that interrupted system calls are automatically restarted, and we minimize hassles with handlingEINTR
(in particular asEINTR
handling is pretty broken on Linux). -
When applying C-style unescaping as well as specifier expansion on the same string, always apply the C-style unescaping fist, followed by the specifier expansion. When doing the reverse, make sure to escape
%
in specifier-style first (i.e.%
→%%
), and then do C-style escaping where necessary. -
Be exceptionally careful when formatting and parsing floating point numbers. Their syntax is locale dependent (i.e.
5.000
in en_US is generally understood as 5, while in de_DE as 5000.). -
Make sure to enforce limits on every user controllable resource. If the user can allocate resources in your code, your code must enforce some form of limits after which it will refuse operation. It's fine if it is hard-coded (at least initially), but it needs to be there. This is particularly important for objects that unprivileged users may allocate, but also matters for everything else any user may allocated.
Types
-
Think about the types you use. If a value cannot sensibly be negative, do not use
int
, but useunsigned
. -
Use
char
only for actual characters. Useuint8_t
orint8_t
when you actually mean a byte-sized signed or unsigned integers. When referring to a generic byte, we generally prefer the unsigned variantuint8_t
. Do not use types based onshort
. They never make sense. Useint
,long
,long long
, all in unsigned and signed fashion, and the fixed-size typesuint8_t
,uint16_t
,uint32_t
,uint64_t
,int8_t
,int16_t
,int32_t
and so on, as well assize_t
, but nothing else. Do not use kernel types likeu32
and so on, leave that to the kernel. -
Stay uniform. For example, always use
usec_t
for time values. Do not mixusec
andmsec
, andusec
and whatnot. -
Never use the
off_t
type, and particularly avoid it in public APIs. It's really weirdly defined, as it usually is 64-bit and we don't support it any other way, but it could in theory also be 32-bit. Which one it is depends on a compiler switch chosen by the compiled program, which hence corrupts APIs using it unless they can also follow the program's choice. Moreover, in systemd we should parse values the same way on all architectures and cannot exposeoff_t
values over D-Bus. To avoid any confusion regarding conversion and ABIs, always use simplyuint64_t
directly. -
Unless you allocate an array,
double
is always a better choice thanfloat
. Processors speakdouble
natively anyway, so there is no speed benefit, and on calls likeprintf()
float
s get promoted todouble
s anyway, so there is no point. -
Use the bool type for booleans, not integers. One exception: in public headers (i.e those in
src/systemd/sd-*.h
) use integers after all, asbool
is C99 and in our public APIs we try to stick to C89 (with a few extension).
Deadlocks
-
Do not issue NSS requests (that includes user name and host name lookups) from PID 1 as this might trigger deadlocks when those lookups involve synchronously talking to services that we would need to start up.
-
Do not synchronously talk to any other service from PID 1, due to risk of deadlocks.
File Descriptors
-
When you allocate a file descriptor, it should be made
O_CLOEXEC
right from the beginning, as none of our files should leak to forked binaries by default. Hence, whenever you open a file,O_CLOEXEC
must be specified, right from the beginning. This also applies to sockets. Effectively, this means that all invocations to:open()
must getO_CLOEXEC
passed,socket()
andsocketpair()
must getSOCK_CLOEXEC
passed,recvmsg()
must getMSG_CMSG_CLOEXEC
set,F_DUPFD_CLOEXEC
should be used instead ofF_DUPFD
, and so on,- invocations of
fopen()
should takee
.
-
It's a good idea to use
O_NONBLOCK
when opening 'foreign' regular files, i.e. file system objects that are supposed to be regular files whose paths where specified by the user and hence might actually refer to other types of file system objects. This is a good idea so that we don't end up blocking on 'strange' file nodes, for example if the user pointed us to a FIFO or device node which may block when opening. Moreover even for actual regular filesO_NONBLOCK
has a benefit: it bypasses any mandatory lock that might be in effect on the regular file. If in doubt consider turning offO_NONBLOCK
again after opening.
Command Line
-
If you parse a command line, and want to store the parsed parameters in global variables, please consider prefixing their names with
arg_
. We have been following this naming rule in most of our tools, and we should continue to do so, as it makes it easy to identify command line parameter variables, and makes it clear why it is OK that they are global variables. -
Command line option parsing:
- Do not print full
help()
on error, be specific about the error. - Do not print messages to stdout on error.
- Do not POSIX_ME_HARDER unless necessary, i.e. avoid
+
in option string.
- Do not print full
Exporting Symbols
-
Variables and functions must be static, unless they have a prototype, and are supposed to be exported.
-
Public API calls (i.e. functions exported by our shared libraries) must be marked
_public_
and need to be prefixed withsd_
. No other functions should be prefixed like that. -
When exposing public C APIs, be careful what function parameters you make
const
. For example, a parameter taking a context object should probably not beconst
, even if you are writing an otherwise read-only accessor function for it. The reason is that making itconst
fixates the contract that your call won't alter the object ever, as part of the API. However, that's often quite a promise, given that this even prohibits object-internal caching or lazy initialization of object variables. Moreover, it's usually not too useful for client applications. Hence, please be careful and avoidconst
on object parameters, unless you are very sureconst
is appropriate.
Referencing Concepts
-
When referring to a configuration file option in the documentation and such, please always suffix it with
=
, to indicate that it is a configuration file setting. -
When referring to a command line option in the documentation and such, please always prefix with
--
or-
(as appropriate), to indicate that it is a command line option. -
When referring to a file system path that is a directory, please always suffix it with
/
, to indicate that it is a directory, not a regular file (or other file system object).
Functions to Avoid
-
Use
memzero()
or even betterzero()
instead ofmemset(..., 0, ...)
-
Please use
streq()
andstrneq()
instead ofstrcmp()
,strncmp()
where applicable (i.e. wherever you just care about equality/inequality, not about the sorting order). -
Never use
strtol()
,atoi()
and similar calls. Usesafe_atoli()
,safe_atou32()
and suchlike instead. They are much nicer to use in most cases and correctly check for parsing errors. -
htonl()
/ntohl()
andhtons()
/ntohs()
are weird. Please usehtobe32()
andhtobe16()
instead, it's much more descriptive, and actually says what really is happening, after allhtonl()
andhtons()
don't operate onlong
s andshort
s as their name would suggest, but onuint32_t
anduint16_t
. Also, "network byte order" is just a weird name for "big endian", hence we might want to call it "big endian" right-away. -
Please never use
dup()
. Usefcntl(fd, F_DUPFD_CLOEXEC, 3)
instead. For two reason: first, you wantO_CLOEXEC
set on the newfd
(see above). Second,dup()
will happily duplicate yourfd
as 0, 1, 2, i.e. stdin, stdout, stderr, should thosefd
s be closed. Given the special semantics of thosefd
s, it's probably a good idea to avoid them.F_DUPFD_CLOEXEC
with3
as parameter avoids them. -
Don't use
fgets()
, it's too hard to properly handle errors such as overly long lines. Useread_line()
instead, which is our own function that handles this much nicer. -
Don't invoke
exit()
, ever. It is not replacement for proper error handling. Please escalate errors up your call chain, and use normalreturn
to exit from the main function of a process. If youfork()
ed off a child process, please use_exit()
instead ofexit()
, so that the exit handlers are not run. -
We never use the POSIX version of
basename()
(which glibc defines it inlibgen.h
), only the GNU version (which glibc defines instring.h
). The only reason to includelibgen.h
is becausedirname()
is needed. Every time you need that please immediately undefinebasename()
, and add a comment about it, so that no code ever ends up using the POSIX version!
Committing to git
-
Commit message subject lines should be prefixed with an appropriate component name of some kind. For example "journal: ", "nspawn: " and so on.
-
Do not use "Signed-Off-By:" in your commit messages. That's a kernel thing we don't do in the systemd project.