Merge remote-tracking branch 'wireless/main' into wireless-next
Pull in wireless/main content since some new code would otherwise conflict with it. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This commit is contained in:
commit
dfd2d876b3
6
.gitignore
vendored
6
.gitignore
vendored
@ -37,6 +37,8 @@
|
||||
*.o
|
||||
*.o.*
|
||||
*.patch
|
||||
*.rmeta
|
||||
*.rsi
|
||||
*.s
|
||||
*.so
|
||||
*.so.dbg
|
||||
@ -97,6 +99,7 @@ modules.order
|
||||
!.gitattributes
|
||||
!.gitignore
|
||||
!.mailmap
|
||||
!.rustfmt.toml
|
||||
|
||||
#
|
||||
# Generated include files
|
||||
@ -162,3 +165,6 @@ x509.genkey
|
||||
|
||||
# Documentation toolchain
|
||||
sphinx_*/
|
||||
|
||||
# Rust analyzer configuration
|
||||
/rust-project.json
|
||||
|
12
.rustfmt.toml
Normal file
12
.rustfmt.toml
Normal file
@ -0,0 +1,12 @@
|
||||
edition = "2021"
|
||||
newline_style = "Unix"
|
||||
|
||||
# Unstable options that help catching some mistakes in formatting and that we may want to enable
|
||||
# when they become stable.
|
||||
#
|
||||
# They are kept here since they are useful to run from time to time.
|
||||
#format_code_in_doc_comments = true
|
||||
#reorder_impl_items = true
|
||||
#comment_width = 100
|
||||
#wrap_comments = true
|
||||
#normalize_comments = true
|
@ -3,7 +3,7 @@ Date: May 2011
|
||||
KernelVersion: 3.0
|
||||
Contact: Rafał Miłecki <zajec5@gmail.com>
|
||||
Description:
|
||||
Each BCMA core has it's manufacturer id. See
|
||||
Each BCMA core has its manufacturer id. See
|
||||
include/linux/bcma/bcma.h for possible values.
|
||||
|
||||
What: /sys/bus/bcma/devices/.../id
|
||||
|
@ -31,7 +31,7 @@ Description: 'FCoE Controller' instances on the fcoe bus.
|
||||
1) Write interface name to ctlr_create 2) Configure the FCoE
|
||||
Controller (ctlr_X) 3) Enable the FCoE Controller to begin
|
||||
discovery and login. The FCoE Controller is destroyed by
|
||||
writing it's name, i.e. ctlr_X to the ctlr_delete file.
|
||||
writing its name, i.e. ctlr_X to the ctlr_delete file.
|
||||
|
||||
Attributes:
|
||||
|
||||
|
@ -18,7 +18,7 @@ Description:
|
||||
on the signal from which time of flight measurements are
|
||||
taken.
|
||||
The appropriate values to take is dependent on both the
|
||||
sensor and it's operating environment:
|
||||
sensor and its operating environment:
|
||||
* as3935 (0-31 range)
|
||||
18 = indoors (default)
|
||||
14 = outdoors
|
||||
|
@ -296,7 +296,7 @@ Description: Processor frequency boosting control
|
||||
|
||||
This switch controls the boost setting for the whole system.
|
||||
Boosting allows the CPU and the firmware to run at a frequency
|
||||
beyond it's nominal limit.
|
||||
beyond its nominal limit.
|
||||
|
||||
More details can be found in
|
||||
Documentation/admin-guide/pm/cpufreq.rst
|
||||
|
@ -2,8 +2,8 @@ What: /sys/bus/platform/devices/ci_hdrc.0/role
|
||||
Date: Mar 2017
|
||||
Contact: Peter Chen <peter.chen@nxp.com>
|
||||
Description:
|
||||
It returns string "gadget" or "host" when read it, it indicates
|
||||
current controller role.
|
||||
When read, it returns string "gadget" or "host", indicating
|
||||
the current controller role.
|
||||
|
||||
It will do role switch when write "gadget" or "host" to it.
|
||||
It will do role switch when "gadget" or "host" is written to it.
|
||||
Only controller at dual-role configuration supports writing.
|
||||
|
@ -152,7 +152,7 @@ Description:
|
||||
case further investigation is required to determine which
|
||||
device is causing the problem. Note that genuine RTC clock
|
||||
values (such as when pm_trace has not been used), can still
|
||||
match a device and output it's name here.
|
||||
match a device and output its name here.
|
||||
|
||||
What: /sys/power/pm_async
|
||||
Date: January 2009
|
||||
|
@ -66,8 +66,13 @@ over a rather long period of time, but improvements are always welcome!
|
||||
As a rough rule of thumb, any dereference of an RCU-protected
|
||||
pointer must be covered by rcu_read_lock(), rcu_read_lock_bh(),
|
||||
rcu_read_lock_sched(), or by the appropriate update-side lock.
|
||||
Disabling of preemption can serve as rcu_read_lock_sched(), but
|
||||
is less readable and prevents lockdep from detecting locking issues.
|
||||
Explicit disabling of preemption (preempt_disable(), for example)
|
||||
can serve as rcu_read_lock_sched(), but is less readable and
|
||||
prevents lockdep from detecting locking issues.
|
||||
|
||||
Please not that you *cannot* rely on code known to be built
|
||||
only in non-preemptible kernels. Such code can and will break,
|
||||
especially in kernels built with CONFIG_PREEMPT_COUNT=y.
|
||||
|
||||
Letting RCU-protected pointers "leak" out of an RCU read-side
|
||||
critical section is every bit as bad as letting them leak out
|
||||
@ -185,6 +190,9 @@ over a rather long period of time, but improvements are always welcome!
|
||||
|
||||
5. If call_rcu() or call_srcu() is used, the callback function will
|
||||
be called from softirq context. In particular, it cannot block.
|
||||
If you need the callback to block, run that code in a workqueue
|
||||
handler scheduled from the callback. The queue_rcu_work()
|
||||
function does this for you in the case of call_rcu().
|
||||
|
||||
6. Since synchronize_rcu() can block, it cannot be called
|
||||
from any sort of irq context. The same rule applies
|
||||
@ -297,7 +305,8 @@ over a rather long period of time, but improvements are always welcome!
|
||||
the machine.
|
||||
|
||||
d. Periodically invoke synchronize_rcu(), permitting a limited
|
||||
number of updates per grace period.
|
||||
number of updates per grace period. Better yet, periodically
|
||||
invoke rcu_barrier() to wait for all outstanding callbacks.
|
||||
|
||||
The same cautions apply to call_srcu() and kfree_rcu().
|
||||
|
||||
@ -477,6 +486,6 @@ over a rather long period of time, but improvements are always welcome!
|
||||
So if you need to wait for both an RCU grace period and for
|
||||
all pre-existing call_rcu() callbacks, you will need to execute
|
||||
both rcu_barrier() and synchronize_rcu(), if necessary, using
|
||||
something like workqueues to to execute them concurrently.
|
||||
something like workqueues to execute them concurrently.
|
||||
|
||||
See rcubarrier.rst for more information.
|
||||
|
@ -61,7 +61,7 @@ checking of rcu_dereference() primitives:
|
||||
rcu_access_pointer(p):
|
||||
Return the value of the pointer and omit all barriers,
|
||||
but retain the compiler constraints that prevent duplicating
|
||||
or coalescsing. This is useful when when testing the
|
||||
or coalescsing. This is useful when testing the
|
||||
value of the pointer itself, for example, against NULL.
|
||||
|
||||
The rcu_dereference_check() check expression can be any boolean
|
||||
|
@ -128,10 +128,16 @@ Follow these rules to keep your RCU code working properly:
|
||||
This sort of comparison occurs frequently when scanning
|
||||
RCU-protected circular linked lists.
|
||||
|
||||
Note that if checks for being within an RCU read-side
|
||||
critical section are not required and the pointer is never
|
||||
dereferenced, rcu_access_pointer() should be used in place
|
||||
of rcu_dereference().
|
||||
Note that if the pointer comparison is done outside
|
||||
of an RCU read-side critical section, and the pointer
|
||||
is never dereferenced, rcu_access_pointer() should be
|
||||
used in place of rcu_dereference(). In most cases,
|
||||
it is best to avoid accidental dereferences by testing
|
||||
the rcu_access_pointer() return value directly, without
|
||||
assigning it to a variable.
|
||||
|
||||
Within an RCU read-side critical section, there is little
|
||||
reason to use rcu_access_pointer().
|
||||
|
||||
- The comparison is against a pointer that references memory
|
||||
that was initialized "a long time ago." The reason
|
||||
|
@ -6,13 +6,15 @@ What is RCU? -- "Read, Copy, Update"
|
||||
Please note that the "What is RCU?" LWN series is an excellent place
|
||||
to start learning about RCU:
|
||||
|
||||
| 1. What is RCU, Fundamentally? http://lwn.net/Articles/262464/
|
||||
| 2. What is RCU? Part 2: Usage http://lwn.net/Articles/263130/
|
||||
| 3. RCU part 3: the RCU API http://lwn.net/Articles/264090/
|
||||
| 4. The RCU API, 2010 Edition http://lwn.net/Articles/418853/
|
||||
| 2010 Big API Table http://lwn.net/Articles/419086/
|
||||
| 5. The RCU API, 2014 Edition http://lwn.net/Articles/609904/
|
||||
| 2014 Big API Table http://lwn.net/Articles/609973/
|
||||
| 1. What is RCU, Fundamentally? https://lwn.net/Articles/262464/
|
||||
| 2. What is RCU? Part 2: Usage https://lwn.net/Articles/263130/
|
||||
| 3. RCU part 3: the RCU API https://lwn.net/Articles/264090/
|
||||
| 4. The RCU API, 2010 Edition https://lwn.net/Articles/418853/
|
||||
| 2010 Big API Table https://lwn.net/Articles/419086/
|
||||
| 5. The RCU API, 2014 Edition https://lwn.net/Articles/609904/
|
||||
| 2014 Big API Table https://lwn.net/Articles/609973/
|
||||
| 6. The RCU API, 2019 Edition https://lwn.net/Articles/777036/
|
||||
| 2019 Big API Table https://lwn.net/Articles/777165/
|
||||
|
||||
|
||||
What is RCU?
|
||||
@ -915,13 +917,18 @@ which an RCU reference is held include:
|
||||
The understanding that RCU provides a reference that only prevents a
|
||||
change of type is particularly visible with objects allocated from a
|
||||
slab cache marked ``SLAB_TYPESAFE_BY_RCU``. RCU operations may yield a
|
||||
reference to an object from such a cache that has been concurrently
|
||||
freed and the memory reallocated to a completely different object,
|
||||
though of the same type. In this case RCU doesn't even protect the
|
||||
identity of the object from changing, only its type. So the object
|
||||
found may not be the one expected, but it will be one where it is safe
|
||||
to take a reference or spinlock and then confirm that the identity
|
||||
matches the expectations.
|
||||
reference to an object from such a cache that has been concurrently freed
|
||||
and the memory reallocated to a completely different object, though of
|
||||
the same type. In this case RCU doesn't even protect the identity of the
|
||||
object from changing, only its type. So the object found may not be the
|
||||
one expected, but it will be one where it is safe to take a reference
|
||||
(and then potentially acquiring a spinlock), allowing subsequent code
|
||||
to check whether the identity matches expectations. It is tempting
|
||||
to simply acquire the spinlock without first taking the reference, but
|
||||
unfortunately any spinlock in a ``SLAB_TYPESAFE_BY_RCU`` object must be
|
||||
initialized after each and every call to kmem_cache_alloc(), which renders
|
||||
reference-free spinlock acquisition completely unsafe. Therefore, when
|
||||
using ``SLAB_TYPESAFE_BY_RCU``, make proper use of a reference counter.
|
||||
|
||||
With traditional reference counting -- such as that implemented by the
|
||||
kref library in Linux -- there is typically code that runs when the last
|
||||
@ -1057,14 +1064,20 @@ SRCU: Initialization/cleanup::
|
||||
init_srcu_struct
|
||||
cleanup_srcu_struct
|
||||
|
||||
All: lockdep-checked RCU-protected pointer access::
|
||||
All: lockdep-checked RCU utility APIs::
|
||||
|
||||
rcu_access_pointer
|
||||
rcu_dereference_raw
|
||||
RCU_LOCKDEP_WARN
|
||||
rcu_sleep_check
|
||||
RCU_NONIDLE
|
||||
|
||||
All: Unchecked RCU-protected pointer access::
|
||||
|
||||
rcu_dereference_raw
|
||||
|
||||
All: Unchecked RCU-protected pointer access with dereferencing prohibited::
|
||||
|
||||
rcu_access_pointer
|
||||
|
||||
See the comment headers in the source code (or the docbook generated
|
||||
from them) for more information.
|
||||
|
||||
|
@ -262,8 +262,6 @@ Compiling the kernel
|
||||
- Make sure you have at least gcc 5.1 available.
|
||||
For more information, refer to :ref:`Documentation/process/changes.rst <changes>`.
|
||||
|
||||
Please note that you can still run a.out user programs with this kernel.
|
||||
|
||||
- Do a ``make`` to create a compressed kernel image. It is also
|
||||
possible to do ``make install`` if you have lilo installed to suit the
|
||||
kernel makefiles, but you may want to check your particular lilo setup first.
|
||||
@ -332,85 +330,10 @@ Compiling the kernel
|
||||
If something goes wrong
|
||||
-----------------------
|
||||
|
||||
- If you have problems that seem to be due to kernel bugs, please check
|
||||
the file MAINTAINERS to see if there is a particular person associated
|
||||
with the part of the kernel that you are having trouble with. If there
|
||||
isn't anyone listed there, then the second best thing is to mail
|
||||
them to me (torvalds@linux-foundation.org), and possibly to any other
|
||||
relevant mailing-list or to the newsgroup.
|
||||
If you have problems that seem to be due to kernel bugs, please follow the
|
||||
instructions at 'Documentation/admin-guide/reporting-issues.rst'.
|
||||
|
||||
- In all bug-reports, *please* tell what kernel you are talking about,
|
||||
how to duplicate the problem, and what your setup is (use your common
|
||||
sense). If the problem is new, tell me so, and if the problem is
|
||||
old, please try to tell me when you first noticed it.
|
||||
|
||||
- If the bug results in a message like::
|
||||
|
||||
unable to handle kernel paging request at address C0000010
|
||||
Oops: 0002
|
||||
EIP: 0010:XXXXXXXX
|
||||
eax: xxxxxxxx ebx: xxxxxxxx ecx: xxxxxxxx edx: xxxxxxxx
|
||||
esi: xxxxxxxx edi: xxxxxxxx ebp: xxxxxxxx
|
||||
ds: xxxx es: xxxx fs: xxxx gs: xxxx
|
||||
Pid: xx, process nr: xx
|
||||
xx xx xx xx xx xx xx xx xx xx
|
||||
|
||||
or similar kernel debugging information on your screen or in your
|
||||
system log, please duplicate it *exactly*. The dump may look
|
||||
incomprehensible to you, but it does contain information that may
|
||||
help debugging the problem. The text above the dump is also
|
||||
important: it tells something about why the kernel dumped code (in
|
||||
the above example, it's due to a bad kernel pointer). More information
|
||||
on making sense of the dump is in Documentation/admin-guide/bug-hunting.rst
|
||||
|
||||
- If you compiled the kernel with CONFIG_KALLSYMS you can send the dump
|
||||
as is, otherwise you will have to use the ``ksymoops`` program to make
|
||||
sense of the dump (but compiling with CONFIG_KALLSYMS is usually preferred).
|
||||
This utility can be downloaded from
|
||||
https://www.kernel.org/pub/linux/utils/kernel/ksymoops/ .
|
||||
Alternatively, you can do the dump lookup by hand:
|
||||
|
||||
- In debugging dumps like the above, it helps enormously if you can
|
||||
look up what the EIP value means. The hex value as such doesn't help
|
||||
me or anybody else very much: it will depend on your particular
|
||||
kernel setup. What you should do is take the hex value from the EIP
|
||||
line (ignore the ``0010:``), and look it up in the kernel namelist to
|
||||
see which kernel function contains the offending address.
|
||||
|
||||
To find out the kernel function name, you'll need to find the system
|
||||
binary associated with the kernel that exhibited the symptom. This is
|
||||
the file 'linux/vmlinux'. To extract the namelist and match it against
|
||||
the EIP from the kernel crash, do::
|
||||
|
||||
nm vmlinux | sort | less
|
||||
|
||||
This will give you a list of kernel addresses sorted in ascending
|
||||
order, from which it is simple to find the function that contains the
|
||||
offending address. Note that the address given by the kernel
|
||||
debugging messages will not necessarily match exactly with the
|
||||
function addresses (in fact, that is very unlikely), so you can't
|
||||
just 'grep' the list: the list will, however, give you the starting
|
||||
point of each kernel function, so by looking for the function that
|
||||
has a starting address lower than the one you are searching for but
|
||||
is followed by a function with a higher address you will find the one
|
||||
you want. In fact, it may be a good idea to include a bit of
|
||||
"context" in your problem report, giving a few lines around the
|
||||
interesting one.
|
||||
|
||||
If you for some reason cannot do the above (you have a pre-compiled
|
||||
kernel image or similar), telling me as much about your setup as
|
||||
possible will help. Please read
|
||||
'Documentation/admin-guide/reporting-issues.rst' for details.
|
||||
|
||||
- Alternatively, you can use gdb on a running kernel. (read-only; i.e. you
|
||||
cannot change values or set break points.) To do this, first compile the
|
||||
kernel with -g; edit arch/x86/Makefile appropriately, then do a ``make
|
||||
clean``. You'll also need to enable CONFIG_PROC_FS (via ``make config``).
|
||||
|
||||
After you've rebooted with the new kernel, do ``gdb vmlinux /proc/kcore``.
|
||||
You can now use all the usual gdb commands. The command to look up the
|
||||
point where your system crashed is ``l *0xXXXXXXXX``. (Replace the XXXes
|
||||
with the EIP value.)
|
||||
|
||||
gdb'ing a non-running kernel currently fails because ``gdb`` (wrongly)
|
||||
disregards the starting offset for which the kernel is compiled.
|
||||
Hints on understanding kernel bug reports are in
|
||||
'Documentation/admin-guide/bug-hunting.rst'. More on debugging the kernel
|
||||
with gdb is in 'Documentation/dev-tools/gdb-kernel-debugging.rst' and
|
||||
'Documentation/dev-tools/kgdb.rst'.
|
||||
|
@ -1,13 +0,0 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
===============
|
||||
Overriding DSDT
|
||||
===============
|
||||
|
||||
Linux supports a method of overriding the BIOS DSDT:
|
||||
|
||||
CONFIG_ACPI_CUSTOM_DSDT - builds the image into the kernel.
|
||||
|
||||
When to use this method is described in detail on the
|
||||
Linux/ACPI home page:
|
||||
https://01.org/linux-acpi/documentation/overriding-dsdt
|
@ -613,6 +613,7 @@ kernel command line.
|
||||
eibrs enhanced IBRS
|
||||
eibrs,retpoline enhanced IBRS + Retpolines
|
||||
eibrs,lfence enhanced IBRS + LFENCE
|
||||
ibrs use IBRS to protect kernel
|
||||
|
||||
Not specifying this option is equivalent to
|
||||
spectre_v2=auto.
|
||||
|
@ -200,7 +200,7 @@ prb
|
||||
|
||||
A pointer to the printk ringbuffer (struct printk_ringbuffer). This
|
||||
may be pointing to the static boot ringbuffer or the dynamically
|
||||
allocated ringbuffer, depending on when the the core dump occurred.
|
||||
allocated ringbuffer, depending on when the core dump occurred.
|
||||
Used by user-space tools to read the active kernel log buffer.
|
||||
|
||||
printk_rb_static
|
||||
|
@ -3801,6 +3801,10 @@
|
||||
|
||||
nox2apic [X86-64,APIC] Do not enable x2APIC mode.
|
||||
|
||||
NOTE: this parameter will be ignored on systems with the
|
||||
LEGACY_XAPIC_DISABLED bit set in the
|
||||
IA32_XAPIC_DISABLE_STATUS MSR.
|
||||
|
||||
nps_mtm_hs_ctr= [KNL,ARC]
|
||||
This parameter sets the maximum duration, in
|
||||
cycles, each HW thread of the CTOP can run
|
||||
|
@ -65,7 +65,7 @@ HugePages_Surp
|
||||
may be temporarily larger than the maximum number of surplus huge
|
||||
pages when the system is under memory pressure.
|
||||
Hugepagesize
|
||||
is the default hugepage size (in Kb).
|
||||
is the default hugepage size (in kB).
|
||||
Hugetlb
|
||||
is the total amount of memory (in kB), consumed by huge
|
||||
pages of all sizes.
|
||||
|
@ -102,6 +102,9 @@ Values:
|
||||
- 1 - enable JIT hardening for unprivileged users only
|
||||
- 2 - enable JIT hardening for all users
|
||||
|
||||
where "privileged user" in this context means a process having
|
||||
CAP_BPF or CAP_SYS_ADMIN in the root user name space.
|
||||
|
||||
bpf_jit_kallsyms
|
||||
----------------
|
||||
|
||||
|
@ -134,6 +134,12 @@ More detailed explanation for tainting
|
||||
scsi/snic on something else than x86_64, scsi/ips on non
|
||||
x86/x86_64/itanium, have broken firmware settings for the
|
||||
irqchip/irq-gic on arm64 ...).
|
||||
- x86/x86_64: Microcode late loading is dangerous and will result in
|
||||
tainting the kernel. It requires that all CPUs rendezvous to make sure
|
||||
the update happens when the system is as quiescent as possible. However,
|
||||
a higher priority MCE/SMI/NMI can move control flow away from that
|
||||
rendezvous and interrupt the update, which can be detrimental to the
|
||||
machine.
|
||||
|
||||
3) ``R`` if a module was force unloaded by ``rmmod -f``, ``' '`` if all
|
||||
modules were unloaded normally.
|
||||
|
30
Documentation/bpf/clang-notes.rst
Normal file
30
Documentation/bpf/clang-notes.rst
Normal file
@ -0,0 +1,30 @@
|
||||
.. contents::
|
||||
.. sectnum::
|
||||
|
||||
==========================
|
||||
Clang implementation notes
|
||||
==========================
|
||||
|
||||
This document provides more details specific to the Clang/LLVM implementation of the eBPF instruction set.
|
||||
|
||||
Versions
|
||||
========
|
||||
|
||||
Clang defined "CPU" versions, where a CPU version of 3 corresponds to the current eBPF ISA.
|
||||
|
||||
Clang can select the eBPF ISA version using ``-mcpu=v3`` for example to select version 3.
|
||||
|
||||
Arithmetic instructions
|
||||
=======================
|
||||
|
||||
For CPU versions prior to 3, Clang v7.0 and later can enable ``BPF_ALU`` support with
|
||||
``-Xclang -target-feature -Xclang +alu32``. In CPU version 3, support is automatically included.
|
||||
|
||||
Atomic operations
|
||||
=================
|
||||
|
||||
Clang can generate atomic instructions by default when ``-mcpu=v3`` is
|
||||
enabled. If a lower version for ``-mcpu`` is set, the only atomic instruction
|
||||
Clang can generate is ``BPF_ADD`` *without* ``BPF_FETCH``. If you need to enable
|
||||
the atomics features, while keeping a lower ``-mcpu`` version, you can use
|
||||
``-Xclang -target-feature -Xclang +alu32``.
|
@ -26,6 +26,8 @@ that goes into great technical depth about the BPF Architecture.
|
||||
classic_vs_extended.rst
|
||||
bpf_licensing
|
||||
test_debug
|
||||
clang-notes
|
||||
linux-notes
|
||||
other
|
||||
|
||||
.. only:: subproject and html
|
||||
|
@ -1,7 +1,12 @@
|
||||
.. contents::
|
||||
.. sectnum::
|
||||
|
||||
========================================
|
||||
eBPF Instruction Set Specification, v1.0
|
||||
========================================
|
||||
|
||||
This document specifies version 1.0 of the eBPF instruction set.
|
||||
|
||||
====================
|
||||
eBPF Instruction Set
|
||||
====================
|
||||
|
||||
Registers and calling convention
|
||||
================================
|
||||
@ -11,10 +16,10 @@ all of which are 64-bits wide.
|
||||
|
||||
The eBPF calling convention is defined as:
|
||||
|
||||
* R0: return value from function calls, and exit value for eBPF programs
|
||||
* R1 - R5: arguments for function calls
|
||||
* R6 - R9: callee saved registers that function calls will preserve
|
||||
* R10: read-only frame pointer to access stack
|
||||
* R0: return value from function calls, and exit value for eBPF programs
|
||||
* R1 - R5: arguments for function calls
|
||||
* R6 - R9: callee saved registers that function calls will preserve
|
||||
* R10: read-only frame pointer to access stack
|
||||
|
||||
R0 - R5 are scratch registers and eBPF programs needs to spill/fill them if
|
||||
necessary across calls.
|
||||
@ -24,17 +29,17 @@ Instruction encoding
|
||||
|
||||
eBPF has two instruction encodings:
|
||||
|
||||
* the basic instruction encoding, which uses 64 bits to encode an instruction
|
||||
* the wide instruction encoding, which appends a second 64-bit immediate value
|
||||
(imm64) after the basic instruction for a total of 128 bits.
|
||||
* the basic instruction encoding, which uses 64 bits to encode an instruction
|
||||
* the wide instruction encoding, which appends a second 64-bit immediate value
|
||||
(imm64) after the basic instruction for a total of 128 bits.
|
||||
|
||||
The basic instruction encoding looks as follows:
|
||||
|
||||
============= ======= =============== ==================== ============
|
||||
32 bits (MSB) 16 bits 4 bits 4 bits 8 bits (LSB)
|
||||
============= ======= =============== ==================== ============
|
||||
immediate offset source register destination register opcode
|
||||
============= ======= =============== ==================== ============
|
||||
============= ======= =============== ==================== ============
|
||||
32 bits (MSB) 16 bits 4 bits 4 bits 8 bits (LSB)
|
||||
============= ======= =============== ==================== ============
|
||||
immediate offset source register destination register opcode
|
||||
============= ======= =============== ==================== ============
|
||||
|
||||
Note that most instructions do not use all of the fields.
|
||||
Unused fields shall be cleared to zero.
|
||||
@ -44,30 +49,30 @@ Instruction classes
|
||||
|
||||
The three LSB bits of the 'opcode' field store the instruction class:
|
||||
|
||||
========= ===== ===============================
|
||||
class value description
|
||||
========= ===== ===============================
|
||||
BPF_LD 0x00 non-standard load operations
|
||||
BPF_LDX 0x01 load into register operations
|
||||
BPF_ST 0x02 store from immediate operations
|
||||
BPF_STX 0x03 store from register operations
|
||||
BPF_ALU 0x04 32-bit arithmetic operations
|
||||
BPF_JMP 0x05 64-bit jump operations
|
||||
BPF_JMP32 0x06 32-bit jump operations
|
||||
BPF_ALU64 0x07 64-bit arithmetic operations
|
||||
========= ===== ===============================
|
||||
========= ===== =============================== ===================================
|
||||
class value description reference
|
||||
========= ===== =============================== ===================================
|
||||
BPF_LD 0x00 non-standard load operations `Load and store instructions`_
|
||||
BPF_LDX 0x01 load into register operations `Load and store instructions`_
|
||||
BPF_ST 0x02 store from immediate operations `Load and store instructions`_
|
||||
BPF_STX 0x03 store from register operations `Load and store instructions`_
|
||||
BPF_ALU 0x04 32-bit arithmetic operations `Arithmetic and jump instructions`_
|
||||
BPF_JMP 0x05 64-bit jump operations `Arithmetic and jump instructions`_
|
||||
BPF_JMP32 0x06 32-bit jump operations `Arithmetic and jump instructions`_
|
||||
BPF_ALU64 0x07 64-bit arithmetic operations `Arithmetic and jump instructions`_
|
||||
========= ===== =============================== ===================================
|
||||
|
||||
Arithmetic and jump instructions
|
||||
================================
|
||||
|
||||
For arithmetic and jump instructions (BPF_ALU, BPF_ALU64, BPF_JMP and
|
||||
BPF_JMP32), the 8-bit 'opcode' field is divided into three parts:
|
||||
For arithmetic and jump instructions (``BPF_ALU``, ``BPF_ALU64``, ``BPF_JMP`` and
|
||||
``BPF_JMP32``), the 8-bit 'opcode' field is divided into three parts:
|
||||
|
||||
============== ====== =================
|
||||
4 bits (MSB) 1 bit 3 bits (LSB)
|
||||
============== ====== =================
|
||||
operation code source instruction class
|
||||
============== ====== =================
|
||||
============== ====== =================
|
||||
4 bits (MSB) 1 bit 3 bits (LSB)
|
||||
============== ====== =================
|
||||
operation code source instruction class
|
||||
============== ====== =================
|
||||
|
||||
The 4th bit encodes the source operand:
|
||||
|
||||
@ -84,66 +89,66 @@ The four MSB bits store the operation code.
|
||||
Arithmetic instructions
|
||||
-----------------------
|
||||
|
||||
BPF_ALU uses 32-bit wide operands while BPF_ALU64 uses 64-bit wide operands for
|
||||
``BPF_ALU`` uses 32-bit wide operands while ``BPF_ALU64`` uses 64-bit wide operands for
|
||||
otherwise identical operations.
|
||||
The code field encodes the operation as below:
|
||||
The 'code' field encodes the operation as below:
|
||||
|
||||
======== ===== =================================================
|
||||
code value description
|
||||
======== ===== =================================================
|
||||
BPF_ADD 0x00 dst += src
|
||||
BPF_SUB 0x10 dst -= src
|
||||
BPF_MUL 0x20 dst \*= src
|
||||
BPF_DIV 0x30 dst /= src
|
||||
BPF_OR 0x40 dst \|= src
|
||||
BPF_AND 0x50 dst &= src
|
||||
BPF_LSH 0x60 dst <<= src
|
||||
BPF_RSH 0x70 dst >>= src
|
||||
BPF_NEG 0x80 dst = ~src
|
||||
BPF_MOD 0x90 dst %= src
|
||||
BPF_XOR 0xa0 dst ^= src
|
||||
BPF_MOV 0xb0 dst = src
|
||||
BPF_ARSH 0xc0 sign extending shift right
|
||||
BPF_END 0xd0 byte swap operations (see separate section below)
|
||||
======== ===== =================================================
|
||||
======== ===== ==========================================================
|
||||
code value description
|
||||
======== ===== ==========================================================
|
||||
BPF_ADD 0x00 dst += src
|
||||
BPF_SUB 0x10 dst -= src
|
||||
BPF_MUL 0x20 dst \*= src
|
||||
BPF_DIV 0x30 dst /= src
|
||||
BPF_OR 0x40 dst \|= src
|
||||
BPF_AND 0x50 dst &= src
|
||||
BPF_LSH 0x60 dst <<= src
|
||||
BPF_RSH 0x70 dst >>= src
|
||||
BPF_NEG 0x80 dst = ~src
|
||||
BPF_MOD 0x90 dst %= src
|
||||
BPF_XOR 0xa0 dst ^= src
|
||||
BPF_MOV 0xb0 dst = src
|
||||
BPF_ARSH 0xc0 sign extending shift right
|
||||
BPF_END 0xd0 byte swap operations (see `Byte swap instructions`_ below)
|
||||
======== ===== ==========================================================
|
||||
|
||||
BPF_ADD | BPF_X | BPF_ALU means::
|
||||
``BPF_ADD | BPF_X | BPF_ALU`` means::
|
||||
|
||||
dst_reg = (u32) dst_reg + (u32) src_reg;
|
||||
|
||||
BPF_ADD | BPF_X | BPF_ALU64 means::
|
||||
``BPF_ADD | BPF_X | BPF_ALU64`` means::
|
||||
|
||||
dst_reg = dst_reg + src_reg
|
||||
|
||||
BPF_XOR | BPF_K | BPF_ALU means::
|
||||
``BPF_XOR | BPF_K | BPF_ALU`` means::
|
||||
|
||||
src_reg = (u32) src_reg ^ (u32) imm32
|
||||
|
||||
BPF_XOR | BPF_K | BPF_ALU64 means::
|
||||
``BPF_XOR | BPF_K | BPF_ALU64`` means::
|
||||
|
||||
src_reg = src_reg ^ imm32
|
||||
|
||||
|
||||
Byte swap instructions
|
||||
----------------------
|
||||
~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The byte swap instructions use an instruction class of ``BPF_ALU`` and a 4-bit
|
||||
code field of ``BPF_END``.
|
||||
'code' field of ``BPF_END``.
|
||||
|
||||
The byte swap instructions operate on the destination register
|
||||
only and do not use a separate source register or immediate value.
|
||||
|
||||
The 1-bit source operand field in the opcode is used to to select what byte
|
||||
The 1-bit source operand field in the opcode is used to select what byte
|
||||
order the operation convert from or to:
|
||||
|
||||
========= ===== =================================================
|
||||
source value description
|
||||
========= ===== =================================================
|
||||
BPF_TO_LE 0x00 convert between host byte order and little endian
|
||||
BPF_TO_BE 0x08 convert between host byte order and big endian
|
||||
========= ===== =================================================
|
||||
========= ===== =================================================
|
||||
source value description
|
||||
========= ===== =================================================
|
||||
BPF_TO_LE 0x00 convert between host byte order and little endian
|
||||
BPF_TO_BE 0x08 convert between host byte order and big endian
|
||||
========= ===== =================================================
|
||||
|
||||
The imm field encodes the width of the swap operations. The following widths
|
||||
The 'imm' field encodes the width of the swap operations. The following widths
|
||||
are supported: 16, 32 and 64.
|
||||
|
||||
Examples:
|
||||
@ -156,35 +161,31 @@ Examples:
|
||||
|
||||
dst_reg = htobe64(dst_reg)
|
||||
|
||||
``BPF_FROM_LE`` and ``BPF_FROM_BE`` exist as aliases for ``BPF_TO_LE`` and
|
||||
``BPF_TO_BE`` respectively.
|
||||
|
||||
|
||||
Jump instructions
|
||||
-----------------
|
||||
|
||||
BPF_JMP32 uses 32-bit wide operands while BPF_JMP uses 64-bit wide operands for
|
||||
``BPF_JMP32`` uses 32-bit wide operands while ``BPF_JMP`` uses 64-bit wide operands for
|
||||
otherwise identical operations.
|
||||
The code field encodes the operation as below:
|
||||
The 'code' field encodes the operation as below:
|
||||
|
||||
======== ===== ========================= ============
|
||||
code value description notes
|
||||
======== ===== ========================= ============
|
||||
BPF_JA 0x00 PC += off BPF_JMP only
|
||||
BPF_JEQ 0x10 PC += off if dst == src
|
||||
BPF_JGT 0x20 PC += off if dst > src unsigned
|
||||
BPF_JGE 0x30 PC += off if dst >= src unsigned
|
||||
BPF_JSET 0x40 PC += off if dst & src
|
||||
BPF_JNE 0x50 PC += off if dst != src
|
||||
BPF_JSGT 0x60 PC += off if dst > src signed
|
||||
BPF_JSGE 0x70 PC += off if dst >= src signed
|
||||
BPF_CALL 0x80 function call
|
||||
BPF_EXIT 0x90 function / program return BPF_JMP only
|
||||
BPF_JLT 0xa0 PC += off if dst < src unsigned
|
||||
BPF_JLE 0xb0 PC += off if dst <= src unsigned
|
||||
BPF_JSLT 0xc0 PC += off if dst < src signed
|
||||
BPF_JSLE 0xd0 PC += off if dst <= src signed
|
||||
======== ===== ========================= ============
|
||||
======== ===== ========================= ============
|
||||
code value description notes
|
||||
======== ===== ========================= ============
|
||||
BPF_JA 0x00 PC += off BPF_JMP only
|
||||
BPF_JEQ 0x10 PC += off if dst == src
|
||||
BPF_JGT 0x20 PC += off if dst > src unsigned
|
||||
BPF_JGE 0x30 PC += off if dst >= src unsigned
|
||||
BPF_JSET 0x40 PC += off if dst & src
|
||||
BPF_JNE 0x50 PC += off if dst != src
|
||||
BPF_JSGT 0x60 PC += off if dst > src signed
|
||||
BPF_JSGE 0x70 PC += off if dst >= src signed
|
||||
BPF_CALL 0x80 function call
|
||||
BPF_EXIT 0x90 function / program return BPF_JMP only
|
||||
BPF_JLT 0xa0 PC += off if dst < src unsigned
|
||||
BPF_JLE 0xb0 PC += off if dst <= src unsigned
|
||||
BPF_JSLT 0xc0 PC += off if dst < src signed
|
||||
BPF_JSLE 0xd0 PC += off if dst <= src signed
|
||||
======== ===== ========================= ============
|
||||
|
||||
The eBPF program needs to store the return value into register R0 before doing a
|
||||
BPF_EXIT.
|
||||
@ -193,14 +194,26 @@ BPF_EXIT.
|
||||
Load and store instructions
|
||||
===========================
|
||||
|
||||
For load and store instructions (BPF_LD, BPF_LDX, BPF_ST and BPF_STX), the
|
||||
For load and store instructions (``BPF_LD``, ``BPF_LDX``, ``BPF_ST``, and ``BPF_STX``), the
|
||||
8-bit 'opcode' field is divided as:
|
||||
|
||||
============ ====== =================
|
||||
3 bits (MSB) 2 bits 3 bits (LSB)
|
||||
============ ====== =================
|
||||
mode size instruction class
|
||||
============ ====== =================
|
||||
============ ====== =================
|
||||
3 bits (MSB) 2 bits 3 bits (LSB)
|
||||
============ ====== =================
|
||||
mode size instruction class
|
||||
============ ====== =================
|
||||
|
||||
The mode modifier is one of:
|
||||
|
||||
============= ===== ==================================== =============
|
||||
mode modifier value description reference
|
||||
============= ===== ==================================== =============
|
||||
BPF_IMM 0x00 64-bit immediate instructions `64-bit immediate instructions`_
|
||||
BPF_ABS 0x20 legacy BPF packet access (absolute) `Legacy BPF Packet access instructions`_
|
||||
BPF_IND 0x40 legacy BPF packet access (indirect) `Legacy BPF Packet access instructions`_
|
||||
BPF_MEM 0x60 regular load and store operations `Regular load and store operations`_
|
||||
BPF_ATOMIC 0xc0 atomic operations `Atomic operations`_
|
||||
============= ===== ==================================== =============
|
||||
|
||||
The size modifier is one of:
|
||||
|
||||
@ -213,19 +226,6 @@ The size modifier is one of:
|
||||
BPF_DW 0x18 double word (8 bytes)
|
||||
============= ===== =====================
|
||||
|
||||
The mode modifier is one of:
|
||||
|
||||
============= ===== ====================================
|
||||
mode modifier value description
|
||||
============= ===== ====================================
|
||||
BPF_IMM 0x00 64-bit immediate instructions
|
||||
BPF_ABS 0x20 legacy BPF packet access (absolute)
|
||||
BPF_IND 0x40 legacy BPF packet access (indirect)
|
||||
BPF_MEM 0x60 regular load and store operations
|
||||
BPF_ATOMIC 0xc0 atomic operations
|
||||
============= ===== ====================================
|
||||
|
||||
|
||||
Regular load and store operations
|
||||
---------------------------------
|
||||
|
||||
@ -256,44 +256,42 @@ by other eBPF programs or means outside of this specification.
|
||||
All atomic operations supported by eBPF are encoded as store operations
|
||||
that use the ``BPF_ATOMIC`` mode modifier as follows:
|
||||
|
||||
* ``BPF_ATOMIC | BPF_W | BPF_STX`` for 32-bit operations
|
||||
* ``BPF_ATOMIC | BPF_DW | BPF_STX`` for 64-bit operations
|
||||
* 8-bit and 16-bit wide atomic operations are not supported.
|
||||
* ``BPF_ATOMIC | BPF_W | BPF_STX`` for 32-bit operations
|
||||
* ``BPF_ATOMIC | BPF_DW | BPF_STX`` for 64-bit operations
|
||||
* 8-bit and 16-bit wide atomic operations are not supported.
|
||||
|
||||
The imm field is used to encode the actual atomic operation.
|
||||
The 'imm' field is used to encode the actual atomic operation.
|
||||
Simple atomic operation use a subset of the values defined to encode
|
||||
arithmetic operations in the imm field to encode the atomic operation:
|
||||
arithmetic operations in the 'imm' field to encode the atomic operation:
|
||||
|
||||
======== ===== ===========
|
||||
imm value description
|
||||
======== ===== ===========
|
||||
BPF_ADD 0x00 atomic add
|
||||
BPF_OR 0x40 atomic or
|
||||
BPF_AND 0x50 atomic and
|
||||
BPF_XOR 0xa0 atomic xor
|
||||
======== ===== ===========
|
||||
======== ===== ===========
|
||||
imm value description
|
||||
======== ===== ===========
|
||||
BPF_ADD 0x00 atomic add
|
||||
BPF_OR 0x40 atomic or
|
||||
BPF_AND 0x50 atomic and
|
||||
BPF_XOR 0xa0 atomic xor
|
||||
======== ===== ===========
|
||||
|
||||
|
||||
``BPF_ATOMIC | BPF_W | BPF_STX`` with imm = BPF_ADD means::
|
||||
``BPF_ATOMIC | BPF_W | BPF_STX`` with 'imm' = BPF_ADD means::
|
||||
|
||||
*(u32 *)(dst_reg + off16) += src_reg
|
||||
|
||||
``BPF_ATOMIC | BPF_DW | BPF_STX`` with imm = BPF ADD means::
|
||||
``BPF_ATOMIC | BPF_DW | BPF_STX`` with 'imm' = BPF ADD means::
|
||||
|
||||
*(u64 *)(dst_reg + off16) += src_reg
|
||||
|
||||
``BPF_XADD`` is a deprecated name for ``BPF_ATOMIC | BPF_ADD``.
|
||||
|
||||
In addition to the simple atomic operations, there also is a modifier and
|
||||
two complex atomic operations:
|
||||
|
||||
=========== ================ ===========================
|
||||
imm value description
|
||||
=========== ================ ===========================
|
||||
BPF_FETCH 0x01 modifier: return old value
|
||||
BPF_XCHG 0xe0 | BPF_FETCH atomic exchange
|
||||
BPF_CMPXCHG 0xf0 | BPF_FETCH atomic compare and exchange
|
||||
=========== ================ ===========================
|
||||
=========== ================ ===========================
|
||||
imm value description
|
||||
=========== ================ ===========================
|
||||
BPF_FETCH 0x01 modifier: return old value
|
||||
BPF_XCHG 0xe0 | BPF_FETCH atomic exchange
|
||||
BPF_CMPXCHG 0xf0 | BPF_FETCH atomic compare and exchange
|
||||
=========== ================ ===========================
|
||||
|
||||
The ``BPF_FETCH`` modifier is optional for simple atomic operations, and
|
||||
always set for the complex atomic operations. If the ``BPF_FETCH`` flag
|
||||
@ -309,16 +307,10 @@ The ``BPF_CMPXCHG`` operation atomically compares the value addressed by
|
||||
value that was at ``dst_reg + off`` before the operation is zero-extended
|
||||
and loaded back to ``R0``.
|
||||
|
||||
Clang can generate atomic instructions by default when ``-mcpu=v3`` is
|
||||
enabled. If a lower version for ``-mcpu`` is set, the only atomic instruction
|
||||
Clang can generate is ``BPF_ADD`` *without* ``BPF_FETCH``. If you need to enable
|
||||
the atomics features, while keeping a lower ``-mcpu`` version, you can use
|
||||
``-Xclang -target-feature -Xclang +alu32``.
|
||||
|
||||
64-bit immediate instructions
|
||||
-----------------------------
|
||||
|
||||
Instructions with the ``BPF_IMM`` mode modifier use the wide instruction
|
||||
Instructions with the ``BPF_IMM`` 'mode' modifier use the wide instruction
|
||||
encoding for an extra imm64 value.
|
||||
|
||||
There is currently only one such instruction.
|
||||
@ -331,36 +323,6 @@ There is currently only one such instruction.
|
||||
Legacy BPF Packet access instructions
|
||||
-------------------------------------
|
||||
|
||||
eBPF has special instructions for access to packet data that have been
|
||||
carried over from classic BPF to retain the performance of legacy socket
|
||||
filters running in the eBPF interpreter.
|
||||
|
||||
The instructions come in two forms: ``BPF_ABS | <size> | BPF_LD`` and
|
||||
``BPF_IND | <size> | BPF_LD``.
|
||||
|
||||
These instructions are used to access packet data and can only be used when
|
||||
the program context is a pointer to networking packet. ``BPF_ABS``
|
||||
accesses packet data at an absolute offset specified by the immediate data
|
||||
and ``BPF_IND`` access packet data at an offset that includes the value of
|
||||
a register in addition to the immediate data.
|
||||
|
||||
These instructions have seven implicit operands:
|
||||
|
||||
* Register R6 is an implicit input that must contain pointer to a
|
||||
struct sk_buff.
|
||||
* Register R0 is an implicit output which contains the data fetched from
|
||||
the packet.
|
||||
* Registers R1-R5 are scratch registers that are clobbered after a call to
|
||||
``BPF_ABS | BPF_LD`` or ``BPF_IND | BPF_LD`` instructions.
|
||||
|
||||
These instructions have an implicit program exit condition as well. When an
|
||||
eBPF program is trying to access the data beyond the packet boundary, the
|
||||
program execution will be aborted.
|
||||
|
||||
``BPF_ABS | BPF_W | BPF_LD`` means::
|
||||
|
||||
R0 = ntohl(*(u32 *) (((struct sk_buff *) R6)->data + imm32))
|
||||
|
||||
``BPF_IND | BPF_W | BPF_LD`` means::
|
||||
|
||||
R0 = ntohl(*(u32 *) (((struct sk_buff *) R6)->data + src_reg + imm32))
|
||||
eBPF previously introduced special instructions for access to packet data that were
|
||||
carried over from classic BPF. However, these instructions are
|
||||
deprecated and should no longer be used.
|
||||
|
@ -137,14 +137,22 @@ KF_ACQUIRE and KF_RET_NULL flags.
|
||||
--------------------------
|
||||
|
||||
The KF_TRUSTED_ARGS flag is used for kfuncs taking pointer arguments. It
|
||||
indicates that the all pointer arguments will always be refcounted, and have
|
||||
their offset set to 0. It can be used to enforce that a pointer to a refcounted
|
||||
object acquired from a kfunc or BPF helper is passed as an argument to this
|
||||
kfunc without any modifications (e.g. pointer arithmetic) such that it is
|
||||
trusted and points to the original object. This flag is often used for kfuncs
|
||||
that operate (change some property, perform some operation) on an object that
|
||||
was obtained using an acquire kfunc. Such kfuncs need an unchanged pointer to
|
||||
ensure the integrity of the operation being performed on the expected object.
|
||||
indicates that the all pointer arguments will always have a guaranteed lifetime,
|
||||
and pointers to kernel objects are always passed to helpers in their unmodified
|
||||
form (as obtained from acquire kfuncs).
|
||||
|
||||
It can be used to enforce that a pointer to a refcounted object acquired from a
|
||||
kfunc or BPF helper is passed as an argument to this kfunc without any
|
||||
modifications (e.g. pointer arithmetic) such that it is trusted and points to
|
||||
the original object.
|
||||
|
||||
Meanwhile, it is also allowed pass pointers to normal memory to such kfuncs,
|
||||
but those can have a non-zero offset.
|
||||
|
||||
This flag is often used for kfuncs that operate (change some property, perform
|
||||
some operation) on an object that was obtained using an acquire kfunc. Such
|
||||
kfuncs need an unchanged pointer to ensure the integrity of the operation being
|
||||
performed on the expected object.
|
||||
|
||||
2.4.6 KF_SLEEPABLE flag
|
||||
-----------------------
|
||||
|
53
Documentation/bpf/linux-notes.rst
Normal file
53
Documentation/bpf/linux-notes.rst
Normal file
@ -0,0 +1,53 @@
|
||||
.. contents::
|
||||
.. sectnum::
|
||||
|
||||
==========================
|
||||
Linux implementation notes
|
||||
==========================
|
||||
|
||||
This document provides more details specific to the Linux kernel implementation of the eBPF instruction set.
|
||||
|
||||
Byte swap instructions
|
||||
======================
|
||||
|
||||
``BPF_FROM_LE`` and ``BPF_FROM_BE`` exist as aliases for ``BPF_TO_LE`` and ``BPF_TO_BE`` respectively.
|
||||
|
||||
Legacy BPF Packet access instructions
|
||||
=====================================
|
||||
|
||||
As mentioned in the `ISA standard documentation <instruction-set.rst#legacy-bpf-packet-access-instructions>`_,
|
||||
Linux has special eBPF instructions for access to packet data that have been
|
||||
carried over from classic BPF to retain the performance of legacy socket
|
||||
filters running in the eBPF interpreter.
|
||||
|
||||
The instructions come in two forms: ``BPF_ABS | <size> | BPF_LD`` and
|
||||
``BPF_IND | <size> | BPF_LD``.
|
||||
|
||||
These instructions are used to access packet data and can only be used when
|
||||
the program context is a pointer to a networking packet. ``BPF_ABS``
|
||||
accesses packet data at an absolute offset specified by the immediate data
|
||||
and ``BPF_IND`` access packet data at an offset that includes the value of
|
||||
a register in addition to the immediate data.
|
||||
|
||||
These instructions have seven implicit operands:
|
||||
|
||||
* Register R6 is an implicit input that must contain a pointer to a
|
||||
struct sk_buff.
|
||||
* Register R0 is an implicit output which contains the data fetched from
|
||||
the packet.
|
||||
* Registers R1-R5 are scratch registers that are clobbered by the
|
||||
instruction.
|
||||
|
||||
These instructions have an implicit program exit condition as well. If an
|
||||
eBPF program attempts access data beyond the packet boundary, the
|
||||
program execution will be aborted.
|
||||
|
||||
``BPF_ABS | BPF_W | BPF_LD`` (0x20) means::
|
||||
|
||||
R0 = ntohl(*(u32 *) ((struct sk_buff *) R6->data + imm))
|
||||
|
||||
where ``ntohl()`` converts a 32-bit value from network byte order to host byte order.
|
||||
|
||||
``BPF_IND | BPF_W | BPF_LD`` (0x40) means::
|
||||
|
||||
R0 = ntohl(*(u32 *) ((struct sk_buff *) R6->data + src + imm))
|
@ -31,7 +31,7 @@ The map uses key of type of either ``__u64 cgroup_inode_id`` or
|
||||
};
|
||||
|
||||
``cgroup_inode_id`` is the inode id of the cgroup directory.
|
||||
``attach_type`` is the the program's attach type.
|
||||
``attach_type`` is the program's attach type.
|
||||
|
||||
Linux 5.9 added support for type ``__u64 cgroup_inode_id`` as the key type.
|
||||
When this key type is used, then all attach types of the particular cgroup and
|
||||
@ -155,7 +155,7 @@ However, the BPF program can still only associate with one map of each type
|
||||
``BPF_MAP_TYPE_CGROUP_STORAGE`` or more than one
|
||||
``BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE``.
|
||||
|
||||
In all versions, userspace may use the the attach parameters of cgroup and
|
||||
In all versions, userspace may use the attach parameters of cgroup and
|
||||
attach type pair in ``struct bpf_cgroup_storage_key`` as the key to the BPF map
|
||||
APIs to read or update the storage for a given attachment. For Linux 5.9
|
||||
attach type shared storages, only the first value in the struct, cgroup inode
|
||||
|
@ -15,6 +15,18 @@
|
||||
import sys
|
||||
import os
|
||||
import sphinx
|
||||
import shutil
|
||||
|
||||
# helper
|
||||
# ------
|
||||
|
||||
def have_command(cmd):
|
||||
"""Search ``cmd`` in the ``PATH`` environment.
|
||||
|
||||
If found, return True.
|
||||
If not found, return False.
|
||||
"""
|
||||
return shutil.which(cmd) is not None
|
||||
|
||||
# Get Sphinx version
|
||||
major, minor, patch = sphinx.version_info[:3]
|
||||
@ -107,7 +119,32 @@ else:
|
||||
autosectionlabel_prefix_document = True
|
||||
autosectionlabel_maxdepth = 2
|
||||
|
||||
extensions.append("sphinx.ext.imgmath")
|
||||
# Load math renderer:
|
||||
# For html builder, load imgmath only when its dependencies are met.
|
||||
# mathjax is the default math renderer since Sphinx 1.8.
|
||||
have_latex = have_command('latex')
|
||||
have_dvipng = have_command('dvipng')
|
||||
load_imgmath = have_latex and have_dvipng
|
||||
|
||||
# Respect SPHINX_IMGMATH (for html docs only)
|
||||
if 'SPHINX_IMGMATH' in os.environ:
|
||||
env_sphinx_imgmath = os.environ['SPHINX_IMGMATH']
|
||||
if 'yes' in env_sphinx_imgmath:
|
||||
load_imgmath = True
|
||||
elif 'no' in env_sphinx_imgmath:
|
||||
load_imgmath = False
|
||||
else:
|
||||
sys.stderr.write("Unknown env SPHINX_IMGMATH=%s ignored.\n" % env_sphinx_imgmath)
|
||||
|
||||
# Always load imgmath for Sphinx <1.8 or for epub docs
|
||||
load_imgmath = (load_imgmath or (major == 1 and minor < 8)
|
||||
or 'epub' in sys.argv)
|
||||
|
||||
if load_imgmath:
|
||||
extensions.append("sphinx.ext.imgmath")
|
||||
math_renderer = 'imgmath'
|
||||
else:
|
||||
math_renderer = 'mathjax'
|
||||
|
||||
# Add any paths that contain templates here, relative to this directory.
|
||||
templates_path = ['_templates']
|
||||
@ -333,7 +370,8 @@ html_static_path = ['sphinx-static']
|
||||
html_use_smartypants = False
|
||||
|
||||
# Custom sidebar templates, maps document names to template names.
|
||||
#html_sidebars = {}
|
||||
# Note that the RTD theme ignores this.
|
||||
html_sidebars = { '**': ['searchbox.html', 'localtoc.html', 'sourcelink.html']}
|
||||
|
||||
# Additional templates that should be rendered to pages, maps page names to
|
||||
# template names.
|
||||
|
@ -43,10 +43,11 @@ annotated objects like this, tools can be run on them to generate more useful
|
||||
information. In particular, on properly annotated objects, ``objtool`` can be
|
||||
run to check and fix the object if needed. Currently, ``objtool`` can report
|
||||
missing frame pointer setup/destruction in functions. It can also
|
||||
automatically generate annotations for :doc:`ORC unwinder <x86/orc-unwinder>`
|
||||
automatically generate annotations for the ORC unwinder
|
||||
(Documentation/x86/orc-unwinder.rst)
|
||||
for most code. Both of these are especially important to support reliable
|
||||
stack traces which are in turn necessary for :doc:`Kernel live patching
|
||||
<livepatch/livepatch>`.
|
||||
stack traces which are in turn necessary for kernel live patching
|
||||
(Documentation/livepatch/livepatch.rst).
|
||||
|
||||
Caveat and Discussion
|
||||
---------------------
|
@ -560,7 +560,7 @@ available:
|
||||
* cpuhp_state_remove_instance(state, node)
|
||||
* cpuhp_state_remove_instance_nocalls(state, node)
|
||||
|
||||
The arguments are the same as for the the cpuhp_state_add_instance*()
|
||||
The arguments are the same as for the cpuhp_state_add_instance*()
|
||||
variants above.
|
||||
|
||||
The functions differ in the way how the installed callbacks are treated:
|
||||
|
@ -23,6 +23,7 @@ it.
|
||||
printk-formats
|
||||
printk-index
|
||||
symbol-namespaces
|
||||
asm-annotations
|
||||
|
||||
Data structures and low-level utilities
|
||||
=======================================
|
||||
@ -44,6 +45,8 @@ Library functionality that is used throughout the kernel.
|
||||
this_cpu_ops
|
||||
timekeeping
|
||||
errseq
|
||||
wrappers/atomic_t
|
||||
wrappers/atomic_bitops
|
||||
|
||||
Low level entry and exit
|
||||
========================
|
||||
@ -67,6 +70,7 @@ Documentation/locking/index.rst for more related documentation.
|
||||
local_ops
|
||||
padata
|
||||
../RCU/index
|
||||
wrappers/memory-barriers.rst
|
||||
|
||||
Low-level hardware management
|
||||
=============================
|
||||
|
@ -71,7 +71,7 @@ variety of methods:
|
||||
Note that irq domain lookups must happen in contexts that are
|
||||
compatible with a RCU read-side critical section.
|
||||
|
||||
The irq_create_mapping() function must be called *atleast once*
|
||||
The irq_create_mapping() function must be called *at least once*
|
||||
before any call to irq_find_mapping(), lest the descriptor will not
|
||||
be allocated.
|
||||
|
||||
|
@ -625,6 +625,16 @@ Examples::
|
||||
%p4cc Y10 little-endian (0x20303159)
|
||||
%p4cc NV12 big-endian (0xb231564e)
|
||||
|
||||
Rust
|
||||
----
|
||||
|
||||
::
|
||||
|
||||
%pA
|
||||
|
||||
Only intended to be used from Rust code to format ``core::fmt::Arguments``.
|
||||
Do *not* use it from C.
|
||||
|
||||
Thanks
|
||||
======
|
||||
|
||||
|
18
Documentation/core-api/wrappers/atomic_bitops.rst
Normal file
18
Documentation/core-api/wrappers/atomic_bitops.rst
Normal file
@ -0,0 +1,18 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
This is a simple wrapper to bring atomic_bitops.txt into the RST world
|
||||
until such a time as that file can be converted directly.
|
||||
|
||||
=============
|
||||
Atomic bitops
|
||||
=============
|
||||
|
||||
.. raw:: latex
|
||||
|
||||
\footnotesize
|
||||
|
||||
.. include:: ../../atomic_bitops.txt
|
||||
:literal:
|
||||
|
||||
.. raw:: latex
|
||||
|
||||
\normalsize
|
19
Documentation/core-api/wrappers/atomic_t.rst
Normal file
19
Documentation/core-api/wrappers/atomic_t.rst
Normal file
@ -0,0 +1,19 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
This is a simple wrapper to bring atomic_t.txt into the RST world
|
||||
until such a time as that file can be converted directly.
|
||||
|
||||
============
|
||||
Atomic types
|
||||
============
|
||||
|
||||
.. raw:: latex
|
||||
|
||||
\footnotesize
|
||||
|
||||
.. include:: ../../atomic_t.txt
|
||||
:literal:
|
||||
|
||||
.. raw:: latex
|
||||
|
||||
\normalsize
|
||||
|
18
Documentation/core-api/wrappers/memory-barriers.rst
Normal file
18
Documentation/core-api/wrappers/memory-barriers.rst
Normal file
@ -0,0 +1,18 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
This is a simple wrapper to bring memory-barriers.txt into the RST world
|
||||
until such a time as that file can be converted directly.
|
||||
|
||||
============================
|
||||
Linux kernel memory barriers
|
||||
============================
|
||||
|
||||
.. raw:: latex
|
||||
|
||||
\footnotesize
|
||||
|
||||
.. include:: ../../memory-barriers.txt
|
||||
:literal:
|
||||
|
||||
.. raw:: latex
|
||||
|
||||
\normalsize
|
@ -57,6 +57,11 @@ properties:
|
||||
- description: interrupt ID for I2C event
|
||||
- description: interrupt ID for I2C error
|
||||
|
||||
interrupt-names:
|
||||
items:
|
||||
- const: event
|
||||
- const: error
|
||||
|
||||
resets:
|
||||
maxItems: 1
|
||||
|
||||
@ -92,6 +97,8 @@ properties:
|
||||
- description: register offset within syscfg
|
||||
- description: register bitmask for FMP bit
|
||||
|
||||
wakeup-source: true
|
||||
|
||||
required:
|
||||
- compatible
|
||||
- reg
|
||||
|
@ -144,6 +144,12 @@ properties:
|
||||
Mark the corresponding energy efficient ethernet mode as
|
||||
broken and request the ethernet to stop advertising it.
|
||||
|
||||
pses:
|
||||
$ref: /schemas/types.yaml#/definitions/phandle-array
|
||||
maxItems: 1
|
||||
description:
|
||||
Specifies a reference to a node representing a Power Sourcing Equipment.
|
||||
|
||||
phy-is-integrated:
|
||||
$ref: /schemas/types.yaml#/definitions/flag
|
||||
description:
|
||||
|
@ -128,7 +128,7 @@ examples:
|
||||
|
||||
i2c-int-rising;
|
||||
|
||||
reset-n-io = <&gpio3 19 GPIO_ACTIVE_HIGH>;
|
||||
reset-n-io = <&gpio3 19 GPIO_ACTIVE_LOW>;
|
||||
};
|
||||
};
|
||||
|
||||
@ -151,7 +151,7 @@ examples:
|
||||
interrupt-parent = <&gpio1>;
|
||||
interrupts = <17 IRQ_TYPE_EDGE_RISING>;
|
||||
|
||||
reset-n-io = <&gpio3 19 GPIO_ACTIVE_HIGH>;
|
||||
reset-n-io = <&gpio3 19 GPIO_ACTIVE_LOW>;
|
||||
};
|
||||
};
|
||||
|
||||
@ -162,7 +162,7 @@ examples:
|
||||
nfc {
|
||||
compatible = "marvell,nfc-uart";
|
||||
|
||||
reset-n-io = <&gpio3 16 GPIO_ACTIVE_HIGH>;
|
||||
reset-n-io = <&gpio3 16 GPIO_ACTIVE_LOW>;
|
||||
|
||||
hci-muxed;
|
||||
flow-control;
|
||||
|
@ -0,0 +1,40 @@
|
||||
# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
|
||||
%YAML 1.2
|
||||
---
|
||||
$id: http://devicetree.org/schemas/net/pse-pd/podl-pse-regulator.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Regulator based Power Sourcing Equipment
|
||||
|
||||
maintainers:
|
||||
- Oleksij Rempel <o.rempel@pengutronix.de>
|
||||
|
||||
description: Regulator based PoDL PSE controller. The device must be referenced
|
||||
by the PHY node to control power injection to the Ethernet cable.
|
||||
|
||||
allOf:
|
||||
- $ref: "pse-controller.yaml#"
|
||||
|
||||
properties:
|
||||
compatible:
|
||||
const: podl-pse-regulator
|
||||
|
||||
'#pse-cells':
|
||||
const: 0
|
||||
|
||||
pse-supply:
|
||||
description: Power supply for the PSE controller
|
||||
|
||||
additionalProperties: false
|
||||
|
||||
required:
|
||||
- compatible
|
||||
- pse-supply
|
||||
|
||||
examples:
|
||||
- |
|
||||
ethernet-pse {
|
||||
compatible = "podl-pse-regulator";
|
||||
pse-supply = <®_t1l1>;
|
||||
#pse-cells = <0>;
|
||||
};
|
@ -0,0 +1,33 @@
|
||||
# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
|
||||
%YAML 1.2
|
||||
---
|
||||
$id: http://devicetree.org/schemas/net/pse-pd/pse-controller.yaml#
|
||||
$schema: http://devicetree.org/meta-schemas/core.yaml#
|
||||
|
||||
title: Power Sourcing Equipment (PSE).
|
||||
|
||||
description: Binding for the Power Sourcing Equipment (PSE) as defined in the
|
||||
IEEE 802.3 specification. It is designed for hardware which is delivering
|
||||
power over twisted pair/ethernet cable. The ethernet-pse nodes should be
|
||||
used to describe PSE controller and referenced by the ethernet-phy node.
|
||||
|
||||
maintainers:
|
||||
- Oleksij Rempel <o.rempel@pengutronix.de>
|
||||
|
||||
properties:
|
||||
$nodename:
|
||||
pattern: "^ethernet-pse(@.*)?$"
|
||||
|
||||
"#pse-cells":
|
||||
description:
|
||||
Used to uniquely identify a PSE instance within an IC. Will be
|
||||
0 on PSE nodes with only a single output and at least 1 on nodes
|
||||
controlling several outputs.
|
||||
enum: [0, 1]
|
||||
|
||||
required:
|
||||
- "#pse-cells"
|
||||
|
||||
additionalProperties: true
|
||||
|
||||
...
|
@ -14,6 +14,9 @@ when it is embedded in source files.
|
||||
reasons. The kernel source contains tens of thousands of kernel-doc
|
||||
comments. Please stick to the style described here.
|
||||
|
||||
.. note:: kernel-doc does not cover Rust code: please see
|
||||
Documentation/rust/general-information.rst instead.
|
||||
|
||||
The kernel-doc structure is extracted from the comments, and proper
|
||||
`Sphinx C Domain`_ function and type descriptions with anchors are
|
||||
generated from them. The descriptions are filtered for special kernel-doc
|
||||
|
@ -48,10 +48,6 @@ or ``virtualenv``, depending on how your distribution packaged Python 3.
|
||||
on the Sphinx version, it should be installed separately,
|
||||
with ``pip install sphinx_rtd_theme``.
|
||||
|
||||
#) Some ReST pages contain math expressions. Due to the way Sphinx works,
|
||||
those expressions are written using LaTeX notation. It needs texlive
|
||||
installed with amsfonts and amsmath in order to evaluate them.
|
||||
|
||||
In summary, if you want to install Sphinx version 2.4.4, you should do::
|
||||
|
||||
$ virtualenv sphinx_2.4.4
|
||||
@ -86,6 +82,27 @@ Depending on the distribution, you may also need to install a series of
|
||||
``texlive`` packages that provide the minimal set of functionalities
|
||||
required for ``XeLaTeX`` to work.
|
||||
|
||||
Math Expressions in HTML
|
||||
------------------------
|
||||
|
||||
Some ReST pages contain math expressions. Due to the way Sphinx works,
|
||||
those expressions are written using LaTeX notation.
|
||||
There are two options for Sphinx to render math expressions in html output.
|
||||
One is an extension called `imgmath`_ which converts math expressions into
|
||||
images and embeds them in html pages.
|
||||
The other is an extension called `mathjax`_ which delegates math rendering
|
||||
to JavaScript capable web browsers.
|
||||
The former was the only option for pre-6.1 kernel documentation and it
|
||||
requires quite a few texlive packages including amsfonts and amsmath among
|
||||
others.
|
||||
|
||||
Since kernel release 6.1, html pages with math expressions can be built
|
||||
without installing any texlive packages. See `Choice of Math Renderer`_ for
|
||||
further info.
|
||||
|
||||
.. _imgmath: https://www.sphinx-doc.org/en/master/usage/extensions/math.html#module-sphinx.ext.imgmath
|
||||
.. _mathjax: https://www.sphinx-doc.org/en/master/usage/extensions/math.html#module-sphinx.ext.mathjax
|
||||
|
||||
.. _sphinx-pre-install:
|
||||
|
||||
Checking for Sphinx dependencies
|
||||
@ -164,6 +181,38 @@ To remove the generated documentation, run ``make cleandocs``.
|
||||
as well would improve the quality of images embedded in PDF
|
||||
documents, especially for kernel releases 5.18 and later.
|
||||
|
||||
Choice of Math Renderer
|
||||
-----------------------
|
||||
|
||||
Since kernel release 6.1, mathjax works as a fallback math renderer for
|
||||
html output.\ [#sph1_8]_
|
||||
|
||||
Math renderer is chosen depending on available commands as shown below:
|
||||
|
||||
.. table:: Math Renderer Choices for HTML
|
||||
|
||||
============= ================= ============
|
||||
Math renderer Required commands Image format
|
||||
============= ================= ============
|
||||
imgmath latex, dvipng PNG (raster)
|
||||
mathjax
|
||||
============= ================= ============
|
||||
|
||||
The choice can be overridden by setting an environment variable
|
||||
``SPHINX_IMGMATH`` as shown below:
|
||||
|
||||
.. table:: Effect of Setting ``SPHINX_IMGMATH``
|
||||
|
||||
====================== ========
|
||||
Setting Renderer
|
||||
====================== ========
|
||||
``SPHINX_IMGMATH=yes`` imgmath
|
||||
``SPHINX_IMGMATH=no`` mathjax
|
||||
====================== ========
|
||||
|
||||
.. [#sph1_8] Fallback of math renderer requires Sphinx >=1.8.
|
||||
|
||||
|
||||
Writing Documentation
|
||||
=====================
|
||||
|
||||
|
@ -301,6 +301,7 @@ IO region
|
||||
devm_release_region()
|
||||
devm_release_resource()
|
||||
devm_request_mem_region()
|
||||
devm_request_free_mem_region()
|
||||
devm_request_region()
|
||||
devm_request_resource()
|
||||
|
||||
@ -334,7 +335,7 @@ IRQ
|
||||
devm_irq_alloc_descs_from()
|
||||
devm_irq_alloc_generic_chip()
|
||||
devm_irq_setup_generic_chip()
|
||||
devm_irq_sim_init()
|
||||
devm_irq_domain_create_sim()
|
||||
|
||||
LED
|
||||
devm_led_classdev_register()
|
||||
@ -392,7 +393,9 @@ PHY
|
||||
PINCTRL
|
||||
devm_pinctrl_get()
|
||||
devm_pinctrl_put()
|
||||
devm_pinctrl_get_select()
|
||||
devm_pinctrl_register()
|
||||
devm_pinctrl_register_and_init()
|
||||
devm_pinctrl_unregister()
|
||||
|
||||
POWER
|
||||
@ -427,6 +430,8 @@ SLAVE DMA ENGINE
|
||||
devm_acpi_dma_controller_register()
|
||||
|
||||
SPI
|
||||
devm_spi_alloc_master()
|
||||
devm_spi_alloc_slave()
|
||||
devm_spi_register_master()
|
||||
|
||||
WATCHDOG
|
||||
|
@ -100,7 +100,7 @@ I believe platform_data is available for this, but if rather not, moving
|
||||
the isa_driver pointer to the private struct isa_dev is ofcourse fine as
|
||||
well.
|
||||
|
||||
Then, if the the driver did not provide a .match, it matches. If it did,
|
||||
Then, if the driver did not provide a .match, it matches. If it did,
|
||||
the driver match() method is called to determine a match.
|
||||
|
||||
If it did **not** match, dev->platform_data is reset to indicate this to
|
||||
|
@ -86,17 +86,24 @@ Module Options
|
||||
Special configuration for udlfb is usually unnecessary. There are a few
|
||||
options, however.
|
||||
|
||||
From the command line, pass options to modprobe
|
||||
modprobe udlfb fb_defio=0 console=1 shadow=1
|
||||
From the command line, pass options to modprobe::
|
||||
|
||||
Or modify options on the fly at /sys/module/udlfb/parameters directory via
|
||||
sudo nano fb_defio
|
||||
change the parameter in place, and save the file.
|
||||
modprobe udlfb fb_defio=0 console=1 shadow=1
|
||||
|
||||
Unplug/replug USB device to apply with new settings
|
||||
Or change options on the fly by editing
|
||||
/sys/module/udlfb/parameters/PARAMETER_NAME ::
|
||||
|
||||
Or for permanent option, create file like /etc/modprobe.d/udlfb.conf with text
|
||||
options udlfb fb_defio=0 console=1 shadow=1
|
||||
cd /sys/module/udlfb/parameters
|
||||
ls # to see a list of parameter names
|
||||
sudo nano PARAMETER_NAME
|
||||
# change the parameter in place, and save the file.
|
||||
|
||||
Unplug/replug USB device to apply with new settings.
|
||||
|
||||
Or to apply options permanently, create a modprobe configuration file
|
||||
like /etc/modprobe.d/udlfb.conf with text::
|
||||
|
||||
options udlfb fb_defio=0 console=1 shadow=1
|
||||
|
||||
Accepted boolean options:
|
||||
|
||||
|
@ -122,7 +122,7 @@ volumes, calling::
|
||||
to tell fscache that a volume has been withdrawn. This waits for all
|
||||
outstanding accesses on the volume to complete before returning.
|
||||
|
||||
When the the cache is completely withdrawn, fscache should be notified by
|
||||
When the cache is completely withdrawn, fscache should be notified by
|
||||
calling::
|
||||
|
||||
void fscache_relinquish_cache(struct fscache_cache *cache);
|
||||
|
@ -456,15 +456,15 @@ The ext4 superblock is laid out as follows in
|
||||
* - 0x277
|
||||
- __u8
|
||||
- s_lastcheck_hi
|
||||
- Upper 8 bits of the s_lastcheck_hi field.
|
||||
- Upper 8 bits of the s_lastcheck field.
|
||||
* - 0x278
|
||||
- __u8
|
||||
- s_first_error_time_hi
|
||||
- Upper 8 bits of the s_first_error_time_hi field.
|
||||
- Upper 8 bits of the s_first_error_time field.
|
||||
* - 0x279
|
||||
- __u8
|
||||
- s_last_error_time_hi
|
||||
- Upper 8 bits of the s_last_error_time_hi field.
|
||||
- Upper 8 bits of the s_last_error_time field.
|
||||
* - 0x27A
|
||||
- __u8
|
||||
- s_pad[2]
|
||||
|
@ -286,9 +286,8 @@ compress_algorithm=%s:%d Control compress algorithm and its compress level, now,
|
||||
algorithm level range
|
||||
lz4 3 - 16
|
||||
zstd 1 - 22
|
||||
compress_log_size=%u Support configuring compress cluster size, the size will
|
||||
be 4KB * (1 << %u), 16KB is minimum size, also it's
|
||||
default size.
|
||||
compress_log_size=%u Support configuring compress cluster size. The size will
|
||||
be 4KB * (1 << %u). The default and minimum sizes are 16KB.
|
||||
compress_extension=%s Support adding specified extension, so that f2fs can enable
|
||||
compression on those corresponding files, e.g. if all files
|
||||
with '.ext' has high compression rate, we can set the '.ext'
|
||||
|
@ -661,7 +661,7 @@ idmappings::
|
||||
mount idmapping: u0:k10000:r10000
|
||||
|
||||
Assume a file owned by ``u1000`` is read from disk. The filesystem maps this id
|
||||
to ``k21000`` according to it's idmapping. This is what is stored in the
|
||||
to ``k21000`` according to its idmapping. This is what is stored in the
|
||||
inode's ``i_uid`` and ``i_gid`` fields.
|
||||
|
||||
When the caller queries the ownership of this file via ``stat()`` the kernel
|
||||
|
@ -176,7 +176,7 @@ Then userspace.
|
||||
The requirement for a static, fixed preallocated system area comes from how
|
||||
qnx6fs deals with writes.
|
||||
|
||||
Each superblock got it's own half of the system area. So superblock #1
|
||||
Each superblock got its own half of the system area. So superblock #1
|
||||
always uses blocks from the lower half while superblock #2 just writes to
|
||||
blocks represented by the upper half bitmap system area bits.
|
||||
|
||||
|
@ -227,7 +227,7 @@ Files
|
||||
from the data buffer, updating the value of the specified signal
|
||||
notification register. The signal notification register will
|
||||
either be replaced with the input data or will be updated to the
|
||||
bitwise OR or the old value and the input data, depending on the
|
||||
bitwise OR of the old value and the input data, depending on the
|
||||
contents of the signal1_type, or signal2_type respectively,
|
||||
file.
|
||||
|
||||
|
@ -100,7 +100,7 @@ transactions together::
|
||||
|
||||
ntp = xfs_trans_dup(tp);
|
||||
xfs_trans_commit(tp);
|
||||
xfs_log_reserve(ntp);
|
||||
xfs_trans_reserve(ntp);
|
||||
|
||||
This results in a series of "rolling transactions" where the inode is locked
|
||||
across the entire chain of transactions. Hence while this series of rolling
|
||||
@ -191,7 +191,7 @@ transaction rolling mechanism to re-reserve space on every transaction roll. We
|
||||
know from the implementation of the permanent transactions how many transaction
|
||||
rolls are likely for the common modifications that need to be made.
|
||||
|
||||
For example, and inode allocation is typically two transactions - one to
|
||||
For example, an inode allocation is typically two transactions - one to
|
||||
physically allocate a free inode chunk on disk, and another to allocate an inode
|
||||
from an inode chunk that has free inodes in it. Hence for an inode allocation
|
||||
transaction, we might set the reservation log count to a value of 2 to indicate
|
||||
@ -200,7 +200,7 @@ chain. Each time a permanent transaction rolls, it consumes an entire unit
|
||||
reservation.
|
||||
|
||||
Hence when the permanent transaction is first allocated, the log space
|
||||
reservation is increases from a single unit reservation to multiple unit
|
||||
reservation is increased from a single unit reservation to multiple unit
|
||||
reservations. That multiple is defined by the reservation log count, and this
|
||||
means we can roll the transaction multiple times before we have to re-reserve
|
||||
log space when we roll the transaction. This ensures that the common
|
||||
@ -259,7 +259,7 @@ the next transaction in the sequeunce, but we have none remaining. We cannot
|
||||
sleep during the transaction commit process waiting for new log space to become
|
||||
available, as we may end up on the end of the FIFO queue and the items we have
|
||||
locked while we sleep could end up pinning the tail of the log before there is
|
||||
enough free space in the log to fulfil all of the pending reservations and
|
||||
enough free space in the log to fulfill all of the pending reservations and
|
||||
then wake up transaction commit in progress.
|
||||
|
||||
To take a new reservation without sleeping requires us to be able to take a
|
||||
@ -551,14 +551,14 @@ Essentially, this shows that an item that is in the AIL can still be modified
|
||||
and relogged, so any tracking must be separate to the AIL infrastructure. As
|
||||
such, we cannot reuse the AIL list pointers for tracking committed items, nor
|
||||
can we store state in any field that is protected by the AIL lock. Hence the
|
||||
committed item tracking needs it's own locks, lists and state fields in the log
|
||||
committed item tracking needs its own locks, lists and state fields in the log
|
||||
item.
|
||||
|
||||
Similar to the AIL, tracking of committed items is done through a new list
|
||||
called the Committed Item List (CIL). The list tracks log items that have been
|
||||
committed and have formatted memory buffers attached to them. It tracks objects
|
||||
in transaction commit order, so when an object is relogged it is removed from
|
||||
it's place in the list and re-inserted at the tail. This is entirely arbitrary
|
||||
its place in the list and re-inserted at the tail. This is entirely arbitrary
|
||||
and done to make it easy for debugging - the last items in the list are the
|
||||
ones that are most recently modified. Ordering of the CIL is not necessary for
|
||||
transactional integrity (as discussed in the next section) so the ordering is
|
||||
@ -615,7 +615,7 @@ those changes into the current checkpoint context. We then initialise a new
|
||||
context and attach that to the CIL for aggregation of new transactions.
|
||||
|
||||
This allows us to unlock the CIL immediately after transfer of all the
|
||||
committed items and effectively allow new transactions to be issued while we
|
||||
committed items and effectively allows new transactions to be issued while we
|
||||
are formatting the checkpoint into the log. It also allows concurrent
|
||||
checkpoints to be written into the log buffers in the case of log force heavy
|
||||
workloads, just like the existing transaction commit code does. This, however,
|
||||
@ -884,9 +884,9 @@ pin the object the first time it is inserted into the CIL - if it is already in
|
||||
the CIL during a transaction commit, then we do not pin it again. Because there
|
||||
can be multiple outstanding checkpoint contexts, we can still see elevated pin
|
||||
counts, but as each checkpoint completes the pin count will retain the correct
|
||||
value according to it's context.
|
||||
value according to its context.
|
||||
|
||||
Just to make matters more slightly more complex, this checkpoint level context
|
||||
Just to make matters slightly more complex, this checkpoint level context
|
||||
for the pin count means that the pinning of an item must take place under the
|
||||
CIL commit/flush lock. If we pin the object outside this lock, we cannot
|
||||
guarantee which context the pin count is associated with. This is because of
|
||||
|
@ -21,7 +21,7 @@ possible we decided to do following:
|
||||
- Devices behind real busses where there is a connector resource
|
||||
are represented as struct spi_device or struct i2c_device. Note
|
||||
that standard UARTs are not busses so there is no struct uart_device,
|
||||
although some of them may be represented by sturct serdev_device.
|
||||
although some of them may be represented by struct serdev_device.
|
||||
|
||||
As both ACPI and Device Tree represent a tree of devices (and their
|
||||
resources) this implementation follows the Device Tree way as much as
|
||||
@ -205,7 +205,7 @@ Here is what the ACPI namespace for a SPI slave might look like::
|
||||
}
|
||||
...
|
||||
|
||||
The SPI device drivers only need to add ACPI IDs in a similar way than with
|
||||
The SPI device drivers only need to add ACPI IDs in a similar way to
|
||||
the platform device drivers. Below is an example where we add ACPI support
|
||||
to at25 SPI eeprom driver (this is meant for the above ACPI snippet)::
|
||||
|
||||
@ -362,7 +362,7 @@ These GPIO numbers are controller relative and path "\\_SB.PCI0.GPI0"
|
||||
specifies the path to the controller. In order to use these GPIOs in Linux
|
||||
we need to translate them to the corresponding Linux GPIO descriptors.
|
||||
|
||||
There is a standard GPIO API for that and is documented in
|
||||
There is a standard GPIO API for that and it is documented in
|
||||
Documentation/admin-guide/gpio/.
|
||||
|
||||
In the above example we can get the corresponding two GPIO descriptors with
|
||||
@ -538,8 +538,8 @@ information.
|
||||
PCI hierarchy representation
|
||||
============================
|
||||
|
||||
Sometimes could be useful to enumerate a PCI device, knowing its position on the
|
||||
PCI bus.
|
||||
Sometimes it could be useful to enumerate a PCI device, knowing its position on
|
||||
the PCI bus.
|
||||
|
||||
For example, some systems use PCI devices soldered directly on the mother board,
|
||||
in a fixed position (ethernet, Wi-Fi, serial ports, etc.). In this conditions it
|
||||
@ -550,7 +550,7 @@ To identify a PCI device, a complete hierarchical description is required, from
|
||||
the chipset root port to the final device, through all the intermediate
|
||||
bridges/switches of the board.
|
||||
|
||||
For example, let us assume to have a system with a PCIe serial port, an
|
||||
For example, let's assume we have a system with a PCIe serial port, an
|
||||
Exar XR17V3521, soldered on the main board. This UART chip also includes
|
||||
16 GPIOs and we want to add the property ``gpio-line-names`` [1] to these pins.
|
||||
In this case, the ``lspci`` output for this component is::
|
||||
@ -593,8 +593,8 @@ of the chipset bridge (also called "root port") with address::
|
||||
|
||||
Bus: 0 - Device: 14 - Function: 1
|
||||
|
||||
To find this information is necessary disassemble the BIOS ACPI tables, in
|
||||
particular the DSDT (see also [2])::
|
||||
To find this information, it is necessary to disassemble the BIOS ACPI tables,
|
||||
in particular the DSDT (see also [2])::
|
||||
|
||||
mkdir ~/tables/
|
||||
cd ~/tables/
|
||||
|
@ -41,26 +41,23 @@ But it is likely that they will all eventually be added.
|
||||
What should an OEM do if they want to support Linux and Windows
|
||||
using the same BIOS image? Often they need to do something different
|
||||
for Linux to deal with how Linux is different from Windows.
|
||||
Here the BIOS should ask exactly what it wants to know:
|
||||
|
||||
In this case, the OEM should create custom ASL to be executed by the
|
||||
Linux kernel and changes to Linux kernel drivers to execute this custom
|
||||
ASL. The easiest way to accomplish this is to introduce a device specific
|
||||
method (_DSM) that is called from the Linux kernel.
|
||||
|
||||
In the past the kernel used to support something like:
|
||||
_OSI("Linux-OEM-my_interface_name")
|
||||
where 'OEM' is needed if this is an OEM-specific hook,
|
||||
and 'my_interface_name' describes the hook, which could be a
|
||||
quirk, a bug, or a bug-fix.
|
||||
|
||||
In addition, the OEM should send a patch to upstream Linux
|
||||
via the linux-acpi@vger.kernel.org mailing list. When that patch
|
||||
is checked into Linux, the OS will answer "YES" when the BIOS
|
||||
on the OEM's system uses _OSI to ask if the interface is supported
|
||||
by the OS. Linux distributors can back-port that patch for Linux
|
||||
pre-installs, and it will be included by all distributions that
|
||||
re-base to upstream. If the distribution can not update the kernel binary,
|
||||
they can also add an acpi_osi=Linux-OEM-my_interface_name
|
||||
cmdline parameter to the boot loader, as needed.
|
||||
|
||||
If the string refers to a feature where the upstream kernel
|
||||
eventually grows support, a patch should be sent to remove
|
||||
the string when that support is added to the kernel.
|
||||
However this was discovered to be abused by other BIOS vendors to change
|
||||
completely unrelated code on completely unrelated systems. This prompted
|
||||
an evaluation of all of it's uses. This uncovered that they aren't needed
|
||||
for any of the original reasons. As such, the kernel will not respond to
|
||||
any custom Linux-* strings by default.
|
||||
|
||||
That was easy. Read on, to find out how to do it wrong.
|
||||
|
||||
|
@ -1,11 +1,5 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
|
||||
.. The Linux Kernel documentation master file, created by
|
||||
sphinx-quickstart on Fri Feb 12 13:51:46 2016.
|
||||
You can adapt this file completely to your liking, but it should at least
|
||||
contain the root `toctree` directive.
|
||||
|
||||
.. _linux_doc:
|
||||
|
||||
The Linux Kernel documentation
|
||||
@ -18,26 +12,73 @@ documents into a coherent whole. Please note that improvements to the
|
||||
documentation are welcome; join the linux-doc list at vger.kernel.org if
|
||||
you want to help out.
|
||||
|
||||
Licensing documentation
|
||||
-----------------------
|
||||
Working with the development community
|
||||
--------------------------------------
|
||||
|
||||
The following describes the license of the Linux kernel source code
|
||||
(GPLv2), how to properly mark the license of individual files in the source
|
||||
tree, as well as links to the full license text.
|
||||
The essential guides for interacting with the kernel's development
|
||||
community and getting your work upstream.
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
process/development-process
|
||||
process/submitting-patches
|
||||
Code of conduct <process/code-of-conduct>
|
||||
maintainer/index
|
||||
All development-process docs <process/index>
|
||||
|
||||
|
||||
Internal API manuals
|
||||
--------------------
|
||||
|
||||
Manuals for use by developers working to interface with the rest of the
|
||||
kernel.
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
core-api/index
|
||||
driver-api/index
|
||||
subsystem-apis
|
||||
Locking in the kernel <locking/index>
|
||||
|
||||
Development tools and processes
|
||||
-------------------------------
|
||||
|
||||
Various other manuals with useful information for all kernel developers.
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
process/license-rules
|
||||
doc-guide/index
|
||||
dev-tools/index
|
||||
dev-tools/testing-overview
|
||||
kernel-hacking/index
|
||||
trace/index
|
||||
fault-injection/index
|
||||
livepatch/index
|
||||
rust/index
|
||||
|
||||
* :ref:`kernel_licensing`
|
||||
|
||||
User-oriented documentation
|
||||
---------------------------
|
||||
|
||||
The following manuals are written for *users* of the kernel — those who are
|
||||
trying to get it to work optimally on a given system.
|
||||
trying to get it to work optimally on a given system and application
|
||||
developers seeking information on the kernel's user-space APIs.
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
:maxdepth: 1
|
||||
|
||||
admin-guide/index
|
||||
kbuild/index
|
||||
The kernel build system <kbuild/index>
|
||||
admin-guide/reporting-issues.rst
|
||||
User-space tools <tools/index>
|
||||
userspace-api/index
|
||||
|
||||
See also: the `Linux man pages <https://www.kernel.org/doc/man-pages/>`_,
|
||||
which are kept separately from the kernel's own documentation.
|
||||
|
||||
Firmware-related documentation
|
||||
------------------------------
|
||||
@ -45,106 +86,11 @@ The following holds information on the kernel's expectations regarding the
|
||||
platform firmwares.
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
:maxdepth: 1
|
||||
|
||||
firmware-guide/index
|
||||
devicetree/index
|
||||
|
||||
Application-developer documentation
|
||||
-----------------------------------
|
||||
|
||||
The user-space API manual gathers together documents describing aspects of
|
||||
the kernel interface as seen by application developers.
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
userspace-api/index
|
||||
|
||||
|
||||
Introduction to kernel development
|
||||
----------------------------------
|
||||
|
||||
These manuals contain overall information about how to develop the kernel.
|
||||
The kernel community is quite large, with thousands of developers
|
||||
contributing over the course of a year. As with any large community,
|
||||
knowing how things are done will make the process of getting your changes
|
||||
merged much easier.
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
process/index
|
||||
dev-tools/index
|
||||
doc-guide/index
|
||||
kernel-hacking/index
|
||||
trace/index
|
||||
maintainer/index
|
||||
fault-injection/index
|
||||
livepatch/index
|
||||
|
||||
|
||||
Kernel API documentation
|
||||
------------------------
|
||||
|
||||
These books get into the details of how specific kernel subsystems work
|
||||
from the point of view of a kernel developer. Much of the information here
|
||||
is taken directly from the kernel source, with supplemental material added
|
||||
as needed (or at least as we managed to add it — probably *not* all that is
|
||||
needed).
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
driver-api/index
|
||||
core-api/index
|
||||
locking/index
|
||||
accounting/index
|
||||
block/index
|
||||
cdrom/index
|
||||
cpu-freq/index
|
||||
fb/index
|
||||
fpga/index
|
||||
hid/index
|
||||
i2c/index
|
||||
iio/index
|
||||
isdn/index
|
||||
infiniband/index
|
||||
leds/index
|
||||
netlabel/index
|
||||
networking/index
|
||||
pcmcia/index
|
||||
power/index
|
||||
target/index
|
||||
timers/index
|
||||
spi/index
|
||||
w1/index
|
||||
watchdog/index
|
||||
virt/index
|
||||
input/index
|
||||
hwmon/index
|
||||
gpu/index
|
||||
security/index
|
||||
sound/index
|
||||
crypto/index
|
||||
filesystems/index
|
||||
mm/index
|
||||
bpf/index
|
||||
usb/index
|
||||
PCI/index
|
||||
scsi/index
|
||||
misc-devices/index
|
||||
scheduler/index
|
||||
mhi/index
|
||||
peci/index
|
||||
|
||||
Architecture-agnostic documentation
|
||||
-----------------------------------
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
asm-annotations
|
||||
|
||||
Architecture-specific documentation
|
||||
-----------------------------------
|
||||
@ -163,9 +109,8 @@ of the documentation body, or may require some adjustments and/or conversion
|
||||
to ReStructured Text format, or are simply too old.
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
:maxdepth: 1
|
||||
|
||||
tools/index
|
||||
staging/index
|
||||
|
||||
|
||||
|
@ -90,7 +90,11 @@ e.g., on Ubuntu for gcc-10::
|
||||
|
||||
Or on Fedora::
|
||||
|
||||
dnf install gcc-plugin-devel
|
||||
dnf install gcc-plugin-devel libmpc-devel
|
||||
|
||||
Or on Fedora when using cross-compilers that include plugins::
|
||||
|
||||
dnf install libmpc-devel
|
||||
|
||||
Enable the GCC plugin infrastructure and some plugin(s) you want to use
|
||||
in the kernel config::
|
||||
@ -99,6 +103,19 @@ in the kernel config::
|
||||
CONFIG_GCC_PLUGIN_LATENT_ENTROPY=y
|
||||
...
|
||||
|
||||
Run gcc (native or cross-compiler) to ensure plugin headers are detected::
|
||||
|
||||
gcc -print-file-name=plugin
|
||||
CROSS_COMPILE=arm-linux-gnu- ${CROSS_COMPILE}gcc -print-file-name=plugin
|
||||
|
||||
The word "plugin" means they are not detected::
|
||||
|
||||
plugin
|
||||
|
||||
A full path means they are detected::
|
||||
|
||||
/usr/lib/gcc/x86_64-redhat-linux/12/plugin
|
||||
|
||||
To compile the minimum tool set including the plugin(s)::
|
||||
|
||||
make scripts
|
||||
|
@ -48,6 +48,10 @@ KCFLAGS
|
||||
-------
|
||||
Additional options to the C compiler (for built-in and modules).
|
||||
|
||||
KRUSTFLAGS
|
||||
----------
|
||||
Additional options to the Rust compiler (for built-in and modules).
|
||||
|
||||
CFLAGS_KERNEL
|
||||
-------------
|
||||
Additional options for $(CC) when used to compile
|
||||
@ -57,6 +61,15 @@ CFLAGS_MODULE
|
||||
-------------
|
||||
Additional module specific options to use for $(CC).
|
||||
|
||||
RUSTFLAGS_KERNEL
|
||||
----------------
|
||||
Additional options for $(RUSTC) when used to compile
|
||||
code that is compiled as built-in.
|
||||
|
||||
RUSTFLAGS_MODULE
|
||||
----------------
|
||||
Additional module specific options to use for $(RUSTC).
|
||||
|
||||
LDFLAGS_MODULE
|
||||
--------------
|
||||
Additional options used for $(LD) when linking modules.
|
||||
@ -69,6 +82,10 @@ HOSTCXXFLAGS
|
||||
------------
|
||||
Additional flags to be passed to $(HOSTCXX) when building host programs.
|
||||
|
||||
HOSTRUSTFLAGS
|
||||
-------------
|
||||
Additional flags to be passed to $(HOSTRUSTC) when building host programs.
|
||||
|
||||
HOSTLDFLAGS
|
||||
-----------
|
||||
Additional flags to be passed when linking host programs.
|
||||
|
@ -29,8 +29,9 @@ This document describes the Linux kernel Makefiles.
|
||||
--- 4.1 Simple Host Program
|
||||
--- 4.2 Composite Host Programs
|
||||
--- 4.3 Using C++ for host programs
|
||||
--- 4.4 Controlling compiler options for host programs
|
||||
--- 4.5 When host programs are actually built
|
||||
--- 4.4 Using Rust for host programs
|
||||
--- 4.5 Controlling compiler options for host programs
|
||||
--- 4.6 When host programs are actually built
|
||||
|
||||
=== 5 Userspace Program support
|
||||
--- 5.1 Simple Userspace Program
|
||||
@ -835,7 +836,24 @@ Both possibilities are described in the following.
|
||||
qconf-cxxobjs := qconf.o
|
||||
qconf-objs := check.o
|
||||
|
||||
4.4 Controlling compiler options for host programs
|
||||
4.4 Using Rust for host programs
|
||||
--------------------------------
|
||||
|
||||
Kbuild offers support for host programs written in Rust. However,
|
||||
since a Rust toolchain is not mandatory for kernel compilation,
|
||||
it may only be used in scenarios where Rust is required to be
|
||||
available (e.g. when ``CONFIG_RUST`` is enabled).
|
||||
|
||||
Example::
|
||||
|
||||
hostprogs := target
|
||||
target-rust := y
|
||||
|
||||
Kbuild will compile ``target`` using ``target.rs`` as the crate root,
|
||||
located in the same directory as the ``Makefile``. The crate may
|
||||
consist of several source files (see ``samples/rust/hostprogs``).
|
||||
|
||||
4.5 Controlling compiler options for host programs
|
||||
--------------------------------------------------
|
||||
|
||||
When compiling host programs, it is possible to set specific flags.
|
||||
@ -867,7 +885,7 @@ Both possibilities are described in the following.
|
||||
When linking qconf, it will be passed the extra option
|
||||
"-L$(QTDIR)/lib".
|
||||
|
||||
4.5 When host programs are actually built
|
||||
4.6 When host programs are actually built
|
||||
-----------------------------------------
|
||||
|
||||
Kbuild will only build host-programs when they are referenced
|
||||
@ -1181,6 +1199,17 @@ When kbuild executes, the following steps are followed (roughly):
|
||||
The first example utilises the trick that a config option expands
|
||||
to 'y' when selected.
|
||||
|
||||
KBUILD_RUSTFLAGS
|
||||
$(RUSTC) compiler flags
|
||||
|
||||
Default value - see top level Makefile
|
||||
Append or modify as required per architecture.
|
||||
|
||||
Often, the KBUILD_RUSTFLAGS variable depends on the configuration.
|
||||
|
||||
Note that target specification file generation (for ``--target``)
|
||||
is handled in ``scripts/generate_rust_target.rs``.
|
||||
|
||||
KBUILD_AFLAGS_KERNEL
|
||||
Assembler options specific for built-in
|
||||
|
||||
@ -1208,6 +1237,19 @@ When kbuild executes, the following steps are followed (roughly):
|
||||
are used for $(CC).
|
||||
From commandline CFLAGS_MODULE shall be used (see kbuild.rst).
|
||||
|
||||
KBUILD_RUSTFLAGS_KERNEL
|
||||
$(RUSTC) options specific for built-in
|
||||
|
||||
$(KBUILD_RUSTFLAGS_KERNEL) contains extra Rust compiler flags used to
|
||||
compile resident kernel code.
|
||||
|
||||
KBUILD_RUSTFLAGS_MODULE
|
||||
Options for $(RUSTC) when building modules
|
||||
|
||||
$(KBUILD_RUSTFLAGS_MODULE) is used to add arch-specific options that
|
||||
are used for $(RUSTC).
|
||||
From commandline RUSTFLAGS_MODULE shall be used (see kbuild.rst).
|
||||
|
||||
KBUILD_LDFLAGS_MODULE
|
||||
Options for $(LD) when linking modules
|
||||
|
||||
|
@ -39,7 +39,7 @@ as the writer can invalidate a pointer that the reader is following.
|
||||
Sequence counters (``seqcount_t``)
|
||||
==================================
|
||||
|
||||
This is the the raw counting mechanism, which does not protect against
|
||||
This is the raw counting mechanism, which does not protect against
|
||||
multiple writers. Write side critical sections must thus be serialized
|
||||
by an external lock.
|
||||
|
||||
|
@ -52,7 +52,7 @@ CONTENTS
|
||||
|
||||
- Varieties of memory barrier.
|
||||
- What may not be assumed about memory barriers?
|
||||
- Data dependency barriers (historical).
|
||||
- Address-dependency barriers (historical).
|
||||
- Control dependencies.
|
||||
- SMP barrier pairing.
|
||||
- Examples of memory barrier sequences.
|
||||
@ -187,9 +187,9 @@ As a further example, consider this sequence of events:
|
||||
B = 4; Q = P;
|
||||
P = &B; D = *Q;
|
||||
|
||||
There is an obvious data dependency here, as the value loaded into D depends on
|
||||
the address retrieved from P by CPU 2. At the end of the sequence, any of the
|
||||
following results are possible:
|
||||
There is an obvious address dependency here, as the value loaded into D depends
|
||||
on the address retrieved from P by CPU 2. At the end of the sequence, any of
|
||||
the following results are possible:
|
||||
|
||||
(Q == &A) and (D == 1)
|
||||
(Q == &B) and (D == 2)
|
||||
@ -391,58 +391,62 @@ Memory barriers come in four basic varieties:
|
||||
memory system as time progresses. All stores _before_ a write barrier
|
||||
will occur _before_ all the stores after the write barrier.
|
||||
|
||||
[!] Note that write barriers should normally be paired with read or data
|
||||
dependency barriers; see the "SMP barrier pairing" subsection.
|
||||
[!] Note that write barriers should normally be paired with read or
|
||||
address-dependency barriers; see the "SMP barrier pairing" subsection.
|
||||
|
||||
|
||||
(2) Data dependency barriers.
|
||||
(2) Address-dependency barriers (historical).
|
||||
|
||||
A data dependency barrier is a weaker form of read barrier. In the case
|
||||
where two loads are performed such that the second depends on the result
|
||||
of the first (eg: the first load retrieves the address to which the second
|
||||
load will be directed), a data dependency barrier would be required to
|
||||
make sure that the target of the second load is updated after the address
|
||||
obtained by the first load is accessed.
|
||||
An address-dependency barrier is a weaker form of read barrier. In the
|
||||
case where two loads are performed such that the second depends on the
|
||||
result of the first (eg: the first load retrieves the address to which
|
||||
the second load will be directed), an address-dependency barrier would
|
||||
be required to make sure that the target of the second load is updated
|
||||
after the address obtained by the first load is accessed.
|
||||
|
||||
A data dependency barrier is a partial ordering on interdependent loads
|
||||
only; it is not required to have any effect on stores, independent loads
|
||||
or overlapping loads.
|
||||
An address-dependency barrier is a partial ordering on interdependent
|
||||
loads only; it is not required to have any effect on stores, independent
|
||||
loads or overlapping loads.
|
||||
|
||||
As mentioned in (1), the other CPUs in the system can be viewed as
|
||||
committing sequences of stores to the memory system that the CPU being
|
||||
considered can then perceive. A data dependency barrier issued by the CPU
|
||||
under consideration guarantees that for any load preceding it, if that
|
||||
load touches one of a sequence of stores from another CPU, then by the
|
||||
time the barrier completes, the effects of all the stores prior to that
|
||||
touched by the load will be perceptible to any loads issued after the data
|
||||
dependency barrier.
|
||||
considered can then perceive. An address-dependency barrier issued by
|
||||
the CPU under consideration guarantees that for any load preceding it,
|
||||
if that load touches one of a sequence of stores from another CPU, then
|
||||
by the time the barrier completes, the effects of all the stores prior to
|
||||
that touched by the load will be perceptible to any loads issued after
|
||||
the address-dependency barrier.
|
||||
|
||||
See the "Examples of memory barrier sequences" subsection for diagrams
|
||||
showing the ordering constraints.
|
||||
|
||||
[!] Note that the first load really has to have a _data_ dependency and
|
||||
[!] Note that the first load really has to have an _address_ dependency and
|
||||
not a control dependency. If the address for the second load is dependent
|
||||
on the first load, but the dependency is through a conditional rather than
|
||||
actually loading the address itself, then it's a _control_ dependency and
|
||||
a full read barrier or better is required. See the "Control dependencies"
|
||||
subsection for more information.
|
||||
|
||||
[!] Note that data dependency barriers should normally be paired with
|
||||
[!] Note that address-dependency barriers should normally be paired with
|
||||
write barriers; see the "SMP barrier pairing" subsection.
|
||||
|
||||
[!] Kernel release v5.9 removed kernel APIs for explicit address-
|
||||
dependency barriers. Nowadays, APIs for marking loads from shared
|
||||
variables such as READ_ONCE() and rcu_dereference() provide implicit
|
||||
address-dependency barriers.
|
||||
|
||||
(3) Read (or load) memory barriers.
|
||||
|
||||
A read barrier is a data dependency barrier plus a guarantee that all the
|
||||
LOAD operations specified before the barrier will appear to happen before
|
||||
all the LOAD operations specified after the barrier with respect to the
|
||||
other components of the system.
|
||||
A read barrier is an address-dependency barrier plus a guarantee that all
|
||||
the LOAD operations specified before the barrier will appear to happen
|
||||
before all the LOAD operations specified after the barrier with respect to
|
||||
the other components of the system.
|
||||
|
||||
A read barrier is a partial ordering on loads only; it is not required to
|
||||
have any effect on stores.
|
||||
|
||||
Read memory barriers imply data dependency barriers, and so can substitute
|
||||
for them.
|
||||
Read memory barriers imply address-dependency barriers, and so can
|
||||
substitute for them.
|
||||
|
||||
[!] Note that read barriers should normally be paired with write barriers;
|
||||
see the "SMP barrier pairing" subsection.
|
||||
@ -550,17 +554,21 @@ There are certain things that the Linux kernel memory barriers do not guarantee:
|
||||
Documentation/core-api/dma-api.rst
|
||||
|
||||
|
||||
DATA DEPENDENCY BARRIERS (HISTORICAL)
|
||||
-------------------------------------
|
||||
ADDRESS-DEPENDENCY BARRIERS (HISTORICAL)
|
||||
----------------------------------------
|
||||
|
||||
As of v4.15 of the Linux kernel, an smp_mb() was added to READ_ONCE() for
|
||||
DEC Alpha, which means that about the only people who need to pay attention
|
||||
to this section are those working on DEC Alpha architecture-specific code
|
||||
and those working on READ_ONCE() itself. For those who need it, and for
|
||||
those who are interested in the history, here is the story of
|
||||
data-dependency barriers.
|
||||
address-dependency barriers.
|
||||
|
||||
The usage requirements of data dependency barriers are a little subtle, and
|
||||
[!] While address dependencies are observed in both load-to-load and
|
||||
load-to-store relations, address-dependency barriers are not necessary
|
||||
for load-to-store situations.
|
||||
|
||||
The requirement of address-dependency barriers is a little subtle, and
|
||||
it's not always obvious that they're needed. To illustrate, consider the
|
||||
following sequence of events:
|
||||
|
||||
@ -570,11 +578,14 @@ following sequence of events:
|
||||
B = 4;
|
||||
<write barrier>
|
||||
WRITE_ONCE(P, &B);
|
||||
Q = READ_ONCE(P);
|
||||
Q = READ_ONCE_OLD(P);
|
||||
D = *Q;
|
||||
|
||||
There's a clear data dependency here, and it would seem that by the end of the
|
||||
sequence, Q must be either &A or &B, and that:
|
||||
[!] READ_ONCE_OLD() corresponds to READ_ONCE() of pre-4.15 kernel, which
|
||||
doesn't imply an address-dependency barrier.
|
||||
|
||||
There's a clear address dependency here, and it would seem that by the end of
|
||||
the sequence, Q must be either &A or &B, and that:
|
||||
|
||||
(Q == &A) implies (D == 1)
|
||||
(Q == &B) implies (D == 4)
|
||||
@ -588,8 +599,8 @@ While this may seem like a failure of coherency or causality maintenance, it
|
||||
isn't, and this behaviour can be observed on certain real CPUs (such as the DEC
|
||||
Alpha).
|
||||
|
||||
To deal with this, a data dependency barrier or better must be inserted
|
||||
between the address load and the data load:
|
||||
To deal with this, READ_ONCE() provides an implicit address-dependency barrier
|
||||
since kernel release v4.15:
|
||||
|
||||
CPU 1 CPU 2
|
||||
=============== ===============
|
||||
@ -598,7 +609,7 @@ between the address load and the data load:
|
||||
<write barrier>
|
||||
WRITE_ONCE(P, &B);
|
||||
Q = READ_ONCE(P);
|
||||
<data dependency barrier>
|
||||
<implicit address-dependency barrier>
|
||||
D = *Q;
|
||||
|
||||
This enforces the occurrence of one of the two implications, and prevents the
|
||||
@ -615,13 +626,13 @@ odd-numbered bank is idle, one can see the new value of the pointer P (&B),
|
||||
but the old value of the variable B (2).
|
||||
|
||||
|
||||
A data-dependency barrier is not required to order dependent writes
|
||||
because the CPUs that the Linux kernel supports don't do writes
|
||||
until they are certain (1) that the write will actually happen, (2)
|
||||
of the location of the write, and (3) of the value to be written.
|
||||
An address-dependency barrier is not required to order dependent writes
|
||||
because the CPUs that the Linux kernel supports don't do writes until they
|
||||
are certain (1) that the write will actually happen, (2) of the location of
|
||||
the write, and (3) of the value to be written.
|
||||
But please carefully read the "CONTROL DEPENDENCIES" section and the
|
||||
Documentation/RCU/rcu_dereference.rst file: The compiler can and does
|
||||
break dependencies in a great many highly creative ways.
|
||||
Documentation/RCU/rcu_dereference.rst file: The compiler can and does break
|
||||
dependencies in a great many highly creative ways.
|
||||
|
||||
CPU 1 CPU 2
|
||||
=============== ===============
|
||||
@ -629,12 +640,12 @@ break dependencies in a great many highly creative ways.
|
||||
B = 4;
|
||||
<write barrier>
|
||||
WRITE_ONCE(P, &B);
|
||||
Q = READ_ONCE(P);
|
||||
Q = READ_ONCE_OLD(P);
|
||||
WRITE_ONCE(*Q, 5);
|
||||
|
||||
Therefore, no data-dependency barrier is required to order the read into
|
||||
Therefore, no address-dependency barrier is required to order the read into
|
||||
Q with the store into *Q. In other words, this outcome is prohibited,
|
||||
even without a data-dependency barrier:
|
||||
even without an implicit address-dependency barrier of modern READ_ONCE():
|
||||
|
||||
(Q == &B) && (B == 4)
|
||||
|
||||
@ -645,12 +656,12 @@ can be used to record rare error conditions and the like, and the CPUs'
|
||||
naturally occurring ordering prevents such records from being lost.
|
||||
|
||||
|
||||
Note well that the ordering provided by a data dependency is local to
|
||||
Note well that the ordering provided by an address dependency is local to
|
||||
the CPU containing it. See the section on "Multicopy atomicity" for
|
||||
more information.
|
||||
|
||||
|
||||
The data dependency barrier is very important to the RCU system,
|
||||
The address-dependency barrier is very important to the RCU system,
|
||||
for example. See rcu_assign_pointer() and rcu_dereference() in
|
||||
include/linux/rcupdate.h. This permits the current target of an RCU'd
|
||||
pointer to be replaced with a new modified target, without the replacement
|
||||
@ -667,20 +678,21 @@ not understand them. The purpose of this section is to help you prevent
|
||||
the compiler's ignorance from breaking your code.
|
||||
|
||||
A load-load control dependency requires a full read memory barrier, not
|
||||
simply a data dependency barrier to make it work correctly. Consider the
|
||||
following bit of code:
|
||||
simply an (implicit) address-dependency barrier to make it work correctly.
|
||||
Consider the following bit of code:
|
||||
|
||||
q = READ_ONCE(a);
|
||||
<implicit address-dependency barrier>
|
||||
if (q) {
|
||||
<data dependency barrier> /* BUG: No data dependency!!! */
|
||||
/* BUG: No address dependency!!! */
|
||||
p = READ_ONCE(b);
|
||||
}
|
||||
|
||||
This will not have the desired effect because there is no actual data
|
||||
This will not have the desired effect because there is no actual address
|
||||
dependency, but rather a control dependency that the CPU may short-circuit
|
||||
by attempting to predict the outcome in advance, so that other CPUs see
|
||||
the load from b as having happened before the load from a. In such a
|
||||
case what's actually required is:
|
||||
the load from b as having happened before the load from a. In such a case
|
||||
what's actually required is:
|
||||
|
||||
q = READ_ONCE(a);
|
||||
if (q) {
|
||||
@ -927,9 +939,9 @@ General barriers pair with each other, though they also pair with most
|
||||
other types of barriers, albeit without multicopy atomicity. An acquire
|
||||
barrier pairs with a release barrier, but both may also pair with other
|
||||
barriers, including of course general barriers. A write barrier pairs
|
||||
with a data dependency barrier, a control dependency, an acquire barrier,
|
||||
with an address-dependency barrier, a control dependency, an acquire barrier,
|
||||
a release barrier, a read barrier, or a general barrier. Similarly a
|
||||
read barrier, control dependency, or a data dependency barrier pairs
|
||||
read barrier, control dependency, or an address-dependency barrier pairs
|
||||
with a write barrier, an acquire barrier, a release barrier, or a
|
||||
general barrier:
|
||||
|
||||
@ -948,7 +960,7 @@ Or:
|
||||
a = 1;
|
||||
<write barrier>
|
||||
WRITE_ONCE(b, &a); x = READ_ONCE(b);
|
||||
<data dependency barrier>
|
||||
<implicit address-dependency barrier>
|
||||
y = *x;
|
||||
|
||||
Or even:
|
||||
@ -968,8 +980,8 @@ Basically, the read barrier always has to be there, even though it can be of
|
||||
the "weaker" type.
|
||||
|
||||
[!] Note that the stores before the write barrier would normally be expected to
|
||||
match the loads after the read barrier or the data dependency barrier, and vice
|
||||
versa:
|
||||
match the loads after the read barrier or the address-dependency barrier, and
|
||||
vice versa:
|
||||
|
||||
CPU 1 CPU 2
|
||||
=================== ===================
|
||||
@ -1021,8 +1033,8 @@ STORE B, STORE C } all occurring before the unordered set of { STORE D, STORE E
|
||||
V
|
||||
|
||||
|
||||
Secondly, data dependency barriers act as partial orderings on data-dependent
|
||||
loads. Consider the following sequence of events:
|
||||
Secondly, address-dependency barriers act as partial orderings on address-
|
||||
dependent loads. Consider the following sequence of events:
|
||||
|
||||
CPU 1 CPU 2
|
||||
======================= =======================
|
||||
@ -1067,8 +1079,8 @@ effectively random order, despite the write barrier issued by CPU 1:
|
||||
In the above example, CPU 2 perceives that B is 7, despite the load of *C
|
||||
(which would be B) coming after the LOAD of C.
|
||||
|
||||
If, however, a data dependency barrier were to be placed between the load of C
|
||||
and the load of *C (ie: B) on CPU 2:
|
||||
If, however, an address-dependency barrier were to be placed between the load
|
||||
of C and the load of *C (ie: B) on CPU 2:
|
||||
|
||||
CPU 1 CPU 2
|
||||
======================= =======================
|
||||
@ -1078,7 +1090,7 @@ and the load of *C (ie: B) on CPU 2:
|
||||
<write barrier>
|
||||
STORE C = &B LOAD X
|
||||
STORE D = 4 LOAD C (gets &B)
|
||||
<data dependency barrier>
|
||||
<address-dependency barrier>
|
||||
LOAD *C (reads B)
|
||||
|
||||
then the following will occur:
|
||||
@ -1101,7 +1113,7 @@ then the following will occur:
|
||||
| +-------+ | |
|
||||
| | X->9 |------>| |
|
||||
| +-------+ | |
|
||||
Makes sure all effects ---> \ ddddddddddddddddd | |
|
||||
Makes sure all effects ---> \ aaaaaaaaaaaaaaaaa | |
|
||||
prior to the store of C \ +-------+ | |
|
||||
are perceptible to ----->| B->2 |------>| |
|
||||
subsequent loads +-------+ | |
|
||||
@ -1292,7 +1304,7 @@ Which might appear as this:
|
||||
LOAD with immediate effect : : +-------+
|
||||
|
||||
|
||||
Placing a read barrier or a data dependency barrier just before the second
|
||||
Placing a read barrier or an address-dependency barrier just before the second
|
||||
load:
|
||||
|
||||
CPU 1 CPU 2
|
||||
@ -1816,20 +1828,20 @@ which may then reorder things however it wishes.
|
||||
CPU MEMORY BARRIERS
|
||||
-------------------
|
||||
|
||||
The Linux kernel has eight basic CPU memory barriers:
|
||||
The Linux kernel has seven basic CPU memory barriers:
|
||||
|
||||
TYPE MANDATORY SMP CONDITIONAL
|
||||
=============== ======================= ===========================
|
||||
GENERAL mb() smp_mb()
|
||||
WRITE wmb() smp_wmb()
|
||||
READ rmb() smp_rmb()
|
||||
DATA DEPENDENCY READ_ONCE()
|
||||
TYPE MANDATORY SMP CONDITIONAL
|
||||
======================= =============== ===============
|
||||
GENERAL mb() smp_mb()
|
||||
WRITE wmb() smp_wmb()
|
||||
READ rmb() smp_rmb()
|
||||
ADDRESS DEPENDENCY READ_ONCE()
|
||||
|
||||
|
||||
All memory barriers except the data dependency barriers imply a compiler
|
||||
barrier. Data dependencies do not impose any additional compiler ordering.
|
||||
All memory barriers except the address-dependency barriers imply a compiler
|
||||
barrier. Address dependencies do not impose any additional compiler ordering.
|
||||
|
||||
Aside: In the case of data dependencies, the compiler would be expected
|
||||
Aside: In the case of address dependencies, the compiler would be expected
|
||||
to issue the loads in the correct order (eg. `a[b]` would have to load
|
||||
the value of b before loading a[b]), however there is no guarantee in
|
||||
the C specification that the compiler may not speculate the value of b
|
||||
@ -2749,7 +2761,8 @@ is discarded from the CPU's cache and reloaded. To deal with this, the
|
||||
appropriate part of the kernel must invalidate the overlapping bits of the
|
||||
cache on each CPU.
|
||||
|
||||
See Documentation/core-api/cachetlb.rst for more information on cache management.
|
||||
See Documentation/core-api/cachetlb.rst for more information on cache
|
||||
management.
|
||||
|
||||
|
||||
CACHE COHERENCY VS MMIO
|
||||
@ -2889,8 +2902,8 @@ AND THEN THERE'S THE ALPHA
|
||||
The DEC Alpha CPU is one of the most relaxed CPUs there is. Not only that,
|
||||
some versions of the Alpha CPU have a split data cache, permitting them to have
|
||||
two semantically-related cache lines updated at separate times. This is where
|
||||
the data dependency barrier really becomes necessary as this synchronises both
|
||||
caches with the memory coherence system, thus making it seem like pointer
|
||||
the address-dependency barrier really becomes necessary as this synchronises
|
||||
both caches with the memory coherence system, thus making it seem like pointer
|
||||
changes vs new data occur in the right order.
|
||||
|
||||
The Alpha defines the Linux kernel's memory model, although as of v4.15
|
||||
|
@ -197,7 +197,7 @@ unevictable list for the memory cgroup and node being scanned.
|
||||
There may be situations where a page is mapped into a VM_LOCKED VMA, but the
|
||||
page is not marked as PG_mlocked. Such pages will make it all the way to
|
||||
shrink_active_list() or shrink_page_list() where they will be detected when
|
||||
vmscan walks the reverse map in page_referenced() or try_to_unmap(). The page
|
||||
vmscan walks the reverse map in folio_referenced() or try_to_unmap(). The page
|
||||
is culled to the unevictable list when it is released by the shrinker.
|
||||
|
||||
To "cull" an unevictable page, vmscan simply puts the page back on the LRU list
|
||||
@ -267,7 +267,7 @@ the LRU. Such pages can be "noticed" by memory management in several places:
|
||||
(4) in the fault path and when a VM_LOCKED stack segment is expanded; or
|
||||
|
||||
(5) as mentioned above, in vmscan:shrink_page_list() when attempting to
|
||||
reclaim a page in a VM_LOCKED VMA by page_referenced() or try_to_unmap().
|
||||
reclaim a page in a VM_LOCKED VMA by folio_referenced() or try_to_unmap().
|
||||
|
||||
mlocked pages become unlocked and rescued from the unevictable list when:
|
||||
|
||||
@ -547,7 +547,7 @@ vmscan's shrink_inactive_list() and shrink_page_list() also divert obviously
|
||||
unevictable pages found on the inactive lists to the appropriate memory cgroup
|
||||
and node unevictable list.
|
||||
|
||||
rmap's page_referenced_one(), called via vmscan's shrink_active_list() or
|
||||
rmap's folio_referenced_one(), called via vmscan's shrink_active_list() or
|
||||
shrink_page_list(), and rmap's try_to_unmap_one() called via shrink_page_list(),
|
||||
check for (3) pages still mapped into VM_LOCKED VMAs, and call mlock_vma_page()
|
||||
to correct them. Such pages are culled to the unevictable list when released
|
||||
|
@ -220,6 +220,8 @@ Userspace to kernel:
|
||||
``ETHTOOL_MSG_PHC_VCLOCKS_GET`` get PHC virtual clocks info
|
||||
``ETHTOOL_MSG_MODULE_SET`` set transceiver module parameters
|
||||
``ETHTOOL_MSG_MODULE_GET`` get transceiver module parameters
|
||||
``ETHTOOL_MSG_PSE_SET`` set PSE parameters
|
||||
``ETHTOOL_MSG_PSE_GET`` get PSE parameters
|
||||
===================================== =================================
|
||||
|
||||
Kernel to userspace:
|
||||
@ -260,6 +262,7 @@ Kernel to userspace:
|
||||
``ETHTOOL_MSG_STATS_GET_REPLY`` standard statistics
|
||||
``ETHTOOL_MSG_PHC_VCLOCKS_GET_REPLY`` PHC virtual clocks info
|
||||
``ETHTOOL_MSG_MODULE_GET_REPLY`` transceiver module parameters
|
||||
``ETHTOOL_MSG_PSE_GET_REPLY`` PSE parameters
|
||||
======================================== =================================
|
||||
|
||||
``GET`` requests are sent by userspace applications to retrieve device
|
||||
@ -1627,6 +1630,62 @@ For SFF-8636 modules, low power mode is forced by the host according to table
|
||||
For CMIS modules, low power mode is forced by the host according to table 6-12
|
||||
in revision 5.0 of the specification.
|
||||
|
||||
PSE_GET
|
||||
=======
|
||||
|
||||
Gets PSE attributes.
|
||||
|
||||
Request contents:
|
||||
|
||||
===================================== ====== ==========================
|
||||
``ETHTOOL_A_PSE_HEADER`` nested request header
|
||||
===================================== ====== ==========================
|
||||
|
||||
Kernel response contents:
|
||||
|
||||
====================================== ====== =============================
|
||||
``ETHTOOL_A_PSE_HEADER`` nested reply header
|
||||
``ETHTOOL_A_PODL_PSE_ADMIN_STATE`` u32 Operational state of the PoDL
|
||||
PSE functions
|
||||
``ETHTOOL_A_PODL_PSE_PW_D_STATUS`` u32 power detection status of the
|
||||
PoDL PSE.
|
||||
====================================== ====== =============================
|
||||
|
||||
When set, the optional ``ETHTOOL_A_PODL_PSE_ADMIN_STATE`` attribute identifies
|
||||
the operational state of the PoDL PSE functions. The operational state of the
|
||||
PSE function can be changed using the ``ETHTOOL_A_PODL_PSE_ADMIN_CONTROL``
|
||||
action. This option is corresponding to ``IEEE 802.3-2018`` 30.15.1.1.2
|
||||
aPoDLPSEAdminState. Possible values are:
|
||||
|
||||
.. kernel-doc:: include/uapi/linux/ethtool.h
|
||||
:identifiers: ethtool_podl_pse_admin_state
|
||||
|
||||
When set, the optional ``ETHTOOL_A_PODL_PSE_PW_D_STATUS`` attribute identifies
|
||||
the power detection status of the PoDL PSE. The status depend on internal PSE
|
||||
state machine and automatic PD classification support. This option is
|
||||
corresponding to ``IEEE 802.3-2018`` 30.15.1.1.3 aPoDLPSEPowerDetectionStatus.
|
||||
Possible values are:
|
||||
|
||||
.. kernel-doc:: include/uapi/linux/ethtool.h
|
||||
:identifiers: ethtool_podl_pse_pw_d_status
|
||||
|
||||
PSE_SET
|
||||
=======
|
||||
|
||||
Sets PSE parameters.
|
||||
|
||||
Request contents:
|
||||
|
||||
====================================== ====== =============================
|
||||
``ETHTOOL_A_PSE_HEADER`` nested request header
|
||||
``ETHTOOL_A_PODL_PSE_ADMIN_CONTROL`` u32 Control PoDL PSE Admin state
|
||||
====================================== ====== =============================
|
||||
|
||||
When set, the optional ``ETHTOOL_A_PODL_PSE_ADMIN_CONTROL`` attribute is used
|
||||
to control PoDL PSE Admin functions. This option is implementing
|
||||
``IEEE 802.3-2018`` 30.15.1.2.1 acPoDLPSEAdminControl. See
|
||||
``ETHTOOL_A_PODL_PSE_ADMIN_STATE`` for supported values.
|
||||
|
||||
Request translation
|
||||
===================
|
||||
|
||||
|
@ -256,8 +256,10 @@ The tags in common use are:
|
||||
- Cc: the named person received a copy of the patch and had the
|
||||
opportunity to comment on it.
|
||||
|
||||
Be careful in the addition of tags to your patches: only Cc: is appropriate
|
||||
for addition without the explicit permission of the person named.
|
||||
Be careful in the addition of tags to your patches, as only Cc: is appropriate
|
||||
for addition without the explicit permission of the person named; using
|
||||
Reported-by: is fine most of the time as well, but ask for permission if
|
||||
the bug was reported in private.
|
||||
|
||||
|
||||
Sending the patch
|
||||
|
@ -31,6 +31,8 @@ you probably needn't concern yourself with pcmciautils.
|
||||
====================== =============== ========================================
|
||||
GNU C 5.1 gcc --version
|
||||
Clang/LLVM (optional) 11.0.0 clang --version
|
||||
Rust (optional) 1.62.0 rustc --version
|
||||
bindgen (optional) 0.56.0 bindgen --version
|
||||
GNU make 3.81 make --version
|
||||
bash 4.2 bash --version
|
||||
binutils 2.23 ld -v
|
||||
@ -80,6 +82,29 @@ kernels. Older releases aren't guaranteed to work, and we may drop workarounds
|
||||
from the kernel that were used to support older versions. Please see additional
|
||||
docs on :ref:`Building Linux with Clang/LLVM <kbuild_llvm>`.
|
||||
|
||||
Rust (optional)
|
||||
---------------
|
||||
|
||||
A particular version of the Rust toolchain is required. Newer versions may or
|
||||
may not work because the kernel depends on some unstable Rust features, for
|
||||
the moment.
|
||||
|
||||
Each Rust toolchain comes with several "components", some of which are required
|
||||
(like ``rustc``) and some that are optional. The ``rust-src`` component (which
|
||||
is optional) needs to be installed to build the kernel. Other components are
|
||||
useful for developing.
|
||||
|
||||
Please see Documentation/rust/quick-start.rst for instructions on how to
|
||||
satisfy the build requirements of Rust support. In particular, the ``Makefile``
|
||||
target ``rustavailable`` is useful to check why the Rust toolchain may not
|
||||
be detected.
|
||||
|
||||
bindgen (optional)
|
||||
------------------
|
||||
|
||||
``bindgen`` is used to generate the Rust bindings to the C side of the kernel.
|
||||
It depends on ``libclang``.
|
||||
|
||||
Make
|
||||
----
|
||||
|
||||
@ -348,6 +373,12 @@ Sphinx
|
||||
Please see :ref:`sphinx_install` in :ref:`Documentation/doc-guide/sphinx.rst <sphinxdoc>`
|
||||
for details about Sphinx requirements.
|
||||
|
||||
rustdoc
|
||||
-------
|
||||
|
||||
``rustdoc`` is used to generate the documentation for Rust code. Please see
|
||||
Documentation/rust/general-information.rst for more information.
|
||||
|
||||
Getting updated software
|
||||
========================
|
||||
|
||||
@ -364,6 +395,16 @@ Clang/LLVM
|
||||
|
||||
- :ref:`Getting LLVM <getting_llvm>`.
|
||||
|
||||
Rust
|
||||
----
|
||||
|
||||
- Documentation/rust/quick-start.rst.
|
||||
|
||||
bindgen
|
||||
-------
|
||||
|
||||
- Documentation/rust/quick-start.rst.
|
||||
|
||||
Make
|
||||
----
|
||||
|
||||
|
@ -51,7 +51,7 @@ the Technical Advisory Board (TAB) or other maintainers if you're
|
||||
uncertain how to handle situations that come up. It will not be
|
||||
considered a violation report unless you want it to be. If you are
|
||||
uncertain about approaching the TAB or any other maintainers, please
|
||||
reach out to our conflict mediator, Mishi Choudhary <mishi@linux.com>.
|
||||
reach out to our conflict mediator, Joanna Lee <joanna.lee@gesmer.com>.
|
||||
|
||||
In the end, "be kind to each other" is really what the end goal is for
|
||||
everybody. We know everyone is human and we all fail at times, but the
|
||||
@ -127,10 +127,12 @@ are listed at https://kernel.org/code-of-conduct.html. Members can not
|
||||
access reports made before they joined or after they have left the
|
||||
committee.
|
||||
|
||||
The initial Code of Conduct Committee consists of volunteer members of
|
||||
the TAB, as well as a professional mediator acting as a neutral third
|
||||
party. The first task of the committee is to establish documented
|
||||
processes, which will be made public.
|
||||
The Code of Conduct Committee consists of volunteer community members
|
||||
appointed by the TAB, as well as a professional mediator acting as a
|
||||
neutral third party. The processes the Code of Conduct committee will
|
||||
use to address reports is varied and will depend on the individual
|
||||
circumstance, however, this file serves as documentation for the
|
||||
general process used.
|
||||
|
||||
Any member of the committee, including the mediator, can be contacted
|
||||
directly if a reporter does not wish to include the full committee in a
|
||||
@ -141,16 +143,16 @@ processes (see above) and consults with the TAB as needed and
|
||||
appropriate, for instance to request and receive information about the
|
||||
kernel community.
|
||||
|
||||
Any decisions by the committee will be brought to the TAB, for
|
||||
implementation of enforcement with the relevant maintainers if needed.
|
||||
A decision by the Code of Conduct Committee can be overturned by the TAB
|
||||
by a two-thirds vote.
|
||||
Any decisions regarding enforcement recommendations will be brought to
|
||||
the TAB for implementation of enforcement with the relevant maintainers
|
||||
if needed. A decision by the Code of Conduct Committee can be overturned
|
||||
by the TAB by a two-thirds vote.
|
||||
|
||||
At quarterly intervals, the Code of Conduct Committee and TAB will
|
||||
provide a report summarizing the anonymised reports that the Code of
|
||||
Conduct committee has received and their status, as well details of any
|
||||
overridden decisions including complete and identifiable voting details.
|
||||
|
||||
We expect to establish a different process for Code of Conduct Committee
|
||||
staffing beyond the bootstrap period. This document will be updated
|
||||
with that information when this occurs.
|
||||
Because how we interpret and enforce the Code of Conduct will evolve over
|
||||
time, this document will be updated when necessary to reflect any
|
||||
changes.
|
||||
|
@ -1186,6 +1186,68 @@ expression used. For instance:
|
||||
#endif /* CONFIG_SOMETHING */
|
||||
|
||||
|
||||
22) Do not crash the kernel
|
||||
---------------------------
|
||||
|
||||
In general, the decision to crash the kernel belongs to the user, rather
|
||||
than to the kernel developer.
|
||||
|
||||
Avoid panic()
|
||||
*************
|
||||
|
||||
panic() should be used with care and primarily only during system boot.
|
||||
panic() is, for example, acceptable when running out of memory during boot and
|
||||
not being able to continue.
|
||||
|
||||
Use WARN() rather than BUG()
|
||||
****************************
|
||||
|
||||
Do not add new code that uses any of the BUG() variants, such as BUG(),
|
||||
BUG_ON(), or VM_BUG_ON(). Instead, use a WARN*() variant, preferably
|
||||
WARN_ON_ONCE(), and possibly with recovery code. Recovery code is not
|
||||
required if there is no reasonable way to at least partially recover.
|
||||
|
||||
"I'm too lazy to do error handling" is not an excuse for using BUG(). Major
|
||||
internal corruptions with no way of continuing may still use BUG(), but need
|
||||
good justification.
|
||||
|
||||
Use WARN_ON_ONCE() rather than WARN() or WARN_ON()
|
||||
**************************************************
|
||||
|
||||
WARN_ON_ONCE() is generally preferred over WARN() or WARN_ON(), because it
|
||||
is common for a given warning condition, if it occurs at all, to occur
|
||||
multiple times. This can fill up and wrap the kernel log, and can even slow
|
||||
the system enough that the excessive logging turns into its own, additional
|
||||
problem.
|
||||
|
||||
Do not WARN lightly
|
||||
*******************
|
||||
|
||||
WARN*() is intended for unexpected, this-should-never-happen situations.
|
||||
WARN*() macros are not to be used for anything that is expected to happen
|
||||
during normal operation. These are not pre- or post-condition asserts, for
|
||||
example. Again: WARN*() must not be used for a condition that is expected
|
||||
to trigger easily, for example, by user space actions. pr_warn_once() is a
|
||||
possible alternative, if you need to notify the user of a problem.
|
||||
|
||||
Do not worry about panic_on_warn users
|
||||
**************************************
|
||||
|
||||
A few more words about panic_on_warn: Remember that ``panic_on_warn`` is an
|
||||
available kernel option, and that many users set this option. This is why
|
||||
there is a "Do not WARN lightly" writeup, above. However, the existence of
|
||||
panic_on_warn users is not a valid reason to avoid the judicious use
|
||||
WARN*(). That is because, whoever enables panic_on_warn has explicitly
|
||||
asked the kernel to crash if a WARN*() fires, and such users must be
|
||||
prepared to deal with the consequences of a system that is somewhat more
|
||||
likely to crash.
|
||||
|
||||
Use BUILD_BUG_ON() for compile-time assertions
|
||||
**********************************************
|
||||
|
||||
The use of BUILD_BUG_ON() is acceptable and encouraged, because it is a
|
||||
compile-time assertion that has no effect at runtime.
|
||||
|
||||
Appendix I) References
|
||||
----------------------
|
||||
|
||||
|
@ -138,17 +138,20 @@ be NUL terminated. This can lead to various linear read overflows and
|
||||
other misbehavior due to the missing termination. It also NUL-pads
|
||||
the destination buffer if the source contents are shorter than the
|
||||
destination buffer size, which may be a needless performance penalty
|
||||
for callers using only NUL-terminated strings. The safe replacement is
|
||||
for callers using only NUL-terminated strings.
|
||||
|
||||
When the destination is required to be NUL-terminated, the replacement is
|
||||
strscpy(), though care must be given to any cases where the return value
|
||||
of strncpy() was used, since strscpy() does not return a pointer to the
|
||||
destination, but rather a count of non-NUL bytes copied (or negative
|
||||
errno when it truncates). Any cases still needing NUL-padding should
|
||||
instead use strscpy_pad().
|
||||
|
||||
If a caller is using non-NUL-terminated strings, strncpy() can
|
||||
still be used, but destinations should be marked with the `__nonstring
|
||||
If a caller is using non-NUL-terminated strings, strtomem() should be
|
||||
used, and the destinations should be marked with the `__nonstring
|
||||
<https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html>`_
|
||||
attribute to avoid future compiler warnings.
|
||||
attribute to avoid future compiler warnings. For cases still needing
|
||||
NUL-padding, strtomem_pad() can be used.
|
||||
|
||||
strlcpy()
|
||||
---------
|
||||
|
@ -5,6 +5,7 @@
|
||||
|
||||
.. _process_index:
|
||||
|
||||
=============================================
|
||||
Working with the kernel development community
|
||||
=============================================
|
||||
|
||||
|
@ -121,57 +121,56 @@ edit your ``~/.gnupg/gpg-agent.conf`` file to set your own values::
|
||||
to remove anything you had in place for older versions of GnuPG, as
|
||||
it may not be doing the right thing any more.
|
||||
|
||||
Set up a refresh cronjob
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
.. _protect_your_key:
|
||||
|
||||
You will need to regularly refresh your keyring in order to get the
|
||||
latest changes on other people's public keys, which is best done with a
|
||||
daily cronjob::
|
||||
|
||||
@daily /usr/bin/gpg2 --refresh >/dev/null 2>&1
|
||||
|
||||
Check the full path to your ``gpg`` or ``gpg2`` command and use the
|
||||
``gpg2`` command if regular ``gpg`` for you is the legacy GnuPG v.1.
|
||||
|
||||
.. _master_key:
|
||||
|
||||
Protect your master PGP key
|
||||
===========================
|
||||
Protect your PGP key
|
||||
====================
|
||||
|
||||
This guide assumes that you already have a PGP key that you use for Linux
|
||||
kernel development purposes. If you do not yet have one, please see the
|
||||
"`Protecting Code Integrity`_" document mentioned earlier for guidance
|
||||
on how to create a new one.
|
||||
|
||||
You should also make a new key if your current one is weaker than 2048 bits
|
||||
(RSA).
|
||||
You should also make a new key if your current one is weaker than 2048
|
||||
bits (RSA).
|
||||
|
||||
Master key vs. Subkeys
|
||||
----------------------
|
||||
Understanding PGP Subkeys
|
||||
-------------------------
|
||||
|
||||
Subkeys are fully independent PGP keypairs that are tied to the "master"
|
||||
key using certifying key signatures (certificates). It is important to
|
||||
understand the following:
|
||||
A PGP key rarely consists of a single keypair -- usually it is a
|
||||
collection of independent subkeys that can be used for different
|
||||
purposes based on their capabilities, assigned at their creation time.
|
||||
PGP defines four capabilities that a key can have:
|
||||
|
||||
1. There are no technical differences between the "master key" and "subkeys."
|
||||
2. At creation time, we assign functional limitations to each key by
|
||||
giving it specific capabilities.
|
||||
3. A PGP key can have 4 capabilities:
|
||||
- **[S]** keys can be used for signing
|
||||
- **[E]** keys can be used for encryption
|
||||
- **[A]** keys can be used for authentication
|
||||
- **[C]** keys can be used for certifying other keys
|
||||
|
||||
- **[S]** key can be used for signing
|
||||
- **[E]** key can be used for encryption
|
||||
- **[A]** key can be used for authentication
|
||||
- **[C]** key can be used for certifying other keys
|
||||
The key with the **[C]** capability is often called the "master" key,
|
||||
but this terminology is misleading because it implies that the Certify
|
||||
key can be used in place of any of other subkey on the same chain (like
|
||||
a physical "master key" can be used to open the locks made for other
|
||||
keys). Since this is not the case, this guide will refer to it as "the
|
||||
Certify key" to avoid any ambiguity.
|
||||
|
||||
4. A single key may have multiple capabilities.
|
||||
5. A subkey is fully independent from the master key. A message
|
||||
encrypted to a subkey cannot be decrypted with the master key. If you
|
||||
lose your private subkey, it cannot be recreated from the master key
|
||||
in any way.
|
||||
It is critical to fully understand the following:
|
||||
|
||||
The key carrying the **[C]** (certify) capability is considered the
|
||||
"master" key because it is the only key that can be used to indicate
|
||||
relationship with other keys. Only the **[C]** key can be used to:
|
||||
1. All subkeys are fully independent from each other. If you lose a
|
||||
private subkey, it cannot be restored or recreated from any other
|
||||
private key on your chain.
|
||||
2. With the exception of the Certify key, there can be multiple subkeys
|
||||
with identical capabilities (e.g. you can have 2 valid encryption
|
||||
subkeys, 3 valid signing subkeys, but only one valid certification
|
||||
subkey). All subkeys are fully independent -- a message encrypted to
|
||||
one **[E]** subkey cannot be decrypted with any other **[E]** subkey
|
||||
you may also have.
|
||||
3. A single subkey may have multiple capabilities (e.g. your **[C]** key
|
||||
can also be your **[S]** key).
|
||||
|
||||
The key carrying the **[C]** (certify) capability is the only key that
|
||||
can be used to indicate relationship with other keys. Only the **[C]**
|
||||
key can be used to:
|
||||
|
||||
- add or revoke other keys (subkeys) with S/E/A capabilities
|
||||
- add, change or revoke identities (uids) associated with the key
|
||||
@ -180,7 +179,7 @@ relationship with other keys. Only the **[C]** key can be used to:
|
||||
|
||||
By default, GnuPG creates the following when generating new keys:
|
||||
|
||||
- A master key carrying both Certify and Sign capabilities (**[SC]**)
|
||||
- One subkey carrying both Certify and Sign capabilities (**[SC]**)
|
||||
- A separate subkey with the Encryption capability (**[E]**)
|
||||
|
||||
If you used the default parameters when generating your key, then that
|
||||
@ -192,9 +191,6 @@ for example::
|
||||
uid [ultimate] Alice Dev <adev@kernel.org>
|
||||
ssb rsa2048 2018-01-23 [E] [expires: 2020-01-23]
|
||||
|
||||
Any key carrying the **[C]** capability is your master key, regardless
|
||||
of any other capabilities it may have assigned to it.
|
||||
|
||||
The long line under the ``sec`` entry is your key fingerprint --
|
||||
whenever you see ``[fpr]`` in the examples below, that 40-character
|
||||
string is what it refers to.
|
||||
@ -215,37 +211,30 @@ strong passphrase. To set it or change it, use::
|
||||
Create a separate Signing subkey
|
||||
--------------------------------
|
||||
|
||||
Our goal is to protect your master key by moving it to offline media, so
|
||||
if you only have a combined **[SC]** key, then you should create a separate
|
||||
signing subkey::
|
||||
Our goal is to protect your Certify key by moving it to offline media,
|
||||
so if you only have a combined **[SC]** key, then you should create a
|
||||
separate signing subkey::
|
||||
|
||||
$ gpg --quick-addkey [fpr] ed25519 sign
|
||||
|
||||
Remember to tell the keyservers about this change, so others can pull down
|
||||
your new subkey::
|
||||
|
||||
$ gpg --send-key [fpr]
|
||||
|
||||
.. note:: ECC support in GnuPG
|
||||
|
||||
GnuPG 2.1 and later has full support for Elliptic Curve
|
||||
Cryptography, with ability to combine ECC subkeys with traditional
|
||||
RSA master keys. The main upside of ECC cryptography is that it is
|
||||
much faster computationally and creates much smaller signatures when
|
||||
RSA keys. The main upside of ECC cryptography is that it is much
|
||||
faster computationally and creates much smaller signatures when
|
||||
compared byte for byte with 2048+ bit RSA keys. Unless you plan on
|
||||
using a smartcard device that does not support ECC operations, we
|
||||
recommend that you create an ECC signing subkey for your kernel
|
||||
work.
|
||||
|
||||
If for some reason you prefer to stay with RSA subkeys, just replace
|
||||
"ed25519" with "rsa2048" in the above command. Additionally, if you
|
||||
plan to use a hardware device that does not support ED25519 ECC
|
||||
keys, like Nitrokey Pro or a Yubikey, then you should use
|
||||
"nistp256" instead or "ed25519."
|
||||
Note, that if you plan to use a hardware device that does not
|
||||
support ED25519 ECC keys, you should choose "nistp256" instead or
|
||||
"ed25519."
|
||||
|
||||
|
||||
Back up your master key for disaster recovery
|
||||
---------------------------------------------
|
||||
Back up your Certify key for disaster recovery
|
||||
----------------------------------------------
|
||||
|
||||
The more signatures you have on your PGP key from other developers, the
|
||||
more reasons you have to create a backup version that lives on something
|
||||
@ -277,9 +266,7 @@ home, such as your bank vault.
|
||||
Your printer is probably no longer a simple dumb device connected to
|
||||
your parallel port, but since the output is still encrypted with
|
||||
your passphrase, printing out even to "cloud-integrated" modern
|
||||
printers should remain a relatively safe operation. One option is to
|
||||
change the passphrase on your master key immediately after you are
|
||||
done with paperkey.
|
||||
printers should remain a relatively safe operation.
|
||||
|
||||
Back up your whole GnuPG directory
|
||||
----------------------------------
|
||||
@ -300,7 +287,7 @@ will use for backup purposes. You will need to encrypt them using LUKS
|
||||
-- refer to your distro's documentation on how to accomplish this.
|
||||
|
||||
For the encryption passphrase, you can use the same one as on your
|
||||
master key.
|
||||
PGP key.
|
||||
|
||||
Once the encryption process is over, re-insert the USB drive and make
|
||||
sure it gets properly mounted. Copy your entire ``.gnupg`` directory
|
||||
@ -319,7 +306,7 @@ far away, because you'll need to use it every now and again for things
|
||||
like editing identities, adding or revoking subkeys, or signing other
|
||||
people's keys.
|
||||
|
||||
Remove the master key from your homedir
|
||||
Remove the Certify key from your homedir
|
||||
----------------------------------------
|
||||
|
||||
The files in our home directory are not as well protected as we like to
|
||||
@ -334,7 +321,7 @@ think. They can be leaked or stolen via many different means:
|
||||
Protecting your key with a good passphrase greatly helps reduce the risk
|
||||
of any of the above, but passphrases can be discovered via keyloggers,
|
||||
shoulder-surfing, or any number of other means. For this reason, the
|
||||
recommended setup is to remove your master key from your home directory
|
||||
recommended setup is to remove your Certify key from your home directory
|
||||
and store it on offline storage.
|
||||
|
||||
.. warning::
|
||||
@ -343,7 +330,7 @@ and store it on offline storage.
|
||||
your GnuPG directory in its entirety. What we are about to do will
|
||||
render your key useless if you do not have a usable backup!
|
||||
|
||||
First, identify the keygrip of your master key::
|
||||
First, identify the keygrip of your Certify key::
|
||||
|
||||
$ gpg --with-keygrip --list-key [fpr]
|
||||
|
||||
@ -359,7 +346,7 @@ The output will be something like this::
|
||||
Keygrip = 3333000000000000000000000000000000000000
|
||||
|
||||
Find the keygrip entry that is beneath the ``pub`` line (right under the
|
||||
master key fingerprint). This will correspond directly to a file in your
|
||||
Certify key fingerprint). This will correspond directly to a file in your
|
||||
``~/.gnupg`` directory::
|
||||
|
||||
$ cd ~/.gnupg/private-keys-v1.d
|
||||
@ -369,13 +356,13 @@ master key fingerprint). This will correspond directly to a file in your
|
||||
3333000000000000000000000000000000000000.key
|
||||
|
||||
All you have to do is simply remove the .key file that corresponds to
|
||||
the master keygrip::
|
||||
the Certify key keygrip::
|
||||
|
||||
$ cd ~/.gnupg/private-keys-v1.d
|
||||
$ rm 1111000000000000000000000000000000000000.key
|
||||
|
||||
Now, if you issue the ``--list-secret-keys`` command, it will show that
|
||||
the master key is missing (the ``#`` indicates it is not available)::
|
||||
the Certify key is missing (the ``#`` indicates it is not available)::
|
||||
|
||||
$ gpg --list-secret-keys
|
||||
sec# rsa2048 2018-01-24 [SC] [expires: 2020-01-24]
|
||||
@ -404,7 +391,7 @@ file, which still contains your private keys.
|
||||
Move the subkeys to a dedicated crypto device
|
||||
=============================================
|
||||
|
||||
Even though the master key is now safe from being leaked or stolen, the
|
||||
Even though the Certify key is now safe from being leaked or stolen, the
|
||||
subkeys are still in your home directory. Anyone who manages to get
|
||||
their hands on those will be able to decrypt your communication or fake
|
||||
your signatures (if they know the passphrase). Furthermore, each time a
|
||||
@ -447,7 +434,8 @@ functionality. There are several options available:
|
||||
- `Yubikey 5`_: proprietary hardware and software, but cheaper than
|
||||
Nitrokey Pro and comes available in the USB-C form that is more useful
|
||||
with newer laptops. Offers additional security features such as FIDO
|
||||
U2F, among others, and now finally supports ECC keys (NISTP).
|
||||
U2F, among others, and now finally supports NISTP and ED25519 ECC
|
||||
keys.
|
||||
|
||||
`LWN has a good review`_ of some of the above models, as well as several
|
||||
others. Your choice will depend on cost, shipping availability in your
|
||||
@ -460,7 +448,7 @@ geographical region, and open/proprietary hardware considerations.
|
||||
Foundation.
|
||||
|
||||
.. _`Nitrokey Start`: https://shop.nitrokey.com/shop/product/nitrokey-start-6
|
||||
.. _`Nitrokey Pro 2`: https://shop.nitrokey.com/shop/product/nitrokey-pro-2-3
|
||||
.. _`Nitrokey Pro 2`: https://shop.nitrokey.com/shop/product/nkpr2-nitrokey-pro-2-3
|
||||
.. _`Yubikey 5`: https://www.yubico.com/products/yubikey-5-overview/
|
||||
.. _Gnuk: https://www.fsij.org/doc-gnuk/
|
||||
.. _`LWN has a good review`: https://lwn.net/Articles/736231/
|
||||
@ -627,10 +615,10 @@ Other common GnuPG operations
|
||||
Here is a quick reference for some common operations you'll need to do
|
||||
with your PGP key.
|
||||
|
||||
Mounting your master key offline storage
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
Mounting your safe offline storage
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
You will need your master key for any of the operations below, so you
|
||||
You will need your Certify key for any of the operations below, so you
|
||||
will first need to mount your backup offline storage and tell GnuPG to
|
||||
use it::
|
||||
|
||||
@ -644,7 +632,7 @@ your regular home directory location).
|
||||
Extending key expiration date
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The master key has the default expiration date of 2 years from the date
|
||||
The Certify key has the default expiration date of 2 years from the date
|
||||
of creation. This is done both for security reasons and to make obsolete
|
||||
keys eventually disappear from keyservers.
|
||||
|
||||
@ -685,6 +673,7 @@ remote end.
|
||||
|
||||
.. _`Agent Forwarding over SSH`: https://wiki.gnupg.org/AgentForwarding
|
||||
|
||||
.. _pgp_with_git:
|
||||
|
||||
Using PGP with Git
|
||||
==================
|
||||
@ -828,6 +817,63 @@ You can tell git to always sign commits::
|
||||
|
||||
.. _verify_identities:
|
||||
|
||||
|
||||
How to work with signed patches
|
||||
-------------------------------
|
||||
|
||||
It is possible to use your PGP key to sign patches sent to kernel
|
||||
developer mailing lists. Since existing email signature mechanisms
|
||||
(PGP-Mime or PGP-inline) tend to cause problems with regular code
|
||||
review tasks, you should use the tool kernel.org created for this
|
||||
purpose that puts cryptographic attestation signatures into message
|
||||
headers (a-la DKIM):
|
||||
|
||||
- `Patatt Patch Attestation`_
|
||||
|
||||
.. _`Patatt Patch Attestation`: https://pypi.org/project/patatt/
|
||||
|
||||
Installing and configuring patatt
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Patatt is packaged for many distributions already, so please check there
|
||||
first. You can also install it from pypi using "``pip install patatt``".
|
||||
|
||||
If you already have your PGP key configured with git (via the
|
||||
``user.signingKey`` configuration parameter), then patatt requires no
|
||||
further configuration. You can start signing your patches by installing
|
||||
the git-send-email hook in the repository you want::
|
||||
|
||||
patatt install-hook
|
||||
|
||||
Now any patches you send with ``git send-email`` will be automatically
|
||||
signed with your cryptographic signature.
|
||||
|
||||
Checking patatt signatures
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
If you are using ``b4`` to retrieve and apply patches, then it will
|
||||
automatically attempt to verify all DKIM and patatt signatures it
|
||||
encounters, for example::
|
||||
|
||||
$ b4 am 20220720205013.890942-1-broonie@kernel.org
|
||||
[...]
|
||||
Checking attestation on all messages, may take a moment...
|
||||
---
|
||||
✓ [PATCH v1 1/3] kselftest/arm64: Correct buffer allocation for SVE Z registers
|
||||
✓ [PATCH v1 2/3] arm64/sve: Document our actual ABI for clearing registers on syscall
|
||||
✓ [PATCH v1 3/3] kselftest/arm64: Enforce actual ABI for SVE syscalls
|
||||
---
|
||||
✓ Signed: openpgp/broonie@kernel.org
|
||||
✓ Signed: DKIM/kernel.org
|
||||
|
||||
.. note::
|
||||
|
||||
Patatt and b4 are still in active development and you should check
|
||||
the latest documentation for these projects for any new or updated
|
||||
features.
|
||||
|
||||
.. _kernel_identities:
|
||||
|
||||
How to verify kernel developer identities
|
||||
=========================================
|
||||
|
||||
@ -899,65 +945,17 @@ the new default in GnuPG v2). To set it, add (or modify) the
|
||||
|
||||
trust-model tofu+pgp
|
||||
|
||||
How to use keyservers (more) safely
|
||||
-----------------------------------
|
||||
Using the kernel.org web of trust repository
|
||||
--------------------------------------------
|
||||
|
||||
If you get a "No public key" error when trying to validate someone's
|
||||
tag, then you should attempt to lookup that key using a keyserver. It is
|
||||
important to keep in mind that there is absolutely no guarantee that the
|
||||
key you retrieve from PGP keyservers belongs to the actual person --
|
||||
that much is by design. You are supposed to use the Web of Trust to
|
||||
establish key validity.
|
||||
Kernel.org maintains a git repository with developers' public keys as a
|
||||
replacement for replicating keyserver networks that have gone mostly
|
||||
dark in the past few years. The full documentation for how to set up
|
||||
that repository as your source of public keys can be found here:
|
||||
|
||||
How to properly maintain the Web of Trust is beyond the scope of this
|
||||
document, simply because doing it properly requires both effort and
|
||||
dedication that tends to be beyond the caring threshold of most human
|
||||
beings. Here are some shortcuts that will help you reduce the risk of
|
||||
importing a malicious key.
|
||||
- `Kernel developer PGP Keyring`_
|
||||
|
||||
First, let's say you've tried to run ``git verify-tag`` but it returned
|
||||
an error saying the key is not found::
|
||||
If you are a kernel developer, please consider submitting your key for
|
||||
inclusion into that keyring.
|
||||
|
||||
$ git verify-tag sunxi-fixes-for-4.15-2
|
||||
gpg: Signature made Sun 07 Jan 2018 10:51:55 PM EST
|
||||
gpg: using RSA key DA73759BF8619E484E5A3B47389A54219C0F2430
|
||||
gpg: issuer "wens@...org"
|
||||
gpg: Can't check signature: No public key
|
||||
|
||||
Let's query the keyserver for more info about that key fingerprint (the
|
||||
fingerprint probably belongs to a subkey, so we can't use it directly
|
||||
without finding out the ID of the master key it is associated with)::
|
||||
|
||||
$ gpg --search DA73759BF8619E484E5A3B47389A54219C0F2430
|
||||
gpg: data source: hkp://keys.gnupg.net
|
||||
(1) Chen-Yu Tsai <wens@...org>
|
||||
4096 bit RSA key C94035C21B4F2AEB, created: 2017-03-14, expires: 2019-03-15
|
||||
Keys 1-1 of 1 for "DA73759BF8619E484E5A3B47389A54219C0F2430". Enter number(s), N)ext, or Q)uit > q
|
||||
|
||||
Locate the ID of the master key in the output, in our example
|
||||
``C94035C21B4F2AEB``. Now display the key of Linus Torvalds that you
|
||||
have on your keyring::
|
||||
|
||||
$ gpg --list-key torvalds@kernel.org
|
||||
pub rsa2048 2011-09-20 [SC]
|
||||
ABAF11C65A2970B130ABE3C479BE3E4300411886
|
||||
uid [ unknown] Linus Torvalds <torvalds@kernel.org>
|
||||
sub rsa2048 2011-09-20 [E]
|
||||
|
||||
Next, find a trust path from Linus Torvalds to the key-id you found via ``gpg
|
||||
--search`` of the unknown key. For this, you can use several tools including
|
||||
https://github.com/mricon/wotmate,
|
||||
https://git.kernel.org/pub/scm/docs/kernel/pgpkeys.git/tree/graphs, and
|
||||
https://the.earth.li/~noodles/pathfind.html.
|
||||
|
||||
If you get a few decent trust paths, then it's a pretty good indication
|
||||
that it is a valid key. You can add it to your keyring from the
|
||||
keyserver now::
|
||||
|
||||
$ gpg --recv-key C94035C21B4F2AEB
|
||||
|
||||
This process is not perfect, and you are obviously trusting the
|
||||
administrators of the PGP Pathfinder service to not be malicious (in
|
||||
fact, this goes against :ref:`devs_not_infra`). However, if you
|
||||
do not carefully maintain your own web of trust, then it is a marked
|
||||
improvement over blindly trusting keyservers.
|
||||
.. _`Kernel developer PGP Keyring`: https://korg.docs.kernel.org/pgpkeys.html
|
||||
|
@ -97,6 +97,12 @@ text, like this:
|
||||
|
||||
commit <sha1> upstream.
|
||||
|
||||
or alternatively:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
[ Upstream commit <sha1> ]
|
||||
|
||||
Additionally, some patches submitted via :ref:`option_1` may have additional
|
||||
patch prerequisites which can be cherry-picked. This can be specified in the
|
||||
following format in the sign-off area:
|
||||
|
@ -715,8 +715,8 @@ references.
|
||||
|
||||
.. _backtraces:
|
||||
|
||||
Backtraces in commit mesages
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
Backtraces in commit messages
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Backtraces help document the call chain leading to a problem. However,
|
||||
not all backtraces are helpful. For example, early boot call chains are
|
||||
|
19
Documentation/rust/arch-support.rst
Normal file
19
Documentation/rust/arch-support.rst
Normal file
@ -0,0 +1,19 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
Arch Support
|
||||
============
|
||||
|
||||
Currently, the Rust compiler (``rustc``) uses LLVM for code generation,
|
||||
which limits the supported architectures that can be targeted. In addition,
|
||||
support for building the kernel with LLVM/Clang varies (please see
|
||||
Documentation/kbuild/llvm.rst). This support is needed for ``bindgen``
|
||||
which uses ``libclang``.
|
||||
|
||||
Below is a general summary of architectures that currently work. Level of
|
||||
support corresponds to ``S`` values in the ``MAINTAINERS`` file.
|
||||
|
||||
============ ================ ==============================================
|
||||
Architecture Level of support Constraints
|
||||
============ ================ ==============================================
|
||||
``x86`` Maintained ``x86_64`` only.
|
||||
============ ================ ==============================================
|
216
Documentation/rust/coding-guidelines.rst
Normal file
216
Documentation/rust/coding-guidelines.rst
Normal file
@ -0,0 +1,216 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
Coding Guidelines
|
||||
=================
|
||||
|
||||
This document describes how to write Rust code in the kernel.
|
||||
|
||||
|
||||
Style & formatting
|
||||
------------------
|
||||
|
||||
The code should be formatted using ``rustfmt``. In this way, a person
|
||||
contributing from time to time to the kernel does not need to learn and
|
||||
remember one more style guide. More importantly, reviewers and maintainers
|
||||
do not need to spend time pointing out style issues anymore, and thus
|
||||
less patch roundtrips may be needed to land a change.
|
||||
|
||||
.. note:: Conventions on comments and documentation are not checked by
|
||||
``rustfmt``. Thus those are still needed to be taken care of.
|
||||
|
||||
The default settings of ``rustfmt`` are used. This means the idiomatic Rust
|
||||
style is followed. For instance, 4 spaces are used for indentation rather
|
||||
than tabs.
|
||||
|
||||
It is convenient to instruct editors/IDEs to format while typing,
|
||||
when saving or at commit time. However, if for some reason reformatting
|
||||
the entire kernel Rust sources is needed at some point, the following can be
|
||||
run::
|
||||
|
||||
make LLVM=1 rustfmt
|
||||
|
||||
It is also possible to check if everything is formatted (printing a diff
|
||||
otherwise), for instance for a CI, with::
|
||||
|
||||
make LLVM=1 rustfmtcheck
|
||||
|
||||
Like ``clang-format`` for the rest of the kernel, ``rustfmt`` works on
|
||||
individual files, and does not require a kernel configuration. Sometimes it may
|
||||
even work with broken code.
|
||||
|
||||
|
||||
Comments
|
||||
--------
|
||||
|
||||
"Normal" comments (i.e. ``//``, rather than code documentation which starts
|
||||
with ``///`` or ``//!``) are written in Markdown the same way as documentation
|
||||
comments are, even though they will not be rendered. This improves consistency,
|
||||
simplifies the rules and allows to move content between the two kinds of
|
||||
comments more easily. For instance:
|
||||
|
||||
.. code-block:: rust
|
||||
|
||||
// `object` is ready to be handled now.
|
||||
f(object);
|
||||
|
||||
Furthermore, just like documentation, comments are capitalized at the beginning
|
||||
of a sentence and ended with a period (even if it is a single sentence). This
|
||||
includes ``// SAFETY:``, ``// TODO:`` and other "tagged" comments, e.g.:
|
||||
|
||||
.. code-block:: rust
|
||||
|
||||
// FIXME: The error should be handled properly.
|
||||
|
||||
Comments should not be used for documentation purposes: comments are intended
|
||||
for implementation details, not users. This distinction is useful even if the
|
||||
reader of the source file is both an implementor and a user of an API. In fact,
|
||||
sometimes it is useful to use both comments and documentation at the same time.
|
||||
For instance, for a ``TODO`` list or to comment on the documentation itself.
|
||||
For the latter case, comments can be inserted in the middle; that is, closer to
|
||||
the line of documentation to be commented. For any other case, comments are
|
||||
written after the documentation, e.g.:
|
||||
|
||||
.. code-block:: rust
|
||||
|
||||
/// Returns a new [`Foo`].
|
||||
///
|
||||
/// # Examples
|
||||
///
|
||||
// TODO: Find a better example.
|
||||
/// ```
|
||||
/// let foo = f(42);
|
||||
/// ```
|
||||
// FIXME: Use fallible approach.
|
||||
pub fn f(x: i32) -> Foo {
|
||||
// ...
|
||||
}
|
||||
|
||||
One special kind of comments are the ``// SAFETY:`` comments. These must appear
|
||||
before every ``unsafe`` block, and they explain why the code inside the block is
|
||||
correct/sound, i.e. why it cannot trigger undefined behavior in any case, e.g.:
|
||||
|
||||
.. code-block:: rust
|
||||
|
||||
// SAFETY: `p` is valid by the safety requirements.
|
||||
unsafe { *p = 0; }
|
||||
|
||||
``// SAFETY:`` comments are not to be confused with the ``# Safety`` sections
|
||||
in code documentation. ``# Safety`` sections specify the contract that callers
|
||||
(for functions) or implementors (for traits) need to abide by. ``// SAFETY:``
|
||||
comments show why a call (for functions) or implementation (for traits) actually
|
||||
respects the preconditions stated in a ``# Safety`` section or the language
|
||||
reference.
|
||||
|
||||
|
||||
Code documentation
|
||||
------------------
|
||||
|
||||
Rust kernel code is not documented like C kernel code (i.e. via kernel-doc).
|
||||
Instead, the usual system for documenting Rust code is used: the ``rustdoc``
|
||||
tool, which uses Markdown (a lightweight markup language).
|
||||
|
||||
To learn Markdown, there are many guides available out there. For instance,
|
||||
the one at:
|
||||
|
||||
https://commonmark.org/help/
|
||||
|
||||
This is how a well-documented Rust function may look like:
|
||||
|
||||
.. code-block:: rust
|
||||
|
||||
/// Returns the contained [`Some`] value, consuming the `self` value,
|
||||
/// without checking that the value is not [`None`].
|
||||
///
|
||||
/// # Safety
|
||||
///
|
||||
/// Calling this method on [`None`] is *[undefined behavior]*.
|
||||
///
|
||||
/// [undefined behavior]: https://doc.rust-lang.org/reference/behavior-considered-undefined.html
|
||||
///
|
||||
/// # Examples
|
||||
///
|
||||
/// ```
|
||||
/// let x = Some("air");
|
||||
/// assert_eq!(unsafe { x.unwrap_unchecked() }, "air");
|
||||
/// ```
|
||||
pub unsafe fn unwrap_unchecked(self) -> T {
|
||||
match self {
|
||||
Some(val) => val,
|
||||
|
||||
// SAFETY: The safety contract must be upheld by the caller.
|
||||
None => unsafe { hint::unreachable_unchecked() },
|
||||
}
|
||||
}
|
||||
|
||||
This example showcases a few ``rustdoc`` features and some conventions followed
|
||||
in the kernel:
|
||||
|
||||
- The first paragraph must be a single sentence briefly describing what
|
||||
the documented item does. Further explanations must go in extra paragraphs.
|
||||
|
||||
- Unsafe functions must document their safety preconditions under
|
||||
a ``# Safety`` section.
|
||||
|
||||
- While not shown here, if a function may panic, the conditions under which
|
||||
that happens must be described under a ``# Panics`` section.
|
||||
|
||||
Please note that panicking should be very rare and used only with a good
|
||||
reason. In almost all cases, a fallible approach should be used, typically
|
||||
returning a ``Result``.
|
||||
|
||||
- If providing examples of usage would help readers, they must be written in
|
||||
a section called ``# Examples``.
|
||||
|
||||
- Rust items (functions, types, constants...) must be linked appropriately
|
||||
(``rustdoc`` will create a link automatically).
|
||||
|
||||
- Any ``unsafe`` block must be preceded by a ``// SAFETY:`` comment
|
||||
describing why the code inside is sound.
|
||||
|
||||
While sometimes the reason might look trivial and therefore unneeded,
|
||||
writing these comments is not just a good way of documenting what has been
|
||||
taken into account, but most importantly, it provides a way to know that
|
||||
there are no *extra* implicit constraints.
|
||||
|
||||
To learn more about how to write documentation for Rust and extra features,
|
||||
please take a look at the ``rustdoc`` book at:
|
||||
|
||||
https://doc.rust-lang.org/rustdoc/how-to-write-documentation.html
|
||||
|
||||
|
||||
Naming
|
||||
------
|
||||
|
||||
Rust kernel code follows the usual Rust naming conventions:
|
||||
|
||||
https://rust-lang.github.io/api-guidelines/naming.html
|
||||
|
||||
When existing C concepts (e.g. macros, functions, objects...) are wrapped into
|
||||
a Rust abstraction, a name as close as reasonably possible to the C side should
|
||||
be used in order to avoid confusion and to improve readability when switching
|
||||
back and forth between the C and Rust sides. For instance, macros such as
|
||||
``pr_info`` from C are named the same in the Rust side.
|
||||
|
||||
Having said that, casing should be adjusted to follow the Rust naming
|
||||
conventions, and namespacing introduced by modules and types should not be
|
||||
repeated in the item names. For instance, when wrapping constants like:
|
||||
|
||||
.. code-block:: c
|
||||
|
||||
#define GPIO_LINE_DIRECTION_IN 0
|
||||
#define GPIO_LINE_DIRECTION_OUT 1
|
||||
|
||||
The equivalent in Rust may look like (ignoring documentation):
|
||||
|
||||
.. code-block:: rust
|
||||
|
||||
pub mod gpio {
|
||||
pub enum LineDirection {
|
||||
In = bindings::GPIO_LINE_DIRECTION_IN as _,
|
||||
Out = bindings::GPIO_LINE_DIRECTION_OUT as _,
|
||||
}
|
||||
}
|
||||
|
||||
That is, the equivalent of ``GPIO_LINE_DIRECTION_IN`` would be referred to as
|
||||
``gpio::LineDirection::In``. In particular, it should not be named
|
||||
``gpio::gpio_line_direction::GPIO_LINE_DIRECTION_IN``.
|
79
Documentation/rust/general-information.rst
Normal file
79
Documentation/rust/general-information.rst
Normal file
@ -0,0 +1,79 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
General Information
|
||||
===================
|
||||
|
||||
This document contains useful information to know when working with
|
||||
the Rust support in the kernel.
|
||||
|
||||
|
||||
Code documentation
|
||||
------------------
|
||||
|
||||
Rust kernel code is documented using ``rustdoc``, its built-in documentation
|
||||
generator.
|
||||
|
||||
The generated HTML docs include integrated search, linked items (e.g. types,
|
||||
functions, constants), source code, etc. They may be read at (TODO: link when
|
||||
in mainline and generated alongside the rest of the documentation):
|
||||
|
||||
http://kernel.org/
|
||||
|
||||
The docs can also be easily generated and read locally. This is quite fast
|
||||
(same order as compiling the code itself) and no special tools or environment
|
||||
are needed. This has the added advantage that they will be tailored to
|
||||
the particular kernel configuration used. To generate them, use the ``rustdoc``
|
||||
target with the same invocation used for compilation, e.g.::
|
||||
|
||||
make LLVM=1 rustdoc
|
||||
|
||||
To read the docs locally in your web browser, run e.g.::
|
||||
|
||||
xdg-open rust/doc/kernel/index.html
|
||||
|
||||
To learn about how to write the documentation, please see coding-guidelines.rst.
|
||||
|
||||
|
||||
Extra lints
|
||||
-----------
|
||||
|
||||
While ``rustc`` is a very helpful compiler, some extra lints and analyses are
|
||||
available via ``clippy``, a Rust linter. To enable it, pass ``CLIPPY=1`` to
|
||||
the same invocation used for compilation, e.g.::
|
||||
|
||||
make LLVM=1 CLIPPY=1
|
||||
|
||||
Please note that Clippy may change code generation, thus it should not be
|
||||
enabled while building a production kernel.
|
||||
|
||||
|
||||
Abstractions vs. bindings
|
||||
-------------------------
|
||||
|
||||
Abstractions are Rust code wrapping kernel functionality from the C side.
|
||||
|
||||
In order to use functions and types from the C side, bindings are created.
|
||||
Bindings are the declarations for Rust of those functions and types from
|
||||
the C side.
|
||||
|
||||
For instance, one may write a ``Mutex`` abstraction in Rust which wraps
|
||||
a ``struct mutex`` from the C side and calls its functions through the bindings.
|
||||
|
||||
Abstractions are not available for all the kernel internal APIs and concepts,
|
||||
but it is intended that coverage is expanded as time goes on. "Leaf" modules
|
||||
(e.g. drivers) should not use the C bindings directly. Instead, subsystems
|
||||
should provide as-safe-as-possible abstractions as needed.
|
||||
|
||||
|
||||
Conditional compilation
|
||||
-----------------------
|
||||
|
||||
Rust code has access to conditional compilation based on the kernel
|
||||
configuration:
|
||||
|
||||
.. code-block:: rust
|
||||
|
||||
#[cfg(CONFIG_X)] // Enabled (`y` or `m`)
|
||||
#[cfg(CONFIG_X="y")] // Enabled as a built-in (`y`)
|
||||
#[cfg(CONFIG_X="m")] // Enabled as a module (`m`)
|
||||
#[cfg(not(CONFIG_X))] // Disabled
|
22
Documentation/rust/index.rst
Normal file
22
Documentation/rust/index.rst
Normal file
@ -0,0 +1,22 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
Rust
|
||||
====
|
||||
|
||||
Documentation related to Rust within the kernel. To start using Rust
|
||||
in the kernel, please read the quick-start.rst guide.
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
quick-start
|
||||
general-information
|
||||
coding-guidelines
|
||||
arch-support
|
||||
|
||||
.. only:: subproject and html
|
||||
|
||||
Indices
|
||||
=======
|
||||
|
||||
* :ref:`genindex`
|
232
Documentation/rust/quick-start.rst
Normal file
232
Documentation/rust/quick-start.rst
Normal file
@ -0,0 +1,232 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
Quick Start
|
||||
===========
|
||||
|
||||
This document describes how to get started with kernel development in Rust.
|
||||
|
||||
|
||||
Requirements: Building
|
||||
----------------------
|
||||
|
||||
This section explains how to fetch the tools needed for building.
|
||||
|
||||
Some of these requirements might be available from Linux distributions
|
||||
under names like ``rustc``, ``rust-src``, ``rust-bindgen``, etc. However,
|
||||
at the time of writing, they are likely not to be recent enough unless
|
||||
the distribution tracks the latest releases.
|
||||
|
||||
To easily check whether the requirements are met, the following target
|
||||
can be used::
|
||||
|
||||
make LLVM=1 rustavailable
|
||||
|
||||
This triggers the same logic used by Kconfig to determine whether
|
||||
``RUST_IS_AVAILABLE`` should be enabled; but it also explains why not
|
||||
if that is the case.
|
||||
|
||||
|
||||
rustc
|
||||
*****
|
||||
|
||||
A particular version of the Rust compiler is required. Newer versions may or
|
||||
may not work because, for the moment, the kernel depends on some unstable
|
||||
Rust features.
|
||||
|
||||
If ``rustup`` is being used, enter the checked out source code directory
|
||||
and run::
|
||||
|
||||
rustup override set $(scripts/min-tool-version.sh rustc)
|
||||
|
||||
Otherwise, fetch a standalone installer or install ``rustup`` from:
|
||||
|
||||
https://www.rust-lang.org
|
||||
|
||||
|
||||
Rust standard library source
|
||||
****************************
|
||||
|
||||
The Rust standard library source is required because the build system will
|
||||
cross-compile ``core`` and ``alloc``.
|
||||
|
||||
If ``rustup`` is being used, run::
|
||||
|
||||
rustup component add rust-src
|
||||
|
||||
The components are installed per toolchain, thus upgrading the Rust compiler
|
||||
version later on requires re-adding the component.
|
||||
|
||||
Otherwise, if a standalone installer is used, the Rust repository may be cloned
|
||||
into the installation folder of the toolchain::
|
||||
|
||||
git clone --recurse-submodules \
|
||||
--branch $(scripts/min-tool-version.sh rustc) \
|
||||
https://github.com/rust-lang/rust \
|
||||
$(rustc --print sysroot)/lib/rustlib/src/rust
|
||||
|
||||
In this case, upgrading the Rust compiler version later on requires manually
|
||||
updating this clone.
|
||||
|
||||
|
||||
libclang
|
||||
********
|
||||
|
||||
``libclang`` (part of LLVM) is used by ``bindgen`` to understand the C code
|
||||
in the kernel, which means LLVM needs to be installed; like when the kernel
|
||||
is compiled with ``CC=clang`` or ``LLVM=1``.
|
||||
|
||||
Linux distributions are likely to have a suitable one available, so it is
|
||||
best to check that first.
|
||||
|
||||
There are also some binaries for several systems and architectures uploaded at:
|
||||
|
||||
https://releases.llvm.org/download.html
|
||||
|
||||
Otherwise, building LLVM takes quite a while, but it is not a complex process:
|
||||
|
||||
https://llvm.org/docs/GettingStarted.html#getting-the-source-code-and-building-llvm
|
||||
|
||||
Please see Documentation/kbuild/llvm.rst for more information and further ways
|
||||
to fetch pre-built releases and distribution packages.
|
||||
|
||||
|
||||
bindgen
|
||||
*******
|
||||
|
||||
The bindings to the C side of the kernel are generated at build time using
|
||||
the ``bindgen`` tool. A particular version is required.
|
||||
|
||||
Install it via (note that this will download and build the tool from source)::
|
||||
|
||||
cargo install --locked --version $(scripts/min-tool-version.sh bindgen) bindgen
|
||||
|
||||
|
||||
Requirements: Developing
|
||||
------------------------
|
||||
|
||||
This section explains how to fetch the tools needed for developing. That is,
|
||||
they are not needed when just building the kernel.
|
||||
|
||||
|
||||
rustfmt
|
||||
*******
|
||||
|
||||
The ``rustfmt`` tool is used to automatically format all the Rust kernel code,
|
||||
including the generated C bindings (for details, please see
|
||||
coding-guidelines.rst).
|
||||
|
||||
If ``rustup`` is being used, its ``default`` profile already installs the tool,
|
||||
thus nothing needs to be done. If another profile is being used, the component
|
||||
can be installed manually::
|
||||
|
||||
rustup component add rustfmt
|
||||
|
||||
The standalone installers also come with ``rustfmt``.
|
||||
|
||||
|
||||
clippy
|
||||
******
|
||||
|
||||
``clippy`` is a Rust linter. Running it provides extra warnings for Rust code.
|
||||
It can be run by passing ``CLIPPY=1`` to ``make`` (for details, please see
|
||||
general-information.rst).
|
||||
|
||||
If ``rustup`` is being used, its ``default`` profile already installs the tool,
|
||||
thus nothing needs to be done. If another profile is being used, the component
|
||||
can be installed manually::
|
||||
|
||||
rustup component add clippy
|
||||
|
||||
The standalone installers also come with ``clippy``.
|
||||
|
||||
|
||||
cargo
|
||||
*****
|
||||
|
||||
``cargo`` is the Rust native build system. It is currently required to run
|
||||
the tests since it is used to build a custom standard library that contains
|
||||
the facilities provided by the custom ``alloc`` in the kernel. The tests can
|
||||
be run using the ``rusttest`` Make target.
|
||||
|
||||
If ``rustup`` is being used, all the profiles already install the tool,
|
||||
thus nothing needs to be done.
|
||||
|
||||
The standalone installers also come with ``cargo``.
|
||||
|
||||
|
||||
rustdoc
|
||||
*******
|
||||
|
||||
``rustdoc`` is the documentation tool for Rust. It generates pretty HTML
|
||||
documentation for Rust code (for details, please see
|
||||
general-information.rst).
|
||||
|
||||
``rustdoc`` is also used to test the examples provided in documented Rust code
|
||||
(called doctests or documentation tests). The ``rusttest`` Make target uses
|
||||
this feature.
|
||||
|
||||
If ``rustup`` is being used, all the profiles already install the tool,
|
||||
thus nothing needs to be done.
|
||||
|
||||
The standalone installers also come with ``rustdoc``.
|
||||
|
||||
|
||||
rust-analyzer
|
||||
*************
|
||||
|
||||
The `rust-analyzer <https://rust-analyzer.github.io/>`_ language server can
|
||||
be used with many editors to enable syntax highlighting, completion, go to
|
||||
definition, and other features.
|
||||
|
||||
``rust-analyzer`` needs a configuration file, ``rust-project.json``, which
|
||||
can be generated by the ``rust-analyzer`` Make target.
|
||||
|
||||
|
||||
Configuration
|
||||
-------------
|
||||
|
||||
``Rust support`` (``CONFIG_RUST``) needs to be enabled in the ``General setup``
|
||||
menu. The option is only shown if a suitable Rust toolchain is found (see
|
||||
above), as long as the other requirements are met. In turn, this will make
|
||||
visible the rest of options that depend on Rust.
|
||||
|
||||
Afterwards, go to::
|
||||
|
||||
Kernel hacking
|
||||
-> Sample kernel code
|
||||
-> Rust samples
|
||||
|
||||
And enable some sample modules either as built-in or as loadable.
|
||||
|
||||
|
||||
Building
|
||||
--------
|
||||
|
||||
Building a kernel with a complete LLVM toolchain is the best supported setup
|
||||
at the moment. That is::
|
||||
|
||||
make LLVM=1
|
||||
|
||||
For architectures that do not support a full LLVM toolchain, use::
|
||||
|
||||
make CC=clang
|
||||
|
||||
Using GCC also works for some configurations, but it is very experimental at
|
||||
the moment.
|
||||
|
||||
|
||||
Hacking
|
||||
-------
|
||||
|
||||
To dive deeper, take a look at the source code of the samples
|
||||
at ``samples/rust/``, the Rust support code under ``rust/`` and
|
||||
the ``Rust hacking`` menu under ``Kernel hacking``.
|
||||
|
||||
If GDB/Binutils is used and Rust symbols are not getting demangled, the reason
|
||||
is the toolchain does not support Rust's new v0 mangling scheme yet.
|
||||
There are a few ways out:
|
||||
|
||||
- Install a newer release (GDB >= 10.2, Binutils >= 2.36).
|
||||
|
||||
- Some versions of GDB (e.g. vanilla GDB 10.1) are able to use
|
||||
the pre-demangled names embedded in the debug info (``CONFIG_DEBUG_INFO``).
|
@ -94,7 +94,7 @@ other HZ detail. Thus the CFS scheduler has no notion of "timeslices" in the
|
||||
way the previous scheduler had, and has no heuristics whatsoever. There is
|
||||
only one central tunable (you have to switch on CONFIG_SCHED_DEBUG):
|
||||
|
||||
/proc/sys/kernel/sched_min_granularity_ns
|
||||
/sys/kernel/debug/sched/min_granularity_ns
|
||||
|
||||
which can be used to tune the scheduler from "desktop" (i.e., low latencies) to
|
||||
"server" (i.e., good batching) workloads. It defaults to a setting suitable
|
||||
|
@ -7,7 +7,7 @@ Landlock LSM: kernel documentation
|
||||
==================================
|
||||
|
||||
:Author: Mickaël Salaün
|
||||
:Date: May 2022
|
||||
:Date: September 2022
|
||||
|
||||
Landlock's goal is to create scoped access-control (i.e. sandboxing). To
|
||||
harden a whole system, this feature should be available to any process,
|
||||
@ -49,13 +49,13 @@ Filesystem access rights
|
||||
------------------------
|
||||
|
||||
All access rights are tied to an inode and what can be accessed through it.
|
||||
Reading the content of a directory doesn't imply to be allowed to read the
|
||||
Reading the content of a directory does not imply to be allowed to read the
|
||||
content of a listed inode. Indeed, a file name is local to its parent
|
||||
directory, and an inode can be referenced by multiple file names thanks to
|
||||
(hard) links. Being able to unlink a file only has a direct impact on the
|
||||
directory, not the unlinked inode. This is the reason why
|
||||
`LANDLOCK_ACCESS_FS_REMOVE_FILE` or `LANDLOCK_ACCESS_FS_REFER` are not allowed
|
||||
to be tied to files but only to directories.
|
||||
``LANDLOCK_ACCESS_FS_REMOVE_FILE`` or ``LANDLOCK_ACCESS_FS_REFER`` are not
|
||||
allowed to be tied to files but only to directories.
|
||||
|
||||
Tests
|
||||
=====
|
||||
|
@ -14,45 +14,3 @@ Unsorted Documentation
|
||||
static-keys
|
||||
tee
|
||||
xz
|
||||
|
||||
Atomic Types
|
||||
============
|
||||
|
||||
.. raw:: latex
|
||||
|
||||
\footnotesize
|
||||
|
||||
.. include:: ../atomic_t.txt
|
||||
:literal:
|
||||
|
||||
.. raw:: latex
|
||||
|
||||
\normalsize
|
||||
|
||||
Atomic bitops
|
||||
=============
|
||||
|
||||
.. raw:: latex
|
||||
|
||||
\footnotesize
|
||||
|
||||
.. include:: ../atomic_bitops.txt
|
||||
:literal:
|
||||
|
||||
.. raw:: latex
|
||||
|
||||
\normalsize
|
||||
|
||||
Memory Barriers
|
||||
===============
|
||||
|
||||
.. raw:: latex
|
||||
|
||||
\footnotesize
|
||||
|
||||
.. include:: ../memory-barriers.txt
|
||||
:literal:
|
||||
|
||||
.. raw:: latex
|
||||
|
||||
\normalsize
|
||||
|
58
Documentation/subsystem-apis.rst
Normal file
58
Documentation/subsystem-apis.rst
Normal file
@ -0,0 +1,58 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
==============================
|
||||
Kernel subsystem documentation
|
||||
==============================
|
||||
|
||||
These books get into the details of how specific kernel subsystems work
|
||||
from the point of view of a kernel developer. Much of the information here
|
||||
is taken directly from the kernel source, with supplemental material added
|
||||
as needed (or at least as we managed to add it — probably *not* all that is
|
||||
needed).
|
||||
|
||||
**Fixme**: much more organizational work is needed here.
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
driver-api/index
|
||||
core-api/index
|
||||
locking/index
|
||||
accounting/index
|
||||
block/index
|
||||
cdrom/index
|
||||
cpu-freq/index
|
||||
fb/index
|
||||
fpga/index
|
||||
hid/index
|
||||
i2c/index
|
||||
iio/index
|
||||
isdn/index
|
||||
infiniband/index
|
||||
leds/index
|
||||
netlabel/index
|
||||
networking/index
|
||||
pcmcia/index
|
||||
power/index
|
||||
target/index
|
||||
timers/index
|
||||
spi/index
|
||||
w1/index
|
||||
watchdog/index
|
||||
virt/index
|
||||
input/index
|
||||
hwmon/index
|
||||
gpu/index
|
||||
security/index
|
||||
sound/index
|
||||
crypto/index
|
||||
filesystems/index
|
||||
mm/index
|
||||
bpf/index
|
||||
usb/index
|
||||
PCI/index
|
||||
scsi/index
|
||||
misc-devices/index
|
||||
scheduler/index
|
||||
mhi/index
|
||||
peci/index
|
@ -412,7 +412,7 @@ Extended error information
|
||||
Because the default sort key above is 'hitcount', the above shows a
|
||||
the list of call_sites by increasing hitcount, so that at the bottom
|
||||
we see the functions that made the most kmalloc calls during the
|
||||
run. If instead we we wanted to see the top kmalloc callers in
|
||||
run. If instead we wanted to see the top kmalloc callers in
|
||||
terms of the number of bytes requested rather than the number of
|
||||
calls, and we wanted the top caller to appear at the top, we can use
|
||||
the 'sort' parameter, along with the 'descending' modifier::
|
||||
|
@ -328,8 +328,8 @@ Configuring Kprobes
|
||||
===================
|
||||
|
||||
When configuring the kernel using make menuconfig/xconfig/oldconfig,
|
||||
ensure that CONFIG_KPROBES is set to "y". Under "General setup", look
|
||||
for "Kprobes".
|
||||
ensure that CONFIG_KPROBES is set to "y", look for "Kprobes" under
|
||||
"General architecture-dependent options".
|
||||
|
||||
So that you can load and unload Kprobes-based instrumentation modules,
|
||||
make sure "Loadable module support" (CONFIG_MODULES) and "Module
|
||||
|
@ -20,7 +20,7 @@ For example::
|
||||
[root@f32 ~]# cd /sys/kernel/tracing/
|
||||
[root@f32 tracing]# echo timerlat > current_tracer
|
||||
|
||||
It is possible to follow the trace by reading the trace trace file::
|
||||
It is possible to follow the trace by reading the trace file::
|
||||
|
||||
[root@f32 tracing]# cat trace
|
||||
# tracer: timerlat
|
||||
|
@ -1,39 +0,0 @@
|
||||
Chinese translated version of Documentation/core-api/irq/index.rst
|
||||
|
||||
If you have any comment or update to the content, please contact the
|
||||
original document maintainer directly. However, if you have a problem
|
||||
communicating in English you can also ask the Chinese maintainer for
|
||||
help. Contact the Chinese maintainer if this translation is outdated
|
||||
or if there is a problem with the translation.
|
||||
|
||||
Maintainer: Eric W. Biederman <ebiederman@xmission.com>
|
||||
Chinese maintainer: Fu Wei <tekkamanninja@gmail.com>
|
||||
---------------------------------------------------------------------
|
||||
Documentation/core-api/irq/index.rst 的中文翻译
|
||||
|
||||
如果想评论或更新本文的内容,请直接联系原文档的维护者。如果你使用英文
|
||||
交流有困难的话,也可以向中文版维护者求助。如果本翻译更新不及时或者翻
|
||||
译存在问题,请联系中文版维护者。
|
||||
英文版维护者: Eric W. Biederman <ebiederman@xmission.com>
|
||||
中文版维护者: 傅炜 Fu Wei <tekkamanninja@gmail.com>
|
||||
中文版翻译者: 傅炜 Fu Wei <tekkamanninja@gmail.com>
|
||||
中文版校译者: 傅炜 Fu Wei <tekkamanninja@gmail.com>
|
||||
|
||||
|
||||
以下为正文
|
||||
---------------------------------------------------------------------
|
||||
何为 IRQ?
|
||||
|
||||
一个 IRQ 是来自某个设备的一个中断请求。目前,它们可以来自一个硬件引脚,
|
||||
或来自一个数据包。多个设备可能连接到同个硬件引脚,从而共享一个 IRQ。
|
||||
|
||||
一个 IRQ 编号是用于告知硬件中断源的内核标识。通常情况下,这是一个
|
||||
全局 irq_desc 数组的索引,但是除了在 linux/interrupt.h 中的实现,
|
||||
具体的细节是体系结构特定的。
|
||||
|
||||
一个 IRQ 编号是设备上某个可能的中断源的枚举。通常情况下,枚举的编号是
|
||||
该引脚在系统内中断控制器的所有输入引脚中的编号。对于 ISA 总线中的情况,
|
||||
枚举的是在两个 i8259 中断控制器中 16 个输入引脚。
|
||||
|
||||
架构可以对 IRQ 编号指定额外的含义,在硬件涉及任何手工配置的情况下,
|
||||
是被提倡的。ISA 的 IRQ 是一个分配这类额外含义的典型例子。
|
139
Documentation/translations/zh_CN/PCI/acpi-info.rst
Normal file
139
Documentation/translations/zh_CN/PCI/acpi-info.rst
Normal file
@ -0,0 +1,139 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
.. include:: ../disclaimer-zh_CN.rst
|
||||
|
||||
:Original: Documentation/PCI/acpi-info.rst
|
||||
|
||||
:翻译:
|
||||
|
||||
司延腾 Yanteng Si <siyanteng@loongson.cn>
|
||||
|
||||
:校译:
|
||||
|
||||
|
||||
=====================
|
||||
PCI主桥的ACPI注意事项
|
||||
=====================
|
||||
|
||||
一般的规则是,ACPI命名空间应该描述操作系统可能使用的所有东西,除非有其他方法让操作系
|
||||
统找到它[1, 2]。
|
||||
|
||||
例如,没有标准的硬件机制来枚举PCI主桥,所以ACPI命名空间必须描述每个主桥、访问它
|
||||
下面的PCI配置空间的方法、主桥转发到PCI的地址空间窗口(使用_CRS)以及传统的INTx
|
||||
中断的路由(使用_PRT)。
|
||||
|
||||
在主桥下面的PCI设备,通常不需要通过ACPI描述。操作系统可以通过标准的PCI枚举机制来
|
||||
发现它们,使用配置访问来发现和识别设备,并读取和测量它们的BAR。然而,如果ACPI为它们
|
||||
提供电源管理或热插拔功能,或者如果设备有由平台中断控制器连接的INTx中断,需要一个_PRT
|
||||
来描述这些连接,这种情况下ACPI可以描述PCI设备。
|
||||
|
||||
ACPI资源描述是通过ACPI命名空间中设备的_CRS对象完成的[2]。_CRS就像一个通用的PCI BAR:
|
||||
操作系统可以读取_CRS并找出正在消耗的资源,即使它没有该设备的驱动程序[3]。这一点很重要,
|
||||
因为它意味着一个旧的操作系统可以正确地工作,即使是在操作系统不知道的新设备的系统上。新设
|
||||
备可能什么都不做,但操作系统至少可以确保没有资源与它们冲突。
|
||||
|
||||
像MCFG、HPET、ECDT等静态表,不是保留地址空间的机制。静态表是在操作系统在启动初期且在它
|
||||
能够解析ACPI命名空间之前需要知道的东西。如果定义了一个新的表,即使旧的操作系统忽略了这
|
||||
个表,它也需要正常运行。_CRS允许这样做,因为它是通用的,可以被旧的操作系统解析;而静态表
|
||||
则不允许。
|
||||
|
||||
如果操作系统要管理一个通过ACPI描述的不可发现的设备,该设备将有一个特定的_HID/_CID,以
|
||||
告诉操作系统与之绑定的驱动程序,并且_CRS告诉操作系统和驱动程序该设备的寄存器在哪里。
|
||||
|
||||
PCI主桥是PNP0A03或PNP0A08设备。它们的_CRS应该描述它们所消耗的所有地址空间。这包括它
|
||||
们转发到PCI总线上的所有窗口,以及不转发到PCI的主桥本身的寄存器。主桥的寄存器包括次要/下
|
||||
级总线寄存器,决定了桥下面的总线范围,窗口寄存器描述了桥洞,等等。这些都是设备相关的,非
|
||||
架构相关的东西,所以PNP0A03/PNP0A08驱动可以管理它们的唯一方法是通过_PRS/_CRS/_SRS,
|
||||
它包含了特定于设备的细节。主桥寄存器也包括ECAM空间,因为它是由主桥消耗的。
|
||||
|
||||
ACPI定义了一个Consumer/Producer位来区分桥寄存器(“Consumer”下文译作消费者)和
|
||||
桥洞(“Producer”下文译作生产者)[4, 5],但是早期的BIOS没有正确使用这个位。其结果
|
||||
是,目前的ACPI规范只为扩展地址空间描述符定义了消费者/生产者;在旧的QWord/Word/Word地
|
||||
址空间描述符中,该位应该被忽略。因此,操作系统必须假定所有的QWord/Word/Word描述符都是
|
||||
窗口。
|
||||
|
||||
在增加扩展地址空间描述符之前,消费者/生产者的失败意味着没有办法描述PNP0A03/PNP0A08设
|
||||
备本身的桥寄存器。解决办法是在PNP0C02捕捉器中描述桥寄存器(包括ECAM空间)[6]。
|
||||
除了ECAM之外,桥寄存器空间反正是特定于设备的,所以通用的PNP0A03/PNP0A08驱动程
|
||||
序(pci_root.c)没有必要了解它。
|
||||
|
||||
新的架构应该能够在PNP0A03设备中使用“消费者”扩展地址空间描述符,用于桥寄存器,包括
|
||||
ECAM,尽管对[6]的严格解释可能禁止这样做。旧的x86和ia64内核假定所有的地址空间描述
|
||||
符,包括“消费者”扩展地址空间的描述符,都是窗口,所以在这些架构上以这种方式描述桥寄
|
||||
存器是不安全的。
|
||||
|
||||
PNP0C02“主板”设备基本上是万能的。除了“不要将这些资源用于其他用途”之外,没有其他的编
|
||||
程模型。因此,PNP0C02 _CRS应该声明ACPI命名空间中(1)没有被_CRS声明的任何其他设备对
|
||||
象的地址空间,(2)不应该被OS分配给其他东西。
|
||||
|
||||
除非有一个标准的固件接口用于配置访问,例如ia64 SAL接口[7],否则PCIe规范要求使用增强
|
||||
型配置访问方法(ECAM)。主桥消耗ECAM内存地址空间并将内存访问转换为PCI配置访问。该规范
|
||||
定义了ECAM地址空间的布局和功能;只有地址空间的基础是特定于设备的。ACPI操作系统从静态
|
||||
MCFG表或PNP0A03设备中的_CBA方法中了解基础地址。
|
||||
|
||||
MCFG表必须描述非热插拔主桥的ECAM空间[8]。由于MCFG是一个静态表,不能通过热插拔更新,
|
||||
PNP0A03设备中的_CBA方法描述了可热插拔主桥的ECAM空间[9]。请注意,对于MCFG和_CBA,
|
||||
基址总是对应于总线0,即使桥器下面的总线范围(通过_CRS报告)不从0开始。
|
||||
|
||||
|
||||
[1] ACPI 6.2, sec 6.1:
|
||||
对于任何在非枚举类型的总线上的设备(例如,ISA总线),OSPM会枚举设备的标识符,ACPI
|
||||
系统固件必须为每个设备提供一个_HID对象...以使OSPM能够做到这一点。
|
||||
|
||||
[2] ACPI 6.2, sec 3.7:
|
||||
操作系统枚举主板设备时,只需通过读取ACPI命名空间来寻找具有硬件ID的设备。
|
||||
|
||||
ACPI枚举的每个设备都包括ACPI命名空间中ACPI定义的对象,该对象报告设备可能占用的硬
|
||||
件资源[_PRS],报告设备当前使用的资源[_CRS]的对象,以及配置这些资源的对象[_SRS]。
|
||||
这些信息被即插即用操作系统(OSPM)用来配置设备。
|
||||
|
||||
[3] ACPI 6.2, sec 6.2:
|
||||
OSPM使用设备配置对象来配置通过ACPI列举的设备的硬件资源。设备配置对象提供了关于当前
|
||||
和可能的资源需求的信息,共享资源之间的关系,以及配置硬件资源的方法。
|
||||
|
||||
当OSPM枚举一个设备时,它调用_PRS来确定该设备的资源需求。它也可以调用_CRS来找到该设
|
||||
备的当前资源设置。利用这些信息,即插即用系统决定设备应该消耗什么资源,并通过调用设备
|
||||
的_SRS控制方法来设置这些资源。
|
||||
|
||||
在ACPI中,设备可以消耗资源(例如,传统的键盘),提供资源(例如,一个专有的PCI桥),
|
||||
或者两者都做。除非另有规定,设备的资源被假定为来自设备层次结构中设备上方最近的匹配资
|
||||
源。
|
||||
|
||||
[4] ACPI 6.2, sec 6.4.3.5.1, 2, 3, 4:
|
||||
QWord/DWord/Word 地址空间描述符 (.1, .2, .3)
|
||||
常规标志: Bit [0] 被忽略。
|
||||
|
||||
扩展地址空间描述符 (.4)
|
||||
常规标志: Bit [0] 消费者/生产者:
|
||||
|
||||
* 1 – 这个设备消费这个资源
|
||||
* 0 – 该设备生产和消费该资源
|
||||
|
||||
[5] ACPI 6.2, sec 19.6.43:
|
||||
ResourceUsage指定内存范围是由这个设备(ResourceConsumer)消费还是传递给子设备
|
||||
(ResourceProducer)。如果没有指定,那么就假定是ResourceConsumer。
|
||||
|
||||
[6] PCI Firmware 3.2, sec 4.1.2:
|
||||
如果操作系统不能原生的懂得保留MMCFG区域,MMCFG区域必须由固件保留。在MCFG表中或通
|
||||
过_CBA方法(见第4.1.3节)报告的地址范围必须通过声明主板资源来保留。对于大多数系统,
|
||||
主板资源将出现在ACPI命名空间的根部(在_SB下),在一个节点的_HID为EISAID(PNP0C0
|
||||
2),在这种情况下的资源不应该要求在根PCI总线的_CRS。这些资源可以选择在Int15 E820
|
||||
或EFIGetMemoryMap中作为保留内存返回,但必须始终通过ACPI作为主板资源报告。
|
||||
|
||||
[7] PCI Express 4.0, sec 7.2.2:
|
||||
对于PC兼容的系统,或者没有实现允许访问配置空间的处理器架构特定固件接口标准的系统,需
|
||||
要使用本节中定义的ECAM。
|
||||
|
||||
[8] PCI Firmware 3.2, sec 4.1.2:
|
||||
MCFG表是一个ACPI表,用于沟通的基础地址对应的非热的可移动的PCI段组范围内的PCI段组在
|
||||
启动时提供给操作系统。这对PC兼容系统来说是必需的。
|
||||
|
||||
MCFG表仅用于沟通在启动时系统可用的PCI段组对应的基址。
|
||||
|
||||
[9] PCI Firmware 3.2, sec 4.1.3:
|
||||
_CBA (Memory mapped Configuration Base Address) 控制方法是一个可选的ACPI对
|
||||
象,用于返回热插拔主桥的64位内存映射的配置基址。_CBA 返回的基址是与处理器相关的地址。
|
||||
_CBA 控制方法被评估为一个整数。
|
||||
|
||||
这个控制方法出现在主桥对象下。当_CBA方法出现在一个活动的主桥对象下时,操作系统会评
|
||||
估这个结构,以确定内存映射的配置基址,对应于_CRS方法中指定的总线编号范围的PCI段组。
|
||||
一个包含_CBA方法的ACPI命名空间对象也必须包含一个相应的_SEG方法。
|
@ -10,9 +10,6 @@
|
||||
:校译:
|
||||
|
||||
|
||||
|
||||
.. _cn_PCI_index.rst:
|
||||
|
||||
===================
|
||||
Linux PCI总线子系统
|
||||
===================
|
||||
@ -26,12 +23,12 @@ Linux PCI总线子系统
|
||||
pci-iov-howto
|
||||
msi-howto
|
||||
sysfs-pci
|
||||
acpi-info
|
||||
|
||||
|
||||
Todolist:
|
||||
|
||||
acpi-info
|
||||
pci-error-recovery
|
||||
pcieaer-howto
|
||||
endpoint/index
|
||||
boot-interrupts
|
||||
* pci-error-recovery
|
||||
* pcieaer-howto
|
||||
* endpoint/index
|
||||
* boot-interrupts
|
||||
|
@ -6,10 +6,10 @@
|
||||
|
||||
吴想成 Wu XiangCheng <bobwxc@email.cn>
|
||||
|
||||
Linux内核5.x版本 <http://kernel.org/>
|
||||
Linux内核6.x版本 <http://kernel.org/>
|
||||
=========================================
|
||||
|
||||
以下是Linux版本5的发行注记。仔细阅读它们,
|
||||
以下是Linux版本6的发行注记。仔细阅读它们,
|
||||
它们会告诉你这些都是什么,解释如何安装内核,以及遇到问题时该如何做。
|
||||
|
||||
什么是Linux?
|
||||
@ -61,27 +61,27 @@ Linux内核5.x版本 <http://kernel.org/>
|
||||
- 如果您要安装完整的源代码,请把内核tar档案包放在您有权限的目录中(例如您
|
||||
的主目录)并将其解包::
|
||||
|
||||
xz -cd linux-5.x.tar.xz | tar xvf -
|
||||
xz -cd linux-6.x.tar.xz | tar xvf -
|
||||
|
||||
将“X”替换成最新内核的版本号。
|
||||
|
||||
【不要】使用 /usr/src/linux 目录!这里有一组库头文件使用的内核头文件
|
||||
(通常是不完整的)。它们应该与库匹配,而不是被内核的变化搞得一团糟。
|
||||
|
||||
- 您还可以通过打补丁在5.x版本之间升级。补丁以xz格式分发。要通过打补丁进行
|
||||
安装,请获取所有较新的补丁文件,进入内核源代码(linux-5.x)的目录并
|
||||
- 您还可以通过打补丁在6.x版本之间升级。补丁以xz格式分发。要通过打补丁进行
|
||||
安装,请获取所有较新的补丁文件,进入内核源代码(linux-6.x)的目录并
|
||||
执行::
|
||||
|
||||
xz -cd ../patch-5.x.xz | patch -p1
|
||||
xz -cd ../patch-6.x.xz | patch -p1
|
||||
|
||||
请【按顺序】替换所有大于当前源代码树版本的“x”,这样就可以了。您可能想要
|
||||
删除备份文件(文件名类似xxx~ 或 xxx.orig),并确保没有失败的补丁(文件名
|
||||
类似xxx# 或 xxx.rej)。如果有,不是你就是我犯了错误。
|
||||
|
||||
与5.x内核的补丁不同,5.x.y内核(也称为稳定版内核)的补丁不是增量的,而是
|
||||
直接应用于基本的5.x内核。例如,如果您的基本内核是5.0,并且希望应用5.0.3
|
||||
补丁,则不应先应用5.0.1和5.0.2的补丁。类似地,如果您运行的是5.0.2内核,
|
||||
并且希望跳转到5.0.3,那么在应用5.0.3补丁之前,必须首先撤销5.0.2补丁
|
||||
与6.x内核的补丁不同,6.x.y内核(也称为稳定版内核)的补丁不是增量的,而是
|
||||
直接应用于基本的6.x内核。例如,如果您的基本内核是6.0,并且希望应用6.0.3
|
||||
补丁,则不应先应用6.0.1和6.0.2的补丁。类似地,如果您运行的是6.0.2内核,
|
||||
并且希望跳转到6.0.3,那么在应用6.0.3补丁之前,必须首先撤销6.0.2补丁
|
||||
(即patch -R)。更多关于这方面的内容,请阅读
|
||||
:ref:`Documentation/process/applying-patches.rst <applying_patches>` 。
|
||||
|
||||
@ -103,7 +103,7 @@ Linux内核5.x版本 <http://kernel.org/>
|
||||
软件要求
|
||||
---------
|
||||
|
||||
编译和运行5.x内核需要各种软件包的最新版本。请参考
|
||||
编译和运行6.x内核需要各种软件包的最新版本。请参考
|
||||
:ref:`Documentation/process/changes.rst <changes>`
|
||||
来了解最低版本要求以及如何升级软件包。请注意,使用过旧版本的这些包可能会
|
||||
导致很难追踪的间接错误,因此不要以为在生成或操作过程中出现明显问题时可以
|
||||
@ -116,12 +116,12 @@ Linux内核5.x版本 <http://kernel.org/>
|
||||
``make O=output/dir`` 选项可以为输出文件(包括 .config)指定备用位置。
|
||||
例如::
|
||||
|
||||
kernel source code: /usr/src/linux-5.x
|
||||
kernel source code: /usr/src/linux-6.x
|
||||
build directory: /home/name/build/kernel
|
||||
|
||||
要配置和构建内核,请使用::
|
||||
|
||||
cd /usr/src/linux-5.x
|
||||
cd /usr/src/linux-6.x
|
||||
make O=/home/name/build/kernel menuconfig
|
||||
make O=/home/name/build/kernel
|
||||
sudo make O=/home/name/build/kernel modules_install install
|
||||
@ -227,8 +227,6 @@ Linux内核5.x版本 <http://kernel.org/>
|
||||
- 确保您至少有gcc 5.1可用。
|
||||
有关更多信息,请参阅 :ref:`Documentation/process/changes.rst <changes>` 。
|
||||
|
||||
请注意,您仍然可以使用此内核运行a.out用户程序。
|
||||
|
||||
- 执行 ``make`` 来创建压缩内核映像。如果您安装了lilo以适配内核makefile,
|
||||
那么也可以进行 ``make install`` ,但是您可能需要先检查特定的lilo设置。
|
||||
|
||||
@ -282,67 +280,12 @@ Linux内核5.x版本 <http://kernel.org/>
|
||||
若遇到问题
|
||||
-----------
|
||||
|
||||
- 如果您发现了一些可能由于内核缺陷所导致的问题,请检查MAINTAINERS(维护者)
|
||||
文件看看是否有人与令您遇到麻烦的内核部分相关。如果无人在此列出,那么第二
|
||||
个最好的方案就是把它们发给我(torvalds@linux-foundation.org),也可能发送
|
||||
到任何其他相关的邮件列表或新闻组。
|
||||
如果您发现了一些可能由于内核缺陷所导致的问题,请参阅:
|
||||
Documentation/translations/zh_CN/admin-guide/reporting-issues.rst 。
|
||||
|
||||
- 在所有的缺陷报告中,【请】告诉我们您在说什么内核,如何复现问题,以及您的
|
||||
设置是什么的(使用您的常识)。如果问题是新的,请告诉我;如果问题是旧的,
|
||||
请尝试告诉我您什么时候首次注意到它。
|
||||
想要理解内核错误报告,请参阅:
|
||||
Documentation/translations/zh_CN/admin-guide/bug-hunting.rst 。
|
||||
|
||||
- 如果缺陷导致如下消息::
|
||||
|
||||
unable to handle kernel paging request at address C0000010
|
||||
Oops: 0002
|
||||
EIP: 0010:XXXXXXXX
|
||||
eax: xxxxxxxx ebx: xxxxxxxx ecx: xxxxxxxx edx: xxxxxxxx
|
||||
esi: xxxxxxxx edi: xxxxxxxx ebp: xxxxxxxx
|
||||
ds: xxxx es: xxxx fs: xxxx gs: xxxx
|
||||
Pid: xx, process nr: xx
|
||||
xx xx xx xx xx xx xx xx xx xx
|
||||
|
||||
或者类似的内核调试信息显示在屏幕上或在系统日志里,请【如实】复制它。
|
||||
可能对你来说转储(dump)看起来不可理解,但它确实包含可能有助于调试问题的
|
||||
信息。转储上方的文本也很重要:它说明了内核转储代码的原因(在上面的示例中,
|
||||
是由于内核指针错误)。更多关于如何理解转储的信息,请参见
|
||||
Documentation/admin-guide/bug-hunting.rst。
|
||||
|
||||
- 如果使用 CONFIG_KALLSYMS 编译内核,则可以按原样发送转储,否则必须使用
|
||||
``ksymoops`` 程序来理解转储(但通常首选使用CONFIG_KALLSYMS编译)。
|
||||
此实用程序可从
|
||||
https://www.kernel.org/pub/linux/utils/kernel/ksymoops/ 下载。
|
||||
或者,您可以手动执行转储查找:
|
||||
|
||||
- 在调试像上面这样的转储时,如果您可以查找EIP值的含义,这将非常有帮助。
|
||||
十六进制值本身对我或其他任何人都没有太大帮助:它会取决于特定的内核设置。
|
||||
您应该做的是从EIP行获取十六进制值(忽略 ``0010:`` ),然后在内核名字列表
|
||||
中查找它,以查看哪个内核函数包含有问题的地址。
|
||||
|
||||
要找到内核函数名,您需要找到与显示症状的内核相关联的系统二进制文件。就是
|
||||
文件“linux/vmlinux”。要提取名字列表并将其与内核崩溃中的EIP进行匹配,
|
||||
请执行::
|
||||
|
||||
nm vmlinux | sort | less
|
||||
|
||||
这将为您提供一个按升序排序的内核地址列表,从中很容易找到包含有问题的地址
|
||||
的函数。请注意,内核调试消息提供的地址不一定与函数地址完全匹配(事实上,
|
||||
这是不可能的),因此您不能只“grep”列表:不过列表将为您提供每个内核函数
|
||||
的起点,因此通过查找起始地址低于你正在搜索的地址,但后一个函数的高于的
|
||||
函数,你会找到您想要的。实际上,在您的问题报告中加入一些“上下文”可能是
|
||||
一个好主意,给出相关的上下几行。
|
||||
|
||||
如果您由于某些原因无法完成上述操作(如您使用预编译的内核映像或类似的映像),
|
||||
请尽可能多地告诉我您的相关设置信息,这会有所帮助。有关详细信息请阅读
|
||||
‘Documentation/admin-guide/reporting-issues.rst’。
|
||||
|
||||
- 或者,您可以在正在运行的内核上使用gdb(只读的;即不能更改值或设置断点)。
|
||||
为此,请首先使用-g编译内核;适当地编辑arch/x86/Makefile,然后执行 ``make
|
||||
clean`` 。您还需要启用CONFIG_PROC_FS(通过 ``make config`` )。
|
||||
|
||||
使用新内核重新启动后,执行 ``gdb vmlinux /proc/kcore`` 。现在可以使用所有
|
||||
普通的gdb命令。查找系统崩溃点的命令是 ``l *0xXXXXXXXX`` (将xxx替换为EIP
|
||||
值)。
|
||||
|
||||
用gdb无法调试一个当前未运行的内核是由于gdb(错误地)忽略了编译内核的起始
|
||||
偏移量。
|
||||
更多用GDB调试内核的信息,请参阅:
|
||||
Documentation/translations/zh_CN/dev-tools/gdb-kernel-debugging.rst
|
||||
和 Documentation/dev-tools/kgdb.rst 。
|
||||
|
293
Documentation/translations/zh_CN/admin-guide/bootconfig.rst
Normal file
293
Documentation/translations/zh_CN/admin-guide/bootconfig.rst
Normal file
@ -0,0 +1,293 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
.. include:: ../disclaimer-zh_CN.rst
|
||||
|
||||
:Original: Documentation/admin-guide/bootconfig.rst
|
||||
|
||||
:译者: 吴想成 Wu XiangCheng <bobwxc@email.cn>
|
||||
|
||||
========
|
||||
引导配置
|
||||
========
|
||||
|
||||
:作者: Masami Hiramatsu <mhiramat@kernel.org>
|
||||
|
||||
概述
|
||||
====
|
||||
|
||||
引导配置扩展了现有的内核命令行,以一种更有效率的方式在引导内核时进一步支持
|
||||
键值数据。这允许管理员传递一份结构化关键字的配置文件。
|
||||
|
||||
配置文件语法
|
||||
============
|
||||
|
||||
引导配置文件的语法采用非常简单的键值结构。每个关键字由点连接的单词组成,键
|
||||
和值由 ``=`` 连接。值以分号( ``;`` )或换行符( ``\n`` )结尾。数组值中每
|
||||
个元素由逗号( ``,`` )分隔。::
|
||||
|
||||
KEY[.WORD[...]] = VALUE[, VALUE2[...]][;]
|
||||
|
||||
与内核命令行语法不同,逗号和 ``=`` 周围允许有空格。
|
||||
|
||||
关键字只允许包含字母、数字、连字符( ``-`` )和下划线( ``_`` )。值可包含
|
||||
可打印字符和空格,但分号( ``;`` )、换行符( ``\n`` )、逗号( ``,`` )、
|
||||
井号( ``#`` )和右大括号( ``}`` )等分隔符除外。
|
||||
|
||||
如果你需要在值中使用这些分隔符,可以用双引号( ``"VALUE"`` )或单引号
|
||||
( ``'VALUE'`` )括起来。注意,引号无法转义。
|
||||
|
||||
键的值可以为空或不存在。这些键用于检查该键是否存在(类似布尔值)。
|
||||
|
||||
键值语法
|
||||
--------
|
||||
|
||||
引导配置文件语法允许用户通过大括号合并键名部分相同的关键字。例如::
|
||||
|
||||
foo.bar.baz = value1
|
||||
foo.bar.qux.quux = value2
|
||||
|
||||
也可以写成::
|
||||
|
||||
foo.bar {
|
||||
baz = value1
|
||||
qux.quux = value2
|
||||
}
|
||||
|
||||
或者更紧凑一些,写成::
|
||||
|
||||
foo.bar { baz = value1; qux.quux = value2 }
|
||||
|
||||
在这两种样式中,引导解析时相同的关键字都会自动合并。因此可以追加类似的树或
|
||||
键值。
|
||||
|
||||
相同关键字的值
|
||||
--------------
|
||||
|
||||
禁止两个或多个值或数组共享同一个关键字。例如::
|
||||
|
||||
foo = bar, baz
|
||||
foo = qux # !错误! 我们不可以重定义相同的关键字
|
||||
|
||||
如果你想要更新值,必须显式使用覆盖操作符 ``:=`` 。例如::
|
||||
|
||||
foo = bar, baz
|
||||
foo := qux
|
||||
|
||||
这样 ``foo`` 关键字的值就变成了 ``qux`` 。这对于通过添加(部分)自定义引导
|
||||
配置来覆盖默认值非常有用,免于解析默认引导配置。
|
||||
|
||||
如果你想对现有关键字追加值作为数组成员,可以使用 ``+=`` 操作符。例如::
|
||||
|
||||
foo = bar, baz
|
||||
foo += qux
|
||||
|
||||
这样, ``foo`` 关键字就同时拥有了 ``bar`` , ``baz`` 和 ``qux`` 。
|
||||
|
||||
此外,父关键字下可同时存在值和子关键字。
|
||||
例如,下列配置是可行的。::
|
||||
|
||||
foo = value1
|
||||
foo.bar = value2
|
||||
foo := value3 # 这会更新foo的值。
|
||||
|
||||
注意,裸值不能直接放进结构化关键字中,必须在大括号外定义它。例如::
|
||||
|
||||
foo {
|
||||
bar = value1
|
||||
bar {
|
||||
baz = value2
|
||||
qux = value3
|
||||
}
|
||||
}
|
||||
|
||||
同时,关键字下值节点的顺序是固定的。如果值和子关键字同时存在,值永远是该关
|
||||
键字的第一个子节点。因此如果用户先指定子关键字,如::
|
||||
|
||||
foo.bar = value1
|
||||
foo = value2
|
||||
|
||||
则在程序(和/proc/bootconfig)中,它会按如下显示::
|
||||
|
||||
foo = value2
|
||||
foo.bar = value1
|
||||
|
||||
注释
|
||||
----
|
||||
|
||||
配置语法接受shell脚本风格的注释。注释以井号( ``#`` )开始,到换行符
|
||||
( ``\n`` )结束。
|
||||
|
||||
::
|
||||
|
||||
# comment line
|
||||
foo = value # value is set to foo.
|
||||
bar = 1, # 1st element
|
||||
2, # 2nd element
|
||||
3 # 3rd element
|
||||
|
||||
会被解析为::
|
||||
|
||||
foo = value
|
||||
bar = 1, 2, 3
|
||||
|
||||
注意你不能把注释放在值和分隔符( ``,`` 或 ``;`` )之间。如下配置语法是错误的::
|
||||
|
||||
key = 1 # comment
|
||||
,2
|
||||
|
||||
|
||||
/proc/bootconfig
|
||||
================
|
||||
|
||||
/proc/bootconfig是引导配置的用户空间接口。与/proc/cmdline不同,此文件内容以
|
||||
键值列表样式显示。
|
||||
每个键值对一行,样式如下::
|
||||
|
||||
KEY[.WORDS...] = "[VALUE]"[,"VALUE2"...]
|
||||
|
||||
|
||||
用引导配置引导内核
|
||||
==================
|
||||
|
||||
用引导配置引导内核有两种方法:将引导配置附加到initrd镜像或直接嵌入内核中。
|
||||
|
||||
*initrd: initial RAM disk,初始内存磁盘*
|
||||
|
||||
将引导配置附加到initrd
|
||||
----------------------
|
||||
|
||||
由于默认情况下引导配置文件是用initrd加载的,因此它将被添加到initrd(initramfs)
|
||||
镜像文件的末尾,其中包含填充、大小、校验值和12字节幻数,如下所示::
|
||||
|
||||
[initrd][bootconfig][padding][size(le32)][checksum(le32)][#BOOTCONFIG\n]
|
||||
|
||||
大小和校验值为小端序存放的32位无符号值。
|
||||
|
||||
当引导配置被加到initrd镜像时,整个文件大小会对齐到4字节。空字符( ``\0`` )
|
||||
会填补对齐空隙。因此 ``size`` 就是引导配置文件的长度+填充的字节。
|
||||
|
||||
Linux内核在内存中解码initrd镜像的最后部分以获取引导配置数据。由于这种“背负式”
|
||||
的方法,只要引导加载器传递了正确的initrd文件大小,就无需更改或更新引导加载器
|
||||
和内核镜像本身。如果引导加载器意外传递了更长的大小,内核将无法找到引导配置数
|
||||
据。
|
||||
|
||||
Linux内核在tools/bootconfig下提供了 ``bootconfig`` 命令来完成此操作,管理员
|
||||
可以用它从initrd镜像中删除或追加配置文件。你可以用以下命令来构建它::
|
||||
|
||||
# make -C tools/bootconfig
|
||||
|
||||
要向initrd镜像添加你的引导配置文件,请按如下命令操作(旧数据会自动移除)::
|
||||
|
||||
# tools/bootconfig/bootconfig -a your-config /boot/initrd.img-X.Y.Z
|
||||
|
||||
要从镜像中移除配置,可以使用-d选项::
|
||||
|
||||
# tools/bootconfig/bootconfig -d /boot/initrd.img-X.Y.Z
|
||||
|
||||
然后在内核命令行上添加 ``bootconfig`` 告诉内核去initrd文件末尾寻找内核配置。
|
||||
|
||||
将引导配置嵌入内核
|
||||
------------------
|
||||
|
||||
如果你不能使用initrd,也可以通过Kconfig选项将引导配置文件嵌入内核中。在此情
|
||||
况下,你需要用以下选项重新编译内核::
|
||||
|
||||
CONFIG_BOOT_CONFIG_EMBED=y
|
||||
CONFIG_BOOT_CONFIG_EMBED_FILE="/引导配置/文件/的/路径"
|
||||
|
||||
``CONFIG_BOOT_CONFIG_EMBED_FILE`` 需要从源码树或对象树开始的引导配置文件的
|
||||
绝对/相对路径。内核会将其嵌入作为默认引导配置。
|
||||
|
||||
与将引导配置附加到initrd一样,你也需要在内核命令行上添加 ``bootconfig`` 告诉
|
||||
内核去启用内嵌的引导配置。
|
||||
|
||||
注意,即使你已经设置了此选项,仍可用附加到initrd的其他引导配置覆盖内嵌的引导
|
||||
配置。
|
||||
|
||||
通过引导配置传递内核参数
|
||||
========================
|
||||
|
||||
除了内核命令行,引导配置也可以用于传递内核参数。所有 ``kernel`` 关键字下的键
|
||||
值对都将直接传递给内核命令行。此外, ``init`` 下的键值对将通过命令行传递给
|
||||
init进程。参数按以下顺序与用户给定的内核命令行字符串相连,因此命令行参数可以
|
||||
覆盖引导配置参数(这取决于子系统如何处理参数,但通常前面的参数将被后面的参数
|
||||
覆盖)::
|
||||
|
||||
[bootconfig params][cmdline params] -- [bootconfig init params][cmdline init params]
|
||||
|
||||
如果引导配置文件给出的kernel/init参数是::
|
||||
|
||||
kernel {
|
||||
root = 01234567-89ab-cdef-0123-456789abcd
|
||||
}
|
||||
init {
|
||||
splash
|
||||
}
|
||||
|
||||
这将被复制到内核命令行字符串中,如下所示::
|
||||
|
||||
root="01234567-89ab-cdef-0123-456789abcd" -- splash
|
||||
|
||||
如果用户给出的其他命令行是::
|
||||
|
||||
ro bootconfig -- quiet
|
||||
|
||||
则最后的内核命令行如下::
|
||||
|
||||
root="01234567-89ab-cdef-0123-456789abcd" ro bootconfig -- splash quiet
|
||||
|
||||
|
||||
配置文件的限制
|
||||
==============
|
||||
|
||||
当前最大的配置大小是32KB,关键字总数(不是键值条目)必须少于1024个节点。
|
||||
注意:这不是条目数而是节点数,条目必须消耗超过2个节点(一个关键字和一个值)。
|
||||
所以从理论上讲最多512个键值对。如果关键字平均包含3个单词,则可有256个键值对。
|
||||
在大多数情况下,配置项的数量将少于100个条目,小于8KB,因此这应该足够了。如果
|
||||
节点数超过1024,解析器将返回错误,即使文件大小小于32KB。(请注意,此最大尺寸
|
||||
不包括填充的空字符。)
|
||||
无论如何,因为 ``bootconfig`` 命令在附加启动配置到initrd映像时会验证它,用户
|
||||
可以在引导之前注意到它。
|
||||
|
||||
|
||||
引导配置API
|
||||
===========
|
||||
|
||||
用户可以查询或遍历键值对,也可以查找(前缀)根关键字节点,并在查找该节点下的
|
||||
键值。
|
||||
|
||||
如果您有一个关键字字符串,则可以直接使用 xbc_find_value() 查询该键的值。如果
|
||||
你想知道引导配置里有哪些关键字,可以使用 xbc_for_each_key_value() 迭代键值对。
|
||||
请注意,您需要使用 xbc_array_for_each_value() 访问数组的值,例如::
|
||||
|
||||
vnode = NULL;
|
||||
xbc_find_value("key.word", &vnode);
|
||||
if (vnode && xbc_node_is_array(vnode))
|
||||
xbc_array_for_each_value(vnode, value) {
|
||||
printk("%s ", value);
|
||||
}
|
||||
|
||||
如果您想查找具有前缀字符串的键,可以使用 xbc_find_node() 通过前缀字符串查找
|
||||
节点,然后用 xbc_node_for_each_key_value() 迭代前缀节点下的键。
|
||||
|
||||
但最典型的用法是获取前缀下的命名值或前缀下的命名数组,例如::
|
||||
|
||||
root = xbc_find_node("key.prefix");
|
||||
value = xbc_node_find_value(root, "option", &vnode);
|
||||
...
|
||||
xbc_node_for_each_array_value(root, "array-option", value, anode) {
|
||||
...
|
||||
}
|
||||
|
||||
这将访问值“key.prefix.option”的值和“key.prefix.array-option”的数组。
|
||||
|
||||
锁是不需要的,因为在初始化之后配置只读。如果需要修改,必须复制所有数据和关键字。
|
||||
|
||||
|
||||
函数与结构体
|
||||
============
|
||||
|
||||
相关定义的kernel-doc参见:
|
||||
|
||||
- include/linux/bootconfig.h
|
||||
- lib/bootconfig.c
|
@ -63,6 +63,7 @@ Todolist:
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
bootconfig
|
||||
clearing-warn-once
|
||||
cpu-load
|
||||
cputopology
|
||||
@ -80,7 +81,6 @@ Todolist:
|
||||
* binderfs
|
||||
* binfmt-misc
|
||||
* blockdev/index
|
||||
* bootconfig
|
||||
* braille-console
|
||||
* btmrvl
|
||||
* cgroup-v1/index
|
||||
|
210
Documentation/translations/zh_CN/core-api/circular-buffers.rst
Normal file
210
Documentation/translations/zh_CN/core-api/circular-buffers.rst
Normal file
@ -0,0 +1,210 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0+
|
||||
|
||||
.. include:: ../disclaimer-zh_CN.rst
|
||||
|
||||
:Original: Documentation/core-api/circular-buffers.rst
|
||||
|
||||
:翻译:
|
||||
|
||||
周彬彬 Binbin Zhou <zhoubinbin@loongson.cn>
|
||||
|
||||
:校译:
|
||||
|
||||
司延腾 Yanteng Si <siyanteng@loongson.cn>
|
||||
吴想成 Wu Xiangcheng <bobwxc@email.cn>
|
||||
时奎亮 Alex Shi <alexs@kernel.org>
|
||||
|
||||
==========
|
||||
环形缓冲区
|
||||
==========
|
||||
|
||||
:作者: David Howells <dhowells@redhat.com>
|
||||
:作者: Paul E. McKenney <paulmck@linux.ibm.com>
|
||||
|
||||
|
||||
Linux 提供了许多可用于实现循环缓冲的特性。有两组这样的特性:
|
||||
|
||||
(1) 用于确定2次方大小的缓冲区信息的便利函数。
|
||||
|
||||
(2) 可以代替缓冲区中对象的生产者和消费者共享锁的内存屏障。
|
||||
|
||||
如下所述,要使用这些设施,只需要一个生产者和一个消费者。可以通过序列化来处理多个
|
||||
生产者,并通过序列化来处理多个消费者。
|
||||
|
||||
.. Contents:
|
||||
|
||||
(*) 什么是环形缓冲区?
|
||||
|
||||
(*) 测量2次幂缓冲区
|
||||
|
||||
(*) 内存屏障与环形缓冲区的结合使用
|
||||
- 生产者
|
||||
- 消费者
|
||||
|
||||
(*) 延伸阅读
|
||||
|
||||
|
||||
|
||||
什么是环形缓冲区?
|
||||
==================
|
||||
|
||||
首先,什么是环形缓冲区?环形缓冲区是具有固定的有限大小的缓冲区,它有两个索引:
|
||||
|
||||
(1) 'head'索引 - 生产者将元素插入缓冲区的位置。
|
||||
|
||||
(2) 'tail'索引 - 消费者在缓冲区中找到下一个元素的位置。
|
||||
|
||||
通常,当tail指针等于head指针时,表明缓冲区是空的;而当head指针比tail指针少一个时,
|
||||
表明缓冲区是满的。
|
||||
|
||||
添加元素时,递增head索引;删除元素时,递增tail索引。tail索引不应该跳过head索引,
|
||||
两个索引在到达缓冲区末端时都应该被赋值为0,从而允许海量的数据流过缓冲区。
|
||||
|
||||
通常情况下,元素都有相同的单元大小,但这并不是使用以下技术的严格要求。如果要在缓
|
||||
冲区中包含多个元素或可变大小的元素,则索引可以增加超过1,前提是两个索引都没有超过
|
||||
另一个。然而,实现者必须小心,因为超过一个单位大小的区域可能会覆盖缓冲区的末端并
|
||||
且缓冲区会被分成两段。
|
||||
|
||||
测量2次幂缓冲区
|
||||
===============
|
||||
|
||||
计算任意大小的环形缓冲区的占用或剩余容量通常是一个费时的操作,需要使用模(除法)
|
||||
指令。但是如果缓冲区的大小为2次幂,则可以使用更快的按位与指令代替。
|
||||
|
||||
Linux提供了一组用于处理2次幂环形缓冲区的宏。可以通过以下方式使用::
|
||||
|
||||
#include <linux/circ_buf.h>
|
||||
|
||||
这些宏包括:
|
||||
|
||||
(#) 测量缓冲区的剩余容量::
|
||||
|
||||
CIRC_SPACE(head_index, tail_index, buffer_size);
|
||||
|
||||
返回缓冲区[1]中可插入元素的剩余空间大小。
|
||||
|
||||
|
||||
(#) 测量缓冲区中的最大连续立即可用空间::
|
||||
|
||||
CIRC_SPACE_TO_END(head_index, tail_index, buffer_size);
|
||||
|
||||
返回缓冲区[1]中剩余的连续空间的大小,元素可以立即插入其中,而不必绕回到缓冲
|
||||
区的开头。
|
||||
|
||||
|
||||
(#) 测量缓冲区的使用数::
|
||||
|
||||
CIRC_CNT(head_index, tail_index, buffer_size);
|
||||
|
||||
返回当前占用缓冲区[2]的元素数量。
|
||||
|
||||
|
||||
(#) 测量缓冲区的连续使用数::
|
||||
|
||||
CIRC_CNT_TO_END(head_index, tail_index, buffer_size);
|
||||
|
||||
返回可以从缓冲区中提取的连续元素[2]的数量,而不必绕回到缓冲区的开头。
|
||||
|
||||
这里的每一个宏名义上都会返回一个介于0和buffer_size-1之间的值,但是:
|
||||
|
||||
(1) CIRC_SPACE*()是为了在生产者中使用。对生产者来说,它们将返回一个下限,因为生
|
||||
产者控制着head索引,但消费者可能仍然在另一个CPU上耗尽缓冲区并移动tail索引。
|
||||
|
||||
对消费者来说,它将显示一个上限,因为生产者可能正忙于耗尽空间。
|
||||
|
||||
(2) CIRC_CNT*()是为了在消费者中使用。对消费者来说,它们将返回一个下限,因为消费
|
||||
者控制着tail索引,但生产者可能仍然在另一个CPU上填充缓冲区并移动head索引。
|
||||
|
||||
对于生产者,它将显示一个上限,因为消费者可能正忙于清空缓冲区。
|
||||
|
||||
(3) 对于第三方来说,生产者和消费者对索引的写入顺序是无法保证的,因为它们是独立的,
|
||||
而且可能是在不同的CPU上进行的,所以在这种情况下的结果只是一种猜测,甚至可能
|
||||
是错误的。
|
||||
|
||||
内存屏障与环形缓冲区的结合使用
|
||||
==============================
|
||||
|
||||
通过将内存屏障与环形缓冲区结合使用,可以避免以下需求:
|
||||
|
||||
(1) 使用单个锁来控制对缓冲区两端的访问,从而允许同时填充和清空缓冲区;以及
|
||||
|
||||
(2) 使用原子计数器操作。
|
||||
|
||||
这有两个方面:填充缓冲区的生产者和清空缓冲区的消费者。在任何时候,只应有一个生产
|
||||
者在填充缓冲区,同样的也只应有一个消费者在清空缓冲区,但双方可以同时操作。
|
||||
|
||||
|
||||
生产者
|
||||
------
|
||||
|
||||
生产者看起来像这样::
|
||||
|
||||
spin_lock(&producer_lock);
|
||||
|
||||
unsigned long head = buffer->head;
|
||||
/* spin_unlock()和下一个spin_lock()提供必要的排序。 */
|
||||
unsigned long tail = READ_ONCE(buffer->tail);
|
||||
|
||||
if (CIRC_SPACE(head, tail, buffer->size) >= 1) {
|
||||
/* 添加一个元素到缓冲区 */
|
||||
struct item *item = buffer[head];
|
||||
|
||||
produce_item(item);
|
||||
|
||||
smp_store_release(buffer->head,
|
||||
(head + 1) & (buffer->size - 1));
|
||||
|
||||
/* wake_up()将确保在唤醒任何人之前提交head */
|
||||
wake_up(consumer);
|
||||
}
|
||||
|
||||
spin_unlock(&producer_lock);
|
||||
|
||||
这将表明CPU必须在head索引使其对消费者可用之前写入新项目的内容,同时CPU必须在唤醒
|
||||
消费者之前写入修改后的head索引。
|
||||
|
||||
请注意,wake_up()并不保证任何形式的屏障,除非确实唤醒了某些东西。因此我们不能依靠
|
||||
它来进行排序。但是数组中始终有一个元素留空,因此生产者必须产生两个元素,然后才可
|
||||
能破坏消费者当前正在读取的元素。同时,消费者连续调用之间成对的解锁-加锁提供了索引
|
||||
读取(指示消费者已清空给定元素)和生产者对该相同元素的写入之间的必要顺序。
|
||||
|
||||
|
||||
消费者
|
||||
------
|
||||
|
||||
消费者看起来像这样::
|
||||
|
||||
spin_lock(&consumer_lock);
|
||||
|
||||
/* 读取该索引处的内容之前,先读取索引 */
|
||||
unsigned long head = smp_load_acquire(buffer->head);
|
||||
unsigned long tail = buffer->tail;
|
||||
|
||||
if (CIRC_CNT(head, tail, buffer->size) >= 1) {
|
||||
|
||||
/* 从缓冲区中提取一个元素 */
|
||||
struct item *item = buffer[tail];
|
||||
|
||||
consume_item(item);
|
||||
|
||||
/* 在递增tail之前完成对描述符的读取。 */
|
||||
smp_store_release(buffer->tail,
|
||||
(tail + 1) & (buffer->size - 1));
|
||||
}
|
||||
|
||||
spin_unlock(&consumer_lock);
|
||||
|
||||
这表明CPU在读取新元素之前确保索引是最新的,然后在写入新的尾指针之前应确保CPU已完
|
||||
成读取该元素,这将擦除该元素。
|
||||
|
||||
请注意,使用READ_ONCE()和smp_load_acquire()来读取反向(head)索引。这可以防止编译
|
||||
器丢弃并重新加载其缓存值。如果您能确定反向(head)索引将仅使用一次,则这不是必须
|
||||
的。smp_load_acquire()还可以强制CPU对后续的内存引用进行排序。类似地,两种算法都使
|
||||
用smp_store_release()来写入线程的索引。这记录了我们正在写入可以并发读取的内容的事
|
||||
实,以防止编译器破坏存储,并强制对以前的访问进行排序。
|
||||
|
||||
|
||||
延伸阅读
|
||||
========
|
||||
|
||||
关于Linux的内存屏障设施的描述,请查看Documentation/memory-barriers.txt。
|
@ -0,0 +1,23 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0+
|
||||
|
||||
.. include:: ../disclaimer-zh_CN.rst
|
||||
|
||||
:Original: Documentation/core-api/generic-radix-tree.rst
|
||||
|
||||
:翻译:
|
||||
|
||||
周彬彬 Binbin Zhou <zhoubinbin@loongson.cn>
|
||||
|
||||
===================
|
||||
通用基数树/稀疏数组
|
||||
===================
|
||||
|
||||
通用基数树/稀疏数组的相关内容请见include/linux/generic-radix-tree.h文件中的
|
||||
“DOC: Generic radix trees/sparse arrays”。
|
||||
|
||||
通用基数树函数
|
||||
--------------
|
||||
|
||||
该API在以下内核代码中:
|
||||
|
||||
include/linux/generic-radix-tree.h
|
80
Documentation/translations/zh_CN/core-api/idr.rst
Normal file
80
Documentation/translations/zh_CN/core-api/idr.rst
Normal file
@ -0,0 +1,80 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0+
|
||||
|
||||
.. include:: ../disclaimer-zh_CN.rst
|
||||
|
||||
:Original: Documentation/core-api/idr.rst
|
||||
|
||||
:翻译:
|
||||
|
||||
周彬彬 Binbin Zhou <zhoubinbin@loongson.cn>
|
||||
|
||||
:校译:
|
||||
|
||||
司延腾 Yanteng Si <siyanteng@loongson.cn>
|
||||
吴想成 Wu Xiangcheng <bobwxc@email.cn>
|
||||
时奎亮 Alex Shi <alexs@kernel.org>
|
||||
|
||||
======
|
||||
ID分配
|
||||
======
|
||||
|
||||
:作者: Matthew Wilcox
|
||||
|
||||
概述
|
||||
====
|
||||
|
||||
要解决的一个常见问题是分配标识符(IDs);它通常是标识事物的数字。比如包括文件描述
|
||||
符、进程ID、网络协议中的数据包标识符、SCSI标记和设备实例编号。IDR和IDA为这个问题
|
||||
提供了一个合理的解决方案,以避免每个人都自创。IDR提供将ID映射到指针的能力,而IDA
|
||||
仅提供ID分配,因此内存效率更高。
|
||||
|
||||
IDR接口已经被废弃,请使用 ``XArray`` 。
|
||||
|
||||
IDR的用法
|
||||
=========
|
||||
|
||||
首先初始化一个IDR,对于静态分配的IDR使用DEFINE_IDR(),或者对于动态分配的IDR使用
|
||||
idr_init()。
|
||||
|
||||
您可以调用idr_alloc()来分配一个未使用的ID。通过调用idr_find()查询与该ID相关的指针,
|
||||
并通过调用idr_remove()释放该ID。
|
||||
|
||||
如果需要更改与一个ID相关联的指针,可以调用idr_replace()。这样做的一个常见原因是通
|
||||
过将 ``NULL`` 指针传递给分配函数来保留ID;用保留的ID初始化对象,最后将初始化的对
|
||||
象插入IDR。
|
||||
|
||||
一些用户需要分配大于 ``INT_MAX`` 的ID。到目前为止,所有这些用户都满足 ``UINT_MAX``
|
||||
的限制,他们使用idr_alloc_u32()。如果您需要超出u32的ID,我们将与您合作以满足您的
|
||||
需求。
|
||||
|
||||
如果需要按顺序分配ID,可以使用idr_alloc_cyclic()。处理较大数量的ID时,IDR的效率会
|
||||
降低,所以使用这个函数会有一点代价。
|
||||
|
||||
要对IDR使用的所有指针进行操作,您可以使用基于回调的idr_for_each()或迭代器样式的
|
||||
idr_for_each_entry()。您可能需要使用idr_for_each_entry_continue()来继续迭代。如果
|
||||
迭代器不符合您的需求,您也可以使用idr_get_next()。
|
||||
|
||||
当使用完IDR后,您可以调用idr_destroy()来释放IDR占用的内存。这并不会释放IDR指向的
|
||||
对象;如果您想这样做,请使用其中一个迭代器来执行此操作。
|
||||
|
||||
您可以使用idr_is_empty()来查看当前是否分配了任何ID。
|
||||
|
||||
如果在从IDR分配一个新ID时需要带锁,您可能需要传递一组限制性的GFP标志,但这可能导
|
||||
致IDR无法分配内存。为了解决该问题,您可以在获取锁之前调用idr_preload(),然后在分
|
||||
配之后调用idr_preload_end()。
|
||||
|
||||
IDR同步的相关内容请见include/linux/idr.h文件中的“DOC: idr sync”。
|
||||
|
||||
IDA的用法
|
||||
=========
|
||||
|
||||
IDA的用法的相关内容请见lib/idr.c文件中的“DOC: IDA description”。
|
||||
|
||||
函数和数据结构
|
||||
==============
|
||||
|
||||
该API在以下内核代码中:
|
||||
|
||||
include/linux/idr.h
|
||||
|
||||
lib/idr.c
|
@ -44,15 +44,15 @@
|
||||
assoc_array
|
||||
xarray
|
||||
rbtree
|
||||
idr
|
||||
circular-buffers
|
||||
generic-radix-tree
|
||||
packing
|
||||
|
||||
Todolist:
|
||||
|
||||
|
||||
|
||||
idr
|
||||
circular-buffers
|
||||
generic-radix-tree
|
||||
packing
|
||||
this_cpu_ops
|
||||
timekeeping
|
||||
errseq
|
||||
|
160
Documentation/translations/zh_CN/core-api/packing.rst
Normal file
160
Documentation/translations/zh_CN/core-api/packing.rst
Normal file
@ -0,0 +1,160 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0+
|
||||
|
||||
.. include:: ../disclaimer-zh_CN.rst
|
||||
|
||||
:Original: Documentation/core-api/packing.rst
|
||||
|
||||
:翻译:
|
||||
|
||||
周彬彬 Binbin Zhou <zhoubinbin@loongson.cn>
|
||||
|
||||
:校译:
|
||||
|
||||
司延腾 Yanteng Si <siyanteng@loongson.cn>
|
||||
吴想成 Wu Xiangcheng <bobwxc@email.cn>
|
||||
时奎亮 Alex Shi <alexs@kernel.org>
|
||||
|
||||
========================
|
||||
通用的位域打包和解包函数
|
||||
========================
|
||||
|
||||
问题陈述
|
||||
--------
|
||||
|
||||
使用硬件时,必须在几种与其交互的方法之间进行选择。
|
||||
|
||||
可以将指针映射到在硬件设备的内存区上精心设计的结构体,并将其字段作为结构成员(可
|
||||
能声明为位域)访问。但是由于CPU和硬件设备之间潜在的字节顺序不匹配,以这种方式编写
|
||||
代码会降低其可移植性。
|
||||
|
||||
此外,必须密切注意将硬件文档中的寄存器定义转换为结构的位域索引。此外,一些硬件
|
||||
(通常是网络设备)倾向于以违反任何合理字边界(有时甚至是64位)的方式对其寄存器字
|
||||
段进行分组。这就造成了不得不在结构中定义寄存器字段的“高”和“低”部分的不便。
|
||||
|
||||
结构域定义的更可靠的替代方法是通过移动适当数量的位来提取所需的字段。但这仍然不能
|
||||
防止字节顺序不匹配,除非所有内存访问都是逐字节执行的。此外,代码很容易变得杂乱无
|
||||
章,同时可能会在所需的许多位移操作中丢失一些高层次的想法。
|
||||
|
||||
许多驱动程序采用了位移的方法,然后试图用定制的宏来减少杂乱无章的东西,但更多的时
|
||||
候,这些宏所采用的捷径依旧妨碍了代码真正的可移植性。
|
||||
|
||||
解决方案
|
||||
--------
|
||||
|
||||
该API涉及2个基本操作:
|
||||
|
||||
- 将一个CPU可使用的数字打包到内存缓冲区中(具有硬件约束/特殊性)。
|
||||
- 将内存缓冲区(具有硬件约束/特殊性)解压缩为一个CPU可使用的数字。
|
||||
|
||||
该API提供了对所述硬件约束和特殊性以及CPU字节序的抽象,因此这两者之间可能不匹配。
|
||||
|
||||
这些API函数的基本单元是u64。从CPU的角度来看,位63总是意味着字节7的位偏移量7,尽管
|
||||
只是逻辑上的。问题是:我们将这个比特放在内存的什么位置?
|
||||
|
||||
以下示例介绍了打包u64字段的内存布局。打包缓冲区中的字节偏移量始终默认为0,1...7。
|
||||
示例显示的是逻辑字节和位所在的位置。
|
||||
|
||||
1. 通常情况下(无特殊性),我们会这样做:
|
||||
|
||||
::
|
||||
|
||||
63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32
|
||||
7 6 5 4
|
||||
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
|
||||
3 2 1 0
|
||||
|
||||
也就是说,CPU可使用的u64的MSByte(7)位于内存偏移量0处,而u64的LSByte(0)位于内存偏移量7处。
|
||||
|
||||
这对应于大多数人认为的“大端”,其中位i对应于数字2^i。这在代码注释中也称为“逻辑”符号。
|
||||
|
||||
|
||||
2. 如果设置了QUIRK_MSB_ON_THE_RIGHT,我们按如下方式操作:
|
||||
|
||||
::
|
||||
|
||||
56 57 58 59 60 61 62 63 48 49 50 51 52 53 54 55 40 41 42 43 44 45 46 47 32 33 34 35 36 37 38 39
|
||||
7 6 5 4
|
||||
24 25 26 27 28 29 30 31 16 17 18 19 20 21 22 23 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7
|
||||
3 2 1 0
|
||||
|
||||
也就是说,QUIRK_MSB_ON_THE_RIGHT不会影响字节定位,但会反转字节内的位偏移量。
|
||||
|
||||
|
||||
3. 如果设置了QUIRK_LITTLE_ENDIAN,我们按如下方式操作:
|
||||
|
||||
::
|
||||
|
||||
39 38 37 36 35 34 33 32 47 46 45 44 43 42 41 40 55 54 53 52 51 50 49 48 63 62 61 60 59 58 57 56
|
||||
4 5 6 7
|
||||
7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 23 22 21 20 19 18 17 16 31 30 29 28 27 26 25 24
|
||||
0 1 2 3
|
||||
|
||||
因此,QUIRK_LITTLE_ENDIAN意味着在内存区域内,每个4字节的字的每个字节都被放置在与
|
||||
该字的边界相比的镜像位置。
|
||||
|
||||
|
||||
4. 如果设置了QUIRK_MSB_ON_THE_RIGHT和QUIRK_LITTLE_ENDIAN,我们这样做:
|
||||
|
||||
::
|
||||
|
||||
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
|
||||
4 5 6 7
|
||||
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
|
||||
0 1 2 3
|
||||
|
||||
|
||||
5. 如果只设置了QUIRK_LSW32_IS_FIRST,我们这样做:
|
||||
|
||||
::
|
||||
|
||||
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
|
||||
3 2 1 0
|
||||
63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32
|
||||
7 6 5 4
|
||||
|
||||
在这种情况下,8字节内存区域解释如下:前4字节对应最不重要的4字节的字,后4字节对应
|
||||
更重要的4字节的字。
|
||||
|
||||
6. 如果设置了QUIRK_LSW32_IS_FIRST和QUIRK_MSB_ON_THE_RIGHT,我们这样做:
|
||||
|
||||
::
|
||||
|
||||
24 25 26 27 28 29 30 31 16 17 18 19 20 21 22 23 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7
|
||||
3 2 1 0
|
||||
56 57 58 59 60 61 62 63 48 49 50 51 52 53 54 55 40 41 42 43 44 45 46 47 32 33 34 35 36 37 38 39
|
||||
7 6 5 4
|
||||
|
||||
|
||||
7. 如果设置了QUIRK_LSW32_IS_FIRST和QUIRK_LITTLE_ENDIAN,则如下所示:
|
||||
|
||||
::
|
||||
|
||||
7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 23 22 21 20 19 18 17 16 31 30 29 28 27 26 25 24
|
||||
0 1 2 3
|
||||
39 38 37 36 35 34 33 32 47 46 45 44 43 42 41 40 55 54 53 52 51 50 49 48 63 62 61 60 59 58 57 56
|
||||
4 5 6 7
|
||||
|
||||
|
||||
8. 如果设置了QUIRK_LSW32_IS_FIRST,QUIRK_LITTLE_ENDIAN和QUIRK_MSB_ON_THE_RIGHT,
|
||||
则如下所示:
|
||||
|
||||
::
|
||||
|
||||
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
|
||||
0 1 2 3
|
||||
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
|
||||
4 5 6 7
|
||||
|
||||
|
||||
我们总是认为我们的偏移量好像没有特殊性,然后在访问内存区域之前翻译它们。
|
||||
|
||||
预期用途
|
||||
--------
|
||||
|
||||
选择使用该API的驱动程序首先需要确定上述3种quirk组合(共8种)中的哪一种与硬件文档
|
||||
中描述的相匹配。然后,他们应该封装packing()函数,创建一个新的xxx_packing(),使用
|
||||
适当的QUIRK_* one-hot 位集合来调用它。
|
||||
|
||||
packing()函数返回一个int类型的错误码,以防止程序员使用不正确的API。这些错误预计不
|
||||
会在运行时发生,因此xxx_packing()返回void并简单地接受这些错误是合理的。它可以选择
|
||||
转储栈或打印错误描述。
|
37
Documentation/translations/zh_CN/devicetree/changesets.rst
Normal file
37
Documentation/translations/zh_CN/devicetree/changesets.rst
Normal file
@ -0,0 +1,37 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
.. include:: ../disclaimer-zh_CN.rst
|
||||
|
||||
:Original: Documentation/Devicetree/changesets.rst
|
||||
|
||||
:翻译:
|
||||
|
||||
司延腾 Yanteng Si <siyanteng@loongson.cn>
|
||||
|
||||
:校译:
|
||||
|
||||
|
||||
============
|
||||
设备树变更集
|
||||
============
|
||||
|
||||
设备树变更集是一种方法,它允许人们以这样一种方式在实时树中使用变化,即要么使用全部的
|
||||
变化,要么不使用。如果在使用变更集的过程中发生错误,那么树将被回滚到之前的状态。一个
|
||||
变更集也可以在使用后被删除。
|
||||
|
||||
当一个变更集被使用时,所有的改变在发出OF_RECONFIG通知器之前被一次性使用到树上。这是
|
||||
为了让接收者在收到通知时看到一个完整的、一致的树的状态。
|
||||
|
||||
一个变化集的顺序如下。
|
||||
|
||||
1. of_changeset_init() - 初始化一个变更集。
|
||||
|
||||
2. 一些DT树变化的调用,of_changeset_attach_node(), of_changeset_detach_node(),
|
||||
of_changeset_add_property(), of_changeset_remove_property,
|
||||
of_changeset_update_property()来准备一组变更。此时不会对活动树做任何变更。所有
|
||||
的变更操作都记录在of_changeset的 `entries` 列表中。
|
||||
|
||||
3. of_changeset_apply() - 将变更使用到树上。要么整个变更集被使用,要么如果有错误,
|
||||
树会被恢复到之前的状态。核心通过锁确保正确的顺序。如果需要的话,可以使用一个解锁的
|
||||
__of_changeset_apply版本。
|
||||
|
||||
如果一个成功使用的变更集需要被删除,可以用of_changeset_revert()来完成。
|
@ -0,0 +1,31 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
.. include:: ../disclaimer-zh_CN.rst
|
||||
|
||||
:Original: Documentation/Devicetree/dynamic-resolution-notes.rst
|
||||
|
||||
:翻译:
|
||||
|
||||
司延腾 Yanteng Si <siyanteng@loongson.cn>
|
||||
|
||||
:校译:
|
||||
|
||||
========================
|
||||
Devicetree动态解析器说明
|
||||
========================
|
||||
|
||||
本文描述了内核内DeviceTree解析器的实现,它位于drivers/of/resolver.c中。
|
||||
|
||||
解析器如何工作?
|
||||
----------------
|
||||
|
||||
解析器被赋予一个任意的树作为输入,该树用适当的dtc选项编译,并有一个/plugin/标签。这就产
|
||||
生了适当的__fixups__和__local_fixups__节点。
|
||||
|
||||
解析器依次通过以下步骤工作:
|
||||
|
||||
1. 从实时树中获取最大的设备树phandle值 + 1.
|
||||
2. 调整树的所有本地 phandles,以解决这个量。
|
||||
3. 使用 __local__fixups__ 节点信息以相同的量调整所有本地引用。
|
||||
4. 对于__fixups__节点中的每个属性,找到它在实时树中引用的节点。这是用来标记该节点的标签。
|
||||
5. 检索fixup的目标的phandle。
|
||||
6. 对于属性中的每个fixup,找到节点:属性:偏移的位置,并用phandle值替换它。
|
@ -24,21 +24,16 @@ Open Firmware 和 Devicetree
|
||||
|
||||
usage-model
|
||||
of_unittest
|
||||
|
||||
Todolist:
|
||||
|
||||
* kernel-api
|
||||
kernel-api
|
||||
|
||||
Devicetree Overlays
|
||||
===================
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
Todolist:
|
||||
|
||||
* changesets
|
||||
* dynamic-resolution-notes
|
||||
* overlay-notes
|
||||
changesets
|
||||
dynamic-resolution-notes
|
||||
overlay-notes
|
||||
|
||||
Devicetree Bindings
|
||||
===================
|
||||
|
58
Documentation/translations/zh_CN/devicetree/kernel-api.rst
Normal file
58
Documentation/translations/zh_CN/devicetree/kernel-api.rst
Normal file
@ -0,0 +1,58 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
.. include:: ../disclaimer-zh_CN.rst
|
||||
|
||||
:Original: Documentation/Devicetree/kernel-api.rst
|
||||
|
||||
:翻译:
|
||||
|
||||
司延腾 Yanteng Si <siyanteng@loongson.cn>
|
||||
|
||||
:校译:
|
||||
|
||||
|
||||
=================
|
||||
内核中的设备树API
|
||||
=================
|
||||
|
||||
核心函数
|
||||
--------
|
||||
|
||||
该API在以下内核代码中:
|
||||
|
||||
drivers/of/base.c
|
||||
|
||||
include/linux/of.h
|
||||
|
||||
drivers/of/property.c
|
||||
|
||||
include/linux/of_graph.h
|
||||
|
||||
drivers/of/address.c
|
||||
|
||||
drivers/of/irq.c
|
||||
|
||||
drivers/of/fdt.c
|
||||
|
||||
驱动模型函数
|
||||
------------
|
||||
|
||||
该API在以下内核代码中:
|
||||
|
||||
include/linux/of_device.h
|
||||
|
||||
drivers/of/device.c
|
||||
|
||||
include/linux/of_platform.h
|
||||
|
||||
drivers/of/platform.c
|
||||
|
||||
覆盖和动态DT函数
|
||||
----------------
|
||||
|
||||
该API在以下内核代码中:
|
||||
|
||||
drivers/of/resolver.c
|
||||
|
||||
drivers/of/dynamic.c
|
||||
|
||||
drivers/of/overlay.c
|
140
Documentation/translations/zh_CN/devicetree/overlay-notes.rst
Normal file
140
Documentation/translations/zh_CN/devicetree/overlay-notes.rst
Normal file
@ -0,0 +1,140 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
.. include:: ../disclaimer-zh_CN.rst
|
||||
|
||||
:Original: Documentation/Devicetree/overlay-notes.rst
|
||||
|
||||
:翻译:
|
||||
|
||||
司延腾 Yanteng Si <siyanteng@loongson.cn>
|
||||
|
||||
:校译:
|
||||
|
||||
==============
|
||||
设备树覆盖说明
|
||||
==============
|
||||
|
||||
本文档描述了drivers/of/overlay.c中的内核内设备树覆盖功能的实现,是
|
||||
Documentation/devicetree/dynamic-resolution-notes.rst[1]的配套文档。
|
||||
|
||||
覆盖如何工作
|
||||
------------
|
||||
|
||||
设备树覆盖的目的是修改内核的实时树,并使修改以反映变化的方式影响内核的状态。
|
||||
由于内核主要处理的是设备,任何新的设备节点如果导致一个活动的设备,就应该创建它,
|
||||
而如果设备节点被禁用或被全部删除,受影响的设备应该被取消注册。
|
||||
|
||||
让我们举个例子,我们有一个foo板,它的基本树形图如下::
|
||||
|
||||
---- foo.dts ---------------------------------------------------------------
|
||||
/* FOO平台 */
|
||||
/dts-v1/;
|
||||
/ {
|
||||
compatible = "corp,foo";
|
||||
|
||||
/* 共享的资源 */
|
||||
res: res {
|
||||
};
|
||||
|
||||
/* 芯片上的外围设备 */
|
||||
ocp: ocp {
|
||||
/* 总是被实例化的外围设备 */
|
||||
peripheral1 { ... };
|
||||
};
|
||||
};
|
||||
---- foo.dts ---------------------------------------------------------------
|
||||
|
||||
覆盖bar.dts,
|
||||
::
|
||||
|
||||
---- bar.dts - 按标签覆盖目标位置 ----------------------------
|
||||
/dts-v1/;
|
||||
/插件/;
|
||||
&ocp {
|
||||
/* bar外围 */
|
||||
bar {
|
||||
compatible = "corp,bar";
|
||||
... /* 各种属性和子节点 */
|
||||
};
|
||||
};
|
||||
---- bar.dts ---------------------------------------------------------------
|
||||
|
||||
当加载(并按照[1]中描述的方式解决)时,应该产生foo+bar.dts::
|
||||
|
||||
---- foo+bar.dts -----------------------------------------------------------
|
||||
/* FOO平台 + bar外围 */
|
||||
/ {
|
||||
compatible = "corp,foo";
|
||||
|
||||
/* 共享资源 */
|
||||
res: res {
|
||||
};
|
||||
|
||||
/* 芯片上的外围设备 */
|
||||
ocp: ocp {
|
||||
/* 总是被实例化的外围设备 */
|
||||
peripheral1 { ... };
|
||||
|
||||
/* bar外围 */
|
||||
bar {
|
||||
compatible = "corp,bar";
|
||||
... /* 各种属性和子节点 */
|
||||
};
|
||||
};
|
||||
};
|
||||
---- foo+bar.dts -----------------------------------------------------------
|
||||
|
||||
作为覆盖的结果,已经创建了一个新的设备节点(bar),因此将注册一个bar平台设备,
|
||||
如果加载了匹配的设备驱动程序,将按预期创建设备。
|
||||
|
||||
如果基础DT不是用-@选项编译的,那么“&ocp”标签将不能用于将覆盖节点解析到基础
|
||||
DT中的适当位置。在这种情况下,可以提供目标路径。通过标签的目标位置的语法是比
|
||||
较好的,因为不管标签在DT中出现在哪里,覆盖都可以被应用到任何包含标签的基础DT上。
|
||||
|
||||
上面的bar.dts例子被修改为使用目标路径语法,即为::
|
||||
|
||||
---- bar.dts - 通过明确的路径覆盖目标位置 --------------------
|
||||
/dts-v1/;
|
||||
/插件/;
|
||||
&{/ocp} {
|
||||
/* bar外围 */
|
||||
bar {
|
||||
compatible = "corp,bar";
|
||||
... /* 各种外围设备和子节点 */
|
||||
}
|
||||
};
|
||||
---- bar.dts ---------------------------------------------------------------
|
||||
|
||||
|
||||
内核中关于覆盖的API
|
||||
-------------------
|
||||
|
||||
该API相当容易使用。
|
||||
|
||||
1) 调用of_overlay_fdt_apply()来创建和应用一个覆盖的变更集。返回值是一个
|
||||
错误或一个识别这个覆盖的cookie。
|
||||
|
||||
2) 调用of_overlay_remove()来删除和清理先前通过调用of_overlay_fdt_apply()
|
||||
而创建的覆盖变更集。不允许删除一个被另一个覆盖的覆盖变化集。
|
||||
|
||||
最后,如果你需要一次性删除所有的覆盖,只需调用of_overlay_remove_all(),
|
||||
它将以正确的顺序删除每一个覆盖。
|
||||
|
||||
你可以选择注册在覆盖操作中被调用的通知器。详见
|
||||
of_overlay_notifier_register/unregister和enum of_overlay_notify_action。
|
||||
|
||||
OF_OVERLAY_PRE_APPLY、OF_OVERLAY_POST_APPLY或OF_OVERLAY_PRE_REMOVE
|
||||
的通知器回调可以存储指向覆盖层中的设备树节点或其内容的指针,但这些指针不能持
|
||||
续到OF_OVERLAY_POST_REMOVE的通知器回调。在OF_OVERLAY_POST_REMOVE通
|
||||
知器被调用后,包含覆盖层的内存将被kfree()ed。请注意,即使OF_OVERLAY_POST_REMOVE
|
||||
的通知器返回错误,内存也会被kfree()ed。
|
||||
|
||||
drivers/of/dynamic.c中的变更集通知器是第二种类型的通知器,可以通过应用或移除
|
||||
覆盖层来触发。这些通知器不允许在覆盖层或其内容中存储指向设备树节点的指针。当包含
|
||||
覆盖层的内存因移除覆盖层而被释放时,覆盖层代码并不能防止这类指针仍然有效。
|
||||
|
||||
任何其他保留指向覆盖层节点或数据的指针的代码都被认为是一个错误,因为在移除覆盖层
|
||||
后,该指针将指向已释放的内存。
|
||||
|
||||
覆盖层的用户必须特别注意系统上发生的整体操作,以确保其他内核代码不保留任何指向覆
|
||||
盖层节点或数据的指针。任何无意中使用这种指针的例子是,如果一个驱动或子系统模块在
|
||||
应用了覆盖后被加载,并且该驱动或子系统扫描了整个设备树或其大部分,包括覆盖节点。
|
69
Documentation/translations/zh_CN/driver-api/gpio/index.rst
Normal file
69
Documentation/translations/zh_CN/driver-api/gpio/index.rst
Normal file
@ -0,0 +1,69 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
.. include:: ../../disclaimer-zh_CN.rst
|
||||
|
||||
:Original: Documentation/driver-api/gpio/index.rst
|
||||
|
||||
:翻译:
|
||||
|
||||
司延腾 Yanteng Si <siyanteng@loongson.cn>
|
||||
|
||||
:校译:
|
||||
|
||||
=======================
|
||||
通用型输入/输出(GPIO)
|
||||
=======================
|
||||
|
||||
目录:
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
legacy
|
||||
|
||||
Todolist:
|
||||
|
||||
* intro
|
||||
* using-gpio
|
||||
* driver
|
||||
* consumer
|
||||
* board
|
||||
* drivers-on-gpio
|
||||
* bt8xxgpio
|
||||
|
||||
核心
|
||||
====
|
||||
|
||||
该API在以下内核代码中:
|
||||
|
||||
include/linux/gpio/driver.h
|
||||
|
||||
drivers/gpio/gpiolib.c
|
||||
|
||||
ACPI支持
|
||||
========
|
||||
|
||||
该API在以下内核代码中:
|
||||
|
||||
drivers/gpio/gpiolib-acpi.c
|
||||
|
||||
设备树支持
|
||||
==========
|
||||
|
||||
该API在以下内核代码中:
|
||||
|
||||
drivers/gpio/gpiolib-of.c
|
||||
|
||||
设备管理支持
|
||||
============
|
||||
|
||||
该API在以下内核代码中:
|
||||
|
||||
drivers/gpio/gpiolib-devres.c
|
||||
|
||||
sysfs帮助(函数)
|
||||
=================
|
||||
|
||||
该API在以下内核代码中:
|
||||
|
||||
drivers/gpio/gpiolib-sysfs.c
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
x
Reference in New Issue
Block a user