Merge branch 'master' into mm-hotfixes-stable
This commit is contained in:
commit
44f10dbefd
@ -254,6 +254,7 @@ ForEachMacros:
|
||||
- 'for_each_free_mem_range'
|
||||
- 'for_each_free_mem_range_reverse'
|
||||
- 'for_each_func_rsrc'
|
||||
- 'for_each_group_device'
|
||||
- 'for_each_group_evsel'
|
||||
- 'for_each_group_member'
|
||||
- 'for_each_hstate'
|
||||
|
1
.gitattributes
vendored
1
.gitattributes
vendored
@ -2,3 +2,4 @@
|
||||
*.[ch] diff=cpp
|
||||
*.dts diff=dts
|
||||
*.dts[io] diff=dts
|
||||
*.rs diff=rust
|
||||
|
3
.mailmap
3
.mailmap
@ -183,6 +183,8 @@ Henrik Rydberg <rydberg@bitmath.org>
|
||||
Herbert Xu <herbert@gondor.apana.org.au>
|
||||
Huacai Chen <chenhuacai@kernel.org> <chenhc@lemote.com>
|
||||
Huacai Chen <chenhuacai@kernel.org> <chenhuacai@loongson.cn>
|
||||
J. Bruce Fields <bfields@fieldses.org> <bfields@redhat.com>
|
||||
J. Bruce Fields <bfields@fieldses.org> <bfields@citi.umich.edu>
|
||||
Jacob Shin <Jacob.Shin@amd.com>
|
||||
Jaegeuk Kim <jaegeuk@kernel.org> <jaegeuk@google.com>
|
||||
Jaegeuk Kim <jaegeuk@kernel.org> <jaegeuk.kim@samsung.com>
|
||||
@ -330,6 +332,7 @@ Mauro Carvalho Chehab <mchehab@kernel.org> <m.chehab@samsung.com>
|
||||
Mauro Carvalho Chehab <mchehab@kernel.org> <mchehab@s-opensource.com>
|
||||
Maxim Mikityanskiy <maxtram95@gmail.com> <maximmi@mellanox.com>
|
||||
Maxim Mikityanskiy <maxtram95@gmail.com> <maximmi@nvidia.com>
|
||||
Maxime Ripard <mripard@kernel.org> <maxime@cerno.tech>
|
||||
Maxime Ripard <mripard@kernel.org> <maxime.ripard@bootlin.com>
|
||||
Maxime Ripard <mripard@kernel.org> <maxime.ripard@free-electrons.com>
|
||||
Mayuresh Janorkar <mayur@ti.com>
|
||||
|
6
CREDITS
6
CREDITS
@ -383,6 +383,12 @@ E: tomas@nocrew.org
|
||||
W: http://tomas.nocrew.org/
|
||||
D: dsp56k device driver
|
||||
|
||||
N: Srivatsa S. Bhat
|
||||
E: srivatsa@csail.mit.edu
|
||||
D: Maintainer of Generic Paravirt-Ops subsystem
|
||||
D: Maintainer of VMware hypervisor interface
|
||||
D: Maintainer of VMware virtual PTP clock driver (ptp_vmw)
|
||||
|
||||
N: Ross Biro
|
||||
E: ross.biro@gmail.com
|
||||
D: Original author of the Linux networking code
|
||||
|
@ -13,6 +13,11 @@ Description:
|
||||
Specifies the duration of the LED blink in milliseconds.
|
||||
Defaults to 50 ms.
|
||||
|
||||
With hw_control ON, the interval value MUST be set to the
|
||||
default value and cannot be changed.
|
||||
Trying to set any value in this specific mode will return
|
||||
an EINVAL error.
|
||||
|
||||
What: /sys/class/leds/<led>/link
|
||||
Date: Dec 2017
|
||||
KernelVersion: 4.16
|
||||
@ -39,6 +44,9 @@ Description:
|
||||
If set to 1, the LED will blink for the milliseconds specified
|
||||
in interval to signal transmission.
|
||||
|
||||
With hw_control ON, the blink interval is controlled by hardware
|
||||
and won't reflect the value set in interval.
|
||||
|
||||
What: /sys/class/leds/<led>/rx
|
||||
Date: Dec 2017
|
||||
KernelVersion: 4.16
|
||||
@ -50,3 +58,84 @@ Description:
|
||||
|
||||
If set to 1, the LED will blink for the milliseconds specified
|
||||
in interval to signal reception.
|
||||
|
||||
With hw_control ON, the blink interval is controlled by hardware
|
||||
and won't reflect the value set in interval.
|
||||
|
||||
What: /sys/class/leds/<led>/hw_control
|
||||
Date: Jun 2023
|
||||
KernelVersion: 6.5
|
||||
Contact: linux-leds@vger.kernel.org
|
||||
Description:
|
||||
Communicate whether the LED trigger modes are driven by hardware
|
||||
or software fallback is used.
|
||||
|
||||
If 0, the LED is using software fallback to blink.
|
||||
|
||||
If 1, the LED is using hardware control to blink and signal the
|
||||
requested modes.
|
||||
|
||||
What: /sys/class/leds/<led>/link_10
|
||||
Date: Jun 2023
|
||||
KernelVersion: 6.5
|
||||
Contact: linux-leds@vger.kernel.org
|
||||
Description:
|
||||
Signal the link speed state of 10Mbps of the named network device.
|
||||
|
||||
If set to 0 (default), the LED's normal state is off.
|
||||
|
||||
If set to 1, the LED's normal state reflects the link state
|
||||
speed of 10MBps of the named network device.
|
||||
Setting this value also immediately changes the LED state.
|
||||
|
||||
What: /sys/class/leds/<led>/link_100
|
||||
Date: Jun 2023
|
||||
KernelVersion: 6.5
|
||||
Contact: linux-leds@vger.kernel.org
|
||||
Description:
|
||||
Signal the link speed state of 100Mbps of the named network device.
|
||||
|
||||
If set to 0 (default), the LED's normal state is off.
|
||||
|
||||
If set to 1, the LED's normal state reflects the link state
|
||||
speed of 100Mbps of the named network device.
|
||||
Setting this value also immediately changes the LED state.
|
||||
|
||||
What: /sys/class/leds/<led>/link_1000
|
||||
Date: Jun 2023
|
||||
KernelVersion: 6.5
|
||||
Contact: linux-leds@vger.kernel.org
|
||||
Description:
|
||||
Signal the link speed state of 1000Mbps of the named network device.
|
||||
|
||||
If set to 0 (default), the LED's normal state is off.
|
||||
|
||||
If set to 1, the LED's normal state reflects the link state
|
||||
speed of 1000Mbps of the named network device.
|
||||
Setting this value also immediately changes the LED state.
|
||||
|
||||
What: /sys/class/leds/<led>/half_duplex
|
||||
Date: Jun 2023
|
||||
KernelVersion: 6.5
|
||||
Contact: linux-leds@vger.kernel.org
|
||||
Description:
|
||||
Signal the link half duplex state of the named network device.
|
||||
|
||||
If set to 0 (default), the LED's normal state is off.
|
||||
|
||||
If set to 1, the LED's normal state reflects the link half
|
||||
duplex state of the named network device.
|
||||
Setting this value also immediately changes the LED state.
|
||||
|
||||
What: /sys/class/leds/<led>/full_duplex
|
||||
Date: Jun 2023
|
||||
KernelVersion: 6.5
|
||||
Contact: linux-leds@vger.kernel.org
|
||||
Description:
|
||||
Signal the link full duplex state of the named network device.
|
||||
|
||||
If set to 0 (default), the LED's normal state is off.
|
||||
|
||||
If set to 1, the LED's normal state reflects the link full
|
||||
duplex state of the named network device.
|
||||
Setting this value also immediately changes the LED state.
|
||||
|
@ -670,7 +670,7 @@ Description: Preferred MTE tag checking mode
|
||||
"async" Prefer asynchronous mode
|
||||
================ ==============================================
|
||||
|
||||
See also: Documentation/arm64/memory-tagging-extension.rst
|
||||
See also: Documentation/arch/arm64/memory-tagging-extension.rst
|
||||
|
||||
What: /sys/devices/system/cpu/nohz_full
|
||||
Date: Apr 2015
|
||||
|
@ -2071,41 +2071,7 @@ call.
|
||||
|
||||
Because RCU avoids interrupting idle CPUs, it is illegal to execute an
|
||||
RCU read-side critical section on an idle CPU. (Kernels built with
|
||||
``CONFIG_PROVE_RCU=y`` will splat if you try it.) The RCU_NONIDLE()
|
||||
macro and ``_rcuidle`` event tracing is provided to work around this
|
||||
restriction. In addition, rcu_is_watching() may be used to test
|
||||
whether or not it is currently legal to run RCU read-side critical
|
||||
sections on this CPU. I learned of the need for diagnostics on the one
|
||||
hand and RCU_NONIDLE() on the other while inspecting idle-loop code.
|
||||
Steven Rostedt supplied ``_rcuidle`` event tracing, which is used quite
|
||||
heavily in the idle loop. However, there are some restrictions on the
|
||||
code placed within RCU_NONIDLE():
|
||||
|
||||
#. Blocking is prohibited. In practice, this is not a serious
|
||||
restriction given that idle tasks are prohibited from blocking to
|
||||
begin with.
|
||||
#. Although nesting RCU_NONIDLE() is permitted, they cannot nest
|
||||
indefinitely deeply. However, given that they can be nested on the
|
||||
order of a million deep, even on 32-bit systems, this should not be a
|
||||
serious restriction. This nesting limit would probably be reached
|
||||
long after the compiler OOMed or the stack overflowed.
|
||||
#. Any code path that enters RCU_NONIDLE() must sequence out of that
|
||||
same RCU_NONIDLE(). For example, the following is grossly
|
||||
illegal:
|
||||
|
||||
::
|
||||
|
||||
1 RCU_NONIDLE({
|
||||
2 do_something();
|
||||
3 goto bad_idea; /* BUG!!! */
|
||||
4 do_something_else();});
|
||||
5 bad_idea:
|
||||
|
||||
|
||||
It is just as illegal to transfer control into the middle of
|
||||
RCU_NONIDLE()'s argument. Yes, in theory, you could transfer in
|
||||
as long as you also transferred out, but in practice you could also
|
||||
expect to get sharply worded review comments.
|
||||
``CONFIG_PROVE_RCU=y`` will splat if you try it.)
|
||||
|
||||
It is similarly socially unacceptable to interrupt an ``nohz_full`` CPU
|
||||
running in userspace. RCU must therefore track ``nohz_full`` userspace
|
||||
|
@ -1117,7 +1117,6 @@ All: lockdep-checked RCU utility APIs::
|
||||
|
||||
RCU_LOCKDEP_WARN
|
||||
rcu_sleep_check
|
||||
RCU_NONIDLE
|
||||
|
||||
All: Unchecked RCU-protected pointer access::
|
||||
|
||||
|
@ -508,9 +508,6 @@ cache_miss_collisions
|
||||
cache miss, but raced with a write and data was already present (usually 0
|
||||
since the synchronization for cache misses was rewritten)
|
||||
|
||||
cache_readaheads
|
||||
Count of times readahead occurred.
|
||||
|
||||
Sysfs - cache set
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
||||
|
@ -297,7 +297,7 @@ Lock order is as follows::
|
||||
|
||||
Page lock (PG_locked bit of page->flags)
|
||||
mm->page_table_lock or split pte_lock
|
||||
lock_page_memcg (memcg->move_lock)
|
||||
folio_memcg_lock (memcg->move_lock)
|
||||
mapping->i_pages lock
|
||||
lruvec->lru_lock.
|
||||
|
||||
|
@ -1580,6 +1580,13 @@ PAGE_SIZE multiple when read back.
|
||||
|
||||
Healthy workloads are not expected to reach this limit.
|
||||
|
||||
memory.swap.peak
|
||||
A read-only single value file which exists on non-root
|
||||
cgroups.
|
||||
|
||||
The max swap usage recorded for the cgroup and its
|
||||
descendants since the creation of the cgroup.
|
||||
|
||||
memory.swap.max
|
||||
A read-write single value file which exists on non-root
|
||||
cgroups. The default is "max".
|
||||
@ -2022,31 +2029,33 @@ that attribute:
|
||||
no-change
|
||||
Do not modify the I/O priority class.
|
||||
|
||||
none-to-rt
|
||||
For requests that do not have an I/O priority class (NONE),
|
||||
change the I/O priority class into RT. Do not modify
|
||||
the I/O priority class of other requests.
|
||||
promote-to-rt
|
||||
For requests that have a non-RT I/O priority class, change it into RT.
|
||||
Also change the priority level of these requests to 4. Do not modify
|
||||
the I/O priority of requests that have priority class RT.
|
||||
|
||||
restrict-to-be
|
||||
For requests that do not have an I/O priority class or that have I/O
|
||||
priority class RT, change it into BE. Do not modify the I/O priority
|
||||
class of requests that have priority class IDLE.
|
||||
priority class RT, change it into BE. Also change the priority level
|
||||
of these requests to 0. Do not modify the I/O priority class of
|
||||
requests that have priority class IDLE.
|
||||
|
||||
idle
|
||||
Change the I/O priority class of all requests into IDLE, the lowest
|
||||
I/O priority class.
|
||||
|
||||
none-to-rt
|
||||
Deprecated. Just an alias for promote-to-rt.
|
||||
|
||||
The following numerical values are associated with the I/O priority policies:
|
||||
|
||||
+-------------+---+
|
||||
| no-change | 0 |
|
||||
+-------------+---+
|
||||
| none-to-rt | 1 |
|
||||
+-------------+---+
|
||||
| rt-to-be | 2 |
|
||||
+-------------+---+
|
||||
| all-to-idle | 3 |
|
||||
+-------------+---+
|
||||
+----------------+---+
|
||||
| no-change | 0 |
|
||||
+----------------+---+
|
||||
| rt-to-be | 2 |
|
||||
+----------------+---+
|
||||
| all-to-idle | 3 |
|
||||
+----------------+---+
|
||||
|
||||
The numerical value that corresponds to each I/O priority class is as follows:
|
||||
|
||||
@ -2062,9 +2071,13 @@ The numerical value that corresponds to each I/O priority class is as follows:
|
||||
|
||||
The algorithm to set the I/O priority class for a request is as follows:
|
||||
|
||||
- Translate the I/O priority class policy into a number.
|
||||
- Change the request I/O priority class into the maximum of the I/O priority
|
||||
class policy number and the numerical I/O priority class.
|
||||
- If I/O priority class policy is promote-to-rt, change the request I/O
|
||||
priority class to IOPRIO_CLASS_RT and change the request I/O priority
|
||||
level to 4.
|
||||
- If I/O priorityt class is not promote-to-rt, translate the I/O priority
|
||||
class policy into a number, then change the request I/O priority class
|
||||
into the maximum of the I/O priority class policy number and the numerical
|
||||
I/O priority class.
|
||||
|
||||
PID
|
||||
---
|
||||
@ -2437,7 +2450,7 @@ Miscellaneous controller provides 3 interface files. If two misc resources (res_
|
||||
res_b 10
|
||||
|
||||
misc.current
|
||||
A read-only flat-keyed file shown in the non-root cgroups. It shows
|
||||
A read-only flat-keyed file shown in the all cgroups. It shows
|
||||
the current usage of the resources in the cgroup and its children.::
|
||||
|
||||
$ cat misc.current
|
||||
|
@ -304,7 +304,7 @@
|
||||
EL0 is indicated by /sys/devices/system/cpu/aarch32_el0
|
||||
and hot-unplug operations may be restricted.
|
||||
|
||||
See Documentation/arm64/asymmetric-32bit.rst for more
|
||||
See Documentation/arch/arm64/asymmetric-32bit.rst for more
|
||||
information.
|
||||
|
||||
amd_iommu= [HW,X86-64]
|
||||
@ -323,6 +323,7 @@
|
||||
option with care.
|
||||
pgtbl_v1 - Use v1 page table for DMA-API (Default).
|
||||
pgtbl_v2 - Use v2 page table for DMA-API.
|
||||
irtcachedis - Disable Interrupt Remapping Table (IRT) caching.
|
||||
|
||||
amd_iommu_dump= [HW,X86-64]
|
||||
Enable AMD IOMMU driver option to dump the ACPI table
|
||||
@ -429,6 +430,9 @@
|
||||
arm64.nosme [ARM64] Unconditionally disable Scalable Matrix
|
||||
Extension support
|
||||
|
||||
arm64.nomops [ARM64] Unconditionally disable Memory Copy and Memory
|
||||
Set instructions support
|
||||
|
||||
ataflop= [HW,M68k]
|
||||
|
||||
atarimouse= [HW,MOUSE] Atari Mouse
|
||||
@ -818,20 +822,6 @@
|
||||
Format:
|
||||
<first_slot>,<last_slot>,<port>,<enum_bit>[,<debug>]
|
||||
|
||||
cpu0_hotplug [X86] Turn on CPU0 hotplug feature when
|
||||
CONFIG_BOOTPARAM_HOTPLUG_CPU0 is off.
|
||||
Some features depend on CPU0. Known dependencies are:
|
||||
1. Resume from suspend/hibernate depends on CPU0.
|
||||
Suspend/hibernate will fail if CPU0 is offline and you
|
||||
need to online CPU0 before suspend/hibernate.
|
||||
2. PIC interrupts also depend on CPU0. CPU0 can't be
|
||||
removed if a PIC interrupt is detected.
|
||||
It's said poweroff/reboot may depend on CPU0 on some
|
||||
machines although I haven't seen such issues so far
|
||||
after CPU0 is offline on a few tested machines.
|
||||
If the dependencies are under your control, you can
|
||||
turn on cpu0_hotplug.
|
||||
|
||||
cpuidle.off=1 [CPU_IDLE]
|
||||
disable the cpuidle sub-system
|
||||
|
||||
@ -852,6 +842,12 @@
|
||||
on every CPU online, such as boot, and resume from suspend.
|
||||
Default: 10000
|
||||
|
||||
cpuhp.parallel=
|
||||
[SMP] Enable/disable parallel bringup of secondary CPUs
|
||||
Format: <bool>
|
||||
Default is enabled if CONFIG_HOTPLUG_PARALLEL=y. Otherwise
|
||||
the parameter has no effect.
|
||||
|
||||
crash_kexec_post_notifiers
|
||||
Run kdump after running panic-notifiers and dumping
|
||||
kmsg. This only for the users who doubt kdump always
|
||||
@ -2117,6 +2113,16 @@
|
||||
disable
|
||||
Do not enable intel_pstate as the default
|
||||
scaling driver for the supported processors
|
||||
active
|
||||
Use intel_pstate driver to bypass the scaling
|
||||
governors layer of cpufreq and provides it own
|
||||
algorithms for p-state selection. There are two
|
||||
P-state selection algorithms provided by
|
||||
intel_pstate in the active mode: powersave and
|
||||
performance. The way they both operate depends
|
||||
on whether or not the hardware managed P-states
|
||||
(HWP) feature has been enabled in the processor
|
||||
and possibly on the processor model.
|
||||
passive
|
||||
Use intel_pstate as a scaling driver, but configure it
|
||||
to work with generic cpufreq governors (instead of
|
||||
@ -2551,12 +2557,13 @@
|
||||
If the value is 0 (the default), KVM will pick a period based
|
||||
on the ratio, such that a page is zapped after 1 hour on average.
|
||||
|
||||
kvm-amd.nested= [KVM,AMD] Allow nested virtualization in KVM/SVM.
|
||||
Default is 1 (enabled)
|
||||
kvm-amd.nested= [KVM,AMD] Control nested virtualization feature in
|
||||
KVM/SVM. Default is 1 (enabled).
|
||||
|
||||
kvm-amd.npt= [KVM,AMD] Disable nested paging (virtualized MMU)
|
||||
for all guests.
|
||||
Default is 1 (enabled) if in 64-bit or 32-bit PAE mode.
|
||||
kvm-amd.npt= [KVM,AMD] Control KVM's use of Nested Page Tables,
|
||||
a.k.a. Two-Dimensional Page Tables. Default is 1
|
||||
(enabled). Disable by KVM if hardware lacks support
|
||||
for NPT.
|
||||
|
||||
kvm-arm.mode=
|
||||
[KVM,ARM] Select one of KVM/arm64's modes of operation.
|
||||
@ -2602,30 +2609,33 @@
|
||||
Format: <integer>
|
||||
Default: 5
|
||||
|
||||
kvm-intel.ept= [KVM,Intel] Disable extended page tables
|
||||
(virtualized MMU) support on capable Intel chips.
|
||||
Default is 1 (enabled)
|
||||
kvm-intel.ept= [KVM,Intel] Control KVM's use of Extended Page Tables,
|
||||
a.k.a. Two-Dimensional Page Tables. Default is 1
|
||||
(enabled). Disable by KVM if hardware lacks support
|
||||
for EPT.
|
||||
|
||||
kvm-intel.emulate_invalid_guest_state=
|
||||
[KVM,Intel] Disable emulation of invalid guest state.
|
||||
Ignored if kvm-intel.enable_unrestricted_guest=1, as
|
||||
guest state is never invalid for unrestricted guests.
|
||||
This param doesn't apply to nested guests (L2), as KVM
|
||||
never emulates invalid L2 guest state.
|
||||
Default is 1 (enabled)
|
||||
[KVM,Intel] Control whether to emulate invalid guest
|
||||
state. Ignored if kvm-intel.enable_unrestricted_guest=1,
|
||||
as guest state is never invalid for unrestricted
|
||||
guests. This param doesn't apply to nested guests (L2),
|
||||
as KVM never emulates invalid L2 guest state.
|
||||
Default is 1 (enabled).
|
||||
|
||||
kvm-intel.flexpriority=
|
||||
[KVM,Intel] Disable FlexPriority feature (TPR shadow).
|
||||
Default is 1 (enabled)
|
||||
[KVM,Intel] Control KVM's use of FlexPriority feature
|
||||
(TPR shadow). Default is 1 (enabled). Disalbe by KVM if
|
||||
hardware lacks support for it.
|
||||
|
||||
kvm-intel.nested=
|
||||
[KVM,Intel] Enable VMX nesting (nVMX).
|
||||
Default is 0 (disabled)
|
||||
[KVM,Intel] Control nested virtualization feature in
|
||||
KVM/VMX. Default is 1 (enabled).
|
||||
|
||||
kvm-intel.unrestricted_guest=
|
||||
[KVM,Intel] Disable unrestricted guest feature
|
||||
(virtualized real and unpaged mode) on capable
|
||||
Intel chips. Default is 1 (enabled)
|
||||
[KVM,Intel] Control KVM's use of unrestricted guest
|
||||
feature (virtualized real and unpaged mode). Default
|
||||
is 1 (enabled). Disable by KVM if EPT is disabled or
|
||||
hardware lacks support for it.
|
||||
|
||||
kvm-intel.vmentry_l1d_flush=[KVM,Intel] Mitigation for L1 Terminal Fault
|
||||
CVE-2018-3620.
|
||||
@ -2639,9 +2649,10 @@
|
||||
|
||||
Default is cond (do L1 cache flush in specific instances)
|
||||
|
||||
kvm-intel.vpid= [KVM,Intel] Disable Virtual Processor Identification
|
||||
feature (tagged TLBs) on capable Intel chips.
|
||||
Default is 1 (enabled)
|
||||
kvm-intel.vpid= [KVM,Intel] Control KVM's use of Virtual Processor
|
||||
Identification feature (tagged TLBs). Default is 1
|
||||
(enabled). Disable by KVM if hardware lacks support
|
||||
for it.
|
||||
|
||||
l1d_flush= [X86,INTEL]
|
||||
Control mitigation for L1D based snooping vulnerability.
|
||||
@ -3423,6 +3434,10 @@
|
||||
[HW] Make the MicroTouch USB driver use raw coordinates
|
||||
('y', default) or cooked coordinates ('n')
|
||||
|
||||
mtrr=debug [X86]
|
||||
Enable printing debug information related to MTRR
|
||||
registers at boot time.
|
||||
|
||||
mtrr_chunk_size=nn[KMG] [X86]
|
||||
used for mtrr cleanup. It is largest continuous chunk
|
||||
that could hold holes aka. UC entries.
|
||||
@ -3702,8 +3717,8 @@
|
||||
|
||||
nohibernate [HIBERNATION] Disable hibernation and resume.
|
||||
|
||||
nohlt [ARM,ARM64,MICROBLAZE,SH] Forces the kernel to busy wait
|
||||
in do_idle() and not use the arch_cpu_idle()
|
||||
nohlt [ARM,ARM64,MICROBLAZE,MIPS,SH] Forces the kernel to
|
||||
busy wait in do_idle() and not use the arch_cpu_idle()
|
||||
implementation; requires CONFIG_GENERIC_IDLE_POLL_SETUP
|
||||
to be effective. This is useful on platforms where the
|
||||
sleep(SH) or wfi(ARM,ARM64) instructions do not work
|
||||
@ -3838,7 +3853,7 @@
|
||||
nosmp [SMP] Tells an SMP kernel to act as a UP kernel,
|
||||
and disable the IO APIC. legacy for "maxcpus=0".
|
||||
|
||||
nosmt [KNL,S390] Disable symmetric multithreading (SMT).
|
||||
nosmt [KNL,MIPS,S390] Disable symmetric multithreading (SMT).
|
||||
Equivalent to smt=1.
|
||||
|
||||
[KNL,X86] Disable symmetric multithreading (SMT).
|
||||
@ -4736,43 +4751,6 @@
|
||||
the propagation of recent CPU-hotplug changes up
|
||||
the rcu_node combining tree.
|
||||
|
||||
rcutree.use_softirq= [KNL]
|
||||
If set to zero, move all RCU_SOFTIRQ processing to
|
||||
per-CPU rcuc kthreads. Defaults to a non-zero
|
||||
value, meaning that RCU_SOFTIRQ is used by default.
|
||||
Specify rcutree.use_softirq=0 to use rcuc kthreads.
|
||||
|
||||
But note that CONFIG_PREEMPT_RT=y kernels disable
|
||||
this kernel boot parameter, forcibly setting it
|
||||
to zero.
|
||||
|
||||
rcutree.rcu_fanout_exact= [KNL]
|
||||
Disable autobalancing of the rcu_node combining
|
||||
tree. This is used by rcutorture, and might
|
||||
possibly be useful for architectures having high
|
||||
cache-to-cache transfer latencies.
|
||||
|
||||
rcutree.rcu_fanout_leaf= [KNL]
|
||||
Change the number of CPUs assigned to each
|
||||
leaf rcu_node structure. Useful for very
|
||||
large systems, which will choose the value 64,
|
||||
and for NUMA systems with large remote-access
|
||||
latencies, which will choose a value aligned
|
||||
with the appropriate hardware boundaries.
|
||||
|
||||
rcutree.rcu_min_cached_objs= [KNL]
|
||||
Minimum number of objects which are cached and
|
||||
maintained per one CPU. Object size is equal
|
||||
to PAGE_SIZE. The cache allows to reduce the
|
||||
pressure to page allocator, also it makes the
|
||||
whole algorithm to behave better in low memory
|
||||
condition.
|
||||
|
||||
rcutree.rcu_delay_page_cache_fill_msec= [KNL]
|
||||
Set the page-cache refill delay (in milliseconds)
|
||||
in response to low-memory conditions. The range
|
||||
of permitted values is in the range 0:100000.
|
||||
|
||||
rcutree.jiffies_till_first_fqs= [KNL]
|
||||
Set delay from grace-period initialization to
|
||||
first attempt to force quiescent states.
|
||||
@ -4811,21 +4789,6 @@
|
||||
When RCU_NOCB_CPU is set, also adjust the
|
||||
priority of NOCB callback kthreads.
|
||||
|
||||
rcutree.rcu_divisor= [KNL]
|
||||
Set the shift-right count to use to compute
|
||||
the callback-invocation batch limit bl from
|
||||
the number of callbacks queued on this CPU.
|
||||
The result will be bounded below by the value of
|
||||
the rcutree.blimit kernel parameter. Every bl
|
||||
callbacks, the softirq handler will exit in
|
||||
order to allow the CPU to do other work.
|
||||
|
||||
Please note that this callback-invocation batch
|
||||
limit applies only to non-offloaded callback
|
||||
invocation. Offloaded callbacks are instead
|
||||
invoked in the context of an rcuoc kthread, which
|
||||
scheduler will preempt as it does any other task.
|
||||
|
||||
rcutree.nocb_nobypass_lim_per_jiffy= [KNL]
|
||||
On callback-offloaded (rcu_nocbs) CPUs,
|
||||
RCU reduces the lock contention that would
|
||||
@ -4839,14 +4802,6 @@
|
||||
the ->nocb_bypass queue. The definition of "too
|
||||
many" is supplied by this kernel boot parameter.
|
||||
|
||||
rcutree.rcu_nocb_gp_stride= [KNL]
|
||||
Set the number of NOCB callback kthreads in
|
||||
each group, which defaults to the square root
|
||||
of the number of CPUs. Larger numbers reduce
|
||||
the wakeup overhead on the global grace-period
|
||||
kthread, but increases that same overhead on
|
||||
each group's NOCB grace-period kthread.
|
||||
|
||||
rcutree.qhimark= [KNL]
|
||||
Set threshold of queued RCU callbacks beyond which
|
||||
batch limiting is disabled.
|
||||
@ -4864,6 +4819,56 @@
|
||||
on rcutree.qhimark at boot time and to zero to
|
||||
disable more aggressive help enlistment.
|
||||
|
||||
rcutree.rcu_delay_page_cache_fill_msec= [KNL]
|
||||
Set the page-cache refill delay (in milliseconds)
|
||||
in response to low-memory conditions. The range
|
||||
of permitted values is in the range 0:100000.
|
||||
|
||||
rcutree.rcu_divisor= [KNL]
|
||||
Set the shift-right count to use to compute
|
||||
the callback-invocation batch limit bl from
|
||||
the number of callbacks queued on this CPU.
|
||||
The result will be bounded below by the value of
|
||||
the rcutree.blimit kernel parameter. Every bl
|
||||
callbacks, the softirq handler will exit in
|
||||
order to allow the CPU to do other work.
|
||||
|
||||
Please note that this callback-invocation batch
|
||||
limit applies only to non-offloaded callback
|
||||
invocation. Offloaded callbacks are instead
|
||||
invoked in the context of an rcuoc kthread, which
|
||||
scheduler will preempt as it does any other task.
|
||||
|
||||
rcutree.rcu_fanout_exact= [KNL]
|
||||
Disable autobalancing of the rcu_node combining
|
||||
tree. This is used by rcutorture, and might
|
||||
possibly be useful for architectures having high
|
||||
cache-to-cache transfer latencies.
|
||||
|
||||
rcutree.rcu_fanout_leaf= [KNL]
|
||||
Change the number of CPUs assigned to each
|
||||
leaf rcu_node structure. Useful for very
|
||||
large systems, which will choose the value 64,
|
||||
and for NUMA systems with large remote-access
|
||||
latencies, which will choose a value aligned
|
||||
with the appropriate hardware boundaries.
|
||||
|
||||
rcutree.rcu_min_cached_objs= [KNL]
|
||||
Minimum number of objects which are cached and
|
||||
maintained per one CPU. Object size is equal
|
||||
to PAGE_SIZE. The cache allows to reduce the
|
||||
pressure to page allocator, also it makes the
|
||||
whole algorithm to behave better in low memory
|
||||
condition.
|
||||
|
||||
rcutree.rcu_nocb_gp_stride= [KNL]
|
||||
Set the number of NOCB callback kthreads in
|
||||
each group, which defaults to the square root
|
||||
of the number of CPUs. Larger numbers reduce
|
||||
the wakeup overhead on the global grace-period
|
||||
kthread, but increases that same overhead on
|
||||
each group's NOCB grace-period kthread.
|
||||
|
||||
rcutree.rcu_kick_kthreads= [KNL]
|
||||
Cause the grace-period kthread to get an extra
|
||||
wake_up() if it sleeps three times longer than
|
||||
@ -4871,6 +4876,13 @@
|
||||
This wake_up() will be accompanied by a
|
||||
WARN_ONCE() splat and an ftrace_dump().
|
||||
|
||||
rcutree.rcu_resched_ns= [KNL]
|
||||
Limit the time spend invoking a batch of RCU
|
||||
callbacks to the specified number of nanoseconds.
|
||||
By default, this limit is checked only once
|
||||
every 32 callbacks in order to limit the pain
|
||||
inflicted by local_clock() overhead.
|
||||
|
||||
rcutree.rcu_unlock_delay= [KNL]
|
||||
In CONFIG_RCU_STRICT_GRACE_PERIOD=y kernels,
|
||||
this specifies an rcu_read_unlock()-time delay
|
||||
@ -4885,6 +4897,16 @@
|
||||
rcu_node tree with an eye towards determining
|
||||
why a new grace period has not yet started.
|
||||
|
||||
rcutree.use_softirq= [KNL]
|
||||
If set to zero, move all RCU_SOFTIRQ processing to
|
||||
per-CPU rcuc kthreads. Defaults to a non-zero
|
||||
value, meaning that RCU_SOFTIRQ is used by default.
|
||||
Specify rcutree.use_softirq=0 to use rcuc kthreads.
|
||||
|
||||
But note that CONFIG_PREEMPT_RT=y kernels disable
|
||||
this kernel boot parameter, forcibly setting it
|
||||
to zero.
|
||||
|
||||
rcuscale.gp_async= [KNL]
|
||||
Measure performance of asynchronous
|
||||
grace-period primitives such as call_rcu().
|
||||
@ -5087,8 +5109,17 @@
|
||||
|
||||
rcutorture.stall_cpu_block= [KNL]
|
||||
Sleep while stalling if set. This will result
|
||||
in warnings from preemptible RCU in addition
|
||||
to any other stall-related activity.
|
||||
in warnings from preemptible RCU in addition to
|
||||
any other stall-related activity. Note that
|
||||
in kernels built with CONFIG_PREEMPTION=n and
|
||||
CONFIG_PREEMPT_COUNT=y, this parameter will
|
||||
cause the CPU to pass through a quiescent state.
|
||||
Given CONFIG_PREEMPTION=n, this will suppress
|
||||
RCU CPU stall warnings, but will instead result
|
||||
in scheduling-while-atomic splats.
|
||||
|
||||
Use of this module parameter results in splats.
|
||||
|
||||
|
||||
rcutorture.stall_cpu_holdoff= [KNL]
|
||||
Time to wait (s) after boot before inducing stall.
|
||||
@ -5452,7 +5483,12 @@
|
||||
port and the regular usb controller gets disabled.
|
||||
|
||||
root= [KNL] Root filesystem
|
||||
See name_to_dev_t comment in init/do_mounts.c.
|
||||
Usually this a a block device specifier of some kind,
|
||||
see the early_lookup_bdev comment in
|
||||
block/early-lookup.c for details.
|
||||
Alternatively this can be "ram" for the legacy initial
|
||||
ramdisk, "nfs" and "cifs" for root on a network file
|
||||
system, or "mtd" and "ubi" for mounting from raw flash.
|
||||
|
||||
rootdelay= [KNL] Delay (in seconds) to pause before attempting to
|
||||
mount the root filesystem
|
||||
@ -5735,7 +5771,7 @@
|
||||
1: Fast pin select (default)
|
||||
2: ATC IRMode
|
||||
|
||||
smt= [KNL,S390] Set the maximum number of threads (logical
|
||||
smt= [KNL,MIPS,S390] Set the maximum number of threads (logical
|
||||
CPUs) to use per physical CPU on systems capable of
|
||||
symmetric multithreading (SMT). Will be capped to the
|
||||
actual hardware limit.
|
||||
@ -6563,6 +6599,12 @@
|
||||
unknown_nmi_panic
|
||||
[X86] Cause panic on unknown NMI.
|
||||
|
||||
unwind_debug [X86-64]
|
||||
Enable unwinder debug output. This can be
|
||||
useful for debugging certain unwinder error
|
||||
conditions, including corrupt stacks and
|
||||
bad/missing unwinder metadata.
|
||||
|
||||
usbcore.authorized_default=
|
||||
[USB] Default USB device authorization:
|
||||
(default -1 = authorized except for wireless USB,
|
||||
@ -6931,6 +6973,18 @@
|
||||
it can be updated at runtime by writing to the
|
||||
corresponding sysfs file.
|
||||
|
||||
workqueue.cpu_intensive_thresh_us=
|
||||
Per-cpu work items which run for longer than this
|
||||
threshold are automatically considered CPU intensive
|
||||
and excluded from concurrency management to prevent
|
||||
them from noticeably delaying other per-cpu work
|
||||
items. Default is 10000 (10ms).
|
||||
|
||||
If CONFIG_WQ_CPU_INTENSIVE_REPORT is set, the kernel
|
||||
will report the work functions which violate this
|
||||
threshold repeatedly. They are likely good
|
||||
candidates for using WQ_UNBOUND workqueues instead.
|
||||
|
||||
workqueue.disable_numa
|
||||
By default, all work items queued to unbound
|
||||
workqueues are affine to the NUMA nodes they're
|
||||
|
@ -119,9 +119,9 @@ set size has chronologically changed.::
|
||||
Data Access Pattern Aware Memory Management
|
||||
===========================================
|
||||
|
||||
Below three commands make every memory region of size >=4K that doesn't
|
||||
accessed for >=60 seconds in your workload to be swapped out. ::
|
||||
Below command makes every memory region of size >=4K that has not accessed for
|
||||
>=60 seconds in your workload to be swapped out. ::
|
||||
|
||||
$ echo "#min-size max-size min-acc max-acc min-age max-age action" > test_scheme
|
||||
$ echo "4K max 0 0 60s max pageout" >> test_scheme
|
||||
$ damo schemes -c test_scheme <pid of your workload>
|
||||
$ sudo damo schemes --damos_access_rate 0 0 --damos_sz_region 4K max \
|
||||
--damos_age 60s max --damos_action pageout \
|
||||
<pid of your workload>
|
||||
|
@ -10,9 +10,8 @@ DAMON provides below interfaces for different users.
|
||||
`This <https://github.com/awslabs/damo>`_ is for privileged people such as
|
||||
system administrators who want a just-working human-friendly interface.
|
||||
Using this, users can use the DAMON’s major features in a human-friendly way.
|
||||
It may not be highly tuned for special cases, though. It supports both
|
||||
virtual and physical address spaces monitoring. For more detail, please
|
||||
refer to its `usage document
|
||||
It may not be highly tuned for special cases, though. For more detail,
|
||||
please refer to its `usage document
|
||||
<https://github.com/awslabs/damo/blob/next/USAGE.md>`_.
|
||||
- *sysfs interface.*
|
||||
:ref:`This <sysfs_interface>` is for privileged user space programmers who
|
||||
@ -20,11 +19,7 @@ DAMON provides below interfaces for different users.
|
||||
features by reading from and writing to special sysfs files. Therefore,
|
||||
you can write and use your personalized DAMON sysfs wrapper programs that
|
||||
reads/writes the sysfs files instead of you. The `DAMON user space tool
|
||||
<https://github.com/awslabs/damo>`_ is one example of such programs. It
|
||||
supports both virtual and physical address spaces monitoring. Note that this
|
||||
interface provides only simple :ref:`statistics <damos_stats>` for the
|
||||
monitoring results. For detailed monitoring results, DAMON provides a
|
||||
:ref:`tracepoint <tracepoint>`.
|
||||
<https://github.com/awslabs/damo>`_ is one example of such programs.
|
||||
- *debugfs interface. (DEPRECATED!)*
|
||||
:ref:`This <debugfs_interface>` is almost identical to :ref:`sysfs interface
|
||||
<sysfs_interface>`. This is deprecated, so users should move to the
|
||||
@ -139,7 +134,7 @@ scheme of the kdamond. Writing ``clear_schemes_tried_regions`` to ``state``
|
||||
file clears the DAMON-based operating scheme action tried regions directory for
|
||||
each DAMON-based operation scheme of the kdamond. For details of the
|
||||
DAMON-based operation scheme action tried regions directory, please refer to
|
||||
:ref:tried_regions section <sysfs_schemes_tried_regions>`.
|
||||
:ref:`tried_regions section <sysfs_schemes_tried_regions>`.
|
||||
|
||||
If the state is ``on``, reading ``pid`` shows the pid of the kdamond thread.
|
||||
|
||||
@ -259,12 +254,9 @@ be equal or smaller than ``start`` of directory ``N+1``.
|
||||
contexts/<N>/schemes/
|
||||
---------------------
|
||||
|
||||
For usual DAMON-based data access aware memory management optimizations, users
|
||||
would normally want the system to apply a memory management action to a memory
|
||||
region of a specific access pattern. DAMON receives such formalized operation
|
||||
schemes from the user and applies those to the target memory regions. Users
|
||||
can get and set the schemes by reading from and writing to files under this
|
||||
directory.
|
||||
The directory for DAMON-based Operation Schemes (:ref:`DAMOS
|
||||
<damon_design_damos>`). Users can get and set the schemes by reading from and
|
||||
writing to files under this directory.
|
||||
|
||||
In the beginning, this directory has only one file, ``nr_schemes``. Writing a
|
||||
number (``N``) to the file creates the number of child directories named ``0``
|
||||
@ -277,12 +269,12 @@ In each scheme directory, five directories (``access_pattern``, ``quotas``,
|
||||
``watermarks``, ``filters``, ``stats``, and ``tried_regions``) and one file
|
||||
(``action``) exist.
|
||||
|
||||
The ``action`` file is for setting and getting what action you want to apply to
|
||||
memory regions having specific access pattern of the interest. The keywords
|
||||
that can be written to and read from the file and their meaning are as below.
|
||||
The ``action`` file is for setting and getting the scheme's :ref:`action
|
||||
<damon_design_damos_action>`. The keywords that can be written to and read
|
||||
from the file and their meaning are as below.
|
||||
|
||||
Note that support of each action depends on the running DAMON operations set
|
||||
`implementation <sysfs_contexts>`.
|
||||
:ref:`implementation <sysfs_contexts>`.
|
||||
|
||||
- ``willneed``: Call ``madvise()`` for the region with ``MADV_WILLNEED``.
|
||||
Supported by ``vaddr`` and ``fvaddr`` operations set.
|
||||
@ -304,32 +296,21 @@ Note that support of each action depends on the running DAMON operations set
|
||||
schemes/<N>/access_pattern/
|
||||
---------------------------
|
||||
|
||||
The target access pattern of each DAMON-based operation scheme is constructed
|
||||
with three ranges including the size of the region in bytes, number of
|
||||
monitored accesses per aggregate interval, and number of aggregated intervals
|
||||
for the age of the region.
|
||||
The directory for the target access :ref:`pattern
|
||||
<damon_design_damos_access_pattern>` of the given DAMON-based operation scheme.
|
||||
|
||||
Under the ``access_pattern`` directory, three directories (``sz``,
|
||||
``nr_accesses``, and ``age``) each having two files (``min`` and ``max``)
|
||||
exist. You can set and get the access pattern for the given scheme by writing
|
||||
to and reading from the ``min`` and ``max`` files under ``sz``,
|
||||
``nr_accesses``, and ``age`` directories, respectively.
|
||||
``nr_accesses``, and ``age`` directories, respectively. Note that the ``min``
|
||||
and the ``max`` form a closed interval.
|
||||
|
||||
schemes/<N>/quotas/
|
||||
-------------------
|
||||
|
||||
Optimal ``target access pattern`` for each ``action`` is workload dependent, so
|
||||
not easy to find. Worse yet, setting a scheme of some action too aggressive
|
||||
can cause severe overhead. To avoid such overhead, users can limit time and
|
||||
size quota for each scheme. In detail, users can ask DAMON to try to use only
|
||||
up to specific time (``time quota``) for applying the action, and to apply the
|
||||
action to only up to specific amount (``size quota``) of memory regions having
|
||||
the target access pattern within a given time interval (``reset interval``).
|
||||
|
||||
When the quota limit is expected to be exceeded, DAMON prioritizes found memory
|
||||
regions of the ``target access pattern`` based on their size, access frequency,
|
||||
and age. For personalized prioritization, users can set the weights for the
|
||||
three properties.
|
||||
The directory for the :ref:`quotas <damon_design_damos_quotas>` of the given
|
||||
DAMON-based operation scheme.
|
||||
|
||||
Under ``quotas`` directory, three files (``ms``, ``bytes``,
|
||||
``reset_interval_ms``) and one directory (``weights``) having three files
|
||||
@ -337,23 +318,26 @@ Under ``quotas`` directory, three files (``ms``, ``bytes``,
|
||||
|
||||
You can set the ``time quota`` in milliseconds, ``size quota`` in bytes, and
|
||||
``reset interval`` in milliseconds by writing the values to the three files,
|
||||
respectively. You can also set the prioritization weights for size, access
|
||||
frequency, and age in per-thousand unit by writing the values to the three
|
||||
files under the ``weights`` directory.
|
||||
respectively. Then, DAMON tries to use only up to ``time quota`` milliseconds
|
||||
for applying the ``action`` to memory regions of the ``access_pattern``, and to
|
||||
apply the action to only up to ``bytes`` bytes of memory regions within the
|
||||
``reset_interval_ms``. Setting both ``ms`` and ``bytes`` zero disables the
|
||||
quota limits.
|
||||
|
||||
You can also set the :ref:`prioritization weights
|
||||
<damon_design_damos_quotas_prioritization>` for size, access frequency, and age
|
||||
in per-thousand unit by writing the values to the three files under the
|
||||
``weights`` directory.
|
||||
|
||||
schemes/<N>/watermarks/
|
||||
-----------------------
|
||||
|
||||
To allow easy activation and deactivation of each scheme based on system
|
||||
status, DAMON provides a feature called watermarks. The feature receives five
|
||||
values called ``metric``, ``interval``, ``high``, ``mid``, and ``low``. The
|
||||
``metric`` is the system metric such as free memory ratio that can be measured.
|
||||
If the metric value of the system is higher than the value in ``high`` or lower
|
||||
than ``low`` at the memoent, the scheme is deactivated. If the value is lower
|
||||
than ``mid``, the scheme is activated.
|
||||
The directory for the :ref:`watermarks <damon_design_damos_watermarks>` of the
|
||||
given DAMON-based operation scheme.
|
||||
|
||||
Under the watermarks directory, five files (``metric``, ``interval_us``,
|
||||
``high``, ``mid``, and ``low``) for setting each value exist. You can set and
|
||||
``high``, ``mid``, and ``low``) for setting the metric, the time interval
|
||||
between check of the metric, and the three watermarks exist. You can set and
|
||||
get the five values by writing to the files, respectively.
|
||||
|
||||
Keywords and meanings of those that can be written to the ``metric`` file are
|
||||
@ -367,12 +351,8 @@ The ``interval`` should written in microseconds unit.
|
||||
schemes/<N>/filters/
|
||||
--------------------
|
||||
|
||||
Users could know something more than the kernel for specific types of memory.
|
||||
In the case, users could do their own management for the memory and hence
|
||||
doesn't want DAMOS bothers that. Users could limit DAMOS by setting the access
|
||||
pattern of the scheme and/or the monitoring regions for the purpose, but that
|
||||
can be inefficient in some cases. In such cases, users could set non-access
|
||||
pattern driven filters using files in this directory.
|
||||
The directory for the :ref:`filters <damon_design_damos_filters>` of the given
|
||||
DAMON-based operation scheme.
|
||||
|
||||
In the beginning, this directory has only one file, ``nr_filters``. Writing a
|
||||
number (``N``) to the file creates the number of child directories named ``0``
|
||||
@ -432,13 +412,17 @@ starting from ``0`` under this directory. Each directory contains files
|
||||
exposing detailed information about each of the memory region that the
|
||||
corresponding scheme's ``action`` has tried to be applied under this directory,
|
||||
during next :ref:`aggregation interval <sysfs_monitoring_attrs>`. The
|
||||
information includes address range, ``nr_accesses``, , and ``age`` of the
|
||||
region.
|
||||
information includes address range, ``nr_accesses``, and ``age`` of the region.
|
||||
|
||||
The directories will be removed when another special keyword,
|
||||
``clear_schemes_tried_regions``, is written to the relevant
|
||||
``kdamonds/<N>/state`` file.
|
||||
|
||||
The expected usage of this directory is investigations of schemes' behaviors,
|
||||
and query-like efficient data access monitoring results retrievals. For the
|
||||
latter use case, in particular, users can set the ``action`` as ``stat`` and
|
||||
set the ``access pattern`` as their interested pattern that they want to query.
|
||||
|
||||
tried_regions/<N>/
|
||||
------------------
|
||||
|
||||
@ -600,15 +584,10 @@ update.
|
||||
Schemes
|
||||
-------
|
||||
|
||||
For usual DAMON-based data access aware memory management optimizations, users
|
||||
would simply want the system to apply a memory management action to a memory
|
||||
region of a specific access pattern. DAMON receives such formalized operation
|
||||
schemes from the user and applies those to the target processes.
|
||||
|
||||
Users can get and set the schemes by reading from and writing to ``schemes``
|
||||
debugfs file. Reading the file also shows the statistics of each scheme. To
|
||||
the file, each of the schemes should be represented in each line in below
|
||||
form::
|
||||
Users can get and set the DAMON-based operation :ref:`schemes
|
||||
<damon_design_damos>` by reading from and writing to ``schemes`` debugfs file.
|
||||
Reading the file also shows the statistics of each scheme. To the file, each
|
||||
of the schemes should be represented in each line in below form::
|
||||
|
||||
<target access pattern> <action> <quota> <watermarks>
|
||||
|
||||
@ -617,8 +596,9 @@ You can disable schemes by simply writing an empty string to the file.
|
||||
Target Access Pattern
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The ``<target access pattern>`` is constructed with three ranges in below
|
||||
form::
|
||||
The target access :ref:`pattern <damon_design_damos_access_pattern>` of the
|
||||
scheme. The ``<target access pattern>`` is constructed with three ranges in
|
||||
below form::
|
||||
|
||||
min-size max-size min-acc max-acc min-age max-age
|
||||
|
||||
@ -631,9 +611,9 @@ closed interval.
|
||||
Action
|
||||
~~~~~~
|
||||
|
||||
The ``<action>`` is a predefined integer for memory management actions, which
|
||||
DAMON will apply to the regions having the target access pattern. The
|
||||
supported numbers and their meanings are as below.
|
||||
The ``<action>`` is a predefined integer for memory management :ref:`actions
|
||||
<damon_design_damos_action>`. The supported numbers and their meanings are as
|
||||
below.
|
||||
|
||||
- 0: Call ``madvise()`` for the region with ``MADV_WILLNEED``. Ignored if
|
||||
``target`` is ``paddr``.
|
||||
@ -649,10 +629,8 @@ supported numbers and their meanings are as below.
|
||||
Quota
|
||||
~~~~~
|
||||
|
||||
Optimal ``target access pattern`` for each ``action`` is workload dependent, so
|
||||
not easy to find. Worse yet, setting a scheme of some action too aggressive
|
||||
can cause severe overhead. To avoid such overhead, users can limit time and
|
||||
size quota for the scheme via the ``<quota>`` in below form::
|
||||
Users can set the :ref:`quotas <damon_design_damos_quotas>` of the given scheme
|
||||
via the ``<quota>`` in below form::
|
||||
|
||||
<ms> <sz> <reset interval> <priority weights>
|
||||
|
||||
@ -662,19 +640,17 @@ the action to memory regions of the ``target access pattern`` within the
|
||||
``<sz>`` bytes of memory regions within the ``<reset interval>``. Setting both
|
||||
``<ms>`` and ``<sz>`` zero disables the quota limits.
|
||||
|
||||
When the quota limit is expected to be exceeded, DAMON prioritizes found memory
|
||||
regions of the ``target access pattern`` based on their size, access frequency,
|
||||
and age. For personalized prioritization, users can set the weights for the
|
||||
three properties in ``<priority weights>`` in below form::
|
||||
For the :ref:`prioritization <damon_design_damos_quotas_prioritization>`, users
|
||||
can set the weights for the three properties in ``<priority weights>`` in below
|
||||
form::
|
||||
|
||||
<size weight> <access frequency weight> <age weight>
|
||||
|
||||
Watermarks
|
||||
~~~~~~~~~~
|
||||
|
||||
Some schemes would need to run based on current value of the system's specific
|
||||
metrics like free memory ratio. For such cases, users can specify watermarks
|
||||
for the condition.::
|
||||
Users can specify :ref:`watermarks <damon_design_damos_watermarks>` of the
|
||||
given scheme via ``<watermarks>`` in below form::
|
||||
|
||||
<metric> <check interval> <high mark> <middle mark> <low mark>
|
||||
|
||||
@ -797,10 +773,12 @@ root directory only.
|
||||
Tracepoint for Monitoring Results
|
||||
=================================
|
||||
|
||||
DAMON provides the monitoring results via a tracepoint,
|
||||
``damon:damon_aggregated``. While the monitoring is turned on, you could
|
||||
record the tracepoint events and show results using tracepoint supporting tools
|
||||
like ``perf``. For example::
|
||||
Users can get the monitoring results via the :ref:`tried_regions
|
||||
<sysfs_schemes_tried_regions>` or a tracepoint, ``damon:damon_aggregated``.
|
||||
While the tried regions directory is useful for getting a snapshot, the
|
||||
tracepoint is useful for getting a full record of the results. While the
|
||||
monitoring is turned on, you could record the tracepoint events and show
|
||||
results using tracepoint supporting tools like ``perf``. For example::
|
||||
|
||||
# echo on > monitor_on
|
||||
# perf record -e damon:damon_aggregated &
|
||||
|
@ -56,14 +56,14 @@ Example usage of perf::
|
||||
For HiSilicon uncore PMU v2 whose identifier is 0x30, the topology is the same
|
||||
as PMU v1, but some new functions are added to the hardware.
|
||||
|
||||
(a) L3C PMU supports filtering by core/thread within the cluster which can be
|
||||
1. L3C PMU supports filtering by core/thread within the cluster which can be
|
||||
specified as a bitmap::
|
||||
|
||||
$# perf stat -a -e hisi_sccl3_l3c0/config=0x02,tt_core=0x3/ sleep 5
|
||||
|
||||
This will only count the operations from core/thread 0 and 1 in this cluster.
|
||||
|
||||
(b) Tracetag allow the user to chose to count only read, write or atomic
|
||||
2. Tracetag allow the user to chose to count only read, write or atomic
|
||||
operations via the tt_req parameeter in perf. The default value counts all
|
||||
operations. tt_req is 3bits, 3'b100 represents read operations, 3'b101
|
||||
represents write operations, 3'b110 represents atomic store operations and
|
||||
@ -73,14 +73,16 @@ represents write operations, 3'b110 represents atomic store operations and
|
||||
|
||||
This will only count the read operations in this cluster.
|
||||
|
||||
(c) Datasrc allows the user to check where the data comes from. It is 5 bits.
|
||||
3. Datasrc allows the user to check where the data comes from. It is 5 bits.
|
||||
Some important codes are as follows:
|
||||
5'b00001: comes from L3C in this die;
|
||||
5'b01000: comes from L3C in the cross-die;
|
||||
5'b01001: comes from L3C which is in another socket;
|
||||
5'b01110: comes from the local DDR;
|
||||
5'b01111: comes from the cross-die DDR;
|
||||
5'b10000: comes from cross-socket DDR;
|
||||
|
||||
- 5'b00001: comes from L3C in this die;
|
||||
- 5'b01000: comes from L3C in the cross-die;
|
||||
- 5'b01001: comes from L3C which is in another socket;
|
||||
- 5'b01110: comes from the local DDR;
|
||||
- 5'b01111: comes from the cross-die DDR;
|
||||
- 5'b10000: comes from cross-socket DDR;
|
||||
|
||||
etc, it is mainly helpful to find that the data source is nearest from the CPU
|
||||
cores. If datasrc_cfg is used in the multi-chips, the datasrc_skt shall be
|
||||
configured in perf command::
|
||||
@ -88,15 +90,25 @@ configured in perf command::
|
||||
$# perf stat -a -e hisi_sccl3_l3c0/config=0xb9,datasrc_cfg=0xE/,
|
||||
hisi_sccl3_l3c0/config=0xb9,datasrc_cfg=0xF/ sleep 5
|
||||
|
||||
(d)Some HiSilicon SoCs encapsulate multiple CPU and IO dies. Each CPU die
|
||||
4. Some HiSilicon SoCs encapsulate multiple CPU and IO dies. Each CPU die
|
||||
contains several Compute Clusters (CCLs). The I/O dies are called Super I/O
|
||||
clusters (SICL) containing multiple I/O clusters (ICLs). Each CCL/ICL in the
|
||||
SoC has a unique ID. Each ID is 11bits, include a 6-bit SCCL-ID and 5-bit
|
||||
CCL/ICL-ID. For I/O die, the ICL-ID is followed by:
|
||||
5'b00000: I/O_MGMT_ICL;
|
||||
5'b00001: Network_ICL;
|
||||
5'b00011: HAC_ICL;
|
||||
5'b10000: PCIe_ICL;
|
||||
|
||||
- 5'b00000: I/O_MGMT_ICL;
|
||||
- 5'b00001: Network_ICL;
|
||||
- 5'b00011: HAC_ICL;
|
||||
- 5'b10000: PCIe_ICL;
|
||||
|
||||
5. uring_channel: UC PMU events 0x47~0x59 supports filtering by tx request
|
||||
uring channel. It is 2 bits. Some important codes are as follows:
|
||||
|
||||
- 2'b11: count the events which sent to the uring_ext (MATA) channel;
|
||||
- 2'b01: is the same as 2'b11;
|
||||
- 2'b10: count the events which sent to the uring (non-MATA) channel;
|
||||
- 2'b00: default value, count the events which sent to the both uring and
|
||||
uring_ext channel;
|
||||
|
||||
Users could configure IDs to count data come from specific CCL/ICL, by setting
|
||||
srcid_cmd & srcid_msk, and data desitined for specific CCL/ICL by setting
|
||||
|
@ -949,7 +949,7 @@ user space can read performance monitor counter registers directly.
|
||||
|
||||
The default value is 0 (access disabled).
|
||||
|
||||
See Documentation/arm64/perf.rst for more information.
|
||||
See Documentation/arch/arm64/perf.rst for more information.
|
||||
|
||||
|
||||
pid_max
|
||||
|
@ -386,8 +386,8 @@ Default : 0 (for compatibility reasons)
|
||||
txrehash
|
||||
--------
|
||||
|
||||
Controls default hash rethink behaviour on listening socket when SO_TXREHASH
|
||||
option is set to SOCK_TXREHASH_DEFAULT (i. e. not overridden by setsockopt).
|
||||
Controls default hash rethink behaviour on socket when SO_TXREHASH option is set
|
||||
to SOCK_TXREHASH_DEFAULT (i. e. not overridden by setsockopt).
|
||||
|
||||
If set to 1 (default), hash rethink is performed on listening socket.
|
||||
If set to 0, hash rethink is not performed.
|
||||
|
@ -17,16 +17,37 @@ For ACPI on arm64, tables also fall into the following categories:
|
||||
|
||||
- Recommended: BERT, EINJ, ERST, HEST, PCCT, SSDT
|
||||
|
||||
- Optional: BGRT, CPEP, CSRT, DBG2, DRTM, ECDT, FACS, FPDT, IBFT,
|
||||
IORT, MCHI, MPST, MSCT, NFIT, PMTT, RASF, SBST, SLIT, SPMI, SRAT,
|
||||
STAO, TCPA, TPM2, UEFI, XENV
|
||||
- Optional: AGDI, BGRT, CEDT, CPEP, CSRT, DBG2, DRTM, ECDT, FACS, FPDT,
|
||||
HMAT, IBFT, IORT, MCHI, MPAM, MPST, MSCT, NFIT, PMTT, PPTT, RASF, SBST,
|
||||
SDEI, SLIT, SPMI, SRAT, STAO, TCPA, TPM2, UEFI, XENV
|
||||
|
||||
- Not supported: BOOT, DBGP, DMAR, ETDT, HPET, IVRS, LPIT, MSDM, OEMx,
|
||||
PSDT, RSDT, SLIC, WAET, WDAT, WDRT, WPBT
|
||||
- Not supported: AEST, APMT, BOOT, DBGP, DMAR, ETDT, HPET, IVRS, LPIT,
|
||||
MSDM, OEMx, PDTT, PSDT, RAS2, RSDT, SLIC, WAET, WDAT, WDRT, WPBT
|
||||
|
||||
====== ========================================================================
|
||||
Table Usage for ARMv8 Linux
|
||||
====== ========================================================================
|
||||
AEST Signature Reserved (signature == "AEST")
|
||||
|
||||
**Arm Error Source Table**
|
||||
|
||||
This table informs the OS of any error nodes in the system that are
|
||||
compliant with the Arm RAS architecture.
|
||||
|
||||
AGDI Signature Reserved (signature == "AGDI")
|
||||
|
||||
**Arm Generic diagnostic Dump and Reset Device Interface Table**
|
||||
|
||||
This table describes a non-maskable event, that is used by the platform
|
||||
firmware, to request the OS to generate a diagnostic dump and reset the device.
|
||||
|
||||
APMT Signature Reserved (signature == "APMT")
|
||||
|
||||
**Arm Performance Monitoring Table**
|
||||
|
||||
This table describes the properties of PMU support implmented by
|
||||
components in the system.
|
||||
|
||||
BERT Section 18.3 (signature == "BERT")
|
||||
|
||||
**Boot Error Record Table**
|
||||
@ -47,6 +68,13 @@ BGRT Section 5.2.22 (signature == "BGRT")
|
||||
Optional, not currently supported, with no real use-case for an
|
||||
ARM server.
|
||||
|
||||
CEDT Signature Reserved (signature == "CEDT")
|
||||
|
||||
**CXL Early Discovery Table**
|
||||
|
||||
This table allows the OS to discover any CXL Host Bridges and the Host
|
||||
Bridge registers.
|
||||
|
||||
CPEP Section 5.2.18 (signature == "CPEP")
|
||||
|
||||
**Corrected Platform Error Polling table**
|
||||
@ -184,6 +212,15 @@ HEST Section 18.3.2 (signature == "HEST")
|
||||
Must be supplied if RAS support is provided by the platform. It
|
||||
is recommended this table be supplied.
|
||||
|
||||
HMAT Section 5.2.28 (signature == "HMAT")
|
||||
|
||||
**Heterogeneous Memory Attribute Table**
|
||||
|
||||
This table describes the memory attributes, such as memory side cache
|
||||
attributes and bandwidth and latency details, related to Memory Proximity
|
||||
Domains. The OS uses this information to optimize the system memory
|
||||
configuration.
|
||||
|
||||
HPET Signature Reserved (signature == "HPET")
|
||||
|
||||
**High Precision Event timer Table**
|
||||
@ -241,6 +278,13 @@ MCHI Signature Reserved (signature == "MCHI")
|
||||
|
||||
Optional, not currently supported.
|
||||
|
||||
MPAM Signature Reserved (signature == "MPAM")
|
||||
|
||||
**Memory Partitioning And Monitoring table**
|
||||
|
||||
This table allows the OS to discover the MPAM controls implemented by
|
||||
the subsystems.
|
||||
|
||||
MPST Section 5.2.21 (signature == "MPST")
|
||||
|
||||
**Memory Power State Table**
|
||||
@ -281,18 +325,39 @@ PCCT Section 14.1 (signature == "PCCT)
|
||||
Recommend for use on arm64; use of PCC is recommended when using CPPC
|
||||
to control performance and power for platform processors.
|
||||
|
||||
PDTT Section 5.2.29 (signature == "PDTT")
|
||||
|
||||
**Platform Debug Trigger Table**
|
||||
|
||||
This table describes PCC channels used to gather debug logs of
|
||||
non-architectural features.
|
||||
|
||||
|
||||
PMTT Section 5.2.21.12 (signature == "PMTT")
|
||||
|
||||
**Platform Memory Topology Table**
|
||||
|
||||
Optional, not currently supported.
|
||||
|
||||
PPTT Section 5.2.30 (signature == "PPTT")
|
||||
|
||||
**Processor Properties Topology Table**
|
||||
|
||||
This table provides the processor and cache topology.
|
||||
|
||||
PSDT Section 5.2.11.3 (signature == "PSDT")
|
||||
|
||||
**Persistent System Description Table**
|
||||
|
||||
Obsolete table, will not be supported.
|
||||
|
||||
RAS2 Section 5.2.21 (signature == "RAS2")
|
||||
|
||||
**RAS Features 2 table**
|
||||
|
||||
This table provides interfaces for the RAS capabilities implemented in
|
||||
the platform.
|
||||
|
||||
RASF Section 5.2.20 (signature == "RASF")
|
||||
|
||||
**RAS Feature table**
|
||||
@ -318,6 +383,12 @@ SBST Section 5.2.14 (signature == "SBST")
|
||||
|
||||
Optional, not currently supported.
|
||||
|
||||
SDEI Signature Reserved (signature == "SDEI")
|
||||
|
||||
**Software Delegated Exception Interface table**
|
||||
|
||||
This table advertises the presence of the SDEI interface.
|
||||
|
||||
SLIC Signature Reserved (signature == "SLIC")
|
||||
|
||||
**Software LIcensing table**
|
@ -1,40 +1,41 @@
|
||||
=====================
|
||||
ACPI on ARMv8 Servers
|
||||
=====================
|
||||
===================
|
||||
ACPI on Arm systems
|
||||
===================
|
||||
|
||||
ACPI can be used for ARMv8 general purpose servers designed to follow
|
||||
the ARM SBSA (Server Base System Architecture) [0] and SBBR (Server
|
||||
Base Boot Requirements) [1] specifications. Please note that the SBBR
|
||||
can be retrieved simply by visiting [1], but the SBSA is currently only
|
||||
available to those with an ARM login due to ARM IP licensing concerns.
|
||||
ACPI can be used for Armv8 and Armv9 systems designed to follow
|
||||
the BSA (Arm Base System Architecture) [0] and BBR (Arm
|
||||
Base Boot Requirements) [1] specifications. Both BSA and BBR are publicly
|
||||
accessible documents.
|
||||
Arm Servers, in addition to being BSA compliant, comply with a set
|
||||
of rules defined in SBSA (Server Base System Architecture) [2].
|
||||
|
||||
The ARMv8 kernel implements the reduced hardware model of ACPI version
|
||||
The Arm kernel implements the reduced hardware model of ACPI version
|
||||
5.1 or later. Links to the specification and all external documents
|
||||
it refers to are managed by the UEFI Forum. The specification is
|
||||
available at http://www.uefi.org/specifications and documents referenced
|
||||
by the specification can be found via http://www.uefi.org/acpi.
|
||||
|
||||
If an ARMv8 system does not meet the requirements of the SBSA and SBBR,
|
||||
If an Arm system does not meet the requirements of the BSA and BBR,
|
||||
or cannot be described using the mechanisms defined in the required ACPI
|
||||
specifications, then ACPI may not be a good fit for the hardware.
|
||||
|
||||
While the documents mentioned above set out the requirements for building
|
||||
industry-standard ARMv8 servers, they also apply to more than one operating
|
||||
industry-standard Arm systems, they also apply to more than one operating
|
||||
system. The purpose of this document is to describe the interaction between
|
||||
ACPI and Linux only, on an ARMv8 system -- that is, what Linux expects of
|
||||
ACPI and Linux only, on an Arm system -- that is, what Linux expects of
|
||||
ACPI and what ACPI can expect of Linux.
|
||||
|
||||
|
||||
Why ACPI on ARM?
|
||||
Why ACPI on Arm?
|
||||
----------------
|
||||
Before examining the details of the interface between ACPI and Linux, it is
|
||||
useful to understand why ACPI is being used. Several technologies already
|
||||
exist in Linux for describing non-enumerable hardware, after all. In this
|
||||
section we summarize a blog post [2] from Grant Likely that outlines the
|
||||
reasoning behind ACPI on ARMv8 servers. Actually, we snitch a good portion
|
||||
section we summarize a blog post [3] from Grant Likely that outlines the
|
||||
reasoning behind ACPI on Arm systems. Actually, we snitch a good portion
|
||||
of the summary text almost directly, to be honest.
|
||||
|
||||
The short form of the rationale for ACPI on ARM is:
|
||||
The short form of the rationale for ACPI on Arm is:
|
||||
|
||||
- ACPI’s byte code (AML) allows the platform to encode hardware behavior,
|
||||
while DT explicitly does not support this. For hardware vendors, being
|
||||
@ -47,7 +48,7 @@ The short form of the rationale for ACPI on ARM is:
|
||||
|
||||
- In the enterprise server environment, ACPI has established bindings (such
|
||||
as for RAS) which are currently used in production systems. DT does not.
|
||||
Such bindings could be defined in DT at some point, but doing so means ARM
|
||||
Such bindings could be defined in DT at some point, but doing so means Arm
|
||||
and x86 would end up using completely different code paths in both firmware
|
||||
and the kernel.
|
||||
|
||||
@ -108,7 +109,7 @@ recent version of the kernel.
|
||||
|
||||
Relationship with Device Tree
|
||||
-----------------------------
|
||||
ACPI support in drivers and subsystems for ARMv8 should never be mutually
|
||||
ACPI support in drivers and subsystems for Arm should never be mutually
|
||||
exclusive with DT support at compile time.
|
||||
|
||||
At boot time the kernel will only use one description method depending on
|
||||
@ -121,11 +122,11 @@ time).
|
||||
|
||||
Booting using ACPI tables
|
||||
-------------------------
|
||||
The only defined method for passing ACPI tables to the kernel on ARMv8
|
||||
The only defined method for passing ACPI tables to the kernel on Arm
|
||||
is via the UEFI system configuration table. Just so it is explicit, this
|
||||
means that ACPI is only supported on platforms that boot via UEFI.
|
||||
|
||||
When an ARMv8 system boots, it can either have DT information, ACPI tables,
|
||||
When an Arm system boots, it can either have DT information, ACPI tables,
|
||||
or in some very unusual cases, both. If no command line parameters are used,
|
||||
the kernel will try to use DT for device enumeration; if there is no DT
|
||||
present, the kernel will try to use ACPI tables, but only if they are present.
|
||||
@ -169,7 +170,7 @@ hardware reduced mode must be set to zero.
|
||||
|
||||
For the ACPI core to operate properly, and in turn provide the information
|
||||
the kernel needs to configure devices, it expects to find the following
|
||||
tables (all section numbers refer to the ACPI 6.1 specification):
|
||||
tables (all section numbers refer to the ACPI 6.5 specification):
|
||||
|
||||
- RSDP (Root System Description Pointer), section 5.2.5
|
||||
|
||||
@ -184,20 +185,76 @@ tables (all section numbers refer to the ACPI 6.1 specification):
|
||||
|
||||
- GTDT (Generic Timer Description Table), section 5.2.24
|
||||
|
||||
- PPTT (Processor Properties Topology Table), section 5.2.30
|
||||
|
||||
- DBG2 (DeBuG port table 2), section 5.2.6, specifically Table 5-6.
|
||||
|
||||
- APMT (Arm Performance Monitoring unit Table), section 5.2.6, specifically Table 5-6.
|
||||
|
||||
- AGDI (Arm Generic diagnostic Dump and Reset Device Interface Table), section 5.2.6, specifically Table 5-6.
|
||||
|
||||
- If PCI is supported, the MCFG (Memory mapped ConFiGuration
|
||||
Table), section 5.2.6, specifically Table 5-31.
|
||||
Table), section 5.2.6, specifically Table 5-6.
|
||||
|
||||
- If booting without a console=<device> kernel parameter is
|
||||
supported, the SPCR (Serial Port Console Redirection table),
|
||||
section 5.2.6, specifically Table 5-31.
|
||||
section 5.2.6, specifically Table 5-6.
|
||||
|
||||
- If necessary to describe the I/O topology, SMMUs and GIC ITSs,
|
||||
the IORT (Input Output Remapping Table, section 5.2.6, specifically
|
||||
Table 5-31).
|
||||
Table 5-6).
|
||||
|
||||
- If NUMA is supported, the following tables are required:
|
||||
|
||||
- SRAT (System Resource Affinity Table), section 5.2.16
|
||||
|
||||
- SLIT (System Locality distance Information Table), section 5.2.17
|
||||
|
||||
- If NUMA is supported, and the system contains heterogeneous memory,
|
||||
the HMAT (Heterogeneous Memory Attribute Table), section 5.2.28.
|
||||
|
||||
- If the ACPI Platform Error Interfaces are required, the following
|
||||
tables are conditionally required:
|
||||
|
||||
- BERT (Boot Error Record Table, section 18.3.1)
|
||||
|
||||
- EINJ (Error INJection table, section 18.6.1)
|
||||
|
||||
- ERST (Error Record Serialization Table, section 18.5)
|
||||
|
||||
- HEST (Hardware Error Source Table, section 18.3.2)
|
||||
|
||||
- SDEI (Software Delegated Exception Interface table, section 5.2.6,
|
||||
specifically Table 5-6)
|
||||
|
||||
- AEST (Arm Error Source Table, section 5.2.6,
|
||||
specifically Table 5-6)
|
||||
|
||||
- RAS2 (ACPI RAS2 feature table, section 5.2.21)
|
||||
|
||||
- If the system contains controllers using PCC channel, the
|
||||
PCCT (Platform Communications Channel Table), section 14.1
|
||||
|
||||
- If the system contains a controller to capture board-level system state,
|
||||
and communicates with the host via PCC, the PDTT (Platform Debug Trigger
|
||||
Table), section 5.2.29.
|
||||
|
||||
- If NVDIMM is supported, the NFIT (NVDIMM Firmware Interface Table), section 5.2.26
|
||||
|
||||
- If video framebuffer is present, the BGRT (Boot Graphics Resource Table), section 5.2.23
|
||||
|
||||
- If IPMI is implemented, the SPMI (Server Platform Management Interface),
|
||||
section 5.2.6, specifically Table 5-6.
|
||||
|
||||
- If the system contains a CXL Host Bridge, the CEDT (CXL Early Discovery
|
||||
Table), section 5.2.6, specifically Table 5-6.
|
||||
|
||||
- If the system supports MPAM, the MPAM (Memory Partitioning And Monitoring table), section 5.2.6,
|
||||
specifically Table 5-6.
|
||||
|
||||
- If the system lacks persistent storage, the IBFT (ISCSI Boot Firmware
|
||||
Table), section 5.2.6, specifically Table 5-6.
|
||||
|
||||
- If NUMA is supported, the SRAT (System Resource Affinity Table)
|
||||
and SLIT (System Locality distance Information Table), sections
|
||||
5.2.16 and 5.2.17, respectively.
|
||||
|
||||
If the above tables are not all present, the kernel may or may not be
|
||||
able to boot properly since it may not be able to configure all of the
|
||||
@ -269,16 +326,14 @@ Drivers should look for device properties in the _DSD object ONLY; the _DSD
|
||||
object is described in the ACPI specification section 6.2.5, but this only
|
||||
describes how to define the structure of an object returned via _DSD, and
|
||||
how specific data structures are defined by specific UUIDs. Linux should
|
||||
only use the _DSD Device Properties UUID [5]:
|
||||
only use the _DSD Device Properties UUID [4]:
|
||||
|
||||
- UUID: daffd814-6eba-4d8c-8a91-bc9bbf4aa301
|
||||
|
||||
- https://www.uefi.org/sites/default/files/resources/_DSD-device-properties-UUID.pdf
|
||||
|
||||
The UEFI Forum provides a mechanism for registering device properties [4]
|
||||
so that they may be used across all operating systems supporting ACPI.
|
||||
Device properties that have not been registered with the UEFI Forum should
|
||||
not be used.
|
||||
Common device properties can be registered by creating a pull request to [4] so
|
||||
that they may be used across all operating systems supporting ACPI.
|
||||
Device properties that have not been registered with the UEFI Forum can be used
|
||||
but not as "uefi-" common properties.
|
||||
|
||||
Before creating new device properties, check to be sure that they have not
|
||||
been defined before and either registered in the Linux kernel documentation
|
||||
@ -306,7 +361,7 @@ process.
|
||||
|
||||
Once registration and review have been completed, the kernel provides an
|
||||
interface for looking up device properties in a manner independent of
|
||||
whether DT or ACPI is being used. This API should be used [6]; it can
|
||||
whether DT or ACPI is being used. This API should be used [5]; it can
|
||||
eliminate some duplication of code paths in driver probing functions and
|
||||
discourage divergence between DT bindings and ACPI device properties.
|
||||
|
||||
@ -448,15 +503,15 @@ ASWG
|
||||
----
|
||||
The ACPI specification changes regularly. During the year 2014, for instance,
|
||||
version 5.1 was released and version 6.0 substantially completed, with most of
|
||||
the changes being driven by ARM-specific requirements. Proposed changes are
|
||||
the changes being driven by Arm-specific requirements. Proposed changes are
|
||||
presented and discussed in the ASWG (ACPI Specification Working Group) which
|
||||
is a part of the UEFI Forum. The current version of the ACPI specification
|
||||
is 6.1 release in January 2016.
|
||||
is 6.5 release in August 2022.
|
||||
|
||||
Participation in this group is open to all UEFI members. Please see
|
||||
http://www.uefi.org/workinggroup for details on group membership.
|
||||
|
||||
It is the intent of the ARMv8 ACPI kernel code to follow the ACPI specification
|
||||
It is the intent of the Arm ACPI kernel code to follow the ACPI specification
|
||||
as closely as possible, and to only implement functionality that complies with
|
||||
the released standards from UEFI ASWG. As a practical matter, there will be
|
||||
vendors that provide bad ACPI tables or violate the standards in some way.
|
||||
@ -470,12 +525,12 @@ likely be willing to assist in submitting ECRs.
|
||||
|
||||
Linux Code
|
||||
----------
|
||||
Individual items specific to Linux on ARM, contained in the Linux
|
||||
Individual items specific to Linux on Arm, contained in the Linux
|
||||
source code, are in the list that follows:
|
||||
|
||||
ACPI_OS_NAME
|
||||
This macro defines the string to be returned when
|
||||
an ACPI method invokes the _OS method. On ARM64
|
||||
an ACPI method invokes the _OS method. On Arm
|
||||
systems, this macro will be "Linux" by default.
|
||||
The command line parameter acpi_os=<string>
|
||||
can be used to set it to some other value. The
|
||||
@ -485,36 +540,28 @@ ACPI_OS_NAME
|
||||
ACPI Objects
|
||||
------------
|
||||
Detailed expectations for ACPI tables and object are listed in the file
|
||||
Documentation/arm64/acpi_object_usage.rst.
|
||||
Documentation/arch/arm64/acpi_object_usage.rst.
|
||||
|
||||
|
||||
References
|
||||
----------
|
||||
[0] http://silver.arm.com
|
||||
document ARM-DEN-0029, or newer:
|
||||
"Server Base System Architecture", version 2.3, dated 27 Mar 2014
|
||||
[0] https://developer.arm.com/documentation/den0094/latest
|
||||
document Arm-DEN-0094: "Arm Base System Architecture", version 1.0C, dated 6 Oct 2022
|
||||
|
||||
[1] http://infocenter.arm.com/help/topic/com.arm.doc.den0044a/Server_Base_Boot_Requirements.pdf
|
||||
Document ARM-DEN-0044A, or newer: "Server Base Boot Requirements, System
|
||||
Software on ARM Platforms", dated 16 Aug 2014
|
||||
[1] https://developer.arm.com/documentation/den0044/latest
|
||||
Document Arm-DEN-0044: "Arm Base Boot Requirements", version 2.0G, dated 15 Apr 2022
|
||||
|
||||
[2] http://www.secretlab.ca/archives/151,
|
||||
[2] https://developer.arm.com/documentation/den0029/latest
|
||||
Document Arm-DEN-0029: "Arm Server Base System Architecture", version 7.1, dated 06 Oct 2022
|
||||
|
||||
[3] http://www.secretlab.ca/archives/151,
|
||||
10 Jan 2015, Copyright (c) 2015,
|
||||
Linaro Ltd., written by Grant Likely.
|
||||
|
||||
[3] AMD ACPI for Seattle platform documentation
|
||||
http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2012/10/Seattle_ACPI_Guide.pdf
|
||||
[4] _DSD (Device Specific Data) Implementation Guide
|
||||
https://github.com/UEFI/DSD-Guide/blob/main/dsd-guide.pdf
|
||||
|
||||
|
||||
[4] http://www.uefi.org/acpi
|
||||
please see the link for the "ACPI _DSD Device
|
||||
Property Registry Instructions"
|
||||
|
||||
[5] http://www.uefi.org/acpi
|
||||
please see the link for the "_DSD (Device
|
||||
Specific Data) Implementation Guide"
|
||||
|
||||
[6] Kernel code for the unified device
|
||||
[5] Kernel code for the unified device
|
||||
property interface can be found in
|
||||
include/linux/property.h and drivers/base/property.c.
|
||||
|
@ -379,6 +379,38 @@ Before jumping into the kernel, the following conditions must be met:
|
||||
|
||||
- SMCR_EL2.EZT0 (bit 30) must be initialised to 0b1.
|
||||
|
||||
For CPUs with Memory Copy and Memory Set instructions (FEAT_MOPS):
|
||||
|
||||
- If the kernel is entered at EL1 and EL2 is present:
|
||||
|
||||
- HCRX_EL2.MSCEn (bit 11) must be initialised to 0b1.
|
||||
|
||||
For CPUs with the Extended Translation Control Register feature (FEAT_TCR2):
|
||||
|
||||
- If EL3 is present:
|
||||
|
||||
- SCR_EL3.TCR2En (bit 43) must be initialised to 0b1.
|
||||
|
||||
- If the kernel is entered at EL1 and EL2 is present:
|
||||
|
||||
- HCRX_EL2.TCR2En (bit 14) must be initialised to 0b1.
|
||||
|
||||
For CPUs with the Stage 1 Permission Indirection Extension feature (FEAT_S1PIE):
|
||||
|
||||
- If EL3 is present:
|
||||
|
||||
- SCR_EL3.PIEn (bit 45) must be initialised to 0b1.
|
||||
|
||||
- If the kernel is entered at EL1 and EL2 is present:
|
||||
|
||||
- HFGRTR_EL2.nPIR_EL1 (bit 58) must be initialised to 0b1.
|
||||
|
||||
- HFGWTR_EL2.nPIR_EL1 (bit 58) must be initialised to 0b1.
|
||||
|
||||
- HFGRTR_EL2.nPIRE0_EL1 (bit 57) must be initialised to 0b1.
|
||||
|
||||
- HFGRWR_EL2.nPIRE0_EL1 (bit 57) must be initialised to 0b1.
|
||||
|
||||
The requirements described above for CPU mode, caches, MMUs, architected
|
||||
timers, coherency and system registers apply to all CPUs. All CPUs must
|
||||
enter the kernel in the same exception level. Where the values documented
|
@ -288,6 +288,8 @@ infrastructure:
|
||||
+------------------------------+---------+---------+
|
||||
| Name | bits | visible |
|
||||
+------------------------------+---------+---------+
|
||||
| MOPS | [19-16] | y |
|
||||
+------------------------------+---------+---------+
|
||||
| RPRES | [7-4] | y |
|
||||
+------------------------------+---------+---------+
|
||||
| WFXT | [3-0] | y |
|
@ -102,7 +102,7 @@ HWCAP_ASIMDHP
|
||||
|
||||
HWCAP_CPUID
|
||||
EL0 access to certain ID registers is available, to the extent
|
||||
described by Documentation/arm64/cpu-feature-registers.rst.
|
||||
described by Documentation/arch/arm64/cpu-feature-registers.rst.
|
||||
|
||||
These ID registers may imply the availability of features.
|
||||
|
||||
@ -163,12 +163,12 @@ HWCAP_SB
|
||||
HWCAP_PACA
|
||||
Functionality implied by ID_AA64ISAR1_EL1.APA == 0b0001 or
|
||||
ID_AA64ISAR1_EL1.API == 0b0001, as described by
|
||||
Documentation/arm64/pointer-authentication.rst.
|
||||
Documentation/arch/arm64/pointer-authentication.rst.
|
||||
|
||||
HWCAP_PACG
|
||||
Functionality implied by ID_AA64ISAR1_EL1.GPA == 0b0001 or
|
||||
ID_AA64ISAR1_EL1.GPI == 0b0001, as described by
|
||||
Documentation/arm64/pointer-authentication.rst.
|
||||
Documentation/arch/arm64/pointer-authentication.rst.
|
||||
|
||||
HWCAP2_DCPODP
|
||||
Functionality implied by ID_AA64ISAR1_EL1.DPB == 0b0010.
|
||||
@ -226,7 +226,7 @@ HWCAP2_BTI
|
||||
|
||||
HWCAP2_MTE
|
||||
Functionality implied by ID_AA64PFR1_EL1.MTE == 0b0010, as described
|
||||
by Documentation/arm64/memory-tagging-extension.rst.
|
||||
by Documentation/arch/arm64/memory-tagging-extension.rst.
|
||||
|
||||
HWCAP2_ECV
|
||||
Functionality implied by ID_AA64MMFR0_EL1.ECV == 0b0001.
|
||||
@ -239,11 +239,11 @@ HWCAP2_RPRES
|
||||
|
||||
HWCAP2_MTE3
|
||||
Functionality implied by ID_AA64PFR1_EL1.MTE == 0b0011, as described
|
||||
by Documentation/arm64/memory-tagging-extension.rst.
|
||||
by Documentation/arch/arm64/memory-tagging-extension.rst.
|
||||
|
||||
HWCAP2_SME
|
||||
Functionality implied by ID_AA64PFR1_EL1.SME == 0b0001, as described
|
||||
by Documentation/arm64/sme.rst.
|
||||
by Documentation/arch/arm64/sme.rst.
|
||||
|
||||
HWCAP2_SME_I16I64
|
||||
Functionality implied by ID_AA64SMFR0_EL1.I16I64 == 0b1111.
|
||||
@ -302,6 +302,9 @@ HWCAP2_SMEB16B16
|
||||
HWCAP2_SMEF16F16
|
||||
Functionality implied by ID_AA64SMFR0_EL1.F16F16 == 0b1
|
||||
|
||||
HWCAP2_MOPS
|
||||
Functionality implied by ID_AA64ISAR2_EL1.MOPS == 0b0001.
|
||||
|
||||
4. Unused AT_HWCAP bits
|
||||
-----------------------
|
||||
|
@ -15,11 +15,13 @@ ARM64 Architecture
|
||||
cpu-feature-registers
|
||||
elf_hwcaps
|
||||
hugetlbpage
|
||||
kdump
|
||||
legacy_instructions
|
||||
memory
|
||||
memory-tagging-extension
|
||||
perf
|
||||
pointer-authentication
|
||||
ptdump
|
||||
silicon-errata
|
||||
sme
|
||||
sve
|
92
Documentation/arch/arm64/kdump.rst
Normal file
92
Documentation/arch/arm64/kdump.rst
Normal file
@ -0,0 +1,92 @@
|
||||
=======================================
|
||||
crashkernel memory reservation on arm64
|
||||
=======================================
|
||||
|
||||
Author: Baoquan He <bhe@redhat.com>
|
||||
|
||||
Kdump mechanism is used to capture a corrupted kernel vmcore so that
|
||||
it can be subsequently analyzed. In order to do this, a preliminarily
|
||||
reserved memory is needed to pre-load the kdump kernel and boot such
|
||||
kernel if corruption happens.
|
||||
|
||||
That reserved memory for kdump is adapted to be able to minimally
|
||||
accommodate the kdump kernel and the user space programs needed for the
|
||||
vmcore collection.
|
||||
|
||||
Kernel parameter
|
||||
================
|
||||
|
||||
Through the kernel parameters below, memory can be reserved accordingly
|
||||
during the early stage of the first kernel booting so that a continuous
|
||||
large chunk of memomy can be found. The low memory reservation needs to
|
||||
be considered if the crashkernel is reserved from the high memory area.
|
||||
|
||||
- crashkernel=size@offset
|
||||
- crashkernel=size
|
||||
- crashkernel=size,high crashkernel=size,low
|
||||
|
||||
Low memory and high memory
|
||||
==========================
|
||||
|
||||
For kdump reservations, low memory is the memory area under a specific
|
||||
limit, usually decided by the accessible address bits of the DMA-capable
|
||||
devices needed by the kdump kernel to run. Those devices not related to
|
||||
vmcore dumping can be ignored. On arm64, the low memory upper bound is
|
||||
not fixed: it is 1G on the RPi4 platform but 4G on most other systems.
|
||||
On special kernels built with CONFIG_ZONE_(DMA|DMA32) disabled, the
|
||||
whole system RAM is low memory. Outside of the low memory described
|
||||
above, the rest of system RAM is considered high memory.
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
1) crashkernel=size@offset
|
||||
--------------------------
|
||||
|
||||
The crashkernel memory must be reserved at the user-specified region or
|
||||
fail if already occupied.
|
||||
|
||||
|
||||
2) crashkernel=size
|
||||
-------------------
|
||||
|
||||
The crashkernel memory region will be reserved in any available position
|
||||
according to the search order:
|
||||
|
||||
Firstly, the kernel searches the low memory area for an available region
|
||||
with the specified size.
|
||||
|
||||
If searching for low memory fails, the kernel falls back to searching
|
||||
the high memory area for an available region of the specified size. If
|
||||
the reservation in high memory succeeds, a default size reservation in
|
||||
the low memory will be done. Currently the default size is 128M,
|
||||
sufficient for the low memory needs of the kdump kernel.
|
||||
|
||||
Note: crashkernel=size is the recommended option for crashkernel kernel
|
||||
reservations. The user would not need to know the system memory layout
|
||||
for a specific platform.
|
||||
|
||||
3) crashkernel=size,high crashkernel=size,low
|
||||
---------------------------------------------
|
||||
|
||||
crashkernel=size,(high|low) are an important supplement to
|
||||
crashkernel=size. They allows the user to specify how much memory needs
|
||||
to be allocated from the high memory and low memory respectively. On
|
||||
many systems the low memory is precious and crashkernel reservations
|
||||
from this area should be kept to a minimum.
|
||||
|
||||
To reserve memory for crashkernel=size,high, searching is first
|
||||
attempted from the high memory region. If the reservation succeeds, the
|
||||
low memory reservation will be done subsequently.
|
||||
|
||||
If reservation from the high memory failed, the kernel falls back to
|
||||
searching the low memory with the specified size in crashkernel=,high.
|
||||
If it succeeds, no further reservation for low memory is needed.
|
||||
|
||||
Notes:
|
||||
|
||||
- If crashkernel=,low is not specified, the default low memory
|
||||
reservation will be done automatically.
|
||||
|
||||
- if crashkernel=0,low is specified, it means that the low memory
|
||||
reservation is omitted intentionally.
|
@ -221,7 +221,7 @@ programs should not retry in case of a non-zero system call return.
|
||||
``NT_ARM_TAGGED_ADDR_CTRL`` allow ``ptrace()`` access to the tagged
|
||||
address ABI control and MTE configuration of a process as per the
|
||||
``prctl()`` options described in
|
||||
Documentation/arm64/tagged-address-abi.rst and above. The corresponding
|
||||
Documentation/arch/arm64/tagged-address-abi.rst and above. The corresponding
|
||||
``regset`` is 1 element of 8 bytes (``sizeof(long))``).
|
||||
|
||||
Core dump support
|
@ -33,8 +33,8 @@ AArch64 Linux memory layout with 4KB pages + 4 levels (48-bit)::
|
||||
0000000000000000 0000ffffffffffff 256TB user
|
||||
ffff000000000000 ffff7fffffffffff 128TB kernel logical memory map
|
||||
[ffff600000000000 ffff7fffffffffff] 32TB [kasan shadow region]
|
||||
ffff800000000000 ffff800007ffffff 128MB modules
|
||||
ffff800008000000 fffffbffefffffff 124TB vmalloc
|
||||
ffff800000000000 ffff80007fffffff 2GB modules
|
||||
ffff800080000000 fffffbffefffffff 124TB vmalloc
|
||||
fffffbfff0000000 fffffbfffdffffff 224MB fixed mappings (top down)
|
||||
fffffbfffe000000 fffffbfffe7fffff 8MB [guard region]
|
||||
fffffbfffe800000 fffffbffff7fffff 16MB PCI I/O space
|
||||
@ -50,8 +50,8 @@ AArch64 Linux memory layout with 64KB pages + 3 levels (52-bit with HW support):
|
||||
0000000000000000 000fffffffffffff 4PB user
|
||||
fff0000000000000 ffff7fffffffffff ~4PB kernel logical memory map
|
||||
[fffd800000000000 ffff7fffffffffff] 512TB [kasan shadow region]
|
||||
ffff800000000000 ffff800007ffffff 128MB modules
|
||||
ffff800008000000 fffffbffefffffff 124TB vmalloc
|
||||
ffff800000000000 ffff80007fffffff 2GB modules
|
||||
ffff800080000000 fffffbffefffffff 124TB vmalloc
|
||||
fffffbfff0000000 fffffbfffdffffff 224MB fixed mappings (top down)
|
||||
fffffbfffe000000 fffffbfffe7fffff 8MB [guard region]
|
||||
fffffbfffe800000 fffffbffff7fffff 16MB PCI I/O space
|
96
Documentation/arch/arm64/ptdump.rst
Normal file
96
Documentation/arch/arm64/ptdump.rst
Normal file
@ -0,0 +1,96 @@
|
||||
======================
|
||||
Kernel page table dump
|
||||
======================
|
||||
|
||||
ptdump is a debugfs interface that provides a detailed dump of the
|
||||
kernel page tables. It offers a comprehensive overview of the kernel
|
||||
virtual memory layout as well as the attributes associated with the
|
||||
various regions in a human-readable format. It is useful to dump the
|
||||
kernel page tables to verify permissions and memory types. Examining the
|
||||
page table entries and permissions helps identify potential security
|
||||
vulnerabilities such as mappings with overly permissive access rights or
|
||||
improper memory protections.
|
||||
|
||||
Memory hotplug allows dynamic expansion or contraction of available
|
||||
memory without requiring a system reboot. To maintain the consistency
|
||||
and integrity of the memory management data structures, arm64 makes use
|
||||
of the ``mem_hotplug_lock`` semaphore in write mode. Additionally, in
|
||||
read mode, ``mem_hotplug_lock`` supports an efficient implementation of
|
||||
``get_online_mems()`` and ``put_online_mems()``. These protect the
|
||||
offlining of memory being accessed by the ptdump code.
|
||||
|
||||
In order to dump the kernel page tables, enable the following
|
||||
configurations and mount debugfs::
|
||||
|
||||
CONFIG_GENERIC_PTDUMP=y
|
||||
CONFIG_PTDUMP_CORE=y
|
||||
CONFIG_PTDUMP_DEBUGFS=y
|
||||
|
||||
mount -t debugfs nodev /sys/kernel/debug
|
||||
cat /sys/kernel/debug/kernel_page_tables
|
||||
|
||||
On analysing the output of ``cat /sys/kernel/debug/kernel_page_tables``
|
||||
one can derive information about the virtual address range of the entry,
|
||||
followed by size of the memory region covered by this entry, the
|
||||
hierarchical structure of the page tables and finally the attributes
|
||||
associated with each page. The page attributes provide information about
|
||||
access permissions, execution capability, type of mapping such as leaf
|
||||
level PTE or block level PGD, PMD and PUD, and access status of a page
|
||||
within the kernel memory. Assessing these attributes can assist in
|
||||
understanding the memory layout, access patterns and security
|
||||
characteristics of the kernel pages.
|
||||
|
||||
Kernel virtual memory layout example::
|
||||
|
||||
start address end address size attributes
|
||||
+---------------------------------------------------------------------------------------+
|
||||
| ---[ Linear Mapping start ]---------------------------------------------------------- |
|
||||
| .................. |
|
||||
| 0xfff0000000000000-0xfff0000000210000 2112K PTE RW NX SHD AF UXN MEM/NORMAL-TAGGED |
|
||||
| 0xfff0000000210000-0xfff0000001c00000 26560K PTE ro NX SHD AF UXN MEM/NORMAL |
|
||||
| .................. |
|
||||
| ---[ Linear Mapping end ]------------------------------------------------------------ |
|
||||
+---------------------------------------------------------------------------------------+
|
||||
| ---[ Modules start ]----------------------------------------------------------------- |
|
||||
| .................. |
|
||||
| 0xffff800000000000-0xffff800008000000 128M PTE |
|
||||
| .................. |
|
||||
| ---[ Modules end ]------------------------------------------------------------------- |
|
||||
+---------------------------------------------------------------------------------------+
|
||||
| ---[ vmalloc() area ]---------------------------------------------------------------- |
|
||||
| .................. |
|
||||
| 0xffff800008010000-0xffff800008200000 1984K PTE ro x SHD AF UXN MEM/NORMAL |
|
||||
| 0xffff800008200000-0xffff800008e00000 12M PTE ro x SHD AF CON UXN MEM/NORMAL |
|
||||
| .................. |
|
||||
| ---[ vmalloc() end ]----------------------------------------------------------------- |
|
||||
+---------------------------------------------------------------------------------------+
|
||||
| ---[ Fixmap start ]------------------------------------------------------------------ |
|
||||
| .................. |
|
||||
| 0xfffffbfffdb80000-0xfffffbfffdb90000 64K PTE ro x SHD AF UXN MEM/NORMAL |
|
||||
| 0xfffffbfffdb90000-0xfffffbfffdba0000 64K PTE ro NX SHD AF UXN MEM/NORMAL |
|
||||
| .................. |
|
||||
| ---[ Fixmap end ]-------------------------------------------------------------------- |
|
||||
+---------------------------------------------------------------------------------------+
|
||||
| ---[ PCI I/O start ]----------------------------------------------------------------- |
|
||||
| .................. |
|
||||
| 0xfffffbfffe800000-0xfffffbffff800000 16M PTE |
|
||||
| .................. |
|
||||
| ---[ PCI I/O end ]------------------------------------------------------------------- |
|
||||
+---------------------------------------------------------------------------------------+
|
||||
| ---[ vmemmap start ]----------------------------------------------------------------- |
|
||||
| .................. |
|
||||
| 0xfffffc0002000000-0xfffffc0002200000 2M PTE RW NX SHD AF UXN MEM/NORMAL |
|
||||
| 0xfffffc0002200000-0xfffffc0020000000 478M PTE |
|
||||
| .................. |
|
||||
| ---[ vmemmap end ]------------------------------------------------------------------- |
|
||||
+---------------------------------------------------------------------------------------+
|
||||
|
||||
``cat /sys/kernel/debug/kernel_page_tables`` output::
|
||||
|
||||
0xfff0000001c00000-0xfff0000080000000 2020M PTE RW NX SHD AF UXN MEM/NORMAL-TAGGED
|
||||
0xfff0000080000000-0xfff0000800000000 30G PMD
|
||||
0xfff0000800000000-0xfff0000800700000 7M PTE RW NX SHD AF UXN MEM/NORMAL-TAGGED
|
||||
0xfff0000800700000-0xfff0000800710000 64K PTE ro NX SHD AF UXN MEM/NORMAL-TAGGED
|
||||
0xfff0000800710000-0xfff0000880000000 2089920K PTE RW NX SHD AF UXN MEM/NORMAL-TAGGED
|
||||
0xfff0000880000000-0xfff0040000000000 4062G PMD
|
||||
0xfff0040000000000-0xffff800000000000 3964T PGD
|
@ -140,6 +140,10 @@ stable kernels.
|
||||
+----------------+-----------------+-----------------+-----------------------------+
|
||||
| ARM | MMU-500 | #841119,826419 | N/A |
|
||||
+----------------+-----------------+-----------------+-----------------------------+
|
||||
| ARM | MMU-600 | #1076982,1209401| N/A |
|
||||
+----------------+-----------------+-----------------+-----------------------------+
|
||||
| ARM | MMU-700 | #2268618,2812531| N/A |
|
||||
+----------------+-----------------+-----------------+-----------------------------+
|
||||
+----------------+-----------------+-----------------+-----------------------------+
|
||||
| Broadcom | Brahma-B53 | N/A | ARM64_ERRATUM_845719 |
|
||||
+----------------+-----------------+-----------------+-----------------------------+
|
||||
@ -214,3 +218,7 @@ stable kernels.
|
||||
+----------------+-----------------+-----------------+-----------------------------+
|
||||
| Fujitsu | A64FX | E#010001 | FUJITSU_ERRATUM_010001 |
|
||||
+----------------+-----------------+-----------------+-----------------------------+
|
||||
|
||||
+----------------+-----------------+-----------------+-----------------------------+
|
||||
| ASR | ASR8601 | #8601001 | N/A |
|
||||
+----------------+-----------------+-----------------+-----------------------------+
|
@ -465,4 +465,4 @@ References
|
||||
[2] arch/arm64/include/uapi/asm/ptrace.h
|
||||
AArch64 Linux ptrace ABI definitions
|
||||
|
||||
[3] Documentation/arm64/cpu-feature-registers.rst
|
||||
[3] Documentation/arch/arm64/cpu-feature-registers.rst
|
@ -606,7 +606,7 @@ References
|
||||
[2] arch/arm64/include/uapi/asm/ptrace.h
|
||||
AArch64 Linux ptrace ABI definitions
|
||||
|
||||
[3] Documentation/arm64/cpu-feature-registers.rst
|
||||
[3] Documentation/arch/arm64/cpu-feature-registers.rst
|
||||
|
||||
[4] ARM IHI0055C
|
||||
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055c/IHI0055C_beta_aapcs64.pdf
|
@ -107,7 +107,7 @@ following behaviours are guaranteed:
|
||||
|
||||
|
||||
A definition of the meaning of tagged pointers on AArch64 can be found
|
||||
in Documentation/arm64/tagged-pointers.rst.
|
||||
in Documentation/arch/arm64/tagged-pointers.rst.
|
||||
|
||||
3. AArch64 Tagged Address ABI Exceptions
|
||||
-----------------------------------------
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
x
Reference in New Issue
Block a user