Merge branch 'char-misc-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc into fpga-dfl-for-5.4

This commit is contained in:
Moritz Fischer 2019-09-03 19:35:07 -07:00
commit af9ca4b0bd
1971 changed files with 22378 additions and 14136 deletions

3
.gitignore vendored
View File

@ -142,3 +142,6 @@ x509.genkey
# Kdevelop4
*.kdev4
# Clang's compilation database file
/compile_commands.json

View File

@ -64,6 +64,9 @@ Dengcheng Zhu <dzhu@wavecomp.com> <dengcheng.zhu@imgtec.com>
Dengcheng Zhu <dzhu@wavecomp.com> <dczhu@mips.com>
Dengcheng Zhu <dzhu@wavecomp.com> <dengcheng.zhu@gmail.com>
Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Dmitry Safonov <0x7f454c46@gmail.com> <dsafonov@virtuozzo.com>
Dmitry Safonov <0x7f454c46@gmail.com> <d.safonov@partner.samsung.com>
Dmitry Safonov <0x7f454c46@gmail.com> <dima@arista.com>
Domen Puncer <domen@coderock.org>
Douglas Gilbert <dougg@torque.net>
Ed L. Cashin <ecashin@coraid.com>
@ -98,6 +101,7 @@ Jason Gunthorpe <jgg@ziepe.ca> <jgunthorpe@obsidianresearch.com>
Javi Merino <javi.merino@kernel.org> <javi.merino@arm.com>
<javier@osg.samsung.com> <javier.martinez@collabora.co.uk>
Jean Tourrilhes <jt@hpl.hp.com>
<jean-philippe@linaro.org> <jean-philippe.brucker@arm.com>
Jeff Garzik <jgarzik@pretzel.yyz.us>
Jeff Layton <jlayton@kernel.org> <jlayton@redhat.com>
Jeff Layton <jlayton@kernel.org> <jlayton@poochiereds.net>
@ -116,6 +120,7 @@ John Stultz <johnstul@us.ibm.com>
Juha Yrjola <at solidboot.com>
Juha Yrjola <juha.yrjola@nokia.com>
Juha Yrjola <juha.yrjola@solidboot.com>
Julien Thierry <julien.thierry.kdev@gmail.com> <julien.thierry@arm.com>
Kay Sievers <kay.sievers@vrfy.org>
Kenneth W Chen <kenneth.w.chen@intel.com>
Konstantin Khlebnikov <koct9i@gmail.com> <k.khlebnikov@samsung.com>
@ -132,6 +137,7 @@ Linus Lüssing <linus.luessing@c0d3.blue> <linus.luessing@ascom.ch>
Li Yang <leoyang.li@nxp.com> <leo@zh-kernel.org>
Li Yang <leoyang.li@nxp.com> <leoli@freescale.com>
Maciej W. Rozycki <macro@mips.com> <macro@imgtec.com>
Marc Zyngier <maz@kernel.org> <marc.zyngier@arm.com>
Marcin Nowakowski <marcin.nowakowski@mips.com> <marcin.nowakowski@imgtec.com>
Mark Brown <broonie@sirena.org.uk>
Mark Yao <markyao0591@gmail.com> <mark.yao@rock-chips.com>
@ -157,6 +163,8 @@ Matt Ranostay <mranostay@gmail.com> Matthew Ranostay <mranostay@embeddedalley.co
Matt Ranostay <mranostay@gmail.com> <matt.ranostay@intel.com>
Matt Ranostay <matt.ranostay@konsulko.com> <matt@ranostay.consulting>
Matt Redfearn <matt.redfearn@mips.com> <matt.redfearn@imgtec.com>
Maxime Ripard <mripard@kernel.org> <maxime.ripard@bootlin.com>
Maxime Ripard <mripard@kernel.org> <maxime.ripard@free-electrons.com>
Mayuresh Janorkar <mayur@ti.com>
Michael Buesch <m@bues.ch>
Michel Dänzer <michel@tungstengraphics.com>

View File

@ -12,7 +12,8 @@ Description: (RW) Configure MSC operating mode:
- "single", for contiguous buffer mode (high-order alloc);
- "multi", for multiblock mode;
- "ExI", for DCI handler mode;
- "debug", for debug mode.
- "debug", for debug mode;
- any of the currently loaded buffer sinks.
If operating mode changes, existing buffer is deallocated,
provided there are no active users and tracing is not enabled,
otherwise the write will fail.

View File

@ -21,3 +21,26 @@ Contact: Wu Hao <hao.wu@intel.com>
Description: Read-only. It returns Bitstream (static FPGA region) meta
data, which includes the synthesis date, seed and other
information of this static FPGA region.
What: /sys/bus/platform/devices/dfl-fme.0/cache_size
Date: August 2019
KernelVersion: 5.4
Contact: Wu Hao <hao.wu@intel.com>
Description: Read-only. It returns cache size of this FPGA device.
What: /sys/bus/platform/devices/dfl-fme.0/fabric_version
Date: August 2019
KernelVersion: 5.4
Contact: Wu Hao <hao.wu@intel.com>
Description: Read-only. It returns fabric version of this FPGA device.
Userspace applications need this information to select
best data channels per different fabric design.
What: /sys/bus/platform/devices/dfl-fme.0/socket_id
Date: August 2019
KernelVersion: 5.4
Contact: Wu Hao <hao.wu@intel.com>
Description: Read-only. It returns socket_id to indicate which socket
this FPGA belongs to, only valid for integrated solution.
User only needs this information, in case standard numa node
can't provide correct information.

View File

@ -14,3 +14,35 @@ Description: Read-only. User can program different PR bitstreams to FPGA
Accelerator Function Unit (AFU) for different functions. It
returns uuid which could be used to identify which PR bitstream
is programmed in this AFU.
What: /sys/bus/platform/devices/dfl-port.0/power_state
Date: August 2019
KernelVersion: 5.4
Contact: Wu Hao <hao.wu@intel.com>
Description: Read-only. It reports the APx (AFU Power) state, different APx
means different throttling level. When reading this file, it
returns "0" - Normal / "1" - AP1 / "2" - AP2 / "6" - AP6.
What: /sys/bus/platform/devices/dfl-port.0/ap1_event
Date: August 2019
KernelVersion: 5.4
Contact: Wu Hao <hao.wu@intel.com>
Description: Read-write. Read this file for AP1 (AFU Power State 1) event.
It's used to indicate transient AP1 state. Write 1 to this
file to clear AP1 event.
What: /sys/bus/platform/devices/dfl-port.0/ap2_event
Date: August 2019
KernelVersion: 5.4
Contact: Wu Hao <hao.wu@intel.com>
Description: Read-write. Read this file for AP2 (AFU Power State 2) event.
It's used to indicate transient AP2 state. Write 1 to this
file to clear AP2 event.
What: /sys/bus/platform/devices/dfl-port.0/ltr
Date: August 2019
KernelVersion: 5.4
Contact: Wu Hao <hao.wu@intel.com>
Description: Read-write. Read or set AFU latency tolerance reporting value.
Set ltr to 1 if the AFU can tolerate latency >= 40us or set it
to 0 if it is latency sensitive.

View File

@ -9,7 +9,7 @@ Linux PCI Bus Subsystem
:numbered:
pci
picebus-howto
pciebus-howto
pci-iov-howto
msi-howto
acpi-info

View File

@ -403,7 +403,7 @@ That is, the recovery API only requires that:
.. note::
Implementation details for the powerpc platform are discussed in
the file Documentation/powerpc/eeh-pci-error-recovery.txt
the file Documentation/powerpc/eeh-pci-error-recovery.rst
As of this writing, there is a growing list of device drivers with
patches implementing error recovery. Not all of these patches are in
@ -422,3 +422,6 @@ That is, the recovery API only requires that:
- drivers/net/cxgb3
- drivers/net/s2io.c
- drivers/net/qlge
The End
-------

View File

@ -1,7 +1,7 @@
Using hlist_nulls to protect read-mostly linked lists and
objects using SLAB_TYPESAFE_BY_RCU allocations.
Please read the basics in Documentation/RCU/listRCU.txt
Please read the basics in Documentation/RCU/listRCU.rst
Using special makers (called 'nulls') is a convenient way
to solve following problem :

View File

@ -1,10 +0,0 @@
# -*- coding: utf-8; mode: python -*-
project = 'Linux Kernel User Documentation'
tags.add("subproject")
latex_documents = [
('index', 'linux-user.tex', 'Linux Kernel User Documentation',
'The kernel development community', 'manual'),
]

View File

@ -41,10 +41,11 @@ Related CVEs
The following CVE entries describe Spectre variants:
============= ======================= =================
============= ======================= ==========================
CVE-2017-5753 Bounds check bypass Spectre variant 1
CVE-2017-5715 Branch target injection Spectre variant 2
============= ======================= =================
CVE-2019-1125 Spectre v1 swapgs Spectre variant 1 (swapgs)
============= ======================= ==========================
Problem
-------
@ -78,6 +79,13 @@ There are some extensions of Spectre variant 1 attacks for reading data
over the network, see :ref:`[12] <spec_ref12>`. However such attacks
are difficult, low bandwidth, fragile, and are considered low risk.
Note that, despite "Bounds Check Bypass" name, Spectre variant 1 is not
only about user-controlled array bounds checks. It can affect any
conditional checks. The kernel entry code interrupt, exception, and NMI
handlers all have conditional swapgs checks. Those may be problematic
in the context of Spectre v1, as kernel code can speculatively run with
a user GS.
Spectre variant 2 (Branch Target Injection)
-------------------------------------------
@ -132,6 +140,9 @@ not cover all possible attack vectors.
1. A user process attacking the kernel
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Spectre variant 1
~~~~~~~~~~~~~~~~~
The attacker passes a parameter to the kernel via a register or
via a known address in memory during a syscall. Such parameter may
be used later by the kernel as an index to an array or to derive
@ -144,7 +155,40 @@ not cover all possible attack vectors.
potentially be influenced for Spectre attacks, new "nospec" accessor
macros are used to prevent speculative loading of data.
Spectre variant 2 attacker can :ref:`poison <poison_btb>` the branch
Spectre variant 1 (swapgs)
~~~~~~~~~~~~~~~~~~~~~~~~~~
An attacker can train the branch predictor to speculatively skip the
swapgs path for an interrupt or exception. If they initialize
the GS register to a user-space value, if the swapgs is speculatively
skipped, subsequent GS-related percpu accesses in the speculation
window will be done with the attacker-controlled GS value. This
could cause privileged memory to be accessed and leaked.
For example:
::
if (coming from user space)
swapgs
mov %gs:<percpu_offset>, %reg
mov (%reg), %reg1
When coming from user space, the CPU can speculatively skip the
swapgs, and then do a speculative percpu load using the user GS
value. So the user can speculatively force a read of any kernel
value. If a gadget exists which uses the percpu value as an address
in another load/store, then the contents of the kernel value may
become visible via an L1 side channel attack.
A similar attack exists when coming from kernel space. The CPU can
speculatively do the swapgs, causing the user GS to get used for the
rest of the speculative window.
Spectre variant 2
~~~~~~~~~~~~~~~~~
A spectre variant 2 attacker can :ref:`poison <poison_btb>` the branch
target buffer (BTB) before issuing syscall to launch an attack.
After entering the kernel, the kernel could use the poisoned branch
target buffer on indirect jump and jump to gadget code in speculative
@ -280,11 +324,18 @@ The sysfs file showing Spectre variant 1 mitigation status is:
The possible values in this file are:
======================================= =================================
'Mitigation: __user pointer sanitation' Protection in kernel on a case by
case base with explicit pointer
sanitation.
======================================= =================================
.. list-table::
* - 'Not affected'
- The processor is not vulnerable.
* - 'Vulnerable: __user pointer sanitization and usercopy barriers only; no swapgs barriers'
- The swapgs protections are disabled; otherwise it has
protection in the kernel on a case by case base with explicit
pointer sanitation and usercopy LFENCE barriers.
* - 'Mitigation: usercopy/swapgs barriers and __user pointer sanitization'
- Protection in the kernel on a case by case base with explicit
pointer sanitation, usercopy LFENCE barriers, and swapgs LFENCE
barriers.
However, the protections are put in place on a case by case basis,
and there is no guarantee that all possible attack vectors for Spectre
@ -366,12 +417,27 @@ Turning on mitigation for Spectre variant 1 and Spectre variant 2
1. Kernel mitigation
^^^^^^^^^^^^^^^^^^^^
Spectre variant 1
~~~~~~~~~~~~~~~~~
For the Spectre variant 1, vulnerable kernel code (as determined
by code audit or scanning tools) is annotated on a case by case
basis to use nospec accessor macros for bounds clipping :ref:`[2]
<spec_ref2>` to avoid any usable disclosure gadgets. However, it may
not cover all attack vectors for Spectre variant 1.
Copy-from-user code has an LFENCE barrier to prevent the access_ok()
check from being mis-speculated. The barrier is done by the
barrier_nospec() macro.
For the swapgs variant of Spectre variant 1, LFENCE barriers are
added to interrupt, exception and NMI entry where needed. These
barriers are done by the FENCE_SWAPGS_KERNEL_ENTRY and
FENCE_SWAPGS_USER_ENTRY macros.
Spectre variant 2
~~~~~~~~~~~~~~~~~
For Spectre variant 2 mitigation, the compiler turns indirect calls or
jumps in the kernel into equivalent return trampolines (retpolines)
:ref:`[3] <spec_ref3>` :ref:`[9] <spec_ref9>` to go to the target
@ -473,6 +539,12 @@ Mitigation control on the kernel command line
Spectre variant 2 mitigation can be disabled or force enabled at the
kernel command line.
nospectre_v1
[X86,PPC] Disable mitigations for Spectre Variant 1
(bounds check bypass). With this option data leaks are
possible in the system.
nospectre_v2
[X86] Disable all mitigations for the Spectre variant 2

View File

@ -2545,7 +2545,7 @@
mem_encrypt=on: Activate SME
mem_encrypt=off: Do not activate SME
Refer to Documentation/virtual/kvm/amd-memory-encryption.rst
Refer to Documentation/virt/kvm/amd-memory-encryption.rst
for details on when memory encryption can be activated.
mem_sleep_default= [SUSPEND] Default system suspend mode:
@ -2604,7 +2604,7 @@
expose users to several CPU vulnerabilities.
Equivalent to: nopti [X86,PPC]
kpti=0 [ARM64]
nospectre_v1 [PPC]
nospectre_v1 [X86,PPC]
nobp=0 [S390]
nospectre_v2 [X86,PPC,S390,ARM64]
spectre_v2_user=off [X86]
@ -2965,9 +2965,9 @@
nosmt=force: Force disable SMT, cannot be undone
via the sysfs control file.
nospectre_v1 [PPC] Disable mitigations for Spectre Variant 1 (bounds
check bypass). With this option data leaks are possible
in the system.
nospectre_v1 [X86,PPC] Disable mitigations for Spectre Variant 1
(bounds check bypass). With this option data leaks are
possible in the system.
nospectre_v2 [X86,PPC_FSL_BOOK3E,ARM64] Disable all mitigations for
the Spectre variant 2 (indirect branch prediction)
@ -4090,6 +4090,13 @@
Run specified binary instead of /init from the ramdisk,
used for early userspace startup. See initrd.
rdrand= [X86]
force - Override the decision by the kernel to hide the
advertisement of RDRAND support (this affects
certain AMD processors because of buggy BIOS
support, specifically around the suspend/resume
path).
rdt= [HW,X86,RDT]
Turn on/off individual RDT features. List is:
cmt, mbmtotal, mbmlocal, l3cat, l3cdp, l2cat, l2cdp,

View File

@ -53,7 +53,7 @@ disabled, there is ``khugepaged`` daemon that scans memory and
collapses sequences of basic pages into huge pages.
The THP behaviour is controlled via :ref:`sysfs <thp_sysfs>`
interface and using madivse(2) and prctl(2) system calls.
interface and using madvise(2) and prctl(2) system calls.
Transparent Hugepage Support maximizes the usefulness of free memory
if compared to the reservation approach of hugetlbfs by allowing all

View File

@ -39,7 +39,6 @@ Table : Subdirectories in /proc/sys/net
802 E802 protocol ax25 AX25
ethernet Ethernet protocol rose X.25 PLP layer
ipv4 IP version 4 x25 X.25 protocol
ipx IPX token-ring IBM token ring
bridge Bridging decnet DEC net
ipv6 IP version 6 tipc TIPC
========= =================== = ========== ==================
@ -401,33 +400,7 @@ interface.
(network) that the route leads to, the router (may be directly connected), the
route flags, and the device the route is using.
5. IPX
------
The IPX protocol has no tunable values in proc/sys/net.
The IPX protocol does, however, provide proc/net/ipx. This lists each IPX
socket giving the local and remote addresses in Novell format (that is
network:node:port). In accordance with the strange Novell tradition,
everything but the port is in hex. Not_Connected is displayed for sockets that
are not tied to a specific remote address. The Tx and Rx queue sizes indicate
the number of bytes pending for transmission and reception. The state
indicates the state the socket is in and the uid is the owning uid of the
socket.
The /proc/net/ipx_interface file lists all IPX interfaces. For each interface
it gives the network number, the node number, and indicates if the network is
the primary network. It also indicates which device it is bound to (or
Internal for internal networks) and the Frame Type if appropriate. Linux
supports 802.3, 802.2, 802.2 SNAP and DIX (Blue Book) ethernet framing for
IPX.
The /proc/net/ipx_route table holds a list of IPX routes. For each route it
gives the destination network, the router node (or Directly) and the network
address of the router (or Connected) for internal networks.
6. TIPC
5. TIPC
-------
tipc_rmem

View File

@ -16,6 +16,8 @@ import sys
import os
import sphinx
from subprocess import check_output
# Get Sphinx version
major, minor, patch = sphinx.version_info[:3]
@ -276,10 +278,21 @@ latex_elements = {
\\setsansfont{DejaVu Sans}
\\setromanfont{DejaVu Serif}
\\setmonofont{DejaVu Sans Mono}
'''
}
# At least one book (translations) may have Asian characters
# with are only displayed if xeCJK is used
cjk_cmd = check_output(['fc-list', '--format="%{family[0]}\n"']).decode('utf-8', 'ignore')
if cjk_cmd.find("Noto Sans CJK SC") >= 0:
print ("enabling CJK for LaTeX builder")
latex_elements['preamble'] += '''
% This is needed for translations
\\usepackage{xeCJK}
\\setCJKmainfont{Noto Sans CJK SC}
'''
# Fix reference escape troubles with Sphinx 1.4.x
if major == 1 and minor > 3:
latex_elements['preamble'] += '\\renewcommand*{\\DUrole}[2]{ #2 }\n'
@ -410,6 +423,21 @@ latex_documents = [
'The kernel development community', 'manual'),
]
# Add all other index files from Documentation/ subdirectories
for fn in os.listdir('.'):
doc = os.path.join(fn, "index")
if os.path.exists(doc + ".rst"):
has = False
for l in latex_documents:
if l[0] == doc:
has = True
break
if not has:
latex_documents.append((doc, fn + '.tex',
'Linux %s Documentation' % fn.capitalize(),
'The kernel development community',
'manual'))
# The name of an image file (relative to this directory) to place at the top of
# the title page.
#latex_logo = None

View File

@ -1,10 +0,0 @@
# -*- coding: utf-8; mode: python -*-
project = "Core-API Documentation"
tags.add("subproject")
latex_documents = [
('index', 'core-api.tex', project,
'The kernel development community', 'manual'),
]

View File

@ -1,10 +0,0 @@
# -*- coding: utf-8; mode: python -*-
project = 'Linux Kernel Crypto API'
tags.add("subproject")
latex_documents = [
('index', 'crypto-api.tex', 'Linux Kernel Crypto API manual',
'The kernel development community', 'manual'),
]

View File

@ -1,10 +0,0 @@
# -*- coding: utf-8; mode: python -*-
project = "Development tools for the kernel"
tags.add("subproject")
latex_documents = [
('index', 'dev-tools.tex', project,
'The kernel development community', 'manual'),
]

View File

@ -19,7 +19,9 @@ quiet_cmd_mk_schema = SCHEMA $@
DT_DOCS = $(shell \
cd $(srctree)/$(src) && \
find * \( -name '*.yaml' ! -name $(DT_TMP_SCHEMA) \) \
find * \( -name '*.yaml' ! \
-name $(DT_TMP_SCHEMA) ! \
-name '*.example.dt.yaml' \) \
)
DT_SCHEMA_FILES ?= $(addprefix $(src)/,$(DT_DOCS))

View File

@ -136,7 +136,9 @@ Required properties:
OCOTP bindings based on SCU Message Protocol
------------------------------------------------------------
Required properties:
- compatible: Should be "fsl,imx8qxp-scu-ocotp"
- compatible: Should be one of:
"fsl,imx8qm-scu-ocotp",
"fsl,imx8qxp-scu-ocotp".
- #address-cells: Must be 1. Contains byte index
- #size-cells: Must be 1. Contains byte length

View File

@ -703,4 +703,4 @@ cpus {
https://www.devicetree.org/specifications/
[6] ARM Linux Kernel documentation - Booting AArch64 Linux
Documentation/arm64/booting.txt
Documentation/arm64/booting.rst

View File

@ -1,7 +1,7 @@
# SPDX-License-Identifier: GPL-2.0
%YAML 1.2
---
$id: http://devicetree.org/schemas/arm/shmobile.yaml#
$id: http://devicetree.org/schemas/arm/renesas.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#
title: Renesas SH-Mobile, R-Mobile, and R-Car Platform Device Tree Bindings

View File

@ -1,7 +1,7 @@
# SPDX-License-Identifier: GPL-2.0
%YAML 1.2
---
$id: http://devicetree.org/schemas/arm/milbeaut.yaml#
$id: http://devicetree.org/schemas/arm/socionext/milbeaut.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#
title: Milbeaut platforms device tree bindings

View File

@ -1,7 +1,7 @@
# SPDX-License-Identifier: GPL-2.0
%YAML 1.2
---
$id: http://devicetree.org/schemas/arm/ti/davinci.yaml#
$id: http://devicetree.org/schemas/arm/ti/ti,davinci.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#
title: Texas Instruments DaVinci Platforms Device Tree Bindings

View File

@ -1,7 +1,7 @@
# SPDX-License-Identifier: GPL-2.0
%YAML 1.2
---
$id: http://devicetree.org/schemas/phy/allwinner,sun4i-a10-ccu.yaml#
$id: http://devicetree.org/schemas/clock/allwinner,sun4i-a10-ccu.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#
title: Allwinner Clock Control Unit Device Tree Bindings

View File

@ -2,7 +2,7 @@
# Copyright 2019 Linaro Ltd.
%YAML 1.2
---
$id: "http://devicetree.org/schemas/firmware/intel-ixp4xx-network-processing-engine.yaml#"
$id: "http://devicetree.org/schemas/firmware/intel,ixp4xx-network-processing-engine.yaml#"
$schema: "http://devicetree.org/meta-schemas/core.yaml#"
title: Intel IXP4xx Network Processing Engine

View File

@ -1,7 +1,7 @@
# SPDX-License-Identifier: GPL-2.0
%YAML 1.2
---
$id: http://devicetree.org/schemas/iio/accelerometers/adi,adxl345.yaml#
$id: http://devicetree.org/schemas/iio/accel/adi,adxl345.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#
title: Analog Devices ADXL345/ADXL375 3-Axis Digital Accelerometers

View File

@ -1,7 +1,7 @@
# SPDX-License-Identifier: GPL-2.0
%YAML 1.2
---
$id: http://devicetree.org/schemas/iio/accelerometers/adi,adxl372.yaml#
$id: http://devicetree.org/schemas/iio/accel/adi,adxl372.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#
title: Analog Devices ADXL372 3-Axis, +/-(200g) Digital Accelerometer

View File

@ -5,21 +5,19 @@ Required properties:
- compatible: should be "amazon,al-fic"
- reg: physical base address and size of the registers
- interrupt-controller: identifies the node as an interrupt controller
- #interrupt-cells: must be 2.
First cell defines the index of the interrupt within the controller.
Second cell is used to specify the trigger type and must be one of the
following:
- bits[3:0] trigger type and level flags
1 = low-to-high edge triggered
4 = active high level-sensitive
- interrupt-parent: specifies the parent interrupt controller.
- #interrupt-cells : must be 2. Specifies the number of cells needed to encode
an interrupt source. Supported trigger types are low-to-high edge
triggered and active high level-sensitive.
- interrupts: describes which input line in the interrupt parent, this
fic's output is connected to. This field property depends on the parent's
binding
Please refer to interrupts.txt in this directory for details of the common
Interrupt Controllers bindings used by client devices.
Example:
amazon_fic: interrupt-controller@0xfd8a8500 {
amazon_fic: interrupt-controller@fd8a8500 {
compatible = "amazon,al-fic";
interrupt-controller;
#interrupt-cells = <2>;

View File

@ -2,7 +2,7 @@
# Copyright 2018 Linaro Ltd.
%YAML 1.2
---
$id: "http://devicetree.org/schemas/interrupt/intel-ixp4xx-interrupt.yaml#"
$id: "http://devicetree.org/schemas/interrupt-controller/intel,ixp4xx-interrupt.yaml#"
$schema: "http://devicetree.org/meta-schemas/core.yaml#"
title: Intel IXP4xx XScale Networking Processors Interrupt Controller

View File

@ -1,20 +1,30 @@
* ARC-HS Interrupt Distribution Unit
This optional 2nd level interrupt controller can be used in SMP configurations for
dynamic IRQ routing, load balancing of common/external IRQs towards core intc.
This optional 2nd level interrupt controller can be used in SMP configurations
for dynamic IRQ routing, load balancing of common/external IRQs towards core
intc.
Properties:
- compatible: "snps,archs-idu-intc"
- interrupt-controller: This is an interrupt controller.
- #interrupt-cells: Must be <1>.
- #interrupt-cells: Must be <1> or <2>.
Value of the cell specifies the "common" IRQ from peripheral to IDU. Number N
of the particular interrupt line of IDU corresponds to the line N+24 of the
core interrupt controller.
Value of the first cell specifies the "common" IRQ from peripheral to IDU.
Number N of the particular interrupt line of IDU corresponds to the line N+24
of the core interrupt controller.
intc accessed via the special ARC AUX register interface, hence "reg" property
is not specified.
The (optional) second cell specifies any of the following flags:
- bits[3:0] trigger type and level flags
1 = low-to-high edge triggered
2 = NOT SUPPORTED (high-to-low edge triggered)
4 = active high level-sensitive <<< DEFAULT
8 = NOT SUPPORTED (active low level-sensitive)
When no second cell is specified, the interrupt is assumed to be level
sensitive.
The interrupt controller is accessed via the special ARC AUX register
interface, hence "reg" property is not specified.
Example:
core_intc: core-interrupt-controller {

View File

@ -2,7 +2,7 @@
# Copyright 2019 Linaro Ltd.
%YAML 1.2
---
$id: "http://devicetree.org/schemas/misc/intel-ixp4xx-ahb-queue-manager.yaml#"
$id: "http://devicetree.org/schemas/misc/intel,ixp4xx-ahb-queue-manager.yaml#"
$schema: "http://devicetree.org/meta-schemas/core.yaml#"
title: Intel IXP4xx AHB Queue Manager

View File

@ -1,7 +1,7 @@
# SPDX-License-Identifier: GPL-2.0
%YAML 1.2
---
$id: http://devicetree.org/schemas/net/allwinner,sun8i-a83t-gmac.yaml#
$id: http://devicetree.org/schemas/net/allwinner,sun8i-a83t-emac.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#
title: Allwinner A83t EMAC Device Tree Bindings

View File

@ -12,6 +12,7 @@ Required properties:
- "microchip,ksz8565"
- "microchip,ksz9893"
- "microchip,ksz9563"
- "microchip,ksz8563"
Optional properties:

View File

@ -7,18 +7,6 @@ Required properties:
- phy-mode : See ethernet.txt file in the same directory
Optional properties:
- phy-reset-gpios : Should specify the gpio for phy reset
- phy-reset-duration : Reset duration in milliseconds. Should present
only if property "phy-reset-gpios" is available. Missing the property
will have the duration be 1 millisecond. Numbers greater than 1000 are
invalid and 1 millisecond will be used instead.
- phy-reset-active-high : If present then the reset sequence using the GPIO
specified in the "phy-reset-gpios" property is reversed (H=reset state,
L=operation state).
- phy-reset-post-delay : Post reset delay in milliseconds. If present then
a delay of phy-reset-post-delay milliseconds will be observed after the
phy-reset-gpios has been toggled. Can be omitted thus no delay is
observed. Delay is in range of 1ms to 1000ms. Other delays are invalid.
- phy-supply : regulator that powers the Ethernet PHY.
- phy-handle : phandle to the PHY device connected to this device.
- fixed-link : Assume a fixed link. See fixed-link.txt in the same directory.
@ -47,11 +35,27 @@ Optional properties:
For imx6sx, "int0" handles all 3 queues and ENET_MII. "pps" is for the pulse
per second interrupt associated with 1588 precision time protocol(PTP).
Optional subnodes:
- mdio : specifies the mdio bus in the FEC, used as a container for phy nodes
according to phy.txt in the same directory
Deprecated optional properties:
To avoid these, create a phy node according to phy.txt in the same
directory, and point the fec's "phy-handle" property to it. Then use
the phy's reset binding, again described by phy.txt.
- phy-reset-gpios : Should specify the gpio for phy reset
- phy-reset-duration : Reset duration in milliseconds. Should present
only if property "phy-reset-gpios" is available. Missing the property
will have the duration be 1 millisecond. Numbers greater than 1000 are
invalid and 1 millisecond will be used instead.
- phy-reset-active-high : If present then the reset sequence using the GPIO
specified in the "phy-reset-gpios" property is reversed (H=reset state,
L=operation state).
- phy-reset-post-delay : Post reset delay in milliseconds. If present then
a delay of phy-reset-post-delay milliseconds will be observed after the
phy-reset-gpios has been toggled. Can be omitted thus no delay is
observed. Delay is in range of 1ms to 1000ms. Other delays are invalid.
Example:
ethernet@83fec000 {

View File

@ -15,10 +15,10 @@ Required properties:
Use "atmel,sama5d4-gem" for the GEM IP (10/100) available on Atmel sama5d4 SoCs.
Use "cdns,zynq-gem" Xilinx Zynq-7xxx SoC.
Use "cdns,zynqmp-gem" for Zynq Ultrascale+ MPSoC.
Use "sifive,fu540-macb" for SiFive FU540-C000 SoC.
Use "sifive,fu540-c000-gem" for SiFive FU540-C000 SoC.
Or the generic form: "cdns,emac".
- reg: Address and length of the register set for the device
For "sifive,fu540-macb", second range is required to specify the
For "sifive,fu540-c000-gem", second range is required to specify the
address and length of the registers for GEMGXL Management block.
- interrupts: Should contain macb interrupt
- phy-mode: See ethernet.txt file in the same directory.

View File

@ -37,13 +37,13 @@ required:
examples:
- |
sid@1c23800 {
efuse@1c23800 {
compatible = "allwinner,sun4i-a10-sid";
reg = <0x01c23800 0x10>;
};
- |
sid@1c23800 {
efuse@1c23800 {
compatible = "allwinner,sun7i-a20-sid";
reg = <0x01c23800 0x200>;
};

View File

@ -2,7 +2,7 @@ Freescale i.MX6 On-Chip OTP Controller (OCOTP) device tree bindings
This binding represents the on-chip eFuse OTP controller found on
i.MX6Q/D, i.MX6DL/S, i.MX6SL, i.MX6SX, i.MX6UL, i.MX6ULL/ULZ, i.MX6SLL,
i.MX7D/S, i.MX7ULP and i.MX8MQ SoCs.
i.MX7D/S, i.MX7ULP, i.MX8MQ, i.MX8MM and i.MX8MN SoCs.
Required properties:
- compatible: should be one of
@ -16,6 +16,7 @@ Required properties:
"fsl,imx7ulp-ocotp" (i.MX7ULP),
"fsl,imx8mq-ocotp" (i.MX8MQ),
"fsl,imx8mm-ocotp" (i.MX8MM),
"fsl,imx8mn-ocotp" (i.MX8MN),
followed by "syscon".
- #address-cells : Should be 1
- #size-cells : Should be 1

View File

@ -0,0 +1,45 @@
# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
%YAML 1.2
---
$id: http://devicetree.org/schemas/nvmem/nvmem-consumer.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#
title: NVMEM (Non Volatile Memory) Consumer Device Tree Bindings
maintainers:
- Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
select: true
properties:
nvmem:
$ref: /schemas/types.yaml#/definitions/phandle-array
description:
List of phandle to the nvmem providers.
nvmem-cells:
$ref: /schemas/types.yaml#/definitions/phandle-array
description:
List of phandle to the nvmem data cells.
nvmem-names:
$ref: /schemas/types.yaml#/definitions/string-array
description:
Names for the each nvmem provider.
nvmem-cell-names:
$ref: /schemas/types.yaml#/definitions/string-array
description:
Names for each nvmem-cells specified.
dependencies:
nvmem-names: [ nvmem ]
nvmem-cell-names: [ nvmem-cells ]
examples:
- |
tsens {
/* ... */
nvmem-cells = <&tsens_calibration>;
nvmem-cell-names = "calibration";
};

View File

@ -1,80 +1 @@
= NVMEM(Non Volatile Memory) Data Device Tree Bindings =
This binding is intended to represent the location of hardware
configuration data stored in NVMEMs like eeprom, efuses and so on.
On a significant proportion of boards, the manufacturer has stored
some data on NVMEM, for the OS to be able to retrieve these information
and act upon it. Obviously, the OS has to know about where to retrieve
these data from, and where they are stored on the storage device.
This document is here to document this.
= Data providers =
Contains bindings specific to provider drivers and data cells as children
of this node.
Optional properties:
read-only: Mark the provider as read only.
= Data cells =
These are the child nodes of the provider which contain data cell
information like offset and size in nvmem provider.
Required properties:
reg: specifies the offset in byte within the storage device.
Optional properties:
bits: Is pair of bit location and number of bits, which specifies offset
in bit and number of bits within the address range specified by reg property.
Offset takes values from 0-7.
For example:
/* Provider */
qfprom: qfprom@700000 {
...
/* Data cells */
tsens_calibration: calib@404 {
reg = <0x404 0x10>;
};
tsens_calibration_bckp: calib_bckp@504 {
reg = <0x504 0x11>;
bits = <6 128>
};
pvs_version: pvs-version@6 {
reg = <0x6 0x2>
bits = <7 2>
};
speed_bin: speed-bin@c{
reg = <0xc 0x1>;
bits = <2 3>;
};
...
};
= Data consumers =
Are device nodes which consume nvmem data cells/providers.
Required-properties:
nvmem-cells: list of phandle to the nvmem data cells.
nvmem-cell-names: names for the each nvmem-cells specified. Required if
nvmem-cells is used.
Optional-properties:
nvmem : list of phandles to nvmem providers.
nvmem-names: names for the each nvmem provider. required if nvmem is used.
For example:
tsens {
...
nvmem-cells = <&tsens_calibration>;
nvmem-cell-names = "calibration";
};
This file has been moved to nvmem.yaml and nvmem-consumer.yaml.

View File

@ -0,0 +1,93 @@
# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
%YAML 1.2
---
$id: http://devicetree.org/schemas/nvmem/nvmem.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#
title: NVMEM (Non Volatile Memory) Device Tree Bindings
maintainers:
- Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
description: |
This binding is intended to represent the location of hardware
configuration data stored in NVMEMs like eeprom, efuses and so on.
On a significant proportion of boards, the manufacturer has stored
some data on NVMEM, for the OS to be able to retrieve these
information and act upon it. Obviously, the OS has to know about
where to retrieve these data from, and where they are stored on the
storage device.
properties:
$nodename:
pattern: "^(eeprom|efuse|nvram)(@.*|-[0-9a-f])*$"
"#address-cells":
const: 1
"#size-cells":
const: 1
read-only:
$ref: /schemas/types.yaml#/definitions/flag
description:
Mark the provider as read only.
patternProperties:
"^.*@[0-9a-f]+$":
type: object
properties:
reg:
maxItems: 1
description:
Offset and size in bytes within the storage device.
bits:
maxItems: 1
items:
items:
- minimum: 0
maximum: 7
description:
Offset in bit within the address range specified by reg.
- minimum: 1
description:
Size in bit within the address range specified by reg.
required:
- reg
additionalProperties: false
examples:
- |
qfprom: eeprom@700000 {
#address-cells = <1>;
#size-cells = <1>;
/* ... */
/* Data cells */
tsens_calibration: calib@404 {
reg = <0x404 0x10>;
};
tsens_calibration_bckp: calib_bckp@504 {
reg = <0x504 0x11>;
bits = <6 128>;
};
pvs_version: pvs-version@6 {
reg = <0x6 0x2>;
bits = <7 2>;
};
speed_bin: speed-bin@c{
reg = <0xc 0x1>;
bits = <2 3>;
};
};
...

View File

@ -1,7 +1,7 @@
# SPDX-License-Identifier: GPL-2.0
%YAML 1.2
---
$id: http://devicetree.org/schemas/display/allwinner,sun6i-a31-mipi-dphy.yaml#
$id: http://devicetree.org/schemas/phy/allwinner,sun6i-a31-mipi-dphy.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#
title: Allwinner A31 MIPI D-PHY Controller Device Tree Bindings

View File

@ -37,7 +37,8 @@ properties:
hwlocks: true
st,syscfg:
$ref: "/schemas/types.yaml#/definitions/phandle-array"
allOf:
- $ref: "/schemas/types.yaml#/definitions/phandle-array"
description: Should be phandle/offset/mask
items:
- description: Phandle to the syscon node which includes IRQ mux selection.

View File

@ -1,162 +0,0 @@
===================
RISC-V CPU Bindings
===================
The device tree allows to describe the layout of CPUs in a system through
the "cpus" node, which in turn contains a number of subnodes (ie "cpu")
defining properties for every cpu.
Bindings for CPU nodes follow the Devicetree Specification, available from:
https://www.devicetree.org/specifications/
with updates for 32-bit and 64-bit RISC-V systems provided in this document.
===========
Terminology
===========
This document uses some terminology common to the RISC-V community that is not
widely used, the definitions of which are listed here:
* hart: A hardware execution context, which contains all the state mandated by
the RISC-V ISA: a PC and some registers. This terminology is designed to
disambiguate software's view of execution contexts from any particular
microarchitectural implementation strategy. For example, my Intel laptop is
described as having one socket with two cores, each of which has two hyper
threads. Therefore this system has four harts.
=====================================
cpus and cpu node bindings definition
=====================================
The RISC-V architecture, in accordance with the Devicetree Specification,
requires the cpus and cpu nodes to be present and contain the properties
described below.
- cpus node
Description: Container of cpu nodes
The node name must be "cpus".
A cpus node must define the following properties:
- #address-cells
Usage: required
Value type: <u32>
Definition: must be set to 1
- #size-cells
Usage: required
Value type: <u32>
Definition: must be set to 0
- cpu node
Description: Describes a hart context
PROPERTIES
- device_type
Usage: required
Value type: <string>
Definition: must be "cpu"
- reg
Usage: required
Value type: <u32>
Definition: The hart ID of this CPU node
- compatible:
Usage: required
Value type: <stringlist>
Definition: must contain "riscv", may contain one of
"sifive,rocket0"
- mmu-type:
Usage: optional
Value type: <string>
Definition: Specifies the CPU's MMU type. Possible values are
"riscv,sv32"
"riscv,sv39"
"riscv,sv48"
- riscv,isa:
Usage: required
Value type: <string>
Definition: Contains the RISC-V ISA string of this hart. These
ISA strings are defined by the RISC-V ISA manual.
Example: SiFive Freedom U540G Development Kit
---------------------------------------------
This system contains two harts: a hart marked as disabled that's used for
low-level system tasks and should be ignored by Linux, and a second hart that
Linux is allowed to run on.
cpus {
#address-cells = <1>;
#size-cells = <0>;
timebase-frequency = <1000000>;
cpu@0 {
clock-frequency = <1600000000>;
compatible = "sifive,rocket0", "riscv";
device_type = "cpu";
i-cache-block-size = <64>;
i-cache-sets = <128>;
i-cache-size = <16384>;
next-level-cache = <&L15 &L0>;
reg = <0>;
riscv,isa = "rv64imac";
status = "disabled";
L10: interrupt-controller {
#interrupt-cells = <1>;
compatible = "riscv,cpu-intc";
interrupt-controller;
};
};
cpu@1 {
clock-frequency = <1600000000>;
compatible = "sifive,rocket0", "riscv";
d-cache-block-size = <64>;
d-cache-sets = <64>;
d-cache-size = <32768>;
d-tlb-sets = <1>;
d-tlb-size = <32>;
device_type = "cpu";
i-cache-block-size = <64>;
i-cache-sets = <64>;
i-cache-size = <32768>;
i-tlb-sets = <1>;
i-tlb-size = <32>;
mmu-type = "riscv,sv39";
next-level-cache = <&L15 &L0>;
reg = <1>;
riscv,isa = "rv64imafdc";
status = "okay";
tlb-split;
L13: interrupt-controller {
#interrupt-cells = <1>;
compatible = "riscv,cpu-intc";
interrupt-controller;
};
};
};
Example: Spike ISA Simulator with 1 Hart
----------------------------------------
This device tree matches the Spike ISA golden model as run with `spike -p1`.
cpus {
cpu@0 {
device_type = "cpu";
reg = <0x00000000>;
status = "okay";
compatible = "riscv";
riscv,isa = "rv64imafdc";
mmu-type = "riscv,sv48";
clock-frequency = <0x3b9aca00>;
interrupt-controller {
#interrupt-cells = <0x00000001>;
interrupt-controller;
compatible = "riscv,cpu-intc";
}
}
}

View File

@ -10,6 +10,18 @@ maintainers:
- Paul Walmsley <paul.walmsley@sifive.com>
- Palmer Dabbelt <palmer@sifive.com>
description: |
This document uses some terminology common to the RISC-V community
that is not widely used, the definitions of which are listed here:
hart: A hardware execution context, which contains all the state
mandated by the RISC-V ISA: a PC and some registers. This
terminology is designed to disambiguate software's view of execution
contexts from any particular microarchitectural implementation
strategy. For example, an Intel laptop containing one socket with
two cores, each of which has two hyperthreads, could be described as
having four harts.
properties:
compatible:
items:
@ -50,6 +62,10 @@ properties:
User-Level ISA document, available from
https://riscv.org/specifications/
While the isa strings in ISA specification are case
insensitive, letters in the riscv,isa string must be all
lowercase to simplify parsing.
timebase-frequency:
type: integer
minimum: 1

View File

@ -19,7 +19,7 @@ properties:
compatible:
items:
- enum:
- sifive,freedom-unleashed-a00
- sifive,hifive-unleashed-a00
- const: sifive,fu540-c000
- const: sifive,fu540
...

View File

@ -73,7 +73,6 @@ patternProperties:
Compatible of the SPI device.
reg:
maxItems: 1
minimum: 0
maximum: 256
description:

View File

@ -2,7 +2,7 @@
# Copyright 2018 Linaro Ltd.
%YAML 1.2
---
$id: "http://devicetree.org/schemas/timer/intel-ixp4xx-timer.yaml#"
$id: "http://devicetree.org/schemas/timer/intel,ixp4xx-timer.yaml#"
$schema: "http://devicetree.org/meta-schemas/core.yaml#"
title: Intel IXP4xx XScale Networking Processors Timers

View File

@ -64,10 +64,8 @@ Optional properties :
- power-on-time-ms : Specifies the time it takes from the time the host
initiates the power-on sequence to a port until the port has adequate
power. The value is given in ms in a 0 - 510 range (default is 100ms).
- swap-dx-lanes : Specifies the downstream ports which will swap the
differential-pair (D+/D-), default is not-swapped.
- swap-us-lanes : Selects the upstream port differential-pair (D+/D-)
swapping (boolean, default is not-swapped)
- swap-dx-lanes : Specifies the ports which will swap the differential-pair
(D+/D-), default is not-swapped.
Examples:
usb2512b@2c {

View File

@ -1,10 +0,0 @@
# -*- coding: utf-8; mode: python -*-
project = 'Linux Kernel Documentation Guide'
tags.add("subproject")
latex_documents = [
('index', 'kernel-doc-guide.tex', 'Linux Kernel Documentation Guide',
'The kernel development community', 'manual'),
]

View File

@ -1,10 +0,0 @@
# -*- coding: utf-8; mode: python -*-
project = "Linux 802.11 Driver Developer's Guide"
tags.add("subproject")
latex_documents = [
('index', '80211.tex', project,
'The kernel development community', 'manual'),
]

View File

@ -1,10 +0,0 @@
# -*- coding: utf-8; mode: python -*-
project = "The Linux driver implementer's API guide"
tags.add("subproject")
latex_documents = [
('index', 'driver-api.tex', project,
'The kernel development community', 'manual'),
]

View File

@ -233,7 +233,7 @@ Userspace Interface
Several sysfs attributes are generated by the Generic Counter interface,
and reside under the /sys/bus/counter/devices/counterX directory, where
counterX refers to the respective counter device. Please see
Documentation/ABI/testing/sys-bus-counter-generic-sysfs for detailed
Documentation/ABI/testing/sysfs-bus-counter for detailed
information on each Generic Counter interface sysfs attribute.
Through these sysfs attributes, programs and scripts may interact with
@ -325,7 +325,7 @@ sysfs attributes, where Y is the unique ID of the respective Count:
For a more detailed breakdown of the available Generic Counter interface
sysfs attributes, please refer to the
Documentation/ABI/testing/sys-bus-counter file.
Documentation/ABI/testing/sysfs-bus-counter file.
The Signals and Counts associated with the Counter device are registered
to the system as well by the counter_register function. The

View File

@ -179,8 +179,8 @@ PHY Mappings
In order to get reference to a PHY without help from DeviceTree, the framework
offers lookups which can be compared to clkdev that allow clk structures to be
bound to devices. A lookup can be made be made during runtime when a handle to
the struct phy already exists.
bound to devices. A lookup can be made during runtime when a handle to the
struct phy already exists.
The framework offers the following API for registering and unregistering the
lookups::

View File

@ -1,10 +0,0 @@
# -*- coding: utf-8; mode: python -*-
project = "Device Power Management"
tags.add("subproject")
latex_documents = [
('index', 'pm.tex', project,
'The kernel development community', 'manual'),
]

View File

@ -13,7 +13,8 @@ a) SMB3 (and SMB3.1.1) missing optional features:
- T10 copy offload ie "ODX" (copy chunk, and "Duplicate Extents" ioctl
currently the only two server side copy mechanisms supported)
b) improved sparse file support
b) improved sparse file support (fiemap and SEEK_HOLE are implemented
but additional features would be supportable by the protocol).
c) Directory entry caching relies on a 1 second timer, rather than
using Directory Leases, currently only the root file handle is cached longer
@ -21,9 +22,13 @@ using Directory Leases, currently only the root file handle is cached longer
d) quota support (needs minor kernel change since quota calls
to make it to network filesystems or deviceless filesystems)
e) Additional use cases where we use "compoounding" (e.g. open/query/close
and open/setinfo/close) to reduce the number of roundtrips, and also
open to reduce redundant opens (using deferred close and reference counts more).
e) Additional use cases can be optimized to use "compounding"
(e.g. open/query/close and open/setinfo/close) to reduce the number
of roundtrips to the server and improve performance. Various cases
(stat, statfs, create, unlink, mkdir) already have been improved by
using compounding but more can be done. In addition we could significantly
reduce redundant opens by using deferred close (with handle caching leases)
and better using reference counters on file handles.
f) Finish inotify support so kde and gnome file list windows
will autorefresh (partially complete by Asser). Needs minor kernel
@ -43,18 +48,17 @@ mount or a per server basis to client UIDs or nobody if no mapping
exists. Also better integration with winbind for resolving SID owners
k) Add tools to take advantage of more smb3 specific ioctls and features
(passthrough ioctl/fsctl for sending various SMB3 fsctls to the server
is in progress, and a passthrough query_info call is already implemented
in cifs.ko to allow smb3 info levels queries to be sent from userspace)
(passthrough ioctl/fsctl is now implemented in cifs.ko to allow sending
various SMB3 fsctls and query info and set info calls directly from user space)
Add tools to make setting various non-POSIX metadata attributes easier
from tools (e.g. extending what was done in smb-info tool).
l) encrypted file support
m) improved stats gathering tools (perhaps integration with nfsometer?)
to extend and make easier to use what is currently in /proc/fs/cifs/Stats
n) allow setting more NTFS/SMB3 file attributes remotely (currently limited to compressed
file attribute via chflags) and improve user space tools for managing and
viewing them.
n) Add support for claims based ACLs ("DAC")
o) mount helper GUI (to simplify the various configuration options on mount)
@ -82,6 +86,8 @@ so far).
w) Add support for additional strong encryption types, and additional spnego
authentication mechanisms (see MS-SMB2)
x) Finish support for SMB3.1.1 compression
KNOWN BUGS
====================================
See http://bugzilla.samba.org - search on product "CifsVFS" for

View File

@ -1,10 +0,0 @@
# -*- coding: utf-8; mode: python -*-
project = "Linux Filesystems API"
tags.add("subproject")
latex_documents = [
('index', 'filesystems.tex', project,
'The kernel development community', 'manual'),
]

View File

@ -1,10 +0,0 @@
# -*- coding: utf-8; mode: python -*-
project = "Linux GPU Driver Developer's Guide"
tags.add("subproject")
latex_documents = [
('index', 'gpu.tex', project,
'The kernel development community', 'manual'),
]

View File

@ -9,7 +9,7 @@ Supported chips:
Addresses scanned: PCI space
Datasheet: http://support.amd.com/us/Processor_TechDocs/32559.pdf
Datasheet: http://www.amd.com/system/files/TechDocs/32559.pdf
Author: Rudolf Marek

View File

@ -111,9 +111,11 @@ needed).
netlabel/index
networking/index
pcmcia/index
power/index
target/index
timers/index
watchdog/index
virtual/index
input/index
hwmon/index
gpu/index
@ -143,6 +145,7 @@ implementation.
arm64/index
ia64/index
m68k/index
powerpc/index
riscv/index
s390/index
sh/index

View File

@ -1,10 +0,0 @@
# -*- coding: utf-8; mode: python -*-
project = "The Linux input driver subsystem"
tags.add("subproject")
latex_documents = [
('index', 'linux-input.tex', project,
'The kernel development community', 'manual'),
]

View File

@ -1,10 +0,0 @@
# -*- coding: utf-8; mode: python -*-
project = "Kernel Hacking Guides"
tags.add("subproject")
latex_documents = [
('index', 'kernel-hacking.tex', project,
'The kernel development community', 'manual'),
]

View File

@ -82,7 +82,7 @@ itself. The read lock allows many concurrent readers. Anything that
**changes** the list will have to get the write lock.
NOTE! RCU is better for list traversal, but requires careful
attention to design detail (see Documentation/RCU/listRCU.txt).
attention to design detail (see Documentation/RCU/listRCU.rst).
Also, you cannot "upgrade" a read-lock to a write-lock, so if you at _any_
time need to do any changes (even if you don't do it every time), you have
@ -90,7 +90,7 @@ to get the write-lock at the very beginning.
NOTE! We are working hard to remove reader-writer spinlocks in most
cases, so please don't add a new one without consensus. (Instead, see
Documentation/RCU/rcu.txt for complete information.)
Documentation/RCU/rcu.rst for complete information.)
----

View File

@ -1,10 +0,0 @@
# -*- coding: utf-8; mode: python -*-
project = 'Linux Kernel Development Documentation'
tags.add("subproject")
latex_documents = [
('index', 'maintainer.tex', 'Linux Kernel Development Documentation',
'The kernel development community', 'manual'),
]

View File

@ -1,12 +0,0 @@
# -*- coding: utf-8; mode: python -*-
# SPDX-License-Identifier: GPL-2.0
project = 'Linux Media Subsystem Documentation'
tags.add("subproject")
latex_documents = [
('index', 'media.tex', 'Linux Media Subsystem Documentation',
'The kernel development community', 'manual'),
]

View File

@ -548,7 +548,7 @@ There are certain things that the Linux kernel memory barriers do not guarantee:
[*] For information on bus mastering DMA and coherency please read:
Documentation/PCI/pci.rst
Documentation/driver-api/pci/pci.rst
Documentation/DMA-API-HOWTO.txt
Documentation/DMA-API.txt

View File

@ -20,3 +20,4 @@ fit into other categories.
isl29003
lis3lv02d
max6875
xilinx_sdfec

View File

@ -1,10 +0,0 @@
# -*- coding: utf-8; mode: python -*-
project = "Linux Networking Documentation"
tags.add("subproject")
latex_documents = [
('index', 'networking.tex', project,
'The kernel development community', 'manual'),
]

View File

@ -424,13 +424,24 @@ Statistics
Following minimum set of TLS-related statistics should be reported
by the driver:
* ``rx_tls_decrypted`` - number of successfully decrypted TLS segments
* ``tx_tls_encrypted`` - number of in-order TLS segments passed to device
for encryption
* ``rx_tls_decrypted_packets`` - number of successfully decrypted RX packets
which were part of a TLS stream.
* ``rx_tls_decrypted_bytes`` - number of TLS payload bytes in RX packets
which were successfully decrypted.
* ``tx_tls_encrypted_packets`` - number of TX packets passed to the device
for encryption of their TLS payload.
* ``tx_tls_encrypted_bytes`` - number of TLS payload bytes in TX packets
passed to the device for encryption.
* ``tx_tls_ctx`` - number of TLS TX HW offload contexts added to device for
encryption.
* ``tx_tls_ooo`` - number of TX packets which were part of a TLS stream
but did not arrive in the expected order
* ``tx_tls_drop_no_sync_data`` - number of TX packets dropped because
they arrived out of order and associated record could not be found
but did not arrive in the expected order.
* ``tx_tls_drop_no_sync_data`` - number of TX packets which were part of
a TLS stream dropped, because they arrived out of order and associated
record could not be found.
* ``tx_tls_drop_bypass_req`` - number of TX packets which were part of a TLS
stream dropped, because they contain both data that has been encrypted by
software and data that expects hardware crypto offload.
Notable corner cases, exceptions and additional requirements
============================================================
@ -495,21 +506,3 @@ Drivers should ignore the changes to TLS the device feature flags.
These flags will be acted upon accordingly by the core ``ktls`` code.
TLS device feature flags only control adding of new TLS connection
offloads, old connections will remain active after flags are cleared.
Known bugs
==========
skb_orphan() leaks clear text
-----------------------------
Currently drivers depend on the :c:member:`sk` member of
:c:type:`struct sk_buff <sk_buff>` to identify segments requiring
encryption. Any operation which removes or does not preserve the socket
association such as :c:func:`skb_orphan` or :c:func:`skb_clone`
will cause the driver to miss the packets and lead to clear text leaks.
Redirects leak clear text
-------------------------
In the RX direction, if segment has already been decrypted by the device
and it gets redirected or mirrored - clear text will be transmitted out.

View File

@ -204,8 +204,8 @@ Ethernet device, which instead of receiving packets from a physical
media, receives them from user space program and instead of sending
packets via physical media sends them to the user space program.
Let's say that you configured IPX on the tap0, then whenever
the kernel sends an IPX packet to tap0, it is passed to the application
Let's say that you configured IPv6 on the tap0, then whenever
the kernel sends an IPv6 packet to tap0, it is passed to the application
(VTun for example). The application encrypts, compresses and sends it to
the other side over TCP or UDP. The application on the other side decompresses
and decrypts the data received and writes the packet to the TAP device,

View File

@ -1,4 +1,4 @@
:orphan:
.. SPDX-License-Identifier: GPL-2.0
================
Power Management

View File

@ -1,5 +1,7 @@
========================
The PowerPC boot wrapper
------------------------
========================
Copyright (C) Secret Lab Technologies Ltd.
PowerPC image targets compresses and wraps the kernel image (vmlinux) with
@ -21,6 +23,7 @@ it uses the wrapper script (arch/powerpc/boot/wrapper) to generate target
image. The details of the build system is discussed in the next section.
Currently, the following image format targets exist:
==================== ========================================================
cuImage.%: Backwards compatible uImage for older version of
U-Boot (for versions that don't understand the device
tree). This image embeds a device tree blob inside
@ -29,31 +32,36 @@ Currently, the following image format targets exist:
with boot wrapper code that extracts data from the old
bd_info structure and loads the data into the device
tree before jumping into the kernel.
Because of the series of #ifdefs found in the
Because of the series of #ifdefs found in the
bd_info structure used in the old U-Boot interfaces,
cuImages are platform specific. Each specific
U-Boot platform has a different platform init file
which populates the embedded device tree with data
from the platform specific bd_info file. The platform
specific cuImage platform init code can be found in
arch/powerpc/boot/cuboot.*.c. Selection of the correct
`arch/powerpc/boot/cuboot.*.c`. Selection of the correct
cuImage init code for a specific board can be found in
the wrapper structure.
dtbImage.%: Similar to zImage, except device tree blob is embedded
inside the image instead of provided by firmware. The
output image file can be either an elf file or a flat
binary depending on the platform.
dtbImages are used on systems which do not have an
dtbImages are used on systems which do not have an
interface for passing a device tree directly.
dtbImages are similar to simpleImages except that
dtbImages have platform specific code for extracting
data from the board firmware, but simpleImages do not
talk to the firmware at all.
PlayStation 3 support uses dtbImage. So do Embedded
PlayStation 3 support uses dtbImage. So do Embedded
Planet boards using the PlanetCore firmware. Board
specific initialization code is typically found in a
file named arch/powerpc/boot/<platform>.c; but this
can be overridden by the wrapper script.
simpleImage.%: Firmware independent compressed image that does not
depend on any particular firmware interface and embeds
a device tree blob. This image is a flat binary that
@ -61,14 +69,16 @@ Currently, the following image format targets exist:
Firmware cannot pass any configuration data to the
kernel with this image type and it depends entirely on
the embedded device tree for all information.
The simpleImage is useful for booting systems with
The simpleImage is useful for booting systems with
an unknown firmware interface or for booting from
a debugger when no firmware is present (such as on
the Xilinx Virtex platform). The only assumption that
simpleImage makes is that RAM is correctly initialized
and that the MMU is either off or has RAM mapped to
base address 0.
simpleImage also supports inserting special platform
simpleImage also supports inserting special platform
specific initialization code to the start of the bootup
sequence. The virtex405 platform uses this feature to
ensure that the cache is invalidated before caching
@ -81,9 +91,11 @@ Currently, the following image format targets exist:
named (virtex405-<board>.dts). Search the wrapper
script for 'virtex405' and see the file
arch/powerpc/boot/virtex405-head.S for details.
treeImage.%; Image format for used with OpenBIOS firmware found
on some ppc4xx hardware. This image embeds a device
tree blob inside the image.
uImage: Native image format used by U-Boot. The uImage target
does not add any boot code. It just wraps a compressed
vmlinux in the uImage data structure. This image
@ -91,12 +103,14 @@ Currently, the following image format targets exist:
a device tree to the kernel at boot. If using an older
version of U-Boot, then you need to use a cuImage
instead.
zImage.%: Image format which does not embed a device tree.
Used by OpenFirmware and other firmware interfaces
which are able to supply a device tree. This image
expects firmware to provide the device tree at boot.
Typically, if you have general purpose PowerPC
hardware then you want this image format.
==================== ========================================================
Image types which embed a device tree blob (simpleImage, dtbImage, treeImage,
and cuImage) all generate the device tree blob from a file in the

View File

@ -1,3 +1,4 @@
============
CPU Families
============
@ -8,8 +9,8 @@ and are supported by arch/powerpc.
Book3S (aka sPAPR)
------------------
- Hash MMU
- Mix of 32 & 64 bit
- Hash MMU
- Mix of 32 & 64 bit::
+--------------+ +----------------+
| Old POWER | --------------> | RS64 (threads) |
@ -108,8 +109,8 @@ Book3S (aka sPAPR)
IBM BookE
---------
- Software loaded TLB.
- All 32 bit
- Software loaded TLB.
- All 32 bit::
+--------------+
| 401 |
@ -155,8 +156,8 @@ IBM BookE
Motorola/Freescale 8xx
----------------------
- Software loaded with hardware assist.
- All 32 bit
- Software loaded with hardware assist.
- All 32 bit::
+-------------+
| MPC8xx Core |
@ -166,9 +167,9 @@ Motorola/Freescale 8xx
Freescale BookE
---------------
- Software loaded TLB.
- e6500 adds HW loaded indirect TLB entries.
- Mix of 32 & 64 bit
- Software loaded TLB.
- e6500 adds HW loaded indirect TLB entries.
- Mix of 32 & 64 bit::
+--------------+
| e200 |
@ -207,8 +208,8 @@ Freescale BookE
IBM A2 core
-----------
- Book3E, software loaded TLB + HW loaded indirect TLB entries.
- 64 bit
- Book3E, software loaded TLB + HW loaded indirect TLB entries.
- 64 bit::
+--------------+ +----------------+
| A2 core | --> | WSP |

View File

@ -1,3 +1,7 @@
============
CPU Features
============
Hollis Blanchard <hollis@austin.ibm.com>
5 Jun 2002
@ -32,7 +36,7 @@ anyways).
After detecting the processor type, the kernel patches out sections of code
that shouldn't be used by writing nop's over it. Using cpufeatures requires
just 2 macros (found in arch/powerpc/include/asm/cputable.h), as seen in head.S
transfer_to_handler:
transfer_to_handler::
#ifdef CONFIG_ALTIVEC
BEGIN_FTR_SECTION

View File

@ -1,3 +1,4 @@
====================================
Coherent Accelerator Interface (CXL)
====================================
@ -21,6 +22,8 @@ Introduction
Hardware overview
=================
::
POWER8/9 FPGA
+----------+ +---------+
| | | |
@ -59,14 +62,16 @@ Hardware overview
the fault. The context to which this fault is serviced is based on
who owns that acceleration function.
POWER8 <-----> PSL Version 8 is compliant to the CAIA Version 1.0.
POWER9 <-----> PSL Version 9 is compliant to the CAIA Version 2.0.
- POWER8 and PSL Version 8 are compliant to the CAIA Version 1.0.
- POWER9 and PSL Version 9 are compliant to the CAIA Version 2.0.
This PSL Version 9 provides new features such as:
* Interaction with the nest MMU on the P9 chip.
* Native DMA support.
* Supports sending ASB_Notify messages for host thread wakeup.
* Supports Atomic operations.
* ....
* etc.
Cards with a PSL9 won't work on a POWER8 system and cards with a
PSL8 won't work on a POWER9 system.
@ -147,7 +152,9 @@ User API
master devices.
A userspace library libcxl is available here:
https://github.com/ibm-capi/libcxl
This provides a C interface to this kernel API.
open
@ -165,7 +172,8 @@ open
When all available contexts are allocated the open call will fail
and return -ENOSPC.
Note: IRQs need to be allocated for each context, which may limit
Note:
IRQs need to be allocated for each context, which may limit
the number of contexts that can be created, and therefore
how many times the device can be opened. The POWER8 CAPP
supports 2040 IRQs and 3 are used by the kernel, so 2037 are
@ -186,7 +194,9 @@ ioctl
updated as userspace allocates and frees memory. This ioctl
returns once the AFU context is started.
Takes a pointer to a struct cxl_ioctl_start_work:
Takes a pointer to a struct cxl_ioctl_start_work
::
struct cxl_ioctl_start_work {
__u64 flags;
@ -269,7 +279,7 @@ read
The buffer passed to read() must be at least 4K bytes.
The result of the read will be a buffer of one or more events,
each event is of type struct cxl_event, of varying size.
each event is of type struct cxl_event, of varying size::
struct cxl_event {
struct cxl_event_header header;
@ -280,7 +290,9 @@ read
};
};
The struct cxl_event_header is defined as:
The struct cxl_event_header is defined as
::
struct cxl_event_header {
__u16 type;
@ -307,7 +319,9 @@ read
For future extensions and padding.
If the event type is CXL_EVENT_AFU_INTERRUPT then the event
structure is defined as:
structure is defined as
::
struct cxl_event_afu_interrupt {
__u16 flags;
@ -326,7 +340,9 @@ read
For future extensions and padding.
If the event type is CXL_EVENT_DATA_STORAGE then the event
structure is defined as:
structure is defined as
::
struct cxl_event_data_storage {
__u16 flags;
@ -356,7 +372,9 @@ read
For future extensions
If the event type is CXL_EVENT_AFU_ERROR then the event structure
is defined as:
is defined as
::
struct cxl_event_afu_error {
__u16 flags;
@ -393,15 +411,15 @@ open
ioctl
-----
CXL_IOCTL_DOWNLOAD_IMAGE:
CXL_IOCTL_VALIDATE_IMAGE:
CXL_IOCTL_DOWNLOAD_IMAGE / CXL_IOCTL_VALIDATE_IMAGE:
Starts and controls flashing a new FPGA image. Partial
reconfiguration is not supported (yet), so the image must contain
a copy of the PSL and AFU(s). Since an image can be quite large,
the caller may have to iterate, splitting the image in smaller
chunks.
Takes a pointer to a struct cxl_adapter_image:
Takes a pointer to a struct cxl_adapter_image::
struct cxl_adapter_image {
__u64 flags;
__u64 data;
@ -442,7 +460,7 @@ Udev rules
The following udev rules could be used to create a symlink to the
most logical chardev to use in any programming mode (afuX.Yd for
dedicated, afuX.Ys for afu directed), since the API is virtually
identical for each:
identical for each::
SUBSYSTEM=="cxl", ATTRS{mode}=="dedicated_process", SYMLINK="cxl/%b"
SUBSYSTEM=="cxl", ATTRS{mode}=="afu_directed", \

View File

@ -1,3 +1,7 @@
================================
Coherent Accelerator (CXL) Flash
================================
Introduction
============
@ -28,7 +32,7 @@ Introduction
responsible for the initialization of the adapter, setting up the
special path for user space access, and performing error recovery. It
communicates directly the Flash Accelerator Functional Unit (AFU)
as described in Documentation/powerpc/cxl.txt.
as described in Documentation/powerpc/cxl.rst.
The cxlflash driver supports two, mutually exclusive, modes of
operation at the device (LUN) level:
@ -58,7 +62,7 @@ Overview
The CXL Flash Adapter Driver establishes a master context with the
AFU. It uses memory mapped I/O (MMIO) for this control and setup. The
Adapter Problem Space Memory Map looks like this:
Adapter Problem Space Memory Map looks like this::
+-------------------------------+
| 512 * 64 KB User MMIO |
@ -375,7 +379,7 @@ CXL Flash Driver Host IOCTLs
Each host adapter instance that is supported by the cxlflash driver
has a special character device associated with it to enable a set of
host management function. These character devices are hosted in a
class dedicated for cxlflash and can be accessed via /dev/cxlflash/*.
class dedicated for cxlflash and can be accessed via `/dev/cxlflash/*`.
Applications can be written to perform various functions using the
host ioctl APIs below.

View File

@ -1,10 +1,11 @@
=====================
DAWR issues on POWER9
============================
=====================
On POWER9 the Data Address Watchpoint Register (DAWR) can cause a checkstop
if it points to cache inhibited (CI) memory. Currently Linux has no way to
disinguish CI memory when configuring the DAWR, so (for now) the DAWR is
disabled by this commit:
disabled by this commit::
commit 9654153158d3e0684a1bdb76dbababdb7111d5a0
Author: Michael Neuling <mikey@neuling.org>
@ -12,7 +13,7 @@ disabled by this commit:
powerpc: Disable DAWR in the base POWER9 CPU features
Technical Details:
============================
==================
DAWR has 6 different ways of being set.
1) ptrace
@ -37,7 +38,7 @@ DAWR on the migration.
For xmon, the 'bd' command will return an error on P9.
Consequences for users
============================
======================
For GDB watchpoints (ie 'watch' command) on POWER9 bare metal , GDB
will accept the command. Unfortunately since there is no hardware
@ -57,8 +58,8 @@ trapped in GDB. The watchpoint is remembered, so if the guest is
migrated back to the POWER8 host, it will start working again.
Force enabling the DAWR
=============================
Kernels (since ~v5.2) have an option to force enable the DAWR via:
=======================
Kernels (since ~v5.2) have an option to force enable the DAWR via::
echo Y > /sys/kernel/debug/powerpc/dawr_enable_dangerous
@ -86,5 +87,7 @@ dawr_enable_dangerous file will fail if the hypervisor doesn't support
writing the DAWR.
To double check the DAWR is working, run this kernel selftest:
tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c
Any errors/failures/skips mean something is wrong.

View File

@ -1,5 +1,6 @@
DSCR (Data Stream Control Register)
================================================
===================================
DSCR (Data Stream Control Register)
===================================
DSCR register in powerpc allows user to have some control of prefetch of data
stream in the processor. Please refer to the ISA documents or related manual
@ -10,14 +11,17 @@ user interface.
(A) Data Structures:
(1) thread_struct:
(1) thread_struct::
dscr /* Thread DSCR value */
dscr_inherit /* Thread has changed default DSCR */
(2) PACA:
(2) PACA::
dscr_default /* per-CPU DSCR default value */
(3) sysfs.c:
(3) sysfs.c::
dscr_default /* System DSCR default value */
(B) Scheduler Changes:
@ -35,8 +39,8 @@ user interface.
(C) SYSFS Interface:
Global DSCR default: /sys/devices/system/cpu/dscr_default
CPU specific DSCR default: /sys/devices/system/cpu/cpuN/dscr
- Global DSCR default: /sys/devices/system/cpu/dscr_default
- CPU specific DSCR default: /sys/devices/system/cpu/cpuN/dscr
Changing the global DSCR default in the sysfs will change all the CPU
specific DSCR defaults immediately in their PACA structures. Again if

View File

@ -1,10 +1,10 @@
==========================
PCI Bus EEH Error Recovery
==========================
Linas Vepstas <linas@austin.ibm.com>
PCI Bus EEH Error Recovery
--------------------------
Linas Vepstas
<linas@austin.ibm.com>
12 January 2005
12 January 2005
Overview:
@ -153,7 +153,7 @@ beforehand.
Next, it uses the Linux kernel notifier chain/work queue mechanism to
allow any interested parties to find out about the failure. Device
drivers, or other parts of the kernel, can use
eeh_register_notifier(struct notifier_block *) to find out about EEH
`eeh_register_notifier(struct notifier_block *)` to find out about EEH
events. The event will include a pointer to the pci device, the
device node and some state info. Receivers of the event can "do as
they wish"; the default handler will be described further in this
@ -162,10 +162,13 @@ section.
To assist in the recovery of the device, eeh.c exports the
following functions:
rtas_set_slot_reset() -- assert the PCI #RST line for 1/8th of a second
rtas_configure_bridge() -- ask firmware to configure any PCI bridges
rtas_set_slot_reset()
assert the PCI #RST line for 1/8th of a second
rtas_configure_bridge()
ask firmware to configure any PCI bridges
located topologically under the pci slot.
eeh_save_bars() and eeh_restore_bars(): save and restore the PCI
eeh_save_bars() and eeh_restore_bars():
save and restore the PCI
config-space info for a device and any devices under it.
@ -191,7 +194,7 @@ events get delivered to user-space scripts.
Following is an example sequence of events that cause a device driver
close function to be called during the first phase of an EEH reset.
The following sequence is an example of the pcnet32 device driver.
The following sequence is an example of the pcnet32 device driver::
rpa_php_unconfig_pci_adapter (struct slot *) // in rpaphp_pci.c
{
@ -241,53 +244,54 @@ The following sequence is an example of the pcnet32 device driver.
}}}}}}
in drivers/pci/pci_driver.c,
struct device_driver->remove() is just pci_device_remove()
which calls struct pci_driver->remove() which is pcnet32_remove_one()
which calls unregister_netdev() (in net/core/dev.c)
which calls dev_close() (in net/core/dev.c)
which calls dev->stop() which is pcnet32_close()
which then does the appropriate shutdown.
in drivers/pci/pci_driver.c,
struct device_driver->remove() is just pci_device_remove()
which calls struct pci_driver->remove() which is pcnet32_remove_one()
which calls unregister_netdev() (in net/core/dev.c)
which calls dev_close() (in net/core/dev.c)
which calls dev->stop() which is pcnet32_close()
which then does the appropriate shutdown.
---
Following is the analogous stack trace for events sent to user-space
when the pci device is unconfigured.
rpa_php_unconfig_pci_adapter() { // in rpaphp_pci.c
calls
pci_remove_bus_device (struct pci_dev *) { // in /drivers/pci/remove.c
Following is the analogous stack trace for events sent to user-space
when the pci device is unconfigured::
rpa_php_unconfig_pci_adapter() { // in rpaphp_pci.c
calls
pci_destroy_dev (struct pci_dev *) {
pci_remove_bus_device (struct pci_dev *) { // in /drivers/pci/remove.c
calls
device_unregister (&dev->dev) { // in /drivers/base/core.c
pci_destroy_dev (struct pci_dev *) {
calls
device_del(struct device * dev) { // in /drivers/base/core.c
device_unregister (&dev->dev) { // in /drivers/base/core.c
calls
kobject_del() { //in /libs/kobject.c
device_del(struct device * dev) { // in /drivers/base/core.c
calls
kobject_uevent() { // in /libs/kobject.c
kobject_del() { //in /libs/kobject.c
calls
kset_uevent() { // in /lib/kobject.c
kobject_uevent() { // in /libs/kobject.c
calls
kset->uevent_ops->uevent() // which is really just
a call to
dev_uevent() { // in /drivers/base/core.c
kset_uevent() { // in /lib/kobject.c
calls
dev->bus->uevent() which is really just a call to
pci_uevent () { // in drivers/pci/hotplug.c
which prints device name, etc....
kset->uevent_ops->uevent() // which is really just
a call to
dev_uevent() { // in /drivers/base/core.c
calls
dev->bus->uevent() which is really just a call to
pci_uevent () { // in drivers/pci/hotplug.c
which prints device name, etc....
}
}
}
then kobject_uevent() sends a netlink uevent to userspace
--> userspace uevent
(during early boot, nobody listens to netlink events and
kobject_uevent() executes uevent_helper[], which runs the
event process /sbin/hotplug)
then kobject_uevent() sends a netlink uevent to userspace
--> userspace uevent
(during early boot, nobody listens to netlink events and
kobject_uevent() executes uevent_helper[], which runs the
event process /sbin/hotplug)
}
}
}
kobject_del() then calls sysfs_remove_dir(), which would
trigger any user-space daemon that was watching /sysfs,
and notice the delete event.
kobject_del() then calls sysfs_remove_dir(), which would
trigger any user-space daemon that was watching /sysfs,
and notice the delete event.
Pro's and Con's of the Current Design
@ -299,12 +303,12 @@ individual device drivers, so that the current design throws a wide net.
The biggest negative of the design is that it potentially disturbs
network daemons and file systems that didn't need to be disturbed.
-- A minor complaint is that resetting the network card causes
- A minor complaint is that resetting the network card causes
user-space back-to-back ifdown/ifup burps that potentially disturb
network daemons, that didn't need to even know that the pci
card was being rebooted.
-- A more serious concern is that the same reset, for SCSI devices,
- A more serious concern is that the same reset, for SCSI devices,
causes havoc to mounted file systems. Scripts cannot post-facto
unmount a file system without flushing pending buffers, but this
is impossible, because I/O has already been stopped. Thus,
@ -322,7 +326,7 @@ network daemons and file systems that didn't need to be disturbed.
from the block layer. It would be very natural to add an EEH
reset into this chain of events.
-- If a SCSI error occurs for the root device, all is lost unless
- If a SCSI error occurs for the root device, all is lost unless
the sysadmin had the foresight to run /bin, /sbin, /etc, /var
and so on, out of ramdisk/tmpfs.
@ -330,5 +334,3 @@ network daemons and file systems that didn't need to be disturbed.
Conclusions
-----------
There's forward progress ...

View File

@ -1,7 +1,8 @@
======================
Firmware-Assisted Dump
======================
Firmware-Assisted Dump
------------------------
July 2011
July 2011
The goal of firmware-assisted dump is to enable the dump of
a crashed system, and to do so from a fully-reset system, and
@ -27,11 +28,11 @@ in production use.
Comparing with kdump or other strategies, firmware-assisted
dump offers several strong, practical advantages:
-- Unlike kdump, the system has been reset, and loaded
- Unlike kdump, the system has been reset, and loaded
with a fresh copy of the kernel. In particular,
PCI and I/O devices have been reinitialized and are
in a clean, consistent state.
-- Once the dump is copied out, the memory that held the dump
- Once the dump is copied out, the memory that held the dump
is immediately available to the running kernel. And therefore,
unlike kdump, fadump doesn't need a 2nd reboot to get back
the system to the production configuration.
@ -40,17 +41,18 @@ The above can only be accomplished by coordination with,
and assistance from the Power firmware. The procedure is
as follows:
-- The first kernel registers the sections of memory with the
- The first kernel registers the sections of memory with the
Power firmware for dump preservation during OS initialization.
These registered sections of memory are reserved by the first
kernel during early boot.
-- When a system crashes, the Power firmware will save
- When a system crashes, the Power firmware will save
the low memory (boot memory of size larger of 5% of system RAM
or 256MB) of RAM to the previous registered region. It will
also save system registers, and hardware PTE's.
NOTE: The term 'boot memory' means size of the low memory chunk
NOTE:
The term 'boot memory' means size of the low memory chunk
that is required for a kernel to boot successfully when
booted with restricted memory. By default, the boot memory
size will be the larger of 5% of system RAM or 256MB.
@ -64,12 +66,12 @@ as follows:
as fadump uses a predefined offset to reserve memory
for boot memory dump preservation in case of a crash.
-- After the low memory (boot memory) area has been saved, the
- After the low memory (boot memory) area has been saved, the
firmware will reset PCI and other hardware state. It will
*not* clear the RAM. It will then launch the bootloader, as
normal.
-- The freshly booted kernel will notice that there is a new
- The freshly booted kernel will notice that there is a new
node (ibm,dump-kernel) in the device tree, indicating that
there is crash data available from a previous boot. During
the early boot OS will reserve rest of the memory above
@ -77,17 +79,18 @@ as follows:
size. This will make sure that the second kernel will not
touch any of the dump memory area.
-- User-space tools will read /proc/vmcore to obtain the contents
- User-space tools will read /proc/vmcore to obtain the contents
of memory, which holds the previous crashed kernel dump in ELF
format. The userspace tools may copy this info to disk, or
network, nas, san, iscsi, etc. as desired.
-- Once the userspace tool is done saving dump, it will echo
- Once the userspace tool is done saving dump, it will echo
'1' to /sys/kernel/fadump_release_mem to release the reserved
memory back to general use, except the memory required for
next firmware-assisted dump registration.
e.g.
e.g.::
# echo 1 > /sys/kernel/fadump_release_mem
Please note that the firmware-assisted dump feature
@ -95,7 +98,7 @@ is only available on Power6 and above systems with recent
firmware versions.
Implementation details:
----------------------
-----------------------
During boot, a check is made to see if firmware supports
this feature on that particular machine. If it does, then
@ -121,7 +124,7 @@ Allocator (CMA) for memory reservation if CMA is configured for kernel.
With CMA reservation this memory will be available for applications to
use it, while kernel is prevented from using it. With this fadump will
still be able to capture all of the kernel memory and most of the user
space memory except the user pages that were present in CMA region.
space memory except the user pages that were present in CMA region::
o Memory Reservation during first kernel
@ -166,7 +169,7 @@ The tools to examine the dump will be same as the ones
used for kdump.
How to enable firmware-assisted dump (fadump):
-------------------------------------
----------------------------------------------
1. Set config option CONFIG_FA_DUMP=y and build kernel.
2. Boot into linux kernel with 'fadump=on' kernel cmdline option.
@ -177,19 +180,20 @@ How to enable firmware-assisted dump (fadump):
to specify size of the memory to reserve for boot memory dump
preservation.
NOTE: 1. 'fadump_reserve_mem=' parameter has been deprecated. Instead
use 'crashkernel=' to specify size of the memory to reserve
for boot memory dump preservation.
2. If firmware-assisted dump fails to reserve memory then it
will fallback to existing kdump mechanism if 'crashkernel='
option is set at kernel cmdline.
3. if user wants to capture all of user space memory and ok with
reserved memory not available to production system, then
'fadump=nocma' kernel parameter can be used to fallback to
old behaviour.
NOTE:
1. 'fadump_reserve_mem=' parameter has been deprecated. Instead
use 'crashkernel=' to specify size of the memory to reserve
for boot memory dump preservation.
2. If firmware-assisted dump fails to reserve memory then it
will fallback to existing kdump mechanism if 'crashkernel='
option is set at kernel cmdline.
3. if user wants to capture all of user space memory and ok with
reserved memory not available to production system, then
'fadump=nocma' kernel parameter can be used to fallback to
old behaviour.
Sysfs/debugfs files:
------------
--------------------
Firmware-assisted dump feature uses sysfs file system to hold
the control files and debugfs file to display memory reserved region.
@ -197,20 +201,20 @@ the control files and debugfs file to display memory reserved region.
Here is the list of files under kernel sysfs:
/sys/kernel/fadump_enabled
This is used to display the fadump status.
0 = fadump is disabled
1 = fadump is enabled
- 0 = fadump is disabled
- 1 = fadump is enabled
This interface can be used by kdump init scripts to identify if
fadump is enabled in the kernel and act accordingly.
/sys/kernel/fadump_registered
This is used to display the fadump registration status as well
as to control (start/stop) the fadump registration.
0 = fadump is not registered.
1 = fadump is registered and ready to handle system crash.
- 0 = fadump is not registered.
- 1 = fadump is registered and ready to handle system crash.
To register fadump echo 1 > /sys/kernel/fadump_registered and
echo 0 > /sys/kernel/fadump_registered for un-register and stop the
@ -219,13 +223,12 @@ Here is the list of files under kernel sysfs:
easily integrated with kdump service start/stop.
/sys/kernel/fadump_release_mem
This file is available only when fadump is active during
second kernel. This is used to release the reserved memory
region that are held for saving crash dump. To release the
reserved memory echo 1 to it:
reserved memory echo 1 to it::
echo 1 > /sys/kernel/fadump_release_mem
echo 1 > /sys/kernel/fadump_release_mem
After echo 1, the content of the /sys/kernel/debug/powerpc/fadump_region
file will change to reflect the new memory reservations.
@ -238,38 +241,39 @@ Here is the list of files under powerpc debugfs:
(Assuming debugfs is mounted on /sys/kernel/debug directory.)
/sys/kernel/debug/powerpc/fadump_region
This file shows the reserved memory regions if fadump is
enabled otherwise this file is empty. The output format
is:
<region>: [<start>-<end>] <reserved-size> bytes, Dumped: <dump-size>
is::
<region>: [<start>-<end>] <reserved-size> bytes, Dumped: <dump-size>
e.g.
Contents when fadump is registered during first kernel
Contents when fadump is registered during first kernel::
# cat /sys/kernel/debug/powerpc/fadump_region
CPU : [0x0000006ffb0000-0x0000006fff001f] 0x40020 bytes, Dumped: 0x0
HPTE: [0x0000006fff0020-0x0000006fff101f] 0x1000 bytes, Dumped: 0x0
DUMP: [0x0000006fff1020-0x0000007fff101f] 0x10000000 bytes, Dumped: 0x0
# cat /sys/kernel/debug/powerpc/fadump_region
CPU : [0x0000006ffb0000-0x0000006fff001f] 0x40020 bytes, Dumped: 0x0
HPTE: [0x0000006fff0020-0x0000006fff101f] 0x1000 bytes, Dumped: 0x0
DUMP: [0x0000006fff1020-0x0000007fff101f] 0x10000000 bytes, Dumped: 0x0
Contents when fadump is active during second kernel
Contents when fadump is active during second kernel::
# cat /sys/kernel/debug/powerpc/fadump_region
CPU : [0x0000006ffb0000-0x0000006fff001f] 0x40020 bytes, Dumped: 0x40020
HPTE: [0x0000006fff0020-0x0000006fff101f] 0x1000 bytes, Dumped: 0x1000
DUMP: [0x0000006fff1020-0x0000007fff101f] 0x10000000 bytes, Dumped: 0x10000000
: [0x00000010000000-0x0000006ffaffff] 0x5ffb0000 bytes, Dumped: 0x5ffb0000
# cat /sys/kernel/debug/powerpc/fadump_region
CPU : [0x0000006ffb0000-0x0000006fff001f] 0x40020 bytes, Dumped: 0x40020
HPTE: [0x0000006fff0020-0x0000006fff101f] 0x1000 bytes, Dumped: 0x1000
DUMP: [0x0000006fff1020-0x0000007fff101f] 0x10000000 bytes, Dumped: 0x10000000
: [0x00000010000000-0x0000006ffaffff] 0x5ffb0000 bytes, Dumped: 0x5ffb0000
NOTE: Please refer to Documentation/filesystems/debugfs.txt on
NOTE:
Please refer to Documentation/filesystems/debugfs.txt on
how to mount the debugfs filesystem.
TODO:
-----
o Need to come up with the better approach to find out more
- Need to come up with the better approach to find out more
accurate boot memory size that is required for a kernel to
boot successfully when booted with restricted memory.
o The fadump implementation introduces a fadump crash info structure
- The fadump implementation introduces a fadump crash info structure
in the scratch area before the ELF core header. The idea of introducing
this structure is to pass some important crash info data to the second
kernel which will help second kernel to populate ELF core header with
@ -277,7 +281,9 @@ TODO:
design implementation does not address a possibility of introducing
additional fields (in future) to this structure without affecting
compatibility. Need to come up with the better approach to address this.
The possible approaches are:
1. Introduce version field for version tracking, bump up the version
whenever a new field is added to the structure in future. The version
field can be used to find out what fields are valid for the current
@ -285,8 +291,11 @@ TODO:
2. Reserve the area of predefined size (say PAGE_SIZE) for this
structure and have unused area as reserved (initialized to zero)
for future field additions.
The advantage of approach 1 over 2 is we don't need to reserve extra space.
---
Author: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
This document is based on the original documentation written for phyp
assisted dump by Linas Vepstas and Manish Ahuja.

View File

@ -1,19 +1,22 @@
===========================================================================
HVCS
IBM "Hypervisor Virtual Console Server" Installation Guide
for Linux Kernel 2.6.4+
Copyright (C) 2004 IBM Corporation
===============================================================
HVCS IBM "Hypervisor Virtual Console Server" Installation Guide
===============================================================
===========================================================================
NOTE:Eight space tabs are the optimum editor setting for reading this file.
===========================================================================
for Linux Kernel 2.6.4+
Author(s) : Ryan S. Arnold <rsa@us.ibm.com>
Date Created: March, 02, 2004
Last Changed: August, 24, 2004
Copyright (C) 2004 IBM Corporation
---------------------------------------------------------------------------
Table of contents:
.. ===========================================================================
.. NOTE:Eight space tabs are the optimum editor setting for reading this file.
.. ===========================================================================
Author(s): Ryan S. Arnold <rsa@us.ibm.com>
Date Created: March, 02, 2004
Last Changed: August, 24, 2004
.. Table of contents:
1. Driver Introduction:
2. System Requirements
@ -27,8 +30,8 @@ Table of contents:
8. Questions & Answers:
9. Reporting Bugs:
---------------------------------------------------------------------------
1. Driver Introduction:
=======================
This is the device driver for the IBM Hypervisor Virtual Console Server,
"hvcs". The IBM hvcs provides a tty driver interface to allow Linux user
@ -38,8 +41,8 @@ ppc64 system. Physical hardware consoles per partition are not practical
on this hardware so system consoles are accessed by this driver using
firmware interfaces to virtual terminal devices.
---------------------------------------------------------------------------
2. System Requirements:
=======================
This device driver was written using 2.6.4 Linux kernel APIs and will only
build and run on kernels of this version or later.
@ -52,8 +55,8 @@ Sysfs must be mounted on the system so that the user can determine which
major and minor numbers are associated with each vty-server. Directions
for sysfs mounting are outside the scope of this document.
---------------------------------------------------------------------------
3. Build Options:
=================
The hvcs driver registers itself as a tty driver. The tty layer
dynamically allocates a block of major and minor numbers in a quantity
@ -65,11 +68,11 @@ If the default number of device entries is adequate then this driver can be
built into the kernel. If not, the default can be over-ridden by inserting
the driver as a module with insmod parameters.
---------------------------------------------------------------------------
3.1 Built-in:
-------------
The following menuconfig example demonstrates selecting to build this
driver into the kernel.
driver into the kernel::
Device Drivers --->
Character devices --->
@ -77,11 +80,11 @@ driver into the kernel.
Begin the kernel make process.
---------------------------------------------------------------------------
3.2 Module:
-----------
The following menuconfig example demonstrates selecting to build this
driver as a kernel module.
driver as a kernel module::
Device Drivers --->
Character devices --->
@ -89,11 +92,11 @@ driver as a kernel module.
The make process will build the following kernel modules:
hvcs.ko
hvcserver.ko
- hvcs.ko
- hvcserver.ko
To insert the module with the default allocation execute the following
commands in the order they appear:
commands in the order they appear::
insmod hvcserver.ko
insmod hvcs.ko
@ -103,7 +106,7 @@ be inserted first, otherwise the hvcs module will not find some of the
symbols it expects.
To override the default use an insmod parameter as follows (requesting 4
tty devices as an example):
tty devices as an example)::
insmod hvcs.ko hvcs_parm_num_devs=4
@ -115,31 +118,31 @@ source file before building.
NOTE: The length of time it takes to insmod the driver seems to be related
to the number of tty interfaces the registering driver requests.
In order to remove the driver module execute the following command:
In order to remove the driver module execute the following command::
rmmod hvcs.ko
The recommended method for installing hvcs as a module is to use depmod to
build a current modules.dep file in /lib/modules/`uname -r` and then
execute:
execute::
modprobe hvcs hvcs_parm_num_devs=4
modprobe hvcs hvcs_parm_num_devs=4
The modules.dep file indicates that hvcserver.ko needs to be inserted
before hvcs.ko and modprobe uses this file to smartly insert the modules in
the proper order.
The following modprobe command is used to remove hvcs and hvcserver in the
proper order:
proper order::
modprobe -r hvcs
modprobe -r hvcs
---------------------------------------------------------------------------
4. Installation:
================
The tty layer creates sysfs entries which contain the major and minor
numbers allocated for the hvcs driver. The following snippet of "tree"
output of the sysfs directory shows where these numbers are presented:
output of the sysfs directory shows where these numbers are presented::
sys/
|-- *other sysfs base dirs*
@ -164,7 +167,7 @@ output of the sysfs directory shows where these numbers are presented:
|-- *other sysfs base dirs*
For the above examples the following output is a result of cat'ing the
"dev" entry in the hvcs directory:
"dev" entry in the hvcs directory::
Pow5:/sys/class/tty/hvcs0/ # cat dev
254:0
@ -184,7 +187,7 @@ systems running hvcs will already have the device entries created or udev
will do it automatically.
Given the example output above, to manually create a /dev/hvcs* node entry
mknod can be used as follows:
mknod can be used as follows::
mknod /dev/hvcs0 c 254 0
mknod /dev/hvcs1 c 254 1
@ -195,15 +198,15 @@ Using mknod to manually create the device entries makes these device nodes
persistent. Once created they will exist prior to the driver insmod.
Attempting to connect an application to /dev/hvcs* prior to insertion of
the hvcs module will result in an error message similar to the following:
the hvcs module will result in an error message similar to the following::
"/dev/hvcs*: No such device".
NOTE: Just because there is a device node present doesn't mean that there
is a vty-server device configured for that node.
---------------------------------------------------------------------------
5. Connection
=============
Since this driver controls devices that provide a tty interface a user can
interact with the device node entries using any standard tty-interactive
@ -249,7 +252,7 @@ vty-server adapter is associated with which /dev/hvcs* node a special sysfs
attribute has been added to each vty-server sysfs entry. This entry is
called "index" and showing it reveals an integer that refers to the
/dev/hvcs* entry to use to connect to that device. For instance cating the
index attribute of vty-server adapter 30000004 shows the following.
index attribute of vty-server adapter 30000004 shows the following::
Pow5:/sys/bus/vio/drivers/hvcs/30000004 # cat index
2
@ -262,8 +265,8 @@ system the /dev/hvcs* entry that interacts with a particular vty-server
adapter is not guaranteed to remain the same across system reboots. Look
in the Q & A section for more on this issue.
---------------------------------------------------------------------------
6. Disconnection
================
As a security feature to prevent the delivery of stale data to an
unintended target the Power5 system firmware disables the fetching of data
@ -305,7 +308,7 @@ connection between the vty-server and target vty ONLY if the vterm_state
previously read '1'. The write directive is ignored if the vterm_state
read '0' or if any value other than '0' was written to the vterm_state
attribute. The following example will show the method used for verifying
the vty-server connection status and disconnecting a vty-server connection.
the vty-server connection status and disconnecting a vty-server connection::
Pow5:/sys/bus/vio/drivers/hvcs/30000004 # cat vterm_state
1
@ -318,12 +321,12 @@ the vty-server connection status and disconnecting a vty-server connection.
All vty-server connections are automatically terminated when the device is
hotplug removed and when the module is removed.
---------------------------------------------------------------------------
7. Configuration
================
Each vty-server has a sysfs entry in the /sys/devices/vio directory, which
is symlinked in several other sysfs tree directories, notably under the
hvcs driver entry, which looks like the following example:
hvcs driver entry, which looks like the following example::
Pow5:/sys/bus/vio/drivers/hvcs # ls
. .. 30000003 30000004 rescan
@ -344,7 +347,7 @@ completed or was never executed.
Vty-server entries in this directory are a 32 bit partition unique unit
address that is created by firmware. An example vty-server sysfs entry
looks like the following:
looks like the following::
Pow5:/sys/bus/vio/drivers/hvcs/30000004 # ls
. current_vty devspec name partner_vtys
@ -352,21 +355,21 @@ looks like the following:
Each entry is provided, by default with a "name" attribute. Reading the
"name" attribute will reveal the device type as shown in the following
example:
example::
Pow5:/sys/bus/vio/drivers/hvcs/30000003 # cat name
vty-server
Each entry is also provided, by default, with a "devspec" attribute which
reveals the full device specification when read, as shown in the following
example:
example::
Pow5:/sys/bus/vio/drivers/hvcs/30000004 # cat devspec
/vdevice/vty-server@30000004
Each vty-server sysfs dir is provided with two read-only attributes that
provide lists of easily parsed partner vty data: "partner_vtys" and
"partner_clcs".
"partner_clcs"::
Pow5:/sys/bus/vio/drivers/hvcs/30000004 # cat partner_vtys
30000000
@ -396,7 +399,7 @@ A vty-server can only be connected to a single vty at a time. The entry,
read.
The current_vty can be changed by writing a valid partner clc to the entry
as in the following example:
as in the following example::
Pow5:/sys/bus/vio/drivers/hvcs/30000004 # echo U5112.428.10304
8A-V4-C0 > current_vty
@ -408,9 +411,9 @@ currently open connection is freed.
Information on the "vterm_state" attribute was covered earlier on the
chapter entitled "disconnection".
---------------------------------------------------------------------------
8. Questions & Answers:
===========================================================================
=======================
Q: What are the security concerns involving hvcs?
A: There are three main security concerns:
@ -429,6 +432,7 @@ A: There are three main security concerns:
partition) will experience the previously logged in session.
---------------------------------------------------------------------------
Q: How do I multiplex a console that I grab through hvcs so that other
people can see it:
@ -440,6 +444,7 @@ term type "screen" to others. This means that curses based programs may
not display properly in screen sessions.
---------------------------------------------------------------------------
Q: Why are the colors all messed up?
Q: Why are the control characters acting strange or not working?
Q: Why is the console output all strange and unintelligible?
@ -455,6 +460,7 @@ disconnect from the console. This will ensure that the next user gets
their own TERM type set when they login.
---------------------------------------------------------------------------
Q: When I try to CONNECT kermit to an hvcs device I get:
"Sorry, can't open connection: /dev/hvcs*"What is happening?
@ -490,6 +496,7 @@ A: There is not a corresponding vty-server device that maps to an existing
/dev/hvcs* entry.
---------------------------------------------------------------------------
Q: When I try to CONNECT kermit to an hvcs device I get:
"Sorry, write access to UUCP lockfile directory denied."
@ -497,6 +504,7 @@ A: The /dev/hvcs* entry you have specified doesn't exist where you said it
does? Maybe you haven't inserted the module (on systems with udev).
---------------------------------------------------------------------------
Q: If I already have one Linux partition installed can I use hvcs on said
partition to provide the console for the install of a second Linux
partition?
@ -505,6 +513,7 @@ A: Yes granted that your are connected to the /dev/hvcs* device using
kermit or cu or some other program that doesn't provide terminal emulation.
---------------------------------------------------------------------------
Q: Can I connect to more than one partition's console at a time using this
driver?
@ -512,6 +521,7 @@ A: Yes. Of course this means that there must be more than one vty-server
configured for this partition and each must point to a disconnected vty.
---------------------------------------------------------------------------
Q: Does the hvcs driver support dynamic (hotplug) addition of devices?
A: Yes, if you have dlpar and hotplug enabled for your system and it has
@ -519,6 +529,7 @@ been built into the kernel the hvcs drivers is configured to dynamically
handle additions of new devices and removals of unused devices.
---------------------------------------------------------------------------
Q: For some reason /dev/hvcs* doesn't map to the same vty-server adapter
after a reboot. What happened?
@ -533,6 +544,7 @@ on how to determine which vty-server goes with which /dev/hvcs* node.
Hint; look at the sysfs "index" attribute for the vty-server.
---------------------------------------------------------------------------
Q: Can I use /dev/hvcs* as a conduit to another partition and use a tty
device on that partition as the other end of the pipe?
@ -554,7 +566,9 @@ read or write to /dev/hvcs*. Now you have a tty conduit between two
partitions.
---------------------------------------------------------------------------
9. Reporting Bugs:
==================
The proper channel for reporting bugs is either through the Linux OS
distribution company that provided your OS or by posting issues to the

View File

@ -0,0 +1,34 @@
.. SPDX-License-Identifier: GPL-2.0
=======
powerpc
=======
.. toctree::
:maxdepth: 1
bootwrapper
cpu_families
cpu_features
cxl
cxlflash
dawr-power9
dscr
eeh-pci-error-recovery
firmware-assisted-dump
hvcs
isa-versions
mpc52xx
pci_iov_resource_on_powernv
pmu-ebb
ptrace
qe_firmware
syscall64-abi
transactional_memory
.. only:: subproject and html
Indices
=======
* :ref:`genindex`

View File

@ -1,13 +1,12 @@
:orphan:
==========================
CPU to ISA Version Mapping
==========================
Mapping of some CPU versions to relevant ISA versions.
========= ====================
========= ====================================================================
CPU Architecture version
========= ====================
========= ====================================================================
Power9 Power ISA v3.0B
Power8 Power ISA v2.07
Power7 Power ISA v2.06
@ -24,7 +23,7 @@ PPC970 - PowerPC User Instruction Set Architecture Book I v2.01
- PowerPC Virtual Environment Architecture Book II v2.01
- PowerPC Operating Environment Architecture Book III v2.01
- Plus Altivec/VMX ~= 2.03
========= ====================
========= ====================================================================
Key Features
@ -60,9 +59,9 @@ Power5 No
PPC970 No
========== ====
========== ====================
========== ====================================
CPU Transactional Memory
========== ====================
========== ====================================
Power9 Yes (* see transactional_memory.txt)
Power8 Yes
Power7 No
@ -73,4 +72,4 @@ Power5++ No
Power5+ No
Power5 No
PPC970 No
========== ====================
========== ====================================

View File

@ -1,11 +1,13 @@
=============================
Linux 2.6.x on MPC52xx family
-----------------------------
=============================
For the latest info, go to http://www.246tNt.com/mpc52xx/
To compile/use :
- U-Boot:
- U-Boot::
# <edit Makefile to set ARCH=ppc & CROSS_COMPILE=... ( also EXTRAVERSION
if you wish to ).
# make lite5200_defconfig
@ -16,7 +18,8 @@ To compile/use :
=> tftpboot 400000 pRamdisk
=> bootm 200000 400000
- DBug:
- DBug::
# <edit Makefile to set ARCH=ppc & CROSS_COMPILE=... ( also EXTRAVERSION
if you wish to ).
# make lite5200_defconfig
@ -28,7 +31,8 @@ To compile/use :
DBug> dn -i zImage.initrd.lite5200
Some remarks :
Some remarks:
- The port is named mpc52xxx, and config options are PPC_MPC52xx. The MGT5100
is not supported, and I'm not sure anyone is interesting in working on it
so. I didn't took 5xxx because there's apparently a lot of 5xxx that have

View File

@ -1,6 +1,13 @@
===================================================
PCI Express I/O Virtualization Resource on Powerenv
===================================================
Wei Yang <weiyang@linux.vnet.ibm.com>
Benjamin Herrenschmidt <benh@au1.ibm.com>
Bjorn Helgaas <bhelgaas@google.com>
26 Aug 2014
This document describes the requirement from hardware for PCI MMIO resource
@ -10,6 +17,7 @@ Endpoints and the implementation on P8 (IODA2). The next two sections talks
about considerations on enabling SRIOV on IODA2.
1. Introduction to Partitionable Endpoints
==========================================
A Partitionable Endpoint (PE) is a way to group the various resources
associated with a device or a set of devices to provide isolation between
@ -35,6 +43,7 @@ is a completely separate HW entity that replicates the entire logic, so has
its own set of PEs, etc.
2. Implementation of Partitionable Endpoints on P8 (IODA2)
==========================================================
P8 supports up to 256 Partitionable Endpoints per PHB.
@ -149,6 +158,7 @@ P8 supports up to 256 Partitionable Endpoints per PHB.
sense, but we haven't done it yet.
3. Considerations for SR-IOV on PowerKVM
========================================
* SR-IOV Background
@ -224,7 +234,7 @@ P8 supports up to 256 Partitionable Endpoints per PHB.
IODA supports 256 PEs, so segmented windows contain 256 segments, so if
total_VFs is less than 256, we have the situation in Figure 1.0, where
segments [total_VFs, 255] of the M64 window may map to some MMIO range on
other devices:
other devices::
0 1 total_VFs - 1
+------+------+- -+------+------+
@ -243,7 +253,7 @@ P8 supports up to 256 Partitionable Endpoints per PHB.
Figure 1.0 Direct map VF(n) BAR space
Our current solution is to allocate 256 segments even if the VF(n) BAR
space doesn't need that much, as shown in Figure 1.1:
space doesn't need that much, as shown in Figure 1.1::
0 1 total_VFs - 1 255
+------+------+- -+------+------+- -+------+------+
@ -269,6 +279,7 @@ P8 supports up to 256 Partitionable Endpoints per PHB.
responds to segments [total_VFs, 255].
4. Implications for the Generic PCI Code
========================================
The PCIe SR-IOV spec requires that the base of the VF(n) BAR space be
aligned to the size of an individual VF BAR.

View File

@ -1,3 +1,4 @@
========================
PMU Event Based Branches
========================

View File

@ -0,0 +1,156 @@
======
Ptrace
======
GDB intends to support the following hardware debug features of BookE
processors:
4 hardware breakpoints (IAC)
2 hardware watchpoints (read, write and read-write) (DAC)
2 value conditions for the hardware watchpoints (DVC)
For that, we need to extend ptrace so that GDB can query and set these
resources. Since we're extending, we're trying to create an interface
that's extendable and that covers both BookE and server processors, so
that GDB doesn't need to special-case each of them. We added the
following 3 new ptrace requests.
1. PTRACE_PPC_GETHWDEBUGINFO
============================
Query for GDB to discover the hardware debug features. The main info to
be returned here is the minimum alignment for the hardware watchpoints.
BookE processors don't have restrictions here, but server processors have
an 8-byte alignment restriction for hardware watchpoints. We'd like to avoid
adding special cases to GDB based on what it sees in AUXV.
Since we're at it, we added other useful info that the kernel can return to
GDB: this query will return the number of hardware breakpoints, hardware
watchpoints and whether it supports a range of addresses and a condition.
The query will fill the following structure provided by the requesting process::
struct ppc_debug_info {
unit32_t version;
unit32_t num_instruction_bps;
unit32_t num_data_bps;
unit32_t num_condition_regs;
unit32_t data_bp_alignment;
unit32_t sizeof_condition; /* size of the DVC register */
uint64_t features; /* bitmask of the individual flags */
};
features will have bits indicating whether there is support for::
#define PPC_DEBUG_FEATURE_INSN_BP_RANGE 0x1
#define PPC_DEBUG_FEATURE_INSN_BP_MASK 0x2
#define PPC_DEBUG_FEATURE_DATA_BP_RANGE 0x4
#define PPC_DEBUG_FEATURE_DATA_BP_MASK 0x8
#define PPC_DEBUG_FEATURE_DATA_BP_DAWR 0x10
2. PTRACE_SETHWDEBUG
Sets a hardware breakpoint or watchpoint, according to the provided structure::
struct ppc_hw_breakpoint {
uint32_t version;
#define PPC_BREAKPOINT_TRIGGER_EXECUTE 0x1
#define PPC_BREAKPOINT_TRIGGER_READ 0x2
#define PPC_BREAKPOINT_TRIGGER_WRITE 0x4
uint32_t trigger_type; /* only some combinations allowed */
#define PPC_BREAKPOINT_MODE_EXACT 0x0
#define PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE 0x1
#define PPC_BREAKPOINT_MODE_RANGE_EXCLUSIVE 0x2
#define PPC_BREAKPOINT_MODE_MASK 0x3
uint32_t addr_mode; /* address match mode */
#define PPC_BREAKPOINT_CONDITION_MODE 0x3
#define PPC_BREAKPOINT_CONDITION_NONE 0x0
#define PPC_BREAKPOINT_CONDITION_AND 0x1
#define PPC_BREAKPOINT_CONDITION_EXACT 0x1 /* different name for the same thing as above */
#define PPC_BREAKPOINT_CONDITION_OR 0x2
#define PPC_BREAKPOINT_CONDITION_AND_OR 0x3
#define PPC_BREAKPOINT_CONDITION_BE_ALL 0x00ff0000 /* byte enable bits */
#define PPC_BREAKPOINT_CONDITION_BE(n) (1<<((n)+16))
uint32_t condition_mode; /* break/watchpoint condition flags */
uint64_t addr;
uint64_t addr2;
uint64_t condition_value;
};
A request specifies one event, not necessarily just one register to be set.
For instance, if the request is for a watchpoint with a condition, both the
DAC and DVC registers will be set in the same request.
With this GDB can ask for all kinds of hardware breakpoints and watchpoints
that the BookE supports. COMEFROM breakpoints available in server processors
are not contemplated, but that is out of the scope of this work.
ptrace will return an integer (handle) uniquely identifying the breakpoint or
watchpoint just created. This integer will be used in the PTRACE_DELHWDEBUG
request to ask for its removal. Return -ENOSPC if the requested breakpoint
can't be allocated on the registers.
Some examples of using the structure to:
- set a breakpoint in the first breakpoint register::
p.version = PPC_DEBUG_CURRENT_VERSION;
p.trigger_type = PPC_BREAKPOINT_TRIGGER_EXECUTE;
p.addr_mode = PPC_BREAKPOINT_MODE_EXACT;
p.condition_mode = PPC_BREAKPOINT_CONDITION_NONE;
p.addr = (uint64_t) address;
p.addr2 = 0;
p.condition_value = 0;
- set a watchpoint which triggers on reads in the second watchpoint register::
p.version = PPC_DEBUG_CURRENT_VERSION;
p.trigger_type = PPC_BREAKPOINT_TRIGGER_READ;
p.addr_mode = PPC_BREAKPOINT_MODE_EXACT;
p.condition_mode = PPC_BREAKPOINT_CONDITION_NONE;
p.addr = (uint64_t) address;
p.addr2 = 0;
p.condition_value = 0;
- set a watchpoint which triggers only with a specific value::
p.version = PPC_DEBUG_CURRENT_VERSION;
p.trigger_type = PPC_BREAKPOINT_TRIGGER_READ;
p.addr_mode = PPC_BREAKPOINT_MODE_EXACT;
p.condition_mode = PPC_BREAKPOINT_CONDITION_AND | PPC_BREAKPOINT_CONDITION_BE_ALL;
p.addr = (uint64_t) address;
p.addr2 = 0;
p.condition_value = (uint64_t) condition;
- set a ranged hardware breakpoint::
p.version = PPC_DEBUG_CURRENT_VERSION;
p.trigger_type = PPC_BREAKPOINT_TRIGGER_EXECUTE;
p.addr_mode = PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE;
p.condition_mode = PPC_BREAKPOINT_CONDITION_NONE;
p.addr = (uint64_t) begin_range;
p.addr2 = (uint64_t) end_range;
p.condition_value = 0;
- set a watchpoint in server processors (BookS)::
p.version = 1;
p.trigger_type = PPC_BREAKPOINT_TRIGGER_RW;
p.addr_mode = PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE;
or
p.addr_mode = PPC_BREAKPOINT_MODE_EXACT;
p.condition_mode = PPC_BREAKPOINT_CONDITION_NONE;
p.addr = (uint64_t) begin_range;
/* For PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE addr2 needs to be specified, where
* addr2 - addr <= 8 Bytes.
*/
p.addr2 = (uint64_t) end_range;
p.condition_value = 0;
3. PTRACE_DELHWDEBUG
Takes an integer which identifies an existing breakpoint or watchpoint
(i.e., the value returned from PTRACE_SETHWDEBUG), and deletes the
corresponding breakpoint or watchpoint..

View File

@ -1,151 +0,0 @@
GDB intends to support the following hardware debug features of BookE
processors:
4 hardware breakpoints (IAC)
2 hardware watchpoints (read, write and read-write) (DAC)
2 value conditions for the hardware watchpoints (DVC)
For that, we need to extend ptrace so that GDB can query and set these
resources. Since we're extending, we're trying to create an interface
that's extendable and that covers both BookE and server processors, so
that GDB doesn't need to special-case each of them. We added the
following 3 new ptrace requests.
1. PTRACE_PPC_GETHWDEBUGINFO
Query for GDB to discover the hardware debug features. The main info to
be returned here is the minimum alignment for the hardware watchpoints.
BookE processors don't have restrictions here, but server processors have
an 8-byte alignment restriction for hardware watchpoints. We'd like to avoid
adding special cases to GDB based on what it sees in AUXV.
Since we're at it, we added other useful info that the kernel can return to
GDB: this query will return the number of hardware breakpoints, hardware
watchpoints and whether it supports a range of addresses and a condition.
The query will fill the following structure provided by the requesting process:
struct ppc_debug_info {
unit32_t version;
unit32_t num_instruction_bps;
unit32_t num_data_bps;
unit32_t num_condition_regs;
unit32_t data_bp_alignment;
unit32_t sizeof_condition; /* size of the DVC register */
uint64_t features; /* bitmask of the individual flags */
};
features will have bits indicating whether there is support for:
#define PPC_DEBUG_FEATURE_INSN_BP_RANGE 0x1
#define PPC_DEBUG_FEATURE_INSN_BP_MASK 0x2
#define PPC_DEBUG_FEATURE_DATA_BP_RANGE 0x4
#define PPC_DEBUG_FEATURE_DATA_BP_MASK 0x8
#define PPC_DEBUG_FEATURE_DATA_BP_DAWR 0x10
2. PTRACE_SETHWDEBUG
Sets a hardware breakpoint or watchpoint, according to the provided structure:
struct ppc_hw_breakpoint {
uint32_t version;
#define PPC_BREAKPOINT_TRIGGER_EXECUTE 0x1
#define PPC_BREAKPOINT_TRIGGER_READ 0x2
#define PPC_BREAKPOINT_TRIGGER_WRITE 0x4
uint32_t trigger_type; /* only some combinations allowed */
#define PPC_BREAKPOINT_MODE_EXACT 0x0
#define PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE 0x1
#define PPC_BREAKPOINT_MODE_RANGE_EXCLUSIVE 0x2
#define PPC_BREAKPOINT_MODE_MASK 0x3
uint32_t addr_mode; /* address match mode */
#define PPC_BREAKPOINT_CONDITION_MODE 0x3
#define PPC_BREAKPOINT_CONDITION_NONE 0x0
#define PPC_BREAKPOINT_CONDITION_AND 0x1
#define PPC_BREAKPOINT_CONDITION_EXACT 0x1 /* different name for the same thing as above */
#define PPC_BREAKPOINT_CONDITION_OR 0x2
#define PPC_BREAKPOINT_CONDITION_AND_OR 0x3
#define PPC_BREAKPOINT_CONDITION_BE_ALL 0x00ff0000 /* byte enable bits */
#define PPC_BREAKPOINT_CONDITION_BE(n) (1<<((n)+16))
uint32_t condition_mode; /* break/watchpoint condition flags */
uint64_t addr;
uint64_t addr2;
uint64_t condition_value;
};
A request specifies one event, not necessarily just one register to be set.
For instance, if the request is for a watchpoint with a condition, both the
DAC and DVC registers will be set in the same request.
With this GDB can ask for all kinds of hardware breakpoints and watchpoints
that the BookE supports. COMEFROM breakpoints available in server processors
are not contemplated, but that is out of the scope of this work.
ptrace will return an integer (handle) uniquely identifying the breakpoint or
watchpoint just created. This integer will be used in the PTRACE_DELHWDEBUG
request to ask for its removal. Return -ENOSPC if the requested breakpoint
can't be allocated on the registers.
Some examples of using the structure to:
- set a breakpoint in the first breakpoint register
p.version = PPC_DEBUG_CURRENT_VERSION;
p.trigger_type = PPC_BREAKPOINT_TRIGGER_EXECUTE;
p.addr_mode = PPC_BREAKPOINT_MODE_EXACT;
p.condition_mode = PPC_BREAKPOINT_CONDITION_NONE;
p.addr = (uint64_t) address;
p.addr2 = 0;
p.condition_value = 0;
- set a watchpoint which triggers on reads in the second watchpoint register
p.version = PPC_DEBUG_CURRENT_VERSION;
p.trigger_type = PPC_BREAKPOINT_TRIGGER_READ;
p.addr_mode = PPC_BREAKPOINT_MODE_EXACT;
p.condition_mode = PPC_BREAKPOINT_CONDITION_NONE;
p.addr = (uint64_t) address;
p.addr2 = 0;
p.condition_value = 0;
- set a watchpoint which triggers only with a specific value
p.version = PPC_DEBUG_CURRENT_VERSION;
p.trigger_type = PPC_BREAKPOINT_TRIGGER_READ;
p.addr_mode = PPC_BREAKPOINT_MODE_EXACT;
p.condition_mode = PPC_BREAKPOINT_CONDITION_AND | PPC_BREAKPOINT_CONDITION_BE_ALL;
p.addr = (uint64_t) address;
p.addr2 = 0;
p.condition_value = (uint64_t) condition;
- set a ranged hardware breakpoint
p.version = PPC_DEBUG_CURRENT_VERSION;
p.trigger_type = PPC_BREAKPOINT_TRIGGER_EXECUTE;
p.addr_mode = PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE;
p.condition_mode = PPC_BREAKPOINT_CONDITION_NONE;
p.addr = (uint64_t) begin_range;
p.addr2 = (uint64_t) end_range;
p.condition_value = 0;
- set a watchpoint in server processors (BookS)
p.version = 1;
p.trigger_type = PPC_BREAKPOINT_TRIGGER_RW;
p.addr_mode = PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE;
or
p.addr_mode = PPC_BREAKPOINT_MODE_EXACT;
p.condition_mode = PPC_BREAKPOINT_CONDITION_NONE;
p.addr = (uint64_t) begin_range;
/* For PPC_BREAKPOINT_MODE_RANGE_INCLUSIVE addr2 needs to be specified, where
* addr2 - addr <= 8 Bytes.
*/
p.addr2 = (uint64_t) end_range;
p.condition_value = 0;
3. PTRACE_DELHWDEBUG
Takes an integer which identifies an existing breakpoint or watchpoint
(i.e., the value returned from PTRACE_SETHWDEBUG), and deletes the
corresponding breakpoint or watchpoint..

View File

@ -1,23 +1,23 @@
Freescale QUICC Engine Firmware Uploading
-----------------------------------------
=========================================
Freescale QUICC Engine Firmware Uploading
=========================================
(c) 2007 Timur Tabi <timur at freescale.com>,
Freescale Semiconductor
Table of Contents
=================
.. Table of Contents
I - Software License for Firmware
I - Software License for Firmware
II - Microcode Availability
II - Microcode Availability
III - Description and Terminology
III - Description and Terminology
IV - Microcode Programming Details
IV - Microcode Programming Details
V - Firmware Structure Layout
V - Firmware Structure Layout
VI - Sample Code for Creating Firmware Files
VI - Sample Code for Creating Firmware Files
Revision Information
====================
@ -39,7 +39,7 @@ http://opensource.freescale.com. For other firmware files, please contact
your Freescale representative or your operating system vendor.
III - Description and Terminology
================================
=================================
In this document, the term 'microcode' refers to the sequence of 32-bit
integers that compose the actual QE microcode.
@ -89,7 +89,7 @@ being fixed in the RAM package utilizing they should be activated. This data
structure signals the microcode which of these virtual traps is active.
This structure contains 6 words that the application should copy to some
specific been defined. This table describes the structure.
specific been defined. This table describes the structure::
---------------------------------------------------------------
| Offset in | | Destination Offset | Size of |
@ -119,7 +119,7 @@ Extended Modes
This is a double word bit array (64 bits) that defines special functionality
which has an impact on the software drivers. Each bit has its own impact
and has special instructions for the s/w associated with it. This structure is
described in this table:
described in this table::
-----------------------------------------------------------------------
| Bit # | Name | Description |
@ -220,7 +220,8 @@ The 'model' field is a 16-bit number that matches the actual SOC. The
'major' and 'minor' fields are the major and minor revision numbers,
respectively, of the SOC.
For example, to match the 8323, revision 1.0:
For example, to match the 8323, revision 1.0::
soc.model = 8323
soc.major = 1
soc.minor = 0
@ -273,10 +274,10 @@ library and available to any driver that calles qe_get_firmware_info().
'reserved'.
After the last microcode is a 32-bit CRC. It can be calculated using
this algorithm:
this algorithm::
u32 crc32(const u8 *p, unsigned int len)
{
u32 crc32(const u8 *p, unsigned int len)
{
unsigned int i;
u32 crc = 0;
@ -286,7 +287,7 @@ u32 crc32(const u8 *p, unsigned int len)
crc = (crc >> 1) ^ ((crc & 1) ? 0xedb88320 : 0);
}
return crc;
}
}
VI - Sample Code for Creating Firmware Files
============================================

View File

@ -5,12 +5,12 @@ Power Architecture 64-bit Linux system call ABI
syscall
=======
syscall calling sequence[*] matches the Power Architecture 64-bit ELF ABI
syscall calling sequence\ [1]_ matches the Power Architecture 64-bit ELF ABI
specification C function calling sequence, including register preservation
rules, with the following differences.
[*] Some syscalls (typically low-level management functions) may have
different calling sequences (e.g., rt_sigreturn).
.. [1] Some syscalls (typically low-level management functions) may have
different calling sequences (e.g., rt_sigreturn).
Parameters and return value
---------------------------
@ -33,12 +33,14 @@ Register preservation rules
Register preservation rules match the ELF ABI calling sequence with the
following differences:
r0: Volatile. (System call number.)
r3: Volatile. (Parameter 1, and return value.)
r4-r8: Volatile. (Parameters 2-6.)
cr0: Volatile (cr0.SO is the return error condition)
cr1, cr5-7: Nonvolatile.
lr: Nonvolatile.
=========== ============= ========================================
r0 Volatile (System call number.)
r3 Volatile (Parameter 1, and return value.)
r4-r8 Volatile (Parameters 2-6.)
cr0 Volatile (cr0.SO is the return error condition)
cr1, cr5-7 Nonvolatile
lr Nonvolatile
=========== ============= ========================================
All floating point and vector data registers as well as control and status
registers are nonvolatile.
@ -90,9 +92,12 @@ The vsyscall may or may not use the caller's stack frame save areas.
Register preservation rules
---------------------------
r0: Volatile.
cr1, cr5-7: Volatile.
lr: Volatile.
=========== ========
r0 Volatile
cr1, cr5-7 Volatile
lr Volatile
=========== ========
Invocation
----------

View File

@ -1,3 +1,4 @@
============================
Transactional Memory support
============================
@ -17,29 +18,29 @@ instructions are presented to delimit transactions; transactions are
guaranteed to either complete atomically or roll back and undo any partial
changes.
A simple transaction looks like this:
A simple transaction looks like this::
begin_move_money:
tbegin
beq abort_handler
begin_move_money:
tbegin
beq abort_handler
ld r4, SAVINGS_ACCT(r3)
ld r5, CURRENT_ACCT(r3)
subi r5, r5, 1
addi r4, r4, 1
std r4, SAVINGS_ACCT(r3)
std r5, CURRENT_ACCT(r3)
ld r4, SAVINGS_ACCT(r3)
ld r5, CURRENT_ACCT(r3)
subi r5, r5, 1
addi r4, r4, 1
std r4, SAVINGS_ACCT(r3)
std r5, CURRENT_ACCT(r3)
tend
tend
b continue
b continue
abort_handler:
... test for odd failures ...
abort_handler:
... test for odd failures ...
/* Retry the transaction if it failed because it conflicted with
* someone else: */
b begin_move_money
/* Retry the transaction if it failed because it conflicted with
* someone else: */
b begin_move_money
The 'tbegin' instruction denotes the start point, and 'tend' the end point.
@ -123,7 +124,7 @@ Transaction-aware signal handlers can read the transactional register state
from the second ucontext. This will be necessary for crash handlers to
determine, for example, the address of the instruction causing the SIGSEGV.
Example signal handler:
Example signal handler::
void crash_handler(int sig, siginfo_t *si, void *uc)
{
@ -133,9 +134,9 @@ Example signal handler:
if (ucp_link) {
u64 msr = ucp->uc_mcontext.regs->msr;
/* May have transactional ucontext! */
#ifndef __powerpc64__
#ifndef __powerpc64__
msr |= ((u64)transactional_ucp->uc_mcontext.regs->msr) << 32;
#endif
#endif
if (MSR_TM_ACTIVE(msr)) {
/* Yes, we crashed during a transaction. Oops. */
fprintf(stderr, "Transaction to be restarted at 0x%llx, but "
@ -176,6 +177,7 @@ Failure cause codes used by kernel
These are defined in <asm/reg.h>, and distinguish different reasons why the
kernel aborted a transaction:
====================== ================================
TM_CAUSE_RESCHED Thread was rescheduled.
TM_CAUSE_TLBI Software TLB invalid.
TM_CAUSE_FAC_UNAV FP/VEC/VSX unavailable trap.
@ -184,6 +186,7 @@ kernel aborted a transaction:
TM_CAUSE_MISC Currently unused.
TM_CAUSE_ALIGNMENT Alignment fault.
TM_CAUSE_EMULATE Emulation that touched memory.
====================== ================================
These can be checked by the user program's abort handler as TEXASR[0:7]. If
bit 7 is set, it indicates that the error is consider persistent. For example
@ -203,7 +206,7 @@ POWER9
======
TM on POWER9 has issues with storing the complete register state. This
is described in this commit:
is described in this commit::
commit 4bb3c7a0208fc13ca70598efd109901a7cd45ae7
Author: Paul Mackerras <paulus@ozlabs.org>

View File

@ -1,10 +0,0 @@
# -*- coding: utf-8; mode: python -*-
project = 'Linux Kernel Development Documentation'
tags.add("subproject")
latex_documents = [
('index', 'process.tex', 'Linux Kernel Development Documentation',
'The kernel development community', 'manual'),
]

View File

@ -119,3 +119,17 @@ array may exceed the remaining memory in the stack segment. This could
lead to a crash, possible overwriting sensitive contents at the end of the
stack (when built without `CONFIG_THREAD_INFO_IN_TASK=y`), or overwriting
memory adjacent to the stack (when built without `CONFIG_VMAP_STACK=y`)
Implicit switch case fall-through
---------------------------------
The C language allows switch cases to "fall through" when
a "break" statement is missing at the end of a case. This,
however, introduces ambiguity in the code, as it's not always
clear if the missing break is intentional or a bug. As there
have been a long list of flaws `due to missing "break" statements
<https://cwe.mitre.org/data/definitions/484.html>`_, we no longer allow
"implicit fall-through". In order to identify an intentional fall-through
case, we have adopted the marking used by static analyzers: a comment
saying `/* Fall through */`. Once the C++17 `__attribute__((fallthrough))`
is more widely handled by C compilers, static analyzers, and IDEs, we can
switch to using that instead.

View File

@ -0,0 +1,279 @@
Embargoed hardware issues
=========================
Scope
-----
Hardware issues which result in security problems are a different category
of security bugs than pure software bugs which only affect the Linux
kernel.
Hardware issues like Meltdown, Spectre, L1TF etc. must be treated
differently because they usually affect all Operating Systems ("OS") and
therefore need coordination across different OS vendors, distributions,
hardware vendors and other parties. For some of the issues, software
mitigations can depend on microcode or firmware updates, which need further
coordination.
.. _Contact:
Contact
-------
The Linux kernel hardware security team is separate from the regular Linux
kernel security team.
The team only handles the coordination of embargoed hardware security
issues. Reports of pure software security bugs in the Linux kernel are not
handled by this team and the reporter will be guided to contact the regular
Linux kernel security team (:ref:`Documentation/admin-guide/
<securitybugs>`) instead.
The team can be contacted by email at <hardware-security@kernel.org>. This
is a private list of security officers who will help you to coordinate an
issue according to our documented process.
The list is encrypted and email to the list can be sent by either PGP or
S/MIME encrypted and must be signed with the reporter's PGP key or S/MIME
certificate. The list's PGP key and S/MIME certificate are available from
https://www.kernel.org/....
While hardware security issues are often handled by the affected hardware
vendor, we welcome contact from researchers or individuals who have
identified a potential hardware flaw.
Hardware security officers
^^^^^^^^^^^^^^^^^^^^^^^^^^
The current team of hardware security officers:
- Linus Torvalds (Linux Foundation Fellow)
- Greg Kroah-Hartman (Linux Foundation Fellow)
- Thomas Gleixner (Linux Foundation Fellow)
Operation of mailing-lists
^^^^^^^^^^^^^^^^^^^^^^^^^^
The encrypted mailing-lists which are used in our process are hosted on
Linux Foundation's IT infrastructure. By providing this service Linux
Foundation's director of IT Infrastructure security technically has the
ability to access the embargoed information, but is obliged to
confidentiality by his employment contract. Linux Foundation's director of
IT Infrastructure security is also responsible for the kernel.org
infrastructure.
The Linux Foundation's current director of IT Infrastructure security is
Konstantin Ryabitsev.
Non-disclosure agreements
-------------------------
The Linux kernel hardware security team is not a formal body and therefore
unable to enter into any non-disclosure agreements. The kernel community
is aware of the sensitive nature of such issues and offers a Memorandum of
Understanding instead.
Memorandum of Understanding
---------------------------
The Linux kernel community has a deep understanding of the requirement to
keep hardware security issues under embargo for coordination between
different OS vendors, distributors, hardware vendors and other parties.
The Linux kernel community has successfully handled hardware security
issues in the past and has the necessary mechanisms in place to allow
community compliant development under embargo restrictions.
The Linux kernel community has a dedicated hardware security team for
initial contact, which oversees the process of handling such issues under
embargo rules.
The hardware security team identifies the developers (domain experts) who
will form the initial response team for a particular issue. The initial
response team can bring in further developers (domain experts) to address
the issue in the best technical way.
All involved developers pledge to adhere to the embargo rules and to keep
the received information confidential. Violation of the pledge will lead to
immediate exclusion from the current issue and removal from all related
mailing-lists. In addition, the hardware security team will also exclude
the offender from future issues. The impact of this consequence is a highly
effective deterrent in our community. In case a violation happens the
hardware security team will inform the involved parties immediately. If you
or anyone becomes aware of a potential violation, please report it
immediately to the Hardware security officers.
Process
^^^^^^^
Due to the globally distributed nature of Linux kernel development,
face-to-face meetings are almost impossible to address hardware security
issues. Phone conferences are hard to coordinate due to time zones and
other factors and should be only used when absolutely necessary. Encrypted
email has been proven to be the most effective and secure communication
method for these types of issues.
Start of Disclosure
"""""""""""""""""""
Disclosure starts by contacting the Linux kernel hardware security team by
email. This initial contact should contain a description of the problem and
a list of any known affected hardware. If your organization builds or
distributes the affected hardware, we encourage you to also consider what
other hardware could be affected.
The hardware security team will provide an incident-specific encrypted
mailing-list which will be used for initial discussion with the reporter,
further disclosure and coordination.
The hardware security team will provide the disclosing party a list of
developers (domain experts) who should be informed initially about the
issue after confirming with the developers that they will adhere to this
Memorandum of Understanding and the documented process. These developers
form the initial response team and will be responsible for handling the
issue after initial contact. The hardware security team is supporting the
response team, but is not necessarily involved in the mitigation
development process.
While individual developers might be covered by a non-disclosure agreement
via their employer, they cannot enter individual non-disclosure agreements
in their role as Linux kernel developers. They will, however, agree to
adhere to this documented process and the Memorandum of Understanding.
Disclosure
""""""""""
The disclosing party provides detailed information to the initial response
team via the specific encrypted mailing-list.
From our experience the technical documentation of these issues is usually
a sufficient starting point and further technical clarification is best
done via email.
Mitigation development
""""""""""""""""""""""
The initial response team sets up an encrypted mailing-list or repurposes
an existing one if appropriate. The disclosing party should provide a list
of contacts for all other parties who have already been, or should be
informed about the issue. The response team contacts these parties so they
can name experts who should be subscribed to the mailing-list.
Using a mailing-list is close to the normal Linux development process and
has been successfully used in developing mitigations for various hardware
security issues in the past.
The mailing-list operates in the same way as normal Linux development.
Patches are posted, discussed and reviewed and if agreed on applied to a
non-public git repository which is only accessible to the participating
developers via a secure connection. The repository contains the main
development branch against the mainline kernel and backport branches for
stable kernel versions as necessary.
The initial response team will identify further experts from the Linux
kernel developer community as needed and inform the disclosing party about
their participation. Bringing in experts can happen at any time of the
development process and often needs to be handled in a timely manner.
Coordinated release
"""""""""""""""""""
The involved parties will negotiate the date and time where the embargo
ends. At that point the prepared mitigations are integrated into the
relevant kernel trees and published.
While we understand that hardware security issues need coordinated embargo
time, the embargo time should be constrained to the minimum time which is
required for all involved parties to develop, test and prepare the
mitigations. Extending embargo time artificially to meet conference talk
dates or other non-technical reasons is creating more work and burden for
the involved developers and response teams as the patches need to be kept
up to date in order to follow the ongoing upstream kernel development,
which might create conflicting changes.
CVE assignment
""""""""""""""
Neither the hardware security team nor the initial response team assign
CVEs, nor are CVEs required for the development process. If CVEs are
provided by the disclosing party they can be used for documentation
purposes.
Process ambassadors
-------------------
For assistance with this process we have established ambassadors in various
organizations, who can answer questions about or provide guidance on the
reporting process and further handling. Ambassadors are not involved in the
disclosure of a particular issue, unless requested by a response team or by
an involved disclosed party. The current ambassadors list:
============= ========================================================
ARM
AMD
IBM
Intel
Qualcomm
Microsoft
VMware
XEN
Canonical Tyler Hicks <tyhicks@canonical.com>
Debian Ben Hutchings <ben@decadent.org.uk>
Oracle Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Red Hat Josh Poimboeuf <jpoimboe@redhat.com>
SUSE Jiri Kosina <jkosina@suse.cz>
Amazon
Google
============== ========================================================
If you want your organization to be added to the ambassadors list, please
contact the hardware security team. The nominated ambassador has to
understand and support our process fully and is ideally well connected in
the Linux kernel community.
Encrypted mailing-lists
-----------------------
We use encrypted mailing-lists for communication. The operating principle
of these lists is that email sent to the list is encrypted either with the
list's PGP key or with the list's S/MIME certificate. The mailing-list
software decrypts the email and re-encrypts it individually for each
subscriber with the subscriber's PGP key or S/MIME certificate. Details
about the mailing-list software and the setup which is used to ensure the
security of the lists and protection of the data can be found here:
https://www.kernel.org/....
List keys
^^^^^^^^^
For initial contact see :ref:`Contact`. For incident specific mailing-lists
the key and S/MIME certificate are conveyed to the subscribers by email
sent from the specific list.
Subscription to incident specific lists
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Subscription is handled by the response teams. Disclosed parties who want
to participate in the communication send a list of potential subscribers to
the response team so the response team can validate subscription requests.
Each subscriber needs to send a subscription request to the response team
by email. The email must be signed with the subscriber's PGP key or S/MIME
certificate. If a PGP key is used, it must be available from a public key
server and is ideally connected to the Linux kernel's PGP web of trust. See
also: https://www.kernel.org/signature.html.
The response team verifies that the subscriber request is valid and adds
the subscriber to the list. After subscription the subscriber will receive
email from the mailing-list which is signed either with the list's PGP key
or the list's S/MIME certificate. The subscriber's email client can extract
the PGP key or the S/MIME certificate from the signature so the subscriber
can send encrypted email to the list.

View File

@ -45,6 +45,7 @@ Other guides to the community that are of interest to most developers are:
submit-checklist
kernel-docs
deprecated
embargoed-hardware-issues
These are some overall technical guides that have been put here for now for
lack of a better place.

View File

@ -180,6 +180,13 @@ The process of how these work together.
add it to an iommu_group and a vfio_group. Then we could pass through
the mdev to a guest.
VFIO-CCW Regions
----------------
The vfio-ccw driver exposes MMIO regions to accept requests from and return
results to userspace.
vfio-ccw I/O region
-------------------
@ -205,6 +212,25 @@ irb_area stores the I/O result.
ret_code stores a return code for each access of the region.
This region is always available.
vfio-ccw cmd region
-------------------
The vfio-ccw cmd region is used to accept asynchronous instructions
from userspace::
#define VFIO_CCW_ASYNC_CMD_HSCH (1 << 0)
#define VFIO_CCW_ASYNC_CMD_CSCH (1 << 1)
struct ccw_cmd_region {
__u32 command;
__u32 ret_code;
} __packed;
This region is exposed via region type VFIO_REGION_SUBTYPE_CCW_ASYNC_CMD.
Currently, CLEAR SUBCHANNEL and HALT SUBCHANNEL use this region.
vfio-ccw operation details
--------------------------
@ -306,9 +332,8 @@ Together with the corresponding work in QEMU, we can bring the passed
through DASD/ECKD device online in a guest now and use it as a block
device.
While the current code allows the guest to start channel programs via
START SUBCHANNEL, support for HALT SUBCHANNEL or CLEAR SUBCHANNEL is
not yet implemented.
The current code allows the guest to start channel programs via
START SUBCHANNEL, and to issue HALT SUBCHANNEL and CLEAR SUBCHANNEL.
vfio-ccw supports classic (command mode) channel I/O only. Transport
mode (HPF) is not supported.

View File

@ -1,10 +0,0 @@
# -*- coding: utf-8; mode: python -*-
project = "SuperH architecture implementation manual"
tags.add("subproject")
latex_documents = [
('index', 'sh.tex', project,
'The kernel development community', 'manual'),
]

View File

@ -1,10 +0,0 @@
# -*- coding: utf-8; mode: python -*-
project = "Linux Sound Subsystem Documentation"
tags.add("subproject")
latex_documents = [
('index', 'sound.tex', project,
'The kernel development community', 'manual'),
]

View File

@ -21,6 +21,29 @@ def loadConfig(namespace):
and os.path.normpath(namespace["__file__"]) != os.path.normpath(config_file) ):
config_file = os.path.abspath(config_file)
# Let's avoid one conf.py file just due to latex_documents
start = config_file.find('Documentation/')
if start >= 0:
start = config_file.find('/', start + 1)
end = config_file.rfind('/')
if start >= 0 and end > 0:
dir = config_file[start + 1:end]
print("source directory: %s" % dir)
new_latex_docs = []
latex_documents = namespace['latex_documents']
for l in latex_documents:
if l[0].find(dir + '/') == 0:
has = True
fn = l[0][len(dir) + 1:]
new_latex_docs.append((fn, l[1], l[2], l[3], l[4]))
break
namespace['latex_documents'] = new_latex_docs
# If there is an extra conf.py file, load it
if os.path.isfile(config_file):
sys.stdout.write("load additional sphinx-config: %s\n" % config_file)
config = namespace.copy()
@ -29,4 +52,6 @@ def loadConfig(namespace):
del config['__file__']
namespace.update(config)
else:
sys.stderr.write("WARNING: additional sphinx-config not found: %s\n" % config_file)
config = namespace.copy()
config['tags'].add("subproject")
namespace.update(config)

Some files were not shown because too many files have changed in this diff Show More