linux/Documentation
Jonathan Cameron c5e22feffd topology: Represent clusters of CPUs within a die
Both ACPI and DT provide the ability to describe additional layers of
topology between that of individual cores and higher level constructs
such as the level at which the last level cache is shared.
In ACPI this can be represented in PPTT as a Processor Hierarchy
Node Structure [1] that is the parent of the CPU cores and in turn
has a parent Processor Hierarchy Nodes Structure representing
a higher level of topology.

For example Kunpeng 920 has 6 or 8 clusters in each NUMA node, and each
cluster has 4 cpus. All clusters share L3 cache data, but each cluster
has local L3 tag. On the other hand, each clusters will share some
internal system bus.

+-----------------------------------+                          +---------+
|  +------+    +------+             +--------------------------+         |
|  | CPU0 |    | cpu1 |             |    +-----------+         |         |
|  +------+    +------+             |    |           |         |         |
|                                   +----+    L3     |         |         |
|  +------+    +------+   cluster   |    |    tag    |         |         |
|  | CPU2 |    | CPU3 |             |    |           |         |         |
|  +------+    +------+             |    +-----------+         |         |
|                                   |                          |         |
+-----------------------------------+                          |         |
+-----------------------------------+                          |         |
|  +------+    +------+             +--------------------------+         |
|  |      |    |      |             |    +-----------+         |         |
|  +------+    +------+             |    |           |         |         |
|                                   |    |    L3     |         |         |
|  +------+    +------+             +----+    tag    |         |         |
|  |      |    |      |             |    |           |         |         |
|  +------+    +------+             |    +-----------+         |         |
|                                   |                          |         |
+-----------------------------------+                          |   L3    |
                                                               |   data  |
+-----------------------------------+                          |         |
|  +------+    +------+             |    +-----------+         |         |
|  |      |    |      |             |    |           |         |         |
|  +------+    +------+             +----+    L3     |         |         |
|                                   |    |    tag    |         |         |
|  +------+    +------+             |    |           |         |         |
|  |      |    |      |             |    +-----------+         |         |
|  +------+    +------+             +--------------------------+         |
+-----------------------------------|                          |         |
+-----------------------------------|                          |         |
|  +------+    +------+             +--------------------------+         |
|  |      |    |      |             |    +-----------+         |         |
|  +------+    +------+             |    |           |         |         |
|                                   +----+    L3     |         |         |
|  +------+    +------+             |    |    tag    |         |         |
|  |      |    |      |             |    |           |         |         |
|  +------+    +------+             |    +-----------+         |         |
|                                   |                          |         |
+-----------------------------------+                          |         |
+-----------------------------------+                          |         |
|  +------+    +------+             +--------------------------+         |
|  |      |    |      |             |   +-----------+          |         |
|  +------+    +------+             |   |           |          |         |
|                                   |   |    L3     |          |         |
|  +------+    +------+             +---+    tag    |          |         |
|  |      |    |      |             |   |           |          |         |
|  +------+    +------+             |   +-----------+          |         |
|                                   |                          |         |
+-----------------------------------+                          |         |
+-----------------------------------+                          |         |
|  +------+    +------+             +--------------------------+         |
|  |      |    |      |             |  +-----------+           |         |
|  +------+    +------+             |  |           |           |         |
|                                   |  |    L3     |           |         |
|  +------+    +------+             +--+    tag    |           |         |
|  |      |    |      |             |  |           |           |         |
|  +------+    +------+             |  +-----------+           |         |
|                                   |                          +---------+
+-----------------------------------+

That means spreading tasks among clusters will bring more bandwidth
while packing tasks within one cluster will lead to smaller cache
synchronization latency. So both kernel and userspace will have
a chance to leverage this topology to deploy tasks accordingly to
achieve either smaller cache latency within one cluster or an even
distribution of load among clusters for higher throughput.

This patch exposes cluster topology to both kernel and userspace.
Libraried like hwloc will know cluster by cluster_cpus and related
sysfs attributes. PoC of HWLOC support at [2].

Note this patch only handle the ACPI case.

Special consideration is needed for SMT processors, where it is
necessary to move 2 levels up the hierarchy from the leaf nodes
(thus skipping the processor core level).

Note that arm64 / ACPI does not provide any means of identifying
a die level in the topology but that may be unrelate to the cluster
level.

[1] ACPI Specification 6.3 - section 5.2.29.1 processor hierarchy node
    structure (Type 0)
[2] https://github.com/hisilicon/hwloc/tree/linux-cluster

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210924085104.44806-2-21cnbao@gmail.com
2021-10-15 11:25:15 +02:00
..
ABI topology: Represent clusters of CPUs within a die 2021-10-15 11:25:15 +02:00
accounting
admin-guide topology: Represent clusters of CPUs within a die 2021-10-15 11:25:15 +02:00
arm Documentation: arm: marvell: Add 88F6825 model into list 2021-08-24 13:26:32 -06:00
arm64 Merge remote-tracking branch 'tip/sched/arm64' into for-next/core 2021-08-31 09:10:00 +01:00
block Documentation: block: blk-mq: Fix small typo in multi-queue docs 2021-08-24 13:30:00 -06:00
bpf libbpf: Rename libbpf documentation index file 2021-08-18 08:45:25 -07:00
cdrom
core-api irqchip fixes for 5.15, take #1 2021-09-24 14:11:04 +02:00
cpu-freq cpufreq: Remove ready() callback 2021-09-02 18:04:17 +02:00
crypto
dev-tools Merge branch 'akpm' (patches from Andrew) 2021-09-08 12:55:35 -07:00
devicetree interconnect fixes for v5.15 2021-09-27 17:22:16 +02:00
doc-guide
driver-api cxl for v5.15 2021-09-09 11:48:27 -07:00
fault-injection Char / Misc driver changes for 5.15-rc1 2021-09-01 08:35:06 -07:00
fb
features RISC-V Patches for the 5.15 Merge Window, Part 2 2021-09-11 14:29:42 -07:00
filesystems block-5.15-2021-09-11 2021-09-11 10:19:51 -07:00
firmware_class
firmware-guide
fpga
gpu drm fixes for 5.15-rc1 2021-09-10 11:22:23 -07:00
hid
hwmon hwmon: (k10temp) Remove residues of current and voltage 2021-09-12 17:56:36 -07:00
i2c Documentation: i2c: add i2c-sysfs into index 2021-08-10 22:58:32 +02:00
ia64
ide
iio
infiniband
input
isdn
kbuild Merge branch 'akpm' (patches from Andrew) 2021-09-08 12:55:35 -07:00
kernel-hacking docs: kernel-hacking: Remove inappropriate text 2021-09-03 15:56:45 -06:00
leds Documentation: leds: standartizing LED names 2021-08-20 10:26:24 +02:00
litmus-tests
livepatch
locking Documentation: locking: fix references 2021-08-24 13:20:39 -06:00
m68k
maintainer
mhi
mips
misc-devices
netlabel
networking Doc: networking: Fox a typo in ice.rst 2021-09-21 11:01:25 +01:00
nios2
nvdimm
openrisc
parisc
PCI pci-v5.15-changes 2021-09-07 19:13:42 -07:00
pcmcia
power Documentation: power: include kernel-doc in Energy Model doc 2021-09-07 21:17:28 +02:00
powerpc powerpc/doc: Fix htmldocs errors 2021-08-27 00:56:34 +10:00
process Merge branch 'gcc-min-version-5.1' (make gcc-5.1 the minimum version) 2021-09-13 10:43:04 -07:00
RCU
riscv
s390
scheduler sched/fair: Add document for burstable CFS bandwidth 2021-10-05 15:51:41 +02:00
scsi
security
sh
sound Yet another set of documentation changes: 2021-09-01 18:49:47 -07:00
sparc
sphinx docs: sphinx-requirements: Move sphinx_rtd_theme to top 2021-08-12 09:15:38 -06:00
sphinx-static
spi
staging
target
timers
trace Tracing updates for 5.15: 2021-09-05 11:50:41 -07:00
translations Merge branch 'gcc-min-version-5.1' (make gcc-5.1 the minimum version) 2021-09-13 10:43:04 -07:00
usb docs: usb: fix malformed table 2021-08-05 12:31:51 +02:00
userspace-api virtio,vdpa,vhost: features, fixes 2021-09-11 14:48:42 -07:00
virt ARM: 2021-09-07 13:40:51 -07:00
vm Merge branch 'akpm' (patches from Andrew) 2021-09-08 12:55:35 -07:00
w1
watchdog
x86 Another collection of documentation patches, mostly fixes but also includes 2021-09-08 16:28:14 -07:00
xtensa
.gitignore
arch.rst
asm-annotations.rst
atomic_bitops.txt
atomic_t.txt
Changes
CodingStyle
conf.py docs: pdfdocs: Fix typo in CJK-language specific font settings 2021-09-06 16:53:39 -06:00
COPYING-logo
docutils.conf
dontdiff
index.rst
Kconfig
logo.gif
Makefile
memory-barriers.txt
SubmittingPatches
watch_queue.rst