IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
- By popular demand, print the previous microcode revision an update
was done over
- Remove more code related to the now gone MICROCODE_OLD_INTERFACE
- Document the problems stemming from microcode late loading
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmM8QL8ACgkQEsHwGGHe
VUoDxw/9FA3rOAZD7N0PI/vspMUxEDQVYV60tfuuynao72HZv+tfJbRTXe42p3ZO
B+kRPFud4lAOE1ykDHJ2A2OZzvthGfYlUnMyvk1IvK/gOwkkiSH4c6sVSrOYWtl7
uoIN/3J83BMZoWNOKqrg1OOzotzkTyeucPXdWF+sRkfVzBIgbDqtplbFFCP4abPK
WxatY2hkTfBCiN92OSOLaMGg0POpmycy+6roR2Qr5rWrC7nfREVNbKdOyEykZsfV
U2gPm0A953sZ3Ye6waFib+qjJdyR7zBQRCJVEGOB6g8BlNwqGv/TY7NIUWSVFT9Y
qcAnD3hI0g0UTYdToBUvYEpfD8zC9Wg3tZEpZSBRKh3AR2+Xt44VKQFO4L9uIt6g
hWFMBLsFiYnBmKW3arNLQcdamE34GRhwUfXm0OjHTvTWb3aFO1I9+NBCaHp19KVy
HD13wGSyj5V9SAVD0ztRFut4ZESejDyYBw9joB2IsjkY2IJmAAsRFgV0KXqUvQLX
TX13hnhm894UfQ+4KCXnA0UeEDoXhwAbYFxR89yGeOxoGe1oaPXr9C1/r88YLq0n
ekjIVZ3G97PIxmayj3cv9YrRIrrJi4PWF1Raey6go3Ma+rNBRnya5UF6Noch1lHh
HeF7t84BZ5Ub6GweWYaMHQZCA+wMCZMYYuCMNzN7b54yRtQuvCc=
=lWDD
-----END PGP SIGNATURE-----
Merge tag 'x86_microcode_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x75 microcode loader updates from Borislav Petkov:
- Get rid of a single ksize() usage
- By popular demand, print the previous microcode revision an update
was done over
- Remove more code related to the now gone MICROCODE_OLD_INTERFACE
- Document the problems stemming from microcode late loading
* tag 'x86_microcode_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/microcode/AMD: Track patch allocation size explicitly
x86/microcode: Print previous version of microcode after reload
x86/microcode: Remove ->request_microcode_user()
x86/microcode: Document the whole late loading problem
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmM7/C0ACgkQEsHwGGHe
VUp3jA//caeFnxfKU8F6B+D+sqXub+fRC88U/5j5TVgueiFdpgbHxbeb5isV2DW8
/1as5LN6PCy0V9ax873PblaRdL+ir2KVjzTOQhv1fsBWtRJgRAte6ltHfncST3jM
Qm9ELCSYlwX6wgPDFWrLd5/jCHLq/7ffvWATwt4ZS9/+ebwJJ3JRU82aydvSAe+M
tYDLA0RYjrTKB7bfhlxft8n3/ttOwmaQ9sPw6J9KWL9VZ8d7uA0cZgHMFgclr9xb
+cTuOQwJ+a+njS3tXYWH0lU4lLw6tRcWrl3/gHpiuiwU85xdoClyCcsmvZYMLy1s
cv4CmAQqs0dSlcII/F+6pTh+E8gcpzm2+uiP8k56dCK3c9OhtdKcPRcxm169JID+
hKZ9rcscx1e/Q3rzhRtC4w2LguDcARe2b0OSMsGH40uZ/UOHdnYVk8oD81O9LWrB
T4DgjJCcSf4y0KGDXtjJ39POfe0zFHgx8nr5JOL9Las6gZtXZHRmCzBacIU6JeJo
5VEl6Yx1ca7aiuNTTcsmmW3zbsR9J6VppFZ6ma9jTTM8HbINM6g4UDepkeubNUdv
tAJVbb6MEDGtT1ybH4oalrYkPfhYDwRkahqPtMY0ROtlzxO65q2DI3rH2gtyboas
oP4hfEDtzKiJj4FiIRMKbryKz7MBWImHMRovRGKjB/Y48mTxX3M=
=e8H8
-----END PGP SIGNATURE-----
Merge tag 'x86_apic_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 APIC update from Borislav Petkov:
- Add support for locking the APIC in X2APIC mode to prevent SGX
enclave leaks
* tag 'x86_apic_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/apic: Don't disable x2APIC if locked
Since 2d1c498072de ("mm: memcontrol: make swap tracking an integral part
of memory control"), CONFIG_MEMCG_SWAP hasn't been a user-visible config
option anymore, it just means CONFIG_MEMCG && CONFIG_SWAP.
Update the sites accordingly and drop the symbol.
[ While touching the docs, remove two references to CONFIG_MEMCG_KMEM,
which hasn't been a user-visible symbol for over half a decade. ]
Link: https://lkml.kernel.org/r/20220926135704.400818-5-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Shakeel Butt <shakeelb@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The swapaccounting= commandline option already does very little today. To
close a trivial containment failure case, the swap ownership tracking part
of the swap controller has recently become mandatory (see commit
2d1c498072de ("mm: memcontrol: make swap tracking an integral part of
memory control") for details), which makes up the majority of the work
during swapout, swapin, and the swap slot map.
The only thing left under this flag is the page_counter operations and the
visibility of the swap control files in the first place, which are rather
meager savings. There also aren't many scenarios, if any, where
controlling the memory of a cgroup while allowing it unlimited access to a
global swap space is a workable resource isolation strategy.
On the other hand, there have been several bugs and confusion around the
many possible swap controller states (cgroup1 vs cgroup2 behavior, memory
accounting without swap accounting, memcg runtime disabled).
This puts the maintenance overhead of retaining the toggle above its
practical benefits. Deprecate it.
Link: https://lkml.kernel.org/r/20220926135704.400818-3-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Suggested-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The main benefit of THPs are that they can be mapped at the pmd level,
increasing the likelihood of TLB hit and spending less cycles in page
table walks. pte-mapped hugepages - that is - hugepage-aligned compound
pages of order HPAGE_PMD_ORDER mapped by ptes - although being contiguous
in physical memory, don't have this advantage. In fact, one could argue
they are detrimental to system performance overall since they occupy a
precious hugepage-aligned/sized region of physical memory that could
otherwise be used more effectively. Additionally, pte-mapped hugepages
can be the cheapest memory to collapse for khugepaged since no new
hugepage allocation or copying of memory contents is necessary - we only
need to update the mapping page tables.
In the anonymous collapse path, we are able to collapse pte-mapped
hugepages (albeit, perhaps suboptimally), but the file/shmem path makes no
effort when compound pages (of any order) are encountered.
Identify pte-mapped hugepages in the file/shmem collapse path. The
final step of which makes a racy check of the value of the pmd to
ensure it maps a pte table. This should be fine, since races that
result in false-positive (i.e. attempt collapse even though we
shouldn't) will fail later in collapse_pte_mapped_thp() once we
actually lock mmap_lock and reinspect the pmd value. Races that result
in false-negatives (i.e. where we decide to not attempt collapse, but
should have) shouldn't be an issue, since in the worst case, we do
nothing - which is what we've done up to this point. We make a similar
check in retract_page_tables(). If we do think we've found a
pte-mapped hugepgae in khugepaged context, attempt to update page
tables mapping this hugepage.
Note that these collapses still count towards the
/sys/kernel/mm/transparent_hugepage/khugepaged/pages_collapsed counter,
and if the pte-mapped hugepage was also mapped into multiple process'
address spaces, could be incremented for each page table update. Since we
increment the counter when a pte-mapped hugepage is successfully added to
the list of to-collapse pte-mapped THPs, it's possible that we never
actually update the page table either. This is different from how
file/shmem pages_collapsed accounting works today where only a successful
page cache update is counted (it's also possible here that no page tables
are actually changed). Though it incurs some slop, this is preferred to
either not accounting for the event at all, or plumbing through data in
struct mm_slot on whether to account for the collapse or not.
Also note that work still needs to be done to support arbitrary compound
pages, and that this should all be converted to using folios.
[shy828301@gmail.com: Spelling mistake, update comment, and add Documentation]
Link: https://lore.kernel.org/linux-mm/CAHbLzkpHwZxFzjfX9nxVoRhzup8WMjMfyL6Xiq8mZ9M-N3ombw@mail.gmail.com/
Link: https://lkml.kernel.org/r/20220907144521.3115321-3-zokeefe@google.com
Link: https://lkml.kernel.org/r/20220922224046.1143204-3-zokeefe@google.com
Signed-off-by: Zach O'Keefe <zokeefe@google.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Chris Kennelly <ckennelly@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Houghton <jthoughton@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rongwei Wang <rongwei.wang@linux.alibaba.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
A user who reads THP_ZERO_PAGE_ALLOC may be more concerned about the huge
zero pages that are really allocated for thp. It is misleading to
increase THP_ZERO_PAGE_ALLOC twice if two threads call get_huge_zero_page
concurrently. Don't increase the value if the huge page is not really
used.
Update Documentation/admin-guide/mm/transhuge.rst to suit.
Link: https://lkml.kernel.org/r/20220909021653.3371879-1-liushixin2@huawei.com
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Commit b18402726bd1 ("Docs/admin-guide/mm/damon/usage: document DAMON
sysfs interface") announced the DAMON debugfs interface deprecation plan,
but it is not so aggressively announced. As the deprecation time is
coming, this commit makes the announce more easy to be found by adding the
note at the beginning of the DAMON debugfs interface usage document.
Link: https://lkml.kernel.org/r/20220909202901.57977-8-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Yun Levi <ppbuk5246@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
'Getting Started' document of DAMON says DAMON user-space tool, damo[1],
is using DAMON debugfs interface, and therefore it needs to ensure debugfs
is mounted. However, the latest version of the tool is using DAMON sysfs
interface. Moreover, DAMON debugfs interface is going to be deprecated as
announced by commit b18402726bd1 ("Docs/admin-guide/mm/damon/usage:
document DAMON sysfs interface").
This commit therefore update the document to tell readers about DAMON
sysfs interface dependency instead and never mention about debugfs
interface, which will be deprecated.
[1] https://github.com/awslabs/damo
Link: https://lkml.kernel.org/r/20220909202901.57977-7-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Yun Levi <ppbuk5246@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The title of the DAMON document for admin-guide, 'Monitoring Data
Accesses', could confuse readers in some ways. First of all, DAMON is not
the only single way for data access monitoring. And the document is for
not only the data access monitoring but also data access pattern based
memory management optimizations (DAMOS). This commit updates the title to
'DAMON: Data Access MONitor', which more explicitly explains what the
document describes.
Link: https://lkml.kernel.org/r/20220909202901.57977-5-sj@kernel.org
Fixes: c4ba6014aec3 ("Documentation: add documents for DAMON")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Yun Levi <ppbuk5246@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
- Reimplement acpi_get_pci_dev() using the list of physical devices
associated with the given ACPI device object (Rafael Wysocki).
- Rename ACPI device object reference counting functions (Rafael
Wysocki).
- Rearrange ACPI device object initialization code (Rafael Wysocki).
- Drop parent field from struct acpi_device (Rafael Wysocki).
- Extend the the int3472-tps68470 driver to support multiple consumers
of a single TPS68470 along with the requisite framework-level
support (Daniel Scally).
- Filter out non-memory resources in is_memory(), add a helper
function to find all memory type resources of an ACPI device object
and use that function in 3 places (Heikki Krogerus).
- Add IRQ override quirks for Asus Vivobook K3402ZA/K3502ZA and ASUS
model S5402ZA (Tamim Khan, Kellen Renshaw).
- Fix acpi_dev_state_d0() kerneldoc (Sakari Ailus).
- Fix up suspend-to-idle support on ASUS Rembrandt laptops (Mario
Limonciello).
- Clean up ACPI platform devices support code (Andy Shevchenko, John
Garry).
- Clean up ACPI bus management code (Andy Shevchenko, ye xingchen).
- Add support for multiple DMA windows with different offsets to the
ACPI device enumeration code and use it on LoongArch (Jianmin Lv).
- Clean up the ACPI LPSS (Intel SoC) driver (Andy Shevchenko).
- Add a quirk for Dell Inspiron 14 2-in-1 for StorageD3Enable (Mario
Limonciello).
- Drop unused dev_fmt() and redundant 'HMAT' prefix from the HMAT
parsing code (Liu Shixin).
- Make ACPI FPDT parsing code avoid calling acpi_os_map_memory() on
invalid physical addresses (Hans de Goede).
- Silence missing-declarations warning related to Apple device
properties management (Lukas Wunner).
- Disable frequency invariance in the CPPC library if registers used
by cppc_get_perf_ctrs() are accessed via PCC (Jeremy Linton).
- Add ACPI disabled check to acpi_cpc_valid() (Perry Yuan).
- Fix Tx acknowledge in the PCC address space handler (Huisong Li).
- Use wait_for_completion_timeout() for PCC mailbox operations (Huisong
Li).
- Release resources on PCC address space setup failure path (Rafael
Mendonca).
- Remove unneeded result variables from APEI code (ye xingchen).
- Print total number of records found during BERT log parsing (Dmitry
Monakhov).
- Drop support for 3 _OSI strings that should not be necessary any
more and update documentation on custom _OSI strings so that adding
new ones is not encouraged any more (Mario Limonciello).
- Drop unneeded result variable from ec_write() (ye xingchen).
- Remove the leftover struct acpi_ac_bl from the ACPI AC driver (Hanjun
Guo).
- Reorder symbols to get rid of a few forward declarations in the ACPI
fan driver (Uwe Kleine-König).
- Add Toshiba Satellite/Portege Z830 ACPI backlight quirk (Arvid
Norlander).
- Add ARM DMA-330 controller to the supported list in the ACPI AMBA
driver (Vijayenthiran Subramaniam).
- Drop references to non-functional 01.org/linux-acpi web site from
MAINTAINERS and Kconfig help texts (Rafael Wysocki).
- Replace strlcpy() with unused retval with strscpy() in the ACPI
support code (Wolfram Sang).
- Do not initialize ret in main() in the pfrut utility (Shi junming).
- Drop useless ACPI DSDT override documentation (Rafael Wysocki).
- Fix a few typos and wording mistakes in the ACPI device enumeration
documentation (Jean Delvare).
- Introduce acpi_dev_uid_to_integer() to convert a _UID string into an
integer value (Andy Shevchenko).
- Use acpi_dev_uid_to_integer() in several places to unify _UID
handling (Andy Shevchenko).
- Drop unused pnpid32_to_pnpid() declaration from PNP code (Gaosheng
Cui).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmM7OhkSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRx/TkQALQ4TN451dPSj9jcYSNY6qZ/9b4P9Iym
TmRf3wO3+IVZQ8JajeKKRuVKNsW3sC0RcFkJJVmgZkydJBr1Uui2L0ZLzi8axGNy
RlbZm5NyBeFnlP0fA8Gb2iRMXVAUcRIx+RvZCulxxFmgQ8UhoU4wlVZWlEcko4TQ
hGp++lJYcRHR1NbVLSXZhFvzopKLdhGL6vB1Awsjb/I7TVqn23+k4jVRV1DYkIQ7
qgFM+Z7osRVZiVQbaPoOgdykeSa43qXu7Vgs7F/QeJuIiUYx59xDh0/WCJBxnuDM
cHGiaNnvuJghKmCg43X8+joaHEH/jCFyvBVGfiSzRvjz03WOPRs1XztwdEiCi+py
RcZGzrPaXmkCjNeytPRooiifyqm95HT7aMBN/aTvKBXDaGRrfPheXF+i2idl24HM
NrHqMaa0+5qoDGHLUEaf5znlCHfS+3lwq6+lGVrq/UGf6B3cP+9HwOyevEW493JX
4nuv69Y517moR9W3mBU8sAn5mUjshcka7pghRj7QnuoqRqWLbU3lIz8oUDHr84cI
ixpIPvt2KlZ5UjnN9aqu/6k70JkJvy4SrKjnx4iqu03ePmMrRc0Hcpy7+VMlgumD
tgN9aW+YDgy0/Z5QmO1MOvFodVmA5sX6+gnX1neAjuDdIo3LkJptlkO1fCx2jfQu
cgPQk1CtPOos
=xyUK
-----END PGP SIGNATURE-----
Merge tag 'acpi-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"ACPI and PNP updates for 6.1-rc1.
These rearrange the ACPI device object initialization code (to get rid
of a redundant parent pointer from struct acpi_device among other
things), unify the _UID handling, drop support for some _OSI strings
that should not be necessary any more, add new IDs to support more
hardware and some more quirks, fix a few issues and clean up code all
over.
Specifics:
- Reimplement acpi_get_pci_dev() using the list of physical devices
associated with the given ACPI device object (Rafael Wysocki)
- Rename ACPI device object reference counting functions (Rafael
Wysocki)
- Rearrange ACPI device object initialization code (Rafael Wysocki)
- Drop parent field from struct acpi_device (Rafael Wysocki)
- Extend the the int3472-tps68470 driver to support multiple
consumers of a single TPS68470 along with the requisite
framework-level support (Daniel Scally)
- Filter out non-memory resources in is_memory(), add a helper
function to find all memory type resources of an ACPI device object
and use that function in 3 places (Heikki Krogerus)
- Add IRQ override quirks for Asus Vivobook K3402ZA/K3502ZA and ASUS
model S5402ZA (Tamim Khan, Kellen Renshaw)
- Fix acpi_dev_state_d0() kerneldoc (Sakari Ailus)
- Fix up suspend-to-idle support on ASUS Rembrandt laptops (Mario
Limonciello)
- Clean up ACPI platform devices support code (Andy Shevchenko, John
Garry)
- Clean up ACPI bus management code (Andy Shevchenko, ye xingchen)
- Add support for multiple DMA windows with different offsets to the
ACPI device enumeration code and use it on LoongArch (Jianmin Lv)
- Clean up the ACPI LPSS (Intel SoC) driver (Andy Shevchenko)
- Add a quirk for Dell Inspiron 14 2-in-1 for StorageD3Enable (Mario
Limonciello)
- Drop unused dev_fmt() and redundant 'HMAT' prefix from the HMAT
parsing code (Liu Shixin)
- Make ACPI FPDT parsing code avoid calling acpi_os_map_memory() on
invalid physical addresses (Hans de Goede)
- Silence missing-declarations warning related to Apple device
properties management (Lukas Wunner)
- Disable frequency invariance in the CPPC library if registers used
by cppc_get_perf_ctrs() are accessed via PCC (Jeremy Linton)
- Add ACPI disabled check to acpi_cpc_valid() (Perry Yuan)
- Fix Tx acknowledge in the PCC address space handler (Huisong Li)
- Use wait_for_completion_timeout() for PCC mailbox operations
(Huisong Li)
- Release resources on PCC address space setup failure path (Rafael
Mendonca)
- Remove unneeded result variables from APEI code (ye xingchen)
- Print total number of records found during BERT log parsing (Dmitry
Monakhov)
- Drop support for 3 _OSI strings that should not be necessary any
more and update documentation on custom _OSI strings so that adding
new ones is not encouraged any more (Mario Limonciello)
- Drop unneeded result variable from ec_write() (ye xingchen)
- Remove the leftover struct acpi_ac_bl from the ACPI AC driver
(Hanjun Guo)
- Reorder symbols to get rid of a few forward declarations in the
ACPI fan driver (Uwe Kleine-König)
- Add Toshiba Satellite/Portege Z830 ACPI backlight quirk (Arvid
Norlander)
- Add ARM DMA-330 controller to the supported list in the ACPI AMBA
driver (Vijayenthiran Subramaniam)
- Drop references to non-functional 01.org/linux-acpi web site from
MAINTAINERS and Kconfig help texts (Rafael Wysocki)
- Replace strlcpy() with unused retval with strscpy() in the ACPI
support code (Wolfram Sang)
- Do not initialize ret in main() in the pfrut utility (Shi junming)
- Drop useless ACPI DSDT override documentation (Rafael Wysocki)
- Fix a few typos and wording mistakes in the ACPI device enumeration
documentation (Jean Delvare)
- Introduce acpi_dev_uid_to_integer() to convert a _UID string into
an integer value (Andy Shevchenko)
- Use acpi_dev_uid_to_integer() in several places to unify _UID
handling (Andy Shevchenko)
- Drop unused pnpid32_to_pnpid() declaration from PNP code (Gaosheng
Cui)"
* tag 'acpi-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (79 commits)
ACPI: LPSS: Deduplicate skipping device in acpi_lpss_create_device()
ACPI: LPSS: Replace loop with first entry retrieval
ACPI: x86: s2idle: Add another ID to s2idle_dmi_table
ACPI: x86: s2idle: Fix a NULL pointer dereference
MAINTAINERS: Drop records pointing to 01.org/linux-acpi
ACPI: Kconfig: Drop link to https://01.org/linux-acpi
ACPI: docs: Drop useless DSDT override documentation
ACPI: DPTF: Drop stale link from Kconfig help
ACPI: x86: s2idle: Add a quirk for ASUSTeK COMPUTER INC. ROG Flow X13
ACPI: x86: s2idle: Add a quirk for Lenovo Slim 7 Pro 14ARH7
ACPI: x86: s2idle: Add a quirk for ASUS ROG Zephyrus G14
ACPI: x86: s2idle: Add a quirk for ASUS TUF Gaming A17 FA707RE
ACPI: x86: s2idle: Add module parameter to prefer Microsoft GUID
ACPI: x86: s2idle: If a new AMD _HID is missing assume Rembrandt
ACPI: x86: s2idle: Move _HID handling for AMD systems into structures
platform/x86: int3472: Add board data for Surface Go2 IR camera
platform/x86: int3472: Support multiple gpio lookups in board data
platform/x86: int3472: Support multiple clock consumers
ACPI: bus: Add iterator for dependent devices
ACPI: scan: Add acpi_dev_get_next_consumer_dev()
...
Merge miscellaneous ACPI material, ACPI tools changes and ACPI
documentation updates for 6.1-rc1:
- Drop references to non-functional 01.org/linux-acpi web site from
MAINTAINERS and Kconfig help texts (Rafael Wysocki).
- Replace strlcpy() with unused retval with strscpy() in the ACPI
support code (Wolfram Sang).
- Do not initialize ret in main() in the pfrut utility (Shi junming).
- Drop useless ACPI DSDT override documentation (Rafael Wysocki).
- Fix a few typos and wording mistakes in the ACPI device enumeration
documentation (Jean Delvare).
* acpi-misc:
MAINTAINERS: Drop records pointing to 01.org/linux-acpi
ACPI: Kconfig: Drop link to https://01.org/linux-acpi
ACPI: DPTF: Drop stale link from Kconfig help
ACPI: move from strlcpy() with unused retval to strscpy()
* acpi-tools:
ACPI: tools: pfrut: Do not initialize ret in main()
* acpi-docs:
ACPI: docs: Drop useless DSDT override documentation
ACPI: docs: enumeration: Fix a few typos and wording mistakes
but a few significant changes even so:
- A complete rewriting of the top-level index.rst file, which mostly
reflects itself in a redone top page in the HTML-rendered docs. The hope
is that the new organization will be a friendlier starting point for
both users and developers.
- Some math-rendering improvements.
- A coding-style.rst update on the use of BUG() and WARN()
- A big maintainer-PHP guide update.
- Some code-of-conduct updates
- More Chinese translation work
Plus the usual pile of typo fixes, corrections, and updates.
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAmM7BksPHGNvcmJldEBs
d24ubmV0AAoJEBdDWhNsDH5Y8i4H/ihd1ppgVYy1yvFL3L1KkcsNyt3bFUa6hide
qmkhqpzjsNmbTOaW19Y6epCzRzvxG7M9hzztIewt1BhRDvgRC8GaQNNRw/IBs0B6
kprisINC2/ap4JjCroYWepfd+H8NSiVxqtd8hVSMWDSh2cK9vw0qVqQq59I+gght
64pA4F2nPO6bamZzAELTdWRj0ITL1A/V/jYj+T074B094arc4HyekIQ5Jn9GTCmt
jFBH9yxAb3l8K7KgzH7FgxKY/an0HxKDh4Cnx2Jv+dcocgCwy1iXCuyEZbFd9GEB
UyhPcCyrIe/I2B9U9LrqLvXA8LW7jwE+MZMqZpaRkxcIdE2gEFQ=
=M7tR
-----END PGP SIGNATURE-----
Merge tag 'docs-6.1' of git://git.lwn.net/linux
Pull documentation updates from Jonathan Corbet:
"There's not a huge amount of activity in the docs tree this time
around, but a few significant changes even so:
- A complete rewriting of the top-level index.rst file, which mostly
reflects itself in a redone top page in the HTML-rendered docs. The
hope is that the new organization will be a friendlier starting
point for both users and developers.
- Some math-rendering improvements.
- A coding-style.rst update on the use of BUG() and WARN()
- A big maintainer-PHP guide update.
- Some code-of-conduct updates
- More Chinese translation work
Plus the usual pile of typo fixes, corrections, and updates"
* tag 'docs-6.1' of git://git.lwn.net/linux: (66 commits)
checkpatch: warn on usage of VM_BUG_ON() and other BUG variants
coding-style.rst: document BUG() and WARN() rules ("do not crash the kernel")
Documentation: devres: add missing IO helper
Documentation: devres: update IRQ helper
Documentation/mm: modify page_referenced to folio_referenced
Documentation/CoC: Reflect current CoC interpretation and practices
docs/doc-guide: Add documentation on SPHINX_IMGMATH
docs: process/5.Posting.rst: clarify use of Reported-by: tag
docs, kprobes: Fix the wrong location of Kprobes
docs: add a man-pages link to the front page
docs: put atomic*.txt and memory-barriers.txt into the core-api book
docs: move asm-annotations.rst into core-api
docs: remove some index.rst cruft
docs: reconfigure the HTML left column
docs: Rewrite the front page
docs: promote the title of process/index.rst
Documentation: devres: add missing SPI helper
Documentation: devres: add missing PINCTRL helpers
docs: hugetlbpage.rst: fix a typo of hugepage size
docs/zh_CN: Add new translation of admin-guide/bootconfig.rst
...
This patch adds the kunit.enable module parameter that will need to be
set to true in addition to KUNIT being enabled for KUnit tests to run.
The default value is true giving backwards compatibility. However, for
the production+testing use case the new config option
KUNIT_DEFAULT_ENABLED can be set to N requiring the tester to opt-in
by passing kunit.enable=1 to the kernel.
Signed-off-by: Joe Fradley <joefradley@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
* for-next/misc:
: Miscellaneous patches
arm64/kprobe: Optimize the performance of patching single-step slot
ARM64: reloc_test: add __init/__exit annotations to module init/exit funcs
arm64/mm: fold check for KFENCE into can_set_direct_map()
arm64: uaccess: simplify uaccess_mask_ptr()
arm64: mte: move register initialization to C
arm64: mm: handle ARM64_KERNEL_USES_PMD_MAPS in vmemmap_populate()
arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()
arm64: support huge vmalloc mappings
arm64: spectre: increase parameters that can be used to turn off bhb mitigation individually
arm64: run softirqs on the per-CPU IRQ stack
arm64: compat: Implement misalignment fixups for multiword loads
Because https://01.org/linux-acpi web site has become permanently
inaccessible, the "Overriding DSDT" document in the kernel tree
pointing to it as the main source of information is useless (and
the config option name mentioned by it is incorrect), so drop it
and drop the pointer to it from the ACPI Kconfig.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
commit 7c693f54c873691 ("x86/speculation: Add spectre_v2=ibrs option to support Kernel IBRS")
adds the "ibrs " option in
Documentation/admin-guide/kernel-parameters.txt but omits it to
Documentation/admin-guide/hw-vuln/spectre.rst, add it.
Signed-off-by: Lin Yujun <linyujun809@huawei.com>
Link: https://lore.kernel.org/r/20220830123614.23007-1-linyujun809@huawei.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
The current section 'If something goes wrong' makes a number of suggestions
for debugging, bug hunting and reporting issues, which are quite briefly
described in that section.
However, the suggestions are also well covered in other kernel
documentation or sometimes simply outdated. Here, each suggestion in that
section is summarized, and then followed with its assessment, and the
derived action for each suggestion:
- use MAINTAINERS and mailing list: covered in 'Reporting issues',
summarized in the short guide, detailed in its further section.
Reporting issues even provides some specific examples that guides
readers well through the needed steps. Refer to 'Reporting issues'.
- contact Linus Torvalds: probably outdated as currently described.
nevertheless covered in 'Reporting issues'. Reporting issues points out
to contact the relevant kernel maintainers first, and after some
patience and failed attempts with those maintainers, contacting Linus
Torvalds might be okay. Refer to 'Reporting issues'.
- tell what kernel, how to duplicate, the setup, if the problem is new
or old and when did you notice: covered in 'Reporting issues',
especially in Step-by-step guide how to report issues to the kernel
maintainers. Refer to 'Reporting issues'.
- duplicate kernel bug reports exactly: covered in 'Reporting issues',
especially in Write and send the report. Refer to 'Reporting issues'.
- read 'Bug hunting': keep this reference. Refer to 'Bug hunting'.
- compile the kernel with CONFIG_KALLSYMS: covered in 'Reporting issues',
especially in Decode failure messages. Refer to 'Reporting issues'.
- alternatively, use ksymoops: ksymoops at the mentioned URL seems not to
be maintained anymore. It was released roughly once a year until
version 2.4.11 in 2005, but has not seen a new release since then. The
information in ./scripts/ksymoops/README is from 1999, and does not
give more insight on its actual maintenance state either. Ksymoops is
mentioned as system utility in changes.rst, but also not recommended
there. Drop the explanation on using ksymoops.
- alternatively, lookup dump manually with the EIP and nm to determine
the function in which the kernel crashes: this method seems already a
quite advanced and low-level debugging method. Even all the further
references on bug hunting and debugging do not mention it. Drop this
alternative method and limit mentioning methods explained in the other
existing kernel documentation.
- read 'Reporting issues': keep this reference.
Refer to 'Reporting issues'.
- use gdb for debugging: some specific details, e.g., edit
arch/x86/Makefile, are probably outdated or limited to one (historic
important) setup. Using gdb is covered in 'Bug hunting', 'Debugging
kernel and modules via gdb' and 'Using kgdb, kdb and the kernel
debugger internals'. Refer to those three documents.
Overall, it is sufficient to refer to reporting-issues.rst,
bug-hunting.rst, gdb-kernel-debugging.rst and kgdb.rst and this way cover
the existing suggestions.
'Reporting issues' is quite new and probably up to date. 'Bug hunting',
'Debugging kernel and modules via gdb' and 'Using kgdb, kdb and the kernel
debugger internals' might need some revisit and update, but they are
generally in an acceptable state for referring to them.
Replace the existing suggestions by reference to other existing kernel
documentation covering those suggestions---partly even nicely summarized
and then explained in greater detail.
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Link: https://lore.kernel.org/r/20220720041325.15693-3-lukas.bulwahn@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Running a.out user programs with the latest kernel release is a very rare
and uncommon use case nowadays. The support of a.out user programs is only
remaining for the alpha architecture and is not defined and activated in
the architecture's Kconfig (so even the activation of this support requires
to modify the Kconfig file and not just kernel build configuration).
The discussion on a.out support in 2019 (see Link) shows that the support
of a.out user programs is just remaining for a special corner case from
some (alpha architecture) users.
There is no need to point out and mention this special feature to the
general audience of kernel users. Delete the reference to this historic and
special feature.
Link: https://lore.kernel.org/all/CAHk-=wgt7M6yA5BJCJo0nF22WgPJnN8CvViL9CAJmd+S+Civ6w@mail.gmail.com/
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Link: https://lore.kernel.org/r/20220720041325.15693-2-lukas.bulwahn@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Add the description of KSM profit and how to determine it separately in
system-wide range and inner a single process.
Link: https://lkml.kernel.org/r/20220830144003.299870-1-xu.xin16@zte.com.cn
Signed-off-by: xu xin <xu.xin16@zte.com.cn>
Reviewed-by: Xiaokai Ran <ran.xiaokai@zte.com.cn>
Reviewed-by: Yang Yang <yang.yang29@zte.com.cn>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Izik Eidus <izik.eidus@ravellosystems.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
for-6.0 has the following fix for cgroup_get_from_id().
836ac87d ("cgroup: fix cgroup_get_from_id")
which conflicts with the following two commits in for-6.1.
4534dee9 ("cgroup: cgroup: Honor caller's cgroup NS when resolving cgroup id")
fa7e439c ("cgroup: Homogenize cgroup_get_from_id() return value")
While the resolution is straightforward, the code ends up pretty ugly
afterwards. Let's pull for-6.0-fixes into for-6.1 so that the code can be
fixed up there.
Signed-off-by: Tejun Heo <tj@kernel.org>
Alibaba's T-Head SoC implements uncore PMU for performance and functional
debugging to facilitate system maintenance. Document it to provide guidance
on how to use it.
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220914022326.88550-2-xueshuai@linux.alibaba.com
Signed-off-by: Will Deacon <will@kernel.org>
As commit 559089e0a93d ("vmalloc: replace VM_NO_HUGE_VMAP with
VM_ALLOW_HUGE_VMAP"), the use of hugepage mappings for vmalloc
is an opt-in strategy, so it is saftly to support huge vmalloc
mappings on arm64, for now, it is used in kvmalloc() and
alloc_large_system_hash().
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Link: https://lore.kernel.org/r/20220911044423.139229-1-wangkefeng.wang@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmMeQ2keHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGYRMH+gLNHiGirGZlm2GQ
tKaZQUy7MiXuIP0hGDonDIIIAmIVhnjm9MDG8KT4W8AvEd7ukncyYqJfwWeWQPhP
4mZcf6l3Z8Ke+qiaFpXpMPCxTyWcln1ox0EoNx2g9gdPxZntaRuuaTQVljUfTiey
aVPHxve8ip3G7jDoJnuLSxESOqWxkb8v/SshBP1E5bF5BZ+cgZRqq7FNigFqxjbk
wF29K09BVOPjdgkSvY/b0/SnL5KlSdMAv+FrPcJNGivcdIPgf/qJks5cI2HRUo7o
CpKgbcLorCVyD+d+zLonJBwIy3arbmKD8JqYnfdTSIqVOUqHXWUDfeydsH32u1Gu
lPSI2Hw=
=7LTL
-----END PGP SIGNATURE-----
Merge 6.0-rc5 into driver-core-next
We need the driver core and debugfs changes in this branch.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Print the machine hardware name (UTS_MACHINE) in /proc/sys/kernel/arch.
This helps people who debug kernel with initramfs with minimal environment
(i.e. without coreutils or even busybox) or allow to open sysfs file
instead of run 'uname -m' in high level languages.
Link: https://lkml.kernel.org/r/20220901194403.3819-1-pvorel@suse.cz
Signed-off-by: Petr Vorel <pvorel@suse.cz>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: David Sterba <dsterba@suse.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
In commit 2f1ee0913ce5 ("Revert "mm: use early_pfn_to_nid in
page_ext_init""), we call page_ext_init() after page_alloc_init_late() to
avoid some panic problem. It seems that we cannot track early page
allocations in current kernel even if page structure has been initialized
early.
This patch introduces a new boot parameter 'early_page_ext' to resolve
this problem. If we pass it to the kernel, page_ext_init() will be moved
up and the feature 'deferred initialization of struct pages' will be
disabled to initialize the page allocator early and prevent the panic
problem above. It can help us to catch early page allocations. This is
useful especially when we find that the free memory value is not the same
right after different kernel booting.
[akpm@linux-foundation.org: fix section issue by removing __meminitdata]
Link: https://lkml.kernel.org/r/20220825102714.669-1-lizhe.67@bytedance.com
Signed-off-by: Li Zhe <lizhe.67@bytedance.com>
Suggested-by: Michal Hocko <mhocko@suse.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
In NUMA balancing memory tiering mode, if there are hot pages in slow
memory node and cold pages in fast memory node, we need to promote/demote
hot/cold pages between the fast and cold memory nodes.
A choice is to promote/demote as fast as possible. But the CPU cycles and
memory bandwidth consumed by the high promoting/demoting throughput will
hurt the latency of some workload because of accessing inflating and slow
memory bandwidth contention.
A way to resolve this issue is to restrict the max promoting/demoting
throughput. It will take longer to finish the promoting/demoting. But
the workload latency will be better. This is implemented in this patch as
the page promotion rate limit mechanism.
The number of the candidate pages to be promoted to the fast memory node
via NUMA balancing is counted, if the count exceeds the limit specified by
the users, the NUMA balancing promotion will be stopped until the next
second.
A new sysctl knob kernel.numa_balancing_promote_rate_limit_MBps is added
for the users to specify the limit.
Link: https://lkml.kernel.org/r/20220713083954.34196-3-ying.huang@intel.com
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Tested-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: osalvador <osalvador@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Wei Xu <weixugc@google.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zhong Jiang <zhongjiang-ali@linux.alibaba.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Currently only 12 characters of the cma name is being used as the debug
directories where as the cma name can be of length CMA_MAX_NAME(=64)
characters. One side problem with this is having 2 cma's with first
common 12 characters would end up in trying to create directories with
same name and fails with -EEXIST thus can limit cma debug functionality.
The 'cma-' prefix is used initially where cma areas don't have any names
and are represented by simple integer values. Since now each cma would be
having its own name, drop 'cma-' prefix for the cma debug directories as
they are clearly evident that they are for cma debug through creating them
in /sys/kernel/debug/cma/ path.
Link: https://lkml.kernel.org/r/1660223729-22461-1-git-send-email-quic_charante@quicinc.com
Signed-off-by: Charan Teja Kalla <quic_charante@quicinc.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Pavan Kondeti <quic_pkondeti@quicinc.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Explain the different ways to create a new userfaultfd, and how access
control works for each way.
[axelrasmussen@google.com: improve wording in documentation, per Mike]
Link: https://lkml.kernel.org/r/20220819205201.658693-5-axelrasmussen@google.com
Link: https://lkml.kernel.org/r/20220808175614.3885028-5-axelrasmussen@google.com
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dmitry V. Levin <ldv@altlinux.org>
Cc: Gleb Fotengauer-Malinovskiy <glebfm@altlinux.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nadav Amit <namit@vmware.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zhang Yi <yi.zhang@huawei.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
In our environment, it was found that the mitigation BHB has a great
impact on the benchmark performance. For example, in the lmbench test,
the "process fork && exit" test performance drops by 20%.
So it is necessary to have the ability to turn off the mitigation
individually through cmdline, thus avoiding having to compile the
kernel by adjusting the config.
Signed-off-by: Liu Song <liusong@linux.alibaba.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/1661514050-22263-1-git-send-email-liusong@linux.alibaba.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
PSI accounts stalls for each cgroup separately and aggregates it
at each level of the hierarchy. This may cause non-negligible overhead
for some workloads when under deep level of the hierarchy.
commit 3958e2d0c34e ("cgroup: make per-cgroup pressure stall tracking configurable")
make PSI to skip per-cgroup stall accounting, only account system-wide
to avoid this each level overhead.
But for our use case, we also want leaf cgroup PSI stats accounted for
userspace adjustment on that cgroup, apart from only system-wide adjustment.
So this patch introduce a per-cgroup PSI accounting disable/re-enable
interface "cgroup.pressure", which is a read-write single value file that
allowed values are "0" and "1", the defaults is "1" so per-cgroup
PSI stats is enabled by default.
Implementation details:
It should be relatively straight-forward to disable and re-enable
state aggregation, time tracking, averaging on a per-cgroup level,
if we can live with losing history from while it was disabled.
I.e. the avgs will restart from 0, total= will have gaps.
But it's hard or complex to stop/restart groupc->tasks[] updates,
which is not implemented in this patch. So we always update
groupc->tasks[] and PSI_ONCPU bit in psi_group_change() even when
the cgroup PSI stats is disabled.
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Link: https://lkml.kernel.org/r/20220907090332.2078-1-zhouchengming@bytedance.com
Now PSI already tracked workload pressure stall information for
CPU, memory and IO. Apart from these, IRQ/SOFTIRQ could have
obvious impact on some workload productivity, such as web service
workload.
When CONFIG_IRQ_TIME_ACCOUNTING, we can get IRQ/SOFTIRQ delta time
from update_rq_clock_task(), in which we can record that delta
to CPU curr task's cgroups as PSI_IRQ_FULL status.
Note we don't use PSI_IRQ_SOME since IRQ/SOFTIRQ always happen in
the current task on the CPU, make nothing productive could run
even if it were runnable, so we only use PSI_IRQ_FULL.
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Link: https://lore.kernel.org/r/20220825164111.29534-8-zhouchengming@bytedance.com
Rework/modernize docs:
- use /proc/dynamic_debug/control in examples
its *always* there (when dyndbg is config'd), even when <debugfs> is not.
drop <debugfs> talk, its a distraction here.
- alias ddcmd='echo $* > /proc/dynamic_debug/control
focus on args: declutter, hide boilerplate, make pwd independent.
- swap sections: Viewing before Controlling. control file as Catalog.
- focus on use by a system administrator
add an alias to make examples more readable
drop grep-101 lessons, admins know this.
- use init/main.c as 1st example, thread it thru doc where useful.
everybodys kernel boots, runs these.
- add *prdbg* api section
to the bottom of the file, its for developers more than admins.
move list of api functions there.
- simplify - drop extra words, phrases, sentences.
- add "decorator" flags line to unify "prefix", trim fmlt descriptions
CC: linux-doc@vger.kernel.org
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20220904214134.408619-20-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add an explanation of the new "class CLASS_NAME" syntax and meaning,
noting that the module determines if CLASS_NAME applies to it.
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Link: https://lore.kernel.org/r/20220904214134.408619-19-jim.cromie@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CONFIG_VIRT_CPU_ACCOUNTING_GEN under pseries does not provide stolen
time accounting unless CONFIG_PARAVIRT_TIME_ACCOUNTING is enabled.
Implement this using the VPA accumulated wait counters.
Note this will not work on current KVM hosts because KVM does not
implement the VPA dispatch counters (yet). It could be implemented
with the dispatch trace log as it is for VIRT_CPU_ACCOUNTING_NATIVE,
but that is not necessary for the more limited accounting provided
by PARAVIRT_TIME_ACCOUNTING, and it is more expensive, complex, and
has downsides like potential log wrap.
From Shrikanth:
[...] it was tested on Power10 [PowerVM] Shared LPAR. system has two
LPAR. we will call first one LPAR1 and second one as LPAR2. Test was
carried out in SMT=1. Similar observation was seen in SMT=8 as well.
LPAR config header from each LPAR is below. LPAR1 is twice as big as
LPAR2. Since Both are sharing the same underlying hardware, work
stealing will happen when both the LPAR's are contending for the same
resource.
LPAR1:
type=Shared mode=Uncapped smt=Off lcpu=40 cpus=40 ent=20.00
LPAR2:
type=Shared mode=Uncapped smt=Off lcpu=20 cpus=40 ent=10.00
mpstat was used to check for the utilization. stress-ng has been used
as the workload. Few cases are tested. when the both LPAR are idle
there is no steal time. when LPAR1 starts running at 100% which
consumes all of the physical resource, steal time starts to get
accounted. With LPAR1 running at 100% and LPAR2 starts running, steal
time starts increasing. This is as expected. When the LPAR2 Load is
increased further, steal time increases further.
Case 1: 0% LPAR1; 0% LPAR2
%usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
0.00 0.00 0.05 0.00 0.00 0.00 0.00 0.00 0.00 99.95
Case 2: 100% LPAR1; 0% LPAR2
%usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
97.68 0.00 0.00 0.00 0.00 0.00 2.32 0.00 0.00 0.00
Case 3: 100% LPAR1; 50% LPAR2
%usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
86.34 0.00 0.10 0.00 0.00 0.03 13.54 0.00 0.00 0.00
Case 4: 100% LPAR1; 100% LPAR2
%usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
78.54 0.00 0.07 0.00 0.00 0.02 21.36 0.00 0.00 0.00
Case 5: 50% LPAR1; 100% LPAR2
%usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
49.37 0.00 0.00 0.00 0.00 0.00 1.17 0.00 0.00 49.47
Patch is accounting for the steal time and basic tests are holding
good.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com>
[mpe: Add SPDX tag to new paravirt_api_clock.h]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220902085316.2071519-3-npiggin@gmail.com
Update Documentation/admin-guide/cgroup-v2.rst on the newly introduced
"isolated" cpuset partition type as well as other changes made in other
cpuset patches.
Signed-off-by: Waiman Long <longman@redhat.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
The APIC supports two modes, legacy APIC (or xAPIC), and Extended APIC
(or x2APIC). X2APIC mode is mostly compatible with legacy APIC, but
it disables the memory-mapped APIC interface in favor of one that uses
MSRs. The APIC mode is controlled by the EXT bit in the APIC MSR.
The MMIO/xAPIC interface has some problems, most notably the APIC LEAK
[1]. This bug allows an attacker to use the APIC MMIO interface to
extract data from the SGX enclave.
Introduce support for a new feature that will allow the BIOS to lock
the APIC in x2APIC mode. If the APIC is locked in x2APIC mode and the
kernel tries to disable the APIC or revert to legacy APIC mode a GP
fault will occur.
Introduce support for a new MSR (IA32_XAPIC_DISABLE_STATUS) and handle
the new locked mode when the LEGACY_XAPIC_DISABLED bit is set by
preventing the kernel from trying to disable the x2APIC.
On platforms with the IA32_XAPIC_DISABLE_STATUS MSR, if SGX or TDX are
enabled the LEGACY_XAPIC_DISABLED will be set by the BIOS. If
legacy APIC is required, then it SGX and TDX need to be disabled in the
BIOS.
[1]: https://aepicleak.com/aepicleak.pdf
Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
Link: https://lkml.kernel.org/r/20220816231943.1152579-1-daniel.sneddon@linux.intel.com
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAmMM3aoPHGNvcmJldEBs
d24ubmV0AAoJEBdDWhNsDH5YsdsH/09lM6s1trLFE8AVRGcSVJUelbZNZfm2WI7J
7ElIVVVsBvRfLMVpjCkPrKjRV2jyJ+wDtIs1g54ra/J9/xyWSmlp8jDLvB+PgqGR
zxhZX1/dbGEoRwqY3S5Vm5i3DsdBKMcGGc1nIM2xzZXdRoptvbzbZOHFGU/mqdC5
QZeVe6ycCzsIgGz6aKlhHEszp/Jd2T4Zj/kJ9n95IxNB3l8uKTnSn/s0cQfGRZU/
dtednWNs+1nUMGMerfgq7rstE0GfQFbfWL+2bYlDuSS3RgsaMXD+jwhxx7RV4oM2
XCbgDoYW+rg4jArWOvBLWW+EFK5ywYYBjhGXDm041jaKi/vPRbo=
=aQJl
-----END PGP SIGNATURE-----
Merge tag 'docs-6.0-fixes' of git://git.lwn.net/linux
Pull documentation fixes from Jonathan Corbet:
"A handful of fixes for documentation and the docs build system"
* tag 'docs-6.0-fixes' of git://git.lwn.net/linux:
docs/conf.py: add function attribute '__fix_address' to conf.py
Docs/admin-guide/mm/damon/usage: fix the example code snip
docs: Update version number from 5.x to 6.x in README.rst
docs/ja_JP/SubmittingPatches: Remove reference to submitting-drivers.rst
docs: kerneldoc-preamble: Test xeCJK.sty before loading
- Fix PAT on Xen, which caused i915 driver failures
- Fix compat INT 80 entry crash on Xen PV guests
- Fix 'MMIO Stale Data' mitigation status reporting on older Intel CPUs
- Fix RSB stuffing regressions
- Fix ORC unwinding on ftrace trampolines
- Add Intel Raptor Lake CPU model number
- Fix (work around) a SEV-SNP bootloader bug providing bogus values in
boot_params->cc_blob_address, by ignoring the value on !SEV-SNP bootups.
- Fix SEV-SNP early boot failure
- Fix the objtool list of noreturn functions and annotate snp_abort(),
which bug confused objtool on gcc-12.
- Fix the documentation for retbleed
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmMLgUERHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1hU1hAAj9bKO+D07gROpkRhXpLvbqm+mUqk6It8
qEuyLkC/xvD9N//Pya0ZPPE+l23EHQxM/L0tXCYwsc8Zlx3fr678FQktHZuxULaP
qlGmwub8PvsyMesXBGtgHJ4RT8L07FelBR+E/TpqCSbv/geM7mcc0ojyOKo+OmbD
vbsUTDNqsMYndzePUBrv/vJRqVLGlbd1a22DKE/YiYBbZu5hughNUB0bHYhYdsml
mutcRsVTKRbZOi4wiuY/pnveMf/z9wAwKOWd9EBaDWILygVSHkBJJQyhHigBh3F8
H1hFn1q4ybULdrAUT4YND9Tjb6U8laU0fxg8Z43bay7bFXXElqoj3W2qka9NjAim
QvWSFvTYQRTIWt6sCshVsUBWj6fBZdHxcJ8Fh+ucJJp+JWl9/aD0A41vK2Jx0LIt
p8YzBRKEqd1Q/7QD855BArN7HNtQGgShNt4oPVZ7nPnjuQt0+lngu8NR+6X3RKpX
r8rordyPvzgPL6W+1uV+8hnz1w+YU2xplAbK+zTijwgJVgyf8khSlZQNpFKULrou
zjtzo/2nB+4C4bvfetNnaOGhi1/AdCHZHyZE35rotpd73SLHvdOrH0Ll9oCVfVrC
UWbC1E67cHQw97Ni/4CrCsJRBULK01uyszVCxlEkSYInf0UsKlnmy+TqxizwsCVy
reYQ0ePyWg0=
=ZpCJ
-----END PGP SIGNATURE-----
Merge tag 'x86-urgent-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 fixes from Ingo Molnar:
- Fix PAT on Xen, which caused i915 driver failures
- Fix compat INT 80 entry crash on Xen PV guests
- Fix 'MMIO Stale Data' mitigation status reporting on older Intel CPUs
- Fix RSB stuffing regressions
- Fix ORC unwinding on ftrace trampolines
- Add Intel Raptor Lake CPU model number
- Fix (work around) a SEV-SNP bootloader bug providing bogus values in
boot_params->cc_blob_address, by ignoring the value on !SEV-SNP
bootups.
- Fix SEV-SNP early boot failure
- Fix the objtool list of noreturn functions and annotate snp_abort(),
which bug confused objtool on gcc-12.
- Fix the documentation for retbleed
* tag 'x86-urgent-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Documentation/ABI: Mention retbleed vulnerability info file for sysfs
x86/sev: Mark snp_abort() noreturn
x86/sev: Don't use cc_platform_has() for early SEV-SNP calls
x86/boot: Don't propagate uninitialized boot_params->cc_blob_address
x86/cpu: Add new Raptor Lake CPU model number
x86/unwind/orc: Unwind ftrace trampolines with correct ORC entry
x86/nospec: Fix i386 RSB stuffing
x86/nospec: Unwreck the RSB stuffing
x86/bugs: Add "unknown" reporting for MMIO Stale Data
x86/entry: Fix entry_INT80_compat for Xen PV guests
x86/PAT: Have pat_enabled() properly reflect state when running on Xen
- Fix workaround for Cortex-A76 erratum #1286807
- Add workaround for AMU erratum #2457168 on Cortex-A510
- Drop reference to removed CONFIG_ARCH_RANDOM #define
- Fix parsing of the "rodata=full" cmdline option
- Fix a bunch of issues in the SME register state switching and sigframe code
- Fix incorrect extraction of the CTR_EL0.CWG register field
- Fix ACPI cache topology probing when the PPTT is not present
- Trivial comment and whitespace fixes
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmMIr4YQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNOpcB/9PQM3DtbFcCFJeetFxJYVSPHikHfEj4SHY
H6NRNLA9JwwCqA1/RKbJQDpFS34JdL5UgtNuUFbg4zkH2aaeML7WKuRg396jv2ke
GmHpOkNeIaae0NQes3MLZQf+Heh2oimRYwlPXJc23vhAIZagwby3yR/9POrxKm6p
kamLIrmwU1mU7pwr7HEkRr1HOd/eOJ+q6P3UYrFlAgAp5PEfVGgcdjbHJ1gY1KWw
HwR6s2yWQ71+Aug9wMS56TQszBedyuIOeKezAI+IbThF6t/kQlfJna+hqvPp0zEE
mz1hss7ACpfDURdga35B1bE146aZ1SKEsXXaB853JI76K6D534Y9
=3oRd
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"A bumper crop of arm64 fixes for -rc3.
The largest change is fixing our parsing of the 'rodata=full' command
line option, which kstrtobool() started treating as 'rodata=false'.
The fix actually makes the parsing of that option much less fragile
and updates the documentation at the same time.
We still have a boot issue pending when KASLR is disabled at compile
time, but there's a fresh fix on the list which I'll send next week if
it holds up to testing.
Summary:
- Fix workaround for Cortex-A76 erratum #1286807
- Add workaround for AMU erratum #2457168 on Cortex-A510
- Drop reference to removed CONFIG_ARCH_RANDOM #define
- Fix parsing of the "rodata=full" cmdline option
- Fix a bunch of issues in the SME register state switching and sigframe code
- Fix incorrect extraction of the CTR_EL0.CWG register field
- Fix ACPI cache topology probing when the PPTT is not present
- Trivial comment and whitespace fixes"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64/sme: Don't flush SVE register state when handling SME traps
arm64/sme: Don't flush SVE register state when allocating SME storage
arm64/signal: Flush FPSIMD register state when disabling streaming mode
arm64/signal: Raise limit on stack frames
arm64/cache: Fix cache_type_cwg() for register generation
arm64/sysreg: Guard SYS_FIELD_ macros for asm
arm64/sysreg: Directly include bitfield.h
arm64: cacheinfo: Fix incorrect assignment of signed error value to unsigned fw_level
arm64: errata: add detection for AMEVCNTR01 incrementing incorrectly
arm64: fix rodata=full
arm64: Fix comment typo
docs/arm64: elf_hwcaps: unify newlines in HWCAP lists
arm64: adjust KASLR relocation after ARCH_RANDOM removal
arm64: Fix match_list for erratum 1286807 on Arm Cortex-A76
The workflow example code is not working since it got the file names
wrong. So fix this.
Fixes: b18402726bd1 ("Docs/admin-guide/mm/damon/usage: document DAMON sysfs interface")
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Kairui Song <kasong@tencent.com>
Link: https://lore.kernel.org/r/20220823114053.53305-1-ryncsn@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
A quick 'grep "5\.x" . -R' on Documentation shows that README.rst,
2.Process.rst and applying-patches.rst all mention the version number "5.x"
for kernel releases.
As the next release will be version 6.0, updating the version number to 6.x
in README.rst seems reasonable.
The description in 2.Process.rst is just a description of recent kernel
releases, it was last updated in the beginning of 2020, and can be
revisited at any time on a regular basis, independent of changing the
version number from 5 to 6. So, there is no need to update this document
now when transitioning from 5.x to 6.x numbering.
The document applying-patches.rst is probably obsolete for most users
anyway, a reader will sufficiently well understand the steps, even it
mentions version 5 rather than version 6. So, do not update that to a
version 6.x numbering scheme.
Update version number from 5.x to 6.x in README.rst only.
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Link: https://lore.kernel.org/r/20220824080836.23087-1-lukas.bulwahn@gmail.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Stephen Rothwell reported htmldocs warning when merging net-next tree:
Documentation/admin-guide/sysctl/net.rst:37: WARNING: Malformed table.
Text in column margin in table line 4.
========= =================== = ========== ==================
Directory Content Directory Content
========= =================== = ========== ==================
802 E802 protocol mptcp Multipath TCP
appletalk Appletalk protocol netfilter Network Filter
ax25 AX25 netrom NET/ROM
bridge Bridging rose X.25 PLP layer
core General parameter tipc TIPC
ethernet Ethernet protocol unix Unix domain sockets
ipv4 IP version 4 x25 X.25 protocol
ipv6 IP version 6
========= =================== = ========== ==================
The warning above is caused by cells in second "Content" column of
/proc/sys/net subdirectory table which are in column margin.
Align these cells against the column header to fix the warning.
Link: https://lore.kernel.org/linux-next/20220823134905.57ed08d5@canb.auug.org.au/
Fixes: 1202cdd665315c ("Remove DECnet support from kernel")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://lore.kernel.org/r/20220824035804.204322-1-bagasdotme@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>