IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
commit 003af997c8a945493859dd1a2d015cc9387ff27a upstream.
When trying to allocate a hugepage with no reserved ones free, it may be
allowed in case a number of overcommit hugepages was configured (using
/proc/sys/vm/nr_overcommit_hugepages) and that number wasn't reached.
This allows for a behavior of having extra hugepages allocated
dynamically, if there're resources for it. Some sysadmins even prefer not
reserving any hugepages and setting a big number of overcommit hugepages.
But while attempting to allocate overcommit hugepages in a multi node
system (either NUMA or mempolicy/cpuset) said allocations might randomly
fail even when there're resources available for the allocation.
This happens due to allowed_mems_nr() only accounting for the number of
free hugepages in the nodes the current process belongs to and the surplus
hugepage allocation is done so it can be allocated in any node. In case
one or more of the requested surplus hugepages are allocated in a
different node, the whole allocation will fail due allowed_mems_nr()
returning a lower value.
So allocate surplus hugepages in one of the nodes the current process
belongs to.
Easy way to reproduce this issue is to use a 2+ NUMA nodes system:
# echo 0 >/proc/sys/vm/nr_hugepages
# echo 1 >/proc/sys/vm/nr_overcommit_hugepages
# numactl -m0 ./tools/testing/selftests/mm/map_hugetlb 2
Repeating the execution of map_hugetlb test application will eventually
fail when the hugepage ends up allocated in a different node.
[aris@ruivo.org: v2]
Link: https://lkml.kernel.org/r/20240701212343.GG844599@cathedrallabs.org
Link: https://lkml.kernel.org/r/20240621190050.mhxwb65zn37doegp@redhat.com
Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Aristeu Rozanski <aris@ruivo.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Vishal Moola <vishal.moola@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 39705a6c29f8a2b93cf5b99528a55366c50014d1 upstream.
When a process' cred struct is replaced, this _almost_ always invokes
the cred_prepare LSM hook; but in one special case (when
KEYCTL_SESSION_TO_PARENT updates the parent's credentials), the
cred_transfer LSM hook is used instead. Landlock only implements the
cred_prepare hook, not cred_transfer, so KEYCTL_SESSION_TO_PARENT causes
all information on Landlock restrictions to be lost.
This basically means that a process with the ability to use the fork()
and keyctl() syscalls can get rid of all Landlock restrictions on
itself.
Fix it by adding a cred_transfer hook that does the same thing as the
existing cred_prepare hook. (Implemented by having hook_cred_prepare()
call hook_cred_transfer() so that the two functions are less likely to
accidentally diverge in the future.)
Cc: stable@kernel.org
Fixes: 385975dca53e ("landlock: Set up the security framework and manage credentials")
Signed-off-by: Jann Horn <jannh@google.com>
Link: https://lore.kernel.org/r/20240724-landlock-houdini-fix-v1-1-df89a4560ca3@google.com
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a8bd68e4329f9a0ad1b878733e0f80be6a971649 ]
When mtk-cmdq unbinds, a WARN_ON message with condition
pm_runtime_get_sync() < 0 occurs.
According to the call tracei below:
cmdq_mbox_shutdown
mbox_free_channel
mbox_controller_unregister
__devm_mbox_controller_unregister
...
The root cause can be deduced to be calling pm_runtime_get_sync() after
calling pm_runtime_disable() as observed below:
1. CMDQ driver uses devm_mbox_controller_register() in cmdq_probe()
to bind the cmdq device to the mbox_controller, so
devm_mbox_controller_unregister() will automatically unregister
the device bound to the mailbox controller when the device-managed
resource is removed. That means devm_mbox_controller_unregister()
and cmdq_mbox_shoutdown() will be called after cmdq_remove().
2. CMDQ driver also uses devm_pm_runtime_enable() in cmdq_probe() after
devm_mbox_controller_register(), so that devm_pm_runtime_disable()
will be called after cmdq_remove(), but before
devm_mbox_controller_unregister().
To fix this problem, cmdq_probe() needs to move
devm_mbox_controller_register() after devm_pm_runtime_enable() to make
devm_pm_runtime_disable() be called after
devm_mbox_controller_unregister().
Fixes: 623a6143a845 ("mailbox: mediatek: Add Mediatek CMDQ driver")
Signed-off-by: Jason-JH.Lin <jason-jh.lin@mediatek.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b5ef17917f3a797a7b12d1edd51f676554e44a07 ]
Two TXDB_V2 channels are used between Linux and System Manager(SM).
Channel0 for normal TX, Channel 1 for notification completion.
The TXDB_V2 trigger logic is using imx_mu_xcr_rmw which uses
read/modify/update logic.
Note: clear MUB GSR BITs, the MUA side GCR BITs will also got cleared per
hardware design.
Channel0 Linux
read GCR->modify GCR->write GCR->M33 SM->read GSR----->clear GSR
|-(1)-|
Channel1 Linux start in time slot(1)
read GCR->modify GCR->write GCR->M33 SM->read GSR->clear GSR
So Channel1 read GCR will read back the GCR that Channel0 wrote, because
M33 has not finish clear GSR, this means Channel1 GCR writing will
trigger Channel1 and Channel0 interrupt both which is wrong.
Channel0 will be freed(SCMI channel status set to FREE) in M33 SM when
processing the 1st Channel0 interrupt. So when 2nd interrupt trigger
(channel 0/1 trigger together), SM will see a freed Channel0, and report
protocol error.
To address the issue, not using read/modify/update logic, just use
write, because write 0 to GCR will be ignored. And after write MUA GCR,
wait the SM to clear MUB GSR by looping MUA GCR value.
Fixes: 5bfe4067d350 ("mailbox: imx: support channel type tx doorbell v2")
Reviewed-by: Ranjani Vaidyanathan <ranjani.vaidyanathan@nxp.com>
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0a02bc0a34cd53c7fe5bf4bae6efb56ad47677fa ]
Multiple mailbox users can share one interrupt line. This flag was
mistakenly dropped as part of the FIFO removal. Mark the IRQ as shared.
Reported-by: Beleswar Padhi <b-padhi@ti.com>
Fixes: 3f58c1f4206f ("mailbox: omap: Remove kernel FIFO message queuing")
Signed-off-by: Andrew Davis <afd@ti.com>
Tested-by: Beleswar Padhi <b-padhi@ti.com>
Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a8631f6d6344d976096b1efafdb2fbb3111bd790 ]
ret variable was used to test reset status, get from
reset_control_status() call. But this variable was overwritten by
ti_sci_proc_get_status() a few lines bellow.
And as ti_sci_proc_get_status() returns 0 or a negative value (in this
latter case, followed by a return), the expression !ret was always true,
Clearly, this was not what was intended:
In the comment above it's said that "requires both local and module
resets to be deasserted"; if reset_control_status() returns 0 it means
that the reset line is deasserted.
So, it's pretty clear that the return value of reset_control_status()
was intended to be used instead of ti_sci_proc_get_status() return
value.
This could lead in an incorrect IPC-only mode detection if reset line is
asserted (so reset_control_status() return > 0) and c_state != 0 and
halted == 0.
In this case, the old code would have detected an IPC-only mode instead
of a mismatched mode.
Fixes: 1168af40b1ad ("remoteproc: k3-r5: Add support for IPC-only mode for all R5Fs")
Signed-off-by: Richard Genoud <richard.genoud@bootlin.com>
Reviewed-by: Hari Nagalla <hnagalla@ti.com>
Link: https://lore.kernel.org/r/20240621150058.319524-2-richard.genoud@bootlin.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 67ca3f98070ffdf308b91e08a477fcb1e9684ae8 ]
The current code doesn't check whether platform_get_resource_byname()
succeeded to get the l1tcm memory, which is optional, before attempting
to map it. This results in the following error message when it is
missing:
mtk-scp 10500000.scp: error -EINVAL: invalid resource (null)
Add a check so that the remapping is only attempted if the memory region
exists. This also allows to simplify the logic handling failure to
remap, since a failure then is always a failure.
Fixes: ca23ecfdbd44 ("remoteproc/mediatek: support L1TCM")
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Reviewed-by: Tzung-Bi Shih <tzungbi@kernel.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/20240627-scp-invalid-resource-l1tcm-v1-1-7d221e6c495a@collabora.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f8b6c1eb76f73ed721facd58d0cfb08513aad34c ]
If iio_read_channel_processed() fails, 'val->intval' is not updated, but it
is still *1000 just after. So, in case of error, the *1000 accumulate and
'val->intval' becomes erroneous.
So instead of rescaling the value after the fact, use the dedicated scaling
API. This way the result is updated only when needed. In case of error, the
previous value is kept, unmodified.
This should also reduce any inaccuracies resulting from the scaling.
Finally, this is also slightly more efficient as it saves a function call
and a multiplication.
Fixes: fb24ccfbe1e0 ("power: supply: add Ingenic JZ47xx battery driver.")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Artur Rojek <contact@artur-rojek.eu>
Link: https://lore.kernel.org/r/51e49c18574003db1e20c9299061a5ecd1661a3c.1719121781.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3288757087cbb93b91019ba6b7de53a1908c9d48 ]
The ab8500_charger_get_[ac|vbus]_[current|voltage]() functions should
return an error code on error.
Up to now, an un-initialized value is returned.
This makes the error handling of the callers un-reliable.
Return the error code instead, to fix the issue.
Fixes: 97ab78bac5d0 ("power: supply: ab8500_charger: Convert to IIO ADC")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/f9f65642331c9e40aaebb888589db043db80b7eb.1719037737.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3892b11eac5aaaeefbf717f1953288b77759d9e2 ]
Currently, there are some places to set CSR.PRMD.PWE, the first one is
in hw_breakpoint_thread_switch() to enable user space singlestep via
checking TIF_SINGLESTEP, the second one is in hw_breakpoint_control() to
enable user space watchpoint. For the latter case, it should also check
TIF_LOAD_WATCH to make the logic correct and clear.
Fixes: c8e57ab0995c ("LoongArch: Trigger user-space watchpoints correctly")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 72d04bdcf3f7d7e07d82f9757946f68802a7270a ]
Configuration for sbq:
depth=64, wake_batch=6, shift=6, map_nr=1
1. There are 64 requests in progress:
map->word = 0xFFFFFFFFFFFFFFFF
2. After all the 64 requests complete, and no more requests come:
map->word = 0xFFFFFFFFFFFFFFFF, map->cleared = 0xFFFFFFFFFFFFFFFF
3. Now two tasks try to allocate requests:
T1: T2:
__blk_mq_get_tag .
__sbitmap_queue_get .
sbitmap_get .
sbitmap_find_bit .
sbitmap_find_bit_in_word .
__sbitmap_get_word -> nr=-1 __blk_mq_get_tag
sbitmap_deferred_clear __sbitmap_queue_get
/* map->cleared=0xFFFFFFFFFFFFFFFF */ sbitmap_find_bit
if (!READ_ONCE(map->cleared)) sbitmap_find_bit_in_word
return false; __sbitmap_get_word -> nr=-1
mask = xchg(&map->cleared, 0) sbitmap_deferred_clear
atomic_long_andnot() /* map->cleared=0 */
if (!(map->cleared))
return false;
/*
* map->cleared is cleared by T1
* T2 fail to acquire the tag
*/
4. T2 is the sole tag waiter. When T1 puts the tag, T2 cannot be woken
up due to the wake_batch being set at 6. If no more requests come, T1
will wait here indefinitely.
This patch achieves two purposes:
1. Check on ->cleared and update on both ->cleared and ->word need to
be done atomically, and using spinlock could be the simplest solution.
2. Add extra check in sbitmap_deferred_clear(), to identify whether
->word has free bits.
Fixes: ea86ea2cdced ("sbitmap: ammortize cost of clearing bits")
Signed-off-by: Yang Yang <yang.yang@vivo.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20240716082644.659566-1-yang.yang@vivo.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6ab42fe21c84d72da752923b4bd7075344f4a362 ]
pgalloc_tag_sub() might call page_ext_put() using a page different from
the one used in page_ext_get() call. This does not pose an issue since
page_ext_put() ignores this parameter as long as it's non-NULL but
technically this is wrong. Fix it by storing the original page used in
page_ext_get() and passing it to page_ext_put().
Link: https://lkml.kernel.org/r/20240711220457.1751071-3-surenb@google.com
Fixes: be25d1d4e822 ("mm: create new codetag references during page splitting")
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Sourav Panda <souravpanda@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8e64d2356cbc800b4cd0e3e614797f76bcf0cdb8 ]
dasd_add_busid() can return an error via ERR_PTR() if an allocation
fails. However, two callsites in dasd_copy_pair_store() do not check
the result, potentially resulting in a NULL pointer dereference. Fix
this by checking the result with IS_ERR() and returning the error up
the stack.
Fixes: a91ff09d39f9b ("s390/dasd: add copy pair setup")
Signed-off-by: Carlos López <clopez@suse.de>
Signed-off-by: Stefan Haberland <sth@linux.ibm.com>
Link: https://lore.kernel.org/r/20240715112434.2111291-3-sth@linux.ibm.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7ea981070fd9ec24bc0111636038193aebb0289c ]
set_huge_pte_at() expects the size of the hugepage as an int, not the
psize which is the index of the page definition in table mmu_psize_defs[]
Link: https://lkml.kernel.org/r/97f2090011e25d99b6b0aae73e22e1b921c5d1fb.1719928057.git.christophe.leroy@csgroup.eu
Fixes: 935d4f0c6dc8 ("mm: hugetlb: add huge page size param to set_huge_pte_at()")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fff42f213824fa434a4b6cf906b4331fe6e9302b ]
The commit 1bbe254e4336 ("md-cluster: check for timeout while a
new disk adding") is correct in terms of code syntax but not
suite real clustered code logic.
When a timeout occurs while adding a new disk, if recv_daemon()
bypasses the unlock for ack_lockres:CR, another node will be waiting
to grab EX lock. This will cause the cluster to hang indefinitely.
How to fix:
1. In dlm_lock_sync(), change the wait behaviour from forever to a
timeout, This could avoid the hanging issue when another node
fails to handle cluster msg. Another result of this change is
that if another node receives an unknown msg (e.g. a new msg_type),
the old code will hang, whereas the new code will timeout and fail.
This could help cluster_md handle new msg_type from different
nodes with different kernel/module versions (e.g. The user only
updates one leg's kernel and monitors the stability of the new
kernel).
2. The old code for __sendmsg() always returns 0 (success) under the
design (must successfully unlock ->message_lockres). This commit
makes this function return an error number when an error occurs.
Fixes: 1bbe254e4336 ("md-cluster: check for timeout while a new disk adding")
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
Reviewed-by: Su Yue <glass.su@suse.com>
Acked-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20240709104120.22243-1-heming.zhao@suse.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit eb95678ee930d67d79fc83f0a700245ae7230455 ]
We skip the run_truncate_head call also for $MFT::$ATTR_BITMAP.
Otherwise wnd_map()/run_lookup_entry will not find the disk position for the bitmap parts.
Fixes: 0e5b044cbf3a ("fs/ntfs3: Refactoring attr_set_size to restore after errors")
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d392e85fd1e8d58e460c17ca7d0d5c157848d9c1 ]
The 'nocase' option was mistakenly added as fsparam_flag_no
with the 'no' prefix, causing the case-insensitive mode to require
the 'nonocase' option to be enabled.
Fixes: a3a956c78efa ("fs/ntfs3: Add option "nocase"")
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 463927a8902a9f22c3633960119410f57d4c8920 ]
`rtc_add_offset()` is called by `__rtc_read_time()`
and `__rtc_read_alarm()` to add the RTC's offset to
the raw read-outs from the device drivers. However,
in the latter case, a fix-up algorithm is run if
the RTC device does not report a full `struct rtc_time`
alarm value. In that case, the offset was forgot to be
added.
Fixes: fd6792bb022e ("rtc: fix alarm read and set offset")
Signed-off-by: Csókás, Bence <csokas.bence@prolan.hu>
Link: https://lore.kernel.org/r/20240619140451.2800578-1-csokas.bence@prolan.hu
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0f3819e8c483771a59cf9d3190cd68a7a990083c ]
According to the C standard 3.4.3p3, the result of signed integer overflow
is undefined. The macro nilfs_cnt32_ge(), which compares two sequence
numbers, uses signed integer subtraction that can overflow, and therefore
the result of the calculation may differ from what is expected due to
undefined behavior in different environments.
Similar to an earlier change to the jiffies-related comparison macros in
commit 5a581b367b5d ("jiffies: Avoid undefined behavior from signed
overflow"), avoid this potential issue by changing the definition of the
macro to perform the subtraction as unsigned integers, then cast the
result to a signed integer for comparison.
Link: https://lkml.kernel.org/r/20130727225828.GA11864@linux.vnet.ibm.com
Link: https://lkml.kernel.org/r/20240702183512.6390-1-konishi.ryusuke@gmail.com
Fixes: 9ff05123e3bf ("nilfs2: segment constructor")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 34ec4344a5dabbb39e23e8daf30779892c0211a6 ]
Patch series "selftests/damon: test DAMOS tried regions and
{min,max}_nr_regions".
This patch series fix a minor issue in a program for DAMON selftest, and
implement new functionality selftests for DAMOS tried regions and
{min,max}_nr_regions. The test for max_nr_regions also test the recovery
from online tuning-caused limit violation, which was fixed by a previous
patch [1] titled "mm/damon/core: merge regions aggressively when
max_nr_regions is unmet".
The first patch fixes a minor problem in the articial memory access
pattern generator for tests. Following 3 patches (2-4) implement schemes
tried regions test. Then a couple of patches (5-6) implementing static
setup based {min,max}_nr_regions functionality test follows. Final two
patches (7-8) implement dynamic max_nr_regions update test.
[1] https://lore.kernel.org/20240624210650.53960C2BBFC@smtp.kernel.org
This patch (of 8):
'access_memory' is an artificial memory access pattern generator for DAMON
tests. It creates and accesses memory regions that the user specified the
number and size via the command line. However, real access part of the
program ignores the user-specified size of each region. Instead, it uses
a hard-coded value, 10 MiB. Fix it to use user-defined size.
Note that all existing 'access_memory' users are setting the region size
as 10 MiB. Hence no real problem has happened so far.
Link: https://lkml.kernel.org/r/20240625180538.73134-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20240625180538.73134-2-sj@kernel.org
Fixes: b5906f5f7359 ("selftests/damon: add a test for update_schemes_tried_regions sysfs command")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2c1f057e5be63e890f2dd89e4c25ab5eef084a91 ]
We added PM_MMAP_EXCLUSIVE in 2015 via commit 77bb499bb60f ("pagemap: add
mmap-exclusive bit for marking pages mapped only here"), when THPs could
not be partially mapped and page_mapcount() returned something that was
true for all pages of the THP.
In 2016, we added support for partially mapping THPs via commit
53f9263baba6 ("mm: rework mapcount accounting to enable 4k mapping of
THPs") but missed to determine PM_MMAP_EXCLUSIVE as well per page.
Checking page_mapcount() on the head page does not tell the whole story.
We should check each individual page. In a future without per-page
mapcounts it will be different, but we'll change that to be consistent
with PTE-mapped THPs once we deal with that.
Link: https://lkml.kernel.org/r/20240607122357.115423-4-david@redhat.com
Fixes: 53f9263baba6 ("mm: rework mapcount accounting to enable 4k mapping of THPs")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Lance Yang <ioworker0@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit da7f31ed0f4df8f61e8195e527aa83dd54896ba3 ]
Relying on the mapcount for non-present PTEs that reference pages doesn't
make any sense: they are not accounted in the mapcount, so page_mapcount()
== 1 won't return the result we actually want to know.
While we don't check the mapcount for migration entries already, we could
end up checking it for swap, hwpoison, device exclusive, ... entries,
which we really shouldn't.
There is one exception: device private entries, which we consider
fake-present (e.g., incremented the mapcount). But we won't care about
that for now for PM_MMAP_EXCLUSIVE, because indicating PM_SWAP for them
although they are fake-present already sounds suspiciously wrong.
Let's never indicate PM_MMAP_EXCLUSIVE without PM_PRESENT.
Link: https://lkml.kernel.org/r/20240607122357.115423-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Lance Yang <ioworker0@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Stable-dep-of: 2c1f057e5be6 ("fs/proc/task_mmu: properly detect PM_MMAP_EXCLUSIVE per page of PMD-mapped THPs")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3f9f022e975d930709848a86a1c79775b0585202 ]
Patch series "fs/proc: move page_mapcount() to fs/proc/internal.h".
With all other page_mapcount() users in the tree gone, move
page_mapcount() to fs/proc/internal.h, rename it and extend the
documentation to prevent future (ab)use.
... of course, I find some issues while working on that code that I sort
first ;)
We'll now only end up calling page_mapcount() [now
folio_precise_page_mapcount()] on pages mapped via present page table
entries. Except for /proc/kpagecount, that still does questionable
things, but we'll leave that legacy interface as is for now.
Did a quick sanity check. Likely we would want some better selfestest for
/proc/$/pagemap + smaps. I'll see if I can find some time to write some
more.
This patch (of 6):
Looks like we never taught pagemap_pmd_range() about the existence of
PMD-mapped file THPs. Seems to date back to the times when we first added
support for non-anon THPs in the form of shmem THP.
Link: https://lkml.kernel.org/r/20240607122357.115423-1-david@redhat.com
Link: https://lkml.kernel.org/r/20240607122357.115423-2-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Fixes: 800d8c63b2e9 ("shmem: add huge pages support")
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Lance Yang <ioworker0@gmail.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bfd2428f3a80647af681df4793e473258aa755da ]
The Pin Multiplex attachment in Rev.1.10 of the R-Car V4H Series
Hardware User's Manual still has two alternate pins named both TCLK3
and TCLK4. To differentiate, the pin control driver uses "TCLK[34]" and
"TCLK[34]_X". In addition, there are alternate pins without suffix, and
with an "_A" or "_B" suffix.
Increase uniformity by adopting R-Car V4M naming:
- Rename "TCLK2_B" to "TCLK2_C",
- Rename "TCLK[12]_A" to "TCLK[12]_B",
- Rename "TCLK[12]" to "TCLK[12]_A",
- Rename "TCLK[34]_A" to "TCLK[34]_C",
- Rename "TCLK[34]_X" to "TCLK[34]_A",
- Rename "TCLK[34]" to "TCLK[34]_B".
Fixes: ad9bb2fec66262b0 ("pinctrl: renesas: Initial R8A779G0 (R-Car V4H) PFC support")
Fixes: 0df46188a58895e1 ("pinctrl: renesas: r8a779g0: Add missing TCLKx_A/TCLKx_B/TCLKx_X")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/2845ff1f8fe1fd8d23d2f307ad5e8eb8243da608.1717754960.git.geert+renesas@glider.be
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5350f38150a171322b50c0a48efa671885f87050 ]
(H)SCIF instance 3 has two alternate pin groups: "hscif3" and
"hscif3_a", resp. "scif3" and "scif3_a", but the actual meanings of the
pins within the groups do not match.
Increase uniformity by adopting R-Car V4M naming:
- Rename "hscif3_a" to "hscif3_b",
- Rename "hscif3" to "hscif3_a",
- Rename "scif3" to "scif3_b".
While at it, remove unneeded separators.
Fixes: ad9bb2fec66262b0 ("pinctrl: renesas: Initial R8A779G0 (R-Car V4H) PFC support")
Fixes: 050442ae4c74f830 ("pinctrl: renesas: r8a779g0: Add pins, groups and functions")
Fixes: 213b713255defaa6 ("pinctrl: renesas: r8a779g0: Add missing HSCIF3_A")
Fixes: 49e4697656bdd1cd ("pinctrl: renesas: r8a779g0: Add missing SCIF3")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/61fdde58e369e8070ffd3c5811c089e6219c7ecc.1717754960.git.geert+renesas@glider.be
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3cf834a1669ea433aeee4c82c642776899c87451 ]
The Pin Multiplex attachment in Rev.1.10 of the R-Car V4H Series
Hardware User's Manual still has two alternate pin groups (GP0_14-18
and GP1_6-10) each named both HSCIF1 and SCIF1. To differentiate, the
pin control driver uses "(h)scif1" and "(h)scif1_x", which were
considered temporary names until the conflict was sorted out.
Fix this by adopting R-Car V4M naming:
- Rename "(h)scif1" to "(h)scif1_a",
- Rename "(h)scif1_x" to "(h)scif1_b".
Adopt the R-Car V4M naming "(h)scif1_a" and "(h)scif1_b" to increase
uniformity.
While at it, remove unneeded separators.
Fixes: ad9bb2fec66262b0 ("pinctrl: renesas: Initial R8A779G0 (R-Car V4H) PFC support")
Fixes: 050442ae4c74f830 ("pinctrl: renesas: r8a779g0: Add pins, groups and functions")
Fixes: cf4f7891847bc558 ("pinctrl: renesas: r8a779g0: Add missing HSCIF1_X")
Fixes: 9c151c2be92becf2 ("pinctrl: renesas: r8a779g0: Add missing SCIF1_X")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/5009130d1867e12abf9b231c8838fd05e2b28bee.1717754960.git.geert+renesas@glider.be
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4976d61ca39ce51f422e094de53b46e2e3ac5c0d ]
The Pin Multiplex attachment in Rev.1.10 of the R-Car V4H Series
Hardware User's Manual still has two alternate pins named both
"FXR_TXEN[AB]". To differentiate, the pin control driver uses
"FXR_TXEN[AB]" and "FXR_TXEN[AB]_X", which were considered temporary
names until the conflict was sorted out.
Fix this by adopting R-Car V4M naming:
- Rename "FXR_TXEN[AB]" to "FXR_TXEN[AB]_A",
- Rename "FXR_TXEN[AB]_X" to "FXR_TXEN[AB]_B".
Fixes: ad9bb2fec66262b0 ("pinctrl: renesas: Initial R8A779G0 (R-Car V4H) PFC support")
Fixes: 1c2646b5cebfff07 ("pinctrl: renesas: r8a779g0: Add missing FlexRay")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/5e1e9abb46c311d4c54450d991072d6d0e66f14c.1717754960.git.geert+renesas@glider.be
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 94d4154792abf30ee6081d35beaeef035816e294 ]
struct rtc_device is allocated twice in probe(), once with
devm_kzalloc(), and then with devm_rtc_allocate_device().
The allocation with devm_kzalloc() is lost and superfluous.
Fixes: 9f67c1e63976 ("rtc: tps6594: Add driver for TPS6594 RTC")
Signed-off-by: Richard Genoud <richard.genoud@bootlin.com>
Link: https://lore.kernel.org/r/20240618141851.1810000-2-richard.genoud@bootlin.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f28d0866d8ff798aa497971f93d0cc58f442d946 ]
Clusters allocated for Extended Attributes, must be freed
when rolling back inode creation.
Fixes: 82cae269cfa95 ("fs/ntfs3: Add initialization of super block")
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 69505fe98f198ee813898cbcaf6770949636430b ]
The issue was detected due to xfstest 465 failing.
Fixes: 4342306f0f0d ("fs/ntfs3: Add file operations and implementation")
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7f500f2011c0bbb6e1cacab74b4c99222e60248e ]
of_get_next_child() will increase refcount of the returned node, need
use of_node_put() on it when done.
Per current implementation, 'child' will be override by
for_each_child_of_node(np, child), so use of_get_child_count to avoid
refcount leakage.
Fixes: 17723111e64f ("pinctrl: add pinctrl-mxs support")
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Link: https://lore.kernel.org/20240504-pinctrl-cleanup-v2-18-26c5f2dc1181@nxp.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9b401f4a7170125365160c9af267a41ff6b39001 ]
This driver calls pinctrl_register_and_init() which is not
devm_ managed, it will leads memory leak if pinctrl_enable()
fails. Replace it with devm_pinctrl_register_and_init().
And add missing of_node_put() in the error path.
Fixes: 5038a66dad01 ("pinctrl: core: delete incorrect free in pinctrl_enable()")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/20240606023704.3931561-4-yangyingliang@huawei.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8f773bfbdd428819328a2d185976cfc6ae811cd3 ]
This driver calls pinctrl_register_and_init() which is not
devm_ managed, it will leads memory leak if pinctrl_enable()
fails. Replace it with devm_pinctrl_register_and_init().
And call pcs_free_resources() if pinctrl_enable() fails.
Fixes: 5038a66dad01 ("pinctrl: core: delete incorrect free in pinctrl_enable()")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/20240606023704.3931561-3-yangyingliang@huawei.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ae1cf4759972c5fe665ee4c5e0c29de66fe3cf4a ]
In devm_pinctrl_register(), if pinctrl_enable() fails in pinctrl_register(),
the "pctldev" has not been added to dev resources, so devm_pinctrl_dev_release()
can not be called, it leads memory leak.
Introduce pinctrl_uninit_controller(), call it in the error path to free memory.
Fixes: 5038a66dad01 ("pinctrl: core: delete incorrect free in pinctrl_enable()")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/20240606023704.3931561-2-yangyingliang@huawei.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0f9579d9e0331b6255132ac06bdf2c0a01cceb90 ]
After switching from pages to folio [1], it became evident that
the initialization of .dirty_folio for page cache operations was missed for
compressed files.
[1] https://lore.kernel.org/ntfs3/20240422193203.3534108-1-willy@infradead.org
Fixes: 82cae269cfa95 ("fs/ntfs3: Add initialization of super block")
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 24c5100aceedcd47af89aaa404d4c96cd2837523 ]
An additional condition causes the mft record to be read from disk
and get the file type dt_type.
Fixes: 22457c047ed97 ("fs/ntfs3: Modified fix directory element type detection")
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>