61307b7be4
documented (hopefully adequately) in the respective changelogs. Notable series include: - Lucas Stach has provided some page-mapping cleanup/consolidation/maintainability work in the series "mm/treewide: Remove pXd_huge() API". - In the series "Allow migrate on protnone reference with MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's MPOL_PREFERRED_MANY mode, yielding almost doubled performance in one test. - In their series "Memory allocation profiling" Kent Overstreet and Suren Baghdasaryan have contributed a means of determining (via /proc/allocinfo) whereabouts in the kernel memory is being allocated: number of calls and amount of memory. - Matthew Wilcox has provided the series "Various significant MM patches" which does a number of rather unrelated things, but in largely similar code sites. - In his series "mm: page_alloc: freelist migratetype hygiene" Johannes Weiner has fixed the page allocator's handling of migratetype requests, with resulting improvements in compaction efficiency. - In the series "make the hugetlb migration strategy consistent" Baolin Wang has fixed a hugetlb migration issue, which should improve hugetlb allocation reliability. - Liu Shixin has hit an I/O meltdown caused by readahead in a memory-tight memcg. Addressed in the series "Fix I/O high when memory almost met memcg limit". - In the series "mm/filemap: optimize folio adding and splitting" Kairui Song has optimized pagecache insertion, yielding ~10% performance improvement in one test. - Baoquan He has cleaned up and consolidated the early zone initialization code in the series "mm/mm_init.c: refactor free_area_init_core()". - Baoquan has also redone some MM initializatio code in the series "mm/init: minor clean up and improvement". - MM helper cleanups from Christoph Hellwig in his series "remove follow_pfn". - More cleanups from Matthew Wilcox in the series "Various page->flags cleanups". - Vlastimil Babka has contributed maintainability improvements in the series "memcg_kmem hooks refactoring". - More folio conversions and cleanups in Matthew Wilcox's series "Convert huge_zero_page to huge_zero_folio" "khugepaged folio conversions" "Remove page_idle and page_young wrappers" "Use folio APIs in procfs" "Clean up __folio_put()" "Some cleanups for memory-failure" "Remove page_mapping()" "More folio compat code removal" - David Hildenbrand chipped in with "fs/proc/task_mmu: convert hugetlb functions to work on folis". - Code consolidation and cleanup work related to GUP's handling of hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2". - Rick Edgecombe has developed some fixes to stack guard gaps in the series "Cover a guard gap corner case". - Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the series "mm/ksm: fix ksm exec support for prctl". - Baolin Wang has implemented NUMA balancing for multi-size THPs. This is a simple first-cut implementation for now. The series is "support multi-size THP numa balancing". - Cleanups to vma handling helper functions from Matthew Wilcox in the series "Unify vma_address and vma_pgoff_address". - Some selftests maintenance work from Dev Jain in the series "selftests/mm: mremap_test: Optimizations and style fixes". - Improvements to the swapping of multi-size THPs from Ryan Roberts in the series "Swap-out mTHP without splitting". - Kefeng Wang has significantly optimized the handling of arm64's permission page faults in the series "arch/mm/fault: accelerate pagefault when badaccess" "mm: remove arch's private VM_FAULT_BADMAP/BADACCESS" - GUP cleanups from David Hildenbrand in "mm/gup: consistently call it GUP-fast". - hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault path to use struct vm_fault". - selftests build fixes from John Hubbard in the series "Fix selftests/mm build without requiring "make headers"". - Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the series "Improved Memory Tier Creation for CPUless NUMA Nodes". Fixes the initialization code so that migration between different memory types works as intended. - David Hildenbrand has improved follow_pte() and fixed an errant driver in the series "mm: follow_pte() improvements and acrn follow_pte() fixes". - David also did some cleanup work on large folio mapcounts in his series "mm: mapcount for large folios + page_mapcount() cleanups". - Folio conversions in KSM in Alex Shi's series "transfer page to folio in KSM". - Barry Song has added some sysfs stats for monitoring multi-size THP's in the series "mm: add per-order mTHP alloc and swpout counters". - Some zswap cleanups from Yosry Ahmed in the series "zswap same-filled and limit checking cleanups". - Matthew Wilcox has been looking at buffer_head code and found the documentation to be lacking. The series is "Improve buffer head documentation". - Multi-size THPs get more work, this time from Lance Yang. His series "mm/madvise: enhance lazyfreeing with mTHP in madvise_free" optimizes the freeing of these things. - Kemeng Shi has added more userspace-visible writeback instrumentation in the series "Improve visibility of writeback". - Kemeng Shi then sent some maintenance work on top in the series "Fix and cleanups to page-writeback". - Matthew Wilcox reduces mmap_lock traffic in the anon vma code in the series "Improve anon_vma scalability for anon VMAs". Intel's test bot reported an improbable 3x improvement in one test. - SeongJae Park adds some DAMON feature work in the series "mm/damon: add a DAMOS filter type for page granularity access recheck" "selftests/damon: add DAMOS quota goal test" - Also some maintenance work in the series "mm/damon/paddr: simplify page level access re-check for pageout" "mm/damon: misc fixes and improvements" - David Hildenbrand has disabled some known-to-fail selftests ni the series "selftests: mm: cow: flag vmsplice() hugetlb tests as XFAIL". - memcg metadata storage optimizations from Shakeel Butt in "memcg: reduce memory consumption by memcg stats". - DAX fixes and maintenance work from Vishal Verma in the series "dax/bus.c: Fixups for dax-bus locking". -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZkgQYwAKCRDdBJ7gKXxA jrdKAP9WVJdpEcXxpoub/vVE0UWGtffr8foifi9bCwrQrGh5mgEAx7Yf0+d/oBZB nvA4E0DcPrUAFy144FNM0NTCb7u9vAw= =V3R/ -----END PGP SIGNATURE----- Merge tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull mm updates from Andrew Morton: "The usual shower of singleton fixes and minor series all over MM, documented (hopefully adequately) in the respective changelogs. Notable series include: - Lucas Stach has provided some page-mapping cleanup/consolidation/ maintainability work in the series "mm/treewide: Remove pXd_huge() API". - In the series "Allow migrate on protnone reference with MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's MPOL_PREFERRED_MANY mode, yielding almost doubled performance in one test. - In their series "Memory allocation profiling" Kent Overstreet and Suren Baghdasaryan have contributed a means of determining (via /proc/allocinfo) whereabouts in the kernel memory is being allocated: number of calls and amount of memory. - Matthew Wilcox has provided the series "Various significant MM patches" which does a number of rather unrelated things, but in largely similar code sites. - In his series "mm: page_alloc: freelist migratetype hygiene" Johannes Weiner has fixed the page allocator's handling of migratetype requests, with resulting improvements in compaction efficiency. - In the series "make the hugetlb migration strategy consistent" Baolin Wang has fixed a hugetlb migration issue, which should improve hugetlb allocation reliability. - Liu Shixin has hit an I/O meltdown caused by readahead in a memory-tight memcg. Addressed in the series "Fix I/O high when memory almost met memcg limit". - In the series "mm/filemap: optimize folio adding and splitting" Kairui Song has optimized pagecache insertion, yielding ~10% performance improvement in one test. - Baoquan He has cleaned up and consolidated the early zone initialization code in the series "mm/mm_init.c: refactor free_area_init_core()". - Baoquan has also redone some MM initializatio code in the series "mm/init: minor clean up and improvement". - MM helper cleanups from Christoph Hellwig in his series "remove follow_pfn". - More cleanups from Matthew Wilcox in the series "Various page->flags cleanups". - Vlastimil Babka has contributed maintainability improvements in the series "memcg_kmem hooks refactoring". - More folio conversions and cleanups in Matthew Wilcox's series: "Convert huge_zero_page to huge_zero_folio" "khugepaged folio conversions" "Remove page_idle and page_young wrappers" "Use folio APIs in procfs" "Clean up __folio_put()" "Some cleanups for memory-failure" "Remove page_mapping()" "More folio compat code removal" - David Hildenbrand chipped in with "fs/proc/task_mmu: convert hugetlb functions to work on folis". - Code consolidation and cleanup work related to GUP's handling of hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2". - Rick Edgecombe has developed some fixes to stack guard gaps in the series "Cover a guard gap corner case". - Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the series "mm/ksm: fix ksm exec support for prctl". - Baolin Wang has implemented NUMA balancing for multi-size THPs. This is a simple first-cut implementation for now. The series is "support multi-size THP numa balancing". - Cleanups to vma handling helper functions from Matthew Wilcox in the series "Unify vma_address and vma_pgoff_address". - Some selftests maintenance work from Dev Jain in the series "selftests/mm: mremap_test: Optimizations and style fixes". - Improvements to the swapping of multi-size THPs from Ryan Roberts in the series "Swap-out mTHP without splitting". - Kefeng Wang has significantly optimized the handling of arm64's permission page faults in the series "arch/mm/fault: accelerate pagefault when badaccess" "mm: remove arch's private VM_FAULT_BADMAP/BADACCESS" - GUP cleanups from David Hildenbrand in "mm/gup: consistently call it GUP-fast". - hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault path to use struct vm_fault". - selftests build fixes from John Hubbard in the series "Fix selftests/mm build without requiring "make headers"". - Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the series "Improved Memory Tier Creation for CPUless NUMA Nodes". Fixes the initialization code so that migration between different memory types works as intended. - David Hildenbrand has improved follow_pte() and fixed an errant driver in the series "mm: follow_pte() improvements and acrn follow_pte() fixes". - David also did some cleanup work on large folio mapcounts in his series "mm: mapcount for large folios + page_mapcount() cleanups". - Folio conversions in KSM in Alex Shi's series "transfer page to folio in KSM". - Barry Song has added some sysfs stats for monitoring multi-size THP's in the series "mm: add per-order mTHP alloc and swpout counters". - Some zswap cleanups from Yosry Ahmed in the series "zswap same-filled and limit checking cleanups". - Matthew Wilcox has been looking at buffer_head code and found the documentation to be lacking. The series is "Improve buffer head documentation". - Multi-size THPs get more work, this time from Lance Yang. His series "mm/madvise: enhance lazyfreeing with mTHP in madvise_free" optimizes the freeing of these things. - Kemeng Shi has added more userspace-visible writeback instrumentation in the series "Improve visibility of writeback". - Kemeng Shi then sent some maintenance work on top in the series "Fix and cleanups to page-writeback". - Matthew Wilcox reduces mmap_lock traffic in the anon vma code in the series "Improve anon_vma scalability for anon VMAs". Intel's test bot reported an improbable 3x improvement in one test. - SeongJae Park adds some DAMON feature work in the series "mm/damon: add a DAMOS filter type for page granularity access recheck" "selftests/damon: add DAMOS quota goal test" - Also some maintenance work in the series "mm/damon/paddr: simplify page level access re-check for pageout" "mm/damon: misc fixes and improvements" - David Hildenbrand has disabled some known-to-fail selftests ni the series "selftests: mm: cow: flag vmsplice() hugetlb tests as XFAIL". - memcg metadata storage optimizations from Shakeel Butt in "memcg: reduce memory consumption by memcg stats". - DAX fixes and maintenance work from Vishal Verma in the series "dax/bus.c: Fixups for dax-bus locking"" * tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (426 commits) memcg, oom: cleanup unused memcg_oom_gfp_mask and memcg_oom_order selftests/mm: hugetlb_madv_vs_map: avoid test skipping by querying hugepage size at runtime mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_wp mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_fault selftests: cgroup: add tests to verify the zswap writeback path mm: memcg: make alloc_mem_cgroup_per_node_info() return bool mm/damon/core: fix return value from damos_wmark_metric_value mm: do not update memcg stats for NR_{FILE/SHMEM}_PMDMAPPED selftests: cgroup: remove redundant enabling of memory controller Docs/mm/damon/maintainer-profile: allow posting patches based on damon/next tree Docs/mm/damon/maintainer-profile: change the maintainer's timezone from PST to PT Docs/mm/damon/design: use a list for supported filters Docs/admin-guide/mm/damon/usage: fix wrong schemes effective quota update command Docs/admin-guide/mm/damon/usage: fix wrong example of DAMOS filter matching sysfs file selftests/damon: classify tests for functionalities and regressions selftests/damon/_damon_sysfs: use 'is' instead of '==' for 'None' selftests/damon/_damon_sysfs: find sysfs mount point from /proc/mounts selftests/damon/_damon_sysfs: check errors from nr_schemes file reads mm/damon/core: initialize ->esz_bp from damos_quota_init_priv() selftests/damon: add a test for DAMOS quota goal ...
187 lines
4.6 KiB
C
187 lines
4.6 KiB
C
// SPDX-License-Identifier: GPL-2.0
|
|
/*
|
|
* arch/alpha/lib/checksum.c
|
|
*
|
|
* This file contains network checksum routines that are better done
|
|
* in an architecture-specific manner due to speed..
|
|
* Comments in other versions indicate that the algorithms are from RFC1071
|
|
*
|
|
* accelerated versions (and 21264 assembly versions ) contributed by
|
|
* Rick Gorton <rick.gorton@alpha-processor.com>
|
|
*/
|
|
|
|
#include <linux/module.h>
|
|
#include <linux/string.h>
|
|
#include <net/checksum.h>
|
|
|
|
#include <asm/byteorder.h>
|
|
#include <asm/checksum.h>
|
|
|
|
static inline unsigned short from64to16(unsigned long x)
|
|
{
|
|
/* Using extract instructions is a bit more efficient
|
|
than the original shift/bitmask version. */
|
|
|
|
union {
|
|
unsigned long ul;
|
|
unsigned int ui[2];
|
|
unsigned short us[4];
|
|
} in_v, tmp_v, out_v;
|
|
|
|
in_v.ul = x;
|
|
tmp_v.ul = (unsigned long) in_v.ui[0] + (unsigned long) in_v.ui[1];
|
|
|
|
/* Since the bits of tmp_v.sh[3] are going to always be zero,
|
|
we don't have to bother to add that in. */
|
|
out_v.ul = (unsigned long) tmp_v.us[0] + (unsigned long) tmp_v.us[1]
|
|
+ (unsigned long) tmp_v.us[2];
|
|
|
|
/* Similarly, out_v.us[2] is always zero for the final add. */
|
|
return out_v.us[0] + out_v.us[1];
|
|
}
|
|
|
|
/*
|
|
* computes the checksum of the TCP/UDP pseudo-header
|
|
* returns a 16-bit checksum, already complemented.
|
|
*/
|
|
__sum16 csum_tcpudp_magic(__be32 saddr, __be32 daddr,
|
|
__u32 len, __u8 proto, __wsum sum)
|
|
{
|
|
return (__force __sum16)~from64to16(
|
|
(__force u64)saddr + (__force u64)daddr +
|
|
(__force u64)sum + ((len + proto) << 8));
|
|
}
|
|
EXPORT_SYMBOL(csum_tcpudp_magic);
|
|
|
|
__wsum csum_tcpudp_nofold(__be32 saddr, __be32 daddr,
|
|
__u32 len, __u8 proto, __wsum sum)
|
|
{
|
|
unsigned long result;
|
|
|
|
result = (__force u64)saddr + (__force u64)daddr +
|
|
(__force u64)sum + ((len + proto) << 8);
|
|
|
|
/* Fold down to 32-bits so we don't lose in the typedef-less
|
|
network stack. */
|
|
/* 64 to 33 */
|
|
result = (result & 0xffffffff) + (result >> 32);
|
|
/* 33 to 32 */
|
|
result = (result & 0xffffffff) + (result >> 32);
|
|
return (__force __wsum)result;
|
|
}
|
|
EXPORT_SYMBOL(csum_tcpudp_nofold);
|
|
|
|
/*
|
|
* Do a 64-bit checksum on an arbitrary memory area..
|
|
*
|
|
* This isn't a great routine, but it's not _horrible_ either. The
|
|
* inner loop could be unrolled a bit further, and there are better
|
|
* ways to do the carry, but this is reasonable.
|
|
*/
|
|
static inline unsigned long do_csum(const unsigned char * buff, int len)
|
|
{
|
|
int odd, count;
|
|
unsigned long result = 0;
|
|
|
|
if (len <= 0)
|
|
goto out;
|
|
odd = 1 & (unsigned long) buff;
|
|
if (odd) {
|
|
result = *buff << 8;
|
|
len--;
|
|
buff++;
|
|
}
|
|
count = len >> 1; /* nr of 16-bit words.. */
|
|
if (count) {
|
|
if (2 & (unsigned long) buff) {
|
|
result += *(unsigned short *) buff;
|
|
count--;
|
|
len -= 2;
|
|
buff += 2;
|
|
}
|
|
count >>= 1; /* nr of 32-bit words.. */
|
|
if (count) {
|
|
if (4 & (unsigned long) buff) {
|
|
result += *(unsigned int *) buff;
|
|
count--;
|
|
len -= 4;
|
|
buff += 4;
|
|
}
|
|
count >>= 1; /* nr of 64-bit words.. */
|
|
if (count) {
|
|
unsigned long carry = 0;
|
|
do {
|
|
unsigned long w = *(unsigned long *) buff;
|
|
count--;
|
|
buff += 8;
|
|
result += carry;
|
|
result += w;
|
|
carry = (w > result);
|
|
} while (count);
|
|
result += carry;
|
|
result = (result & 0xffffffff) + (result >> 32);
|
|
}
|
|
if (len & 4) {
|
|
result += *(unsigned int *) buff;
|
|
buff += 4;
|
|
}
|
|
}
|
|
if (len & 2) {
|
|
result += *(unsigned short *) buff;
|
|
buff += 2;
|
|
}
|
|
}
|
|
if (len & 1)
|
|
result += *buff;
|
|
result = from64to16(result);
|
|
if (odd)
|
|
result = ((result >> 8) & 0xff) | ((result & 0xff) << 8);
|
|
out:
|
|
return result;
|
|
}
|
|
|
|
/*
|
|
* This is a version of ip_compute_csum() optimized for IP headers,
|
|
* which always checksum on 4 octet boundaries.
|
|
*/
|
|
__sum16 ip_fast_csum(const void *iph, unsigned int ihl)
|
|
{
|
|
return (__force __sum16)~do_csum(iph,ihl*4);
|
|
}
|
|
EXPORT_SYMBOL(ip_fast_csum);
|
|
|
|
/*
|
|
* computes the checksum of a memory block at buff, length len,
|
|
* and adds in "sum" (32-bit)
|
|
*
|
|
* returns a 32-bit number suitable for feeding into itself
|
|
* or csum_tcpudp_magic
|
|
*
|
|
* this function must be called with even lengths, except
|
|
* for the last fragment, which may be odd
|
|
*
|
|
* it's best to have buff aligned on a 32-bit boundary
|
|
*/
|
|
__wsum csum_partial(const void *buff, int len, __wsum sum)
|
|
{
|
|
unsigned long result = do_csum(buff, len);
|
|
|
|
/* add in old sum, and carry.. */
|
|
result += (__force u32)sum;
|
|
/* 32+c bits -> 32 bits */
|
|
result = (result & 0xffffffff) + (result >> 32);
|
|
return (__force __wsum)result;
|
|
}
|
|
|
|
EXPORT_SYMBOL(csum_partial);
|
|
|
|
/*
|
|
* this routine is used for miscellaneous IP-like checksums, mainly
|
|
* in icmp.c
|
|
*/
|
|
__sum16 ip_compute_csum(const void *buff, int len)
|
|
{
|
|
return (__force __sum16)~from64to16(do_csum(buff,len));
|
|
}
|
|
EXPORT_SYMBOL(ip_compute_csum);
|