Merge branch 'akpm' (patches from Andrew)

Merge updates from Andrew Morton:

 - fsnotify fix

 - poll() timeout fix

 - a few scripts/ tweaks

 - debugobjects updates

 - the (small) ocfs2 queue

 - Minor fixes to kernel/padata.c

 - Maybe half of the MM queue

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (117 commits)
  mm, page_alloc: restore the original nodemask if the fast path allocation failed
  mm, page_alloc: uninline the bad page part of check_new_page()
  mm, page_alloc: don't duplicate code in free_pcp_prepare
  mm, page_alloc: defer debugging checks of pages allocated from the PCP
  mm, page_alloc: defer debugging checks of freed pages until a PCP drain
  cpuset: use static key better and convert to new API
  mm, page_alloc: inline pageblock lookup in page free fast paths
  mm, page_alloc: remove unnecessary variable from free_pcppages_bulk
  mm, page_alloc: pull out side effects from free_pages_check
  mm, page_alloc: un-inline the bad part of free_pages_check
  mm, page_alloc: check multiple page fields with a single branch
  mm, page_alloc: remove field from alloc_context
  mm, page_alloc: avoid looking up the first zone in a zonelist twice
  mm, page_alloc: shortcut watermark checks for order-0 pages
  mm, page_alloc: reduce cost of fair zone allocation policy retry
  mm, page_alloc: shorten the page allocator fast path
  mm, page_alloc: check once if a zone has isolated pageblocks
  mm, page_alloc: move __GFP_HARDWALL modifications out of the fastpath
  mm, page_alloc: simplify last cpupid reset
  mm, page_alloc: remove unnecessary initialisation from __alloc_pages_nodemask()
  ...
This commit is contained in:
Linus Torvalds 2016-05-19 20:00:06 -07:00
commit a05a70db34
122 changed files with 2310 additions and 1631 deletions

View File

@ -316,8 +316,8 @@
</itemizedlist>
</para>
<para>
The function returns 1 when the fixup was successful,
otherwise 0. The return value is used to update the
The function returns true when the fixup was successful,
otherwise false. The return value is used to update the
statistics.
</para>
<para>
@ -341,8 +341,8 @@
</itemizedlist>
</para>
<para>
The function returns 1 when the fixup was successful,
otherwise 0. The return value is used to update the
The function returns true when the fixup was successful,
otherwise false. The return value is used to update the
statistics.
</para>
<para>
@ -359,7 +359,8 @@
statically initialized object or not. In case it is it calls
debug_object_init() and debug_object_activate() to make the
object known to the tracker and marked active. In this case
the function should return 0 because this is not a real fixup.
the function should return false because this is not a real
fixup.
</para>
</sect1>
@ -376,8 +377,8 @@
</itemizedlist>
</para>
<para>
The function returns 1 when the fixup was successful,
otherwise 0. The return value is used to update the
The function returns true when the fixup was successful,
otherwise false. The return value is used to update the
statistics.
</para>
</sect1>
@ -397,8 +398,8 @@
</itemizedlist>
</para>
<para>
The function returns 1 when the fixup was successful,
otherwise 0. The return value is used to update the
The function returns true when the fixup was successful,
otherwise false. The return value is used to update the
statistics.
</para>
</sect1>
@ -414,8 +415,8 @@
debug bucket.
</para>
<para>
The function returns 1 when the fixup was successful,
otherwise 0. The return value is used to update the
The function returns true when the fixup was successful,
otherwise false. The return value is used to update the
statistics.
</para>
<para>
@ -427,7 +428,8 @@
case. The fixup function should check if this is a legitimate
case of a statically initialized object or not. In this case only
debug_object_init() should be called to make the object known to
the tracker. Then the function should return 0 because this is not
the tracker. Then the function should return false because this
is not
a real fixup.
</para>
</sect1>

View File

@ -2168,6 +2168,14 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
[KNL,SH] Allow user to override the default size for
per-device physically contiguous DMA buffers.
memhp_default_state=online/offline
[KNL] Set the initial state for the memory hotplug
onlining policy. If not specified, the default value is
set according to the
CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE kernel config
option.
See Documentation/memory-hotplug.txt.
memmap=exactmap [KNL,X86] Enable setting of an exact
E820 memory map, as specified by the user.
Such memmap=exactmap lines can be constructed based on

View File

@ -261,10 +261,11 @@ it according to the policy which can be read from "auto_online_blocks" file:
% cat /sys/devices/system/memory/auto_online_blocks
The default is "offline" which means the newly added memory is not in a
ready-to-use state and you have to "online" the newly added memory blocks
manually. Automatic onlining can be requested by writing "online" to
"auto_online_blocks" file:
The default depends on the CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE kernel config
option. If it is disabled the default is "offline" which means the newly added
memory is not in a ready-to-use state and you have to "online" the newly added
memory blocks manually. Automatic onlining can be requested by writing "online"
to "auto_online_blocks" file:
% echo online > /sys/devices/system/memory/auto_online_blocks

View File

@ -57,6 +57,7 @@ Currently, these files are in /proc/sys/vm:
- panic_on_oom
- percpu_pagelist_fraction
- stat_interval
- stat_refresh
- swappiness
- user_reserve_kbytes
- vfs_cache_pressure
@ -755,6 +756,19 @@ is 1 second.
==============================================================
stat_refresh
Any read or write (by root only) flushes all the per-cpu vm statistics
into their global totals, for more accurate reports when testing
e.g. cat /proc/sys/vm/stat_refresh /proc/meminfo
As a side-effect, it also checks for negative totals (elsewhere reported
as 0) and "fails" with EINVAL if any are found, with a warning in dmesg.
(At time of writing, a few stats are known sometimes to be found negative,
with no ill effects: errors and warnings on these stats are suppressed.)
==============================================================
swappiness
This control is used to define how aggressive the kernel will swap

View File

@ -394,9 +394,9 @@ hugepage natively. Once finished you can drop the page table lock.
Refcounting on THP is mostly consistent with refcounting on other compound
pages:
- get_page()/put_page() and GUP operate in head page's ->_count.
- get_page()/put_page() and GUP operate in head page's ->_refcount.
- ->_count in tail pages is always zero: get_page_unless_zero() never
- ->_refcount in tail pages is always zero: get_page_unless_zero() never
succeed on tail pages.
- map/unmap of the pages with PTE entry increment/decrement ->_mapcount
@ -426,15 +426,15 @@ requests to split pinned huge page: it expects page count to be equal to
sum of mapcount of all sub-pages plus one (split_huge_page caller must
have reference for head page).
split_huge_page uses migration entries to stabilize page->_count and
split_huge_page uses migration entries to stabilize page->_refcount and
page->_mapcount.
We safe against physical memory scanners too: the only legitimate way
scanner can get reference to a page is get_page_unless_zero().
All tail pages has zero ->_count until atomic_add(). It prevent scanner
All tail pages has zero ->_refcount until atomic_add(). It prevent scanner
from geting reference to tail page up to the point. After the atomic_add()
we don't care about ->_count value. We already known how many references
we don't care about ->_refcount value. We already known how many references
with should uncharge from head page.
For head page get_page_unless_zero() will succeed and we don't mind. It's

View File

@ -61,8 +61,6 @@ static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr,
extern void update_mmu_cache_pmd(struct vm_area_struct *vma, unsigned long addr,
pmd_t *pmd);
#define has_transparent_hugepage() 1
/* Generic variants assume pgtable_t is struct page *, hence need for these */
#define __HAVE_ARCH_PGTABLE_DEPOSIT
extern void pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp,

View File

@ -281,11 +281,6 @@ static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr,
flush_pmd_entry(pmdp);
}
static inline int has_transparent_hugepage(void)
{
return 1;
}
#endif /* __ASSEMBLY__ */
#endif /* _ASM_PGTABLE_3LEVEL_H */

View File

@ -316,11 +316,6 @@ static inline int pmd_protnone(pmd_t pmd)
#define set_pmd_at(mm, addr, pmdp, pmd) set_pte_at(mm, addr, (pte_t *)pmdp, pmd_pte(pmd))
static inline int has_transparent_hugepage(void)
{
return 1;
}
#define __pgprot_modify(prot,mask,bits) \
__pgprot((pgprot_val(prot) & ~(mask)) | (bits))

View File

@ -307,6 +307,7 @@ static __init int setup_hugepagesz(char *opt)
} else if (ps == PUD_SIZE) {
hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT);
} else {
hugetlb_bad_size();
pr_err("hugepagesz: Unsupported page size %lu K\n", ps >> 10);
return 0;
}

View File

@ -239,6 +239,7 @@ static __init int setup_hugepagesz(char *opt)
if (ps == (1 << HPAGE_SHIFT)) {
hugetlb_add_hstate(HPAGE_SHIFT - PAGE_SHIFT);
} else {
hugetlb_bad_size();
pr_err("hugepagesz: Unsupported page size %lu M\n",
ps >> 20);
return 0;

View File

@ -533,6 +533,7 @@ static inline int io_remap_pfn_range(struct vm_area_struct *vma,
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
#define has_transparent_hugepage has_transparent_hugepage
extern int has_transparent_hugepage(void);
static inline int pmd_trans_huge(pmd_t pmd)

View File

@ -405,19 +405,20 @@ void add_wired_entry(unsigned long entrylo0, unsigned long entrylo1,
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
int __init has_transparent_hugepage(void)
int has_transparent_hugepage(void)
{
unsigned int mask;
unsigned long flags;
static unsigned int mask = -1;
local_irq_save(flags);
write_c0_pagemask(PM_HUGE_MASK);
back_to_back_c0_hazard();
mask = read_c0_pagemask();
write_c0_pagemask(PM_DEFAULT_MASK);
local_irq_restore(flags);
if (mask == -1) { /* first call comes during __init */
unsigned long flags;
local_irq_save(flags);
write_c0_pagemask(PM_HUGE_MASK);
back_to_back_c0_hazard();
mask = read_c0_pagemask();
write_c0_pagemask(PM_DEFAULT_MASK);
local_irq_restore(flags);
}
return mask == PM_HUGE_MASK;
}

View File

@ -219,6 +219,7 @@ extern void set_pmd_at(struct mm_struct *mm, unsigned long addr,
pmd_t *pmdp, pmd_t pmd);
extern void update_mmu_cache_pmd(struct vm_area_struct *vma, unsigned long addr,
pmd_t *pmd);
#define has_transparent_hugepage has_transparent_hugepage
extern int has_transparent_hugepage(void);
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */

View File

@ -65,7 +65,6 @@ extern int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr,
struct page **pages, int *nr);
#ifndef CONFIG_TRANSPARENT_HUGEPAGE
#define pmd_large(pmd) 0
#define has_transparent_hugepage() 0
#endif
pte_t *__find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,
bool *is_thp, unsigned *shift);

View File

@ -772,8 +772,10 @@ static int __init hugepage_setup_sz(char *str)
size = memparse(str, &str);
if (add_huge_page_size(size) != 0)
printk(KERN_WARNING "Invalid huge page size specified(%llu)\n", size);
if (add_huge_page_size(size) != 0) {
hugetlb_bad_size();
pr_err("Invalid huge page size specified(%llu)\n", size);
}
return 1;
}

View File

@ -1223,6 +1223,7 @@ static inline int pmd_trans_huge(pmd_t pmd)
return pmd_val(pmd) & _SEGMENT_ENTRY_LARGE;
}
#define has_transparent_hugepage has_transparent_hugepage
static inline int has_transparent_hugepage(void)
{
return MACHINE_HAS_HPAGE ? 1 : 0;

View File

@ -681,8 +681,6 @@ static inline unsigned long pmd_trans_huge(pmd_t pmd)
return pte_val(pte) & _PAGE_PMD_HUGE;
}
#define has_transparent_hugepage() 1
static inline pmd_t pmd_mkold(pmd_t pmd)
{
pte_t pte = __pte(pmd_val(pmd));

View File

@ -487,7 +487,6 @@ static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot)
}
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
#define has_transparent_hugepage() 1
#define pmd_trans_huge pmd_huge_page
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */

View File

@ -962,9 +962,7 @@ static void __init setup_numa_mapping(void)
cpumask_set_cpu(best_cpu, &node_2_cpu_mask[node]);
cpu_2_node[best_cpu] = node;
cpumask_clear_cpu(best_cpu, &unbound_cpus);
node = next_node(node, default_nodes);
if (node == MAX_NUMNODES)
node = first_node(default_nodes);
node = next_node_in(node, default_nodes);
}
/* Print out node assignments and set defaults for disabled cpus */

View File

@ -308,11 +308,16 @@ static bool saw_hugepagesz;
static __init int setup_hugepagesz(char *opt)
{
int rc;
if (!saw_hugepagesz) {
saw_hugepagesz = true;
memset(huge_shift, 0, sizeof(huge_shift));
}
return __setup_hugepagesz(memparse(opt, NULL));
rc = __setup_hugepagesz(memparse(opt, NULL));
if (rc)
hugetlb_bad_size();
return rc;
}
__setup("hugepagesz=", setup_hugepagesz);

View File

@ -679,7 +679,7 @@ static void __init init_free_pfn_range(unsigned long start, unsigned long end)
* Hacky direct set to avoid unnecessary
* lock take/release for EVERY page here.
*/
p->_count.counter = 0;
p->_refcount.counter = 0;
p->_mapcount.counter = -1;
}
init_page_count(page);

View File

@ -181,6 +181,7 @@ static inline int pmd_trans_huge(pmd_t pmd)
return (pmd_val(pmd) & (_PAGE_PSE|_PAGE_DEVMAP)) == _PAGE_PSE;
}
#define has_transparent_hugepage has_transparent_hugepage
static inline int has_transparent_hugepage(void)
{
return boot_cpu_has(X86_FEATURE_PSE);

View File

@ -165,6 +165,7 @@ static __init int setup_hugepagesz(char *opt)
} else if (ps == PUD_SIZE && boot_cpu_has(X86_FEATURE_GBPAGES)) {
hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT);
} else {
hugetlb_bad_size();
printk(KERN_ERR "hugepagesz: Unsupported page size %lu M\n",
ps >> 20);
return 0;

View File

@ -617,9 +617,7 @@ static void __init numa_init_array(void)
if (early_cpu_to_node(i) != NUMA_NO_NODE)
continue;
numa_set_node(i, rr);
rr = next_node(rr, node_online_map);
if (rr == MAX_NUMNODES)
rr = first_node(node_online_map);
rr = next_node_in(rr, node_online_map);
}
}

View File

@ -861,7 +861,7 @@ rqbiocnt(struct request *r)
* discussion.
*
* We cannot use get_page in the workaround, because it insists on a
* positive page count as a precondition. So we use _count directly.
* positive page count as a precondition. So we use _refcount directly.
*/
static void
bio_pageinc(struct bio *bio)

View File

@ -1164,7 +1164,7 @@ static void msc_mmap_close(struct vm_area_struct *vma)
if (!atomic_dec_and_mutex_lock(&msc->mmap_count, &msc->buf_mutex))
return;
/* drop page _counts */
/* drop page _refcounts */
for (pg = 0; pg < msc->nr_pages; pg++) {
struct page *page = msc_buffer_get_page(msc, pg);

View File

@ -517,7 +517,7 @@ int dib0700_download_firmware(struct usb_device *udev, const struct firmware *fw
if (nb_packet_buffer_size < 1)
nb_packet_buffer_size = 1;
/* get the fimware version */
/* get the firmware version */
usb_control_msg(udev, usb_rcvctrlpipe(udev, 0),
REQUEST_GET_VERSION,
USB_TYPE_VENDOR | USB_DIR_IN, 0, 0,

View File

@ -23,7 +23,7 @@ static void nicvf_get_page(struct nicvf *nic)
if (!nic->rb_pageref || !nic->rb_page)
return;
atomic_add(nic->rb_pageref, &nic->rb_page->_count);
page_ref_add(nic->rb_page, nic->rb_pageref);
nic->rb_pageref = 0;
}

View File

@ -433,8 +433,8 @@ static int mlx5e_alloc_rx_fragmented_mpwqe(struct mlx5e_rq *rq,
for (i = 0; i < MLX5_MPWRQ_PAGES_PER_WQE; i++) {
if (unlikely(mlx5e_alloc_and_map_page(rq, wi, i)))
goto err_unmap;
atomic_add(mlx5e_mpwqe_strides_per_page(rq),
&wi->umr.dma_info[i].page->_count);
page_ref_add(wi->umr.dma_info[i].page,
mlx5e_mpwqe_strides_per_page(rq));
wi->skbs_frags[i] = 0;
}
@ -452,8 +452,8 @@ err_unmap:
while (--i >= 0) {
dma_unmap_page(rq->pdev, wi->umr.dma_info[i].addr, PAGE_SIZE,
PCI_DMA_FROMDEVICE);
atomic_sub(mlx5e_mpwqe_strides_per_page(rq),
&wi->umr.dma_info[i].page->_count);
page_ref_sub(wi->umr.dma_info[i].page,
mlx5e_mpwqe_strides_per_page(rq));
put_page(wi->umr.dma_info[i].page);
}
dma_unmap_single(rq->pdev, wi->umr.mtt_addr, mtt_sz, PCI_DMA_TODEVICE);
@ -477,8 +477,8 @@ void mlx5e_free_rx_fragmented_mpwqe(struct mlx5e_rq *rq,
for (i = 0; i < MLX5_MPWRQ_PAGES_PER_WQE; i++) {
dma_unmap_page(rq->pdev, wi->umr.dma_info[i].addr, PAGE_SIZE,
PCI_DMA_FROMDEVICE);
atomic_sub(mlx5e_mpwqe_strides_per_page(rq) - wi->skbs_frags[i],
&wi->umr.dma_info[i].page->_count);
page_ref_sub(wi->umr.dma_info[i].page,
mlx5e_mpwqe_strides_per_page(rq) - wi->skbs_frags[i]);
put_page(wi->umr.dma_info[i].page);
}
dma_unmap_single(rq->pdev, wi->umr.mtt_addr, mtt_sz, PCI_DMA_TODEVICE);
@ -527,8 +527,8 @@ static int mlx5e_alloc_rx_linear_mpwqe(struct mlx5e_rq *rq,
*/
split_page(wi->dma_info.page, MLX5_MPWRQ_WQE_PAGE_ORDER);
for (i = 0; i < MLX5_MPWRQ_PAGES_PER_WQE; i++) {
atomic_add(mlx5e_mpwqe_strides_per_page(rq),
&wi->dma_info.page[i]._count);
page_ref_add(&wi->dma_info.page[i],
mlx5e_mpwqe_strides_per_page(rq));
wi->skbs_frags[i] = 0;
}
@ -551,8 +551,8 @@ void mlx5e_free_rx_linear_mpwqe(struct mlx5e_rq *rq,
dma_unmap_page(rq->pdev, wi->dma_info.addr, rq->wqe_sz,
PCI_DMA_FROMDEVICE);
for (i = 0; i < MLX5_MPWRQ_PAGES_PER_WQE; i++) {
atomic_sub(mlx5e_mpwqe_strides_per_page(rq) - wi->skbs_frags[i],
&wi->dma_info.page[i]._count);
page_ref_sub(&wi->dma_info.page[i],
mlx5e_mpwqe_strides_per_page(rq) - wi->skbs_frags[i]);
put_page(&wi->dma_info.page[i]);
}
}

View File

@ -920,7 +920,7 @@ static inline int qede_realloc_rx_buffer(struct qede_dev *edev,
* network stack to take the ownership of the page
* which can be recycled multiple times by the driver.
*/
atomic_inc(&curr_cons->data->_count);
page_ref_inc(curr_cons->data);
qede_reuse_page(edev, rxq, curr_cons);
}
@ -1036,7 +1036,7 @@ static int qede_fill_frag_skb(struct qede_dev *edev,
/* Incr page ref count to reuse on allocation failure
* so that it doesn't get freed while freeing SKB.
*/
atomic_inc(&current_bd->data->_count);
page_ref_inc(current_bd->data);
goto out;
}
@ -1487,7 +1487,7 @@ static int qede_rx_int(struct qede_fastpath *fp, int budget)
* freeing SKB.
*/
atomic_inc(&sw_rx_data->data->_count);
page_ref_inc(sw_rx_data->data);
rxq->rx_alloc_errors++;
qede_recycle_rx_bd_ring(rxq, edev,
fp_cqe->bd_num);

View File

@ -356,7 +356,7 @@ struct bfi_ioc_image_hdr_s {
u8 port0_mode; /* device mode for port 0 */
u8 port1_mode; /* device mode for port 1 */
u32 exec; /* exec vector */
u32 bootenv; /* fimware boot env */
u32 bootenv; /* firmware boot env */
u32 rsvd_b[2];
struct bfi_ioc_fwver_s fwver;
u32 md5sum[BFI_IOC_MD5SUM_SZ];

View File

@ -26,7 +26,7 @@
* Much of the functionality of this driver was determined from reading
* the source code for the Windows driver.
*
* The FPGA on the board requires fimware, which is available from
* The FPGA on the board requires firmware, which is available from
* http://www.comedi.org in the comedi_nonfree_firmware tarball.
*
* Configuration options: not applicable, uses PCI auto config

View File

@ -255,17 +255,17 @@ out:
*/
static void free_more_memory(void)
{
struct zone *zone;
struct zoneref *z;
int nid;
wakeup_flusher_threads(1024, WB_REASON_FREE_MORE_MEM);
yield();
for_each_online_node(nid) {
(void)first_zones_zonelist(node_zonelist(nid, GFP_NOFS),
gfp_zone(GFP_NOFS), NULL,
&zone);
if (zone)
z = first_zones_zonelist(node_zonelist(nid, GFP_NOFS),
gfp_zone(GFP_NOFS), NULL);
if (z->zone)
try_to_free_pages(node_zonelist(nid, GFP_NOFS), 0,
GFP_NOFS, NULL);
}

View File

@ -1583,15 +1583,15 @@ static int ep_send_events(struct eventpoll *ep,
return ep_scan_ready_list(ep, ep_send_events_proc, &esed, 0, false);
}
static inline struct timespec ep_set_mstimeout(long ms)
static inline struct timespec64 ep_set_mstimeout(long ms)
{
struct timespec now, ts = {
struct timespec64 now, ts = {
.tv_sec = ms / MSEC_PER_SEC,
.tv_nsec = NSEC_PER_MSEC * (ms % MSEC_PER_SEC),
};
ktime_get_ts(&now);
return timespec_add_safe(now, ts);
ktime_get_ts64(&now);
return timespec64_add_safe(now, ts);
}
/**
@ -1621,11 +1621,11 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
ktime_t expires, *to = NULL;
if (timeout > 0) {
struct timespec end_time = ep_set_mstimeout(timeout);
struct timespec64 end_time = ep_set_mstimeout(timeout);
slack = select_estimate_accuracy(&end_time);
to = &expires;
*to = timespec_to_ktime(end_time);
*to = timespec64_to_ktime(end_time);
} else if (timeout == 0) {
/*
* Avoid the unnecessary trip to the wait queue loop, if the

View File

@ -56,6 +56,13 @@ static inline void fsnotify_clear_marks_by_mount(struct vfsmount *mnt)
fsnotify_destroy_marks(&real_mount(mnt)->mnt_fsnotify_marks,
&mnt->mnt_root->d_lock);
}
/* prepare for freeing all marks associated with given group */
extern void fsnotify_detach_group_marks(struct fsnotify_group *group);
/*
* wait for fsnotify_mark_srcu period to end and free all marks in destroy_list
*/
extern void fsnotify_mark_destroy_list(void);
/*
* update the dentry->d_flags of all of inode's children to indicate if inode cares
* about events that happen to its children.

View File

@ -47,12 +47,21 @@ static void fsnotify_final_destroy_group(struct fsnotify_group *group)
*/
void fsnotify_destroy_group(struct fsnotify_group *group)
{
/* clear all inode marks for this group */
fsnotify_clear_marks_by_group(group);
/* clear all inode marks for this group, attach them to destroy_list */
fsnotify_detach_group_marks(group);
synchronize_srcu(&fsnotify_mark_srcu);
/*
* Wait for fsnotify_mark_srcu period to end and free all marks in
* destroy_list
*/
fsnotify_mark_destroy_list();
/* clear the notification queue of all events */
/*
* Since we have waited for fsnotify_mark_srcu in
* fsnotify_mark_destroy_list() there can be no outstanding event
* notification against this group. So clearing the notification queue
* of all events is reliable now.
*/
fsnotify_flush_notify(group);
/*

View File

@ -97,8 +97,8 @@ struct srcu_struct fsnotify_mark_srcu;
static DEFINE_SPINLOCK(destroy_lock);
static LIST_HEAD(destroy_list);
static void fsnotify_mark_destroy(struct work_struct *work);
static DECLARE_DELAYED_WORK(reaper_work, fsnotify_mark_destroy);
static void fsnotify_mark_destroy_workfn(struct work_struct *work);
static DECLARE_DELAYED_WORK(reaper_work, fsnotify_mark_destroy_workfn);
void fsnotify_get_mark(struct fsnotify_mark *mark)
{
@ -173,11 +173,15 @@ void fsnotify_detach_mark(struct fsnotify_mark *mark)
}
/*
* Free fsnotify mark. The freeing is actually happening from a kthread which
* first waits for srcu period end. Caller must have a reference to the mark
* or be protected by fsnotify_mark_srcu.
* Prepare mark for freeing and add it to the list of marks prepared for
* freeing. The actual freeing must happen after SRCU period ends and the
* caller is responsible for this.
*
* The function returns true if the mark was added to the list of marks for
* freeing. The function returns false if someone else has already called
* __fsnotify_free_mark() for the mark.
*/
void fsnotify_free_mark(struct fsnotify_mark *mark)
static bool __fsnotify_free_mark(struct fsnotify_mark *mark)
{
struct fsnotify_group *group = mark->group;
@ -185,17 +189,11 @@ void fsnotify_free_mark(struct fsnotify_mark *mark)
/* something else already called this function on this mark */
if (!(mark->flags & FSNOTIFY_MARK_FLAG_ALIVE)) {
spin_unlock(&mark->lock);
return;
return false;
}
mark->flags &= ~FSNOTIFY_MARK_FLAG_ALIVE;
spin_unlock(&mark->lock);
spin_lock(&destroy_lock);
list_add(&mark->g_list, &destroy_list);
spin_unlock(&destroy_lock);
queue_delayed_work(system_unbound_wq, &reaper_work,
FSNOTIFY_REAPER_DELAY);
/*
* Some groups like to know that marks are being freed. This is a
* callback to the group function to let it know that this mark
@ -203,6 +201,25 @@ void fsnotify_free_mark(struct fsnotify_mark *mark)
*/
if (group->ops->freeing_mark)
group->ops->freeing_mark(mark, group);
spin_lock(&destroy_lock);
list_add(&mark->g_list, &destroy_list);
spin_unlock(&destroy_lock);
return true;
}
/*
* Free fsnotify mark. The freeing is actually happening from a workqueue which
* first waits for srcu period end. Caller must have a reference to the mark
* or be protected by fsnotify_mark_srcu.
*/
void fsnotify_free_mark(struct fsnotify_mark *mark)
{
if (__fsnotify_free_mark(mark)) {
queue_delayed_work(system_unbound_wq, &reaper_work,
FSNOTIFY_REAPER_DELAY);
}
}
void fsnotify_destroy_mark(struct fsnotify_mark *mark,
@ -468,11 +485,29 @@ void fsnotify_clear_marks_by_group_flags(struct fsnotify_group *group,
}
/*
* Given a group, destroy all of the marks associated with that group.
* Given a group, prepare for freeing all the marks associated with that group.
* The marks are attached to the list of marks prepared for destruction, the
* caller is responsible for freeing marks in that list after SRCU period has
* ended.
*/
void fsnotify_clear_marks_by_group(struct fsnotify_group *group)
void fsnotify_detach_group_marks(struct fsnotify_group *group)
{
fsnotify_clear_marks_by_group_flags(group, (unsigned int)-1);
struct fsnotify_mark *mark;
while (1) {
mutex_lock_nested(&group->mark_mutex, SINGLE_DEPTH_NESTING);
if (list_empty(&group->marks_list)) {
mutex_unlock(&group->mark_mutex);
break;
}
mark = list_first_entry(&group->marks_list,
struct fsnotify_mark, g_list);
fsnotify_get_mark(mark);
fsnotify_detach_mark(mark);
mutex_unlock(&group->mark_mutex);
__fsnotify_free_mark(mark);
fsnotify_put_mark(mark);
}
}
void fsnotify_duplicate_mark(struct fsnotify_mark *new, struct fsnotify_mark *old)
@ -499,7 +534,11 @@ void fsnotify_init_mark(struct fsnotify_mark *mark,
mark->free_mark = free_mark;
}
static void fsnotify_mark_destroy(struct work_struct *work)
/*
* Destroy all marks in destroy_list, waits for SRCU period to finish before
* actually freeing marks.
*/
void fsnotify_mark_destroy_list(void)
{
struct fsnotify_mark *mark, *next;
struct list_head private_destroy_list;
@ -516,3 +555,8 @@ static void fsnotify_mark_destroy(struct work_struct *work)
fsnotify_put_mark(mark);
}
}
static void fsnotify_mark_destroy_workfn(struct work_struct *work)
{
fsnotify_mark_destroy_list();
}

View File

@ -5351,7 +5351,7 @@ static int ocfs2_truncate_rec(handle_t *handle,
{
int ret;
u32 left_cpos, rec_range, trunc_range;
int wants_rotate = 0, is_rightmost_tree_rec = 0;
int is_rightmost_tree_rec = 0;
struct super_block *sb = ocfs2_metadata_cache_get_super(et->et_ci);
struct ocfs2_path *left_path = NULL;
struct ocfs2_extent_list *el = path_leaf_el(path);
@ -5457,7 +5457,6 @@ static int ocfs2_truncate_rec(handle_t *handle,
memset(rec, 0, sizeof(*rec));
ocfs2_cleanup_merge(el, index);
wants_rotate = 1;
next_free = le16_to_cpu(el->l_next_free_rec);
if (is_rightmost_tree_rec && next_free > 1) {

View File

@ -1456,7 +1456,6 @@ static void o2hb_region_release(struct config_item *item)
static int o2hb_read_block_input(struct o2hb_region *reg,
const char *page,
size_t count,
unsigned long *ret_bytes,
unsigned int *ret_bits)
{
@ -1499,8 +1498,8 @@ static ssize_t o2hb_region_block_bytes_store(struct config_item *item,
if (reg->hr_bdev)
return -EINVAL;
status = o2hb_read_block_input(reg, page, count,
&block_bytes, &block_bits);
status = o2hb_read_block_input(reg, page, &block_bytes,
&block_bits);
if (status)
return status;

View File

@ -580,7 +580,7 @@ struct ocfs2_extended_slot {
/*00*/ __u8 es_valid;
__u8 es_reserved1[3];
__le32 es_node_num;
/*10*/
/*08*/
};
/*

View File

@ -535,12 +535,8 @@ void ocfs2_put_slot(struct ocfs2_super *osb)
spin_unlock(&osb->osb_lock);
status = ocfs2_update_disk_slot(osb, si, slot_num);
if (status < 0) {
if (status < 0)
mlog_errno(status);
goto bail;
}
bail:
ocfs2_free_slot_info(osb);
}

View File

@ -142,7 +142,7 @@ u64 stable_page_flags(struct page *page)
/*
* Caveats on high order pages: page->_count will only be set
* Caveats on high order pages: page->_refcount will only be set
* -1 on the head page; SLUB/SLQB do the same for PG_slab;
* SLOB won't set PG_slab at all on compound pages.
*/

View File

@ -47,7 +47,7 @@
#define MAX_SLACK (100 * NSEC_PER_MSEC)
static long __estimate_accuracy(struct timespec *tv)
static long __estimate_accuracy(struct timespec64 *tv)
{
long slack;
int divfactor = 1000;
@ -70,10 +70,10 @@ static long __estimate_accuracy(struct timespec *tv)
return slack;
}
u64 select_estimate_accuracy(struct timespec *tv)
u64 select_estimate_accuracy(struct timespec64 *tv)
{
u64 ret;
struct timespec now;
struct timespec64 now;
/*
* Realtime tasks get a slack of 0 for obvious reasons.
@ -82,8 +82,8 @@ u64 select_estimate_accuracy(struct timespec *tv)
if (rt_task(current))
return 0;
ktime_get_ts(&now);
now = timespec_sub(*tv, now);
ktime_get_ts64(&now);
now = timespec64_sub(*tv, now);
ret = __estimate_accuracy(&now);
if (ret < current->timer_slack_ns)
return current->timer_slack_ns;
@ -260,7 +260,7 @@ EXPORT_SYMBOL(poll_schedule_timeout);
/**
* poll_select_set_timeout - helper function to setup the timeout value
* @to: pointer to timespec variable for the final timeout
* @to: pointer to timespec64 variable for the final timeout
* @sec: seconds (from user space)
* @nsec: nanoseconds (from user space)
*
@ -269,26 +269,28 @@ EXPORT_SYMBOL(poll_schedule_timeout);
*
* Returns -EINVAL if sec/nsec are not normalized. Otherwise 0.
*/
int poll_select_set_timeout(struct timespec *to, long sec, long nsec)
int poll_select_set_timeout(struct timespec64 *to, time64_t sec, long nsec)
{
struct timespec ts = {.tv_sec = sec, .tv_nsec = nsec};
struct timespec64 ts = {.tv_sec = sec, .tv_nsec = nsec};
if (!timespec_valid(&ts))
if (!timespec64_valid(&ts))
return -EINVAL;
/* Optimize for the zero timeout value here */
if (!sec && !nsec) {
to->tv_sec = to->tv_nsec = 0;
} else {
ktime_get_ts(to);
*to = timespec_add_safe(*to, ts);
ktime_get_ts64(to);
*to = timespec64_add_safe(*to, ts);
}
return 0;
}
static int poll_select_copy_remaining(struct timespec *end_time, void __user *p,
static int poll_select_copy_remaining(struct timespec64 *end_time,
void __user *p,
int timeval, int ret)
{
struct timespec64 rts64;
struct timespec rts;
struct timeval rtv;
@ -302,16 +304,18 @@ static int poll_select_copy_remaining(struct timespec *end_time, void __user *p,
if (!end_time->tv_sec && !end_time->tv_nsec)
return ret;
ktime_get_ts(&rts);
rts = timespec_sub(*end_time, rts);
if (rts.tv_sec < 0)
rts.tv_sec = rts.tv_nsec = 0;
ktime_get_ts64(&rts64);
rts64 = timespec64_sub(*end_time, rts64);
if (rts64.tv_sec < 0)
rts64.tv_sec = rts64.tv_nsec = 0;
rts = timespec64_to_timespec(rts64);
if (timeval) {
if (sizeof(rtv) > sizeof(rtv.tv_sec) + sizeof(rtv.tv_usec))
memset(&rtv, 0, sizeof(rtv));
rtv.tv_sec = rts.tv_sec;
rtv.tv_usec = rts.tv_nsec / NSEC_PER_USEC;
rtv.tv_sec = rts64.tv_sec;
rtv.tv_usec = rts64.tv_nsec / NSEC_PER_USEC;
if (!copy_to_user(p, &rtv, sizeof(rtv)))
return ret;
@ -396,7 +400,7 @@ static inline void wait_key_set(poll_table *wait, unsigned long in,
wait->_key |= POLLOUT_SET;
}
int do_select(int n, fd_set_bits *fds, struct timespec *end_time)
int do_select(int n, fd_set_bits *fds, struct timespec64 *end_time)
{
ktime_t expire, *to = NULL;
struct poll_wqueues table;
@ -522,7 +526,7 @@ int do_select(int n, fd_set_bits *fds, struct timespec *end_time)
* pointer to the expiry value.
*/
if (end_time && !to) {
expire = timespec_to_ktime(*end_time);
expire = timespec64_to_ktime(*end_time);
to = &expire;
}
@ -545,7 +549,7 @@ int do_select(int n, fd_set_bits *fds, struct timespec *end_time)
* I'm trying ERESTARTNOHAND which restart only when you want to.
*/
int core_sys_select(int n, fd_set __user *inp, fd_set __user *outp,
fd_set __user *exp, struct timespec *end_time)
fd_set __user *exp, struct timespec64 *end_time)
{
fd_set_bits fds;
void *bits;
@ -622,7 +626,7 @@ out_nofds:
SYSCALL_DEFINE5(select, int, n, fd_set __user *, inp, fd_set __user *, outp,
fd_set __user *, exp, struct timeval __user *, tvp)
{
struct timespec end_time, *to = NULL;
struct timespec64 end_time, *to = NULL;
struct timeval tv;
int ret;
@ -648,15 +652,17 @@ static long do_pselect(int n, fd_set __user *inp, fd_set __user *outp,
const sigset_t __user *sigmask, size_t sigsetsize)
{
sigset_t ksigmask, sigsaved;
struct timespec ts, end_time, *to = NULL;
struct timespec ts;
struct timespec64 ts64, end_time, *to = NULL;
int ret;
if (tsp) {
if (copy_from_user(&ts, tsp, sizeof(ts)))
return -EFAULT;
ts64 = timespec_to_timespec64(ts);
to = &end_time;
if (poll_select_set_timeout(to, ts.tv_sec, ts.tv_nsec))
if (poll_select_set_timeout(to, ts64.tv_sec, ts64.tv_nsec))
return -EINVAL;
}
@ -779,7 +785,7 @@ static inline unsigned int do_pollfd(struct pollfd *pollfd, poll_table *pwait,
}
static int do_poll(struct poll_list *list, struct poll_wqueues *wait,
struct timespec *end_time)
struct timespec64 *end_time)
{
poll_table* pt = &wait->pt;
ktime_t expire, *to = NULL;
@ -854,7 +860,7 @@ static int do_poll(struct poll_list *list, struct poll_wqueues *wait,
* pointer to the expiry value.
*/
if (end_time && !to) {
expire = timespec_to_ktime(*end_time);
expire = timespec64_to_ktime(*end_time);
to = &expire;
}
@ -868,7 +874,7 @@ static int do_poll(struct poll_list *list, struct poll_wqueues *wait,
sizeof(struct pollfd))
int do_sys_poll(struct pollfd __user *ufds, unsigned int nfds,
struct timespec *end_time)
struct timespec64 *end_time)
{
struct poll_wqueues table;
int err = -EFAULT, fdcount, len, size;
@ -936,7 +942,7 @@ static long do_restart_poll(struct restart_block *restart_block)
{
struct pollfd __user *ufds = restart_block->poll.ufds;
int nfds = restart_block->poll.nfds;
struct timespec *to = NULL, end_time;
struct timespec64 *to = NULL, end_time;
int ret;
if (restart_block->poll.has_timeout) {
@ -957,7 +963,7 @@ static long do_restart_poll(struct restart_block *restart_block)
SYSCALL_DEFINE3(poll, struct pollfd __user *, ufds, unsigned int, nfds,
int, timeout_msecs)
{
struct timespec end_time, *to = NULL;
struct timespec64 end_time, *to = NULL;
int ret;
if (timeout_msecs >= 0) {
@ -993,7 +999,8 @@ SYSCALL_DEFINE5(ppoll, struct pollfd __user *, ufds, unsigned int, nfds,
size_t, sigsetsize)
{
sigset_t ksigmask, sigsaved;
struct timespec ts, end_time, *to = NULL;
struct timespec ts;
struct timespec64 end_time, *to = NULL;
int ret;
if (tsp) {

View File

@ -806,4 +806,12 @@ static inline int pmd_clear_huge(pmd_t *pmd)
#define io_remap_pfn_range remap_pfn_range
#endif
#ifndef has_transparent_hugepage
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
#define has_transparent_hugepage() 1
#else
#define has_transparent_hugepage() 0
#endif
#endif
#endif /* _ASM_GENERIC_PGTABLE_H */

View File

@ -83,34 +83,34 @@ extern void *__alloc_bootmem(unsigned long size,
unsigned long goal);
extern void *__alloc_bootmem_nopanic(unsigned long size,
unsigned long align,
unsigned long goal);
unsigned long goal) __malloc;
extern void *__alloc_bootmem_node(pg_data_t *pgdat,
unsigned long size,
unsigned long align,
unsigned long goal);
unsigned long goal) __malloc;
void *__alloc_bootmem_node_high(pg_data_t *pgdat,
unsigned long size,
unsigned long align,
unsigned long goal);
unsigned long goal) __malloc;
extern void *__alloc_bootmem_node_nopanic(pg_data_t *pgdat,
unsigned long size,
unsigned long align,
unsigned long goal);
unsigned long goal) __malloc;
void *___alloc_bootmem_node_nopanic(pg_data_t *pgdat,
unsigned long size,
unsigned long align,
unsigned long goal,
unsigned long limit);
unsigned long limit) __malloc;
extern void *__alloc_bootmem_low(unsigned long size,
unsigned long align,
unsigned long goal);
unsigned long goal) __malloc;
void *__alloc_bootmem_low_nopanic(unsigned long size,
unsigned long align,
unsigned long goal);
unsigned long goal) __malloc;
extern void *__alloc_bootmem_low_node(pg_data_t *pgdat,
unsigned long size,
unsigned long align,
unsigned long goal);
unsigned long goal) __malloc;
#ifdef CONFIG_NO_BOOTMEM
/* We are using top down, so it is safe to use 0 here */

View File

@ -39,12 +39,12 @@ extern int sysctl_compact_unevictable_allowed;
extern int fragmentation_index(struct zone *zone, unsigned int order);
extern unsigned long try_to_compact_pages(gfp_t gfp_mask, unsigned int order,
int alloc_flags, const struct alloc_context *ac,
enum migrate_mode mode, int *contended);
unsigned int alloc_flags, const struct alloc_context *ac,
enum migrate_mode mode, int *contended);
extern void compact_pgdat(pg_data_t *pgdat, int order);
extern void reset_isolation_suitable(pg_data_t *pgdat);
extern unsigned long compaction_suitable(struct zone *zone, int order,
int alloc_flags, int classzone_idx);
unsigned int alloc_flags, int classzone_idx);
extern void defer_compaction(struct zone *zone, int order);
extern bool compaction_deferred(struct zone *zone, int order);

View File

@ -142,6 +142,7 @@
#if GCC_VERSION >= 30400
#define __must_check __attribute__((warn_unused_result))
#define __malloc __attribute__((__malloc__))
#endif
#if GCC_VERSION >= 40000

View File

@ -357,6 +357,10 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
#define __deprecated_for_modules
#endif
#ifndef __malloc
#define __malloc
#endif
/*
* Allow us to avoid 'defined but not used' warnings on functions and data,
* as well as force them to be emitted to the assembly file.

View File

@ -16,26 +16,26 @@
#ifdef CONFIG_CPUSETS
extern struct static_key cpusets_enabled_key;
extern struct static_key_false cpusets_enabled_key;
static inline bool cpusets_enabled(void)
{
return static_key_false(&cpusets_enabled_key);
return static_branch_unlikely(&cpusets_enabled_key);
}
static inline int nr_cpusets(void)
{
/* jump label reference count + the top-level cpuset */
return static_key_count(&cpusets_enabled_key) + 1;
return static_key_count(&cpusets_enabled_key.key) + 1;
}
static inline void cpuset_inc(void)
{
static_key_slow_inc(&cpusets_enabled_key);
static_branch_inc(&cpusets_enabled_key);
}
static inline void cpuset_dec(void)
{
static_key_slow_dec(&cpusets_enabled_key);
static_branch_dec(&cpusets_enabled_key);
}
extern int cpuset_init(void);
@ -48,16 +48,25 @@ extern nodemask_t cpuset_mems_allowed(struct task_struct *p);
void cpuset_init_current_mems_allowed(void);
int cpuset_nodemask_valid_mems_allowed(nodemask_t *nodemask);
extern int __cpuset_node_allowed(int node, gfp_t gfp_mask);
extern bool __cpuset_node_allowed(int node, gfp_t gfp_mask);
static inline int cpuset_node_allowed(int node, gfp_t gfp_mask)
static inline bool cpuset_node_allowed(int node, gfp_t gfp_mask)
{
return nr_cpusets() <= 1 || __cpuset_node_allowed(node, gfp_mask);
if (cpusets_enabled())
return __cpuset_node_allowed(node, gfp_mask);
return true;
}
static inline int cpuset_zone_allowed(struct zone *z, gfp_t gfp_mask)
static inline bool __cpuset_zone_allowed(struct zone *z, gfp_t gfp_mask)
{
return cpuset_node_allowed(zone_to_nid(z), gfp_mask);
return __cpuset_node_allowed(zone_to_nid(z), gfp_mask);
}
static inline bool cpuset_zone_allowed(struct zone *z, gfp_t gfp_mask)
{
if (cpusets_enabled())
return __cpuset_zone_allowed(z, gfp_mask);
return true;
}
extern int cpuset_mems_allowed_intersects(const struct task_struct *tsk1,
@ -172,14 +181,19 @@ static inline int cpuset_nodemask_valid_mems_allowed(nodemask_t *nodemask)
return 1;
}
static inline int cpuset_node_allowed(int node, gfp_t gfp_mask)
static inline bool cpuset_node_allowed(int node, gfp_t gfp_mask)
{
return 1;
return true;
}
static inline int cpuset_zone_allowed(struct zone *z, gfp_t gfp_mask)
static inline bool __cpuset_zone_allowed(struct zone *z, gfp_t gfp_mask)
{
return 1;
return true;
}
static inline bool cpuset_zone_allowed(struct zone *z, gfp_t gfp_mask)
{
return true;
}
static inline int cpuset_mems_allowed_intersects(const struct task_struct *tsk1,

View File

@ -38,8 +38,10 @@ struct debug_obj {
* @name: name of the object typee
* @debug_hint: function returning address, which have associated
* kernel symbol, to allow identify the object
* @is_static_object return true if the obj is static, otherwise return false
* @fixup_init: fixup function, which is called when the init check
* fails
* fails. All fixup functions must return true if fixup
* was successful, otherwise return false
* @fixup_activate: fixup function, which is called when the activate check
* fails
* @fixup_destroy: fixup function, which is called when the destroy check
@ -51,12 +53,13 @@ struct debug_obj {
*/
struct debug_obj_descr {
const char *name;
void *(*debug_hint) (void *addr);
int (*fixup_init) (void *addr, enum debug_obj_state state);
int (*fixup_activate) (void *addr, enum debug_obj_state state);
int (*fixup_destroy) (void *addr, enum debug_obj_state state);
int (*fixup_free) (void *addr, enum debug_obj_state state);
int (*fixup_assert_init)(void *addr, enum debug_obj_state state);
void *(*debug_hint)(void *addr);
bool (*is_static_object)(void *addr);
bool (*fixup_init)(void *addr, enum debug_obj_state state);
bool (*fixup_activate)(void *addr, enum debug_obj_state state);
bool (*fixup_destroy)(void *addr, enum debug_obj_state state);
bool (*fixup_free)(void *addr, enum debug_obj_state state);
bool (*fixup_assert_init)(void *addr, enum debug_obj_state state);
};
#ifdef CONFIG_DEBUG_OBJECTS

View File

@ -609,14 +609,14 @@ typedef int (*dr_match_t)(struct device *dev, void *res, void *match_data);
#ifdef CONFIG_DEBUG_DEVRES
extern void *__devres_alloc_node(dr_release_t release, size_t size, gfp_t gfp,
int nid, const char *name);
int nid, const char *name) __malloc;
#define devres_alloc(release, size, gfp) \
__devres_alloc_node(release, size, gfp, NUMA_NO_NODE, #release)
#define devres_alloc_node(release, size, gfp, nid) \
__devres_alloc_node(release, size, gfp, nid, #release)
#else
extern void *devres_alloc_node(dr_release_t release, size_t size, gfp_t gfp,
int nid);
int nid) __malloc;
static inline void *devres_alloc(dr_release_t release, size_t size, gfp_t gfp)
{
return devres_alloc_node(release, size, gfp, NUMA_NO_NODE);
@ -648,12 +648,12 @@ extern void devres_remove_group(struct device *dev, void *id);
extern int devres_release_group(struct device *dev, void *id);
/* managed devm_k.alloc/kfree for device drivers */
extern void *devm_kmalloc(struct device *dev, size_t size, gfp_t gfp);
extern void *devm_kmalloc(struct device *dev, size_t size, gfp_t gfp) __malloc;
extern __printf(3, 0)
char *devm_kvasprintf(struct device *dev, gfp_t gfp, const char *fmt,
va_list ap);
va_list ap) __malloc;
extern __printf(3, 4)
char *devm_kasprintf(struct device *dev, gfp_t gfp, const char *fmt, ...);
char *devm_kasprintf(struct device *dev, gfp_t gfp, const char *fmt, ...) __malloc;
static inline void *devm_kzalloc(struct device *dev, size_t size, gfp_t gfp)
{
return devm_kmalloc(dev, size, gfp | __GFP_ZERO);
@ -671,7 +671,7 @@ static inline void *devm_kcalloc(struct device *dev,
return devm_kmalloc_array(dev, n, size, flags | __GFP_ZERO);
}
extern void devm_kfree(struct device *dev, void *p);
extern char *devm_kstrdup(struct device *dev, const char *s, gfp_t gfp);
extern char *devm_kstrdup(struct device *dev, const char *s, gfp_t gfp) __malloc;
extern void *devm_kmemdup(struct device *dev, const void *src, size_t len,
gfp_t gfp);

View File

@ -359,8 +359,6 @@ extern void fsnotify_clear_vfsmount_marks_by_group(struct fsnotify_group *group)
extern void fsnotify_clear_inode_marks_by_group(struct fsnotify_group *group);
/* run all the marks in a group, and clear all of the marks where mark->flags & flags is true*/
extern void fsnotify_clear_marks_by_group_flags(struct fsnotify_group *group, unsigned int flags);
/* run all the marks in a group, and flag them to be freed */
extern void fsnotify_clear_marks_by_group(struct fsnotify_group *group);
extern void fsnotify_get_mark(struct fsnotify_mark *mark);
extern void fsnotify_put_mark(struct fsnotify_mark *mark);
extern void fsnotify_unmount_inodes(struct super_block *sb);

View File

@ -28,9 +28,7 @@ extern int zap_huge_pmd(struct mmu_gather *tlb,
extern int mincore_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
unsigned long addr, unsigned long end,
unsigned char *vec);
extern bool move_huge_pmd(struct vm_area_struct *vma,
struct vm_area_struct *new_vma,
unsigned long old_addr,
extern bool move_huge_pmd(struct vm_area_struct *vma, unsigned long old_addr,
unsigned long new_addr, unsigned long old_end,
pmd_t *old_pmd, pmd_t *new_pmd);
extern int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,

View File

@ -338,6 +338,7 @@ int huge_add_to_page_cache(struct page *page, struct address_space *mapping,
/* arch callback */
int __init alloc_bootmem_huge_page(struct hstate *h);
void __init hugetlb_bad_size(void);
void __init hugetlb_add_hstate(unsigned order);
struct hstate *size_to_hstate(unsigned long size);

View File

@ -5,16 +5,16 @@
#include <linux/mm.h>
static inline int is_vm_hugetlb_page(struct vm_area_struct *vma)
static inline bool is_vm_hugetlb_page(struct vm_area_struct *vma)
{
return !!(vma->vm_flags & VM_HUGETLB);
}
#else
static inline int is_vm_hugetlb_page(struct vm_area_struct *vma)
static inline bool is_vm_hugetlb_page(struct vm_area_struct *vma)
{
return 0;
return false;
}
#endif

View File

@ -412,9 +412,9 @@ extern __printf(3, 4)
int scnprintf(char *buf, size_t size, const char *fmt, ...);
extern __printf(3, 0)
int vscnprintf(char *buf, size_t size, const char *fmt, va_list args);
extern __printf(2, 3)
extern __printf(2, 3) __malloc
char *kasprintf(gfp_t gfp, const char *fmt, ...);
extern __printf(2, 0)
extern __printf(2, 0) __malloc
char *kvasprintf(gfp_t gfp, const char *fmt, va_list args);
extern __printf(2, 0)
const char *kvasprintf_const(gfp_t gfp, const char *fmt, va_list args);

View File

@ -658,12 +658,6 @@ mem_cgroup_get_lru_size(struct lruvec *lruvec, enum lru_list lru)
return 0;
}
static inline void
mem_cgroup_update_lru_size(struct lruvec *lruvec, enum lru_list lru,
int increment)
{
}
static inline unsigned long
mem_cgroup_node_nr_lru_pages(struct mem_cgroup *memcg,
int nid, unsigned int lru_mask)

View File

@ -247,16 +247,16 @@ static inline void mem_hotplug_done(void) {}
#ifdef CONFIG_MEMORY_HOTREMOVE
extern int is_mem_section_removable(unsigned long pfn, unsigned long nr_pages);
extern bool is_mem_section_removable(unsigned long pfn, unsigned long nr_pages);
extern void try_offline_node(int nid);
extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages);
extern void remove_memory(int nid, u64 start, u64 size);
#else
static inline int is_mem_section_removable(unsigned long pfn,
static inline bool is_mem_section_removable(unsigned long pfn,
unsigned long nr_pages)
{
return 0;
return false;
}
static inline void try_offline_node(int nid) {}

View File

@ -172,14 +172,14 @@ extern int mpol_parse_str(char *str, struct mempolicy **mpol);
extern void mpol_to_str(char *buffer, int maxlen, struct mempolicy *pol);
/* Check if a vma is migratable */
static inline int vma_migratable(struct vm_area_struct *vma)
static inline bool vma_migratable(struct vm_area_struct *vma)
{
if (vma->vm_flags & (VM_IO | VM_PFNMAP))
return 0;
return false;
#ifndef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION
if (vma->vm_flags & VM_HUGETLB)
return 0;
return false;
#endif
/*
@ -190,8 +190,8 @@ static inline int vma_migratable(struct vm_area_struct *vma)
if (vma->vm_file &&
gfp_zone(mapping_gfp_mask(vma->vm_file->f_mapping))
< policy_zone)
return 0;
return 1;
return false;
return true;
}
extern int mpol_misplaced(struct page *, struct vm_area_struct *, unsigned long);
@ -228,6 +228,12 @@ static inline void mpol_free_shared_policy(struct shared_policy *p)
{
}
static inline struct mempolicy *
mpol_shared_policy_lookup(struct shared_policy *sp, unsigned long idx)
{
return NULL;
}
#define vma_policy(vma) NULL
static inline int

View File

@ -5,6 +5,7 @@
#define _LINUX_MEMPOOL_H
#include <linux/wait.h>
#include <linux/compiler.h>
struct kmem_cache;
@ -31,7 +32,7 @@ extern mempool_t *mempool_create_node(int min_nr, mempool_alloc_t *alloc_fn,
extern int mempool_resize(mempool_t *pool, int new_min_nr);
extern void mempool_destroy(mempool_t *pool);
extern void * mempool_alloc(mempool_t *pool, gfp_t gfp_mask);
extern void *mempool_alloc(mempool_t *pool, gfp_t gfp_mask) __malloc;
extern void mempool_free(void *element, mempool_t *pool);
/*

View File

@ -447,14 +447,14 @@ unsigned long vmalloc_to_pfn(const void *addr);
* On nommu, vmalloc/vfree wrap through kmalloc/kfree directly, so there
* is no special casing required.
*/
static inline int is_vmalloc_addr(const void *x)
static inline bool is_vmalloc_addr(const void *x)
{
#ifdef CONFIG_MMU
unsigned long addr = (unsigned long)x;
return addr >= VMALLOC_START && addr < VMALLOC_END;
#else
return 0;
return false;
#endif
}
#ifdef CONFIG_MMU
@ -734,7 +734,7 @@ static inline void get_page(struct page *page)
page = compound_head(page);
/*
* Getting a normal page or the head of a compound page
* requires to already have an elevated page->_count.
* requires to already have an elevated page->_refcount.
*/
VM_BUG_ON_PAGE(page_ref_count(page) <= 0, page);
page_ref_inc(page);
@ -850,10 +850,7 @@ extern int page_cpupid_xchg_last(struct page *page, int cpupid);
static inline void page_cpupid_reset_last(struct page *page)
{
int cpupid = (1 << LAST_CPUPID_SHIFT) - 1;
page->flags &= ~(LAST_CPUPID_MASK << LAST_CPUPID_PGSHIFT);
page->flags |= (cpupid & LAST_CPUPID_MASK) << LAST_CPUPID_PGSHIFT;
page->flags |= LAST_CPUPID_MASK << LAST_CPUPID_PGSHIFT;
}
#endif /* LAST_CPUPID_NOT_IN_PAGE_FLAGS */
#else /* !CONFIG_NUMA_BALANCING */
@ -1032,26 +1029,7 @@ static inline pgoff_t page_file_index(struct page *page)
return page->index;
}
/*
* Return true if this page is mapped into pagetables.
* For compound page it returns true if any subpage of compound page is mapped.
*/
static inline bool page_mapped(struct page *page)
{
int i;
if (likely(!PageCompound(page)))
return atomic_read(&page->_mapcount) >= 0;
page = compound_head(page);
if (atomic_read(compound_mapcount_ptr(page)) >= 0)
return true;
if (PageHuge(page))
return false;
for (i = 0; i < hpage_nr_pages(page); i++) {
if (atomic_read(&page[i]._mapcount) >= 0)
return true;
}
return false;
}
bool page_mapped(struct page *page);
/*
* Return true only if the page has been allocated with

View File

@ -22,22 +22,34 @@ static inline int page_is_file_cache(struct page *page)
return !PageSwapBacked(page);
}
static __always_inline void __update_lru_size(struct lruvec *lruvec,
enum lru_list lru, int nr_pages)
{
__mod_zone_page_state(lruvec_zone(lruvec), NR_LRU_BASE + lru, nr_pages);
}
static __always_inline void update_lru_size(struct lruvec *lruvec,
enum lru_list lru, int nr_pages)
{
#ifdef CONFIG_MEMCG
mem_cgroup_update_lru_size(lruvec, lru, nr_pages);
#else
__update_lru_size(lruvec, lru, nr_pages);
#endif
}
static __always_inline void add_page_to_lru_list(struct page *page,
struct lruvec *lruvec, enum lru_list lru)
{
int nr_pages = hpage_nr_pages(page);
mem_cgroup_update_lru_size(lruvec, lru, nr_pages);
update_lru_size(lruvec, lru, hpage_nr_pages(page));
list_add(&page->lru, &lruvec->lists[lru]);
__mod_zone_page_state(lruvec_zone(lruvec), NR_LRU_BASE + lru, nr_pages);
}
static __always_inline void del_page_from_lru_list(struct page *page,
struct lruvec *lruvec, enum lru_list lru)
{
int nr_pages = hpage_nr_pages(page);
mem_cgroup_update_lru_size(lruvec, lru, -nr_pages);
list_del(&page->lru);
__mod_zone_page_state(lruvec_zone(lruvec), NR_LRU_BASE + lru, -nr_pages);
update_lru_size(lruvec, lru, -hpage_nr_pages(page));
}
/**

View File

@ -73,9 +73,9 @@ struct page {
unsigned long counters;
#else
/*
* Keep _count separate from slub cmpxchg_double data.
* As the rest of the double word is protected by
* slab_lock but _count is not.
* Keep _refcount separate from slub cmpxchg_double
* data. As the rest of the double word is protected by
* slab_lock but _refcount is not.
*/
unsigned counters;
#endif
@ -97,7 +97,11 @@ struct page {
};
int units; /* SLOB */
};
atomic_t _count; /* Usage count, see below. */
/*
* Usage count, *USE WRAPPER FUNCTION*
* when manual accounting. See page_ref.h
*/
atomic_t _refcount;
};
unsigned int active; /* SLAB */
};
@ -248,7 +252,7 @@ struct page_frag_cache {
__u32 offset;
#endif
/* we maintain a pagecount bias, so that we dont dirty cache line
* containing page->_count every time we allocate a fragment.
* containing page->_refcount every time we allocate a fragment.
*/
unsigned int pagecnt_bias;
bool pfmemalloc;

View File

@ -85,13 +85,6 @@ extern int page_group_by_mobility_disabled;
get_pfnblock_flags_mask(page, page_to_pfn(page), \
PB_migrate_end, MIGRATETYPE_MASK)
static inline int get_pfnblock_migratetype(struct page *page, unsigned long pfn)
{
BUILD_BUG_ON(PB_migrate_end - PB_migrate != 2);
return get_pfnblock_flags_mask(page, pfn, PB_migrate_end,
MIGRATETYPE_MASK);
}
struct free_area {
struct list_head free_list[MIGRATE_TYPES];
unsigned long nr_free;
@ -747,7 +740,8 @@ extern struct mutex zonelists_mutex;
void build_all_zonelists(pg_data_t *pgdat, struct zone *zone);
void wakeup_kswapd(struct zone *zone, int order, enum zone_type classzone_idx);
bool zone_watermark_ok(struct zone *z, unsigned int order,
unsigned long mark, int classzone_idx, int alloc_flags);
unsigned long mark, int classzone_idx,
unsigned int alloc_flags);
bool zone_watermark_ok_safe(struct zone *z, unsigned int order,
unsigned long mark, int classzone_idx);
enum memmap_context {
@ -828,10 +822,7 @@ static inline int is_highmem_idx(enum zone_type idx)
static inline int is_highmem(struct zone *zone)
{
#ifdef CONFIG_HIGHMEM
int zone_off = (char *)zone - (char *)zone->zone_pgdat->node_zones;
return zone_off == ZONE_HIGHMEM * sizeof(*zone) ||
(zone_off == ZONE_MOVABLE * sizeof(*zone) &&
zone_movable_is_highmem());
return is_highmem_idx(zone_idx(zone));
#else
return 0;
#endif
@ -922,6 +913,10 @@ static inline int zonelist_node_idx(struct zoneref *zoneref)
#endif /* CONFIG_NUMA */
}
struct zoneref *__next_zones_zonelist(struct zoneref *z,
enum zone_type highest_zoneidx,
nodemask_t *nodes);
/**
* next_zones_zonelist - Returns the next zone at or below highest_zoneidx within the allowed nodemask using a cursor within a zonelist as a starting point
* @z - The cursor used as a starting point for the search
@ -934,9 +929,14 @@ static inline int zonelist_node_idx(struct zoneref *zoneref)
* being examined. It should be advanced by one before calling
* next_zones_zonelist again.
*/
struct zoneref *next_zones_zonelist(struct zoneref *z,
static __always_inline struct zoneref *next_zones_zonelist(struct zoneref *z,
enum zone_type highest_zoneidx,
nodemask_t *nodes);
nodemask_t *nodes)
{
if (likely(!nodes && zonelist_zone_idx(z) <= highest_zoneidx))
return z;
return __next_zones_zonelist(z, highest_zoneidx, nodes);
}
/**
* first_zones_zonelist - Returns the first zone at or below highest_zoneidx within the allowed nodemask in a zonelist
@ -952,13 +952,10 @@ struct zoneref *next_zones_zonelist(struct zoneref *z,
*/
static inline struct zoneref *first_zones_zonelist(struct zonelist *zonelist,
enum zone_type highest_zoneidx,
nodemask_t *nodes,
struct zone **zone)
nodemask_t *nodes)
{
struct zoneref *z = next_zones_zonelist(zonelist->_zonerefs,
return next_zones_zonelist(zonelist->_zonerefs,
highest_zoneidx, nodes);
*zone = zonelist_zone(z);
return z;
}
/**
@ -973,10 +970,17 @@ static inline struct zoneref *first_zones_zonelist(struct zonelist *zonelist,
* within a given nodemask
*/
#define for_each_zone_zonelist_nodemask(zone, z, zlist, highidx, nodemask) \
for (z = first_zones_zonelist(zlist, highidx, nodemask, &zone); \
for (z = first_zones_zonelist(zlist, highidx, nodemask), zone = zonelist_zone(z); \
zone; \
z = next_zones_zonelist(++z, highidx, nodemask), \
zone = zonelist_zone(z)) \
zone = zonelist_zone(z))
#define for_next_zone_zonelist_nodemask(zone, z, zlist, highidx, nodemask) \
for (zone = z->zone; \
zone; \
z = next_zones_zonelist(++z, highidx, nodemask), \
zone = zonelist_zone(z))
/**
* for_each_zone_zonelist - helper macro to iterate over valid zones in a zonelist at or below a given zone index

View File

@ -43,8 +43,10 @@
*
* int first_node(mask) Number lowest set bit, or MAX_NUMNODES
* int next_node(node, mask) Next node past 'node', or MAX_NUMNODES
* int next_node_in(node, mask) Next node past 'node', or wrap to first,
* or MAX_NUMNODES
* int first_unset_node(mask) First node not set in mask, or
* MAX_NUMNODES.
* MAX_NUMNODES
*
* nodemask_t nodemask_of_node(node) Return nodemask with bit 'node' set
* NODE_MASK_ALL Initializer - all bits set
@ -259,6 +261,13 @@ static inline int __next_node(int n, const nodemask_t *srcp)
return min_t(int,MAX_NUMNODES,find_next_bit(srcp->bits, MAX_NUMNODES, n+1));
}
/*
* Find the next present node in src, starting after node n, wrapping around to
* the first node in src if needed. Returns MAX_NUMNODES if src is empty.
*/
#define next_node_in(n, src) __next_node_in((n), &(src))
int __next_node_in(int node, const nodemask_t *srcp);
static inline void init_nodemask_of_node(nodemask_t *mask, int node)
{
nodes_clear(*mask);

View File

@ -72,6 +72,14 @@ static inline bool oom_task_origin(const struct task_struct *p)
extern void mark_oom_victim(struct task_struct *tsk);
#ifdef CONFIG_MMU
extern void try_oom_reaper(struct task_struct *tsk);
#else
static inline void try_oom_reaper(struct task_struct *tsk)
{
}
#endif
extern unsigned long oom_badness(struct task_struct *p,
struct mem_cgroup *memcg, const nodemask_t *nodemask,
unsigned long totalpages);

View File

@ -175,11 +175,6 @@ extern int padata_do_parallel(struct padata_instance *pinst,
extern void padata_do_serial(struct padata_priv *padata);
extern int padata_set_cpumask(struct padata_instance *pinst, int cpumask_type,
cpumask_var_t cpumask);
extern int padata_set_cpumasks(struct padata_instance *pinst,
cpumask_var_t pcpumask,
cpumask_var_t cbcpumask);
extern int padata_add_cpu(struct padata_instance *pinst, int cpu, int mask);
extern int padata_remove_cpu(struct padata_instance *pinst, int cpu, int mask);
extern int padata_start(struct padata_instance *pinst);
extern void padata_stop(struct padata_instance *pinst);
extern int padata_register_cpumask_notifier(struct padata_instance *pinst,

View File

@ -371,10 +371,15 @@ PAGEFLAG(Idle, idle, PF_ANY)
#define PAGE_MAPPING_KSM 2
#define PAGE_MAPPING_FLAGS (PAGE_MAPPING_ANON | PAGE_MAPPING_KSM)
static __always_inline int PageAnonHead(struct page *page)
{
return ((unsigned long)page->mapping & PAGE_MAPPING_ANON) != 0;
}
static __always_inline int PageAnon(struct page *page)
{
page = compound_head(page);
return ((unsigned long)page->mapping & PAGE_MAPPING_ANON) != 0;
return PageAnonHead(page);
}
#ifdef CONFIG_KSM

View File

@ -63,17 +63,17 @@ static inline void __page_ref_unfreeze(struct page *page, int v)
static inline int page_ref_count(struct page *page)
{
return atomic_read(&page->_count);
return atomic_read(&page->_refcount);
}
static inline int page_count(struct page *page)
{
return atomic_read(&compound_head(page)->_count);
return atomic_read(&compound_head(page)->_refcount);
}
static inline void set_page_count(struct page *page, int v)
{
atomic_set(&page->_count, v);
atomic_set(&page->_refcount, v);
if (page_ref_tracepoint_active(__tracepoint_page_ref_set))
__page_ref_set(page, v);
}
@ -89,35 +89,35 @@ static inline void init_page_count(struct page *page)
static inline void page_ref_add(struct page *page, int nr)
{
atomic_add(nr, &page->_count);
atomic_add(nr, &page->_refcount);
if (page_ref_tracepoint_active(__tracepoint_page_ref_mod))
__page_ref_mod(page, nr);
}
static inline void page_ref_sub(struct page *page, int nr)
{
atomic_sub(nr, &page->_count);
atomic_sub(nr, &page->_refcount);
if (page_ref_tracepoint_active(__tracepoint_page_ref_mod))
__page_ref_mod(page, -nr);
}
static inline void page_ref_inc(struct page *page)
{
atomic_inc(&page->_count);
atomic_inc(&page->_refcount);
if (page_ref_tracepoint_active(__tracepoint_page_ref_mod))
__page_ref_mod(page, 1);
}
static inline void page_ref_dec(struct page *page)
{
atomic_dec(&page->_count);
atomic_dec(&page->_refcount);
if (page_ref_tracepoint_active(__tracepoint_page_ref_mod))
__page_ref_mod(page, -1);
}
static inline int page_ref_sub_and_test(struct page *page, int nr)
{
int ret = atomic_sub_and_test(nr, &page->_count);
int ret = atomic_sub_and_test(nr, &page->_refcount);
if (page_ref_tracepoint_active(__tracepoint_page_ref_mod_and_test))
__page_ref_mod_and_test(page, -nr, ret);
@ -126,7 +126,7 @@ static inline int page_ref_sub_and_test(struct page *page, int nr)
static inline int page_ref_dec_and_test(struct page *page)
{
int ret = atomic_dec_and_test(&page->_count);
int ret = atomic_dec_and_test(&page->_refcount);
if (page_ref_tracepoint_active(__tracepoint_page_ref_mod_and_test))
__page_ref_mod_and_test(page, -1, ret);
@ -135,7 +135,7 @@ static inline int page_ref_dec_and_test(struct page *page)
static inline int page_ref_dec_return(struct page *page)
{
int ret = atomic_dec_return(&page->_count);
int ret = atomic_dec_return(&page->_refcount);
if (page_ref_tracepoint_active(__tracepoint_page_ref_mod_and_return))
__page_ref_mod_and_return(page, -1, ret);
@ -144,7 +144,7 @@ static inline int page_ref_dec_return(struct page *page)
static inline int page_ref_add_unless(struct page *page, int nr, int u)
{
int ret = atomic_add_unless(&page->_count, nr, u);
int ret = atomic_add_unless(&page->_refcount, nr, u);
if (page_ref_tracepoint_active(__tracepoint_page_ref_mod_unless))
__page_ref_mod_unless(page, nr, ret);
@ -153,7 +153,7 @@ static inline int page_ref_add_unless(struct page *page, int nr, int u)
static inline int page_ref_freeze(struct page *page, int count)
{
int ret = likely(atomic_cmpxchg(&page->_count, count, 0) == count);
int ret = likely(atomic_cmpxchg(&page->_refcount, count, 0) == count);
if (page_ref_tracepoint_active(__tracepoint_page_ref_freeze))
__page_ref_freeze(page, count, ret);
@ -165,7 +165,7 @@ static inline void page_ref_unfreeze(struct page *page, int count)
VM_BUG_ON_PAGE(page_count(page) != 0, page);
VM_BUG_ON(count == 0);
atomic_set(&page->_count, count);
atomic_set(&page->_refcount, count);
if (page_ref_tracepoint_active(__tracepoint_page_ref_unfreeze))
__page_ref_unfreeze(page, count);
}

View File

@ -90,12 +90,12 @@ void release_pages(struct page **pages, int nr, bool cold);
/*
* speculatively take a reference to a page.
* If the page is free (_count == 0), then _count is untouched, and 0
* is returned. Otherwise, _count is incremented by 1 and 1 is returned.
* If the page is free (_refcount == 0), then _refcount is untouched, and 0
* is returned. Otherwise, _refcount is incremented by 1 and 1 is returned.
*
* This function must be called inside the same rcu_read_lock() section as has
* been used to lookup the page in the pagecache radix-tree (or page table):
* this allows allocators to use a synchronize_rcu() to stabilize _count.
* this allows allocators to use a synchronize_rcu() to stabilize _refcount.
*
* Unless an RCU grace period has passed, the count of all pages coming out
* of the allocator must be considered unstable. page_count may return higher
@ -111,7 +111,7 @@ void release_pages(struct page **pages, int nr, bool cold);
* 2. conditionally increment refcount
* 3. check the page is still in pagecache (if no, goto 1)
*
* Remove-side that cares about stability of _count (eg. reclaim) has the
* Remove-side that cares about stability of _refcount (eg. reclaim) has the
* following (with tree_lock held for write):
* A. atomically check refcount is correct and set it to 0 (atomic_cmpxchg)
* B. remove page from pagecache

View File

@ -96,7 +96,7 @@ extern void poll_initwait(struct poll_wqueues *pwq);
extern void poll_freewait(struct poll_wqueues *pwq);
extern int poll_schedule_timeout(struct poll_wqueues *pwq, int state,
ktime_t *expires, unsigned long slack);
extern u64 select_estimate_accuracy(struct timespec *tv);
extern u64 select_estimate_accuracy(struct timespec64 *tv);
static inline int poll_schedule(struct poll_wqueues *pwq, int state)
@ -153,12 +153,13 @@ void zero_fd_set(unsigned long nr, unsigned long *fdset)
#define MAX_INT64_SECONDS (((s64)(~((u64)0)>>1)/HZ)-1)
extern int do_select(int n, fd_set_bits *fds, struct timespec *end_time);
extern int do_select(int n, fd_set_bits *fds, struct timespec64 *end_time);
extern int do_sys_poll(struct pollfd __user * ufds, unsigned int nfds,
struct timespec *end_time);
struct timespec64 *end_time);
extern int core_sys_select(int n, fd_set __user *inp, fd_set __user *outp,
fd_set __user *exp, struct timespec *end_time);
fd_set __user *exp, struct timespec64 *end_time);
extern int poll_select_set_timeout(struct timespec *to, long sec, long nsec);
extern int poll_select_set_timeout(struct timespec64 *to, time64_t sec,
long nsec);
#endif /* _LINUX_POLL_H */

View File

@ -315,8 +315,8 @@ static __always_inline int kmalloc_index(size_t size)
}
#endif /* !CONFIG_SLOB */
void *__kmalloc(size_t size, gfp_t flags) __assume_kmalloc_alignment;
void *kmem_cache_alloc(struct kmem_cache *, gfp_t flags) __assume_slab_alignment;
void *__kmalloc(size_t size, gfp_t flags) __assume_kmalloc_alignment __malloc;
void *kmem_cache_alloc(struct kmem_cache *, gfp_t flags) __assume_slab_alignment __malloc;
void kmem_cache_free(struct kmem_cache *, void *);
/*
@ -339,8 +339,8 @@ static __always_inline void kfree_bulk(size_t size, void **p)
}
#ifdef CONFIG_NUMA
void *__kmalloc_node(size_t size, gfp_t flags, int node) __assume_kmalloc_alignment;
void *kmem_cache_alloc_node(struct kmem_cache *, gfp_t flags, int node) __assume_slab_alignment;
void *__kmalloc_node(size_t size, gfp_t flags, int node) __assume_kmalloc_alignment __malloc;
void *kmem_cache_alloc_node(struct kmem_cache *, gfp_t flags, int node) __assume_slab_alignment __malloc;
#else
static __always_inline void *__kmalloc_node(size_t size, gfp_t flags, int node)
{
@ -354,12 +354,12 @@ static __always_inline void *kmem_cache_alloc_node(struct kmem_cache *s, gfp_t f
#endif
#ifdef CONFIG_TRACING
extern void *kmem_cache_alloc_trace(struct kmem_cache *, gfp_t, size_t) __assume_slab_alignment;
extern void *kmem_cache_alloc_trace(struct kmem_cache *, gfp_t, size_t) __assume_slab_alignment __malloc;
#ifdef CONFIG_NUMA
extern void *kmem_cache_alloc_node_trace(struct kmem_cache *s,
gfp_t gfpflags,
int node, size_t size) __assume_slab_alignment;
int node, size_t size) __assume_slab_alignment __malloc;
#else
static __always_inline void *
kmem_cache_alloc_node_trace(struct kmem_cache *s,
@ -392,10 +392,10 @@ kmem_cache_alloc_node_trace(struct kmem_cache *s,
}
#endif /* CONFIG_TRACING */
extern void *kmalloc_order(size_t size, gfp_t flags, unsigned int order) __assume_page_alignment;
extern void *kmalloc_order(size_t size, gfp_t flags, unsigned int order) __assume_page_alignment __malloc;
#ifdef CONFIG_TRACING
extern void *kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order) __assume_page_alignment;
extern void *kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order) __assume_page_alignment __malloc;
#else
static __always_inline void *
kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order)

View File

@ -80,6 +80,10 @@ struct kmem_cache {
struct kasan_cache kasan_info;
#endif
#ifdef CONFIG_SLAB_FREELIST_RANDOM
void *random_seq;
#endif
struct kmem_cache_node *node[MAX_NUMNODES];
};

View File

@ -119,7 +119,7 @@ char *strreplace(char *s, char old, char new);
extern void kfree_const(const void *x);
extern char *kstrdup(const char *s, gfp_t gfp);
extern char *kstrdup(const char *s, gfp_t gfp) __malloc;
extern const char *kstrdup_const(const char *s, gfp_t gfp);
extern char *kstrndup(const char *s, size_t len, gfp_t gfp);
extern void *kmemdup(const void *src, size_t len, gfp_t gfp);

View File

@ -65,7 +65,6 @@ static inline struct itimerspec64 itimerspec_to_itimerspec64(struct itimerspec *
# define timespec64_equal timespec_equal
# define timespec64_compare timespec_compare
# define set_normalized_timespec64 set_normalized_timespec
# define timespec64_add_safe timespec_add_safe
# define timespec64_add timespec_add
# define timespec64_sub timespec_sub
# define timespec64_valid timespec_valid
@ -134,15 +133,6 @@ static inline int timespec64_compare(const struct timespec64 *lhs, const struct
extern void set_normalized_timespec64(struct timespec64 *ts, time64_t sec, s64 nsec);
/*
* timespec64_add_safe assumes both values are positive and checks for
* overflow. It will return TIME_T_MAX if the returned value would be
* smaller then either of the arguments.
*/
extern struct timespec64 timespec64_add_safe(const struct timespec64 lhs,
const struct timespec64 rhs);
static inline struct timespec64 timespec64_add(struct timespec64 lhs,
struct timespec64 rhs)
{
@ -224,4 +214,11 @@ static __always_inline void timespec64_add_ns(struct timespec64 *a, u64 ns)
#endif
/*
* timespec64_add_safe assumes both values are positive and checks for
* overflow. It will return TIME64_MAX in case of overflow.
*/
extern struct timespec64 timespec64_add_safe(const struct timespec64 lhs,
const struct timespec64 rhs);
#endif /* _LINUX_TIME64_H */

View File

@ -163,12 +163,10 @@ static inline unsigned long zone_page_state_snapshot(struct zone *zone,
#ifdef CONFIG_NUMA
extern unsigned long node_page_state(int node, enum zone_stat_item item);
extern void zone_statistics(struct zone *, struct zone *, gfp_t gfp);
#else
#define node_page_state(node, item) global_page_state(item)
#define zone_statistics(_zl, _z, gfp) do { } while (0)
#endif /* CONFIG_NUMA */
@ -193,6 +191,10 @@ void quiet_vmstat(void);
void cpu_vm_stats_fold(int cpu);
void refresh_zone_stat_thresholds(void);
struct ctl_table;
int vmstat_refresh(struct ctl_table *, int write,
void __user *buffer, size_t *lenp, loff_t *ppos);
void drain_zonestat(struct zone *zone, struct per_cpu_pageset *);
int calculate_pressure_threshold(struct zone *zone);

View File

@ -1742,6 +1742,15 @@ config SLOB
endchoice
config SLAB_FREELIST_RANDOM
default n
depends on SLAB
bool "SLAB freelist randomization"
help
Randomizes the freelist order used on creating new SLABs. This
security feature reduces the predictability of the kernel slab
allocator against heap overflows.
config SLUB_CPU_PARTIAL
default y
depends on SLUB && SMP

View File

@ -61,7 +61,7 @@
#include <linux/cgroup.h>
#include <linux/wait.h>
struct static_key cpusets_enabled_key __read_mostly = STATIC_KEY_INIT_FALSE;
DEFINE_STATIC_KEY_FALSE(cpusets_enabled_key);
/* See "Frequency meter" comments, below. */
@ -2528,27 +2528,27 @@ static struct cpuset *nearest_hardwall_ancestor(struct cpuset *cs)
* GFP_KERNEL - any node in enclosing hardwalled cpuset ok
* GFP_USER - only nodes in current tasks mems allowed ok.
*/
int __cpuset_node_allowed(int node, gfp_t gfp_mask)
bool __cpuset_node_allowed(int node, gfp_t gfp_mask)
{
struct cpuset *cs; /* current cpuset ancestors */
int allowed; /* is allocation in zone z allowed? */
unsigned long flags;
if (in_interrupt())
return 1;
return true;
if (node_isset(node, current->mems_allowed))
return 1;
return true;
/*
* Allow tasks that have access to memory reserves because they have
* been OOM killed to get memory anywhere.
*/
if (unlikely(test_thread_flag(TIF_MEMDIE)))
return 1;
return true;
if (gfp_mask & __GFP_HARDWALL) /* If hardwall request, stop here */
return 0;
return false;
if (current->flags & PF_EXITING) /* Let dying task have memory */
return 1;
return true;
/* Not hardwall and node outside mems_allowed: scan up cpusets */
spin_lock_irqsave(&callback_lock, flags);
@ -2591,13 +2591,7 @@ int __cpuset_node_allowed(int node, gfp_t gfp_mask)
static int cpuset_spread_node(int *rotor)
{
int node;
node = next_node(*rotor, current->mems_allowed);
if (node == MAX_NUMNODES)
node = first_node(current->mems_allowed);
*rotor = node;
return node;
return *rotor = next_node_in(*rotor, current->mems_allowed);
}
int cpuset_mem_spread_node(void)

View File

@ -1410,7 +1410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
VMCOREINFO_STRUCT_SIZE(list_head);
VMCOREINFO_SIZE(nodemask_t);
VMCOREINFO_OFFSET(page, flags);
VMCOREINFO_OFFSET(page, _count);
VMCOREINFO_OFFSET(page, _refcount);
VMCOREINFO_OFFSET(page, mapping);
VMCOREINFO_OFFSET(page, lru);
VMCOREINFO_OFFSET(page, _mapcount);

View File

@ -606,33 +606,6 @@ out_replace:
return 0;
}
/**
* padata_set_cpumasks - Set both parallel and serial cpumasks. The first
* one is used by parallel workers and the second one
* by the wokers doing serialization.
*
* @pinst: padata instance
* @pcpumask: the cpumask to use for parallel workers
* @cbcpumask: the cpumsak to use for serial workers
*/
int padata_set_cpumasks(struct padata_instance *pinst, cpumask_var_t pcpumask,
cpumask_var_t cbcpumask)
{
int err;
mutex_lock(&pinst->lock);
get_online_cpus();
err = __padata_set_cpumasks(pinst, pcpumask, cbcpumask);
put_online_cpus();
mutex_unlock(&pinst->lock);
return err;
}
EXPORT_SYMBOL(padata_set_cpumasks);
/**
* padata_set_cpumask: Sets specified by @cpumask_type cpumask to the value
* equivalent to @cpumask.
@ -674,6 +647,43 @@ out:
}
EXPORT_SYMBOL(padata_set_cpumask);
/**
* padata_start - start the parallel processing
*
* @pinst: padata instance to start
*/
int padata_start(struct padata_instance *pinst)
{
int err = 0;
mutex_lock(&pinst->lock);
if (pinst->flags & PADATA_INVALID)
err = -EINVAL;
__padata_start(pinst);
mutex_unlock(&pinst->lock);
return err;
}
EXPORT_SYMBOL(padata_start);
/**
* padata_stop - stop the parallel processing
*
* @pinst: padata instance to stop
*/
void padata_stop(struct padata_instance *pinst)
{
mutex_lock(&pinst->lock);
__padata_stop(pinst);
mutex_unlock(&pinst->lock);
}
EXPORT_SYMBOL(padata_stop);
#ifdef CONFIG_HOTPLUG_CPU
static int __padata_add_cpu(struct padata_instance *pinst, int cpu)
{
struct parallel_data *pd;
@ -694,42 +704,6 @@ static int __padata_add_cpu(struct padata_instance *pinst, int cpu)
return 0;
}
/**
* padata_add_cpu - add a cpu to one or both(parallel and serial)
* padata cpumasks.
*
* @pinst: padata instance
* @cpu: cpu to add
* @mask: bitmask of flags specifying to which cpumask @cpu shuld be added.
* The @mask may be any combination of the following flags:
* PADATA_CPU_SERIAL - serial cpumask
* PADATA_CPU_PARALLEL - parallel cpumask
*/
int padata_add_cpu(struct padata_instance *pinst, int cpu, int mask)
{
int err;
if (!(mask & (PADATA_CPU_SERIAL | PADATA_CPU_PARALLEL)))
return -EINVAL;
mutex_lock(&pinst->lock);
get_online_cpus();
if (mask & PADATA_CPU_SERIAL)
cpumask_set_cpu(cpu, pinst->cpumask.cbcpu);
if (mask & PADATA_CPU_PARALLEL)
cpumask_set_cpu(cpu, pinst->cpumask.pcpu);
err = __padata_add_cpu(pinst, cpu);
put_online_cpus();
mutex_unlock(&pinst->lock);
return err;
}
EXPORT_SYMBOL(padata_add_cpu);
static int __padata_remove_cpu(struct padata_instance *pinst, int cpu)
{
struct parallel_data *pd = NULL;
@ -789,43 +763,6 @@ int padata_remove_cpu(struct padata_instance *pinst, int cpu, int mask)
}
EXPORT_SYMBOL(padata_remove_cpu);
/**
* padata_start - start the parallel processing
*
* @pinst: padata instance to start
*/
int padata_start(struct padata_instance *pinst)
{
int err = 0;
mutex_lock(&pinst->lock);
if (pinst->flags & PADATA_INVALID)
err =-EINVAL;
__padata_start(pinst);
mutex_unlock(&pinst->lock);
return err;
}
EXPORT_SYMBOL(padata_start);
/**
* padata_stop - stop the parallel processing
*
* @pinst: padata instance to stop
*/
void padata_stop(struct padata_instance *pinst)
{
mutex_lock(&pinst->lock);
__padata_stop(pinst);
mutex_unlock(&pinst->lock);
}
EXPORT_SYMBOL(padata_stop);
#ifdef CONFIG_HOTPLUG_CPU
static inline int pinst_has_cpu(struct padata_instance *pinst, int cpu)
{
return cpumask_test_cpu(cpu, pinst->cpumask.pcpu) ||
@ -1091,7 +1028,6 @@ err_free_inst:
err:
return NULL;
}
EXPORT_SYMBOL(padata_alloc);
/**
* padata_free - free a padata instance

View File

@ -380,29 +380,9 @@ void destroy_rcu_head(struct rcu_head *head)
debug_object_free(head, &rcuhead_debug_descr);
}
/*
* fixup_activate is called when:
* - an active object is activated
* - an unknown object is activated (might be a statically initialized object)
* Activation is performed internally by call_rcu().
*/
static int rcuhead_fixup_activate(void *addr, enum debug_obj_state state)
static bool rcuhead_is_static_object(void *addr)
{
struct rcu_head *head = addr;
switch (state) {
case ODEBUG_STATE_NOTAVAILABLE:
/*
* This is not really a fixup. We just make sure that it is
* tracked in the object tracker.
*/
debug_object_init(head, &rcuhead_debug_descr);
debug_object_activate(head, &rcuhead_debug_descr);
return 0;
default:
return 1;
}
return true;
}
/**
@ -440,7 +420,7 @@ EXPORT_SYMBOL_GPL(destroy_rcu_head_on_stack);
struct debug_obj_descr rcuhead_debug_descr = {
.name = "rcu_head",
.fixup_activate = rcuhead_fixup_activate,
.is_static_object = rcuhead_is_static_object,
};
EXPORT_SYMBOL_GPL(rcuhead_debug_descr);
#endif /* #ifdef CONFIG_DEBUG_OBJECTS_RCU_HEAD */

View File

@ -1521,6 +1521,13 @@ static struct ctl_table vm_table[] = {
.mode = 0644,
.proc_handler = proc_dointvec_jiffies,
},
{
.procname = "stat_refresh",
.data = NULL,
.maxlen = 0,
.mode = 0600,
.proc_handler = vmstat_refresh,
},
#endif
#ifdef CONFIG_MMU
{

View File

@ -334,7 +334,7 @@ static void *hrtimer_debug_hint(void *addr)
* fixup_init is called when:
* - an active object is initialized
*/
static int hrtimer_fixup_init(void *addr, enum debug_obj_state state)
static bool hrtimer_fixup_init(void *addr, enum debug_obj_state state)
{
struct hrtimer *timer = addr;
@ -342,30 +342,25 @@ static int hrtimer_fixup_init(void *addr, enum debug_obj_state state)
case ODEBUG_STATE_ACTIVE:
hrtimer_cancel(timer);
debug_object_init(timer, &hrtimer_debug_descr);
return 1;
return true;
default:
return 0;
return false;
}
}
/*
* fixup_activate is called when:
* - an active object is activated
* - an unknown object is activated (might be a statically initialized object)
* - an unknown non-static object is activated
*/
static int hrtimer_fixup_activate(void *addr, enum debug_obj_state state)
static bool hrtimer_fixup_activate(void *addr, enum debug_obj_state state)
{
switch (state) {
case ODEBUG_STATE_NOTAVAILABLE:
WARN_ON_ONCE(1);
return 0;
case ODEBUG_STATE_ACTIVE:
WARN_ON(1);
default:
return 0;
return false;
}
}
@ -373,7 +368,7 @@ static int hrtimer_fixup_activate(void *addr, enum debug_obj_state state)
* fixup_free is called when:
* - an active object is freed
*/
static int hrtimer_fixup_free(void *addr, enum debug_obj_state state)
static bool hrtimer_fixup_free(void *addr, enum debug_obj_state state)
{
struct hrtimer *timer = addr;
@ -381,9 +376,9 @@ static int hrtimer_fixup_free(void *addr, enum debug_obj_state state)
case ODEBUG_STATE_ACTIVE:
hrtimer_cancel(timer);
debug_object_free(timer, &hrtimer_debug_descr);
return 1;
return true;
default:
return 0;
return false;
}
}

View File

@ -769,3 +769,24 @@ struct timespec timespec_add_safe(const struct timespec lhs,
return res;
}
/*
* Add two timespec64 values and do a safety check for overflow.
* It's assumed that both values are valid (>= 0).
* And, each timespec64 is in normalized form.
*/
struct timespec64 timespec64_add_safe(const struct timespec64 lhs,
const struct timespec64 rhs)
{
struct timespec64 res;
set_normalized_timespec64(&res, lhs.tv_sec + rhs.tv_sec,
lhs.tv_nsec + rhs.tv_nsec);
if (unlikely(res.tv_sec < lhs.tv_sec || res.tv_sec < rhs.tv_sec)) {
res.tv_sec = TIME64_MAX;
res.tv_nsec = 0;
}
return res;
}

View File

@ -489,11 +489,19 @@ static void *timer_debug_hint(void *addr)
return ((struct timer_list *) addr)->function;
}
static bool timer_is_static_object(void *addr)
{
struct timer_list *timer = addr;
return (timer->entry.pprev == NULL &&
timer->entry.next == TIMER_ENTRY_STATIC);
}
/*
* fixup_init is called when:
* - an active object is initialized
*/
static int timer_fixup_init(void *addr, enum debug_obj_state state)
static bool timer_fixup_init(void *addr, enum debug_obj_state state)
{
struct timer_list *timer = addr;
@ -501,9 +509,9 @@ static int timer_fixup_init(void *addr, enum debug_obj_state state)
case ODEBUG_STATE_ACTIVE:
del_timer_sync(timer);
debug_object_init(timer, &timer_debug_descr);
return 1;
return true;
default:
return 0;
return false;
}
}
@ -516,36 +524,22 @@ static void stub_timer(unsigned long data)
/*
* fixup_activate is called when:
* - an active object is activated
* - an unknown object is activated (might be a statically initialized object)
* - an unknown non-static object is activated
*/
static int timer_fixup_activate(void *addr, enum debug_obj_state state)
static bool timer_fixup_activate(void *addr, enum debug_obj_state state)
{
struct timer_list *timer = addr;
switch (state) {
case ODEBUG_STATE_NOTAVAILABLE:
/*
* This is not really a fixup. The timer was
* statically initialized. We just make sure that it
* is tracked in the object tracker.
*/
if (timer->entry.pprev == NULL &&
timer->entry.next == TIMER_ENTRY_STATIC) {
debug_object_init(timer, &timer_debug_descr);
debug_object_activate(timer, &timer_debug_descr);
return 0;
} else {
setup_timer(timer, stub_timer, 0);
return 1;
}
return 0;
setup_timer(timer, stub_timer, 0);
return true;
case ODEBUG_STATE_ACTIVE:
WARN_ON(1);
default:
return 0;
return false;
}
}
@ -553,7 +547,7 @@ static int timer_fixup_activate(void *addr, enum debug_obj_state state)
* fixup_free is called when:
* - an active object is freed
*/
static int timer_fixup_free(void *addr, enum debug_obj_state state)
static bool timer_fixup_free(void *addr, enum debug_obj_state state)
{
struct timer_list *timer = addr;
@ -561,9 +555,9 @@ static int timer_fixup_free(void *addr, enum debug_obj_state state)
case ODEBUG_STATE_ACTIVE:
del_timer_sync(timer);
debug_object_free(timer, &timer_debug_descr);
return 1;
return true;
default:
return 0;
return false;
}
}
@ -571,32 +565,23 @@ static int timer_fixup_free(void *addr, enum debug_obj_state state)
* fixup_assert_init is called when:
* - an untracked/uninit-ed object is found
*/
static int timer_fixup_assert_init(void *addr, enum debug_obj_state state)
static bool timer_fixup_assert_init(void *addr, enum debug_obj_state state)
{
struct timer_list *timer = addr;
switch (state) {
case ODEBUG_STATE_NOTAVAILABLE:
if (timer->entry.next == TIMER_ENTRY_STATIC) {
/*
* This is not really a fixup. The timer was
* statically initialized. We just make sure that it
* is tracked in the object tracker.
*/
debug_object_init(timer, &timer_debug_descr);
return 0;
} else {
setup_timer(timer, stub_timer, 0);
return 1;
}
setup_timer(timer, stub_timer, 0);
return true;
default:
return 0;
return false;
}
}
static struct debug_obj_descr timer_debug_descr = {
.name = "timer_list",
.debug_hint = timer_debug_hint,
.is_static_object = timer_is_static_object,
.fixup_init = timer_fixup_init,
.fixup_activate = timer_fixup_activate,
.fixup_free = timer_fixup_free,

View File

@ -433,11 +433,18 @@ static void *work_debug_hint(void *addr)
return ((struct work_struct *) addr)->func;
}
static bool work_is_static_object(void *addr)
{
struct work_struct *work = addr;
return test_bit(WORK_STRUCT_STATIC_BIT, work_data_bits(work));
}
/*
* fixup_init is called when:
* - an active object is initialized
*/
static int work_fixup_init(void *addr, enum debug_obj_state state)
static bool work_fixup_init(void *addr, enum debug_obj_state state)
{
struct work_struct *work = addr;
@ -445,42 +452,9 @@ static int work_fixup_init(void *addr, enum debug_obj_state state)
case ODEBUG_STATE_ACTIVE:
cancel_work_sync(work);
debug_object_init(work, &work_debug_descr);
return 1;
return true;
default:
return 0;
}
}
/*
* fixup_activate is called when:
* - an active object is activated
* - an unknown object is activated (might be a statically initialized object)
*/
static int work_fixup_activate(void *addr, enum debug_obj_state state)
{
struct work_struct *work = addr;
switch (state) {
case ODEBUG_STATE_NOTAVAILABLE:
/*
* This is not really a fixup. The work struct was
* statically initialized. We just make sure that it
* is tracked in the object tracker.
*/
if (test_bit(WORK_STRUCT_STATIC_BIT, work_data_bits(work))) {
debug_object_init(work, &work_debug_descr);
debug_object_activate(work, &work_debug_descr);
return 0;
}
WARN_ON_ONCE(1);
return 0;
case ODEBUG_STATE_ACTIVE:
WARN_ON(1);
default:
return 0;
return false;
}
}
@ -488,7 +462,7 @@ static int work_fixup_activate(void *addr, enum debug_obj_state state)
* fixup_free is called when:
* - an active object is freed
*/
static int work_fixup_free(void *addr, enum debug_obj_state state)
static bool work_fixup_free(void *addr, enum debug_obj_state state)
{
struct work_struct *work = addr;
@ -496,17 +470,17 @@ static int work_fixup_free(void *addr, enum debug_obj_state state)
case ODEBUG_STATE_ACTIVE:
cancel_work_sync(work);
debug_object_free(work, &work_debug_descr);
return 1;
return true;
default:
return 0;
return false;
}
}
static struct debug_obj_descr work_debug_descr = {
.name = "work_struct",
.debug_hint = work_debug_hint,
.is_static_object = work_is_static_object,
.fixup_init = work_fixup_init,
.fixup_activate = work_fixup_activate,
.fixup_free = work_fixup_free,
};

View File

@ -25,7 +25,7 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \
sha1.o md5.o irq_regs.o argv_split.o \
flex_proportions.o ratelimit.o show_mem.o \
is_single_threaded.o plist.o decompress.o kobject_uevent.o \
earlycpio.o seq_buf.o nmi_backtrace.o
earlycpio.o seq_buf.o nmi_backtrace.o nodemask.o
obj-$(CONFIG_ARCH_HAS_DEBUG_STRICT_USER_COPY_CHECKS) += usercopy.o
lib-$(CONFIG_MMU) += ioremap.o

View File

@ -269,16 +269,15 @@ static void debug_print_object(struct debug_obj *obj, char *msg)
* Try to repair the damage, so we have a better chance to get useful
* debug output.
*/
static int
debug_object_fixup(int (*fixup)(void *addr, enum debug_obj_state state),
static bool
debug_object_fixup(bool (*fixup)(void *addr, enum debug_obj_state state),
void * addr, enum debug_obj_state state)
{
int fixed = 0;
if (fixup)
fixed = fixup(addr, state);
debug_objects_fixups += fixed;
return fixed;
if (fixup && fixup(addr, state)) {
debug_objects_fixups++;
return true;
}
return false;
}
static void debug_object_is_on_stack(void *addr, int onstack)
@ -416,7 +415,7 @@ int debug_object_activate(void *addr, struct debug_obj_descr *descr)
state = obj->state;
raw_spin_unlock_irqrestore(&db->lock, flags);
ret = debug_object_fixup(descr->fixup_activate, addr, state);
return ret ? -EINVAL : 0;
return ret ? 0 : -EINVAL;
case ODEBUG_STATE_DESTROYED:
debug_print_object(obj, "activate");
@ -432,14 +431,21 @@ int debug_object_activate(void *addr, struct debug_obj_descr *descr)
raw_spin_unlock_irqrestore(&db->lock, flags);
/*
* This happens when a static object is activated. We
* let the type specific code decide whether this is
* true or not.
* We are here when a static object is activated. We
* let the type specific code confirm whether this is
* true or not. if true, we just make sure that the
* static object is tracked in the object tracker. If
* not, this must be a bug, so we try to fix it up.
*/
if (debug_object_fixup(descr->fixup_activate, addr,
ODEBUG_STATE_NOTAVAILABLE)) {
if (descr->is_static_object && descr->is_static_object(addr)) {
/* track this static object */
debug_object_init(addr, descr);
debug_object_activate(addr, descr);
} else {
debug_print_object(&o, "activate");
return -EINVAL;
ret = debug_object_fixup(descr->fixup_activate, addr,
ODEBUG_STATE_NOTAVAILABLE);
return ret ? 0 : -EINVAL;
}
return 0;
}
@ -603,12 +609,18 @@ void debug_object_assert_init(void *addr, struct debug_obj_descr *descr)
raw_spin_unlock_irqrestore(&db->lock, flags);
/*
* Maybe the object is static. Let the type specific
* code decide what to do.
* Maybe the object is static, and we let the type specific
* code confirm. Track this static object if true, else invoke
* fixup.
*/
if (debug_object_fixup(descr->fixup_assert_init, addr,
ODEBUG_STATE_NOTAVAILABLE))
if (descr->is_static_object && descr->is_static_object(addr)) {
/* Track this static object */
debug_object_init(addr, descr);
} else {
debug_print_object(&o, "assert_init");
debug_object_fixup(descr->fixup_assert_init, addr,
ODEBUG_STATE_NOTAVAILABLE);
}
return;
}
@ -793,11 +805,18 @@ struct self_test {
static __initdata struct debug_obj_descr descr_type_test;
static bool __init is_static_object(void *addr)
{
struct self_test *obj = addr;
return obj->static_init;
}
/*
* fixup_init is called when:
* - an active object is initialized
*/
static int __init fixup_init(void *addr, enum debug_obj_state state)
static bool __init fixup_init(void *addr, enum debug_obj_state state)
{
struct self_test *obj = addr;
@ -805,37 +824,31 @@ static int __init fixup_init(void *addr, enum debug_obj_state state)
case ODEBUG_STATE_ACTIVE:
debug_object_deactivate(obj, &descr_type_test);
debug_object_init(obj, &descr_type_test);
return 1;
return true;
default:
return 0;
return false;
}
}
/*
* fixup_activate is called when:
* - an active object is activated
* - an unknown object is activated (might be a statically initialized object)
* - an unknown non-static object is activated
*/
static int __init fixup_activate(void *addr, enum debug_obj_state state)
static bool __init fixup_activate(void *addr, enum debug_obj_state state)
{
struct self_test *obj = addr;
switch (state) {
case ODEBUG_STATE_NOTAVAILABLE:
if (obj->static_init == 1) {
debug_object_init(obj, &descr_type_test);
debug_object_activate(obj, &descr_type_test);
return 0;
}
return 1;
return true;
case ODEBUG_STATE_ACTIVE:
debug_object_deactivate(obj, &descr_type_test);
debug_object_activate(obj, &descr_type_test);
return 1;
return true;
default:
return 0;
return false;
}
}
@ -843,7 +856,7 @@ static int __init fixup_activate(void *addr, enum debug_obj_state state)
* fixup_destroy is called when:
* - an active object is destroyed
*/
static int __init fixup_destroy(void *addr, enum debug_obj_state state)
static bool __init fixup_destroy(void *addr, enum debug_obj_state state)
{
struct self_test *obj = addr;
@ -851,9 +864,9 @@ static int __init fixup_destroy(void *addr, enum debug_obj_state state)
case ODEBUG_STATE_ACTIVE:
debug_object_deactivate(obj, &descr_type_test);
debug_object_destroy(obj, &descr_type_test);
return 1;
return true;
default:
return 0;
return false;
}
}
@ -861,7 +874,7 @@ static int __init fixup_destroy(void *addr, enum debug_obj_state state)
* fixup_free is called when:
* - an active object is freed
*/
static int __init fixup_free(void *addr, enum debug_obj_state state)
static bool __init fixup_free(void *addr, enum debug_obj_state state)
{
struct self_test *obj = addr;
@ -869,9 +882,9 @@ static int __init fixup_free(void *addr, enum debug_obj_state state)
case ODEBUG_STATE_ACTIVE:
debug_object_deactivate(obj, &descr_type_test);
debug_object_free(obj, &descr_type_test);
return 1;
return true;
default:
return 0;
return false;
}
}
@ -917,6 +930,7 @@ out:
static __initdata struct debug_obj_descr descr_type_test = {
.name = "selftest",
.is_static_object = is_static_object,
.fixup_init = fixup_init,
.fixup_activate = fixup_activate,
.fixup_destroy = fixup_destroy,

30
lib/nodemask.c Normal file
View File

@ -0,0 +1,30 @@
#include <linux/nodemask.h>
#include <linux/module.h>
#include <linux/random.h>
int __next_node_in(int node, const nodemask_t *srcp)
{
int ret = __next_node(node, srcp);
if (ret == MAX_NUMNODES)
ret = __first_node(srcp);
return ret;
}
EXPORT_SYMBOL(__next_node_in);
#ifdef CONFIG_NUMA
/*
* Return the bit number of a random bit set in the nodemask.
* (returns NUMA_NO_NODE if nodemask is empty)
*/
int node_random(const nodemask_t *maskp)
{
int w, bit = NUMA_NO_NODE;
w = nodes_weight(*maskp);
if (w)
bit = bitmap_ord_to_pos(maskp->bits,
get_random_int() % w, MAX_NUMNODES);
return bit;
}
#endif

View File

@ -19,7 +19,7 @@ static DEFINE_SPINLOCK(percpu_counters_lock);
static struct debug_obj_descr percpu_counter_debug_descr;
static int percpu_counter_fixup_free(void *addr, enum debug_obj_state state)
static bool percpu_counter_fixup_free(void *addr, enum debug_obj_state state)
{
struct percpu_counter *fbc = addr;
@ -27,9 +27,9 @@ static int percpu_counter_fixup_free(void *addr, enum debug_obj_state state)
case ODEBUG_STATE_ACTIVE:
percpu_counter_destroy(fbc);
debug_object_free(fbc, &percpu_counter_debug_descr);
return 1;
return true;
default:
return 0;
return false;
}
}

View File

@ -192,6 +192,22 @@ config MEMORY_HOTPLUG_SPARSE
def_bool y
depends on SPARSEMEM && MEMORY_HOTPLUG
config MEMORY_HOTPLUG_DEFAULT_ONLINE
bool "Online the newly added memory blocks by default"
default n
depends on MEMORY_HOTPLUG
help
This option sets the default policy setting for memory hotplug
onlining policy (/sys/devices/system/memory/auto_online_blocks) which
determines what happens to newly added memory regions. Policy setting
can always be changed at runtime.
See Documentation/memory-hotplug.txt for more information.
Say Y here if you want all hot-plugged memory blocks to appear in
'online' state by default.
Say N here if you want the default policy to keep all hot-plugged
memory blocks in 'offline' state.
config MEMORY_HOTREMOVE
bool "Allow for memory hot remove"
select MEMORY_ISOLATION
@ -268,11 +284,6 @@ config ARCH_ENABLE_HUGEPAGE_MIGRATION
config PHYS_ADDR_T_64BIT
def_bool 64BIT || ARCH_PHYS_ADDR_T_64BIT
config ZONE_DMA_FLAG
int
default "0" if !ZONE_DMA
default "1"
config BOUNCE
bool "Enable bounce buffers"
default y

View File

@ -42,6 +42,11 @@ static inline void count_compact_events(enum vm_event_item item, long delta)
#define CREATE_TRACE_POINTS
#include <trace/events/compaction.h>
#define block_start_pfn(pfn, order) round_down(pfn, 1UL << (order))
#define block_end_pfn(pfn, order) ALIGN((pfn) + 1, 1UL << (order))
#define pageblock_start_pfn(pfn) block_start_pfn(pfn, pageblock_order)
#define pageblock_end_pfn(pfn) block_end_pfn(pfn, pageblock_order)
static unsigned long release_freepages(struct list_head *freelist)
{
struct page *page, *next;
@ -161,7 +166,7 @@ static void reset_cached_positions(struct zone *zone)
zone->compact_cached_migrate_pfn[0] = zone->zone_start_pfn;
zone->compact_cached_migrate_pfn[1] = zone->zone_start_pfn;
zone->compact_cached_free_pfn =
round_down(zone_end_pfn(zone) - 1, pageblock_nr_pages);
pageblock_start_pfn(zone_end_pfn(zone) - 1);
}
/*
@ -519,10 +524,10 @@ isolate_freepages_range(struct compact_control *cc,
LIST_HEAD(freelist);
pfn = start_pfn;
block_start_pfn = pfn & ~(pageblock_nr_pages - 1);
block_start_pfn = pageblock_start_pfn(pfn);
if (block_start_pfn < cc->zone->zone_start_pfn)
block_start_pfn = cc->zone->zone_start_pfn;
block_end_pfn = ALIGN(pfn + 1, pageblock_nr_pages);
block_end_pfn = pageblock_end_pfn(pfn);
for (; pfn < end_pfn; pfn += isolated,
block_start_pfn = block_end_pfn,
@ -538,8 +543,8 @@ isolate_freepages_range(struct compact_control *cc,
* scanning range to right one.
*/
if (pfn >= block_end_pfn) {
block_start_pfn = pfn & ~(pageblock_nr_pages - 1);
block_end_pfn = ALIGN(pfn + 1, pageblock_nr_pages);
block_start_pfn = pageblock_start_pfn(pfn);
block_end_pfn = pageblock_end_pfn(pfn);
block_end_pfn = min(block_end_pfn, end_pfn);
}
@ -633,12 +638,13 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
{
struct zone *zone = cc->zone;
unsigned long nr_scanned = 0, nr_isolated = 0;
struct list_head *migratelist = &cc->migratepages;
struct lruvec *lruvec;
unsigned long flags = 0;
bool locked = false;
struct page *page = NULL, *valid_page = NULL;
unsigned long start_pfn = low_pfn;
bool skip_on_failure = false;
unsigned long next_skip_pfn = 0;
/*
* Ensure that there are not too many pages isolated from the LRU
@ -659,10 +665,37 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
if (compact_should_abort(cc))
return 0;
if (cc->direct_compaction && (cc->mode == MIGRATE_ASYNC)) {
skip_on_failure = true;
next_skip_pfn = block_end_pfn(low_pfn, cc->order);
}
/* Time to isolate some pages for migration */
for (; low_pfn < end_pfn; low_pfn++) {
bool is_lru;
if (skip_on_failure && low_pfn >= next_skip_pfn) {
/*
* We have isolated all migration candidates in the
* previous order-aligned block, and did not skip it due
* to failure. We should migrate the pages now and
* hopefully succeed compaction.
*/
if (nr_isolated)
break;
/*
* We failed to isolate in the previous order-aligned
* block. Set the new boundary to the end of the
* current block. Note we can't simply increase
* next_skip_pfn by 1 << order, as low_pfn might have
* been incremented by a higher number due to skipping
* a compound or a high-order buddy page in the
* previous loop iteration.
*/
next_skip_pfn = block_end_pfn(low_pfn, cc->order);
}
/*
* Periodically drop the lock (if held) regardless of its
* contention, to give chance to IRQs. Abort async compaction
@ -674,7 +707,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
break;
if (!pfn_valid_within(low_pfn))
continue;
goto isolate_fail;
nr_scanned++;
page = pfn_to_page(low_pfn);
@ -729,11 +762,11 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
if (likely(comp_order < MAX_ORDER))
low_pfn += (1UL << comp_order) - 1;
continue;
goto isolate_fail;
}
if (!is_lru)
continue;
goto isolate_fail;
/*
* Migration will fail if an anonymous page is pinned in memory,
@ -742,7 +775,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
*/
if (!page_mapping(page) &&
page_count(page) > page_mapcount(page))
continue;
goto isolate_fail;
/* If we already hold the lock, we can skip some rechecking */
if (!locked) {
@ -753,7 +786,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
/* Recheck PageLRU and PageCompound under lock */
if (!PageLRU(page))
continue;
goto isolate_fail;
/*
* Page become compound since the non-locked check,
@ -762,7 +795,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
*/
if (unlikely(PageCompound(page))) {
low_pfn += (1UL << compound_order(page)) - 1;
continue;
goto isolate_fail;
}
}
@ -770,7 +803,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
/* Try isolate the page */
if (__isolate_lru_page(page, isolate_mode) != 0)
continue;
goto isolate_fail;
VM_BUG_ON_PAGE(PageCompound(page), page);
@ -778,15 +811,55 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
del_page_from_lru_list(page, lruvec, page_lru(page));
isolate_success:
list_add(&page->lru, migratelist);
list_add(&page->lru, &cc->migratepages);
cc->nr_migratepages++;
nr_isolated++;
/*
* Record where we could have freed pages by migration and not
* yet flushed them to buddy allocator.
* - this is the lowest page that was isolated and likely be
* then freed by migration.
*/
if (!cc->last_migrated_pfn)
cc->last_migrated_pfn = low_pfn;
/* Avoid isolating too much */
if (cc->nr_migratepages == COMPACT_CLUSTER_MAX) {
++low_pfn;
break;
}
continue;
isolate_fail:
if (!skip_on_failure)
continue;
/*
* We have isolated some pages, but then failed. Release them
* instead of migrating, as we cannot form the cc->order buddy
* page anyway.
*/
if (nr_isolated) {
if (locked) {
spin_unlock_irqrestore(&zone->lru_lock, flags);
locked = false;
}
acct_isolated(zone, cc);
putback_movable_pages(&cc->migratepages);
cc->nr_migratepages = 0;
cc->last_migrated_pfn = 0;
nr_isolated = 0;
}
if (low_pfn < next_skip_pfn) {
low_pfn = next_skip_pfn - 1;
/*
* The check near the loop beginning would have updated
* next_skip_pfn too, but this is a bit simpler.
*/
next_skip_pfn += 1UL << cc->order;
}
}
/*
@ -834,10 +907,10 @@ isolate_migratepages_range(struct compact_control *cc, unsigned long start_pfn,
/* Scan block by block. First and last block may be incomplete */
pfn = start_pfn;
block_start_pfn = pfn & ~(pageblock_nr_pages - 1);
block_start_pfn = pageblock_start_pfn(pfn);
if (block_start_pfn < cc->zone->zone_start_pfn)
block_start_pfn = cc->zone->zone_start_pfn;
block_end_pfn = ALIGN(pfn + 1, pageblock_nr_pages);
block_end_pfn = pageblock_end_pfn(pfn);
for (; pfn < end_pfn; pfn = block_end_pfn,
block_start_pfn = block_end_pfn,
@ -924,10 +997,10 @@ static void isolate_freepages(struct compact_control *cc)
* is using.
*/
isolate_start_pfn = cc->free_pfn;
block_start_pfn = cc->free_pfn & ~(pageblock_nr_pages-1);
block_start_pfn = pageblock_start_pfn(cc->free_pfn);
block_end_pfn = min(block_start_pfn + pageblock_nr_pages,
zone_end_pfn(zone));
low_pfn = ALIGN(cc->migrate_pfn + 1, pageblock_nr_pages);
low_pfn = pageblock_end_pfn(cc->migrate_pfn);
/*
* Isolate free pages until enough are available to migrate the
@ -1070,7 +1143,6 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
unsigned long block_start_pfn;
unsigned long block_end_pfn;
unsigned long low_pfn;
unsigned long isolate_start_pfn;
struct page *page;
const isolate_mode_t isolate_mode =
(sysctl_compact_unevictable_allowed ? ISOLATE_UNEVICTABLE : 0) |
@ -1081,12 +1153,12 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
* initialized by compact_zone()
*/
low_pfn = cc->migrate_pfn;
block_start_pfn = cc->migrate_pfn & ~(pageblock_nr_pages - 1);
block_start_pfn = pageblock_start_pfn(low_pfn);
if (block_start_pfn < zone->zone_start_pfn)
block_start_pfn = zone->zone_start_pfn;
/* Only scan within a pageblock boundary */
block_end_pfn = ALIGN(low_pfn + 1, pageblock_nr_pages);
block_end_pfn = pageblock_end_pfn(low_pfn);
/*
* Iterate over whole pageblocks until we find the first suitable.
@ -1125,7 +1197,6 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
continue;
/* Perform the isolation */
isolate_start_pfn = low_pfn;
low_pfn = isolate_migratepages_block(cc, low_pfn,
block_end_pfn, isolate_mode);
@ -1134,15 +1205,6 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
return ISOLATE_ABORT;
}
/*
* Record where we could have freed pages by migration and not
* yet flushed them to buddy allocator.
* - this is the lowest page that could have been isolated and
* then freed by migration.
*/
if (cc->nr_migratepages && !cc->last_migrated_pfn)
cc->last_migrated_pfn = isolate_start_pfn;
/*
* Either we isolated something and proceed with migration. Or
* we failed and compact_zone should decide if we should
@ -1251,7 +1313,8 @@ static int compact_finished(struct zone *zone, struct compact_control *cc,
* COMPACT_CONTINUE - If compaction should run now
*/
static unsigned long __compaction_suitable(struct zone *zone, int order,
int alloc_flags, int classzone_idx)
unsigned int alloc_flags,
int classzone_idx)
{
int fragindex;
unsigned long watermark;
@ -1296,7 +1359,8 @@ static unsigned long __compaction_suitable(struct zone *zone, int order,
}
unsigned long compaction_suitable(struct zone *zone, int order,
int alloc_flags, int classzone_idx)
unsigned int alloc_flags,
int classzone_idx)
{
unsigned long ret;
@ -1343,7 +1407,7 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
cc->migrate_pfn = zone->compact_cached_migrate_pfn[sync];
cc->free_pfn = zone->compact_cached_free_pfn;
if (cc->free_pfn < start_pfn || cc->free_pfn >= end_pfn) {
cc->free_pfn = round_down(end_pfn - 1, pageblock_nr_pages);
cc->free_pfn = pageblock_start_pfn(end_pfn - 1);
zone->compact_cached_free_pfn = cc->free_pfn;
}
if (cc->migrate_pfn < start_pfn || cc->migrate_pfn >= end_pfn) {
@ -1398,6 +1462,18 @@ static int compact_zone(struct zone *zone, struct compact_control *cc)
ret = COMPACT_CONTENDED;
goto out;
}
/*
* We failed to migrate at least one page in the current
* order-aligned block, so skip the rest of it.
*/
if (cc->direct_compaction &&
(cc->mode == MIGRATE_ASYNC)) {
cc->migrate_pfn = block_end_pfn(
cc->migrate_pfn - 1, cc->order);
/* Draining pcplists is useless in this case */
cc->last_migrated_pfn = 0;
}
}
check_drain:
@ -1411,7 +1487,7 @@ check_drain:
if (cc->order > 0 && cc->last_migrated_pfn) {
int cpu;
unsigned long current_block_start =
cc->migrate_pfn & ~((1UL << cc->order) - 1);
block_start_pfn(cc->migrate_pfn, cc->order);
if (cc->last_migrated_pfn < current_block_start) {
cpu = get_cpu();
@ -1436,7 +1512,7 @@ out:
cc->nr_freepages = 0;
VM_BUG_ON(free_pfn == 0);
/* The cached pfn is always the first in a pageblock */
free_pfn &= ~(pageblock_nr_pages-1);
free_pfn = pageblock_start_pfn(free_pfn);
/*
* Only go back, not forward. The cached pfn might have been
* already reset to zone end in compact_finished()
@ -1456,7 +1532,7 @@ out:
static unsigned long compact_zone_order(struct zone *zone, int order,
gfp_t gfp_mask, enum migrate_mode mode, int *contended,
int alloc_flags, int classzone_idx)
unsigned int alloc_flags, int classzone_idx)
{
unsigned long ret;
struct compact_control cc = {
@ -1497,8 +1573,8 @@ int sysctl_extfrag_threshold = 500;
* This is the main entry point for direct page compaction.
*/
unsigned long try_to_compact_pages(gfp_t gfp_mask, unsigned int order,
int alloc_flags, const struct alloc_context *ac,
enum migrate_mode mode, int *contended)
unsigned int alloc_flags, const struct alloc_context *ac,
enum migrate_mode mode, int *contended)
{
int may_enter_fs = gfp_mask & __GFP_FS;
int may_perform_io = gfp_mask & __GFP_IO;
@ -1526,7 +1602,7 @@ unsigned long try_to_compact_pages(gfp_t gfp_mask, unsigned int order,
status = compact_zone_order(zone, order, gfp_mask, mode,
&zone_contended, alloc_flags,
ac->classzone_idx);
ac_classzone_idx(ac));
rc = max(status, rc);
/*
* It takes at least one zone that wasn't lock contended
@ -1536,7 +1612,7 @@ unsigned long try_to_compact_pages(gfp_t gfp_mask, unsigned int order,
/* If a normal allocation would succeed, stop compacting */
if (zone_watermark_ok(zone, order, low_wmark_pages(zone),
ac->classzone_idx, alloc_flags)) {
ac_classzone_idx(ac), alloc_flags)) {
/*
* We think the allocation will succeed in this zone,
* but it is not certain, hence the false. The caller

View File

@ -213,7 +213,7 @@ void __delete_from_page_cache(struct page *page, void *shadow)
* some other bad page check should catch it later.
*/
page_mapcount_reset(page);
atomic_sub(mapcount, &page->_count);
page_ref_sub(page, mapcount);
}
}

View File

@ -112,16 +112,12 @@ EXPORT_PER_CPU_SYMBOL(__kmap_atomic_idx);
unsigned int nr_free_highpages (void)
{
pg_data_t *pgdat;
struct zone *zone;
unsigned int pages = 0;
for_each_online_pgdat(pgdat) {
pages += zone_page_state(&pgdat->node_zones[ZONE_HIGHMEM],
NR_FREE_PAGES);
if (zone_movable_is_highmem())
pages += zone_page_state(
&pgdat->node_zones[ZONE_MOVABLE],
NR_FREE_PAGES);
for_each_populated_zone(zone) {
if (is_highmem(zone))
pages += zone_page_state(zone, NR_FREE_PAGES);
}
return pages;

View File

@ -1698,20 +1698,17 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
return 1;
}
bool move_huge_pmd(struct vm_area_struct *vma, struct vm_area_struct *new_vma,
unsigned long old_addr,
bool move_huge_pmd(struct vm_area_struct *vma, unsigned long old_addr,
unsigned long new_addr, unsigned long old_end,
pmd_t *old_pmd, pmd_t *new_pmd)
{
spinlock_t *old_ptl, *new_ptl;
pmd_t pmd;
struct mm_struct *mm = vma->vm_mm;
if ((old_addr & ~HPAGE_PMD_MASK) ||
(new_addr & ~HPAGE_PMD_MASK) ||
old_end - old_addr < HPAGE_PMD_SIZE ||
(new_vma->vm_flags & VM_NOHUGEPAGE))
old_end - old_addr < HPAGE_PMD_SIZE)
return false;
/*
@ -3113,7 +3110,7 @@ static void __split_huge_page_tail(struct page *head, int tail,
VM_BUG_ON_PAGE(page_ref_count(page_tail) != 0, page_tail);
/*
* tail_page->_count is zero and not changing from under us. But
* tail_page->_refcount is zero and not changing from under us. But
* get_page_unless_zero() may be running from under us on the
* tail_page. If we used atomic_set() below instead of atomic_inc(), we
* would then run atomic_set() concurrently with
@ -3340,7 +3337,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
if (mlocked)
lru_add_drain();
/* Prevent deferred_split_scan() touching ->_count */
/* Prevent deferred_split_scan() touching ->_refcount */
spin_lock_irqsave(&pgdata->split_queue_lock, flags);
count = page_count(head);
mapcount = total_mapcount(head);

View File

@ -51,6 +51,7 @@ __initdata LIST_HEAD(huge_boot_pages);
static struct hstate * __initdata parsed_hstate;
static unsigned long __initdata default_hstate_max_huge_pages;
static unsigned long __initdata default_hstate_size;
static bool __initdata parsed_valid_hugepagesz = true;
/*
* Protects updates to hugepage_freelists, hugepage_activelist, nr_huge_pages,
@ -144,7 +145,8 @@ static long hugepage_subpool_get_pages(struct hugepage_subpool *spool,
}
}
if (spool->min_hpages != -1) { /* minimum size accounting */
/* minimum size accounting */
if (spool->min_hpages != -1 && spool->rsv_hpages) {
if (delta > spool->rsv_hpages) {
/*
* Asking for more reserves than those already taken on
@ -182,7 +184,8 @@ static long hugepage_subpool_put_pages(struct hugepage_subpool *spool,
if (spool->max_hpages != -1) /* maximum size accounting */
spool->used_hpages -= delta;
if (spool->min_hpages != -1) { /* minimum size accounting */
/* minimum size accounting */
if (spool->min_hpages != -1 && spool->used_hpages < spool->min_hpages) {
if (spool->rsv_hpages + delta <= spool->min_hpages)
ret = 0;
else
@ -937,9 +940,7 @@ err:
*/
static int next_node_allowed(int nid, nodemask_t *nodes_allowed)
{
nid = next_node(nid, *nodes_allowed);
if (nid == MAX_NUMNODES)
nid = first_node(*nodes_allowed);
nid = next_node_in(nid, *nodes_allowed);
VM_BUG_ON(nid >= MAX_NUMNODES);
return nid;
@ -1030,8 +1031,8 @@ static int __alloc_gigantic_page(unsigned long start_pfn,
return alloc_contig_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
}
static bool pfn_range_valid_gigantic(unsigned long start_pfn,
unsigned long nr_pages)
static bool pfn_range_valid_gigantic(struct zone *z,
unsigned long start_pfn, unsigned long nr_pages)
{
unsigned long i, end_pfn = start_pfn + nr_pages;
struct page *page;
@ -1042,6 +1043,9 @@ static bool pfn_range_valid_gigantic(unsigned long start_pfn,
page = pfn_to_page(i);
if (page_zone(page) != z)
return false;
if (PageReserved(page))
return false;
@ -1074,7 +1078,7 @@ static struct page *alloc_gigantic_page(int nid, unsigned int order)
pfn = ALIGN(z->zone_start_pfn, nr_pages);
while (zone_spans_last_pfn(z, pfn, nr_pages)) {
if (pfn_range_valid_gigantic(pfn, nr_pages)) {
if (pfn_range_valid_gigantic(z, pfn, nr_pages)) {
/*
* We release the zone lock here because
* alloc_contig_range() will also lock the zone
@ -2659,6 +2663,11 @@ static int __init hugetlb_init(void)
subsys_initcall(hugetlb_init);
/* Should be called on processing a hugepagesz=... option */
void __init hugetlb_bad_size(void)
{
parsed_valid_hugepagesz = false;
}
void __init hugetlb_add_hstate(unsigned int order)
{
struct hstate *h;
@ -2678,8 +2687,8 @@ void __init hugetlb_add_hstate(unsigned int order)
for (i = 0; i < MAX_NUMNODES; ++i)
INIT_LIST_HEAD(&h->hugepage_freelists[i]);
INIT_LIST_HEAD(&h->hugepage_activelist);
h->next_nid_to_alloc = first_node(node_states[N_MEMORY]);
h->next_nid_to_free = first_node(node_states[N_MEMORY]);
h->next_nid_to_alloc = first_memory_node;
h->next_nid_to_free = first_memory_node;
snprintf(h->name, HSTATE_NAME_LEN, "hugepages-%lukB",
huge_page_size(h)/1024);
@ -2691,11 +2700,17 @@ static int __init hugetlb_nrpages_setup(char *s)
unsigned long *mhp;
static unsigned long *last_mhp;
if (!parsed_valid_hugepagesz) {
pr_warn("hugepages = %s preceded by "
"an unsupported hugepagesz, ignoring\n", s);
parsed_valid_hugepagesz = true;
return 1;
}
/*
* !hugetlb_max_hstate means we haven't parsed a hugepagesz= parameter yet,
* so this hugepages= parameter goes to the "default hstate".
*/
if (!hugetlb_max_hstate)
else if (!hugetlb_max_hstate)
mhp = &default_hstate_max_huge_pages;
else
mhp = &parsed_hstate->max_huge_pages;

View File

@ -58,7 +58,7 @@ static inline unsigned long ra_submit(struct file_ra_state *ra,
}
/*
* Turn a non-refcounted page (->_count == 0) into refcounted with
* Turn a non-refcounted page (->_refcount == 0) into refcounted with
* a count of one.
*/
static inline void set_page_refcounted(struct page *page)
@ -102,13 +102,14 @@ extern pmd_t *mm_find_pmd(struct mm_struct *mm, unsigned long address);
struct alloc_context {
struct zonelist *zonelist;
nodemask_t *nodemask;
struct zone *preferred_zone;
int classzone_idx;
struct zoneref *preferred_zoneref;
int migratetype;
enum zone_type high_zoneidx;
bool spread_dirty_pages;
};
#define ac_classzone_idx(ac) zonelist_zone_idx(ac->preferred_zoneref)
/*
* Locate the struct page for both the matching buddy in our
* pair (buddy1) and the combined O(n+1) page they form (page).
@ -175,7 +176,7 @@ struct compact_control {
bool direct_compaction; /* False from kcompactd or /proc/... */
int order; /* order a direct compactor needs */
const gfp_t gfp_mask; /* gfp mask of a direct compactor */
const int alloc_flags; /* alloc flags of a direct compactor */
const unsigned int alloc_flags; /* alloc flags of a direct compactor */
const int classzone_idx; /* zone index of a direct compactor */
struct zone *zone;
int contended; /* Signal need_sched() or lock

View File

@ -1023,22 +1023,40 @@ out:
* @lru: index of lru list the page is sitting on
* @nr_pages: positive when adding or negative when removing
*
* This function must be called when a page is added to or removed from an
* lru list.
* This function must be called under lru_lock, just before a page is added
* to or just after a page is removed from an lru list (that ordering being
* so as to allow it to check that lru_size 0 is consistent with list_empty).
*/
void mem_cgroup_update_lru_size(struct lruvec *lruvec, enum lru_list lru,
int nr_pages)
{
struct mem_cgroup_per_zone *mz;
unsigned long *lru_size;
long size;
bool empty;
__update_lru_size(lruvec, lru, nr_pages);
if (mem_cgroup_disabled())
return;
mz = container_of(lruvec, struct mem_cgroup_per_zone, lruvec);
lru_size = mz->lru_size + lru;
*lru_size += nr_pages;
VM_BUG_ON((long)(*lru_size) < 0);
empty = list_empty(lruvec->lists + lru);
if (nr_pages < 0)
*lru_size += nr_pages;
size = *lru_size;
if (WARN_ONCE(size < 0 || empty != !size,
"%s(%p, %d, %d): lru_size %ld but %sempty\n",
__func__, lruvec, lru, nr_pages, size, empty ? "" : "not ")) {
VM_BUG_ON(1);
*lru_size = 0;
}
if (nr_pages > 0)
*lru_size += nr_pages;
}
bool task_in_mem_cgroup(struct task_struct *task, struct mem_cgroup *memcg)
@ -1257,6 +1275,7 @@ static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
*/
if (fatal_signal_pending(current) || task_will_free_mem(current)) {
mark_oom_victim(current);
try_oom_reaper(current);
goto unlock;
}
@ -1389,14 +1408,11 @@ int mem_cgroup_select_victim_node(struct mem_cgroup *memcg)
mem_cgroup_may_update_nodemask(memcg);
node = memcg->last_scanned_node;
node = next_node(node, memcg->scan_nodes);
if (node == MAX_NUMNODES)
node = first_node(memcg->scan_nodes);
node = next_node_in(node, memcg->scan_nodes);
/*
* We call this when we hit limit, not when pages are added to LRU.
* No LRU may hold pages because all pages are UNEVICTABLE or
* memcg is too small and all pages are not on LRU. In that case,
* we use curret node.
* mem_cgroup_may_update_nodemask might have seen no reclaimmable pages
* last time it really checked all the LRUs due to rate limiting.
* Fallback to the current node in that case for simplicity.
*/
if (unlikely(node == MAX_NUMNODES))
node = numa_node_id();

View File

@ -78,9 +78,24 @@ static struct {
#define memhp_lock_acquire() lock_map_acquire(&mem_hotplug.dep_map)
#define memhp_lock_release() lock_map_release(&mem_hotplug.dep_map)
#ifndef CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE
bool memhp_auto_online;
#else
bool memhp_auto_online = true;
#endif
EXPORT_SYMBOL_GPL(memhp_auto_online);
static int __init setup_memhp_default_state(char *str)
{
if (!strcmp(str, "online"))
memhp_auto_online = true;
else if (!strcmp(str, "offline"))
memhp_auto_online = false;
return 1;
}
__setup("memhp_default_state=", setup_memhp_default_state);
void get_online_mems(void)
{
might_sleep();
@ -1410,7 +1425,7 @@ static struct page *next_active_pageblock(struct page *page)
}
/* Checks if this range of memory is likely to be hot-removable. */
int is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
bool is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
{
struct page *page = pfn_to_page(start_pfn);
struct page *end_page = page + nr_pages;
@ -1418,12 +1433,12 @@ int is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
/* Check the starting page of each pageblock within the range */
for (; page < end_page; page = next_active_pageblock(page)) {
if (!is_pageblock_removable_nolock(page))
return 0;
return false;
cond_resched();
}
/* All pageblocks in the memory block are likely to be hot-removable */
return 1;
return true;
}
/*

View File

@ -97,7 +97,6 @@
#include <asm/tlbflush.h>
#include <asm/uaccess.h>
#include <linux/random.h>
#include "internal.h"
@ -347,9 +346,7 @@ static void mpol_rebind_nodemask(struct mempolicy *pol, const nodemask_t *nodes,
BUG();
if (!node_isset(current->il_next, tmp)) {
current->il_next = next_node(current->il_next, tmp);
if (current->il_next >= MAX_NUMNODES)
current->il_next = first_node(tmp);
current->il_next = next_node_in(current->il_next, tmp);
if (current->il_next >= MAX_NUMNODES)
current->il_next = numa_node_id();
}
@ -1709,9 +1706,7 @@ static unsigned interleave_nodes(struct mempolicy *policy)
struct task_struct *me = current;
nid = me->il_next;
next = next_node(nid, policy->v.nodes);
if (next >= MAX_NUMNODES)
next = first_node(policy->v.nodes);
next = next_node_in(nid, policy->v.nodes);
if (next < MAX_NUMNODES)
me->il_next = next;
return nid;
@ -1744,18 +1739,18 @@ unsigned int mempolicy_slab_node(void)
return interleave_nodes(policy);
case MPOL_BIND: {
struct zoneref *z;
/*
* Follow bind policy behavior and start allocation at the
* first node.
*/
struct zonelist *zonelist;
struct zone *zone;
enum zone_type highest_zoneidx = gfp_zone(GFP_KERNEL);
zonelist = &NODE_DATA(node)->node_zonelists[0];
(void)first_zones_zonelist(zonelist, highest_zoneidx,
&policy->v.nodes,
&zone);
return zone ? zone->node : node;
z = first_zones_zonelist(zonelist, highest_zoneidx,
&policy->v.nodes);
return z->zone ? z->zone->node : node;
}
default:
@ -1763,23 +1758,25 @@ unsigned int mempolicy_slab_node(void)
}
}
/* Do static interleaving for a VMA with known offset. */
/*
* Do static interleaving for a VMA with known offset @n. Returns the n'th
* node in pol->v.nodes (starting from n=0), wrapping around if n exceeds the
* number of present nodes.
*/
static unsigned offset_il_node(struct mempolicy *pol,
struct vm_area_struct *vma, unsigned long off)
struct vm_area_struct *vma, unsigned long n)
{
unsigned nnodes = nodes_weight(pol->v.nodes);
unsigned target;
int c;
int nid = NUMA_NO_NODE;
int i;
int nid;
if (!nnodes)
return numa_node_id();
target = (unsigned int)off % nnodes;
c = 0;
do {
target = (unsigned int)n % nnodes;
nid = first_node(pol->v.nodes);
for (i = 0; i < target; i++)
nid = next_node(nid, pol->v.nodes);
c++;
} while (c <= target);
return nid;
}
@ -1805,21 +1802,6 @@ static inline unsigned interleave_nid(struct mempolicy *pol,
return interleave_nodes(pol);
}
/*
* Return the bit number of a random bit set in the nodemask.
* (returns NUMA_NO_NODE if nodemask is empty)
*/
int node_random(const nodemask_t *maskp)
{
int w, bit = NUMA_NO_NODE;
w = nodes_weight(*maskp);
if (w)
bit = bitmap_ord_to_pos(maskp->bits,
get_random_int() % w, MAX_NUMNODES);
return bit;
}
#ifdef CONFIG_HUGETLBFS
/*
* huge_zonelist(@vma, @addr, @gfp_flags, @mpol)
@ -2284,7 +2266,7 @@ static void sp_free(struct sp_node *n)
int mpol_misplaced(struct page *page, struct vm_area_struct *vma, unsigned long addr)
{
struct mempolicy *pol;
struct zone *zone;
struct zoneref *z;
int curnid = page_to_nid(page);
unsigned long pgoff;
int thiscpu = raw_smp_processor_id();
@ -2316,6 +2298,7 @@ int mpol_misplaced(struct page *page, struct vm_area_struct *vma, unsigned long
break;
case MPOL_BIND:
/*
* allows binding to multiple nodes.
* use current page if in policy nodemask,
@ -2324,11 +2307,11 @@ int mpol_misplaced(struct page *page, struct vm_area_struct *vma, unsigned long
*/
if (node_isset(curnid, pol->v.nodes))
goto out;
(void)first_zones_zonelist(
z = first_zones_zonelist(
node_zonelist(numa_node_id(), GFP_HIGHUSER),
gfp_zone(GFP_HIGHUSER),
&pol->v.nodes, &zone);
polnid = zone->node;
&pol->v.nodes);
polnid = z->zone->node;
break;
default:

Some files were not shown because too many files have changed in this diff Show More