linux/Documentation/filesystems
Vlastimil Babka bf9683d699 mm, documentation: clarify /proc/pid/status VmSwap limitations for shmem
This series is based on Jerome Marchand's [1] so let me quote the first
paragraph from there:

There are several shortcomings with the accounting of shared memory
(sysV shm, shared anonymous mapping, mapping to a tmpfs file).  The
values in /proc/<pid>/status and statm don't allow to distinguish
between shmem memory and a shared mapping to a regular file, even though
their implications on memory usage are quite different: at reclaim, file
mapping can be dropped or written back on disk while shmem needs a place
in swap.  As for shmem pages that are swapped-out or in swap cache, they
aren't accounted at all.

The original motivation for myself is that a customer found (IMHO
rightfully) confusing that e.g.  top output for process swap usage is
unreliable with respect to swapped out shmem pages, which are not
accounted for.

The fundamental difference between private anonymous and shmem pages is
that the latter has PTE's converted to pte_none, and not swapents.  As
such, they are not accounted to the number of swapents visible e.g.  in
/proc/pid/status VmSwap row.  It might be theoretically possible to use
swapents when swapping out shmem (without extra cost, as one has to
change all mappers anyway), and on swap in only convert the swapent for
the faulting process, leaving swapents in other processes until they
also fault (so again no extra cost).  But I don't know how many
assumptions this would break, and it would be too disruptive change for
a relatively small benefit.

Instead, my approach is to document the limitation of VmSwap, and
provide means to determine the swap usage for shmem areas for those who
are interested and willing to pay the price, using /proc/pid/smaps.
Because outside of ipcs, I don't think it's possible to currently to
determine the usage at all.  The previous patchset [1] did introduce new
shmem-specific fields into smaps output, and functions to determine the
values.  I take a simpler approach, noting that smaps output already has
a "Swap: X kB" line, where currently X == 0 always for shmem areas.  I
think we can just consider this a bug and provide the proper value by
consulting the radix tree, as e.g.  mincore_page() does.  In the patch
changelog I explain why this is also not perfect (and cannot be without
swapents), but still arguably much better than showing a 0.

The last two patches are adapted from Jerome's patchset and provide a
VmRSS breakdown to RssAnon, RssFile and RssShm in /proc/pid/status.
Hugh noted that this is a welcome addition, and I agree that it might
help e.g.  debugging process memory usage at albeit non-zero, but still
rather low cost of extra per-mm counter and some page flag checks.

[1] http://lwn.net/Articles/611966/

This patch (of 6):

The documentation for /proc/pid/status does not mention that the value
of VmSwap counts only swapped out anonymous private pages, and not
swapped out pages of the underlying shmem objects (for shmem mappings).
This is not obvious, so document this limitation.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14 16:00:49 -08:00
..
caching FS-Cache: Count the number of initialised operations 2015-04-02 14:28:53 +01:00
cifs update CIFS TODO list 2014-08-02 01:26:07 -05:00
configfs configfs: implement binary attributes 2016-01-04 12:31:46 +01:00
nfs ipconfig: send Client-identifier in DHCP requests 2015-10-18 19:23:52 -07:00
pohmelfs
.gitignore Documentation: update .gitignore files 2014-09-26 11:02:59 +02:00
9p.txt 9p: update documentation 2014-01-24 10:55:21 -06:00
00-INDEX dax: replace XIP documentation with DAX documentation 2015-02-16 17:56:03 -08:00
adfs.txt
affs.txt affs: add mount option to avoid filename truncates 2014-04-07 16:36:08 -07:00
afs.txt
autofs4-mount-control.txt doc: fix double words 2014-03-21 13:16:58 +01:00
autofs4.txt autofs: the documentation I wanted to read 2014-10-14 02:18:17 +02:00
automount-support.txt Documentation: remove outdated information from automount-support.txt 2015-05-15 01:10:38 -04:00
befs.txt
bfs.txt
btrfs.txt Documentation: filesystems: btrfs: Fixed typos and whitespace 2015-07-09 14:31:06 -06:00
ceph.txt
coda.txt
cramfs.txt
dax.txt dax: add huge page fault support 2015-09-08 15:35:28 -07:00
debugfs.txt debugfs: Pass bool pointer to debugfs_create_bool() 2015-10-04 11:36:07 +01:00
devpts.txt
directory-locking vfs: take i_mutex on renamed file 2013-11-09 00:16:40 -05:00
dlmfs.txt ocfs2: update web page + git tree in documentation 2015-02-28 09:57:50 -08:00
dnotify_test.c
dnotify.txt
ecryptfs.txt
efivarfs.txt
exofs.txt
ext2.txt fs: Remove ext3 filesystem driver 2015-07-23 20:59:40 +02:00
ext3.txt fs: Remove ext3 filesystem driver 2015-07-23 20:59:40 +02:00
ext4.txt ext4: add DAX functionality 2015-02-16 17:56:04 -08:00
f2fs.txt f2fs: introduce new option for controlling data flush 2015-12-16 09:25:48 -08:00
fiemap.txt fsioctl.c: make generic_block_fiemap() signal-tolerant 2015-02-10 14:30:30 -08:00
files.txt
fuse.txt
gfs2-glocks.txt gfs2: Remove gl_spin define 2015-10-29 12:57:48 -05:00
gfs2-uevents.txt
gfs2.txt
hfs.txt
hfsplus.txt Documentation: update URL to hfsplus Technote 1150 2014-02-20 14:48:51 +01:00
hpfs.txt
inotify.txt inotify: update documentation to reflect code changes 2015-02-10 14:30:28 -08:00
isofs.txt
jfs.txt doc: fix misspellings with 'codespell' tool 2013-05-28 12:02:12 +02:00
Locking switch ->get_link() to delayed_call, kill ->put_link() 2015-12-30 13:01:03 -05:00
locks.txt
logfs.txt
Makefile configfs: remove old API 2015-10-13 22:17:57 -07:00
mandatory-locking.txt
ncpfs.txt
nilfs2.txt nilfs2: update project's web site in nilfs2.txt 2014-04-03 16:21:26 -07:00
ntfs.txt NTFS: Remove changelog from Documentation/filesystems/ntfs.txt. 2014-10-16 12:43:57 +01:00
ocfs2.txt ocfs2: update web page + git tree in documentation 2015-02-28 09:57:50 -08:00
omfs.txt
overlayfs.txt Remove email address from Documentation/filesystems/overlayfs.txt 2015-11-11 10:04:53 -07:00
path-lookup.md Documentation: add new description of path-name lookup. 2015-11-02 18:18:25 -07:00
path-lookup.txt Documentation: add new description of path-name lookup. 2015-11-02 18:18:25 -07:00
porting switch ->get_link() to delayed_call, kill ->put_link() 2015-12-30 13:01:03 -05:00
proc.txt mm, documentation: clarify /proc/pid/status VmSwap limitations for shmem 2016-01-14 16:00:49 -08:00
qnx6.txt doc: filesystems : Fix typo in Documentations/filesystems 2013-08-20 13:45:56 +02:00
quota.txt quota: Update documentation 2015-05-18 11:23:07 +02:00
ramfs-rootfs-initramfs.txt initmpfs: use initramfs if rootfstype= or root= specified 2013-09-11 15:59:38 -07:00
relay.txt doc: Fix typo in doucmentations 2013-07-25 12:34:15 +02:00
romfs.txt
seq_file.txt Documentation: update seq_file 2014-12-29 15:40:18 -07:00
sharedsubtree.txt doc: spelling error changes 2014-05-05 15:32:05 +02:00
spufs.txt
squashfs.txt Squashfs: Add LZ4 compression configuration option 2014-11-27 18:48:44 +00:00
sysfs-pci.txt
sysfs-tagging.txt sysfs-tagging.txt: fix pre-kernfs references 2015-09-13 14:38:51 -06:00
sysfs.txt sysfs.txt: mention that store method buffers are null-terminated 2015-09-13 14:38:51 -06:00
sysv-fs.txt
tmpfs.txt
ubifs.txt
udf.txt
ufs.txt
vfat.txt fs/fat/: add support for DOS 1.x formatted volumes 2014-06-06 16:08:10 -07:00
vfs.txt switch ->get_link() to delayed_call, kill ->put_link() 2015-12-30 13:01:03 -05:00
xfs-delayed-logging-design.txt
xfs-self-describing-metadata.txt xfs: add metadata CRC documentation 2013-04-27 13:27:43 -05:00
xfs.txt xfs: fix kernel version in docs 2015-06-01 07:15:38 +10:00