26672 Commits

Author SHA1 Message Date
Sachin Prabhu
8830d7e07a cifs: use standard token parser for mount options
Use the standard token parser instead of the long if condition to parse
cifs mount options.

This was first proposed by Scott Lovenberg
http://lists.samba.org/archive/linux-cifs-client/2010-May/006079.html

Mount options have been grouped together in terms of their input types.
Aliases for username, password, domain and credentials have been added.
The password parser has been modified to make it easier to read.

Since the patch was first proposed, the following bugs have been fixed
1) Allow blank 'pass' option to be passed by the cifs mount helper when
using sec=none.
2) Do not explicitly set vol->nullauth to 0. This causes a problem
when using sec=none while also using a username.

Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
2012-03-23 14:40:56 -04:00
Jeff Layton
27ac5755ae cifs: remove /proc/fs/cifs/OplockEnabled
We've had a deprecation warning on this file for 2 releases now. Remove
it as promised for 3.4.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
2012-03-23 14:40:56 -04:00
Jeff Layton
da82f7e755 cifs: convert cifs_iovec_write to use async writes
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Pavel Shilovsky <piastry@etersoft.ru>
2012-03-23 14:40:56 -04:00
Jeff Layton
597b027f69 cifs: call cifs_update_eof with i_lock held
cifs_update_eof has the potential to be racy if multiple threads are
trying to modify it at the same time. Protect modifications of the
server_eof value with the inode->i_lock.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
2012-03-23 14:40:56 -04:00
Jeff Layton
e9492871fb cifs: abstract out function to marshal up the iovec array for async writes
We'll need to do something a bit different depending on the caller.
Abstract the code that marshals the page array into an iovec.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Pavel Shilovsky <piastry@etersoft.ru>
2012-03-23 14:40:56 -04:00
Jeff Layton
a7103b99e4 cifs: fix up get_numpages
Use DIV_ROUND_UP. Also, PAGE_SIZE is more appropriate here since these
aren't pagecache pages.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Pavel Shilovsky <piastry@etersoft.ru>
2012-03-23 14:40:56 -04:00
Jeff Layton
35ebb4155f cifs: make cifsFileInfo_get return the cifsFileInfo pointer
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Pavel Shilovsky <piastry@etersoft.ru>
2012-03-23 14:40:56 -04:00
Jeff Layton
e94f7ba124 cifs: fix allocation in cifs_write_allocate_pages
The gfp flags are currently set to __GPF_HIGHMEM, which doesn't allow
for any reclaim. Make this more resilient by or'ing that with
GFP_KERNEL. Also, get rid of the goto and unify the exit codepath.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Pavel Shilovsky <piastry@etersoft.ru>
2012-03-23 14:40:56 -04:00
Jeff Layton
c2e8764009 cifs: allow caller to specify completion op when allocating writedata
We'll need a different set of write completion ops when not writing out
of the pagecache.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Pavel Shilovsky <piastry@etersoft.ru>
2012-03-23 14:40:55 -04:00
Jeff Layton
fe5f5d2e90 cifs: add pid field to cifs_writedata
We'll need this to handle rwpidforward option correctly when we use
async writes in the aio_write op.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Pavel Shilovsky <piastry@etersoft.ru>
2012-03-23 14:40:55 -04:00
Jeff Layton
da472fc847 cifs: add new cifsiod_wq workqueue
...and convert existing cifs users of system_nrt_wq to use that instead.

Also, make it freezable, and set WQ_MEM_RECLAIM since we use it to
deal with write reply handling.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
2012-03-23 14:40:53 -04:00
Pavel Shilovsky
7c9421e1a9 CIFS: Change mid_q_entry structure fields
to be protocol-unspecific and big enough to keep both CIFS
and SMB2 values.

Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
2012-03-23 14:28:03 -04:00
Pavel Shilovsky
243d04b6e6 CIFS: Expand CurrentMid field
While in CIFS/SMB we have 16 bit mid, in SMB2 it is 64 bit.
Convert the existing field to 64 bit and mask off higher bits
for CIFS/SMB.

Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
2012-03-23 14:28:03 -04:00
Pavel Shilovsky
5ffef7bf1d CIFS: Separate protocol-specific code from cifs_readv_receive code
Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
2012-03-23 14:28:03 -04:00
Pavel Shilovsky
d4e4854fd1 CIFS: Separate protocol-specific code from demultiplex code
Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
2012-03-23 14:28:02 -04:00
Pavel Shilovsky
792af7b05b CIFS: Separate protocol-specific code from transport routines
that lets us use this functions for SMB2.

Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
2012-03-23 14:28:02 -04:00
Linus Torvalds
e57f146b28 - Improve error messages
- Clean-up i_nlink management
 - Minor clean-ups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJPbHqpAAoJECmIfjd9wqK0PGYP/i06JD3HM0uxU59TB0NM4q/7
 zhSe8epwlFFXIWkpq8ljjgqAqwFkQicvRwK4TG0IosBa3/q7LxpONpkQKOvvtaZT
 K4VROqHtPXBwtiAkjqso+ukjxqtJEaro8hQiljt06LR/kPPFjXFROqENMWMQPsP2
 bmeU5LfuYaFmVOLwI1jbCQTZ21Qrc1Wxvv3A3qSMIHsvLATSGjoN2vdRLhnefYh5
 CREBi5jr1mU/1vh+j4IjQ3GZWW+dg88G8munPOANA3ZOm+OOrbV/aq4txpasXkw7
 3lL//w070NV/GfbdXMqTj1ASlT3zOozFu4iaXgHdpMmNwmNk/IF0LSFye3iC+VdL
 pz0mlD6PN0mZhw+6qVEQgVs5MfQ7cZrvYZkY7tWEMlzuzBImR3v0TJ5Dg6CfaRbv
 wRHOXE70DefKigWT762TRbpaZBzSDImFTc9YdCYLtb9eEUnmpEP+Ci5/jK5EUjin
 CmPvxzCH60OCWb8CZIfdVCnHyfzY4YugTwp+PjjTWhc9iWwrMkIthlvGBAIfwjab
 6wyQrlFrKFhxLyPHGDMNYtPIazKXDsxPlvHAEXX24uXsXQNWyUBp4LNFUgfIzBsM
 wqTapNnl1Qmd9ZJYYhR7byhKhbU8Nb/3DUFaJEemviMeCncN+v0p0Qk5/lNjAiGz
 d1f5Yfu4EjQBT5piBR4C
 =Sq1l
 -----END PGP SIGNATURE-----

Merge tag 'upstream-3.4-rc1' of git://git.infradead.org/linux-ubifs

Pull UBIFS changes from Artem Bityutskiy:
 - Improve error messages
 - Clean-up i_nlink management
 - Minor clean-ups

* tag 'upstream-3.4-rc1' of git://git.infradead.org/linux-ubifs:
  UBIFS: improve error messages
  UBIFS: kill CUR_MAX_KEY_LEN macro
  UBIFS: do not use inc_link when i_nlink is zero
  UBIFS: make the dbg_lock spinlock static
  UBIFS: increase dumps loglevel
  UBIFS: amend recovery debugging message
2012-03-23 09:27:40 -07:00
Linus Torvalds
6e55f8ed81 One pstore patch
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.10 (GNU/Linux)
 
 iQIcBAABAgAGBQJPa7HFAAoJEKurIx+X31iBtm8P/1wtBftqpdXiNXbuphWMxZAb
 FmzVSrlCRpbPB4HQjY64bs8BCR3zCeeGPf19Mx3mDrrY8o5ENlWEz/8eZNsXeiyS
 WMNIPrUevxRfsnPO61TizZ0JvPiCouQarRovYh0FW32PC8/9I2yYIM0Mj+L1P+b9
 FZgjk5Jeg+kEoDZZM6HepX500/iZjsPBuyysJ91JPVNzQJNHPJUYvHwPx8xYWVva
 PwdrugkLPFEJSyyc1CNLt3kg9wtDNtWIKUvrjmRl6aRFeLmJBUVBhkxtZD2rrLIm
 oOvGY+UdlTCg5vHHAYtc4oAVnZiXUxFaotH0Msa8ots6ORvKAXKBBmXTwS2bce8Q
 DUtqs/CU6X3LykRDyuCRvjuwuH5e4WUM3SMCSWuxk8ZOCiYWHwqMaaWYIBUDAWIm
 AXFbP29M95hdWd/jvZUxwJCXRc2Vr/MEXNkQzPWIw8eAewP+2jbqLLICjXqVK0Vp
 vkHmhVTN07ajbz/dJsJ3K2JObVal+1YEL96DdW+nstdmgNPiDhyLibVS/WH8qOq/
 k9s175e3kLOaLU1LCBdKy4TbG3fGg7f1FUVePelbmTSF9gRSz0Mz3jMyrvoFS7jQ
 //ra4HlrbUHHwwsK8R5axZGa2ejezLIc7/wdgRlEy2ZffqytNMPJ4RwCk0Ft1YJ2
 pbD9OUSUqwhRrtbxJWJd
 =MzvB
 -----END PGP SIGNATURE-----

Merge tag 'pstore-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux

Pull one pstore patch from Tony Luck

* tag 'pstore-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
  pstore: Introduce get_reason_str() to pstore
2012-03-23 09:24:07 -07:00
Linus Torvalds
49d99a2f9c Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
Pull XFS updates from Ben Myers:
 "Scalability improvements for dquots, log grant code cleanups, plus
  bugfixes and cleanups large and small"

Fix up various trivial conflicts that were due to some of the earlier
patches already having been integrated into v3.3 as bugfixes, and then
there were development patches on top of those.  Easily merged by just
taking the newer version from the pulled branch.

* 'for-linus' of git://oss.sgi.com/xfs/xfs: (45 commits)
  xfs: fallback to vmalloc for large buffers in xfs_getbmap
  xfs: fallback to vmalloc for large buffers in xfs_attrmulti_attr_get
  xfs: remove remaining scraps of struct xfs_iomap
  xfs: fix inode lookup race
  xfs: clean up minor sparse warnings
  xfs: remove the global xfs_Gqm structure
  xfs: remove the per-filesystem list of dquots
  xfs: use per-filesystem radix trees for dquot lookup
  xfs: per-filesystem dquot LRU lists
  xfs: use common code for quota statistics
  xfs: reimplement fdatasync support
  xfs: split in-core and on-disk inode log item fields
  xfs: make xfs_inode_item_size idempotent
  xfs: log timestamp updates
  xfs: log file size updates at I/O completion time
  xfs: log file size updates as part of unwritten extent conversion
  xfs: do not require an ioend for new EOF calculation
  xfs: use per-filesystem I/O completion workqueues
  quota: make Q_XQUOTASYNC a noop
  xfs: include reservations in quota reporting
  ...
2012-03-23 09:19:22 -07:00
Linus Torvalds
1c3ddfe5ab Merge git://git.samba.org/sfrench/cifs-2.6
Pull CIFS fixes from Steve French

* git://git.samba.org/sfrench/cifs-2.6:
  cifs: clean up ordering in exit_cifs
  cifs: clean up call to cifs_dfs_release_automount_timer()
  CIFS: Delete echo_retries module parm
  CIFS: Prepare credits code for a slot reservation
  CIFS: Make wait_for_free_request killable
  CIFS: Introduce credit-based flow control
  CIFS: Simplify inFlight logic
  cifs: fix issue mounting of DFS ROOT when redirecting from one domain controller to the next
  CIFS: Respect negotiated MaxMpxCount
  CIFS: Fix a spurious error in cifs_push_posix_locks
2012-03-23 09:07:15 -07:00
Linus Torvalds
f63d395d47 NFS client updates for Linux 3.4
New features include:
 - Add NFS client support for containers.
   This should enable most of the necessary functionality, including
   lockd support, and support for rpc.statd, NFSv4 idmapper and
   RPCSEC_GSS upcalls into the correct network namespace from
   which the mount system call was issued.
 - NFSv4 idmapper scalability improvements
   Base the idmapper cache on the keyring interface to allow concurrent
   access to idmapper entries. Start the process of migrating users from
   the single-threaded daemon-based approach to the multi-threaded
   request-key based approach.
 - NFSv4.1 implementation id.
   Allows the NFSv4.1 client and server to mutually identify each other
   for logging and debugging purposes.
 - Support the 'vers=4.1' mount option for mounting NFSv4.1 instead of
   having to use the more counterintuitive 'vers=4,minorversion=1'.
 - SUNRPC tracepoints.
   Start the process of adding tracepoints in order to improve debugging
   of the RPC layer.
 - pNFS object layout support for autologin.
 
 Important bugfixes include:
 - Fix a bug in rpc_wake_up/rpc_wake_up_status that caused them to fail
   to wake up all tasks when applied to priority waitqueues.
 - Ensure that we handle read delegations correctly, when we try to
   truncate a file.
 - A number of fixes for NFSv4 state manager loops (mostly to do with
   delegation recovery).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJPalZbAAoJEGcL54qWCgDyCi4P+QHcmzQhJO7HWx3Pzjs67bFT
 xMSYaKHGWS4AJKUBVl5OKBxUExfrMHBNbElV3IKUIwBlDx8RVtnwfptKSe146iki
 dn4TrRO5es8nmI4hRDcGMlzJDZq4y0Qg//qiUFmojiNW/Avw0ljfMoVUejJJ09FV
 oeDk4EGtcxkEyH+g48ZjYbyspRnG8qtD3atf70Z3lYE0ELdG/B5Dyzw1RDrA5p73
 xJX3lqy8p/4ROzw/dmNoxdAXOrr3Q4/T58Bvp/lUglPy/EHyPmWzFoH0MU0C/PFu
 5VnAl6QDbNCTcIw9FvJlX/mIyErpNG9eKzUskUc9L9SA+B+J/i4rIap4KATRN3nH
 7QhE5qUacPuJnvxml7MPmlQTuft3fkAQ7NhKIWrbRi1QS9FmJC5NxctIb8loqlFn
 yIXdKeLfMshB+NyuFS9uzStX7SmV3eMgVd+5ZxRjYxm+PKJLw2KXeudArL6M5mHK
 3QeKZpqwaYQ3RfaTNpvAp0doiXHCO5UbWfI0Pe8xQs/QcMCNReffqV2G4IJKFAu6
 WpoN2UDQC9LCBifLw2nS7kku8+ZVXLQU8OC1NVl3TG15xD9cNLXuk3/y5llPGq4O
 odo52uLFpJohbDaHMj5RTKOfchTQCm2iyuVmxZEeAySypMSiAXmW7COSKHs/HxI1
 VBm+EI00Pvmm5+fUjIlp
 =LuHE
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.4-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client updates for Linux 3.4 from Trond Myklebust:
 "New features include:
   - Add NFS client support for containers.

     This should enable most of the necessary functionality, including
     lockd support, and support for rpc.statd, NFSv4 idmapper and
     RPCSEC_GSS upcalls into the correct network namespace from which
     the mount system call was issued.

   - NFSv4 idmapper scalability improvements

     Base the idmapper cache on the keyring interface to allow
     concurrent access to idmapper entries.  Start the process of
     migrating users from the single-threaded daemon-based approach to
     the multi-threaded request-key based approach.

   - NFSv4.1 implementation id.

     Allows the NFSv4.1 client and server to mutually identify each
     other for logging and debugging purposes.

   - Support the 'vers=4.1' mount option for mounting NFSv4.1 instead of
     having to use the more counterintuitive 'vers=4,minorversion=1'.

   - SUNRPC tracepoints.

     Start the process of adding tracepoints in order to improve
     debugging of the RPC layer.

   - pNFS object layout support for autologin.

  Important bugfixes include:

   - Fix a bug in rpc_wake_up/rpc_wake_up_status that caused them to
     fail to wake up all tasks when applied to priority waitqueues.

   - Ensure that we handle read delegations correctly, when we try to
     truncate a file.

   - A number of fixes for NFSv4 state manager loops (mostly to do with
     delegation recovery)."

* tag 'nfs-for-3.4-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (224 commits)
  NFS: fix sb->s_id in nfs debug prints
  xprtrdma: Remove assumption that each segment is <= PAGE_SIZE
  xprtrdma: The transport should not bug-check when a dup reply is received
  pnfs-obj: autologin: Add support for protocol autologin
  NFS: Remove nfs4_setup_sequence from generic rename code
  NFS: Remove nfs4_setup_sequence from generic unlink code
  NFS: Remove nfs4_setup_sequence from generic read code
  NFS: Remove nfs4_setup_sequence from generic write code
  NFS: Fix more NFS debug related build warnings
  SUNRPC/LOCKD: Fix build warnings when CONFIG_SUNRPC_DEBUG is undefined
  nfs: non void functions must return a value
  SUNRPC: Kill compiler warning when RPC_DEBUG is unset
  SUNRPC/NFS: Add Kbuild dependencies for NFS_DEBUG/RPC_DEBUG
  NFS: Use cond_resched_lock() to reduce latencies in the commit scans
  NFSv4: It is not safe to dereference lsp->ls_state in release_lockowner
  NFS: ncommit count is being double decremented
  SUNRPC: We must not use list_for_each_entry_safe() in rpc_wake_up()
  Try using machine credentials for RENEW calls
  NFSv4.1: Fix a few issues in filelayout_commit_pagelist
  NFSv4.1: Clean ups and bugfixes for the pNFS read/writeback/commit code
  ...
2012-03-23 08:53:47 -07:00
Linus Torvalds
aab008db80 Cleanups: rename of flush to invalidate, moving reporting of statistics
into debugfs, and use __read_mostly as neccessary.
 Also add a MAINTAINER file for cleancache API files.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQEcBAABAgAGBQJPZ1sjAAoJEFjIrFwIi8fJCSkIALqBBCkuPSusvdebT59Oq145
 EW2eEwJzVft0vlvS/IPl+W37TH5VWhBVCW8frAYMpLaY5X/W1ZwzFpx76T4acAiM
 wkXwsyYXs6N13OOFoH5gmf3cwproAioRxbEeALucxeMR4rK5Yw+oZXT+hSu2KaLh
 1LtHyx+u+NLb0QOseAuDcmyjY9r6aLBA1HMcVD2z+4UW1n/9NWexQP3ShYP9uMDs
 GnyDzZ8vWXTcoc4Auj0rpaNsT5d47ltGegKASZmmvS3QyMHbZ4sk2HECnQY4wSee
 alPJwDyb6mOJmQ3e4s940onjfZPgd8/cW/DVEUrveH+dz6Eqjxqz41dVRVXiLz8=
 =tUNC
 -----END PGP SIGNATURE-----

Merge tag 'stable/for-linus-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/mm

Pull cleancache changes from Konrad Rzeszutek Wilk:
 "This has some patches for the cleancache API that should have been
  submitted a _long_ time ago.  They are basically cleanups:

   - rename of flush to invalidate

   - moving reporting of statistics into debugfs

   - use __read_mostly as necessary.

  Oh, and also the MAINTAINERS file change.  The files (except the
  MAINTAINERS file) have been in #linux-next for months now.  The late
  addition of MAINTAINERS file is a brain-fart on my side - didn't
  realize I needed that just until I was typing this up - and I based
  that patch on v3.3 - so the tree is on top of v3.3."

* tag 'stable/for-linus-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/mm:
  MAINTAINERS: Adding cleancache API to the list.
  mm: cleancache: Use __read_mostly as appropiate.
  mm: cleancache: report statistics via debugfs instead of sysfs.
  mm: zcache/tmem/cleancache: s/flush/invalidate/
  mm: cleancache: s/flush/invalidate/
2012-03-22 19:52:47 -07:00
Linus Torvalds
f7493e5d9c vfs: tidy up sparse warnings in fs/namei.c
While doing the fs/namei.c cleanups, I ran sparse on it, and it pointed
out other large integers and a couple of cases of us using '0' instead
of the proper 'NULL'.

Sparse still doesn't understand some of the conditional locking going
on, but that's no excuse for not fixing up the trivial stuff.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-22 16:10:40 -07:00
Linus Torvalds
989412bbd2 vfs: tidy up fs/namei.c byte-repeat word constants
In commit commit 1de5b41cd3b2 ("fs/namei.c: fix warnings on 32-bit")
Andrew said that there must be a tidier way of doing this.

This is that tidier way.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-22 15:58:27 -07:00
Randy Dunlap
1f1e6e523e fs: fix kernel-doc warnings in dcache.c
Fix kernel-doc warnings in fs/dcache.c:

  Warning(fs/dcache.c:1743): No description found for parameter 'seqp'
  Warning(fs/dcache.c:1743): Excess function parameter 'seq' description in '__d_lookup_rcu'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-22 15:49:18 -07:00
Al Viro
f132c5be05 Fix full_name_hash() behaviour when length is a multiple of 8
We want it to match what hash_name() is doing, which means extra
multiply by 9 in this case...

Reported-and-Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-22 15:10:43 -07:00
Lucas De Marchi
4e474a00d7 sysctl: protect poll() in entries that may go away
Protect code accessing ctl_table by grabbing the header with grab_header()
and after releasing with sysctl_head_finish().  This is needed if poll()
is called in entries created by modules: currently only hostname and
domainname support poll(), but this bug may be triggered when/if modules
use it and if user called poll() in a file that doesn't support it.

Dave Jones reported the following when using a syscall fuzzer while
hibernating/resuming:

RIP: 0010:[<ffffffff81233e3e>]  [<ffffffff81233e3e>] proc_sys_poll+0x4e/0x90
RAX: 0000000000000145 RBX: ffff88020cab6940 RCX: 0000000000000000
RDX: ffffffff81233df0 RSI: 6b6b6b6b6b6b6b6b RDI: ffff88020cab6940
[ ... ]
Code: 00 48 89 fb 48 89 f1 48 8b 40 30 4c 8b 60 e8 b8 45 01 00 00 49 83
7c 24 28 00 74 2e 49 8b 74 24 30 48 85 f6 74 24 48 85 c9 75 32 <8b> 16
b8 45 01 00 00 48 63 d2 49 39 d5 74 10 8b 06 48 98 48 89

If an entry goes away while we are polling() it, ctl_table may not exist
anymore.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-03-22 14:46:56 -07:00
Dave Chinner
c999a223c2 xfs: introduce an allocation workqueue
We currently have significant issues with the amount of stack that
allocation in XFS uses, especially in the writeback path. We can
easily consume 4k of stack between mapping the page, manipulating
the bmap btree and allocating blocks from the free list. Not to
mention btree block readahead and other functionality that issues IO
in the allocation path.

As a result, we can no longer fit allocation in the writeback path
in the stack space provided on x86_64. To alleviate this problem,
introduce an allocation workqueue and move all allocations to a
seperate context. This can be easily added as an interposing layer
into xfs_alloc_vextent(), which takes a single argument structure
and does not return until the allocation is complete or has failed.

To do this, add a work structure and a completion to the allocation
args structure. This allows xfs_alloc_vextent to queue the args onto
the workqueue and wait for it to be completed by the worker. This
can be done completely transparently to the caller.

The worker function needs to ensure that it sets and clears the
PF_TRANS flag appropriately as it is being run in an active
transaction context. Work can also be queued in a memory reclaim
context, so a rescuer is needed for the workqueue.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-03-22 16:12:24 -05:00
Dave Chinner
1a1d772433 xfs: Fix open flag handling in open_by_handle code
Sparse identified some unsafe handling of open flags in the xfs open
by handle ioctl code. Update the code to use the correct access
macros to ensure that we handle the open flags correctly.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-03-22 15:56:52 -05:00
Kamal Dasu
5575acc780 xfs: fix deadlock in xfs_rtfree_extent
To fix the deadlock caused by repeatedly calling xfs_rtfree_extent

 - removed xfs_ilock() and xfs_trans_ijoin() from xfs_rtfree_extent(),
   instead added asserts that the inode is locked and has an inode_item
   attached to it.
 - in xfs_bunmapi() when dealing with an inode with the rt flag
   call xfs_ilock() and xfs_trans_ijoin() so that the
   reference count is bumped on the inode and attached it to the
   transaction before calling into xfs_bmap_del_extent, similar to
   what we do in xfs_bmap_rtalloc.

Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-03-22 15:31:06 -05:00
Gerard Snitselaar
1c2ccc66bc fs: xfs: fix section mismatch in linux-next
xfs_qm_exit() is called in init_xfs_fs().

Signed-off-by: Gerard Snitselaar <dev@snitselaar.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-03-22 13:48:55 -05:00
Linus Torvalds
95211279c5 Merge branch 'akpm' (Andrew's patch-bomb)
Merge first batch of patches from Andrew Morton:
 "A few misc things and all the MM queue"

* emailed from Andrew Morton <akpm@linux-foundation.org>: (92 commits)
  memcg: avoid THP split in task migration
  thp: add HPAGE_PMD_* definitions for !CONFIG_TRANSPARENT_HUGEPAGE
  memcg: clean up existing move charge code
  mm/memcontrol.c: remove unnecessary 'break' in mem_cgroup_read()
  mm/memcontrol.c: remove redundant BUG_ON() in mem_cgroup_usage_unregister_event()
  mm/memcontrol.c: s/stealed/stolen/
  memcg: fix performance of mem_cgroup_begin_update_page_stat()
  memcg: remove PCG_FILE_MAPPED
  memcg: use new logic for page stat accounting
  memcg: remove PCG_MOVE_LOCK flag from page_cgroup
  memcg: simplify move_account() check
  memcg: remove EXPORT_SYMBOL(mem_cgroup_update_page_stat)
  memcg: kill dead prev_priority stubs
  memcg: remove PCG_CACHE page_cgroup flag
  memcg: let css_get_next() rely upon rcu_read_lock()
  cgroup: revert ss_id_lock to spinlock
  idr: make idr_get_next() good for rcu_read_lock()
  memcg: remove unnecessary thp check in page stat accounting
  memcg: remove redundant returns
  memcg: enum lru_list lru
  ...
2012-03-22 09:04:48 -07:00
Alex Elder
3489b42a72 ceph: fix three bugs, two in ceph_vxattrcb_file_layout()
In ceph_vxattrcb_file_layout(), there is a check to determine
whether a preferred PG should be formatted into the output buffer.
That check assumes that a preferred PG number of 0 indicates "no
preference," but that is wrong.  No preference is indicated by a
negative (specifically, -1) PG number.

In addition, if that condition yields true, the preferred value
is formatted into a sized buffer, but the size consumed by the
earlier snprintf() call is not accounted for, opening up the
possibilty of a buffer overrun.

Finally, in ceph_vxattrcb_dir_rctime() where the nanoseconds part of
the time displayed did not include leading 0's, which led to
erroneous (sub-second portion of) time values being shown.

This fixes these three issues:
    http://tracker.newdream.net/issues/2155
    http://tracker.newdream.net/issues/2156
    http://tracker.newdream.net/issues/2157

Signed-off-by: Alex Elder <elder@dreamhost.com>
Reviewed-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:52 -05:00
Alex Elder
cffaba15cd ceph: ensure Boolean options support both senses
Many ceph-related Boolean options offer the ability to both enable
and disable a feature.  For all those that don't offer this, add
a new option so that they do.

Note that ceph_show_options()--which reports mount options currently
in effect--only reports the option if it is different from the
default value.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:51 -05:00
Alex Elder
ee57741c52 rbd: make ceph_parse_options() return a pointer
ceph_parse_options() takes the address of a pointer as an argument
and uses it to return the address of an allocated structure if
successful.  With this interface is not evident at call sites that
the pointer is always initialized.  Change the interface to return
the address instead (or a pointer-coded error code) to make the
validity of the returned pointer obvious.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:47 -05:00
Alex Elder
18fa8b3fea ceph: make ceph_setxattr() and ceph_removexattr() more alike
This patch just rearranges a few bits of code to make more
portions of ceph_setxattr() and ceph_removexattr() identical.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:46 -05:00
Alex Elder
3ce6cd1233 ceph: avoid repeatedly computing the size of constant vxattr names
All names defined in the directory and file virtual extended
attribute tables are constant, and the size of each is known at
compile time.  So there's no need to compute their length every
time any file's attribute is listed.

Record the length of each string and use it when needed to determine
the space need to represent them.  In addition, compute the
aggregate size of strings in each table just once at initialization
time.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:46 -05:00
Alex Elder
aa4066ed7b ceph: encode type in vxattr callback routines
The names of the callback functions used for virtual extended
attributes are based only on the last component of the attribute
name.  Because of the way these are defined, this precludes allowing
a single (lowest) attribute name for different callbacks, dependent
on the type of file being operated on.  (For example, it might be
nice to support both "ceph.dir.layout" and "ceph.file.layout".)

Just change the callback names to avoid this problem.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:46 -05:00
Alex Elder
881a5fa200 ceph: drop "_cb" from name of struct ceph_vxattr_cb
A struct ceph_vxattr_cb does not represent a callback at all, but
rather a virtual extended attribute itself.  Drop the "_cb" suffix
from its name to reflect that.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:46 -05:00
Alex Elder
eb78808446 ceph: use macros to normalize vxattr table definitions
Entries in the ceph virtual extended attribute tables all follow a
distinct pattern in their definition.  Enforce this pattern through
the use of a macro.

Also, a null name field signals the end of the table, so make that
be the first field in the ceph_vxattr_cb structure.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:46 -05:00
Alex Elder
2289190719 ceph: use a symbolic name for "ceph." extended attribute namespace
Use symbolic constants to define the top-level prefix for "ceph."
extended attribute names.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:46 -05:00
Alex Elder
06476a69d8 ceph: pass inode rather than table to ceph_match_vxattr()
All callers of ceph_match_vxattr() determine what to pass as the
first argument by calling ceph_inode_vxattrs(inode).  Just do that
inside ceph_match_vxattr() itself, changing it to take an inode
rather than the vxattr pointer as its first argument.

Also ensure the function works correctly for an empty table (i.e.,
containing only a terminating null entry).

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:46 -05:00
Alex Elder
b829c1954d ceph: don't null-terminate xattr values
For some reason, ceph_setxattr() allocates an extra byte in which a
'\0' is stored past the end of an extended attribute value.  This is
not needed, and is potentially misleading, so get rid of it.

Signed-off-by: Alex Elder <elder@dreamhost.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:46 -05:00
Xi Wang
80834312a4 ceph: fix overflow check in build_snap_context()
The overflow check for a + n * b should be (n > (ULONG_MAX - a) / b),
rather than (n > ULONG_MAX / b - a).

Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:45 -05:00
Xi Wang
810339ec2f ceph: avoid panic with mismatched symlink sizes in fill_inode()
Return -EINVAL rather than panic if iinfo->symlink_len and inode->i_size
do not match.

Also use kstrndup rather than kmalloc/memcpy.

Signed-off-by: Xi Wang <xi.wang@gmail.com>
Reviewed-by: Alex Elder <elder@dreamhost.com>
2012-03-22 10:47:45 -05:00
Amon Ott
a661fc5611 ceph: use 2 instead of 1 as fallback for 32-bit inode number
The root directory of the Ceph mount has inode number 1, so falling back
to 1 always creates a collision. 2 is unused on my test systems and seems
less likely to collide.

Signed-off-by: Amon Ott <ao@m-privacy.de>
Signed-off-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:45 -05:00
Alex Elder
1ce208a6ce ceph: don't reset s_cap_ttl to zero
Avoid the need to check for a special zero s_cap_ttl value by just
using (jiffies - 1) as the value assigned to indicate "sometime in
the past."

Signed-off-by: Alex Elder <elder@dreamhost.com>
Reviewed-by: Sage Weil <sage@newdream.net>
2012-03-22 10:47:45 -05:00
Jan Kara
914b20070b btrfs: Fix busyloop in transaction_kthread()
When a filesystem got aborted due do error, transaction_kthread() will
busyloop.  Fix it by going to sleep in that case as well. Maybe we should
just stop transaction_kthread() when filesystem is aborted but that would be
more complex.

Signed-off-by: Jan Kara <jack@suse.cz>
2012-03-22 11:53:11 +01:00
Jeff Mahoney
79787eaab4 btrfs: replace many BUG_ONs with proper error handling
btrfs currently handles most errors with BUG_ON. This patch is a work-in-
 progress but aims to handle most errors other than internal logic
 errors and ENOMEM more gracefully.

 This iteration prevents most crashes but can run into lockups with
 the page lock on occasion when the timing "works out."

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
2012-03-22 11:52:54 +01:00
Artem Bityutskiy
182f514f88 ext4: remove useless s_dirt assignment
Clean-up ext4 a tiny bit by removing useless s_dirt assignment in
'ext4_fill_super()' because a bit later we anyway call
'ext4_setup_super()' which writes the superblock to the media
unconditionally.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2012-03-21 22:30:06 -04:00