27 Commits

Author SHA1 Message Date
Mike Marshall
9f08cfe944 Orangefs: update orangefs.txt
Al Viro has cleaned up the way ops are processed and waited for,
now orangefs.txt has an overview of how it works. Several recent
related commits have added to the comments in the code as well.

Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-02-26 14:39:08 -05:00
Mike Marshall
ca9f518ead Orangefs: code sanitation.
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-02-26 10:21:12 -05:00
Mike Marshall
adcf34a289 Orangefs: code sanitation
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-02-24 16:54:27 -05:00
Al Viro
05a50a5be8 orangefs: have ..._clean_interrupted_...() wait for copy to/from daemon
* turn all those list_del(&op->list) into list_del_init()
* don't pick ops that are already given up in control device
  ->read()/->write_iter().
* have orangefs_clean_interrupted_operation() notice if op is currently
  being copied to/from daemon (by said ->read()/->write_iter()) and
  wait for that to finish.
* when we are done copying to/from daemon and find that it had been
  given up while we were doing that, wake the waiting ..._clean_interrupted_...

As the result, we are guaranteed that orangefs_clean_interrupted_operation(op)
doesn't return until nobody else can see op.  Moreover, we don't need to play
with op refcounts anymore.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-02-19 13:45:56 -05:00
Al Viro
5964c1b839 orangefs: set correct ->downcall.status on failing to copy reply from daemon
... and clean the end of control device ->write_iter() while we are at it

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-02-19 13:45:55 -05:00
Al Viro
897c5df6cf orangefs: get rid of op->done
shouldn't be needed now

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-02-19 13:45:55 -05:00
Al Viro
ea2c9c9f65 orangefs: bufmap rewrite
new waiting-for-slot logics:
	* make request for slot wait for bufmap to be set up if it
comes before it's installed *OR* while it's running down
	* make closing control device wait for all slots to be freed
	* waiting itself rewritten to (open-coded) analogues of wait_event_...
primitives - we would need wait_event_locked() and, pardon an obscenely
long name, wait_event_interruptible_exclusive_timeout_locked().
	* we never wait for more than slot_timeout_secs in total and,
if during the wait the daemon goes away, we only allow
ORANGEFS_BUFMAP_WAIT_TIMEOUT_SECS for it to come back.
	* (cosmetical) bitmap is used instead of an array of zeroes and ones
	* old (and only reached if we are about to corrupt memory) waiting
for daemon restart in service_operation() removed.

[Martin's fixes folded]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-02-19 13:45:54 -05:00
Al Viro
78699e29fd orangefs: delay freeing slot until cancel completes
Make cancels reuse the aborted read/write op, to make sure they do not
fail on lack of memory.

Don't issue a cancel unless the daemon has seen our read/write, has not
replied and isn't being shut down.

If cancel *is* issued, don't wait for it to complete; stash the slot
in there and just have it freed when cancel is finally replied to or
purged (and delay dropping the reference until then, obviously).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-02-19 13:45:53 -05:00
Mike Marshall
5090c9670d Orangefs: improve gossip statement
There were two just alike, making it hard maybe to tell which one
you were looking at in syslog... so I changed it a little by adding
some extra interesting tidbits to it...

Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-02-04 13:29:27 -05:00
Al Viro
2a9e5c2260 orangefs: don't reinvent completion.h...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-01-23 15:20:11 -05:00
Al Viro
4f55e39732 if ORANGEFS_VFS_OP_FILE_IO request had been given up, don't bother waiting
... we are not going to get woken up anyway, so it's just going to time out
and whine.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-01-23 15:20:11 -05:00
Al Viro
727cbfea62 orangefs: get rid of MSECS_TO_JIFFIES
All timeouts are in _seconds_, so all calls are of form
MSECS_TO_JIFFIES(n * 1000), which is a convoluted way to
spell n * HZ.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-01-23 15:20:11 -05:00
Al Viro
ed42fe0593 orangefs: hopefully saner op refcounting and locking
* create with refcount 1
* make op_release() decrement and free if zero (i.e. old put_op()
  has become that).
* mark when submitter has given up waiting; from that point nobody
  else can move between the lists, change state, etc.
* have daemon read/write_iter grab a reference when picking op
  and *always* give it up in the end
* don't put into hash until we know it's been successfully passed to
  daemon

* move op->lock _lower_ than htab_in_progress_lock (and make sure
  to take it in purge_inprogress_ops())

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-01-23 13:03:12 -05:00
Al Viro
fee25ce125 orangefs: make sure that reopening pvfs2-req won't overlap with the end of close
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-01-23 12:55:24 -05:00
Al Viro
831d094979 orangefs: move wakeups into set_op_state_{serviced,purged}()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-01-23 12:42:43 -05:00
Al Viro
90e54e36c9 orangefs: ->poll() doesn't need spinlock
not just for list_empty()...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-01-23 12:42:43 -05:00
Al Viro
8016387ce7 orangefs: kill ioctl32 rudiments
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-01-23 12:42:43 -05:00
Al Viro
83595db052 orangefs: ->poll() is only called between successful ->open() and ->release()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-01-23 12:42:43 -05:00
Al Viro
fb6d2526e9 orangefs: generic_file_open() is pointless for character devices
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-01-23 12:42:43 -05:00
Mike Marshall
cf0c27715b Orangefs: make gossip statement more palatable to xtensa
Thanks to Intel's kbuild test robot

Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-01-19 12:04:40 -05:00
Mike Marshall
b3ae4755f5 Orangefs: implement .write_iter
Until now, orangefs_devreq_write_iter has just been a wrapper for
the old-fashioned orangefs_devreq_writev... linux would call
.write_iter with "struct kiocb *iocb" and "struct iov_iter *iter"
and .write_iter would just:

        return pvfs2_devreq_writev(iocb->ki_filp,
                                   iter->iov,
                                   iter->nr_segs,
                                   &iocb->ki_pos);

Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-01-13 11:18:12 -05:00
Martin Brandenburg
7d2214858f orangefs: Fix some more global namespace pollution.
This only changes the names of things, so there is no functional change.

Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-01-04 16:21:46 -05:00
Martin Brandenburg
a762ae6dc5 orangefs: Remove ``aligned'' upcall and downcall length macros.
There was previously MAX_ALIGNED_DEV_REQ_(UP|DOWN)SIZE macros which
evaluated to MAX_DEV_REQ_(UP|DOWN)SIZE+8. As it is unclear what this is
for, other than creating a situation where we accept more data than we
can parse, it is removed.

Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
2015-12-17 14:33:38 -05:00
Martin Brandenburg
90d26aa808 Orangefs: do not finalize bufmap if it was never initialized.
Found by the infant Orangefs fuzzer...

Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2015-12-15 15:37:53 -05:00
Mike Marshall
ce6c414e17 Orangefs: Don't wait the old-fashioned way.
Get rid of add_wait_queue, set_current_state, etc, and use the
wait_event() model.

Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2015-12-14 14:54:46 -05:00
Mike Marshall
97f100277c Orangefs: de-uglify orangefs_devreq_writev, and devorangefs-req.c in general
AV dislikes many parts of orangefs_devreq_writev. Besides making
orangefs_devreq_writev more easily readable and better commented,
this patch makes an effort to address some of the problems:

 > The 5th is quietly ignored unless trailer_size is positive and
 > status is zero. If trailer_size > 0 && status == 0, you verify that
 > the length of the 5th segment is no more than trailer_size and copy
 > it to vmalloc'ed buffer. Without bothering to zero the rest of that
 > buffer out.

It was just wrong to allow a 5th segment that is not exactly equal to
trailer_size. Now that that's fixed, there's nothing to zero out in
the vmalloced buffer - it is exactly the right size to hold the
5th segment.

 > Another API bogosity: when the 5th segment is present, successful writev()
 > returns the sum of sizes of the first 4.

Added size of 5th segment to writev return...

 > if concatenation of the first 4 segments is longer than
 > 16 + sizeof(struct pvfs2_downcall_s) by no more than sizeof(long) => whine
 > and proceed with garbage.

If 4th segment isn't exactly sizeof(struct pvfs2_downcall_s), whine and fail.

 > if the 32bit value 4 bytes into op->downcall is zero and 64bit
 > value following it is non-zero, the latter is interpreted as the size of
 > trailer data.

The latter is what userspace claimed was the length of the trailer data.
The kernel module now compares it to the trailer iovec's iov_len as a
sanity check.

 > if there's no trailer, the 5th segment (if present) is completely ignored.

Whine and fail if there should be no trailer, yet a 5th segment is present.

 > if vmalloc fails, act as if status (32bit at offset 5 into
 > op->downcall) had been -ENOMEM and don't look at the 5th segment at all.

whine and fail with -ENOMEM.

Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2015-12-14 13:32:05 -05:00
Mike Marshall
575e946125 Orangefs: change pvfs2 filenames to orangefs
Also changed references within source files that referred to
header files whose names had changed.

Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2015-12-04 12:56:14 -05:00