linux

iv/linux

Author	SHA1	Message	Date
Alex Elder	4c46459cae	libceph: report defined but unsupported osd ops If osd_req_encode_op() is given any opcode it doesn't recognize it reports an error. This patch fleshes out that routine to distinguish between well-defined but unsupported values and values that are simply bogus. This and the next commit are related to: http://tracker.ceph.com/issues/4126 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-18 12:20:22 -06:00
Alex Elder	2979ddb11b	libceph: update ceph_osd_op_name() Update ceph_osd_op_name() to include the newly-added definitions in "rados.h", and to match its counterpart in the user space code. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-18 12:20:18 -06:00
Alex Elder	4b568b1aaf	libceph: add ceph_osd_state_name() Add the definition of ceph_osd_state_name(), to match its counterpart in user space. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-18 12:20:13 -06:00
Alex Elder	2d2f522699	libceph: kill ceph_osdc_wait_event() There are no actual users of ceph_osdc_wait_event(). This would have been one-shot events, but we no longer support those so just get rid of this function. Since this leaves nothing else that waits for the completion of an event, we can get rid of the completion in a struct ceph_osd_event. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-18 12:20:05 -06:00
Alex Elder	3c663bbdcd	libceph: kill ceph_osdc_create_event() "one_shot" parameter There is only one caller of ceph_osdc_create_event(), and it provides 0 as its "one_shot" argument. Get rid of that argument and just use 0 in its place. Replace the code in handle_watch_notify() that executes if one_shot is nonzero in the event with a BUG_ON() call. While modifying "osd_client.c", give handle_watch_notify() static scope. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-18 12:20:00 -06:00
Alex Elder	60e56f1381	libceph: kill ceph_calc_raw_layout() There is no caller of ceph_calc_raw_layout() outside of libceph, so there's no need to export from the module. Furthermore, there is only one caller, in calc_layout(), and it is not much more than a simple wrapper for that function. So get rid of ceph_calc_raw_layout() and embed it instead within calc_layout(). While touching "osd_client.c", get rid of the unnecessary forward declaration of __send_request(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-18 12:19:54 -06:00
Alex Elder	60789380ae	libdeph: don't export ceph_osdc_init() or ceph_osdc_stop() The only callers of ceph_osdc_init() and ceph_osdc_stop() ceph_create_client() and ceph_destroy_client() (respectively) and they are in the same kernel module as those two functions. There's therefore no need to export those interfaces, so don't. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-18 12:19:48 -06:00
Alex Elder	f9d2519945	libceph: lock outside send_queued() Two of the three callers of the osd client's send_queued() function already hold the osd client mutex and drop it before the call. Change send_queued() so it assumes the caller holds the mutex, and update all callers accordingly. Rename it __send_queued() to match the convention used elsewhere in the file with respect to the lock. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-18 12:19:44 -06:00
Alex Elder	a3bea47e8b	ceph: kill ceph_osdc_new_request() "num_reply" parameter The "num_reply" parameter to ceph_osdc_new_request() is never used inside that function, so get rid of it. Note that ceph_sync_write() passes 2 for that argument, while all other callers pass 1. It doesn't matter, but perhaps someone should verify this doesn't indicate a problem. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-18 12:19:39 -06:00
Alex Elder	2480882611	ceph: kill ceph_osdc_writepages() "flags" parameter There is only one caller of ceph_osdc_writepages(), and it always passes 0 as its "flags" argument. Get rid of that argument and replace its use in ceph_osdc_writepages() with 0. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-18 12:19:35 -06:00
Alex Elder	fbf8685fb1	ceph: kill ceph_osdc_writepages() "dosync" parameter There is only one caller of ceph_osdc_writepages(), and it always passes 0 as its "dosync" argument. Get rid of that argument and replace its use in ceph_osdc_writepages() with 0. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-18 12:19:28 -06:00
Alex Elder	87f979d390	ceph: kill ceph_osdc_writepages() "nofail" parameter There is only one caller of ceph_osdc_writepages(), and it always passes the value true as its "nofail" argument. Get rid of that argument and replace its use in ceph_osdc_writepages() with the constant value true. This and a number of cleanup patches that follow resolve: http://tracker.ceph.com/issues/4126 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-18 12:19:22 -06:00
Alex Elder	9cbb1d7268	libceph: don't require r_num_pages for bio requests There is a check in the completion path for osd requests that ensures the number of pages allocated is enough to hold the amount of incoming data expected. For bio requests coming from rbd the "number of pages" is not really meaningful (although total length would be). So stop requiring that nr_pages be supplied for bio requests. This is done by checking whether the pages pointer is null before checking the value of nr_pages. Note that this value is passed on to the messenger, but there it's only used for debugging--it's never used for validation. While here, change another spot that used r_pages in a debug message inappropriately, and also invalidate the r_con_filling_msg pointer after dropping a reference to it. This resolves: http://tracker.ceph.com/issues/3875 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-13 18:29:11 -08:00
Alex Elder	1e32d34cfa	rbd: don't take extra bio reference for osd client Currently, if the OSD client finds an osd request has had a bio list attached to it, it drops a reference to it (or rather, to the first entry on that list) when the request is released. The code that added that reference (i.e., the rbd client) is therefore required to take an extra reference to that first bio structure. The osd client doesn't really do anything with the bio pointer other than transfer it from the osd request structure to outgoing (for writes) and ingoing (for reads) messages. So it really isn't the right place to be taking or dropping references. Furthermore, the rbd client already holds references to all bio structures it passes to the osd client, and holds them until the request is completed. So there's no need for this extra reference whatsoever. So remove the bio_put() call in ceph_osdc_release_request(), as well as its matching bio_get() call in rbd_osd_req_create(). This change could lead to a crash if old libceph.ko was used with new rbd.ko. Add a compatibility check at rbd initialization time to avoid this possibilty. This resolves: http://tracker.ceph.com/issues/3798 and http://tracker.ceph.com/issues/3799 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-13 18:29:11 -08:00
Alex Elder	72fe25e346	libceph: add a compatibility check interface An upcoming change implements semantic change that could lead to a crash if an old version of the libceph kernel module is used with a new version of the rbd kernel module. In order to preclude that possibility, this adds a compatibilty check interface. If this interface doesn't exist, the modules are obviously not compatible. But if it does exist, this provides a way of letting the caller know whether it will operate properly with this libceph module. Perhaps confusingly, it returns false right now. The semantic change mentioned above will make it return true. This resolves: http://tracker.ceph.com/issues/3800 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-13 18:29:11 -08:00
Alex Elder	3ebc21f7bc	libceph: fix messenger CONFIG_BLOCK dependencies The ceph messenger has a few spots that are only used when bio messages are supported, and that's only when CONFIG_BLOCK is defined. This surrounds a couple of spots with #ifdef's that would cause a problem if CONFIG_BLOCK were not present in the kernel configuration. This resolves: http://tracker.ceph.com/issues/3976 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-13 18:27:40 -08:00
Alex Elder	969e5aa3b0	Merge branch 'testing' of github.com:ceph/ceph-client into v3.8-rc5-testing	2013-01-30 07:54:34 -06:00
Cong Ding	1ec3911dbd	libceph: fix undefined behavior when using snprintf() The variable "str" is used as both the source and destination in function snprintf(), which is undefined behavior based on C11. The original description in C11 is: "If copying takes place between objects that overlap, the behavior is undefined." And, the function of ceph_osdmap_state_str() is to return the osdmap state, so it should return "doesn't exist" when all the conditions are not satisfied. I fix it in this patch. [elder@inktank.com: shortened the commit message] Signed-off-by: Cong Ding <dinggnu@gmail.com> Reviewed-by: Alex Elder <elder@inktank.com>	2013-01-25 17:48:59 -06:00
Alex Elder	ae7ca4a35b	libceph: pass num_op with ops Both ceph_osdc_alloc_request() and ceph_osdc_build_request() are provided an array of ceph osd request operations. Rather than just passing the number of operations in the array, the caller is required append an additional zeroed operation structure to signal the end of the array. All callers know the number of operations at the time these functions are called, so drop the silly zero entry and supply that number directly. As a result, get_num_ops() is no longer needed. This also means that ceph_osdc_alloc_request() never uses its ops argument, so that can be dropped. Also rbd_create_rw_ops() no longer needs to add one to reserve room for the additional op. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-01-17 16:34:57 -06:00
Alex Elder	54a5400721	libceph: don't set pages or bio in ceph_osdc_alloc_request() Only one of the two callers of ceph_osdc_alloc_request() provides page or bio data for its payload. And essentially all that function was doing with those arguments was assigning them to fields in the osd request structure. Simplify ceph_osdc_alloc_request() by having the caller take care of making those assignments Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-01-17 16:34:57 -06:00
Alex Elder	d178a9e740	libceph: don't set flags in ceph_osdc_alloc_request() The only thing ceph_osdc_alloc_request() really does with the flags value it is passed is assign it to the newly-created osd request structure. Do that in the caller instead. Both callers subsequently call ceph_osdc_build_request(), so have that function (instead of ceph_osdc_alloc_request()) issue a warning if a request comes through with neither the read nor write flags set. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-01-17 15:52:05 -06:00
Alex Elder	e75b45cf36	libceph: drop osdc from ceph_calc_raw_layout() The osdc parameter to ceph_calc_raw_layout() is not used, so get rid of it. Consequently, the corresponding parameter in calc_layout() becomes unused, so get rid of that as well. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-01-17 15:52:05 -06:00
Alex Elder	4d6b250bf1	libceph: drop snapid in ceph_calc_raw_layout() A snapshot id must be provided to ceph_calc_raw_layout() even though it is not needed at all for calculating the layout. Where the snapshot id is needed is when building the request message for an osd operation. Drop the snapid parameter from ceph_calc_raw_layout() and pass that value instead in ceph_osdc_build_request(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-01-17 15:52:05 -06:00
Alex Elder	e8afad656c	libceph: pass length to ceph_calc_file_object_mapping() ceph_calc_file_object_mapping() takes (among other things) a "file" offset and length, and based on the layout, determines the object number ("bno") backing the affected portion of the file's data and the offset into that object where the desired range begins. It also computes the size that should be used for the request--either the amount requested or something less if that would exceed the end of the object. This patch changes the input length parameter in this function so it is used only for input. That is, the argument will be passed by value rather than by address, so the value provided won't get updated by the function. The value would only get updated if the length would surpass the current object, and in that case the value it got updated to would be exactly that returned in *oxlen. Only one of the two callers is affected by this change. Update ceph_calc_raw_layout() so it records any updated value. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-01-17 15:52:04 -06:00
Alex Elder	0120be3c60	libceph: pass length to ceph_osdc_build_request() The len argument to ceph_osdc_build_request() is set up to be passed by address, but that function never updates its value so there's no need to do this. Tighten up the interface by passing the length directly. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-01-17 15:52:04 -06:00
Alex Elder	5b9d1b1cd4	libceph: kill op_needs_trail() Since every osd message is now prepared to include trailing data, there's no need to check ahead of time whether any operations will make use of the trail portion of the message. We can drop the second argument to get_num_ops(), and as a result we can also get rid of op_needs_trail() which is no longer used. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-01-17 15:52:03 -06:00
Alex Elder	c885837f7d	libceph: always allow trail in osd request An osd request structure contains an optional trail portion, which if present will contain data to be passed in the payload portion of the message containing the request. The trail field is a ceph_pagelist pointer, and if null it indicates there is no trail. A ceph_pagelist structure contains a length field, and it can legitimately hold value 0. Make use of this to change the interpretation of the "trail" of an osd request so that every osd request has trailing data, it just might have length 0. This means we change the r_trail field in a ceph_osd_request structure from a pointer to a structure that is always initialized. Note that in ceph_osdc_start_request(), the trail pointer (or now address of that structure) is assigned to a ceph message's trail field. Here's why that's still OK (looking at net/ceph/messenger.c): - What would have resulted in a null pointer previously will now refer to a 0-length page list. That message trail pointer is used in two functions, write_partial_msg_pages() and out_msg_pos_next(). - In write_partial_msg_pages(), a null page list pointer is handled the same as a message with 0-length trail, and both result in a "in_trail" variable set to false. The trail pointer is only used if in_trail is true. - The only other place the message trail pointer is used is out_msg_pos_next(). That function is only called by write_partial_msg_pages() and only touches the trail pointer if the in_trail value it is passed is true. Therefore a null ceph_msg->trail pointer is equivalent to a non-null pointer referring to a 0-length page list structure. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-01-17 15:52:03 -06:00
Alex Elder	af77f26caa	rbd: drop oid parameters from ceph_osdc_build_request() The last two parameters to ceph_osd_build_request() describe the object id, but the values passed always come from the osd request structure whose address is also provided. Get rid of those last two parameters. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-01-17 15:52:01 -06:00
Alex Elder	c3acb18196	libceph: reformat __reset_osd() Reformat __reset_osd() into three distinct blocks of code handling the three return cases. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-01-17 14:07:44 -06:00
Sage Weil	7d7c1f6136	crush: avoid recursion if we have already collided This saves us some cycles, but does not affect the placement result at all. This corresponds to ceph.git commit 4abb53d4f. Signed-off-by: Sage Weil <sage@inktank.com>	2013-01-17 12:42:39 -06:00
Jim Schutt	1604f488ac	libceph: for chooseleaf rules, retry CRUSH map descent from root if leaf is failed Add libceph support for a new CRUSH tunable recently added to Ceph servers. Consider the CRUSH rule step chooseleaf firstn 0 type <node_type> This rule means that <n> replicas will be chosen in a manner such that each chosen leaf's branch will contain a unique instance of <node_type>. When an object is re-replicated after a leaf failure, if the CRUSH map uses a chooseleaf rule the remapped replica ends up under the <node_type> bucket that held the failed leaf. This causes uneven data distribution across the storage cluster, to the point that when all the leaves but one fail under a particular <node_type> bucket, that remaining leaf holds all the data from its failed peers. This behavior also limits the number of peers that can participate in the re-replication of the data held by the failed leaf, which increases the time required to re-replicate after a failure. For a chooseleaf CRUSH rule, the tree descent has two steps: call them the inner and outer descents. If the tree descent down to <node_type> is the outer descent, and the descent from <node_type> down to a leaf is the inner descent, the issue is that a down leaf is detected on the inner descent, so only the inner descent is retried. In order to disperse re-replicated data as widely as possible across a storage cluster after a failure, we want to retry the outer descent. So, fix up crush_choose() to allow the inner descent to return immediately on choosing a failed leaf. Wire this up as a new CRUSH tunable. Note that after this change, for a chooseleaf rule, if the primary OSD in a placement group has failed, choosing a replacement may result in one of the other OSDs in the PG colliding with the new primary. This requires that OSD's data for that PG to need moving as well. This seems unavoidable but should be relatively rare. This corresponds to ceph.git commit 88f218181a9e6d2292e2697fc93797d0f6d6e5dc. Signed-off-by: Jim Schutt <jaschut@sandia.gov> Reviewed-by: Sage Weil <sage@inktank.com>	2013-01-17 12:42:39 -06:00
Yan, Zheng	a41bad1a9b	ceph: re-calculate truncate_size for strip object Otherwise osd may truncate the object to larger size. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-01-17 12:42:37 -06:00
Linus Torvalds	58890c0669	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph fixes from Sage Weil: "Two of Alex's patches deal with a race when reseting server connections for open RBD images, one demotes some non-fatal BUGs to WARNs, and my patch fixes a protocol feature bit failure path." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: libceph: fix protocol feature mismatch failure path libceph: WARN, don't BUG on unexpected connection states libceph: always reset osds when kicking libceph: move linger requests sooner in kick_requests()	2013-01-02 17:32:49 -08:00
Sage Weil	0fa6ebc600	libceph: fix protocol feature mismatch failure path We should not set con->state to CLOSED here; that happens in ceph_fault() in the caller, where it first asserts that the state is not yet CLOSED. Avoids a BUG when the features don't match. Since the fail_protocol() has become a trivial wrapper, replace calls to it with direct calls to reset_connection(). Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>	2012-12-27 20:27:04 -06:00
Alex Elder	122070a2ff	libceph: WARN, don't BUG on unexpected connection states A number of assertions in the ceph messenger are implemented with BUG_ON(), killing the system if connection's state doesn't match what's expected. At this point our state model is (evidently) not well understood enough for these assertions to trigger a BUG(). Convert all BUG_ON(con->state...) calls to be WARN_ON(con->state...) so we learn about these issues without killing the machine. We now recognize that a connection fault can occur due to a socket closure at any time, regardless of the state of the connection. So there is really nothing we can assert about the state of the connection at that point so eliminate that assertion. Reported-by: Ugis <ugis22@gmail.com> Tested-by: Ugis <ugis22@gmail.com> Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-12-27 20:27:04 -06:00
Alex Elder	e6d50f67a6	libceph: always reset osds when kicking When ceph_osdc_handle_map() is called to process a new osd map, kick_requests() is called to ensure all affected requests are updated if necessary to reflect changes in the osd map. This happens in two cases: whenever an incremental map update is processed; and when a full map update (or the last one if there is more than one) gets processed. In the former case, the kick_requests() call is followed immediately by a call to reset_changed_osds() to ensure any connections to osds affected by the map change are reset. But for full map updates this isn't done. Both cases should be doing this osd reset. Rather than duplicating the reset_changed_osds() call, move it into the end of kick_requests(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-12-27 20:27:04 -06:00
Alex Elder	ab60b16d3c	libceph: move linger requests sooner in kick_requests() The kick_requests() function is called by ceph_osdc_handle_map() when an osd map change has been indicated. Its purpose is to re-queue any request whose target osd is different from what it was when it was originally sent. It is structured as two loops, one for incomplete but registered requests, and a second for handling completed linger requests. As a special case, in the first loop if a request marked to linger has not yet completed, it is moved from the request list to the linger list. This is as a quick and dirty way to have the second loop handle sending the request along with all the other linger requests. Because of the way it's done now, however, this quick and dirty solution can result in these incomplete linger requests never getting re-sent as desired. The problem lies in the fact that the second loop only arranges for a linger request to be sent if it appears its target osd has changed. This is the proper handling for completed linger requests (it avoids issuing the same linger request twice to the same osd). But although the linger requests added to the list in the first loop may have been sent, they have not yet completed, so they need to be re-sent regardless of whether their target osd has changed. The first required fix is we need to avoid calling __map_request() on any incomplete linger request. Otherwise the subsequent __map_request() call in the second loop will find the target osd has not changed and will therefore not re-send the request. Second, we need to be sure that a sent but incomplete linger request gets re-sent. If the target osd is the same with the new osd map as it was when the request was originally sent, this won't happen. This can be fixed through careful handling when we move these requests from the request list to the linger list, by unregistering the request before it is registered as a linger request. This works because a side-effect of unregistering the request is to make the request's r_osd pointer be NULL, and that will ensure the second loop actually re-sends the linger request. Processing of such a request is done at that point, so continue with the next one once it's been moved. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-12-27 20:27:04 -06:00
Linus Torvalds	40889e8d9f	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph update from Sage Weil: "There are a few different groups of commits here. The largest is Alex's ongoing work to enable the coming RBD features (cloning, striping). There is some cleanup in libceph that goes along with it. Cyril and David have fixed some problems with NFS reexport (leaking dentries and page locks), and there is a batch of patches from Yan fixing problems with the fs client when running against a clustered MDS. There are a few bug fixes mixed in for good measure, many of which will be going to the stable trees once they're upstream. My apologies for the late pull. There is still a gremlin in the rbd map/unmap code and I was hoping to include the fix for that as well, but we haven't been able to confirm the fix is correct yet; I'll send that in a separate pull once it's nailed down." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (68 commits) rbd: get rid of rbd_{get,put}_dev() libceph: register request before unregister linger libceph: don't use rb_init_node() in ceph_osdc_alloc_request() libceph: init event->node in ceph_osdc_create_event() libceph: init osd->o_node in create_osd() libceph: report connection fault with warning libceph: socket can close in any connection state rbd: don't use ENOTSUPP rbd: remove linger unconditionally rbd: get rid of RBD_MAX_SEG_NAME_LEN libceph: avoid using freed osd in __kick_osd_requests() ceph: don't reference req after put rbd: do not allow remove of mounted-on image libceph: Unlock unprocessed pages in start_read() error path ceph: call handle_cap_grant() for cap import message ceph: Fix __ceph_do_pending_vmtruncate ceph: Don't add dirty inode to dirty list if caps is in migration ceph: Fix infinite loop in __wake_requests ceph: Don't update i_max_size when handling non-auth cap bdi_register: add __printf verification, fix arg mismatch ...	2012-12-20 14:00:13 -08:00
Alex Elder	c89ce05e0c	libceph: register request before unregister linger In kick_requests(), we need to register the request before we unregister the linger request. Otherwise the unregister will reset the request's osd pointer to NULL. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-12-20 10:56:39 -06:00
Alex Elder	a978fa20fb	libceph: don't use rb_init_node() in ceph_osdc_alloc_request() The red-black node in the ceph osd request structure is initialized in ceph_osdc_alloc_request() using rbd_init_node(). We do need to initialize this, because in __unregister_request() we call RB_EMPTY_NODE(), which expects the node it's checking to have been initialized. But rb_init_node() is apparently overkill, and may in fact be on its way out. So use RB_CLEAR_NODE() instead. For a little more background, see this commit: `4c199a93` rbtree: empty nodes have no color" Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-12-20 10:56:33 -06:00
Alex Elder	3ee5234df6	libceph: init event->node in ceph_osdc_create_event() The red-black node node in the ceph osd event structure is not initialized in create_osdc_create_event(). Because this node can be the subject of a RB_EMPTY_NODE() call later on, we should ensure the node is initialized properly for that. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-12-20 10:56:28 -06:00
Alex Elder	f407731d12	libceph: init osd->o_node in create_osd() The red-black node node in the ceph osd structure is not initialized in create_osd(). Because this node can be the subject of a RB_EMPTY_NODE() call later on, we should ensure the node is initialized properly for that. Add a call to RB_CLEAR_NODE() initialize it. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-12-20 10:56:21 -06:00
Alex Elder	28362986f8	libceph: report connection fault with warning When a connection's socket disconnects, or if there's a protocol error of some kind on the connection, a fault is signaled and the connection is reset (closed and reopened, basically). We currently get an error message on the log whenever this occurs. A ceph connection will attempt to reestablish a socket connection repeatedly if a fault occurs. This means that these error messages will get repeatedly added to the log, which is undesirable. Change the error message to be a warning, so they don't get logged by default. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-12-20 10:56:13 -06:00
Alex Elder	7bb21d68c5	libceph: socket can close in any connection state A connection's socket can close for any reason, independent of the state of the connection (and without irrespective of the connection mutex). As a result, the connectino can be in pretty much any state at the time its socket is closed. Handle those other cases at the top of con_work(). Pull this whole block of code into a separate function to reduce the clutter. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-12-17 12:07:32 -06:00
Alex Elder	61c7403562	rbd: remove linger unconditionally In __unregister_linger_request(), the request is being removed from the osd client's req_linger list only when the request has a non-null osd pointer. It should be done whether or not the request currently has an osd. This is most likely a non-issue because I believe the request will always have an osd when this function is called. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-12-17 12:07:31 -06:00
Alex Elder	685a7555ca	libceph: avoid using freed osd in __kick_osd_requests() If an osd has no requests and no linger requests, __reset_osd() will just remove it with a call to __remove_osd(). That drops a reference to the osd, and therefore the osd may have been free by the time __reset_osd() returns. That function offers no indication this may have occurred, and as a result the osd will continue to be used even when it's no longer valid. Change__reset_osd() so it returns an error (ENODEV) when it deletes the osd being reset. And change __kick_osd_requests() so it returns immediately (before referencing osd again) if __reset_osd() returns any error. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2012-12-17 08:37:23 -06:00
Alex Elder	7d5f24812b	ceph: don't reference req after put In __unregister_request(), there is a call to list_del_init() referencing a request that was the subject of a call to ceph_osdc_put_request() on the previous line. This is not safe, because the request structure could have been freed by the time we reach the list_del_init(). Fix this by reversing the order of these lines. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-off-by: Sage Weil <sage@inktank.com>	2012-12-17 08:37:19 -06:00
Sage Weil	83aff95eb9	libceph: remove 'osdtimeout' option This would reset a connection with any OSD that had an outstanding request that was taking more than N seconds. The idea was that if the OSD was buggy, the client could compensate by resending the request. In reality, this only served to hide server bugs, and we haven't actually seen such a bug in quite a while. Moreover, the userspace client code never did this. More importantly, often the request is taking a long time because the OSD is trying to recover, or overloaded, and killing the connection and retrying would only make the situation worse by giving the OSD more work to do. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>	2012-12-13 08:13:06 -06:00
Alex Elder	72afc71ffc	libceph: define ceph_pg_pool_name_by_id() Define and export function ceph_pg_pool_name_by_id() to supply the name of a pg pool whose id is given. This will be used by the next patch. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2012-11-01 07:55:42 -05:00
Sage Weil	0ed7285e00	libceph: fix osdmap decode error paths Ensure that we set the err value correctly so that we do not pass a 0 value to ERR_PTR and confuse the calling code. (In particular, osd_client.c handle_map() will BUG(!newmap)). Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>	2012-10-30 08:21:05 -05:00

1 2 3 4 5 ...

309 Commits