IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Because the minor_get and related functions use the return values for
errors, the compiler doesn't know that sysminor will always either 1) be
initialized in aoedev_by_aoeaddr by the call to minor_get, or 2) be
unused as the "goto out" is executed.
This patch avoids the compiler warning.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For some special-purpose systems where udev isn't present, static
allocation of minor numbers is desirable. This update distinguishes
different failure scenarios, to help the user understand what went
wrong.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There is no need to call the request handler function in the I/O
completion routine. The user impact of not doing it is a more "nice" aoe
driver that is less susceptible to causing soft lockups.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A misplaced comment was attached to the nout member of the aoetgt. This
change corrects the comment.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The aoe driver will never be waiting for more than aoe_maxout AoE
commands from a given remote network port on an AoE target. Increasing
the cap increases performance. Users can tighten the setting to reduce
the amount of memory used for handling AoE traffic or the network
bandwidth used for AoE.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Before the aoe driver was an I/O request handler, it was a
make_request-style block driver. Even so, there was a problem where
sysfs expected a request queue to exist, so one was provided in commit
7135a71b19be ("aoe: allocate unused request_queue for sysfs").
During the transition to the request-handler style, a patch was merged
that was based on a driver without the noop queue, and the noop queue
remained in place after the patch was merged, even though a new
functional queue was introduced by the patch, allocated through
blk_init_queue.
The user impact is a memory leak proportional to the number of AoE
targets discovered. This patch removes the memory leak and cleans up
vestiges of the old do-nothing queue from the aoeblk_gdalloc function.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit f3b8e07af774 ("aoe: commands in retransmit queue use new
destination on failure") omits the copying of the coarse-grained time
when an AoE command was sent during the failover from one destination
MAC address on the AoE target to another.
The coarse-grained timing is only used when the system time changes or
an unlikely length of time has passed since the sending of the AoE
command. Users will not be impacted unless their system clock is very
inaccurate or something unusual (e.g., 10 GbE link reset) happens during
the period when the aoe driver is handling the failure of a port on the
AoE target. Being effected will mean that an AoE target could be
considered "down" too eagerly.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When one remote MAC address isn't working as a destination for AoE
commands, the frames used to track information associated with the AoE
commands are moved to a new aoetgt (defined by the tuple of {AoE major,
AoE minor, target MAC address}).
This patch makes sure that the frames on the queue for retransmits that
need to be done are updated to use the new destination, so that
retransmits will be sent through a working network path.
Without this change, packets on the retransmit queue will be needlessly
retransmitted to the unresponsive destination MAC, possibly causing
premature target failure before there's time for the retransmit timer to
run again, decide to retransmit again, and finally update the destination
to a working MAC address on the AoE target.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
These changes improve the accuracy of the decision about whether it's time
to retransmit an AoE command by using the microsecond-resolution
gettimeofday instead of jiffies.
Because the system time can jump suddenly, the decision reverts to using
jiffies if the high-resolution time difference is relatively large.
Otherwise the AoE targets could be considered failed inappropriately.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With this bugfix in place the calculation of the criterion for "lateness"
is performed under lock. Without the lock, there is a chance that one of
the non-atomic operations performed on the round trip time statistics
could be incomplete, such that an incorrect lateness criterion would be
calculated.
Without this change, the effect of the bug would be rare unecessary but
benign retransmissions.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The /dev/etherd/err character device provides low-level information about
normal but sometimes interesting AoE command retransmits and "unexpected
responses", i.e., responses for packets that have already been
retransmitted.
This change adds MAC addresses to the messages about unexpected responses,
so that when they occur, it's more easy to determine the network paths to
which they belong.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The aoe driver already had some congestion handling, but it was limited in
its ability to cope with the kind of congestion that can arise on more
complex networks such as those involving paths through multiple ethernet
switches.
Some of the lessons from TCP's history of development can be applied to
improving the congestion control and avoidance on AoE storage networks.
These changes use familar concepts from Van Jacobson's "Congestion
Avoidance and Control" paper from '88, without adding significant
overhead.
This patch depends on an upcoming patch that covers the failover case when
AoE commands being retransmitted are transferred from one retransmit queue
to another. Another upcoming patch increases the timing accuracy.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make the aoe driver follow expected behavior when the user uses ioctl to
get the ATA device identify information, allowing access to model, serial
number, etc.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The userland aoetools package includes an "aoe-stat" command that can
display a "payload size" column when the aoe driver exports this
information. Users can quickly see what amount of user data is
transferred inside each AoE command on the network, network headers
excluded.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The GPFS filesystem is an example of an aoe user that requires the aoe
driver to support I/O request sizes larger than the default. Most users
will not need large I/O request sizes, because they would need to be split
up into multiple AoE commands anyway.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Users sometimes want to cause the aoe driver to forget a particular
previously discovered device when it is no longer online. The aoetools
provide an "aoe-flush" command that users run to perform this
administrative task. The changes below provide the support needed in the
driver.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The ATA over Ethernet config query response contains a "buffer count"
field reflecting the AoE target's capacity to buffer incoming AoE
commands.
By taking the current value of this field into accound, we increase
performance throughput or avoid network congestion, when the value
has increased or decreased, respectively.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dropped transmits are not common, but when they do occur, increasing
the transmit queue length often helps.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull block driver update from Jens Axboe:
"Now that the core bits are in, here are the driver bits for 3.8. The
branch contains:
- A huge pile of drbd bits that were dumped from the 3.7 merge
window. Following that, it was both made perfectly clear that
there is going to be no more over-the-wall pulls and how the
situation on individual pulls can be improved.
- A few cleanups from Akinobu Mita for drbd and cciss.
- Queue improvement for loop from Lukas. This grew into adding a
generic interface for waiting/checking an even with a specific
lock, allowing this to be pulled out of md and now loop and drbd is
also using it.
- A few fixes for xen back/front block driver from Roger Pau Monne.
- Partition improvements from Stephen Warren, allowing partiion UUID
to be used as an identifier."
* 'for-3.8/drivers' of git://git.kernel.dk/linux-block: (609 commits)
drbd: update Kconfig to match current dependencies
drbd: Fix drbdsetup wait-connect, wait-sync etc... commands
drbd: close race between drbd_set_role and drbd_connect
drbd: respect no-md-barriers setting also when changed online via disk-options
drbd: Remove obsolete check
drbd: fixup after wait_even_lock_irq() addition to generic code
loop: Limit the number of requests in the bio list
wait: add wait_event_lock_irq() interface
xen-blkfront: free allocated page
xen-blkback: move free persistent grants code
block: partition: msdos: provide UUIDs for partitions
init: reduce PARTUUID min length to 1 from 36
block: store partition_meta_info.uuid as a string
cciss: use check_signature()
cciss: cleanup bitops usage
drbd: use copy_highpage
drbd: if the replication link breaks during handshake, keep retrying
drbd: check return of kmalloc in receive_uuids
drbd: Broadcast sync progress no more often than once per second
drbd: don't try to clear bits once the disk has failed
...
ENOTSUPP is not a standard errno (it shows up as "Unknown error 524"
in an error message). This is what was getting produced when the
the local rbd code does not implement features required by a
discovered rbd image.
Change the error code returned in this case to ENXIO.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
RBD_MAX_SEG_NAME_LEN represents the maximum length of an rbd object
name (i.e., one of the objects providing storage backing an rbd
image).
Another symbol, MAX_OBJ_NAME_SIZE, is used in the osd client code to
define the maximum length of any object name in an osd request.
Right now they disagree, with RBD_MAX_SEG_NAME_LEN being too big.
There's no real benefit at this point to defining the rbd object
name length limit separate from any other object name, so just
get rid of RBD_MAX_SEG_NAME_LEN and use MAX_OBJ_NAME_SIZE in its
place.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
There is no check in rbd_remove() to see if anybody holds open the
image being removed. That's not cool.
Add a simple open count that goes up and down with opens and closes
(releases) of the device, and don't allow an rbd image to be removed
if the count is non-zero.
Protect the updates of the open count value with ctl_mutex to ensure
the underlying rbd device doesn't get removed while concurrently
being opened.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Change foreach_grant iterator to a safe version, that allows freeing
the element while iterating. Also move the free code in
free_persistent_gnts to prevent freeing the element before the rb_next
call.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad@kernel.org>
Cc: xen-devel@lists.xen.org
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
We no longer need the connector.
But we need libcrc32c.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
This was introduces when moving the code over from the 8.3 codebase
with commit 328e0f125bf41f4f33f684db22015f92cb44fe56
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
drbd_set_role(, R_PRIMARY, ) does the state change to Primary,
some more housekeeping, and possibly generates a new UUID set.
All of this holding the "state_mutex".
The connection handshake involves sending of various state information,
including the current data generation UUID set, and two connection
state changes from C_WF_CONNECTION to C_WF_REPORT_PARAMS further to
a number of different outcomes, resync being one of them.
If the connection handshake happens between the state change to Primary
and the generation of the new UUIDs, the resync decision based on the
old UUID set may be confused, depending on circumstances.
Make sure that, before we do the handshake, any promotion to Primary
role will either be complete (including the housekeeping stuff), or can
see, and serialize with, the ongoing handshake, based on the
"STATE_SENT" bit, which is set when we start the handshake, and cleared
only when we leave C_WF_REPORT_PARAMS again.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
We need to propagate the configuration into the flag bits,
or it won't be effective.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Smatch complained about it this redundanct check.
The check was introduced in 2006-09-13. On 2007-07-24 the body of the
function was enclosed by get_ldev()/put_ldev() reference counting.
Since then the check is useless and miss leading.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Compiling drbd yields:
drivers/block/drbd/drbd_state.c: In function ‘_conn_request_state’:
drivers/block/drbd/drbd_state.c:1804:5: error: macro "wait_event_lock_irq" passed 4 arguments, but takes just 3
drivers/block/drbd/drbd_state.c:1801:3: error: ‘wait_event_lock_irq’ undeclared (first use in this function)
drivers/block/drbd/drbd_state.c:1801:3: note: each undeclared identifier is reported only once for each function it appears in
drivers/block/drbd/drbd_state.c: At top level:
drivers/block/drbd/drbd_state.c:1734:1: warning: ‘_conn_rq_cond’ defined but not used [-Wunused-function]
Due to drbd having copied the MD definition for wait_event_lock_irq()
as well. Kill them.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Currently there is not limitation of number of requests in the loop bio
list. This can lead into some nasty situations when the caller spawns
tons of bio requests taking huge amount of memory. This is even more
obvious with discard where blkdev_issue_discard() will submit all bios
for the range and wait for them to finish afterwards. On really big loop
devices and slow backing file system this can lead to OOM situation as
reported by Dave Chinner.
With this patch we will wait in loop_make_request() if the number of
bios in the loop bio list would exceed 'nr_congestion_on'.
We'll wake up the process as we process the bios form the list. Some
threshold hysteresis is in place to avoid high frequency oscillation.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Reported-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Free the page allocated for the persistent grant.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Move the code that frees persistent grants from the red-black tree
to a function. This will make it easier for other consumers to move
this to a common place.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Hi Jens,
Another tiny patch.
Removed __packed before the struct smart_attr and added __packed at end of
the structure to fix padding issue.
Signed-off-by: Selvan Mani <smani@micron.com>
Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Calling the request handler directly on a plugged queue defeats
the performance improvements provided by the plugging mechanism.
Use the __blk_run_queue function instead of calling the request
handler directly, so that we don't interfere with the block
layer's ability to plug the queue.
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The dereference to port should be moved below the NULL test.
dpatch engine is used to auto generate this patch.
(https://github.com/weiyj/dpatch)
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
If we're building a 32-bit kernel and CONFIG_LBADF isn't set,
sector_t is 32-bits wide. The shifts by 32 and 40 are thus
larger than we support.
Cast the sector offset to a u64 to avoid these warnings.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Previous commit use value 3 for erasemode mask.
Changing the mask to correct value to 2
Signed-off-by: Selvan Mani <smani@micron.com>
Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Earlier lba address was assigned directly to lba_low and lba_low_ex,
which would result in a different number (bytes reversed) in
big-endian systems. Now assigning lba address byte-by-byte to fis.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Selvan Mani <smani@micron.com>
Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The mtip driver lifted this code from elsewhere and then added a special
handling check for SEC_ERASE_UNIT. If the caller tries to do a security
erase but passes no output data for the command then outbuf is not
allocated and the driver duly explodes.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Selvan Mani <smani@micron.com>
Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Use check_signature() to find a signature in the mmio address.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Mike Miller <mike.miller@hp.com>
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
- Remove unnecessary correction of bit and address
- Use BITS_TO_LONGS macro to calculate bitmap size
- Use bitmap_zero()
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Mike Miller <mike.miller@hp.com>
Cc: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Use copy_highpage() to copy from one page to another.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>