IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
commit 7a21272b088894070391a94fdd1c67014020fa1d upstream.
Dan Carpenter reported a double-free of rcur if _defer_finish fails
while we're recovering CUI items. Fix the error recovery to prevent
this.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b24a978c377be5f14e798cb41238e66fe51aab2f upstream.
Use NOFS for allocating btree cursors, since they can be called
under the ilock.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2e1d23370e75d7d89350d41b4ab58c7f6a0e26b2 upstream.
When we create a new attribute, we first create a shortform
attribute, and try to fit the new attribute into it.
If that fails, we copy the (empty) attribute into a leaf attribute,
and do the copy again. Thus there can be a transient state where
we have an empty leaf attribute.
If we encounter this during log replay, the verifier will fail.
So add a test to ignore this part of the leaf attr verification
during log replay.
Thanks as usual to dchinner for spotting the problem.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1bb33a98702d8360947f18a44349df75ba555d5d upstream.
After various discussions on linux-fsdevel, it has been decided that it
is not necessary to cap the length of a dedupe request, and that
correctly-written userspace client programs will be able to absorb the
change. Therefore, remove the length clamping behavior.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ef388e2054feedaeb05399ed654bdb06f385d294 upstream.
The on-disk field di_size is used to set i_size, which is a signed
integer of loff_t. If the high bit of di_size is set, we'll end up with
a negative i_size, which will cause all sorts of problems. Since the
VFS won't let us create a file with such length, we should catch them
here in the verifier too.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0f352f8ee8412bd9d34fb2a6411241da61175c0e upstream.
We shouldn't assert if somehow we end up trying to add an attr fork to
an inode that apparently already has attr extents because this is an
indication of on-disk corruption. Instead, return an error code to
userspace.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 96a3aefb8ffde23180130460b0b2407b328eb727 upstream.
In xfs_dir3_data_read, we can encounter the situation where err == 0 and
*bpp == NULL if the given bno offset happens to be a hole; this leads to
a crash if we try to set the buffer type after the _da_read_buf call.
Holes can happen due to corrupt or malicious entries in the bmbt data,
so be a little more careful when we're handling buffers.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 356a3225222e5bc4df88aef3419fb6424f18ab69 upstream.
When reading into memory all extents of a btree-format inode fork,
complain if the number of extents we find is not the same as the number
of extents reported in the inode core. This is needed to stop an IO
action from accessing the garbage areas of the in-core fork.
[dchinner: removed redundant assert]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bb3be7e7c1c18e1b141d4cadeb98cc89ecf78099 upstream.
When we're reading a btree block, make sure that what we retrieved
matches the owner and level; and has a plausible number of records.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d2a047f31e86941fa896e0e3271536d50aba415e upstream.
There is no such thing as a zero-level AG btree since even a single-node
zero-records btree has one level. Btree cursor constructors read
cur_nlevels straight from disk and then access things like
cur_bufs[cur_nlevels - 1] which is /really/ bad if cur_nlevels is zero!
Therefore, strengthen the verifiers to prevent this possibility.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c44a1f22626c153976289e1cd67bdcdfefc16e1f upstream.
By inspection, xfs_bmap_trace_exlist isn't handling cow forks,
and will trace the data fork instead.
Fix this by setting state appropriately if whichfork
== XFS_COW_FORK.
()___()
< @ @ >
| |
{o_o}
(|)
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7710517fc37b1899722707883b54694ea710b3c0 upstream.
When xfs_bmap_trace_exlist called trace_xfs_extlist,
it sent in the "whichfork" var instead of the bmap "state"
as expected (even though state was already set up for this
purpose).
As a result, the xfs_bmap_class in tracing code used
"whichfork" not state in xfs_iext_state_to_fork(), and got
the wrong ifork pointer. It all goes downhill from
there, including an ASSERT when ifp_bytes is empty
by the time it reaches xfs_iext_get_ext():
XFS: Assertion failed: idx < ifp->if_bytes / sizeof(xfs_bmbt_rec_t)
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 200237d6746faaeaf7f4ff4abbf13f3917cee60a upstream.
We've missed properly setting the buffer type for
an AGI transaction in 3 spots now, so just move it
into xfs_read_agi() and set it if we are in a transaction
to avoid the problem in the future.
This is similar to how it is done in i.e. the dir3
and attr3 read functions.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f782088c9e5d08e9494c63e68b4e85716df3e5f8 upstream.
xfs_file_iomap_begin_delay() implements post-eof speculative
preallocation by extending the block count of the requested delayed
allocation. Now that xfs_bmapi_reserve_delalloc() has been updated to
handle prealloc blocks separately and tag the inode, update
xfs_file_iomap_begin_delay() to use the new parameter and rely on the
former to tag the inode.
Note that this patch does not change behavior.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 656152e552e5cbe0c11ad261b524376217c2fb13 upstream.
And only lookup the previous extent inside xfs_iomap_prealloc_size
if we actually need it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0260d8ff5f76617e3a55a1c471383ecb4404c3ad upstream.
COW fork reservation is implemented via delayed allocation. The code is
modeled after the traditional delalloc allocation code, but is slightly
different in terms of how preallocation occurs. Rather than post-eof
speculative preallocation, COW fork preallocation is implemented via a
COW extent size hint that is designed to minimize fragmentation as a
reflinked file is split over time.
xfs_reflink_reserve_cow() still uses logic that is oriented towards
dealing with post-eof speculative preallocation, however, and is stale
or not necessarily correct. First, the EOF alignment to the COW extent
size hint is implemented in xfs_bmapi_reserve_delalloc() (which does so
correctly by aligning the start and end offsets) and so is not necessary
in xfs_reflink_reserve_cow(). The backoff and retry logic on ENOSPC is
also ineffective for the same reason, as xfs_bmapi_reserve_delalloc()
will simply perform the same allocation request on the retry. Finally,
since the COW extent size hint aligns the start and end offset of the
range to allocate, the end_fsb != orig_end_fsb logic is not sufficient.
Indeed, if a write request happens to end on an aligned offset, it is
possible that we do not tag the inode for COW preallocation even though
xfs_bmapi_reserve_delalloc() may have preallocated at the start offset.
Kill the unnecessary, duplicate code in xfs_reflink_reserve_cow().
Remove the inode tag logic as well since xfs_bmapi_reserve_delalloc()
has been updated to tag the inode correctly.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 974ae922efd93b07b6cdf989ae959883f6f05fd8 upstream.
Speculative preallocation is currently processed entirely by the callers
of xfs_bmapi_reserve_delalloc(). The caller determines how much
preallocation to include, adjusts the extent length and passes down the
resulting request.
While this works fine for post-eof speculative preallocation, it is not
as reliable for COW fork preallocation. COW fork preallocation is
implemented via the cowextszhint, which aligns the start offset as well
as the length of the extent. Further, it is difficult for the caller to
accurately identify when preallocation occurs because the returned
extent could have been merged with neighboring extents in the fork.
To simplify this situation and facilitate further COW fork preallocation
enhancements, update xfs_bmapi_reserve_delalloc() to take a separate
preallocation parameter to incorporate into the allocation request. The
preallocation blocks value is tacked onto the end of the request and
adjusted to accommodate neighboring extents and extent size limits.
Since xfs_bmapi_reserve_delalloc() now knows precisely how much
preallocation was included in the allocation, it can also tag the inodes
appropriately to support preallocation reclaim.
Note that xfs_bmapi_reserve_delalloc() callers are not yet updated to
use the preallocation mechanism. This patch should not change behavior
outside of correctly tagging reflink inodes when start offset
preallocation occurs (which the caller does not handle correctly).
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 65c5f419788d623a0410eca1866134f5e4628594 upstream.
We can easily lookup the previous extent for the cases where we need it,
which saves the callers from looking it up for us later in the series.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fba3e594ef0ad911fa8f559732d588172f212d71 upstream.
It turns out that btrfs and xfs had differing interpretations of what
to do when the dedupe length is zero. Change xfs to follow btrfs'
semantics so that the userland interface is consistent.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fd26a88093bab6529ea2de819114ca92dbd1d71d upstream.
When we're estimating the amount of space it's going to take to satisfy
a delalloc reservation, we need to include the space that we might need
to grow the rmapbt. This helps us to avoid running out of space later
when _iomap_write_allocate needs more space than we reserved. Eryu Guan
observed this happening on generic/224 when sunit/swidth were set.
Reported-by: Eryu Guan <eguan@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 93533c7855c3c78c8a900cac65c8d669bb14935d upstream.
xfs_iext_lookup_extent looks up a single extent at the passed in offset,
and returns the extent covering the area, or the one behind it in case
of a hole, as well as the index of the returned extent in arguments,
as well as a simple bool as return value that is set to false if no
extent could be found because the offset is behind EOF. It is a simpler
replacement for xfs_bmap_search_extent that leaves looking up the rarely
needed previous extent to the caller and has a nicer calling convention.
xfs_iext_get_extent is a helper for iterating over the extent list,
it takes an extent index as input, and returns the extent at that index
in it's expanded form in an argument if it exists. The actual return
value is a bool whether the index is valid or not.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 98efe8af1c9ffac47e842b7a75ded903e2f028da upstream.
Filesystem shutdown testing on an older distro kernel has uncovered an
imbalanced locking pattern for the inode flush lock in
xfs_reclaim_inode(). Specifically, there is a double unlock sequence
between the call to xfs_iflush_abort() and xfs_reclaim_inode() at the
"reclaim:" label.
This actually does not cause obvious problems on current kernels due to
the current flush lock implementation. Older kernels use a counting
based flush lock mechanism, however, which effectively breaks the lock
indefinitely when an already unlocked flush lock is repeatedly unlocked.
Though this only currently occurs on filesystem shutdown, it has
reproduced the effect of elevating an fs shutdown to a system-wide crash
or hang.
As it turns out, the flush lock is not actually required for the reclaim
logic in xfs_reclaim_inode() because by that time we have already cycled
the flush lock once while holding ILOCK_EXCL. Therefore, remove the
additional flush lock/unlock cycle around the 'reclaim:' label and
update branches into this label to release the flush lock where
appropriate. Add an assert to xfs_ifunlock() to help prevent future
occurences of the same problem.
Reported-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bec9d48d7a303a5bb95c05961ff07ec7eeb59058 upstream.
Check the minimum block size on v5 filesystems.
[dchinner: cleaned up XFS_MIN_CRC_BLOCKSIZE check]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5d829300bee000980a09ac2ccb761cb25867b67c upstream.
The open-coded pattern:
ifp->if_bytes / (uint)sizeof(xfs_bmbt_rec_t)
is all over the xfs code; provide a new helper
xfs_iext_count(ifp) to count the number of inline extents
in an inode fork.
[dchinner: pick up several missed conversions]
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 04197b341f23b908193308b8d63d17ff23232598 upstream.
We've had reports of generic/095 causing XFS to BUG() in
__xfs_get_blocks() due to the existence of delalloc blocks on a
direct I/O read. generic/095 issues a mix of various types of I/O,
including direct and memory mapped I/O to a single file. This is
clearly not supported behavior and is known to lead to such
problems. E.g., the lack of exclusion between the direct I/O and
write fault paths means that a write fault can allocate delalloc
blocks in a region of a file that was previously a hole after the
direct read has attempted to flush/inval the file range, but before
it actually reads the block mapping. In turn, the direct read
discovers a delalloc extent and cannot proceed.
While the appropriate solution here is to not mix direct and memory
mapped I/O to the same regions of the same file, the current
BUG_ON() behavior is probably overkill as it can crash the entire
system. Instead, localize the failure to the I/O in question by
returning an error for a direct I/O that cannot be handled safely
due to delalloc blocks. Be careful to allow the case of a direct
write to post-eof delalloc blocks. This can occur due to speculative
preallocation and is safe as post-eof blocks are not accompanied by
dirty pages in pagecache (conversely, preallocation within eof must
have been zeroed, and thus dirtied, before the inode size could have
been increased beyond said blocks).
Finally, provide an additional warning if a direct I/O write occurs
while the file is memory mapped. This may not catch all problematic
scenarios, but provides a hint that some known-to-be-problematic I/O
methods are in use.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 399372349a7f9b2d7e56e4fa4467c69822d07024 upstream.
The cowblocks background scanner currently clears the cowblocks tag
for inodes without any real allocations in the cow fork. This
excludes inodes with only delalloc blocks in the cow fork. While we
might never expect to clear delalloc blocks from the cow fork in the
background scanner, it is not necessarily correct to clear the
cowblocks tag from such inodes.
For example, if the background scanner happens to process an inode
between a buffered write and writeback, the scanner catches the
inode in a state after delalloc blocks have been allocated to the
cow fork but before the delalloc blocks have been converted to real
blocks by writeback. The background scanner then incorrectly clears
the cowblocks tag, even if part of the aforementioned delalloc
reservation will not be remapped to the data fork (i.e., extra
blocks due to the cowextsize hint). This means that any such
additional blocks in the cow fork might never be reclaimed by the
background scanner and could persist until the inode itself is
reclaimed.
To address this problem, only skip and clear inodes without any cow
fork allocations whatsoever from the background scanner. While we
generally do not want to cancel delalloc reservations from the
background scanner, the pagecache dirty check following the
cowblocks check should prevent that situation. If we do end up with
delalloc cow fork blocks without a dirty address space mapping, this
is probably an indication that something has gone wrong and the
blocks should be reclaimed, as they may never be converted to a real
allocation.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4fd29ec47212c8cbf98916af519019ccc5e58e49 upstream.
Check the return value of xfs_trans_reserve_quota_nblks for errors.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e6fc6fcf4447c9266038c55c25e4c7c14bee110c upstream.
Source xfsprogs commit: ee3754254e8c186c99b6cdd4d59f741759d04acb
Kernel commit 5ef828c4 ("xfs: avoid false quotacheck after unclean
shutdown") made xfs_sb_from_disk() also call xfs_sb_quota_from_disk
by default.
However, when this was merged to libxfs, existing separate
calls to libxfs_sb_quota_from_disk remained, and calling it
twice in a row on a V4 superblock leads to issues, because:
if (sbp->sb_qflags & XFS_PQUOTA_ACCT) {
...
sbp->sb_pquotino = sbp->sb_gquotino;
sbp->sb_gquotino = NULLFSINO;
and after the second call, we have set both pquotino and gquotino
to NULLFSINO.
Fix this by making it safe to call twice, and also remove the extra
calls to libxfs_sb_quota_from_disk.
This is only spotted when running xfstests with "-m crc=0" because
the sb_from_disk change came about after V5 became default, and
the above behavior only exists on a V4 superblock.
Reported-by: Eryu Guan <eguan@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 26a137e31ffe6fbfdb008554a8d9b3d55bd5c86e upstream.
If the TPM we're connecting to uses a static burst count, it will report
a burst count of zero throughout the response read. However, get_burstcount
assumes that a response of zero indicates that the TPM is not ready to
receive more data. In this case, it returns a negative error code, which
is passed on to tpm_tis_{write,read}_bytes as a u16, causing
them to read/write far too many bytes.
This patch checks for negative return codes and bails out from recv_data
and tpm_tis_send_data.
Fixes: 1107d065fdf1 (tpm_tis: Introduce intermediate layer for TPM access)
Signed-off-by: Josh Zimmerman <joshz@google.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2ef32dee97fcf41987722a37eb6ff1a983915e99 upstream.
The previous spec version said "double Ytile planes minimum lines",
and I interpreted this as referring to what the spec calls "Y tile
minimum", but in fact it was referring to what the spec calls "Minimum
Scanlines for Y tile". I noticed that Mahesh Kumar had a different
interpretation, so I sent and email to the spec authors and got
clarification on the correct meaning. Also, BSpec was updated and
should be clear now.
Fixes: ee3d532fcb64 ("drm/i915/gen9: unconditionally apply the memory bandwidth WA")
Cc: stable@vger.kernel.org
Cc: Mahesh Kumar <mahesh1.kumar@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478636531-6081-1-git-send-email-paulo.r.zanoni@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4349bd775cc8fd75cb648e3a2036a690f497de5c upstream.
We were storing viewport relative coordinates for AVIVO/DCE display
engines. However, radeon_crtc_cursor_set2 and radeon_cursor_reset pass
radeon_crtc->cursor_x/y as the x/y parameters of
radeon_cursor_move_locked, which would break if the CRTC isn't located
at (0, 0).
Cc: stable@vger.kernel.org
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6b7df3ce92ac82ec3f4a2953b6fed77da7b38aaa upstream.
__s390_dma_map_sg maps a dma-contiguous area. Although we only map
whole pages we have to take into account that the area doesn't start
or stop at a page boundary because we use the dma address to loop
over the individual sg entries. Failing to do that might lead to an
access of the wrong sg entry.
Fixes: ee877b81c6b9 ("s390/pci_dma: improve map_sg")
Reported-and-tested-by: Christoph Raisch <raisch@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ebb299a51059017ec253bd30781a83d1f6e11b24 upstream.
The s390 specific sched_domain_topology_level should always be used,
not only if the machine provides topology information. Luckily this
odd behaviour, that was by accident introduced with git commit
d05d15da18f5 ("s390/topology: delay initialization of topology cpu
masks") has currently no side effect.
Fixes: d05d15da18f5 ("s390/topology: delay initialization of topology cpumasks")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 99e5cde5eae78bef95bfe7c16ccda87fb070149b upstream.
Make sure to drop any device reference taken by vio_find_node() when
adding and removing virtual I/O slots.
Fixes: 5eeb8c63a38f ("[PATCH] PCI Hotplug: rpaphp: Move VIO registration")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1c7de2b4ff886a45fbd2f4c3d4627e0f37a9dd77 upstream.
There is at least one Chelsio 10Gb card which uses VPD area to store some
non-standard blocks (example below). However pci_vpd_size() returns the
length of the first block only assuming that there can be only one VPD "End
Tag".
Since 4e1a635552d3 ("vfio/pci: Use kernel VPD access functions"), VFIO
blocks access beyond that offset, which prevents the guest "cxgb3" driver
from probing the device. The host system does not have this problem as its
driver accesses the config space directly without pci_read_vpd().
Add a quirk to override the VPD size to a bigger value. The maximum size
is taken from EEPROMSIZE in drivers/net/ethernet/chelsio/cxgb3/common.h.
We do not read the tag as the cxgb3 driver does as the driver supports
writing to EEPROM/VPD and when it writes, it only checks for 8192 bytes
boundary. The quirk is registered for all devices supported by the cxgb3
driver.
This adds a quirk to the PCI layer (not to the cxgb3 driver) as the cxgb3
driver itself accesses VPD directly and the problem only exists with the
vfio-pci driver (when cxgb3 is not running on the host and may not be even
loaded) which blocks accesses beyond the first block of VPD data. However
vfio-pci itself does not have quirks mechanism so we add it to PCI.
This is the controller:
Ethernet controller [0200]: Chelsio Communications Inc T310 10GbE Single Port Adapter [1425:0030]
This is what I parsed from its VPD:
===
b'\x82*\x0010 Gigabit Ethernet-SR PCI Express Adapter\x90J\x00EC\x07D76809 FN\x0746K'
0000 Large item 42 bytes; name 0x2 Identifier String
b'10 Gigabit Ethernet-SR PCI Express Adapter'
002d Large item 74 bytes; name 0x10
#00 [EC] len=7: b'D76809 '
#0a [FN] len=7: b'46K7897'
#14 [PN] len=7: b'46K7897'
#1e [MN] len=4: b'1037'
#25 [FC] len=4: b'5769'
#2c [SN] len=12: b'YL102035603V'
#3b [NA] len=12: b'00145E992ED1'
007a Small item 1 bytes; name 0xf End Tag
0c00 Large item 16 bytes; name 0x2 Identifier String
b'S310E-SR-X '
0c13 Large item 234 bytes; name 0x10
#00 [PN] len=16: b'TBD '
#13 [EC] len=16: b'110107730D2 '
#26 [SN] len=16: b'97YL102035603V '
#39 [NA] len=12: b'00145E992ED1'
#48 [V0] len=6: b'175000'
#51 [V1] len=6: b'266666'
#5a [V2] len=6: b'266666'
#63 [V3] len=6: b'2000 '
#6c [V4] len=2: b'1 '
#71 [V5] len=6: b'c2 '
#7a [V6] len=6: b'0 '
#83 [V7] len=2: b'1 '
#88 [V8] len=2: b'0 '
#8d [V9] len=2: b'0 '
#92 [VA] len=2: b'0 '
#97 [RV] len=80: b's\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'...
0d00 Large item 252 bytes; name 0x11
#00 [VC] len=16: b'122310_1222 dp '
#13 [VD] len=16: b'610-0001-00 H1\x00\x00'
#26 [VE] len=16: b'122310_1353 fp '
#39 [VF] len=16: b'610-0001-00 H1\x00\x00'
#4c [RW] len=173: b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'...
0dff Small item 0 bytes; name 0xf End Tag
10f3 Large item 13315 bytes; name 0x62
!!! unknown item name 98: b'\xd0\x03\x00@`\x0c\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00'
===
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1600f62534b7b3da7978b43b52231a54c24df287 upstream.
Mellanox devices were marked as having INTx masking ability broken. As a
result, the VFIO driver fails to start when more than one device function
is passed-through to a VM if both have the same INTx pin.
Prior to Connect-IB, Mellanox devices exposed to the operating system one
PCI function per all ports. Starting from Connect-IB, the devices are
function-per-port. When passing the second function to a VM, VFIO will
fail to start.
Exclude ConnectX-4, ConnectX4-Lx and Connect-IB from the list of Mellanox
devices marked as having broken INTx masking:
- ConnectX-4 and ConnectX4-LX firmware version is checked. If INTx
masking is supported, we unmark the broken INTx masking.
- Connect-IB does not support INTx currently so will not cause any
problem.
[bhelgaas: call pci_disable_device() always, after iounmap()]
Fixes: 11e42532ada3 ("PCI: Assume all Mellanox devices have broken INTx masking")
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b88214ce4d7064992452765028bd50702414f15f upstream.
Convert all quirk_broken_intx_masking() quirks from HEADER to FINAL.
The quirk sets dev->broken_intx_masking, which is only used by
pci_intx_mask_supported(), which is not needed until after FINAL
quirks have been run.
[bhelgaas: changelog]
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7254383341bc6e1a61996accd836009f0c922b21 upstream.
Add Mellanox device IDs for use by the mlx4 driver and INTx quirks.
[bhelgaas: sorted and adapted from
http://lkml.kernel.org/r/1478011644-12080-1-git-send-email-noaos@mellanox.com]
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a45e2611b9bbd81288d97d02ce7e74a60a698d43 upstream.
We're trying to mask out bits[23:8] while retaining [32:24, 7:0], but we're
doing the inverse. That doesn't have too much effect, since we're setting
all the [23:8] bits to 1, and the other bits are only relevant for modes
we're currently not using. But we should get this right.
Fixes: ca1989084054 ("PCI: rockchip: Fix wrong transmitted FTS count")
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 45e9320f3a4ef9588ee50a2eb1891c4bfdbb07df upstream.
The calculation of negotiated lanes is wrong: it should be shifted by
PCIE_CORE_PL_CONF_LANE_SHIFT, but it is shifted by
PCIE_CORE_PL_CONF_LANE_MASK instead. Let's fix it.
Fixes: e77f847df54c ("PCI: rockchip: Add Rockchip PCIe controller support")
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 230436b3ef3fd7d4a1da19edf5e87bb2d74e0fc2 upstream.
gcc is unsure about the use of last_ofs_in_node, which might happen
without a prior initialization:
fs/f2fs//git/arm-soc/fs/f2fs/data.c: In function ‘f2fs_map_blocks’:
fs/f2fs/data.c:799:54: warning: ‘last_ofs_in_node’ may be used uninitialized in this function [-Wmaybe-uninitialized]
if (prealloc && dn.ofs_in_node != last_ofs_in_node + 1) {
As pointed out by Chao Yu, the code is actually correct as 'prealloc'
is only set if the last_ofs_in_node has been set, the two always
get updated together.
This initializes last_ofs_in_node to dn.ofs_in_node for each
new dnode at the start of the 'next_block' loop, which at that
point is a correct initialization as well. I assume that compilers
that correctly track the contents of the variables and do not
warn about the condition also figure out that they can eliminate
the extra assignment here.
Fixes: 46008c6d4232 ("f2fs: support in batch multi blocks preallocation")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 35782b233f37e48ecc469d9c7232f3f6a7fad41a upstream.
This patch removes percpu_count usage due to performance regression in iozone.
Fixes: 523be8a6b3 ("f2fs: use percpu_counter for page counters")
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e2342ca832726a840ca6bd196dd2cc073815b08a upstream.
md_open() gets a counted reference on an mddev using mddev_find().
If it ends up returning an error, it must drop this reference.
There are two error paths where the reference is not dropped.
One only happens if the process is signalled and an awkward time,
which is quite unlikely.
The other was introduced recently in commit af8d8e6f0.
Change the code to ensure the drop the reference when returning an error,
and make it harded to re-introduce this sort of bug in the future.
Reported-by: Marc Smith <marc.smith@mcc.edu>
Fixes: af8d8e6f0315 ("md: changes for MD_STILL_CLOSED flag")
Signed-off-by: NeilBrown <neilb@suse.com>
Acked-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1803b9a52c4e5a5dbb8a27126f6bc06939359753 upstream.
The core AES cipher implementation that uses ARMv8 Crypto Extensions
instructions erroneously loads the round keys as 64-bit quantities,
which causes the algorithm to fail when built for big endian. In
addition, the key schedule generation routine fails to take endianness
into account as well, when loading the combining the input key with
the round constants. So fix both issues.
Fixes: 12ac3efe74f8 ("arm64/crypto: use crypto instructions to generate AES key schedule")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>