linux

iv/linux

Author	SHA1	Message	Date
Dave Chinner	a444d72e60	Merge branch 'xfs-4.10-misc-fixes-3' into for-next	2016-12-07 17:42:30 +11:00
Dave Chinner	cae028df53	xfs: optimise CRC updates Nick Piggin reported that the CRC overhead in an fsync heavy workload was higher than expected on a Power8 machine. Part of this was to do with the fact that the power8 CRC implementation is not efficient for CRC lengths of less than 512 bytes, and so the way we split the CRCs over the CRC field means a lot of the CRCs are reduced to being less than than optimal size. To optimise this, change the CRC update mechanism to zero the CRC field first, and then compute the CRC in one pass over the buffer and write the result back into the buffer. We can do this safely because anything writing a CRC has exclusive access to the buffer the CRC is being calculated over. We leave the CRC verify code the same - it still splits the CRC calculation - because we do not want read-only operations modifying the underlying buffer. This is because read-only operations may not have an exclusive access to the buffer guaranteed, and so temporary modifications could leak out to to other processes accessing the buffer concurrently. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-12-05 14:40:32 +11:00
Eric Sandeen	200237d674	xfs: Move AGI buffer type setting to xfs_read_agi We've missed properly setting the buffer type for an AGI transaction in 3 spots now, so just move it into xfs_read_agi() and set it if we are in a transaction to avoid the problem in the future. This is similar to how it is done in i.e. the dir3 and attr3 read functions. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-12-05 12:31:31 +11:00
Eric Sandeen	6b10b23ca9	xfs: set AGI buffer type in xlog_recover_clear_agi_bucket xlog_recover_clear_agi_bucket didn't set the type to XFS_BLFT_AGI_BUF, so we got a warning during log replay (or an ASSERT on a debug build). XFS (md0): Unknown buffer type 0! XFS (md0): _xfs_buf_ioapply: no ops on block 0xaea8802/0x1 Fix this, as was done in `f19b872b` for 2 other locations with the same problem. cc: <stable@vger.kernel.org> # 3.10 to current Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-12-05 12:31:06 +11:00
Darrick J. Wong	755c7bf5dd	libxfs: convert ushort to unsigned short Since xfsprogs dropped ushort in favor of unsigned short, do that here too. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-11-08 11:55:48 +11:00
Darrick J. Wong	17c12bcd30	xfs: when replaying bmap operations, don't let unlinked inodes get reaped Log recovery will iget an inode to replay BUI items and iput the inode when it's done. Unfortunately, if the inode was unlinked, the iput will see that i_nlink == 0 and decide to truncate & free the inode, which prevents us from replaying subsequent BUIs. We can't skip the BUIs because we have to replay all the redo items to ensure that atomic operations complete. Since unlinked inode recovery will reap the inode anyway, we can safely introduce a new inode flag to indicate that an inode is in this 'unlinked recovery' state and should not be auto-reaped in the drop_inode path. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2016-10-04 11:05:44 -07:00
Darrick J. Wong	77d61fe45e	xfs: log bmap intent items Provide a mechanism for higher levels to create BUI/BUD items, submit them to the log, and a stub function to deal with recovered BUI items. These parts will be connected to the rmapbt in a later patch. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2016-10-04 11:05:44 -07:00
Darrick J. Wong	a90c00f055	xfs: add refcount btree block detection to log recovery Identify refcountbt blocks in the log correctly so that we can validate them during log recovery. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2016-10-03 09:11:23 -07:00
Darrick J. Wong	f997ee2137	xfs: log refcount intent items Provide a mechanism for higher levels to create CUI/CUD items, submit them to the log, and a stub function to deal with recovered CUI items. These parts will be connected to the refcountbt in a later patch. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2016-10-03 09:11:21 -07:00
Brian Foster	5cd9cee98b	xfs: log recovery tracepoints to track current lsn and buffer submission Log recovery has particular rules around buffer submission along with tricky corner cases where independent transactions can share an LSN. As such, it can be difficult to follow when/why buffers are submitted during recovery. Add a couple tracepoints to post the current LSN of a record when a new record is being processed and when a buffer is being skipped due to LSN ordering. Also, update the recover item class to include the LSN of the current transaction for the item being processed. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-09-26 08:34:52 +10:00
Brian Foster	60a4a22251	xfs: update metadata LSN in buffers during log recovery Log recovery is currently broken for v5 superblocks in that it never updates the metadata LSN of buffers written out during recovery. The metadata LSN is recorded in various bits of metadata to provide recovery ordering criteria that prevents transient corruption states reported by buffer write verifiers. Without such ordering logic, buffer updates can be replayed out of order and lead to false positive transient corruption states. This is generally not a corruption vector on its own, but corruption detection shuts down the filesystem and ultimately prevents a mount if it occurs during log recovery. This requires an xfs_repair run that clears the log and potentially loses filesystem updates. This problem is avoided in most cases as metadata writes during normal filesystem operation update the metadata LSN appropriately. The problem with log recovery not updating metadata LSNs manifests if the system happens to crash shortly after log recovery itself. In this scenario, it is possible for log recovery to complete all metadata I/O such that the filesystem is consistent. If a crash occurs after that point but before the log tail is pushed forward by subsequent operations, however, the next mount performs the same log recovery over again. If a buffer is updated multiple times in the dirty range of the log, an earlier update in the log might not be valid based on the current state of the associated buffer after all of the updates in the log had been replayed (before the previous crash). If a verifier happens to detect such a problem, the filesystem claims corruption and immediately shuts down. This commonly manifests in practice as directory block verifier failures such as the following, likely due to directory verifiers being particularly detailed in their checks as compared to most others: ... Mounting V5 Filesystem XFS (dm-0): Starting recovery (logdev: internal) XFS (dm-0): Internal error XFS_WANT_CORRUPTED_RETURN at line ... of \ file fs/xfs/libxfs/xfs_dir2_data.c. Caller xfs_dir3_data_verify ... ... Update log recovery to update the metadata LSN of recovered buffers. Since metadata LSNs are already updated by write verifer functions via attached log items, attach a dummy log item to the buffer during validation and explicitly set the LSN of the current transaction. This ensures that the metadata LSN of a buffer is updated based on whether the recovery I/O actually completes, and if so, that subsequent recovery attempts identify that the buffer is already up to date with respect to the current transaction. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-09-26 08:34:27 +10:00
Brian Foster	040c52c0aa	xfs: don't warn on buffers not being recovered due to LSN The log recovery buffer validation function is invoked in cases where a buffer update may be skipped due to LSN ordering. If the validation function happens to come across directory conversion situations (e.g., a dir3 block to data conversion), it may warn about seeing a buffer log format of one type and a buffer with a magic number of another. This warning is not valid as the buffer update is ultimately skipped. This is indicated by a current_lsn of NULLCOMMITLSN provided by the caller. As such, update xlog_recover_validate_buf_type() to only warn in such cases when a buffer update is expected. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-09-26 08:32:50 +10:00
Brian Foster	22db9af248	xfs: pass current lsn to log recovery buffer validation The current LSN must be available to the buffer validation function to provide the ability to update the metadata LSN of the buffer. Pass the current_lsn value down to xlog_recover_validate_buf_type() in preparation. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-09-26 08:32:07 +10:00
Brian Foster	12818d24db	xfs: rework log recovery to submit buffers on LSN boundaries The fix to log recovery to update the metadata LSN in recovered buffers introduces the requirement that a buffer is submitted only once per current LSN. Log recovery currently submits buffers on transaction boundaries. This is not sufficient as the abstraction between log records and transactions allows for various scenarios where multiple transactions can share the same current LSN. If independent transactions share an LSN and both modify the same buffer, log recovery can incorrectly skip updates and leave the filesystem in an inconsisent state. In preparation for proper metadata LSN updates during log recovery, update log recovery to submit buffers for write on LSN change boundaries rather than transaction boundaries. Explicitly track the current LSN in a new struct xlog field to handle the various corner cases of when the current LSN may or may not change. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-09-26 08:22:16 +10:00
Darrick J. Wong	722e251770	xfs: remove the extents array from the rmap update done log item Nothing ever uses the extent array in the rmap update done redo item, so remove it before it is fixed in the on-disk log format. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-08-03 12:28:43 +10:00
Darrick J. Wong	a650e8f98e	xfs: add rmap btree block detection to log recovery Originally-From: Dave Chinner <dchinner@redhat.com> So such blocks can be correctly identified and have their operations structures attached to validate recovery has not resulted in a correct block. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-08-03 12:17:11 +10:00
Darrick J. Wong	9e88b5d867	xfs: log rmap intent items Provide a mechanism for higher levels to create RUI/RUD items, submit them to the log, and a stub function to deal with recovered RUI items. These parts will be connected to the rmapbt in a later patch. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-08-03 12:09:48 +10:00
Darrick J. Wong	525488520a	xfs: rmap btree requires more reserved free space Originally-From: Dave Chinner <dchinner@redhat.com> The rmap btree is allocated from the AGFL, which means we have to ensure ENOSPC is reported to userspace before we run out of free space in each AG. The last allocation in an AG can cause a full height rmap btree split, and that means we have to reserve at least this many blocks in each AG to be placed on the AGFL at ENOSPC. Update the various space calculation functions to handle this. Also, because the macros are now executing conditional code and are called quite frequently, convert them to functions that initialise variables in the struct xfs_mount, use the new variables everywhere and document the calculations better. [darrick.wong@oracle.com: don't reserve blocks if !rmap] [dchinner@redhat.com: update m_ag_max_usable after growfs] Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-08-03 11:38:24 +10:00
Darrick J. Wong	dc42375d5f	xfs: refactor redo intent item processing Refactor the EFI intent item recovery (and cancellation) functions into a general function that scans the AIL and an intent item type specific handler. Move the function that recovers a single EFI item into the extent free item code. We'll want the generalized function when we start wiring up more redo item types. Furthermore, ensure that log recovery only replays the redo items that were in the AIL prior to recovery by checking the item LSN against the largest LSN seen during log scanning. As written this should never happen, but we can be defensive anyway. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-08-03 11:23:49 +10:00
Darrick J. Wong	3ab78df2a5	xfs: rework xfs_bmap_free callers to use xfs_defer_ops Restructure everything that used xfs_bmap_free to use xfs_defer_ops instead. For now we'll just remove the old symbols and play some cpp magic to make it work; in the next patch we'll actually rename everything. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-08-03 11:15:38 +10:00
Dave Chinner	2a4ad5894c	Merge branch 'xfs-4.7-misc-fixes' into for-next	2016-05-20 10:33:17 +10:00
Christoph Hellwig	664b60f6ba	xfs: improve kmem_realloc Use krealloc to implement our realloc function. This helps to avoid new allocations if we are still in the slab bucket. At least for the bmap btree root that's actually the common case. This also allows removing the now unused oldsize argument. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-04-06 09:47:01 +10:00
Christoph Hellwig	253f4911f2	xfs: better xfs_trans_alloc interface Merge xfs_trans_reserve and xfs_trans_alloc into a single function call that returns a transaction with all the required log and block reservations, and which allows passing transaction flags directly to avoid the cumbersome _xfs_trans_alloc interface. While we're at it we also get rid of the transaction type argument that has been superflous since we stopped supporting the non-CIL logging mode. The guts of it will be removed in another patch. [dchinner: fixed transaction leak in error path in xfs_setattr_nonsize] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-04-06 09:19:55 +10:00
Dave Chinner	ab9d1e4f7b	Merge branch 'xfs-misc-fixes-4.6-3' into for-next	2016-03-09 08:18:30 +11:00
Dave Chinner	7f0ed5461a	Merge branch 'xfs-buf-macro-cleanup-4.6' into for-next	2016-03-07 09:31:00 +11:00
Dave Chinner	a2bbcb60ff	Merge branch 'xfs-gut-icdinode-4.6' into for-next	2016-03-07 09:30:32 +11:00
Dave Chinner	c53473be45	Merge branch 'xfs-rt-fixes-4.6' into for-next	2016-03-07 09:29:04 +11:00
Dave Chinner	9deed09554	Merge branch 'xfs-torn-log-fixes-4.5' into for-next	2016-03-07 09:28:36 +11:00
Dave Chinner	a798011c8f	xfs: reinitialise per-AG structures if geometry changes during recovery If a crash occurs immediately after a filesystem grow operation, the updated superblock geometry is found only in the log. After we recover the log, the superblock is reread and re-initialised and so has the new geometry in memory. If the new geometry has more AGs than prior to the grow operation, then the new AGs will not have in-memory xfs_perag structurea associated with them. This will result in an oops when the first metadata buffer from a new AG is looked up in the buffer cache, as the block lies within the new geometry but then fails to find a perag structure on lookup. This is easily fixed by simply re-initialising the perag structure after re-reading the superblock at the conclusion of the first pahse of log recovery. This, however, does not fix the case of log recovery requiring access to metadata in the newly grown space. Fortunately for us, because the in-core superblock has not been updated, this will result in detection of access beyond the end of the filesystem and so recovery will fail at that point. If this proves to be a problem, then we can address it separately to the current reported issue. Reported-by: Alex Lyakas <alex@zadarastorage.com> Tested-by: Alex Lyakas <alex@zadarastorage.com> Signed-off-by: Dave Chinner <dchinner@redhat.com>	2016-03-07 08:39:36 +11:00
Brian Foster	7f6aff3a29	xfs: only run torn log write detection on dirty logs XFS uses CRC verification over a sub-range of the head of the log to detect and handle torn writes. This torn log write detection currently runs unconditionally at mount time, regardless of whether the log is dirty or clean. This is problematic in cases where a filesystem might end up being moved across different, incompatible (i.e., opposite byte-endianness) architectures. The problem lies in the fact that log data is not necessarily written in an architecture independent format. For example, certain bits of data are written in native endian format. Further, the size of certain log data structures differs (i.e., struct xlog_rec_header) depending on the word size of the cpu. This leads to false positive crc verification errors and ultimately failed mounts when a cleanly unmounted filesystem is mounted on a system with an incompatible architecture from data that was written near the head of the log. Update the log head/tail discovery code to run torn write detection only when the log is not clean. This means something other than an unmount record resides at the head of the log and log recovery is imminent. It is a requirement to run log recovery on the same type of host that had written the content of the dirty log and therefore CRC failures are legitimate corruptions in that scenario. Reported-by: Jan Beulich <JBeulich@suse.com> Tested-by: Jan Beulich <JBeulich@suse.com> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-03-07 08:22:22 +11:00
Brian Foster	717bc0ebca	xfs: refactor in-core log state update to helper Once the record at the head of the log is identified and verified, the in-core log state is updated based on the record. This includes information such as the current head block and cycle, the start block of the last record written to the log, the tail lsn, etc. Once torn write detection is conditional, this logic will need to be reused. Factor the code to update the in-core log data structures into a new helper function. This patch does not change behavior. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-03-07 08:22:22 +11:00
Brian Foster	65b99a08b3	xfs: refactor unmount record detection into helper Once the mount sequence has identified the head and tail blocks of the physical log, the record at the head of the log is located and examined for an unmount record to determine if the log is clean. This currently occurs after torn write verification of the head region of the log. This must ultimately be separated from torn write verification and may need to be called again if the log head is walked back due to a torn write (to determine whether the new head record is an unmount record). Separate this logic into a new helper function. This patch does not change behavior. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-03-07 08:22:22 +11:00
Brian Foster	82ff6cc26e	xfs: separate log head record discovery from verification The code that locates the log record at the head of the log is buried in the log head verification function. This is fine when torn write verification occurs unconditionally, but this behavior is problematic for filesystems that might be moved across systems with different architectures. In preparation for separating examination of the log head for unmount records from torn write detection, lift the record location logic out of the log verification function and into the caller. This patch does not change behavior. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-03-07 08:22:22 +11:00
Dave Chinner	12877da584	xfs: remove XFS_BUF_ZEROFLAGS macro The places where we use this macro already clear unnecessary IO flags (e.g. through xfs_bwrite()) or never have unexpected IO flags set on them in the first place (e.g. iclog buffers). Remove the macro from these locations, and where necessary clear only the specific flags that are conditional in the current buffer context. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-02-10 15:01:30 +11:00
Dave Chinner	b68c08219a	xfs: remove XBF_WRITE flag wrapper macros They only set/clear/check a flag, no need for obfuscating this with a macro. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-02-10 15:01:11 +11:00
Dave Chinner	0cac682ff6	xfs: remove XBF_READ flag wrapper macros They only set/clear/check a flag, no need for obfuscating this with a macro. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-02-10 15:01:11 +11:00
Dave Chinner	1157b32c73	xfs: remove XBF_ASYNC flag wrapper macros They only set/clear/check a flag, no need for obfuscating this with a macro. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-02-10 15:01:11 +11:00
Dave Chinner	b0388bf108	xfs: remove XBF_DONE flag wrapper macros They only set/clear/check a flag, no need for obfuscating this with a macro. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-02-10 15:01:11 +11:00
Dave Chinner	c19b3b05ae	xfs: mode di_mode to vfs inode Move the di_mode value from the xfs_icdinode to the VFS inode, reducing the xfs_icdinode byte another 2 bytes and collapsing another 2 byte hole in the structure. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-02-09 16:54:58 +11:00
Dave Chinner	54d7b5c1d0	xfs: use vfs inode nlink field everywhere The VFS tracks the inode nlink just like the xfs_icdinode. We can remove the variable from the icdinode and use the VFS inode variable everywhere, reducing the size of the xfs_icdinode by a further 4 bytes. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-02-09 16:54:58 +11:00
Dave Chinner	3987848c7c	xfs: remove timestamps from incore inode The struct xfs_inode has two copies of the current timestamps in it, one in the vfs inode and one in the struct xfs_icdinode. Now that we no longer log the struct xfs_icdinode directly, we don't need to keep the timestamps in this structure. instead we can copy them straight out of the VFS inode when formatting the inode log item or the on-disk inode. This reduces the struct xfs_inode in size by 24 bytes. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-02-09 16:54:58 +11:00
Dave Chinner	f8d55aa052	xfs: introduce inode log format object We currently carry around and log an entire inode core in the struct xfs_inode. A lot of the information in the inode core is duplicated in the VFS inode, but we cannot remove this duplication of infomration because the inode core is logged directly in xfs_inode_item_format(). Add a new function xfs_inode_item_format_core() that copies the inode core data into a struct xfs_icdinode that is pulled directly from the log vector buffer. This means we no longer directly copy the inode core, but copy the structures one member at a time. This will be slightly less efficient than copying, but will allow us to remove duplicate and unnecessary items from the struct xfs_inode. To enable us to do this, call the new structure a xfs_log_dinode, so that we know it's different to the physical xfs_dinode and the in-core xfs_icdinode. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-02-09 16:54:58 +11:00
Dave Chinner	bf85e0998a	xfs: RT bitmap and summary buffers need verifiers Buffers without verifiers issue runtime warnings on XFS. We don't have anything we can actually verify in the RT buffers (no CRCs, not magic numbers, etc), but we still need verifiers to avoid the warnings. Add a set of dummy verifier operations for the realtime buffers and apply them in the appropriate places. Signed-off-by: Dave Chinner <dchinner@redhat.com> Tested-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-02-09 16:41:45 +11:00
Dave Chinner	f67ca6eca8	xfs: RT bitmap and summary buffers are not typed When logging buffers, we attach a type to them that follows the buffer all the way into the log and is used to identify the buffer contents in log recovery. Both the realtime summary buffers and the bitmap buffers do not have types defined or set, so when we try to log them we see assert failure: XFS: Assertion failed: (bip->bli_flags & XFS_BLI_STALE) \|\| (xfs_blft_from_flags(&bip->__bli_format) > XFS_BLFT_UNKNOWN_BUF && xfs_blft_from_flags(&bip->__bli_format) < XFS_BLFT_MAX_BUF), file: fs/xfs/xfs_buf_item.c, line: 294 Fix this by adding buffer log format types for these buffers, and add identification support into log recovery for them. Only build the log recovery support if CONFIG_XFS_RT=y - we can't get into log recovery for real time filesystems if support is not built into the kernel, and this avoids potential build problems. Signed-off-by: Dave Chinner <dchinner@redhat.com> Tested-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-02-09 16:41:31 +11:00
Darrick J. Wong	8e0bd4925b	xfs: fix endianness error when checking log block crc on big endian platforms Since the checksum function and the field are both __le32, don't perform endian conversion when comparing the two. This fixes mount failures on ppc64. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-02-08 11:03:58 +11:00
Dave Chinner	dde7f55bd0	Merge branch 'xfs-misc-fixes-for-4.5-2' into for-next	2016-01-12 07:04:30 +11:00
Dave Chinner	7d6a13f023	xfs: handle dquot buffer readahead in log recovery correctly When we do dquot readahead in log recovery, we do not use a verifier as the underlying buffer may not have dquots in it. e.g. the allocation operation hasn't yet been replayed. Hence we do not want to fail recovery because we detect an operation to be replayed has not been run yet. This problem was addressed for inodes in commit `d891400` ("xfs: inode buffers may not be valid during recovery readahead") but the problem was not recognised to exist for dquots and their buffers as the dquot readahead did not have a verifier. The result of not using a verifier is that when the buffer is then next read to replay a dquot modification, the dquot buffer verifier will only be attached to the buffer if readahead is not complete. Hence we can read the buffer, replay the dquot changes and then add it to the delwri submission list without it having a verifier attached to it. This then generates warnings in xfs_buf_ioapply(), which catches and warns about this case. Fix this and make it handle the same readahead verifier error cases as for inode buffers by adding a new readahead verifier that has a write operation as well as a read operation that marks the buffer as not done if any corruption is detected. Also make sure we don't run readahead if the dquot buffer has been marked as cancelled by recovery. This will result in readahead either succeeding and the buffer having a valid write verifier, or readahead failing and the buffer state requiring the subsequent read to resubmit the IO with the new verifier. In either case, this will result in the buffer always ending up with a valid write verifier on it. Note: we also need to fix the inode buffer readahead error handling to mark the buffer with EIO. Brian noticed the code I copied from there wrong during review, so fix it at the same time. Add comments linking the two functions that handle readahead verifier errors together so we don't forget this behavioural link in future. cc: <stable@vger.kernel.org> # 3.12 - current Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-01-12 07:04:01 +11:00
Brian Foster	7088c4136f	xfs: detect and trim torn writes during log recovery Certain types of storage, such as persistent memory, do not provide sector atomicity for writes. This means that if a crash occurs while XFS is writing log records, only part of those records might make it to the storage. This is problematic because log recovery uses the cycle value packed at the top of each log block to locate the head/tail of the log. This can lead to CRC verification failures during log recovery and an unmountable fs for a filesystem that is otherwise consistent. Update log recovery to incorporate log record CRC verification as part of the head/tail discovery process. Once the head is located via the traditional algorithm, run a CRC-only pass over the records up to the head of the log. If CRC verification fails, assume that the records are torn as a matter of policy and trim the head block back to the start of the first bad record. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-01-05 07:40:16 +11:00
Brian Foster	eed6b462fb	xfs: refactor log record start detection into a new helper As part of the head/tail discovery process, log recovery locates the head block and then reverse seeks to find the start of the last active record in the log. This is non-trivial as the record itself could have wrapped around the end of the physical log. Log recovery torn write detection potentially needs to walk further behind the last record in the log, as multiple log I/Os can be in-flight at one time during a crash event. Therefore, refactor the reverse log record header search mechanism into a new helper that supports the ability to seek past an arbitrary number of log records (or until the tail is hit). Update the head/tail search mechanism to call the new helper, but otherwise there is no change in log recovery behavior. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-01-04 15:55:10 +11:00
Brian Foster	6528250b71	xfs: support a crc verification only log record pass Log recovery torn write detection uses CRC verification over a range of the active log to identify torn writes. Since the generic log recovery pass code implements a superset of the functionality required for CRC verification, it can be easily modified to support a CRC verification only pass. Create a new CRC pass type and update the log record processing helper to skip everything beyond CRC verification when in this mode. This pass will be invoked in subsequent patches to implement torn write detection. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2016-01-04 15:55:10 +11:00

1 2 3 4 5 ...

326 Commits