Go to file
Jason Wang 7f466032dc vhost: access vq metadata through kernel virtual address
It was noticed that the copy_to/from_user() friends that was used to
access virtqueue metdata tends to be very expensive for dataplane
implementation like vhost since it involves lots of software checks,
speculation barriers, hardware feature toggling (e.g SMAP). The
extra cost will be more obvious when transferring small packets since
the time spent on metadata accessing become more significant.

This patch tries to eliminate those overheads by accessing them
through direct mapping of those pages. Invalidation callbacks is
implemented for co-operation with general VM management (swap, KSM,
THP or NUMA balancing). We will try to get the direct mapping of vq
metadata before each round of packet processing if it doesn't
exist. If we fail, we will simplely fallback to copy_to/from_user()
friends.

This invalidation and direct mapping access are synchronized through
spinlock and RCU. All matedata accessing through direct map is
protected by RCU, and the setup or invalidation are done under
spinlock.

This method might does not work for high mem page which requires
temporary mapping so we just fallback to normal
copy_to/from_user() and may not for arch that has virtual tagged cache
since extra cache flushing is needed to eliminate the alias. This will
result complex logic and bad performance. For those archs, this patch
simply go for copy_to/from_user() friends. This is done by ruling out
kernel mapping codes through ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE.

Note that this is only done when device IOTLB is not enabled. We
could use similar method to optimize IOTLB in the future.

Tests shows at most about 23% improvement on TX PPS when using
virtio-user + vhost_net + xdp1 + TAP on 2.6GHz Broadwell:

        SMAP on | SMAP off
Before: 5.2Mpps | 7.1Mpps
After:  6.4Mpps | 8.2Mpps

Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: James Bottomley <James.Bottomley@hansenpartnership.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Miller <davem@davemloft.net>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: linux-mm@kvack.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-parisc@vger.kernel.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-06-05 21:09:18 -04:00
arch The usual smattering of fixes and tunings that came in too late for the 2019-05-26 13:45:15 -07:00
block blk-mq: fix hang caused by freeze/unfreeze sequence 2019-05-23 10:25:26 -06:00
certs treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 36 2019-05-24 17:27:11 +02:00
crypto treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 102 2019-05-24 17:39:00 +02:00
Documentation Devicetree fixes for 5.2: 2019-05-24 15:16:46 -07:00
drivers vhost: access vq metadata through kernel virtual address 2019-06-05 21:09:18 -04:00
fs Bug fixes (including a regression fix) for ext4. 2019-05-25 15:03:12 -07:00
include libnvdimm fixes v5.2-rc2 2019-05-25 10:11:23 -07:00
init treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
ipc treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 52 2019-05-24 17:36:42 +02:00
kernel Make the GCC 9 warning for sub struct memset go away. 2019-05-26 13:49:40 -07:00
lib for-linus-20190524 2019-05-24 16:02:14 -07:00
LICENSES LICENSES: Rename other to deprecated 2019-05-03 06:34:32 -06:00
mm treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 98 2019-05-24 17:37:54 +02:00
net SPDX update for 5.2-rc2, round 2 2019-05-24 14:31:58 -07:00
samples treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 36 2019-05-24 17:27:11 +02:00
scripts Devicetree fixes for 5.2: 2019-05-24 15:16:46 -07:00
security SPDX update for 5.2-rc2, round 2 2019-05-24 14:31:58 -07:00
sound treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 119 2019-05-24 17:39:02 +02:00
tools virtio: add unlikely() to WARN_ON_ONCE() 2019-05-27 11:08:22 -04:00
usr user/Makefile: Fix typo and capitalization in comment section 2018-12-11 00:18:03 +09:00
virt The usual smattering of fixes and tunings that came in too late for the 2019-05-26 13:45:15 -07:00
.clang-format Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-04-17 11:26:25 -07:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes
.gitignore .gitignore: exclude .get_maintainer.ignore and .gitattributes 2019-05-18 11:49:54 +09:00
.mailmap A reasonably busy cycle for docs, including: 2019-05-08 12:42:50 -07:00
COPYING COPYING: use the new text with points to the license files 2018-03-23 12:41:45 -06:00
CREDITS Char/Misc driver patches for 5.1-rc1 2019-03-06 14:18:59 -08:00
Kbuild Kbuild updates for v5.1 2019-03-10 17:48:21 -07:00
Kconfig kconfig: move the "Executable file formats" menu to fs/Kconfig.binfmt 2018-08-02 08:06:55 +09:00
MAINTAINERS The usual smattering of fixes and tunings that came in too late for the 2019-05-26 13:45:15 -07:00
Makefile Linux 5.2-rc2 2019-05-26 16:49:19 -07:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.