linux/fs
Christophe Leroy 0ff28d9f46 splice: sendfile() at once fails for big files
Using sendfile with below small program to get MD5 sums of some files,
it appear that big files (over 64kbytes with 4k pages system) get a
wrong MD5 sum while small files get the correct sum.
This program uses sendfile() to send a file to an AF_ALG socket
for hashing.

/* md5sum2.c */
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <fcntl.h>
#include <sys/socket.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <linux/if_alg.h>

int main(int argc, char **argv)
{
	int sk = socket(AF_ALG, SOCK_SEQPACKET, 0);
	struct stat st;
	struct sockaddr_alg sa = {
		.salg_family = AF_ALG,
		.salg_type = "hash",
		.salg_name = "md5",
	};
	int n;

	bind(sk, (struct sockaddr*)&sa, sizeof(sa));

	for (n = 1; n < argc; n++) {
		int size;
		int offset = 0;
		char buf[4096];
		int fd;
		int sko;
		int i;

		fd = open(argv[n], O_RDONLY);
		sko = accept(sk, NULL, 0);
		fstat(fd, &st);
		size = st.st_size;
		sendfile(sko, fd, &offset, size);
		size = read(sko, buf, sizeof(buf));
		for (i = 0; i < size; i++)
			printf("%2.2x", buf[i]);
		printf("  %s\n", argv[n]);
		close(fd);
		close(sko);
	}
	exit(0);
}

Test below is done using official linux patch files. First result is
with a software based md5sum. Second result is with the program above.

root@vgoip:~# ls -l patch-3.6.*
-rw-r--r--    1 root     root         64011 Aug 24 12:01 patch-3.6.2.gz
-rw-r--r--    1 root     root         94131 Aug 24 12:01 patch-3.6.3.gz

root@vgoip:~# md5sum patch-3.6.*
b3ffb9848196846f31b2ff133d2d6443  patch-3.6.2.gz
c5e8f687878457db77cb7158c38a7e43  patch-3.6.3.gz

root@vgoip:~# ./md5sum2 patch-3.6.*
b3ffb9848196846f31b2ff133d2d6443  patch-3.6.2.gz
5fd77b24e68bb24dcc72d6e57c64790e  patch-3.6.3.gz

After investivation, it appears that sendfile() sends the files by blocks
of 64kbytes (16 times PAGE_SIZE). The problem is that at the end of each
block, the SPLICE_F_MORE flag is missing, therefore the hashing operation
is reset as if it was the end of the file.

This patch adds SPLICE_F_MORE to the flags when more data is pending.

With the patch applied, we get the correct sums:

root@vgoip:~# md5sum patch-3.6.*
b3ffb9848196846f31b2ff133d2d6443  patch-3.6.2.gz
c5e8f687878457db77cb7158c38a7e43  patch-3.6.3.gz

root@vgoip:~# ./md5sum2 patch-3.6.*
b3ffb9848196846f31b2ff133d2d6443  patch-3.6.2.gz
c5e8f687878457db77cb7158c38a7e43  patch-3.6.3.gz

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-05-06 09:27:41 -06:00
..
9p switch generic_write_checks() to iocb and iter 2015-04-11 22:30:21 -04:00
adfs adfs: return correct return values 2015-04-17 09:04:08 -04:00
affs Merge branch 'akpm' (patches from Andrew) 2015-04-17 09:04:38 -04:00
afs make new_sync_{read,write}() static 2015-04-11 22:29:40 -04:00
autofs4 autofs: switch to __vfs_write() 2015-04-11 22:29:37 -04:00
befs befs: replace typedef befs_inode_info by structure 2015-04-17 09:04:03 -04:00
bfs bfs: correct return values 2015-04-17 09:04:09 -04:00
btrfs mirror O_APPEND and O_DIRECT into iocb->ki_flags 2015-04-11 22:30:22 -04:00
cachefiles Cachefiles: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions 2015-02-22 11:38:41 -05:00
ceph mirror O_APPEND and O_DIRECT into iocb->ki_flags 2015-04-11 22:30:22 -04:00
cifs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-04-16 23:27:56 -04:00
coda make new_sync_{read,write}() static 2015-04-11 22:29:40 -04:00
configfs configfs: Fix inconsistent use of file_inode() vs file->f_path.dentry->d_inode 2015-04-15 15:05:31 -04:00
cramfs
debugfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-04-16 23:27:56 -04:00
devpts
dlm netlink: make nlmsg_end() and genlmsg_end() void 2015-01-18 01:03:45 -05:00
ecryptfs make new_sync_{read,write}() static 2015-04-11 22:29:40 -04:00
efivarfs * Move efivarfs from the misc filesystem section to pseudo filesystem, 2015-01-29 19:16:40 +01:00
efs
exofs direct_IO: remove rw from a_ops->direct_IO() 2015-04-11 22:29:45 -04:00
exportfs VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry) 2015-02-22 11:38:41 -05:00
ext2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-04-16 23:27:56 -04:00
ext3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-04-16 23:27:56 -04:00
ext4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-04-16 23:27:56 -04:00
f2fs direct_IO: remove rw from a_ops->direct_IO() 2015-04-11 22:29:45 -04:00
fat Merge branch 'akpm' (patches from Andrew) 2015-04-17 09:04:38 -04:00
freevxfs
fscache
fuse mirror O_APPEND and O_DIRECT into iocb->ki_flags 2015-04-11 22:30:22 -04:00
gfs2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-04-16 23:27:56 -04:00
hfs Merge branch 'akpm' (patches from Andrew) 2015-04-17 09:04:38 -04:00
hfsplus Merge branch 'akpm' (patches from Andrew) 2015-04-17 09:04:38 -04:00
hostfs Merge tag 'for-linus-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml 2015-04-15 13:49:27 -07:00
hpfs make new_sync_{read,write}() static 2015-04-11 22:29:40 -04:00
hppfs VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry) 2015-02-22 11:38:41 -05:00
hugetlbfs Merge branch 'akpm' (patches from Andrew) 2015-04-15 16:39:15 -07:00
isofs isofs: Fix bug in the way to check if the year is a leap year 2015-01-07 09:51:49 +01:00
jbd
jbd2 jbd2: complain about descriptor block checksum errors 2015-01-19 15:59:58 -05:00
jffs2 Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-04-15 13:22:56 -07:00
jfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-04-16 23:27:56 -04:00
kernfs kernfs: handle poll correctly on 'direct_read' files. 2015-03-16 21:51:20 +01:00
lockd Merge branch 'for-3.20' of git://linux-nfs.org/~bfields/linux 2015-02-12 10:39:41 -08:00
logfs make new_sync_{read,write}() static 2015-04-11 22:29:40 -04:00
minix make new_sync_{read,write}() static 2015-04-11 22:29:40 -04:00
ncpfs switch generic_write_checks() to iocb and iter 2015-04-11 22:30:21 -04:00
nfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-04-16 23:27:56 -04:00
nfs_common
nfsd Merge branch 'akpm' (patches from Andrew) 2015-04-15 16:39:15 -07:00
nilfs2 Merge branch 'akpm' (patches from Andrew) 2015-04-17 09:04:38 -04:00
nls
notify fanotify: fix event filtering with FAN_ONDIR set 2015-03-12 18:46:08 -07:00
ntfs switch generic_write_checks() to iocb and iter 2015-04-11 22:30:21 -04:00
ocfs2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-04-16 23:27:56 -04:00
omfs make new_sync_{read,write}() static 2015-04-11 22:29:40 -04:00
openpromfs
overlayfs ovl: upper fs should not be R/O 2015-03-18 10:29:48 +01:00
proc proc: show locks in /proc/pid/fdinfo/X 2015-04-17 09:04:12 -04:00
pstore powerpc updates for 4.1 2015-04-16 13:53:32 -05:00
qnx4
qnx6
quota vfs: Add general support to enforce project quota limits 2015-03-18 21:55:08 +01:00
ramfs make new_sync_{read,write}() static 2015-04-11 22:29:40 -04:00
reiserfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-04-16 23:27:56 -04:00
romfs make new_sync_{read,write}() static 2015-04-11 22:29:40 -04:00
squashfs
sysfs sysfs: Only accept read/write permissions for file attributes 2015-03-25 13:27:57 +01:00
sysv make new_sync_{read,write}() static 2015-04-11 22:29:40 -04:00
tracefs tracing: Have mkdir and rmdir be part of tracefs 2015-02-03 12:48:43 -05:00
ubifs This pull request includes the following UBI/UBIFS changes: 2015-04-15 13:43:40 -07:00
udf Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-04-16 23:27:56 -04:00
ufs make new_sync_{read,write}() static 2015-04-11 22:29:40 -04:00
xfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-04-16 23:27:56 -04:00
aio.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-04-16 23:27:56 -04:00
anon_inodes.c
attr.c
bad_inode.c don't bother with most of the bad_file_ops methods 2015-02-20 04:03:58 -05:00
binfmt_aout.c
binfmt_elf_fdpic.c
binfmt_elf.c mm: fold arch_randomize_brk into ARCH_HAS_ELF_RANDOMIZE 2015-04-14 16:49:05 -07:00
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c fs/binfmt_misc.c: simplify entry_status() 2015-04-17 09:03:59 -04:00
binfmt_script.c
block_dev.c blkdev_write_iter: expand generic_file_checks() call in there 2015-04-11 22:29:48 -04:00
buffer.c page_writeback: clean up mess around cancel_dirty_page() 2015-04-14 16:49:01 -07:00
char_dev.c fs: introduce f_op->mmap_capabilities for nommu mmap support 2015-01-20 14:02:58 -07:00
compat_binfmt_elf.c
compat_ioctl.c Bluetooth: bnep: Add support for get bnep features via ioctl 2015-04-03 23:21:34 +02:00
compat.c
coredump.c coredump: accept any write method 2015-04-11 22:29:39 -04:00
dax.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-04-16 23:27:56 -04:00
dcache.c VFS: Impose ordering on accesses of d_inode and d_flags 2015-04-15 15:05:28 -04:00
dcookies.c
direct-io.c Remove rw from {,__,do_}blockdev_direct_IO() 2015-04-11 22:29:44 -04:00
drop_caches.c vmscan: per memory cgroup slab shrinkers 2015-02-12 18:54:09 -08:00
eventfd.c eventfd: don't take the spinlock in eventfd_poll 2015-02-17 14:34:52 -08:00
eventpoll.c epoll: optimize setting task running after blocking 2015-02-13 21:21:40 -08:00
exec.c fs/exec.c:de_thread: move notify_count write under lock 2015-04-17 09:04:07 -04:00
fcntl.c vfs: renumber FMODE_NONOTIFY and add to uniqueness check 2015-01-08 15:10:52 -08:00
fhandle.c
file_table.c ->aio_read and ->aio_write removed 2015-04-11 22:29:43 -04:00
file.c mm: rcu-protected get_mm_exe_file() 2015-04-17 09:04:07 -04:00
filesystems.c
fs_pin.c switch the IO-triggering parts of umount to fs_pin 2015-01-25 23:17:29 -05:00
fs_struct.c
fs-writeback.c fs: add dirtytime_expire_seconds sysctl 2015-03-17 12:23:32 -04:00
inode.c Merge branch 'lazytime' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-02-17 16:12:34 -08:00
internal.h trylock_super(): replacement for grab_super_passive() 2015-02-22 11:38:42 -05:00
ioctl.c fsioctl.c: make generic_block_fiemap() signal-tolerant 2015-02-10 14:30:30 -08:00
Kconfig dax: does not work correctly with virtual aliasing caches 2015-02-16 17:56:04 -08:00
Kconfig.binfmt mm: split ET_DYN ASLR from mmap ASLR 2015-04-14 16:49:05 -07:00
libfs.c VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry) 2015-02-22 11:38:41 -05:00
locks.c proc: show locks in /proc/pid/fdinfo/X 2015-04-17 09:04:12 -04:00
Makefile This adds the new tracefs file system. This has been in linux-next for 2015-04-14 10:22:29 -07:00
mbcache.c
mount.h switch the IO-triggering parts of umount to fs_pin 2015-01-25 23:17:29 -05:00
mpage.c
namei.c VFS: Make pathwalk use d_is_reg() rather than S_ISREG() 2015-04-15 15:05:30 -04:00
namespace.c VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry) 2015-02-22 11:38:41 -05:00
no-block.c
nsfs.c
open.c ->aio_read and ->aio_write removed 2015-04-11 22:29:43 -04:00
pipe.c make new_sync_{read,write}() static 2015-04-11 22:29:40 -04:00
pnode.c
pnode.h
posix_acl.c VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry) 2015-02-22 11:38:41 -05:00
proc_namespace.c vfs: add support for a lazytime mount option 2015-02-05 02:45:00 -05:00
read_write.c new_sync_write(): discard ->ki_pos unless the return value is positive 2015-04-11 22:29:46 -04:00
readdir.c
select.c all arches, signal: move restart_block to struct task_struct 2015-02-12 18:54:12 -08:00
seq_file.c bitmap, cpumask, nodemask: remove dedicated formatting functions 2015-02-13 21:21:39 -08:00
signalfd.c
splice.c splice: sendfile() at once fails for big files 2015-05-06 09:27:41 -06:00
stack.c
stat.c switch security_inode_getattr() to struct path * 2015-04-11 22:24:32 -04:00
statfs.c
super.c cleancache: remove limit on the number of cleancache enabled filesystems 2015-04-14 16:49:03 -07:00
sync.c vfs: add support for a lazytime mount option 2015-02-05 02:45:00 -05:00
timerfd.c
utimes.c
xattr.c