linux/fs
Davide Libenzi e1ad7468c7 signal/timer/event: eventfd core
This is a very simple and light file descriptor, that can be used as event
wait/dispatch by userspace (both wait and dispatch) and by the kernel
(dispatch only).  It can be used instead of pipe(2) in all cases where those
would simply be used to signal events.  Their kernel overhead is much lower
than pipes, and they do not consume two fds.  When used in the kernel, it can
offer an fd-bridge to enable, for example, functionalities like KAIO or
syslets/threadlets to signal to an fd the completion of certain operations.
But more in general, an eventfd can be used by the kernel to signal readiness,
in a POSIX poll/select way, of interfaces that would otherwise be incompatible
with it.  The API is:

int eventfd(unsigned int count);

The eventfd API accepts an initial "count" parameter, and returns an eventfd
fd.  It supports poll(2) (POLLIN, POLLOUT, POLLERR), read(2) and write(2).

The POLLIN flag is raised when the internal counter is greater than zero.

The POLLOUT flag is raised when at least a value of "1" can be written to the
internal counter.

The POLLERR flag is raised when an overflow in the counter value is detected.

The write(2) operation can never overflow the counter, since it blocks (unless
O_NONBLOCK is set, in which case -EAGAIN is returned).

But the eventfd_signal() function can do it, since it's supposed to not sleep
during its operation.

The read(2) function reads the __u64 counter value, and reset the internal
value to zero.  If the value read is equal to (__u64) -1, an overflow happened
on the internal counter (due to 2^64 eventfd_signal() posts that has never
been retired - unlickely, but possible).

The write(2) call writes an __u64 count value, and adds it to the current
counter.  The eventfd fd supports O_NONBLOCK also.

On the kernel side, we have:

struct file *eventfd_fget(int fd);
int eventfd_signal(struct file *file, unsigned int n);

The eventfd_fget() should be called to get a struct file* from an eventfd fd
(this is an fget() + check of f_op being an eventfd fops pointer).

The kernel can then call eventfd_signal() every time it wants to post an event
to userspace.  The eventfd_signal() function can be called from any context.
An eventfd() simple test and bench is available here:

http://www.xmailserver.org/eventfd-bench.c

This is the eventfd-based version of pipetest-4 (pipe(2) based):

http://www.xmailserver.org/pipetest-4.c

Not that performance matters much in the eventfd case, but eventfd-bench
shows almost as double as performance than pipetest-4.

[akpm@linux-foundation.org: fix i386 build]
[akpm@linux-foundation.org: add sys_eventfd to sys_ni.c]
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11 08:29:36 -07:00
..
9p header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
adfs slab allocators: Remove SLAB_DEBUG_INITIAL flag 2007-05-07 12:12:57 -07:00
affs affs: use zero_user_page 2007-05-09 12:30:55 -07:00
afs AFS: implement statfs 2007-05-11 08:29:32 -07:00
autofs Replace pid_t in autofs with struct pid reference 2007-05-11 08:29:36 -07:00
autofs4 Fix some coding-style errors in autofs 2007-05-11 08:29:36 -07:00
befs slab allocators: Remove SLAB_DEBUG_INITIAL flag 2007-05-07 12:12:57 -07:00
bfs slab allocators: Remove SLAB_DEBUG_INITIAL flag 2007-05-07 12:12:57 -07:00
cifs header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
coda slab allocators: Remove SLAB_DEBUG_INITIAL flag 2007-05-07 12:12:57 -07:00
configfs use simple_read_from_buffer() in fs/ 2007-05-09 12:30:49 -07:00
cramfs mm: make read_cache_page synchronous 2007-05-07 12:12:51 -07:00
debugfs remove "struct subsystem" as it is no longer needed 2007-05-02 18:57:59 -07:00
devpts devpts: add fsnotify create event 2007-05-08 11:14:59 -07:00
dlm Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw 2007-05-07 12:26:27 -07:00
ecryptfs header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
efs slab allocators: Remove SLAB_DEBUG_INITIAL flag 2007-05-07 12:12:57 -07:00
exportfs header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
ext2 ext3: copy i_flags to inode flags on write 2007-05-08 11:15:13 -07:00
ext3 ext3: use zero_user_page 2007-05-09 12:30:55 -07:00
ext4 header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
fat fat: fix VFAT compat ioctls on 64-bit systems 2007-05-08 11:15:14 -07:00
freevxfs freevxfs: possible null pointer dereference fix 2007-05-08 11:14:59 -07:00
fuse add filesystem subtype support 2007-05-08 11:15:01 -07:00
gfs2 header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
hfs is_power_of_2 in fs/hfs 2007-05-08 11:14:59 -07:00
hfsplus is_power_of_2 in fs/hfs 2007-05-08 11:14:59 -07:00
hostfs uml: hostfs style fixes 2007-05-08 11:14:57 -07:00
hpfs slab allocators: Remove SLAB_DEBUG_INITIAL flag 2007-05-07 12:12:57 -07:00
hppfs [PATCH] Mark struct super_operations const 2007-02-12 09:48:47 -08:00
hugetlbfs hugetlbfs: add NULL check in hugetlb_zero_setup() 2007-05-07 12:12:57 -07:00
isofs slab allocators: Remove SLAB_DEBUG_INITIAL flag 2007-05-07 12:12:57 -07:00
jbd fix file specification in comments 2007-05-09 08:58:16 +02:00
jbd2 fix file specification in comments 2007-05-09 08:58:16 +02:00
jffs2 Merge git://git.infradead.org/mtd-2.6 2007-05-09 13:10:11 -07:00
jfs Fix occurrences of "the the " 2007-05-09 08:57:56 +02:00
lockd header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
minix slab allocators: Remove SLAB_DEBUG_INITIAL flag 2007-05-07 12:12:57 -07:00
msdos [PATCH] mark struct inode_operations const 2 2007-02-12 09:48:46 -08:00
ncpfs header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
nfs NFS: Kill the obsolete NFS_PARANOIA 2007-05-09 17:58:01 -04:00
nfs_common [PATCH] nfs_common endianness annotations 2006-10-20 10:26:41 -07:00
nfsd knfsd: avoid Oops if buggy userspace performs confusing filehandle->dentry mapping 2007-05-09 12:30:54 -07:00
nls [PATCH] fs: make nls_cp936.c handle some U00XY characters and U20AC correctly 2006-12-07 08:39:46 -08:00
ntfs header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
ocfs2 ocfs2: kobject/kset foobar 2007-05-10 09:26:52 -07:00
openpromfs slab allocators: Remove SLAB_DEBUG_INITIAL flag 2007-05-07 12:12:57 -07:00
partitions small cleanup in gpt partition handling 2007-05-11 08:29:34 -07:00
proc smaps: only define clear_refs for CONFIG_MMU 2007-05-08 20:41:14 -07:00
qnx4 slab allocators: Remove SLAB_DEBUG_INITIAL flag 2007-05-07 12:12:57 -07:00
ramfs header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
reiserfs Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial 2007-05-09 12:54:17 -07:00
romfs slab allocators: Remove SLAB_DEBUG_INITIAL flag 2007-05-07 12:12:57 -07:00
smbfs smbfs: remove unnecessary allow_signal 2007-05-08 11:15:11 -07:00
sysfs use simple_read_from_buffer() in fs/ 2007-05-09 12:30:49 -07:00
sysv header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
udf udf: possible null pointer dereference while load_partition 2007-05-08 11:15:22 -07:00
ufs header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
vfat [PATCH] mark struct inode_operations const 3 2007-02-12 09:48:46 -08:00
xfs Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial 2007-05-09 12:54:17 -07:00
aio.c unify flush_work/flush_work_keventd and rename it to cancel_work_sync 2007-05-09 12:30:53 -07:00
anon_inodes.c signal/timer/event fds: anonymous inode source 2007-05-11 08:29:36 -07:00
attr.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
bad_inode.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
binfmt_aout.c [PATCH] VFS: change struct file to use struct path 2006-12-08 08:28:41 -08:00
binfmt_elf_fdpic.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
binfmt_elf.c Invalid return value of execve() resulting in oopses 2007-05-08 11:15:15 -07:00
binfmt_em86.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
binfmt_flat.c [PATCH] uclinux: correctly remap bin_fmtflat exe allocated mem regions 2007-02-09 10:45:33 -08:00
binfmt_misc.c use simple_read_from_buffer() in fs/ 2007-05-09 12:30:49 -07:00
binfmt_script.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
binfmt_som.c
bio.c KMEM_CACHE(): simplify slab cache creation 2007-05-07 12:12:55 -07:00
block_dev.c is_power_of_2 in fs/block_dev.c 2007-05-08 11:14:59 -07:00
buffer.c Add suspend-related notifications for CPU hotplug 2007-05-09 12:30:56 -07:00
char_dev.c [PATCH] remove protection of LANANA-reserved majors 2007-04-04 21:12:47 -07:00
compat_ioctl.c Allow compat_ioctl.c to compile without CONFIG_NET 2007-05-10 13:34:05 -07:00
compat.c signal/timer/event: timerfd compat code 2007-05-11 08:29:36 -07:00
dcache.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
dcookies.c [PATCH] slab: remove kmem_cache_t 2006-12-07 08:39:25 -08:00
direct-io.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial 2007-05-09 12:54:17 -07:00
dnotify.c [PATCH] VFS: change struct file to use struct path 2006-12-08 08:28:41 -08:00
dquot.c Introduce a handy list_first_entry macro 2007-05-08 11:15:11 -07:00
drop_caches.c [PATCH] remove invalidate_inode_pages() 2007-02-11 10:51:31 -08:00
eventfd.c signal/timer/event: eventfd core 2007-05-11 08:29:36 -07:00
eventpoll.c Introduce a handy list_first_entry macro 2007-05-08 11:15:11 -07:00
exec.c signal/timer/event: signalfd core 2007-05-11 08:29:36 -07:00
fcntl.c [PATCH] fdtable: Make fdarray and fdsets equal in size 2006-12-10 09:57:22 -08:00
fifo.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
file_table.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
file.c [PATCH] fdtable: Provide free_fdtable() wrapper 2006-12-22 08:55:50 -08:00
filesystems.c add filesystem subtype support 2007-05-08 11:15:01 -07:00
fs-writeback.c Write back inode data pages even when the inode itself is locked 2007-01-26 12:53:20 -08:00
generic_acl.c
inode.c inode numbering: make static counters in new_inode and iunique be 32 bits 2007-05-08 11:15:16 -07:00
inotify_user.c [PATCH] inotify: read return val fix 2007-02-12 09:48:28 -08:00
inotify.c Introduce a handy list_first_entry macro 2007-05-08 11:15:11 -07:00
internal.h cleanup compat ioctl handling 2007-05-08 11:15:09 -07:00
ioctl.c vfs: remove superflous sb == NULL checks 2007-05-08 11:15:02 -07:00
ioprio.c [PATCH] pid: replace do/while_each_task_pid with do/while_each_pid_task 2007-02-12 09:48:32 -08:00
Kconfig Remove obsolete fat_cvf help text 2007-05-09 08:58:15 +02:00
Kconfig.binfmt blackfin architecture 2007-05-07 12:12:58 -07:00
libfs.c fs/libfs.c: >80 columns line break fix 2007-05-09 06:44:57 +02:00
locks.c locks: fix F_GETLK regression (failure to find conflicts) 2007-05-10 20:25:59 -07:00
Makefile signal/timer/event: eventfd core 2007-05-11 08:29:36 -07:00
mbcache.c [PATCH] slab: remove kmem_cache_t 2006-12-07 08:39:25 -08:00
mpage.c consolidate generic_writepages and mpage_writepages 2007-05-11 08:29:35 -07:00
namei.c fs: use path_walk in do_path_lookup 2007-05-09 12:30:50 -07:00
namespace.c check privileges before setting mount propagation 2007-05-08 11:15:12 -07:00
nfsctl.c
no-block.c
open.c Remove suid/sgid bits on [f]truncate() 2007-05-08 20:10:00 -07:00
pipe.c VFS: delay the dentry name generation on sockets and pipes 2007-05-08 11:15:03 -07:00
pnode.c Introduce a handy list_first_entry macro 2007-05-08 11:15:11 -07:00
pnode.h [PATCH] rename struct namespace to struct mnt_namespace 2006-12-08 08:28:51 -08:00
posix_acl.c
quota_v1.c
quota_v2.c
quota.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
read_write.c use use SEEK_MAX to validate user lseek arguments 2007-05-08 11:14:59 -07:00
read_write.h
readdir.c ROUND_UP macro cleanup in fs/(select|compat|readdir).c 2007-05-08 11:15:09 -07:00
select.c Style fix in fs/select.c 2007-05-09 07:10:02 +02:00
seq_file.c [PATCH] VFS: change struct file to use struct path 2006-12-08 08:28:41 -08:00
signalfd.c signal/timer/event: signalfd core 2007-05-11 08:29:36 -07:00
splice.c [PATCH] splice: always call into page_cache_readahead() 2007-05-08 08:46:19 +02:00
stack.c [PATCH] fs/stack.c: Copy i_nlink after all other attributes are copied 2007-02-19 14:21:50 -08:00
stat.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
super.c add filesystem subtype support 2007-05-08 11:15:01 -07:00
sync.c Remove do_sync_file_range() 2007-05-08 11:15:04 -07:00
timerfd.c signal/timer/event: timerfd core 2007-05-11 08:29:36 -07:00
utimes.c utimensat implementation 2007-05-08 11:15:18 -07:00
xattr_acl.c [PATCH] remove many unneeded #includes of sched.h 2007-02-14 08:09:54 -08:00
xattr.c header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00