linux

iv/linux

Author	SHA1	Message	Date
Linus Torvalds	cf833d0b99	random: Use arch_get_random_int instead of cycle counter if avail We still don't use rdrand in /dev/random, which just seems stupid. We accept the cyclecounter* as a random input, but we don't accept rdrand? That's just broken. Sure, people can do things in user space (write to /dev/random, use rdrand in addition to /dev/random themselves etc etc), but that still seems to be a particularly stupid reason for saying "we shouldn't bother to try to do better in /dev/random". And even if somebody really doesn't trust rdrand as a source of random bytes, it seems singularly stupid to trust the cycle counter more. So I'd suggest the attached patch. I'm not going to even bother arguing that we should add more bits to the entropy estimate, because that's not the point - I don't care if /dev/random fills up slowly or not, I think it's just stupid to not use the bits we can get from rdrand and mix them into the strong randomness pool. Link: http://lkml.kernel.org/r/CA%2B55aFwn59N1=m651QAyTy-1gO1noGbK18zwKDwvwqnravA84A@mail.gmail.com Acked-by: "David S. Miller" <davem@davemloft.net> Acked-by: "Theodore Ts'o" <tytso@mit.edu> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Cc: Matt Mackall <mpm@selenic.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-12-29 16:49:45 -08:00
Luck, Tony	bd29e568a4	fix typo/thinko in get_random_bytes() If there is an architecture-specific random number generator we use it to acquire randomness one "long" at a time. We should put these random words into consecutive words in the result buffer - not just overwrite the first word again and again. Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-11-17 11:42:54 -02:00
Linus Torvalds	8e6d539e0f	Merge branch 'x86-rdrand-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'x86-rdrand-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, random: Verify RDRAND functionality and allow it to be disabled x86, random: Architectural inlines to get random integers with RDRAND random: Add support for architectural random hooks Fix up trivial conflicts in drivers/char/random.c: the architectural random hooks touched "get_random_int()" that was simplified to use MD5 and not do the keyptr thing any more (see commit `6e5714eaf7`: "net: Compute protocol sequence numbers and fragment IDs using MD5").	2011-10-28 05:29:07 -07:00
David S. Miller	6e5714eaf7	net: Compute protocol sequence numbers and fragment IDs using MD5. Computers have become a lot faster since we compromised on the partial MD4 hash which we use currently for performance reasons. MD5 is a much safer choice, and is inline with both RFC1948 and other ISS generators (OpenBSD, Solaris, etc.) Furthermore, only having 24-bits of the sequence number be truly unpredictable is a very serious limitation. So the periodic regeneration and 8-bit counter have been removed. We compute and use a full 32-bit sequence number. For ipv6, DCCP was found to use a 32-bit truncated initial sequence number (it needs 43-bits) and that is fixed here as well. Reported-by: Dan Kaminsky <dan@doxpara.com> Tested-by: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-06 18:33:19 -07:00
H. Peter Anvin	63d7717326	random: Add support for architectural random hooks Add support for architecture-specific hooks into the kernel-directed random number generator interfaces. This patchset does not use the architecture random number generator interfaces for the userspace-directed interfaces (/dev/random and /dev/urandom), thus eliminating the need to distinguish between them based on a pool pointer. Changes in version 3: - Moved the hooks from extract_entropy() to get_random_bytes(). - Changes the hooks to inlines. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Matt Mackall <mpm@selenic.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "Theodore Ts'o" <tytso@mit.edu>	2011-07-31 13:54:50 -07:00
Eric Dumazet	87c48fa3b4	ipv6: make fragment identifications less predictable IPv6 fragment identification generation is way beyond what we use for IPv4 : It uses a single generator. Its not scalable and allows DOS attacks. Now inetpeer is IPv6 aware, we can use it to provide a more secure and scalable frag ident generator (per destination, instead of system wide) This patch : 1) defines a new secure_ipv6_id() helper 2) extends inet_getid() to provide 32bit results 3) extends ipv6_select_ident() with a new dest parameter Reported-by: Fernando Gont <fernando@gont.com.ar> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-07-21 21:25:58 -07:00
Lucas De Marchi	25985edced	Fix common misspellings Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>	2011-03-31 11:26:23 -03:00
Jarod Wilson	442a4fffff	random: update interface comments to reflect reality At present, the comment header in random.c makes no mention of add_disk_randomness, and instead, suggests that disk activity adds to the random pool by way of add_interrupt_randomness, which appears to not have been the case since sometime prior to the existence of git, and even prior to bitkeeper. Didn't look any further back. At least, as far as I can tell, there are no storage drivers setting IRQF_SAMPLE_RANDOM, which is a requirement for add_interrupt_randomness to trigger, so the only way for a disk to contribute entropy is by way of add_disk_randomness. Update comments accordingly, complete with special mention about solid state drives being a crappy source of entropy (see `e2e1a148bc` for reference). Signed-off-by: Jarod Wilson <jarod@redhat.com> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2011-02-21 22:42:42 +11:00
Christoph Lameter	b29c617af3	random: Use this_cpu_inc_return __this_cpu_inc can create a single instruction to do the same as __get_cpu_var()++. Cc: Richard Kennedy <richard@rsk.demon.co.uk> Cc: Matt Mackall <mpm@selenic.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2010-12-17 15:18:05 +01:00
Arnd Bergmann	6038f373a3	llseek: automatically add .llseek fop All file_operations should get a .llseek operation so we can make nonseekable_open the default for future file operations without a .llseek pointer. The three cases that we can automatically detect are no_llseek, seq_lseek and default_llseek. For cases where we can we can automatically prove that the file offset is always ignored, we use noop_llseek, which maintains the current behavior of not returning an error from a seek. New drivers should normally not use noop_llseek but instead use no_llseek and call nonseekable_open at open time. Existing drivers can be converted to do the same when the maintainer knows for certain that no user code relies on calling seek on the device file. The generated code is often incorrectly indented and right now contains comments that clarify for each added line why a specific variant was chosen. In the version that gets submitted upstream, the comments will be gone and I will manually fix the indentation, because there does not seem to be a way to do that using coccinelle. Some amount of new code is currently sitting in linux-next that should get the same modifications, which I will do at the end of the merge window. Many thanks to Julia Lawall for helping me learn to write a semantic patch that does all this. ===== begin semantic patch ===== // This adds an llseek= method to all file operations, // as a preparation for making no_llseek the default. // // The rules are // - use no_llseek explicitly if we do nonseekable_open // - use seq_lseek for sequential files // - use default_llseek if we know we access f_pos // - use noop_llseek if we know we don't access f_pos, // but we still want to allow users to call lseek // @ open1 exists @ identifier nested_open; @@ nested_open(...) { <+... nonseekable_open(...) ...+> } @ open exists@ identifier open_f; identifier i, f; identifier open1.nested_open; @@ int open_f(struct inode i, struct file f) { <+... ( nonseekable_open(...) \| nested_open(...) ) ...+> } @ read disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t read_f(struct file f, char p, size_t s, loff_t off) { <+... ( off = E \| off += E \| func(..., off, ...) \| E = off ) ...+> } @ read_no_fpos disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t read_f(struct file f, char p, size_t s, loff_t off) { ... when != off } @ write @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t write_f(struct file f, const char p, size_t s, loff_t off) { <+... ( off = E \| off += E \| func(..., off, ...) \| E = off ) ...+> } @ write_no_fpos @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t write_f(struct file f, const char p, size_t s, loff_t off) { ... when != off } @ fops0 @ identifier fops; @@ struct file_operations fops = { ... }; @ has_llseek depends on fops0 @ identifier fops0.fops; identifier llseek_f; @@ struct file_operations fops = { ... .llseek = llseek_f, ... }; @ has_read depends on fops0 @ identifier fops0.fops; identifier read_f; @@ struct file_operations fops = { ... .read = read_f, ... }; @ has_write depends on fops0 @ identifier fops0.fops; identifier write_f; @@ struct file_operations fops = { ... .write = write_f, ... }; @ has_open depends on fops0 @ identifier fops0.fops; identifier open_f; @@ struct file_operations fops = { ... .open = open_f, ... }; // use no_llseek if we call nonseekable_open //////////////////////////////////////////// @ nonseekable1 depends on !has_llseek && has_open @ identifier fops0.fops; identifier nso ~= "nonseekable_open"; @@ struct file_operations fops = { ... .open = nso, ... +.llseek = no_llseek, /* nonseekable / }; @ nonseekable2 depends on !has_llseek @ identifier fops0.fops; identifier open.open_f; @@ struct file_operations fops = { ... .open = open_f, ... +.llseek = no_llseek, / open uses nonseekable / }; // use seq_lseek for sequential files ///////////////////////////////////// @ seq depends on !has_llseek @ identifier fops0.fops; identifier sr ~= "seq_read"; @@ struct file_operations fops = { ... .read = sr, ... +.llseek = seq_lseek, / we have seq_read / }; // use default_llseek if there is a readdir /////////////////////////////////////////// @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier readdir_e; @@ // any other fop is used that changes pos struct file_operations fops = { ... .readdir = readdir_e, ... +.llseek = default_llseek, / readdir is present / }; // use default_llseek if at least one of read/write touches f_pos ///////////////////////////////////////////////////////////////// @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read.read_f; @@ // read fops use offset struct file_operations fops = { ... .read = read_f, ... +.llseek = default_llseek, / read accesses f_pos / }; @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, ... + .llseek = default_llseek, / write accesses f_pos / }; // Use noop_llseek if neither read nor write accesses f_pos /////////////////////////////////////////////////////////// @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; identifier write_no_fpos.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, .read = read_f, ... +.llseek = noop_llseek, / read and write both use no f_pos / }; @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write_no_fpos.write_f; @@ struct file_operations fops = { ... .write = write_f, ... +.llseek = noop_llseek, / write uses no f_pos / }; @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; @@ struct file_operations fops = { ... .read = read_f, ... +.llseek = noop_llseek, / read uses no f_pos / }; @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; @@ struct file_operations fops = { ... +.llseek = noop_llseek, / no read or write fn */ }; ===== End semantic patch ===== Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Julia Lawall <julia@diku.dk> Cc: Christoph Hellwig <hch@infradead.org>	2010-10-15 15:53:27 +02:00
Richard Kennedy	4015d9a865	random: Reorder struct entropy_store to remove padding on 64bits Re-order structure entropy_store to remove 8 bytes of padding on 64 bit builds, so shrinking this structure from 72 to 64 bytes and allowing it to fit into one cache line. Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2010-07-31 19:58:00 +08:00
Matt Mackall	e954bc91bd	random: simplify fips mode Rather than dynamically allocate 10 bytes, move it to static allocation. This saves space and avoids the need for error checking. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2010-05-20 19:55:01 +10:00
Adam Buchbinder	c41b20e721	Fix misspellings of "truly" in comments. Some comments misspell "truly"; this fixes them. No code changes. Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-02-04 11:55:45 +01:00
Herbert Xu	cd1510cb5f	random: Remove unused inode variable The previous changeset left behind an unused inode variable. This patch removes it. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2010-02-02 06:50:27 +11:00
Matt Mackall	a996996dd7	random: drop weird m_time/a_time manipulation No other driver does anything remotely like this that I know of except for the tty drivers, and I can't see any reason for random/urandom to do it. In fact, it's a (trivial, harmless) timing information leak. And obviously, it generates power- and flash-cycle wasting I/O, especially if combined with something like hwrngd. Also, it breaks ubifs's expectations. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2010-02-02 06:50:23 +11:00
Joe Perches	35900771c0	random.c: use %pU to print UUIDs Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-15 08:53:33 -08:00
Eric W. Biederman	6d4561110a	sysctl: Drop & in front of every proc_handler. For consistency drop & in front of every proc_handler. Explicity taking the address is unnecessary and it prevents optimizations like stubbing the proc_handlers to NULL. Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Joe Perches <joe@perches.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2009-11-18 08:37:40 -08:00
Eric W. Biederman	894d249115	sysctl drivers: Remove dead binary sysctl support Now that sys_sysctl is a wrapper around /proc/sys all of the binary sysctl support elsewhere in the tree is dead code. Cc: Jens Axboe <axboe@kernel.dk> Cc: Corey Minyard <minyard@acm.org> Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: Matt Mackall <mpm@selenic.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Neil Brown <neilb@suse.de> Cc: "James E.J. Bottomley" <James.Bottomley@suse.de> Acked-by: Clemens Ladisch <clemens@ladisch.de> for drivers/char/hpet.c Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2009-11-12 02:04:58 -08:00
Alexey Dobriyan	8d65af789f	sysctl: remove "struct file *" argument of ->proc_handler It's unused. It isn't needed -- read or write flag is already passed and sysctl shouldn't care about the rest. It _was_ used in two places at arch/frv for some reason. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "David S. Miller" <davem@davemloft.net> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-24 07:21:04 -07:00
Neil Horman	5b739ef8a4	random: Add optional continuous repetition test to entropy store based rngs FIPS-140 requires that all random number generators implement continuous self tests in which each extracted block of data is compared against the last block for repetition. The ansi_cprng implements such a test, but it would be nice if the hw rng's did the same thing. Obviously its not something thats always needed, but it seems like it would be a nice feature to have on occasion. I've written the below patch which allows individual entropy stores to be flagged as desiring a continuous test to be run on them as is extracted. By default this option is off, but is enabled in the event that fips mode is selected during bootup. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-06-18 19:50:21 +08:00
Linus Torvalds	26a9a41823	Avoid ICE in get_random_int() with gcc-3.4.5 Martin Knoblauch reports that trying to build 2.6.30-rc6-git3 with RHEL4.3 userspace (gcc (GCC) 3.4.5 20051201 (Red Hat 3.4.5-2)) causes an internal compiler error (ICE): drivers/char/random.c: In function `get_random_int': drivers/char/random.c:1672: error: unrecognizable insn: (insn 202 148 150 0 /scratch/build/linux-2.6.30-rc6-git3/arch/x86/include/asm/tsc.h:23 (set (reg:SI 0 ax [91]) (subreg:SI (plus:DI (plus:DI (reg:DI 0 ax [88]) (subreg:DI (reg:SI 6 bp) 0)) (const_int -4 [0xfffffffffffffffc])) 0)) -1 (nil) (nil)) drivers/char/random.c:1672: internal compiler error: in extract_insn, at recog.c:2083 and after some debugging it turns out that it's due to the code trying to figure out the rough value of the current stack pointer by taking an address of an uninitialized variable and casting that to an integer. This is clearly a compiler bug, but it's not worth fighting - while the current stack kernel pointer might be somewhat hard to predict in user space, it's also not generally going to change for a lot of the call chains for a particular process. So just drop it, and mumble some incoherent curses at the compiler. Tested-by: Martin Knoblauch <spamtrap@knobisoft.de> Cc: Matt Mackall <mpm@selenic.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-05-19 11:25:35 -07:00
Linus Torvalds	8a0a9bd4db	random: make get_random_int() more random It's a really simple patch that basically just open-codes the current "secure_ip_id()" call, but when open-coding it we now use a _static_ hashing area, so that it gets updated every time. And to make sure somebody can't just start from the same original seed of all-zeroes, and then do the "half_md4_transform()" over and over until they get the same sequence as the kernel has, each iteration also mixes in the same old "current->pid + jiffies" we used - so we should now have a regular strong pseudo-number generator, but we also have one that doesn't have a single seed. Note: the "pid + jiffies" is just meant to be a tiny tiny bit of noise. It has no real meaning. It could be anything. I just picked the previous seed, it's just that now we keep the state in between calls and that will feed into the next result, and that should make all the difference. I made that hash be a per-cpu data just to avoid cache-line ping-pong: having multiple CPU's write to the same data would be fine for randomness, and add yet another layer of chaos to it, but since get_random_int() is supposed to be a fast interface I did it that way instead. I considered using "__raw_get_cpu_var()" to avoid any preemption overhead while still getting the hash be _mostly_ ping-pong free, but in the end good taste won out. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-05-07 11:59:06 -07:00
Anton Blanchard	417b43d4b7	random: align rekey_work's timer Align rekey_work. Even though it's infrequent, we may as well line it up. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Matt Mackall <mpm@selenic.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-04-02 19:04:49 -07:00
Yinghai Lu	d178a1eb5c	sparseirq: fix build with unknown irq_desc struct Ingo Molnar wrote: > > tip/kernel/fork.c: In function 'copy_signal': > tip/kernel/fork.c:825: warning: unused variable 'ret' > tip/drivers/char/random.c: In function 'get_timer_rand_state': > tip/drivers/char/random.c:584: error: dereferencing pointer to incomplete type > tip/drivers/char/random.c: In function 'set_timer_rand_state': > tip/drivers/char/random.c:594: error: dereferencing pointer to incomplete type > make[3]: *** [drivers/char/random.o] Error 1 irq_desc is defined in linux/irq.h, so include it in the genirq case. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-11 16:06:03 +01:00
Yinghai Lu	d7e51e6689	sparseirq: make some func to be used with genirq Impact: clean up sparseirq fallout on random.c Ingo suggested to change some ifdef from SPARSE_IRQ to GENERIC_HARDIRQS so we could some #ifdef later if all arch support genirq Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-11 04:46:26 +01:00
Matt Mackall	cda796a3d5	random: don't try to look at entropy_count outside the lock As a non-atomic value, it's only safe to look at entropy_count when the pool lock is held, so we move the BUG_ON inside the lock for correctness. Also remove the spurious comment. It's ok for entropy_count to temporarily exceed POOLBITS so long as it's left in a consistent state when the lock is released. This is a more correct, simple, and idiomatic fix for the bug in `8b76f46a2d`. I've left the reorderings introduced by that patch in place as they're harmless, even though they don't properly deal with potential atomicity issues. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:30 -08:00
Yinghai Lu	2f98357001	sparseirq: move set/get_timer_rand_state back to .c those two functions only used in that C file Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-03 12:01:23 -08:00
Yinghai Lu	0b8f1efad3	sparse irq_desc[] array: core kernel and x86 changes Impact: new feature Problem on distro kernels: irq_desc[NR_IRQS] takes megabytes of RAM with NR_CPUS set to large values. The goal is to be able to scale up to much larger NR_IRQS value without impacting the (important) common case. To solve this, we generalize irq_desc[NR_IRQS] to an (optional) array of irq_desc pointers. When CONFIG_SPARSE_IRQ=y is used, we use kzalloc_node to get irq_desc, this also makes the IRQ descriptors NUMA-local (to the site that calls request_irq()). This gets rid of the irq_cfg[] static array on x86 as well: irq_cfg now uses desc->chip_data for x86 to store irq_cfg. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-08 14:31:51 +01:00
Al Viro	233e70f422	saner FASYNC handling on file close As it is, all instances of ->release() for files that have ->fasync() need to remember to evict file from fasync lists; forgetting that creates a hole and we actually have a bunch that does forget. So let's keep our lives simple - let __fput() check FASYNC in file->f_flags and call ->fasync() there if it's been set. And lose that crap in ->release() instances - leaving it there is still valid, but we don't have to bother anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-11-01 09:49:46 -07:00
Linus Torvalds	9301975ec2	Merge branch 'genirq-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip This merges branches irq/genirq, irq/sparseirq-v4, timers/hpet-percpu and x86/uv. The sparseirq branch is just preliminary groundwork: no sparse IRQs are actually implemented by this tree anymore - just the new APIs are added while keeping the old way intact as well (the new APIs map 1:1 to irq_desc[]). The 'real' sparse IRQ support will then be a relatively small patch ontop of this - with a v2.6.29 merge target. * 'genirq-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (178 commits) genirq: improve include files intr_remapping: fix typo io_apic: make irq_mis_count available on 64-bit too genirq: fix name space collisions of nr_irqs in arch/* genirq: fix name space collision of nr_irqs in autoprobe.c genirq: use iterators for irq_desc loops proc: fixup irq iterator genirq: add reverse iterator for irq_desc x86: move ack_bad_irq() to irq.c x86: unify show_interrupts() and proc helpers x86: cleanup show_interrupts genirq: cleanup the sparseirq modifications genirq: remove artifacts from sparseirq removal genirq: revert dynarray genirq: remove irq_to_desc_alloc genirq: remove sparse irq code genirq: use inline function for irq_to_desc genirq: consolidate nr_irqs and for_each_irq_desc() x86: remove sparse irq from Kconfig genirq: define nr_irqs for architectures with GENERIC_HARDIRQS=n ...	2008-10-20 13:23:01 -07:00
Alexey Dobriyan	f221e726bf	sysctl: simplify ->strategy name and nlen parameters passed to ->strategy hook are unused, remove them. In general ->strategy hook should know what it's doing, and don't do something tricky for which, say, pointer to original userspace array may be needed (name). Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> [ networking bits ] Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Howells <dhowells@redhat.com> Cc: Matt Mackall <mpm@selenic.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-16 11:21:47 -07:00
Thomas Gleixner	d6c88a507e	genirq: revert dynarray Revert the dynarray changes. They need more thought and polishing. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-10-16 16:53:15 +02:00
Thomas Gleixner	2cc21ef843	genirq: remove sparse irq code This code is not ready, but we need to rip it out instead of rebasing as we would lose the APIC/IO_APIC unification otherwise. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-10-16 16:53:15 +02:00
Yinghai Lu	3060d6fe28	x86: put timer_rand_state pointer into irq_desc irq_timer_state[] is a NR_IRQS sized array that is a side-by array to the real irq_desc[] array. Integrate that field into the (now dynamic) irq_desc dynamic array and save some RAM. v2: keep the old way to support arch not support irq_desc Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-16 16:52:31 +02:00
Yinghai Lu	eef1de76da	irqs: make irq_timer_state to use dyn_array Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-16 16:52:07 +02:00
Yinghai Lu	1f45f5621d	drivers/char: use nr_irqs convert them to nr_irqs. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-16 16:52:05 +02:00
Tejun Heo	f331c0296f	block: don't depend on consecutive minor space * Implement disk_devt() and part_devt() and use them to directly access devt instead of computing it from ->major and ->first_minor. Note that all references to ->major and ->first_minor outside of block layer is used to determine devt of the disk (the part0) and as ->major and ->first_minor will continue to represent devt for the disk, converting these users aren't strictly necessary. However, convert them for consistency. * Implement disk_max_parts() to avoid directly deferencing genhd->minors. * Update bdget_disk() such that it doesn't assume consecutive minor space. * Move devt computation from register_disk() to add_disk() and make it the only one (all other usages use the initially determined value). These changes clean up the code and will help disk->part dereference fix and extended block device numbers. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-10-09 08:56:05 +02:00
Andrew Morton	8b76f46a2d	drivers/char/random.c: fix a race which can lead to a bogus BUG() Fix a bug reported by and diagnosed by Aaron Straus. This is a regression intruduced into 2.6.26 by commit `adc782dae6` Author: Matt Mackall <mpm@selenic.com> Date: Tue Apr 29 01:03:07 2008 -0700 random: simplify and rename credit_entropy_store credit_entropy_bits() does: spin_lock_irqsave(&r->lock, flags); ... if (r->entropy_count > r->poolinfo->POOLBITS) r->entropy_count = r->poolinfo->POOLBITS; so there is a time window in which this BUG_ON(): static size_t account(struct entropy_store r, size_t nbytes, int min, int reserved) { unsigned long flags; BUG_ON(r->entropy_count > r->poolinfo->POOLBITS); / Hold lock while accounting */ spin_lock_irqsave(&r->lock, flags); can trigger. We could fix this by moving the assertion inside the lock, but it seems safer and saner to revert to the old behaviour wherein entropy_store.entropy_count at no time exceeds entropy_store.poolinfo->POOLBITS. Reported-by: Aaron Straus <aaron@merfinllc.com> Cc: Matt Mackall <mpm@selenic.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: <stable@kernel.org> [2.6.26.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-09-02 19:21:40 -07:00
Stephen Hemminger	9f59365374	nf_nat: use secure_ipv4_port_ephemeral() for NAT port randomization Use incoming network tuple as seed for NAT port randomization. This avoids concerns of leaking net_random() bits, and also gives better port distribution. Don't have NAT server, compile tested only. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> [ added missing EXPORT_SYMBOL_GPL ] Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-08-18 21:32:32 -07:00
Andrea Righi	27ac792ca0	PAGE_ALIGN(): correctly handle 64-bit values on 32-bit architectures On 32-bit architectures PAGE_ALIGN() truncates 64-bit values to the 32-bit boundary. For example: u64 val = PAGE_ALIGN(size); always returns a value < 4GB even if size is greater than 4GB. The problem resides in PAGE_MASK definition (from include/asm-x86/page.h for example): #define PAGE_SHIFT 12 #define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT) #define PAGE_MASK (~(PAGE_SIZE-1)) ... #define PAGE_ALIGN(addr) (((addr)+PAGE_SIZE-1)&PAGE_MASK) The "~" is performed on a 32-bit value, so everything in "and" with PAGE_MASK greater than 4GB will be truncated to the 32-bit boundary. Using the ALIGN() macro seems to be the right way, because it uses typeof(addr) for the mask. Also move the PAGE_ALIGN() definitions out of include/asm-*/page.h in include/linux/mm.h. See also lkml discussion: http://lkml.org/lkml/2008/6/11/237 [akpm@linux-foundation.org: fix drivers/media/video/uvc/uvc_queue.c] [akpm@linux-foundation.org: fix v850] [akpm@linux-foundation.org: fix powerpc] [akpm@linux-foundation.org: fix arm] [akpm@linux-foundation.org: fix mips] [akpm@linux-foundation.org: fix drivers/media/video/pvrusb2/pvrusb2-dvb.c] [akpm@linux-foundation.org: fix drivers/mtd/maps/uclinux.c] [akpm@linux-foundation.org: fix powerpc] Signed-off-by: Andrea Righi <righi.andrea@gmail.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:21 -07:00
Jeff Dike	9a6f70bbed	random: add async notification support to /dev/random Add async notification support to /dev/random. A little test case is below. Without this patch, you get: $ ./async-random Drained the pool Found more randomness With it, you get: $ ./async-random Drained the pool SIGIO Found more randomness #include <stdio.h> #include <stdlib.h> #include <signal.h> #include <errno.h> #include <fcntl.h> static void handler(int sig) { printf("SIGIO\n"); } int main(int argc, char **argv) { int fd, n, err, flags; if(signal(SIGIO, handler) < 0){ perror("setting SIGIO handler"); exit(1); } fd = open("/dev/random", O_RDONLY); if(fd < 0){ perror("open"); exit(1); } flags = fcntl(fd, F_GETFL); if (flags < 0){ perror("getting flags"); exit(1); } flags \|= O_NONBLOCK; if (fcntl(fd, F_SETFL, flags) < 0){ perror("setting flags"); exit(1); } while((err = read(fd, &n, sizeof(n))) > 0) ; if(err == 0){ printf("random returned 0\n"); exit(1); } else if(errno != EAGAIN){ perror("read"); exit(1); } flags \|= O_ASYNC; if (fcntl(fd, F_SETFL, flags) < 0){ perror("setting flags"); exit(1); } if (fcntl(fd, F_SETOWN, getpid()) < 0) { perror("Setting SIGIO"); exit(1); } printf("Drained the pool\n"); read(fd, &n, sizeof(n)); printf("Found more randomness\n"); return(0); } Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:25 -07:00
Matt Mackall	adc782dae6	random: simplify and rename credit_entropy_store - emphasize bits in the name - make zero bits lock-free - simplify logic Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:25 -07:00
Matt Mackall	e68e5b664e	random: make mixing interface byte-oriented Switch add_entropy_words to a byte-oriented interface, eliminating numerous casts and byte/word size rounding issues. This also reduces the overall bit/byte/word confusion in this code. We now mix a byte at a time into the word-based pool. This takes four times as many iterations, but should be negligible compared to hashing overhead. This also increases our pool churn, which adds some depth against some theoretical failure modes. The function name is changed to emphasize pool mixing and deemphasize entropy (the samples mixed in may not contain any). extract is added to the core function to make it clear that it extracts from the pool. Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:25 -07:00
Matt Mackall	993ba2114c	random: simplify add_ptr logic The add_ptr variable wasn't used in a sensible way, use only i instead. i got reused later for a different purpose, use j instead. While we're here, put tap0 first in the tap list and add a comment. Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:25 -07:00
Matt Mackall	6d38b82740	random: remove some prefetch logic The urandom output pool (ie the fast path) fits in one cacheline, so this is pretty unnecessary. Further, the output path has already fetched the entire pool to hash it before calling in here. (This was the only user of prefetch_range in the kernel, and it passed in words rather than bytes!) Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:25 -07:00
Matt Mackall	feee76972b	random: eliminate redundant new_rotate variable - eliminate new_rotate - move input_rotate masking - simplify input_rotate update - move input_rotate update to end of inner loop for readability Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:25 -07:00
Matt Mackall	433582093a	random: remove cacheline alignment for locks Earlier changes greatly reduce the number of times we grab the lock per output byte, so we shouldn't need this particular hack any more. Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:24 -07:00
Matt Mackall	1c0ad3d492	random: make backtracking attacks harder At each extraction, we change (poolbits / 16) + 32 bits in the pool, or 96 bits in the case of the secondary pools. Thus, a brute-force backtracking attack on the pool state is less difficult than breaking the hash. In certain cases, this difficulty may be is reduced to 2^64 iterations. Instead, hash the entire pool in one go, then feedback the whole hash (160 bits) in one go. This will make backtracking at least as hard as inverting the hash. Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:24 -07:00
Matt Mackall	ffd8d3fa58	random: improve variable naming, clear extract buffer - split the SHA variables apart into hash and workspace - rename data to extract - wipe extract and workspace after hashing Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:24 -07:00
Matt Mackall	53c3f63e82	random: reuse rand_initialize Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:24 -07:00

1 2

84 Commits