1232 Commits

Author SHA1 Message Date
Avi Kivity
27c2381068 KVM: Add __KERNEL__ guards to exported headers
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 10:46:48 +03:00
Gleb Natapov
decde80b6a KVM: s390: remove unused structs
They are not used by common code without defines which s390 does not
have.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-09-10 08:33:15 +03:00
Gleb Natapov
a1b37100d9 KVM: Reduce runnability interface with arch support code
Remove kvm_cpu_has_interrupt() and kvm_arch_interrupt_allowed() from
interface between general code and arch code. kvm_arch_vcpu_runnable()
checks for interrupts instead.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:33:13 +03:00
Marcelo Tosatti
2023a29cbe KVM: remove old KVMTRACE support code
Return EOPNOTSUPP for KVM_TRACE_ENABLE/PAUSE/DISABLE ioctls.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:33:03 +03:00
Joerg Roedel
ec04b2604c KVM: Prepare memslot data structures for multiple hugepage sizes
[avi: fix build on non-x86]

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:33:02 +03:00
Gleb Natapov
988a2cae6a KVM: Use macro to iterate over vcpus.
[christian: remove unused variables on s390]

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:32:52 +03:00
Christian Ehrhardt
628eb9b8a8 KVM: s390: streamline memslot handling
This patch relocates the variables kvm-s390 uses to track guest mem addr/size.
As discussed dropping the variables at struct kvm_arch level allows to use the
common vcpu->request based mechanism to reload guest memory if e.g. changes
via set_memory_region.

The kick mechanism introduced in this series is used to ensure running vcpus
leave guest state to catch the update.

Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:32:42 +03:00
Christian Ehrhardt
b1d16c495d KVM: s390: fix signal handling
If signal pending is true we exit without updating kvm_run, userspace
currently just does nothing and jumps to kvm_run again.
Since we did not set an exit_reason we might end up with a random one
(whatever was the last exit). Therefore it was possible to e.g. jump to
the psw position the last real interruption set.
Setting the INTR exit reason ensures that no old psw data is swapped
in on reentry.

Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:32:42 +03:00
Christian Ehrhardt
9ace903d17 KVM: s390: infrastructure to kick vcpus out of guest state
To ensure vcpu's come out of guest context in certain cases this patch adds a
s390 specific way to kick them out of guest context. Currently it kicks them
out to rerun the vcpu_run path in the s390 code, but the mechanism itself is
expandable and with a new flag we could also add e.g. kicks to userspace etc.

Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:32:42 +03:00
Avi Kivity
0ba12d1081 KVM: Move common KVM Kconfig items to new file virt/kvm/Kconfig
Reduce Kconfig code duplication.

Signed-off-by: Avi Kivity <avi@redhat.com>
2009-09-10 08:32:41 +03:00
Ingo Molnar
ed011b22ce Merge commit 'v2.6.31-rc9' into tracing/core
Merge reason: move from -rc5 to -rc9.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-06 06:11:42 +02:00
Christof Schmitt
b592e89ac9 [SCSI] zfcp: Remove duplicated code for debug timestamps
The timestamp calculation used for s390dbf output is the same in a
private zfcp function and in debug.c. Replace both with a common
inline function.

Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-09-05 08:49:48 -05:00
Jan Glauber
81bd5f6c96 crypto: sha-s390 - Fix warnings in import function
That patch should fix the warnings.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2009-09-05 16:27:35 +10:00
David Howells
ee18d64c1f KEYS: Add a keyctl to install a process's session keyring on its parent [try #6]
Add a keyctl to install a process's session keyring onto its parent.  This
replaces the parent's session keyring.  Because the COW credential code does
not permit one process to change another process's credentials directly, the
change is deferred until userspace next starts executing again.  Normally this
will be after a wait*() syscall.

To support this, three new security hooks have been provided:
cred_alloc_blank() to allocate unset security creds, cred_transfer() to fill in
the blank security creds and key_session_to_parent() - which asks the LSM if
the process may replace its parent's session keyring.

The replacement may only happen if the process has the same ownership details
as its parent, and the process has LINK permission on the session keyring, and
the session keyring is owned by the process, and the LSM permits it.

Note that this requires alteration to each architecture's notify_resume path.
This has been done for all arches barring blackfin, m68k* and xtensa, all of
which need assembly alteration to support TIF_NOTIFY_RESUME.  This allows the
replacement to be performed at the point the parent process resumes userspace
execution.

This allows the userspace AFS pioctl emulation to fully emulate newpag() and
the VIOCSETTOK and VIOCSETTOK2 pioctls, all of which require the ability to
alter the parent process's PAG membership.  However, since kAFS doesn't use
PAGs per se, but rather dumps the keys into the session keyring, the session
keyring of the parent must be replaced if, for example, VIOCSETTOK is passed
the newpag flag.

This can be tested with the following program:

	#include <stdio.h>
	#include <stdlib.h>
	#include <keyutils.h>

	#define KEYCTL_SESSION_TO_PARENT	18

	#define OSERROR(X, S) do { if ((long)(X) == -1) { perror(S); exit(1); } } while(0)

	int main(int argc, char **argv)
	{
		key_serial_t keyring, key;
		long ret;

		keyring = keyctl_join_session_keyring(argv[1]);
		OSERROR(keyring, "keyctl_join_session_keyring");

		key = add_key("user", "a", "b", 1, keyring);
		OSERROR(key, "add_key");

		ret = keyctl(KEYCTL_SESSION_TO_PARENT);
		OSERROR(ret, "KEYCTL_SESSION_TO_PARENT");

		return 0;
	}

Compiled and linked with -lkeyutils, you should see something like:

	[dhowells@andromeda ~]$ keyctl show
	Session Keyring
	       -3 --alswrv   4043  4043  keyring: _ses
	355907932 --alswrv   4043    -1   \_ keyring: _uid.4043
	[dhowells@andromeda ~]$ /tmp/newpag
	[dhowells@andromeda ~]$ keyctl show
	Session Keyring
	       -3 --alswrv   4043  4043  keyring: _ses
	1055658746 --alswrv   4043  4043   \_ user: a
	[dhowells@andromeda ~]$ /tmp/newpag hello
	[dhowells@andromeda ~]$ keyctl show
	Session Keyring
	       -3 --alswrv   4043  4043  keyring: hello
	340417692 --alswrv   4043  4043   \_ user: a

Where the test program creates a new session keyring, sticks a user key named
'a' into it and then installs it on its parent.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-02 21:29:22 +10:00
Heiko Carstens
b62e180cae locking: Inline spinlock code for all locking variants on s390
Speeds up several benchmarks in a measurable way, so inline
all spin-lock variants by default.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Horst Hartmann <horsth@linux.vnet.ibm.com>
Cc: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: David Miller <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: <linux-arch@vger.kernel.org>
LKML-Reference: <20090831124419.319518405@de.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-31 18:08:51 +02:00
Thomas Gleixner
f71bb0ac5e Merge branch 'timers/posixtimers' into timers/tracing
Merge reason: timer tracepoint patches depend on both branches

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-29 10:34:29 +02:00
Hendrik Brueckner
7515bf59f8 tracing: Add syscall tracepoints - s390 arch update
This patch includes s390 arch updates to synchronize with latest
core changes in the syscalls tracing area.

- tracing: Map syscall name to number (syscall_name_to_nr())
- tracing: Call arch_init_ftrace_syscalls at boot
- tracing: add support tracepoint ids (set_syscall_{enter,exit}_id())

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Martin Bligh <mbligh@google.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
LKML-Reference: <20090825123111.GD4639@cetus.boeblingen.de.ibm.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-08-26 21:29:44 +02:00
Josh Stone
1c569f0264 tracing: Create generic syscall TRACE_EVENTs
This converts the syscall_enter/exit tracepoints into TRACE_EVENTs, so
you can have generic ftrace events that capture all system calls with
arguments and return values.  These generic events are also renamed to
sys_enter/exit, so they're more closely aligned to the specific
sys_enter_foo events.

Signed-off-by: Josh Stone <jistone@redhat.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Martin Bligh <mbligh@google.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
LKML-Reference: <1251150194-1713-5-git-send-email-jistone@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-08-26 00:41:48 +02:00
Josh Stone
9741987586 tracing: Move tracepoint callbacks from declaration to definition
It's not strictly correct for the tracepoint reg/unreg callbacks to
occur when a client is hooking up, because the actual tracepoint may not
be present yet.  This happens to be fine for syscall, since that's in
the core kernel, but it would cause problems for tracepoints defined in
a module that hasn't been loaded yet.  It also means the reg/unreg has
to be EXPORTed for any modules to use the tracepoint (as in SystemTap).

This patch removes DECLARE_TRACE_WITH_CALLBACK, and instead introduces
DEFINE_TRACE_FN which stores the callbacks in struct tracepoint.  The
callbacks are used now when the active state of the tracepoint changes
in set_tracepoint & disable_tracepoint.

This also introduces TRACE_EVENT_FN, so ftrace events can also provide
registration callbacks if needed.

Signed-off-by: Josh Stone <jistone@redhat.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Martin Bligh <mbligh@google.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
LKML-Reference: <1251150194-1713-4-git-send-email-jistone@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-08-26 00:36:41 +02:00
Josh Stone
6670000119 tracing: Rename FTRACE_SYSCALLS for tracepoints
s/HAVE_FTRACE_SYSCALLS/HAVE_SYSCALL_TRACEPOINTS/g
s/TIF_SYSCALL_FTRACE/TIF_SYSCALL_TRACEPOINT/g

The syscall enter/exit tracing is no longer specific to just ftrace, so
they now have names that reflect their tie to tracepoints instead.

Signed-off-by: Josh Stone <jistone@redhat.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Jiaying Zhang <jiayingz@google.com>
Cc: Martin Bligh <mbligh@google.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
LKML-Reference: <1251150194-1713-2-git-send-email-jistone@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-08-26 00:17:35 +02:00
Hendrik Brueckner
637952ca68 [S390] set preferred console based on conmode
setup_arch() unconditionally sets the preferred console to ttyS.
This breaks the use of 3270 devices as the console. Provide a new
function to set the default preferred console for s390. The preferred
console depends on the conmode parameter that is used to switch
between 3270 and 3215 terminal/console mode.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-08-23 18:10:01 +02:00
Ingo Molnar
5e9ad7df9f [S390] ftrace: update system call tracer support
Commit fb34a08c3 ("tracing: Add trace events for each syscall
entry/exit") changed the lowlevel API to ftrace syscall tracing
but did not update s390 which started making use of it recently.

This broke the s390 build, as reported by Paul Mundt.

Update the callbacks with the syscall number and the syscall
return code values. This allows per syscall tracepoints,
syscall argument enumeration /debug/tracing/events/syscalls/
and perfcounters support and integration on s390 too.

Reported-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <tip-fb34a08c3469b2be9eae626ccb96476b4687b810@git.kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-19 14:16:15 +02:00
Martin Schwidefsky
23970e389e timekeeping: Introduce read_boot_clock
Add the new function read_boot_clock to get the exact time the system
has been started. For architectures without support for exact boot
time a new weak function is added that returns 0.  Use the exact boot
time to initialize wall_to_monotonic, or xtime if the read_boot_clock
returned 0.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: John Stultz <johnstul@us.ibm.com>
Cc: Daniel Walker <dwalker@fifo99.com>
LKML-Reference: <20090814134811.296703241@de.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-15 10:55:47 +02:00
Martin Schwidefsky
d4f587c67f timekeeping: Increase granularity of read_persistent_clock()
The persistent clock of some architectures (e.g. s390) have a
better granularity than seconds. To reduce the delta between the
host clock and the guest clock in a virtualized system change the 
read_persistent_clock function to return a struct timespec.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: John Stultz <johnstul@us.ibm.com>
Cc: Daniel Walker <dwalker@fifo99.com>
LKML-Reference: <20090814134811.013873340@de.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-15 10:55:46 +02:00
Martin Schwidefsky
155ec60226 timekeeping: Introduce struct timekeeper
Add struct timekeeper to keep the internal values timekeeping.c needs
in regard to the currently selected clock source. This moves the
timekeeping intervals, xtime_nsec and the ntp error value from struct
clocksource to struct timekeeper. The raw_time is removed from the
clocksource as well. It gets treated like xtime as a global variable.
Eventually xtime raw_time should be moved to struct timekeeper.

[ tglx: minor cleanup ]

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: John Stultz <johnstul@us.ibm.com>
Cc: Daniel Walker <dwalker@fifo99.com>
LKML-Reference: <20090814134809.613209842@de.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-15 10:55:46 +02:00
Martin Schwidefsky
f1b82746c1 clocksource: Cleanup clocksource selection
If a non high-resolution clocksource is first set as override clock
and then registered it becomes active even if the system is in one-shot
mode. Move the override check from sysfs_override_clocksource to the
clocksource selection. That fixes the bug and simplifies the code. The
check in clocksource_register for double registration of the same
clocksource is removed without replacement.

To find the initial clocksource a new weak function in jiffies.c is
defined that returns the jiffies clocksource. The architecture code
can then override the weak function with a more suitable clocksource,
e.g. the TOD clock on s390.

[ tglx: Folded in a fix from John Stultz ]

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: John Stultz <johnstul@us.ibm.com>
Cc: Daniel Walker <dwalker@fifo99.com>
LKML-Reference: <20090814134808.388024160@de.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-15 10:55:46 +02:00
Tejun Heo
384be2b18a Merge branch 'percpu-for-linus' into percpu-for-next
Conflicts:
	arch/sparc/kernel/smp_64.c
	arch/x86/kernel/cpu/perf_counter.c
	arch/x86/kernel/setup_percpu.c
	drivers/cpufreq/cpufreq_ondemand.c
	mm/percpu.c

Conflicts in core and arch percpu codes are mostly from commit
ed78e1e078dd44249f88b1dd8c76dafb39567161 which substituted many
num_possible_cpus() with nr_cpu_ids.  As for-next branch has moved all
the first chunk allocators into mm/percpu.c, the changes are moved
from arch code to mm/percpu.c.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-08-14 14:45:31 +09:00
David S. Miller
aa11d958d1 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	arch/microblaze/include/asm/socket.h
2009-08-12 17:44:53 -07:00
Linus Torvalds
17d11ba149 Merge branch 'kvm-updates/2.6.31' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/2.6.31' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: Avoid redelivery of edge interrupt before next edge
  KVM: MMU: limit rmap chain length
  KVM: ia64: fix build failures due to ia64/unsigned long mismatches
  KVM: Make KVM_HPAGES_PER_HPAGE unsigned long to avoid build error on powerpc
  KVM: fix ack not being delivered when msi present
  KVM: s390: fix wait_queue handling
  KVM: VMX: Fix locking imbalance on emulation failure
  KVM: VMX: Fix locking order in handle_invalid_guest_state
  KVM: MMU: handle n_free_mmu_pages > n_alloc_mmu_pages in kvm_mmu_change_mmu_pages
  KVM: SVM: force new asid on vcpu migration
  KVM: x86: verify MTRR/PAT validity
  KVM: PIT: fix kpit_elapsed division by zero
  KVM: Fix KVM_GET_MSR_INDEX_LIST
2009-08-09 14:58:21 -07:00
Roel Kluin
53cb780adb [S390] KVM: Read buffer overflow
Check whether index is within bounds before testing the element.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-08-07 10:40:40 +02:00
Hendrik Brueckner
677c1dd706 [S390] kernel: Storing machine flags early in lowcore
Currently, the machine_flags are stored late in the startup
initialization which results in failing machine type checks
(e.g. for MACHINE_IS_VM).
To allow these checks, store the machine flags in the lowcore
when the machine type has been detected.

Moving the machine_flags to the lowcore has been introduced with
git commit 25097bf153391f7be4c591d47061b3dc4990dac2

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-08-07 10:40:39 +02:00
Jan Engelhardt
0d6038ee76 net: implement a SO_DOMAIN getsockoption
This sockopt goes in line with SO_TYPE and SO_PROTOCOL. It makes it
possible for userspace programs to pass around file descriptors — I
am referring to arguments-to-functions, but it may even work for the
fd passing over UNIX sockets — without needing to also pass the
auxiliary information (PF_INET6/IPPROTO_TCP).

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-05 13:02:57 -07:00
Jan Engelhardt
49c794e946 net: implement a SO_PROTOCOL getsockoption
Similar to SO_TYPE returning the socket type, SO_PROTOCOL allows to
retrieve the protocol used with a given socket.

I am not quite sure why we have that-many copies of socket.h, and why
the values are not the same on all arches either, but for where hex
numbers dominate, I use 0x1029 for SO_PROTOCOL as that seems to be
the next free unused number across a bunch of operating systems, or
so Google results make me want to believe. SO_PROTOCOL for others
just uses the next free Linux number, 38.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-08-05 13:02:56 -07:00
Christian Borntraeger
d3bc2f91b4 KVM: s390: fix wait_queue handling
There are two waitqueues in kvm for wait handling:
vcpu->wq for virt/kvm/kvm_main.c and
vpcu->arch.local_int.wq for the s390 specific wait code.

the wait handling in kvm_s390_handle_wait was broken by using different
wait_queues for add_wait queue and remove_wait_queue.

There are two options to fix the problem:
o  move all the s390 specific code to vcpu->wq and remove
   vcpu->arch.local_int.wq
o  move all the s390 specific code to vcpu->arch.local_int.wq

This patch chooses the 2nd variant for two reasons:
o  s390 does not use kvm_vcpu_block but implements its own enabled wait
   handling.
   Having a separate wait_queue make it clear, that our wait mechanism is
   different
o  the patch is much smaller

Report-by:  Julia Lawall <julia@diku.dk>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-08-05 13:59:46 +03:00
Stanislaw Gruszka
a42548a188 cputime: Optimize jiffies_to_cputime(1)
For powerpc with CONFIG_VIRT_CPU_ACCOUNTING
jiffies_to_cputime(1) is not compile time constant and run time
calculations are quite expensive. To optimize we use
precomputed value. For all other architectures is is
preprocessor definition.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
LKML-Reference: <1248862529-6063-5-git-send-email-sgruszka@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-03 14:48:36 +02:00
Linus Torvalds
760dcc6e18 Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
  [S390] zcrypt: fix scheduling of hrtimer ap_poll_timer
  [S390] vdso: clock_gettime of CLOCK_THREAD_CPUTIME_ID with noexec=on
  [S390] vdso: fix per cpu area allocation
  [S390] hibernation: fix register corruption on machine checks
  [S390] hibernation: fix lowcore handling
2009-07-27 12:16:38 -07:00
Benjamin Herrenschmidt
9e1b32caa5 mm: Pass virtual address to [__]p{te,ud,md}_free_tlb()
mm: Pass virtual address to [__]p{te,ud,md}_free_tlb()

Upcoming paches to support the new 64-bit "BookE" powerpc architecture
will need to have the virtual address corresponding to PTE page when
freeing it, due to the way the HW table walker works.

Basically, the TLB can be loaded with "large" pages that cover the whole
virtual space (well, sort-of, half of it actually) represented by a PTE
page, and which contain an "indirect" bit indicating that this TLB entry
RPN points to an array of PTEs from which the TLB can then create direct
entries. Thus, in order to invalidate those when PTE pages are deleted,
we need the virtual address to pass to tlbilx or tlbivax instructions.

The old trick of sticking it somewhere in the PTE page struct page sucks
too much, the address is almost readily available in all call sites and
almost everybody implemets these as macros, so we may as well add the
argument everywhere. I added it to the pmd and pud variants for consistency.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: David Howells <dhowells@redhat.com> [MN10300 & FRV]
Acked-by: Nick Piggin <npiggin@suse.de>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [s390]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-27 12:10:38 -07:00
Martin Schwidefsky
1277580fe5 [S390] vdso: clock_gettime of CLOCK_THREAD_CPUTIME_ID with noexec=on
The combination of noexec=on and a clock_gettime call with clock id
CLOCK_THREAD_CPUTIME_ID is broken. The vdso code switches to the
access register mode to get access to the per-cpu data structure to
execute the magic ectg instruction. After the ectg instruction the
code always switches back to the primary mode but for noexec=on the
correct mode is the secondary mode. The effect of the bug is that the
user space program looses the access to all mappings without PROT_EXEC,
e.g. the stack. The problem is fixed by restoring the mode that has
been active before the switch to the access register mode.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-07-24 12:41:02 +02:00
Heiko Carstens
3a6ba4600d [S390] vdso: fix per cpu area allocation
vdso per cpu area allocation in smp_prepare_cpus() happens with GFP_KERNEL
but irqs disabled. Triggers this one:

Badness at kernel/lockdep.c:2280
Modules linked in:
CPU: 0 Not tainted 2.6.30 #2
Process swapper (pid: 1, task: 000000003fe88000, ksp: 000000003fe87eb8)
Krnl PSW : 0400c00180000000 0000000000083360 (lockdep_trace_alloc+0xec/0xf8)
[...]
Call Trace:
([<00000000000832b6>] lockdep_trace_alloc+0x42/0xf8)
 [<00000000000b1880>] __alloc_pages_internal+0x3e8/0x5c4
 [<00000000000b1b4a>] __get_free_pages+0x3a/0xb0
 [<0000000000026546>] vdso_alloc_per_cpu+0x6a/0x18c
 [<00000000005eff82>] smp_prepare_cpus+0x322/0x594
 [<00000000005e8232>] kernel_init+0x76/0x398
 [<000000000001bb1e>] kernel_thread_starter+0x6/0xc
 [<000000000001bb18>] kernel_thread_starter+0x0/0xc

Fix this by moving the allocation out of the irqs disabled section.

Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-07-24 12:41:01 +02:00
Heiko Carstens
c63b196afc [S390] hibernation: fix register corruption on machine checks
swsusp_arch_suspend() actually saves all cpu register contents on
hibernation.
Machine checks must be disabled since swsusp_arch_suspend() stores
register contents to their lowcore save areas. That's the same
place where register contents on machine checks would be saved.
To avoid register corruption disable machine checks.
We must also disable machine checks in the new psw mask for
program checks, since swsusp_arch_suspend() may generate program
checks.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-07-24 12:41:00 +02:00
Heiko Carstens
5f954c3426 [S390] hibernation: fix lowcore handling
Our swsusp_arch_suspend() backend implementation disables prefixing
by setting the contents of the prefix register to 0.
However afterwards common code functions are called which might
access percpu data structures.
Since the lowcore contains e.g. the percpu base pointer this isn't
a good idea. So fix this by copying the hibernating cpu's lowcore to
absolute address zero.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-07-24 12:41:00 +02:00
Herbert Xu
9fadfd1adf crypto: sha512-s390 - Add export/import support
This patch adds export/import support to sha512-s390 (which includes
sha384-s390).  The exported type is defined by struct sha512_state,
which is basically the entire descriptor state of sha512_generic.

Since sha512-s390 only supports a 64-bit byte count the import
function will reject anything that exceeds that.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2009-07-22 14:38:13 +08:00
Sachin Sant
2a549c364a crypto: s390 - Fix sha build failure
Use struct s390_sha_ctx instead of sha1/sha256_state struct to fix
s390 crypto build break.

Signed-off-by: Sachin Sant <sachinp@in.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2009-07-16 19:58:42 +08:00
Alexey Dobriyan
405f55712d headers: smp_lock.h redux
* Remove smp_lock.h from files which don't need it (including some headers!)
* Add smp_lock.h to files which do need it
* Make smp_lock.h include conditional in hardirq.h
  It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT

  This will make hardirq.h inclusion cheaper for every PREEMPT=n config
  (which includes allmodconfig/allyesconfig, BTW)

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-12 12:22:34 -07:00
Herbert Xu
f63559bef3 crypto: sha256-s390 - Add export/import support
This patch adds export/import support to sha256-s390.  The exported
type is defined by struct sha256_state, which is basically the entire
descriptor state of sha256_generic.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2009-07-11 18:23:34 +08:00
Herbert Xu
406f104b41 crypto: sha1-s390 - Add export/import support
This patch adds export/import support to sha1-s390.  The exported
type is defined by struct sha1_state, which is basically the entire
descriptor state of sha1_generic.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2009-07-11 18:23:34 +08:00
Linus Torvalds
eee33abe59 Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
  [S390] define KTIME_SCALAR for 32-bit s390
  [S390] add generic atomic64 support for 31 bit
  [S390] improve suspend/resume error messages
  [S390] set SCHED_OMIT_FRAME_POINTER for s390
  [S390] add __ucmpdi2() helper function
  [S390] perf_counter build fix
  [S390] shutdown actions: save/return rc from init function
  [S390] dasd: correct debugfeature sense dump
  [S390] udelay: disable lockdep to avoid false positives
  [S390] monreader: fix dev_set_drvdata conversion
  [S390] sclp: fix compile error for !SCLP_CONSOLE
2009-07-10 19:12:51 -07:00
Peter Zijlstra
c99e6efe1b sched: INIT_PREEMPT_COUNT
Pull the initial preempt_count value into a single
definition site.

Maintainers for: alpha, ia64 and m68k, please have a look,
your arch code is funny.

The header magic is a bit odd, but similar to the KERNEL_DS
one, CPP waits with expanding these macros until the
INIT_THREAD_INFO macro itself is expanded, which is in
arch/*/kernel/init_task.c where we've already included
sched.h so we're good.

Cc: tony.luck@intel.com
Cc: rth@twiddle.net
Cc: geert@linux-m68k.org
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-10 14:24:05 -07:00
Tejun Heo
023bf6f1b8 linker script: unify usage of discard definition
Discarded sections in different archs share some commonality but have
considerable differences.  This led to linker script for each arch
implementing its own /DISCARD/ definition, which makes maintaining
tedious and adding new entries error-prone.

This patch makes all linker scripts to move discard definitions to the
end of the linker script and use the common DISCARDS macro.  As ld
uses the first matching section definition, archs can include default
discarded sections by including them earlier in the linker script.

ia64 is notable because it first throws away some ia64 specific
subsections and then include the rest of the sections into the final
image, so those sections must be discarded before the inclusion.

defconfig compile tested for x86, x86-64, powerpc, powerpc64, ia64,
alpha, sparc, sparc64 and s390.  Michal Simek tested microblaze.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Tested-by: Michal Simek <monstr@monstr.eu>
Cc: linux-arch@vger.kernel.org
Cc: Michal Simek <monstr@monstr.eu>
Cc: microblaze-uclinux@itee.uq.edu.au
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Tony Luck <tony.luck@intel.com>
2009-07-09 11:27:40 +09:00
Martin Schwidefsky
07606309ff [S390] define KTIME_SCALAR for 32-bit s390
32-bit s390 has efficient support for 64/32-bit conversions, define
KTIME_SCALAR to enable the use of the plain scalar nanosecond based
representation of ktime.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-07-07 16:37:54 +02:00