5374 Commits

Author SHA1 Message Date
Joerg Roedel
0feae533dd x86/amd-iommu: Add passthrough mode initialization functions
When iommu=pt is passed on kernel command line the devices
should run untranslated. This requires the allocation of a
special domain for that purpose. This patch implements the
allocation and initialization path for iommu=pt.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-09-03 16:15:42 +02:00
Joerg Roedel
2650815fb0 x86/amd-iommu: Add core functions for pd allocation/freeing
This patch factors some code of protection domain allocation
into seperate functions. This way the logic can be used to
allocate the passthrough domain later. As a side effect this
patch fixes an unlikely domain id leakage bug.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-09-03 16:15:34 +02:00
Joerg Roedel
ac0101d396 x86/dma: Mark iommu_pass_through as __read_mostly
This variable is read most of the time. This patch marks it
as such. It also documents the meaning the this variable
while at it.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-09-03 16:13:46 +02:00
Joerg Roedel
abdc5eb3d6 x86/amd-iommu: Change iommu_map_page to support multiple page sizes
This patch adds a map_size parameter to the iommu_map_page
function which makes it generic enough to handle multiple
page sizes. This also requires a change to alloc_pte which
is also done in this patch.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-09-03 16:11:17 +02:00
Joerg Roedel
a6b256b413 x86/amd-iommu: Support higher level PTEs in iommu_page_unmap
This patch changes fetch_pte and iommu_page_unmap to support
different page sizes too.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-09-03 16:11:08 +02:00
Joerg Roedel
8f7a017ce0 x86/amd-iommu: Use 2-level page tables for dma_ops domains
The driver now supports a dynamic number of levels for IO
page tables. This allows to reduce the number of levels for
dma_ops domains by one because a dma_ops domain has usually
an address space size between 128MB and 4G.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-09-03 16:03:49 +02:00
Joerg Roedel
bad1cac28a x86/amd-iommu: Remove bus_addr check in iommu_map_page
The driver now supports full 64 bit device address spaces.
So this check is not longer required.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-09-03 16:03:48 +02:00
Joerg Roedel
8c8c143cdc x86/amd-iommu: Remove last usages of IOMMU_PTE_L0_INDEX
This change allows to remove these old macros later.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-09-03 16:03:47 +02:00
Joerg Roedel
8bc3e12742 x86/amd-iommu: Change alloc_pte to support 64 bit address space
This patch changes the alloc_pte function to be able to map
pages into the whole 64 bit address space supported by AMD
IOMMU hardware from the old limit of 2**39 bytes.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-09-03 16:03:46 +02:00
Joerg Roedel
50020fb632 x86/amd-iommu: Introduce increase_address_space function
This function will be used to increase the address space
size of a protection domain.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-09-03 16:03:46 +02:00
Joerg Roedel
04bfdd8406 x86/amd-iommu: Flush domains if address space size was increased
Thist patch introduces the update_domain function which
propagates the larger address space of a protection domain
to the device table and flushes all relevant DTEs and the
domain TLB.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-09-03 16:03:45 +02:00
Joerg Roedel
407d733e30 x86/amd-iommu: Introduce set_dte_entry function
This function factors out some logic of attach_device to a
seperate function. This new function will be used to update
device table entries when necessary.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-09-03 16:03:44 +02:00
Joerg Roedel
6a0dbcbe4e x86/amd-iommu: Add a gneric version of amd_iommu_flush_all_devices
This patch adds a generic variant of
amd_iommu_flush_all_devices function which flushes only the
DTEs for a given protection domain.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-09-03 16:03:43 +02:00
Joerg Roedel
a6d41a4027 x86/amd-iommu: Use fetch_pte in amd_iommu_iova_to_phys
Don't reimplement the page table walker in this function.
Use the generic one.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-09-03 16:03:42 +02:00
Joerg Roedel
38a76eeeaf x86/amd-iommu: Use fetch_pte in iommu_unmap_page
Instead of reimplementing existing logic use fetch_pte to
walk the page table in iommu_unmap_page.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-09-03 16:03:41 +02:00
Joerg Roedel
9355a08186 x86/amd-iommu: Make fetch_pte aware of dynamic mapping levels
This patch changes the fetch_pte function in the AMD IOMMU
driver to support dynamic mapping levels.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-09-03 16:03:34 +02:00
Joerg Roedel
6a1eddd2f9 x86/amd-iommu: Reset command buffer if wait loop fails
Instead of a panic on an comletion wait loop failure, try to
recover from that event from resetting the command buffer.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-09-03 15:56:19 +02:00
Joerg Roedel
b26e81b871 x86/amd-iommu: Panic if IOMMU command buffer reset fails
To prevent the driver from doing recursive command buffer
resets, just panic when that recursion happens.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-09-03 15:55:35 +02:00
Joerg Roedel
a345b23b79 x86/amd-iommu: Reset command buffer on ILLEGAL_COMMAND_ERROR
On an ILLEGAL_COMMAND_ERROR the IOMMU stops executing
further commands. This patch changes the code to handle this
case better by resetting the command buffer in the IOMMU.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-09-03 15:55:34 +02:00
Joerg Roedel
93f1cc67cf x86/amd-iommu: Add reset function for command buffers
This patch factors parts of the command buffer
initialization code into a seperate function which can be
used to reset the command buffer later.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-09-03 15:55:34 +02:00
Joerg Roedel
d586d7852c x86/amd-iommu: Add function to flush all DTEs on one IOMMU
This function flushes all DTE entries on one IOMMU for all
devices behind this IOMMU. This is required for command
buffer resetting later.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-09-03 15:55:23 +02:00
Joerg Roedel
e0faf54ee8 x86/amd-iommu: fix broken check in amd_iommu_flush_all_devices
The amd_iommu_pd_table is indexed by protection domain
number and not by device id. So this check is broken and
must be removed.

Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-09-03 15:49:56 +02:00
Joerg Roedel
ae908c22aa x86/amd-iommu: Remove redundant 'IOMMU' string
The 'IOMMU: ' prefix is not necessary because the
DUMP_printk macro already prints its own prefix.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-09-03 15:49:56 +02:00
Joerg Roedel
4c6f40d4e0 x86/amd-iommu: replace "AMD IOMMU" by "AMD-Vi"
This patch replaces the "AMD IOMMU" printk strings with the
official name for the hardware: "AMD-Vi".

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-09-03 15:49:56 +02:00
Joerg Roedel
f2430bd104 x86/amd-iommu: Remove some merge helper code
This patch removes some left-overs which where put into the code to
simplify merging code which also depends on changes in other trees.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-09-03 15:49:55 +02:00
Joerg Roedel
e394d72aa8 x86/amd-iommu: Introduce function for iommu-local domain flush
This patch introduces a function to flush all domain tlbs
for on one given IOMMU. This is required later to reset the
command buffer on one IOMMU.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-09-03 15:41:34 +02:00
Joerg Roedel
945b4ac44e x86/amd-iommu: Dump illegal command on ILLEGAL_COMMAND_ERROR
This patch adds code to dump the command which caused an
ILLEGAL_COMMAND_ERROR raised by the IOMMU hardware.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-09-03 14:28:04 +02:00
Joerg Roedel
e3e59876e8 x86/amd-iommu: Dump fault entry on DTE error
This patch adds code to dump the content of the device table
entry which caused an ILLEGAL_DEV_TABLE_ENTRY error from the
IOMMU hardware.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-09-03 14:17:08 +02:00
Ingo Molnar
f76bd108e5 Merge branch 'perfcounters/urgent' into perfcounters/core
Merge reason: We are going to modify a place modified by
              perfcounters/urgent.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-02 21:42:59 +02:00
David Howells
ee18d64c1f KEYS: Add a keyctl to install a process's session keyring on its parent [try #6]
Add a keyctl to install a process's session keyring onto its parent.  This
replaces the parent's session keyring.  Because the COW credential code does
not permit one process to change another process's credentials directly, the
change is deferred until userspace next starts executing again.  Normally this
will be after a wait*() syscall.

To support this, three new security hooks have been provided:
cred_alloc_blank() to allocate unset security creds, cred_transfer() to fill in
the blank security creds and key_session_to_parent() - which asks the LSM if
the process may replace its parent's session keyring.

The replacement may only happen if the process has the same ownership details
as its parent, and the process has LINK permission on the session keyring, and
the session keyring is owned by the process, and the LSM permits it.

Note that this requires alteration to each architecture's notify_resume path.
This has been done for all arches barring blackfin, m68k* and xtensa, all of
which need assembly alteration to support TIF_NOTIFY_RESUME.  This allows the
replacement to be performed at the point the parent process resumes userspace
execution.

This allows the userspace AFS pioctl emulation to fully emulate newpag() and
the VIOCSETTOK and VIOCSETTOK2 pioctls, all of which require the ability to
alter the parent process's PAG membership.  However, since kAFS doesn't use
PAGs per se, but rather dumps the keys into the session keyring, the session
keyring of the parent must be replaced if, for example, VIOCSETTOK is passed
the newpag flag.

This can be tested with the following program:

	#include <stdio.h>
	#include <stdlib.h>
	#include <keyutils.h>

	#define KEYCTL_SESSION_TO_PARENT	18

	#define OSERROR(X, S) do { if ((long)(X) == -1) { perror(S); exit(1); } } while(0)

	int main(int argc, char **argv)
	{
		key_serial_t keyring, key;
		long ret;

		keyring = keyctl_join_session_keyring(argv[1]);
		OSERROR(keyring, "keyctl_join_session_keyring");

		key = add_key("user", "a", "b", 1, keyring);
		OSERROR(key, "add_key");

		ret = keyctl(KEYCTL_SESSION_TO_PARENT);
		OSERROR(ret, "KEYCTL_SESSION_TO_PARENT");

		return 0;
	}

Compiled and linked with -lkeyutils, you should see something like:

	[dhowells@andromeda ~]$ keyctl show
	Session Keyring
	       -3 --alswrv   4043  4043  keyring: _ses
	355907932 --alswrv   4043    -1   \_ keyring: _uid.4043
	[dhowells@andromeda ~]$ /tmp/newpag
	[dhowells@andromeda ~]$ keyctl show
	Session Keyring
	       -3 --alswrv   4043  4043  keyring: _ses
	1055658746 --alswrv   4043  4043   \_ user: a
	[dhowells@andromeda ~]$ /tmp/newpag hello
	[dhowells@andromeda ~]$ keyctl show
	Session Keyring
	       -3 --alswrv   4043  4043  keyring: hello
	340417692 --alswrv   4043  4043   \_ user: a

Where the test program creates a new session keyring, sticks a user key named
'a' into it and then installs it on its parent.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-02 21:29:22 +10:00
Ingo Molnar
936e894a97 Merge commit 'v2.6.31-rc8' into x86/txt
Conflicts:
	arch/x86/kernel/reboot.c
	security/Kconfig

Merge reason: resolve the conflicts, bump up from rc3 to rc8.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-02 08:17:56 +02:00
Shane Wang
69575d3886 x86, intel_txt: clean up the impact on generic code, unbreak non-x86
Move tboot.h from asm to linux to fix the build errors of intel_txt
patch on non-X86 platforms. Remove the tboot code from generic code
init/main.c and kernel/cpu.c.

Signed-off-by: Shane Wang <shane.wang@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-09-01 18:25:07 -07:00
Prarit Bhargava
1a8e42fa81 [CPUFREQ] Create a blacklist for processors that should not load the acpi-cpufreq module.
Create a blacklist for processors that should not load the acpi-cpufreq module.

The initial entry in the blacklist function is the Intel 0f68 processor.  It's
specification update mentions errata AL30 which implies that cpufreq should not
run on this processor.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-09-01 12:45:20 -04:00
Mark Langsdorf
db39d5529d [CPUFREQ] Powernow-k8: Enable more than 2 low P-states
Remove an obsolete check that used to prevent there being more
than 2 low P-states.  Now that low-to-low P-states changes are
enabled, it prevents otherwise workable configurations with
multiple low P-states.

Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
Tested-by: Krists Krilovs <pow@pow.za.net>
Signed-off-by: Dave Jones <davej@redhat.com>
2009-09-01 12:45:20 -04:00
Catalin Marinas
acde31dc46 kmemleak: Ignore the aperture memory hole on x86_64
This block is allocated with alloc_bootmem() and scanned by kmemleak but
the kernel direct mapping may no longer exist. This patch tells kmemleak
to ignore this memory hole. The dma32_bootmem_ptr in
dma32_reserve_bootmem() is also ignored.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
2009-09-01 11:12:32 +01:00
H. Peter Anvin
ff55df53df x86, msr: Export the register-setting MSR functions via /dev/*/msr
Make it possible to access the all-register-setting/getting MSR
functions via the MSR driver.  This is implemented as an ioctl() on
the standard MSR device node.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <petkovbb@gmail.com>
2009-08-31 16:16:04 -07:00
H. Peter Anvin
0cc0213e73 x86, msr: Have the _safe MSR functions return -EIO, not -EFAULT
For some reason, the _safe MSR functions returned -EFAULT, not -EIO.
However, the only user which cares about the return code as anything
other than a boolean is the MSR driver, which wants -EIO.  Change it
to -EIO across the board.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
2009-08-31 15:15:23 -07:00
Borislav Petkov
6b0f43ddfa x86, AMD: Disable wrongly set X86_FEATURE_LAHF_LM CPUID bit
fbd8b1819e80ac5a176d085fdddc3a34d1499318 turns off the bit for
/proc/cpuinfo. However, a proper/full fix would be to additionally
turn off the bit in the CPUID output so that future callers get
correct CPU features info.

Do that by basically reversing what the BIOS wrongfully does at boot.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <1251705011-18636-3-git-send-email-petkovbb@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-08-31 15:14:29 -07:00
Borislav Petkov
177fed1ee8 x86, msr: Rewrite AMD rd/wrmsr variants
Switch them to native_{rd,wr}msr_safe_regs and remove
pv_cpu_ops.read_msr_amd.

Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
LKML-Reference: <1251705011-18636-2-git-send-email-petkovbb@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-08-31 15:14:28 -07:00
Borislav Petkov
132ec92f3f x86, msr: Add rd/wrmsr interfaces with preset registers
native_{rdmsr,wrmsr}_safe_regs are two new interfaces which allow
presetting of a subset of eight x86 GPRs before executing the rd/wrmsr
instructions. This is needed at least on AMD K8 for accessing an erratum
workaround MSR.

Originally based on an idea by H. Peter Anvin.

Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
LKML-Reference: <1251705011-18636-1-git-send-email-petkovbb@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-08-31 15:14:26 -07:00
Thomas Gleixner
e11dadabf4 x86: apic namespace cleanup
boot_cpu_physical_apicid is a global variable and used as function
argument as well. Rename the function arguments to avoid confusion.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31 21:30:47 +02:00
Thomas Gleixner
bc07844a33 x86: Distangle ioapic and i8259
The proposed Moorestown support patches use an extra feature flag
mechanism to make the ioapic work w/o an i8259. There is a much
simpler solution.

Most i8259 specific functions are already called dependend on the irq
number less than NR_IRQS_LEGACY. Replacing that constant by a
read_mostly variable which can be set to 0 by the platform setup code
allows us to achieve the same without any special feature flags.

That trivial change allows us to proceed with MRST w/o doing a full
blown overhaul of the ioapic code which would delay MRST unduly.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31 19:23:09 +02:00
Thomas Gleixner
3f4110a48a x86: Add Moorestown early detection
Moorestown MID devices need to be detected early in the boot process
to setup and do not call x86_default_early_setup as there is no EBDA
region to reserve.

[ Copied the minimal code from Jacobs latest MRST series ]

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jacob Pan <jacob.jun.pan@intel.com>
2009-08-31 11:09:40 +02:00
Pan, Jacob jun
162bc7ab01 x86: Add hardware_subarch ID for Moorestown
x86 bootprotocol 2.07 has introduced hardware_subarch ID in the boot
parameters provided by FW. We use it to identify Moorestown platforms.

[ tglx: Cleanup and paravirt fix ]

Signed-off-by: Jacob Pan <jacob.jun.pan@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31 11:09:40 +02:00
Thomas Gleixner
47a3d5da70 x86: Add early platform detection
Platforms like Moorestown require early setup and want to avoid the
call to reserve_ebda_region. The x86_init override is too late when
the MRST detection happens in setup_arch. Move the default i386
x86_init overrides and the call to reserve_ebda_region into a separate
function which is called as the default of a switch case depending on
the hardware_subarch id in boot params. This allows us to add a case
for MRST and let MRST have its own early setup function.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31 11:09:40 +02:00
Thomas Gleixner
dd0a70c8f9 x86: Move tsc_init to late_time_init
We do not need the TSC before late_time_init. Move the tsc_init to the
late time init code so we can also utilize HPET for calibration (which
we claimed to do but never did except in some older kernel
version). This also helps Moorestown to calibrate the TSC with the
AHBT timer which needs to be initialized in late_time_init like HPET.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31 09:35:47 +02:00
Thomas Gleixner
2d826404f0 x86: Move tsc_calibration to x86_init_ops
TSC calibration is modified by the vmware hypervisor and paravirt by
separate means. Moorestown wants to add its own calibration routine as
well. So make calibrate_tsc a proper x86_init_ops function and
override it by paravirt or by the early setup of the vmware
hypervisor.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31 09:35:47 +02:00
Thomas Gleixner
47926214d8 x86: Replace the now identical time_32/64.c by time.c
Remove the redundant copy.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31 09:35:46 +02:00
Thomas Gleixner
ef4512882d x86: time_32/64.c unify profile_pc
The code is identical except for the formatting and a useless
#ifdef. Make it the same.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31 09:35:46 +02:00
Thomas Gleixner
08047c4f17 x86: Move calibrate_cpu to tsc.c
Move the code where it's only user is. Also we need to look whether
this hardwired hackery might interfere with perfcounters.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-31 09:35:46 +02:00