742 Commits

Author SHA1 Message Date
Jiang Liu
2e45528930 iommu/vt-d: Unify the way to process DMAR device scope array
Now we have a PCI bus notification based mechanism to update DMAR
device scope array, we could extend the mechanism to support boot
time initialization too, which will help to unify and simplify
the implementation.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-03-04 17:51:06 +01:00
Jiang Liu
59ce0515cd iommu/vt-d: Update DRHD/RMRR/ATSR device scope caches when PCI hotplug happens
Current Intel DMAR/IOMMU driver assumes that all PCI devices associated
with DMAR/RMRR/ATSR device scope arrays are created at boot time and
won't change at runtime, so it caches pointers of associated PCI device
object. That assumption may be wrong now due to:
1) introduction of PCI host bridge hotplug
2) PCI device hotplug through sysfs interfaces.

Wang Yijing has tried to solve this issue by caching <bus, dev, func>
tupple instead of the PCI device object pointer, but that's still
unreliable because PCI bus number may change in case of hotplug.
Please refer to http://lkml.org/lkml/2013/11/5/64
Message from Yingjing's mail:
after remove and rescan a pci device
[  611.857095] dmar: DRHD: handling fault status reg 2
[  611.857109] dmar: DMAR:[DMA Read] Request device [86:00.3] fault addr ffff7000
[  611.857109] DMAR:[fault reason 02] Present bit in context entry is clear
[  611.857524] dmar: DRHD: handling fault status reg 102
[  611.857534] dmar: DMAR:[DMA Read] Request device [86:00.3] fault addr ffff6000
[  611.857534] DMAR:[fault reason 02] Present bit in context entry is clear
[  611.857936] dmar: DRHD: handling fault status reg 202
[  611.857947] dmar: DMAR:[DMA Read] Request device [86:00.3] fault addr ffff5000
[  611.857947] DMAR:[fault reason 02] Present bit in context entry is clear
[  611.858351] dmar: DRHD: handling fault status reg 302
[  611.858362] dmar: DMAR:[DMA Read] Request device [86:00.3] fault addr ffff4000
[  611.858362] DMAR:[fault reason 02] Present bit in context entry is clear
[  611.860819] IPv6: ADDRCONF(NETDEV_UP): eth3: link is not ready
[  611.860983] dmar: DRHD: handling fault status reg 402
[  611.860995] dmar: INTR-REMAP: Request device [[86:00.3] fault index a4
[  611.860995] INTR-REMAP:[fault reason 34] Present field in the IRTE entry is clear

This patch introduces a new mechanism to update the DRHD/RMRR/ATSR device scope
caches by hooking PCI bus notification.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-03-04 17:51:06 +01:00
Jiang Liu
0e242612d9 iommu/vt-d: Use RCU to protect global resources in interrupt context
Global DMA and interrupt remapping resources may be accessed in
interrupt context, so use RCU instead of rwsem to protect them
in such cases.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-03-04 17:51:05 +01:00
Jiang Liu
3a5670e8ac iommu/vt-d: Introduce a rwsem to protect global data structures
Introduce a global rwsem dmar_global_lock, which will be used to
protect DMAR related global data structures from DMAR/PCI/memory
device hotplug operations in process context.

DMA and interrupt remapping related data structures are read most,
and only change when memory/PCI/DMAR hotplug event happens.
So a global rwsem solution is adopted for balance between simplicity
and performance.

For interrupt remapping driver, function intel_irq_remapping_supported(),
dmar_table_init(), intel_enable_irq_remapping(), disable_irq_remapping(),
reenable_irq_remapping() and enable_drhd_fault_handling() etc
are called during booting, suspending and resuming with interrupt
disabled, so no need to take the global lock.

For interrupt remapping entry allocation, the locking model is:
	down_read(&dmar_global_lock);
	/* Find corresponding iommu */
	iommu = map_hpet_to_ir(id);
	if (iommu)
		/*
		 * Allocate remapping entry and mark entry busy,
		 * the IOMMU won't be hot-removed until the
		 * allocated entry has been released.
		 */
		index = alloc_irte(iommu, irq, 1);
	up_read(&dmar_global_lock);

For DMA remmaping driver, we only uses the dmar_global_lock rwsem to
protect functions which are only called in process context. For any
function which may be called in interrupt context, we will use RCU
to protect them in following patches.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-03-04 17:51:05 +01:00
Jiang Liu
b683b230a2 iommu/vt-d: Introduce macro for_each_dev_scope() to walk device scope entries
Introduce for_each_dev_scope()/for_each_active_dev_scope() to walk
{active} device scope entries. This will help following RCU lock
related patches.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-03-04 17:51:04 +01:00
Jiang Liu
b5f82ddf22 iommu/vt-d: Fix error in detect ATS capability
Current Intel IOMMU driver only matches a PCIe root port with the first
DRHD unit with the samge segment number. It will report false result
if there are multiple DRHD units with the same segment number, thus fail
to detect ATS capability for some PCIe devices.

This patch refines function dmar_find_matched_atsr_unit() to search all
DRHD units with the same segment number.

An example DMAR table entries as below:
[1D0h 0464  2]                Subtable Type : 0002 <Root Port ATS Capability>
[1D2h 0466  2]                       Length : 0028
[1D4h 0468  1]                        Flags : 00
[1D5h 0469  1]                     Reserved : 00
[1D6h 0470  2]           PCI Segment Number : 0000

[1D8h 0472  1]      Device Scope Entry Type : 02
[1D9h 0473  1]                 Entry Length : 08
[1DAh 0474  2]                     Reserved : 0000
[1DCh 0476  1]               Enumeration ID : 00
[1DDh 0477  1]               PCI Bus Number : 00
[1DEh 0478  2]                     PCI Path : [02, 00]

[1E0h 0480  1]      Device Scope Entry Type : 02
[1E1h 0481  1]                 Entry Length : 08
[1E2h 0482  2]                     Reserved : 0000
[1E4h 0484  1]               Enumeration ID : 00
[1E5h 0485  1]               PCI Bus Number : 00
[1E6h 0486  2]                     PCI Path : [03, 00]

[1E8h 0488  1]      Device Scope Entry Type : 02
[1E9h 0489  1]                 Entry Length : 08
[1EAh 0490  2]                     Reserved : 0000
[1ECh 0492  1]               Enumeration ID : 00
[1EDh 0493  1]               PCI Bus Number : 00
[1EEh 0494  2]                     PCI Path : [03, 02]

[1F0h 0496  1]      Device Scope Entry Type : 02
[1F1h 0497  1]                 Entry Length : 08
[1F2h 0498  2]                     Reserved : 0000
[1F4h 0500  1]               Enumeration ID : 00
[1F5h 0501  1]               PCI Bus Number : 00
[1F6h 0502  2]                     PCI Path : [03, 03]

[1F8h 0504  2]                Subtable Type : 0002 <Root Port ATS Capability>
[1FAh 0506  2]                       Length : 0020
[1FCh 0508  1]                        Flags : 00
[1FDh 0509  1]                     Reserved : 00
[1FEh 0510  2]           PCI Segment Number : 0000

[200h 0512  1]      Device Scope Entry Type : 02
[201h 0513  1]                 Entry Length : 08
[202h 0514  2]                     Reserved : 0000
[204h 0516  1]               Enumeration ID : 00
[205h 0517  1]               PCI Bus Number : 40
[206h 0518  2]                     PCI Path : [02, 00]

[208h 0520  1]      Device Scope Entry Type : 02
[209h 0521  1]                 Entry Length : 08
[20Ah 0522  2]                     Reserved : 0000
[20Ch 0524  1]               Enumeration ID : 00
[20Dh 0525  1]               PCI Bus Number : 40
[20Eh 0526  2]                     PCI Path : [02, 02]

[210h 0528  1]      Device Scope Entry Type : 02
[211h 0529  1]                 Entry Length : 08
[212h 0530  2]                     Reserved : 0000
[214h 0532  1]               Enumeration ID : 00
[215h 0533  1]               PCI Bus Number : 40
[216h 0534  2]                     PCI Path : [03, 00]

[218h 0536  2]                Subtable Type : 0002 <Root Port ATS Capability>
[21Ah 0538  2]                       Length : 0020
[21Ch 0540  1]                        Flags : 00
[21Dh 0541  1]                     Reserved : 00
[21Eh 0542  2]           PCI Segment Number : 0000

[220h 0544  1]      Device Scope Entry Type : 02
[221h 0545  1]                 Entry Length : 08
[222h 0546  2]                     Reserved : 0000
[224h 0548  1]               Enumeration ID : 00
[225h 0549  1]               PCI Bus Number : 80
[226h 0550  2]                     PCI Path : [02, 00]

[228h 0552  1]      Device Scope Entry Type : 02
[229h 0553  1]                 Entry Length : 08
[22Ah 0554  2]                     Reserved : 0000
[22Ch 0556  1]               Enumeration ID : 00
[22Dh 0557  1]               PCI Bus Number : 80
[22Eh 0558  2]                     PCI Path : [02, 02]

[230h 0560  1]      Device Scope Entry Type : 02
[231h 0561  1]                 Entry Length : 08
[232h 0562  2]                     Reserved : 0000
[234h 0564  1]               Enumeration ID : 00
[235h 0565  1]               PCI Bus Number : 80
[236h 0566  2]                     PCI Path : [03, 00]

[238h 0568  2]                Subtable Type : 0002 <Root Port ATS Capability>
[23Ah 0570  2]                       Length : 0020
[23Ch 0572  1]                        Flags : 00
[23Dh 0573  1]                     Reserved : 00
[23Eh 0574  2]           PCI Segment Number : 0000

[240h 0576  1]      Device Scope Entry Type : 02
[241h 0577  1]                 Entry Length : 08
[242h 0578  2]                     Reserved : 0000
[244h 0580  1]               Enumeration ID : 00
[245h 0581  1]               PCI Bus Number : C0
[246h 0582  2]                     PCI Path : [02, 00]

[248h 0584  1]      Device Scope Entry Type : 02
[249h 0585  1]                 Entry Length : 08
[24Ah 0586  2]                     Reserved : 0000
[24Ch 0588  1]               Enumeration ID : 00
[24Dh 0589  1]               PCI Bus Number : C0
[24Eh 0590  2]                     PCI Path : [02, 02]

[250h 0592  1]      Device Scope Entry Type : 02
[251h 0593  1]                 Entry Length : 08
[252h 0594  2]                     Reserved : 0000
[254h 0596  1]               Enumeration ID : 00
[255h 0597  1]               PCI Bus Number : C0
[256h 0598  2]                     PCI Path : [03, 00]

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-03-04 17:51:04 +01:00
Jiang Liu
a4eaa86c0c iommu/vt-d: Check for NULL pointer when freeing IOMMU data structure
Domain id 0 will be assigned to invalid translation without allocating
domain data structure if DMAR unit supports caching mode. So in function
free_dmar_iommu(), we should check whether the domain pointer is NULL,
otherwise it will cause system crash as below:
[    6.790519] BUG: unable to handle kernel NULL pointer dereference at 00000000000000c8
[    6.799520] IP: [<ffffffff810e2dc8>] __lock_acquire+0x11f8/0x1430
[    6.806493] PGD 0
[    6.817972] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
[    6.823303] Modules linked in:
[    6.826862] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.14.0-rc1+ #126
[    6.834252] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRIVTIN1.86B.0047.R00.1402050741 02/05/2014
[    6.845951] task: ffff880455a80000 ti: ffff880455a88000 task.ti: ffff880455a88000
[    6.854437] RIP: 0010:[<ffffffff810e2dc8>]  [<ffffffff810e2dc8>] __lock_acquire+0x11f8/0x1430
[    6.864154] RSP: 0000:ffff880455a89ce0  EFLAGS: 00010046
[    6.870179] RAX: 0000000000000046 RBX: 0000000000000002 RCX: 0000000000000000
[    6.878249] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000000000c8
[    6.886318] RBP: ffff880455a89d40 R08: 0000000000000002 R09: 0000000000000001
[    6.894387] R10: 0000000000000000 R11: 0000000000000001 R12: ffff880455a80000
[    6.902458] R13: 0000000000000000 R14: 00000000000000c8 R15: 0000000000000000
[    6.910520] FS:  0000000000000000(0000) GS:ffff88045b800000(0000) knlGS:0000000000000000
[    6.919687] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    6.926198] CR2: 00000000000000c8 CR3: 0000000001e0e000 CR4: 00000000001407f0
[    6.934269] Stack:
[    6.936588]  ffffffffffffff10 ffffffff810f59db 0000000000000010 0000000000000246
[    6.945219]  ffff880455a89d10 0000000000000000 ffffffff82bcb980 0000000000000046
[    6.953850]  0000000000000000 0000000000000000 0000000000000002 0000000000000000
[    6.962482] Call Trace:
[    6.965300]  [<ffffffff810f59db>] ? vprintk_emit+0x4fb/0x5a0
[    6.971716]  [<ffffffff810e3185>] lock_acquire+0x185/0x200
[    6.977941]  [<ffffffff821fbbee>] ? init_dmars+0x839/0xa1d
[    6.984167]  [<ffffffff81870b06>] _raw_spin_lock_irqsave+0x56/0x90
[    6.991158]  [<ffffffff821fbbee>] ? init_dmars+0x839/0xa1d
[    6.997380]  [<ffffffff821fbbee>] init_dmars+0x839/0xa1d
[    7.003410]  [<ffffffff8147d575>] ? pci_get_dev_by_id+0x75/0xd0
[    7.010119]  [<ffffffff821fc146>] intel_iommu_init+0x2f0/0x502
[    7.016735]  [<ffffffff821a7947>] ? iommu_setup+0x27d/0x27d
[    7.023056]  [<ffffffff821a796f>] pci_iommu_init+0x28/0x52
[    7.029282]  [<ffffffff81002162>] do_one_initcall+0xf2/0x220
[    7.035702]  [<ffffffff810a4a29>] ? parse_args+0x2c9/0x450
[    7.041919]  [<ffffffff8219d1b1>] kernel_init_freeable+0x1c9/0x25b
[    7.048919]  [<ffffffff8219c8d2>] ? do_early_param+0x8a/0x8a
[    7.055336]  [<ffffffff8184d3f0>] ? rest_init+0x150/0x150
[    7.061461]  [<ffffffff8184d3fe>] kernel_init+0xe/0x100
[    7.067393]  [<ffffffff8187b5fc>] ret_from_fork+0x7c/0xb0
[    7.073518]  [<ffffffff8184d3f0>] ? rest_init+0x150/0x150
[    7.079642] Code: 01 76 18 89 05 46 04 36 01 41 be 01 00 00 00 e9 2f 02 00 00 0f 1f 80 00 00 00 00 41 be 01 00 00 00 e9 1d 02 00 00 0f 1f 44 00 00 <49> 81 3e c0 31 34 82 b8 01 00 00 00 0f 44 d8 41 83 ff 01 0f 87
[    7.104944] RIP  [<ffffffff810e2dc8>] __lock_acquire+0x11f8/0x1430
[    7.112008]  RSP <ffff880455a89ce0>
[    7.115988] CR2: 00000000000000c8
[    7.119784] ---[ end trace 13d756f0f462c538 ]---
[    7.125034] note: swapper/0[1] exited with preempt_count 1
[    7.131285] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
[    7.131285]

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-03-04 17:51:03 +01:00
Jiang Liu
9ebd682e5a iommu/vt-d: Fix incorrect iommu_count for si_domain
The iommu_count field in si_domain(static identity domain) is
initialized to zero and never increases. It will underflow
when tearing down iommu unit in function free_dmar_iommu()
and leak memory. So refine code to correctly manage
si_domain->iommu_count.

Warning message caused by si_domain memory leak:
[   14.609681] IOMMU: Setting RMRR:
[   14.613496] Ignoring identity map for HW passthrough device 0000:00:1a.0 [0xbdcfd000 - 0xbdd1dfff]
[   14.623809] Ignoring identity map for HW passthrough device 0000:00:1d.0 [0xbdcfd000 - 0xbdd1dfff]
[   14.634162] IOMMU: Prepare 0-16MiB unity mapping for LPC
[   14.640329] Ignoring identity map for HW passthrough device 0000:00:1f.0 [0x0 - 0xffffff]
[   14.673360] IOMMU: dmar init failed
[   14.678157] kmem_cache_destroy iommu_devinfo: Slab cache still has objects
[   14.686076] CPU: 12 PID: 1 Comm: swapper/0 Not tainted 3.13.0-rc1-gerry+ #59
[   14.694176] Hardware name: Intel Corporation LH Pass ........../SVRBD-ROW_T, BIOS SE5C600.86B.99.99.x059.091020121352 09/10/2012
[   14.707412]  0000000000000000 ffff88042dd33db0 ffffffff8156223d ffff880c2cc37c00
[   14.716407]  ffff88042dd33dc8 ffffffff811790b1 ffff880c2d3533b8 ffff88042dd33e00
[   14.725468]  ffffffff81dc7a6a ffffffff81b1e8e0 ffffffff81f84058 ffffffff81d8a711
[   14.734464] Call Trace:
[   14.737453]  [<ffffffff8156223d>] dump_stack+0x4d/0x66
[   14.743430]  [<ffffffff811790b1>] kmem_cache_destroy+0xf1/0x100
[   14.750279]  [<ffffffff81dc7a6a>] intel_iommu_init+0x122/0x56a
[   14.757035]  [<ffffffff81d8a711>] ? iommu_setup+0x27d/0x27d
[   14.763491]  [<ffffffff81d8a739>] pci_iommu_init+0x28/0x52
[   14.769846]  [<ffffffff81000342>] do_one_initcall+0x122/0x180
[   14.776506]  [<ffffffff81077738>] ? parse_args+0x1e8/0x320
[   14.782866]  [<ffffffff81d850e8>] kernel_init_freeable+0x1e1/0x26c
[   14.789994]  [<ffffffff81d84833>] ? do_early_param+0x88/0x88
[   14.796556]  [<ffffffff8154ffc0>] ? rest_init+0xd0/0xd0
[   14.802626]  [<ffffffff8154ffce>] kernel_init+0xe/0x130
[   14.808698]  [<ffffffff815756ac>] ret_from_fork+0x7c/0xb0
[   14.814963]  [<ffffffff8154ffc0>] ? rest_init+0xd0/0xd0
[   14.821640] kmem_cache_destroy iommu_domain: Slab cache still has objects
[   14.829456] CPU: 12 PID: 1 Comm: swapper/0 Not tainted 3.13.0-rc1-gerry+ #59
[   14.837562] Hardware name: Intel Corporation LH Pass ........../SVRBD-ROW_T, BIOS SE5C600.86B.99.99.x059.091020121352 09/10/2012
[   14.850803]  0000000000000000 ffff88042dd33db0 ffffffff8156223d ffff88102c1ee3c0
[   14.861222]  ffff88042dd33dc8 ffffffff811790b1 ffff880c2d3533b8 ffff88042dd33e00
[   14.870284]  ffffffff81dc7a76 ffffffff81b1e8e0 ffffffff81f84058 ffffffff81d8a711
[   14.879271] Call Trace:
[   14.882227]  [<ffffffff8156223d>] dump_stack+0x4d/0x66
[   14.888197]  [<ffffffff811790b1>] kmem_cache_destroy+0xf1/0x100
[   14.895034]  [<ffffffff81dc7a76>] intel_iommu_init+0x12e/0x56a
[   14.901781]  [<ffffffff81d8a711>] ? iommu_setup+0x27d/0x27d
[   14.908238]  [<ffffffff81d8a739>] pci_iommu_init+0x28/0x52
[   14.914594]  [<ffffffff81000342>] do_one_initcall+0x122/0x180
[   14.921244]  [<ffffffff81077738>] ? parse_args+0x1e8/0x320
[   14.927598]  [<ffffffff81d850e8>] kernel_init_freeable+0x1e1/0x26c
[   14.934738]  [<ffffffff81d84833>] ? do_early_param+0x88/0x88
[   14.941309]  [<ffffffff8154ffc0>] ? rest_init+0xd0/0xd0
[   14.947380]  [<ffffffff8154ffce>] kernel_init+0xe/0x130
[   14.953430]  [<ffffffff815756ac>] ret_from_fork+0x7c/0xb0
[   14.959689]  [<ffffffff8154ffc0>] ? rest_init+0xd0/0xd0
[   14.966299] kmem_cache_destroy iommu_iova: Slab cache still has objects
[   14.973923] CPU: 12 PID: 1 Comm: swapper/0 Not tainted 3.13.0-rc1-gerry+ #59
[   14.982020] Hardware name: Intel Corporation LH Pass ........../SVRBD-ROW_T, BIOS SE5C600.86B.99.99.x059.091020121352 09/10/2012
[   14.995263]  0000000000000000 ffff88042dd33db0 ffffffff8156223d ffff88042cb5c980
[   15.004265]  ffff88042dd33dc8 ffffffff811790b1 ffff880c2d3533b8 ffff88042dd33e00
[   15.013322]  ffffffff81dc7a82 ffffffff81b1e8e0 ffffffff81f84058 ffffffff81d8a711
[   15.022318] Call Trace:
[   15.025238]  [<ffffffff8156223d>] dump_stack+0x4d/0x66
[   15.031202]  [<ffffffff811790b1>] kmem_cache_destroy+0xf1/0x100
[   15.038038]  [<ffffffff81dc7a82>] intel_iommu_init+0x13a/0x56a
[   15.044786]  [<ffffffff81d8a711>] ? iommu_setup+0x27d/0x27d
[   15.051242]  [<ffffffff81d8a739>] pci_iommu_init+0x28/0x52
[   15.057601]  [<ffffffff81000342>] do_one_initcall+0x122/0x180
[   15.064254]  [<ffffffff81077738>] ? parse_args+0x1e8/0x320
[   15.070608]  [<ffffffff81d850e8>] kernel_init_freeable+0x1e1/0x26c
[   15.077747]  [<ffffffff81d84833>] ? do_early_param+0x88/0x88
[   15.084300]  [<ffffffff8154ffc0>] ? rest_init+0xd0/0xd0
[   15.090362]  [<ffffffff8154ffce>] kernel_init+0xe/0x130
[   15.096431]  [<ffffffff815756ac>] ret_from_fork+0x7c/0xb0
[   15.102693]  [<ffffffff8154ffc0>] ? rest_init+0xd0/0xd0
[   15.189273] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-03-04 17:51:02 +01:00
Jiang Liu
92d03cc8d0 iommu/vt-d: Reduce duplicated code to handle virtual machine domains
Reduce duplicated code to handle virtual machine domains, there's no
functionality changes. It also improves code readability.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-03-04 17:51:01 +01:00
Jiang Liu
e85bb5d4d1 iommu/vt-d: Free resources if failed to create domain for PCIe endpoint
Enhance function get_domain_for_dev() to release allocated resources
if failed to create domain for PCIe endpoint, otherwise the allocated
resources will get lost.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-03-04 17:51:01 +01:00
Jiang Liu
745f2586e7 iommu/vt-d: Simplify function get_domain_for_dev()
Function get_domain_for_dev() is a little complex, simplify it
by factoring out dmar_search_domain_by_dev_info() and
dmar_insert_dev_info().

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-03-04 17:51:01 +01:00
Jiang Liu
b94e4117f8 iommu/vt-d: Move private structures and variables into intel-iommu.c
Move private structures and variables into intel-iommu.c, which will
help to simplify locking policy for hotplug. Also delete redundant
declarations.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-03-04 17:51:00 +01:00
Jiang Liu
bb3a6b7845 iommu/vt-d: Factor out dmar_alloc_dev_scope() for later reuse
Factor out function dmar_alloc_dev_scope() from dmar_parse_dev_scope()
for later reuse.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-03-04 17:51:00 +01:00
Jiang Liu
7e7dfab71a iommu/vt-d: Avoid caching stale domain_device_info when hot-removing PCI device
Function device_notifier() in intel-iommu.c only remove domain_device_info
data structure associated with a PCI device when handling PCI device
driver unbinding events. If a PCI device has never been bound to a PCI
device driver, there won't be BUS_NOTIFY_UNBOUND_DRIVER event when
hot-removing the PCI device. So associated domain_device_info data
structure may get lost.

On the other hand, if iommu_pass_through is enabled, function
iommu_prepare_static_indentify_mapping() will create domain_device_info
data structure for each PCIe to PCIe bridge and PCIe endpoint,
no matter whether there are drivers associated with those PCIe devices
or not. So those domain_device_info data structures will get lost when
hot-removing the assocated PCIe devices if they have never bound to
any PCI device driver.

To be even worse, it's not only an memory leak issue, but also an
caching of stale information bug because the memory are kept in
device_domain_list and domain->devices lists.

Fix the bug by trying to remove domain_device_info data structure when
handling BUS_NOTIFY_DEL_DEVICE event.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-03-04 17:51:00 +01:00
Jiang Liu
816997d03b iommu/vt-d: Avoid caching stale domain_device_info and fix memory leak
Function device_notifier() in intel-iommu.c fails to remove
device_domain_info data structures for PCI devices if they are
associated with si_domain because iommu_no_mapping() returns true
for those PCI devices. This will cause memory leak and caching of
stale information in domain->devices list.

So fix the issue by not calling iommu_no_mapping() and skipping check
of iommu_pass_through.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-03-04 17:50:59 +01:00
Jiang Liu
989d51fc99 iommu/vt-d: Avoid double free of g_iommus on error recovery path
Array 'g_iommus' may be freed twice on error recovery path in function
init_dmars() and free_dmar_iommu(), thus cause random system crash as
below.

[    6.774301] IOMMU: dmar init failed
[    6.778310] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[    6.785615] software IO TLB [mem 0x76bcf000-0x7abcf000] (64MB) mapped at [ffff880076bcf000-ffff88007abcefff]
[    6.796887] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
[    6.804173] Modules linked in:
[    6.807731] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.14.0-rc1+ #108
[    6.815122] Hardware name: Intel Corporation BRICKLAND/BRICKLAND, BIOS BRIVTIN1.86B.0047.R00.1402050741 02/05/2014
[    6.836000] task: ffff880455a80000 ti: ffff880455a88000 task.ti: ffff880455a88000
[    6.844487] RIP: 0010:[<ffffffff8143eea6>]  [<ffffffff8143eea6>] memcpy+0x6/0x110
[    6.853039] RSP: 0000:ffff880455a89cc8  EFLAGS: 00010293
[    6.859064] RAX: ffff006568636163 RBX: ffff00656863616a RCX: 0000000000000005
[    6.867134] RDX: 0000000000000005 RSI: ffffffff81cdc439 RDI: ffff006568636163
[    6.875205] RBP: ffff880455a89d30 R08: 000000000001bc3b R09: 0000000000000000
[    6.883275] R10: 0000000000000000 R11: ffffffff81cdc43e R12: ffff880455a89da8
[    6.891338] R13: ffff006568636163 R14: 0000000000000005 R15: ffffffff81cdc439
[    6.899408] FS:  0000000000000000(0000) GS:ffff88045b800000(0000) knlGS:0000000000000000
[    6.908575] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    6.915088] CR2: ffff88047e1ff000 CR3: 0000000001e0e000 CR4: 00000000001407f0
[    6.923160] Stack:
[    6.925487]  ffffffff8143c904 ffff88045b407e00 ffff006568636163 ffff006568636163
[    6.934113]  ffffffff8120a1a9 ffffffff81cdc43e 0000000000000007 0000000000000000
[    6.942747]  ffff880455a89da8 ffff006568636163 0000000000000007 ffffffff81cdc439
[    6.951382] Call Trace:
[    6.954197]  [<ffffffff8143c904>] ? vsnprintf+0x124/0x6f0
[    6.960323]  [<ffffffff8120a1a9>] ? __kmalloc_track_caller+0x169/0x360
[    6.967716]  [<ffffffff81440e1b>] kvasprintf+0x6b/0x80
[    6.973552]  [<ffffffff81432bf1>] kobject_set_name_vargs+0x21/0x70
[    6.980552]  [<ffffffff8143393d>] kobject_init_and_add+0x4d/0x90
[    6.987364]  [<ffffffff812067c9>] ? __kmalloc+0x169/0x370
[    6.993492]  [<ffffffff8102dbbc>] ? cache_add_dev+0x17c/0x4f0
[    7.000005]  [<ffffffff8102ddfa>] cache_add_dev+0x3ba/0x4f0
[    7.006327]  [<ffffffff821a87ca>] ? i8237A_init_ops+0x14/0x14
[    7.012842]  [<ffffffff821a87f8>] cache_sysfs_init+0x2e/0x61
[    7.019260]  [<ffffffff81002162>] do_one_initcall+0xf2/0x220
[    7.025679]  [<ffffffff810a4a29>] ? parse_args+0x2c9/0x450
[    7.031903]  [<ffffffff8219d1b1>] kernel_init_freeable+0x1c9/0x25b
[    7.038904]  [<ffffffff8219c8d2>] ? do_early_param+0x8a/0x8a
[    7.045322]  [<ffffffff8184d5e0>] ? rest_init+0x150/0x150
[    7.051447]  [<ffffffff8184d5ee>] kernel_init+0xe/0x100
[    7.057380]  [<ffffffff8187b87c>] ret_from_fork+0x7c/0xb0
[    7.063503]  [<ffffffff8184d5e0>] ? rest_init+0x150/0x150
[    7.069628] Code: 89 e5 53 48 89 fb 75 16 80 7f 3c 00 75 05 e8 d2 f9 ff ff 48 8b 43 58 48 2b 43 50 88 43 4e 5b 5d c3 90 90 90 90 48 89 f8 48 89 d1 <f3> a4 c3 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 20 4c 8b 06 4c 8b
[    7.094960] RIP  [<ffffffff8143eea6>] memcpy+0x6/0x110
[    7.100856]  RSP <ffff880455a89cc8>
[    7.104864] ---[ end trace b5d3fdc6c6c28083 ]---
[    7.110142] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[    7.110142]
[    7.120540] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-03-04 17:50:59 +01:00
Laurent Pinchart
07a0203021 iommu/omap: Allocate archdata on the fly for DT-based devices
The OMAP IOMMU driver locates the IOMMU associated to a device using the
IOMMU name stored in the device archdata iommu field. That field is
expected to be populated by platform code and is left unset for DT-based
devices. This results in a crash when the IOMMU driver attaches a domain
to a device.

Fix this by allocating the archdata iommu structure when devices are
added and freeing when they are removed. Devices without an OF node, and
devices without an iommus property in their OF node are ignored. The
iommu name is initialized from the IOMMU device node name.

This should be simplified when removing non-DT support completely from
the IOMMU users as the IOMMU name won't be needed anymore, and the
IOMMU device pointer could then be stored in the archdata iommu field
directly.

Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
[s-anna@ti.com: updated to use device name instead of OF name]
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-03-04 17:02:43 +01:00
Suman Anna
b148d5fb2e iommu/omap: Enable bus-error back on supported iommus
The remoteproc MMUs in OMAP4+ SoCs have some additional debug
registers that can give out the PC value in addition to the
MMU fault address. The PC value can be extracted properly only
on the DSP cores, and is not available on the ARM processors
within the IPU sub-systems. Instead, the MMUs have been enhanced
to throw a bus-error response back to the IPU processors.

This functionality is programmable through the MMU_GP_REG register.
The cores are simply stalled if the MMU_GP_REG.BUS_ERR_BACK_EN bit
is not set. When set, a bus-error exception is raised allowing the
processor to handle it as a bus fault and provide additional debug
information. This feature is turned on by default by the driver on
iommus supporting it.

Signed-off-by: Subramaniam Chanderashekarapuram <subramaniam.ca@ti.com>
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-03-04 17:02:08 +01:00
Florian Vaussard
3c92748df9 iommu/omap: Add devicetree support
As OMAP2+ is moving to a full DT boot for all SoC families, commit
7ce93f3 "ARM: OMAP2+: Fix more missing data for omap3.dtsi file"
adds basic DT bits for OMAP3. But the driver is not yet converted,
so this will not work and driver will not be probed. Convert it!

The legacy boot mode is still supported until OMAP3 is converted
to DT-boot only.

Signed-off-by: Florian Vaussard <florian.vaussard@epfl.ch>
[s-anna@ti.com: dev_name adaptation and improved error checking]
Signed-off-by: Suman Anna <s-anna@ti.com>
[tony@atomide.com: Ack for arch/arm/*omap* parts]
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-03-04 17:01:57 +01:00
Florian Vaussard
90e569c4ca iommu/omap: Allow enable/disable even without pdata
When booting with a devicetree, no platform data is provided.
Do not prematurely exit iommu_enable() and iommu_disable() in
such a case.

Note: As OMAP do not yet has a proper reset controller driver,
IOMMUs requiring a reset signal should use pdata-quirks as a
transitional solution.

Signed-off-by: Florian Vaussard <florian.vaussard@epfl.ch>
Signed-off-by: Suman Anna <s-anna@ti.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-03-04 17:01:44 +01:00
Suman Anna
7ee08b9ef2 iommu/omap: Fix error return paths in omap_iommu_attach()
There are couple of issues with the error return paths in
omap_iommu_attach():
1. omap_iommu_attach() returns NULL or ERR_PTR in case of error,
   but omap_iommu_attach_dev() only checks for IS_ERR. Thus a NULL
   return value (in case driver_find_device fails) will cause the
   kernel to panic when omap_iommu_attach_dev() dereferences the
   pointer.
2. A try_module_get() failure returns a valid success value as
   returned from iommu_enable().

Both the above issues have been fixed up to return the proper
ERR_PTR.

Signed-off-by: Florian Vaussard <florian.vaussard@epfl.ch>
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-03-04 17:01:24 +01:00
Suman Anna
f129b3dfb5 iommu/omap: Convert to devm_* interfaces
Use the various devm_ interfaces to simplify the cleanup in
probe and remove functions.

Signed-off-by: Florian Vaussard <florian.vaussard@epfl.ch>
Signed-off-by: Suman Anna <s-anna@ti.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-03-04 17:00:58 +01:00
Paul Bolle
b835443907 iommu/shmobile: Depend on ARCH_SHMOBILE
Commit 78a2e12f51d9 ("iommu: shmobile: Enable driver compilation with
COMPILE_TEST") added an optional dependency on SH_MOBILE. But that
Kconfig symbol doesn't exist. It seems ARCH_SHMOBILE was intended. Use
that.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-03-04 16:35:33 +01:00
Jay Cornwall
e8d2d82d4a iommu/amd: Fix PASID format in INVALIDATE_IOTLB_PAGES command
This patch corrects the PASID format in the INVALIDATE_IOTLB_PAGES
command, which was caused by incorrect information in
the AMD IOMMU Architectural Specification v2.01 document.

    Incorrect format:
         cmd->data[0][16:23] = PASID[7:0]
         cmd->data[1][16:27] = PASID[19:8]

     Correct format:
         cmd->data[0][16:23] = PASID[15:8]
         cmd->data[1][16:23] = PASID[7:0]

However, this does not affect the IOMMUv2 hardware implementation,
and has been corrected since version 2.02 of the specification
(available through AMD NDA).

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-03-04 15:10:17 +01:00
Joerg Roedel
dc03753b98 Merge branch 'for-joerg/arm-smmu/updates' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into arm/smmu 2014-03-04 14:54:27 +01:00
Marek Szyprowski
68efd7d2fb arm: dma-mapping: remove order parameter from arm_iommu_create_mapping()
The 'order' parameter for IOMMU-aware dma-mapping implementation was
introduced mainly as a hack to reduce size of the bitmap used for
tracking IO virtual address space. Since now it is possible to dynamically
resize the bitmap, this hack is not needed and can be removed without any
impact on the client devices. This way the parameters for
arm_iommu_create_mapping() becomes much easier to understand. 'size'
parameter now means the maximum supported IO address space size.

The code will allocate (resize) bitmap in chunks, ensuring that a single
chunk is not larger than a single memory page to avoid unreliable
allocations of size larger than PAGE_SIZE in atomic context.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
2014-02-28 11:55:18 +01:00
Will Deacon
34fb4b37b7 iommu/arm-smmu: fix incorrect comment regarding TLB invalidation
Commit 1463fe44fd0f ("iommu/arm-smmu: Don't use VMIDs for stage-1
translations") moved our TLB invalidation from context creation time to
context destruction time, but forgot to update an associated comment.

This patch fixes the broken comment.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-02-27 19:08:43 +00:00
Joe Perches
ff3a2b73b7 drivers/iommu/omap-iommu-debug.c: fix decimal permissions
These should have been octal.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Hiroshi DOYU <Hiroshi.DOYU@nokia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-25 15:25:42 -08:00
Will Deacon
3aa80ea4c9 iommu/arm-smmu: provide option to dsb macro when publishing tables
On coherent systems, publishing new page tables to the SMMU walker is
achieved with a dsb instruction. In fact, this can be a dsb(ishst) which
also provides the mandatory barrier option for arm64.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-02-24 19:09:46 +00:00
Will Deacon
b410aed932 iommu/arm-smmu: clean up use of `flags' in page table handling code
Commit 972157cac528 ("arm/smmu: Use irqsafe spinlock for domain lock")
fixed our page table locks to be the irq{save,restore} variants, since
the DMA mapping API can be invoked from interrupt context.

This patch cleans up our use of the flags variable so we can distinguish
between IRQ flags (now `flags') and pte protection bits (now `prot').

Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-02-24 19:09:45 +00:00
Andreas Herrmann
3a5df8ff35 iommu/arm-smmu: support buggy implementations with secure cfg accesses
In such a case we have to use secure aliases of some non-secure
registers.

This handling is switched on by DT property
"calxeda,smmu-secure-config-access" for an SMMU node.

Signed-off-by: Andreas Herrmann <andreas.herrmann@calxeda.com>
[will: merged with driver option handling patch]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-02-24 19:09:44 +00:00
Andreas Herrmann
636e97b0fb iommu/arm-smmu: set MAX_MASTER_STREAMIDS to MAX_PHANDLE_ARGS
The DT parsing code that determines stream IDs uses
of_parse_phandle_with_args and thus MAX_MASTER_STREAMIDS
is always bound by MAX_PHANDLE_ARGS.

Signed-off-by: Andreas Herrmann <andreas.herrmann@calxeda.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-02-24 19:09:43 +00:00
Joerg Roedel
972157cac5 arm/smmu: Use irqsafe spinlock for domain lock
As the lock might be used through DMA-API which is allowed
in interrupt context.

Signed-off-by: Joerg Roedel <joro@8bytes.org>
Acked-by: Will Deacon <will.deacon@arm.com>
2014-02-20 13:04:47 +01:00
Bjorn Helgaas
4b180d97b1 iommu/amd: Add include of <linux/irqreturn.h>
We currently include <linux/irqreturn.h> in <linux/pci.h>, but I'm about to
remove that from linux/pci.h, so add explicit includes where needed.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-02-19 11:28:44 -07:00
Will Deacon
d123cf82d3 iommu/arm-smmu: fix compilation issue when !CONFIG_ARM_AMBA
If !CONFIG_ARM_AMBA, we shouldn't try to register ourselves with the
amba_bustype.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-02-10 17:02:27 +00:00
Will Deacon
57ca90f680 iommu/arm-smmu: set CBARn.BPSHCFG to NSH for s1-s2-bypass contexts
Whilst trying to bring-up an SMMUv2 implementation with the table
walker plumbed into a coherent interconnect, I noticed that the memory
transactions targetting the CPU caches from the SMMU were marked as
outer-shareable instead of inner-shareable.

After a bunch of digging, it seems that we actually need to program
CBARn.BPSHCFG for s1-s2-bypass contexts to act as non-shareable in order
for the shareability configured in the corresponding TTBCR not to be
overridden with an outer-shareable attribute.

Cc: <stable@vger.kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-02-10 17:02:23 +00:00
Will Deacon
6dd35f45b8 iommu/arm-smmu: fix table flushing during initial allocations
Now that we populate page tables as we traverse them ("iommu/arm-smmu:
fix pud/pmd entry fill sequence"), we need to ensure that we flush out
our zeroed tables after initial allocation, to prevent speculative TLB
fills using bogus data.

This patch adds additional calls to arm_smmu_flush_pgtable during
initial table allocation, and moves the dsb required by coherent table
walkers into the helper.

Cc: <stable@vger.kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-02-10 17:02:17 +00:00
Will Deacon
c9d09e2748 iommu/arm-smmu: really fix page table locking
Commit a44a9791e778 ("iommu/arm-smmu: use mutex instead of spinlock for
locking page tables") replaced the page table spinlock with a mutex, to
allow blocking allocations to satisfy lazy mapping requests.

Unfortunately, it turns out that IOMMU mappings are created from atomic
context (e.g. spinlock held during a dma_map), so this change doesn't
really help us in practice.

This patch is a partial revert of the offending commit, bringing back
the original spinlock but replacing our page table allocations for any
levels below the pgd (which is allocated during domain init) with
GFP_ATOMIC instead of GFP_KERNEL.

Cc: <stable@vger.kernel.org>
Reported-by: Andreas Herrmann <andreas.herrmann@calxeda.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-02-10 17:00:49 +00:00
Yifan Zhang
97a644208d iommu/arm-smmu: fix pud/pmd entry fill sequence
The ARM SMMU driver's population of puds and pmds is broken, since we
iterate over the next level of table repeatedly setting the current
level descriptor to point at the pmd being initialised. This is clearly
wrong when dealing with multiple pmds/puds.

This patch fixes the problem by moving the pud/pmd population out of the
loop and instead performing it when we allocate the next level (like we
correctly do for ptes already). The starting address for the next level
is then calculated prior to entering the loop.

Cc: <stable@vger.kernel.org>
Signed-off-by: Yifan Zhang <zhangyf@marvell.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-02-10 17:00:47 +00:00
Linus Torvalds
b3a4bcaa5a IOMMU Updates for Linux v3.14
A few patches have been queued up for this merge window:
 
 	* Improvements for the ARM-SMMU driver
 	  (IOMMU_EXEC support, IOMMU group support)
 	* Updates and fixes for the shmobile IOMMU driver
 	* Various fixes to generic IOMMU code and the
 	  Intel IOMMU driver
 	* Some cleanups in IOMMU drivers (dev_is_pci() usage)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJS6XrCAAoJECvwRC2XARrjgy4P/itemtg2U+603Ldje8WcPo0E
 OCO/0VVSmCTYKUJDZY0hiVwqmhe5gFL3Hm/gGwkS0+UJenFXMmi+aVaPp4pCpgH+
 dL2HD3dIEvi14bisrdxG/8MdR6mIx0qzKtnZLkKSR4LXwucyLHvC/DaCoOytb7Yk
 7s+eEuo0hj0jAkiqSG/zLEtKElTEnoAAkLOjMy46orecJ5q4HusPZekLtWZs2ETe
 x3NS63Unb9g1iSQJWIA7HnQlxWIr2+iynoamHHJRiVFzqRF0W0sGvQY3auG0DSCn
 70WRNE1rKfEkfXMJxosRQ4394YUQdAkt8MBENNcJcC6E1n5PBi0cEZXH6mCnEIlG
 jXzIKUY9fz68ZboaqIxXv4Hb+JLlPXCvPBvQzIQiKRgVxd8nncEjn5I9MHdf+je5
 BmJlzJLJvP4cFvW8Hc8k2Oq101b1kEcSCLARWWvE9/bk9xIUyrqBkR4XjC0vb6qq
 1HbKVdZ7KFKCkBHy9xMpr7CUjKiDiiLeUmqlhyjcK9spicuNIZQnC11HemL6/USP
 oR6Ext9RGhvz+ch656+5+L6f6FURVP8/ywKiJ3RjmvXV5/fCYo3WMitOB2qzlWCy
 SYXAczAOMOdOo+1Dxbghrr+7HzUWPqgfPmntZEPGMZhfuZ6xXr+7pGLjAhHb4vcR
 SZxqkDo1cprqrR9KFAWC
 =YKLk
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull IOMMU Updates from Joerg Roedel:
 "A few patches have been queued up for this merge window:

   - improvements for the ARM-SMMU driver (IOMMU_EXEC support, IOMMU
     group support)
   - updates and fixes for the shmobile IOMMU driver
   - various fixes to generic IOMMU code and the Intel IOMMU driver
   - some cleanups in IOMMU drivers (dev_is_pci() usage)"

* tag 'iommu-updates-v3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (36 commits)
  iommu/vt-d: Fix signedness bug in alloc_irte()
  iommu/vt-d: free all resources if failed to initialize DMARs
  iommu/vt-d, trivial: clean sparse warnings
  iommu/vt-d: fix wrong return value of dmar_table_init()
  iommu/vt-d: release invalidation queue when destroying IOMMU unit
  iommu/vt-d: fix access after free issue in function free_dmar_iommu()
  iommu/vt-d: keep shared resources when failed to initialize iommu devices
  iommu/vt-d: fix invalid memory access when freeing DMAR irq
  iommu/vt-d, trivial: simplify code with existing macros
  iommu/vt-d, trivial: use defined macro instead of hardcoding
  iommu/vt-d: mark internal functions as static
  iommu/vt-d, trivial: clean up unused code
  iommu/vt-d, trivial: check suitable flag in function detect_intel_iommu()
  iommu/vt-d, trivial: print correct domain id of static identity domain
  iommu/vt-d, trivial: refine support of 64bit guest address
  iommu/vt-d: fix resource leakage on error recovery path in iommu_init_domains()
  iommu/vt-d: fix a race window in allocating domain ID for virtual machines
  iommu/vt-d: fix PCI device reference leakage on error recovery path
  drm/msm: Fix link error with !MSM_IOMMU
  iommu/vt-d: use dedicated bitmap to track remapping entry allocation status
  ...
2014-01-29 20:00:13 -08:00
Linus Torvalds
09da8dfa98 ACPI and power management updates for 3.14-rc1
- ACPI core changes to make it create a struct acpi_device object for every
    device represented in the ACPI tables during all namespace scans regardless
    of the current status of that device.  In accordance with this, ACPI hotplug
    operations will not delete those objects, unless the underlying ACPI tables
    go away.
 
  - On top of the above, new sysfs attribute for ACPI device objects allowing
    user space to check device status by triggering the execution of _STA for
    its ACPI object.  From Srinivas Pandruvada.
 
  - ACPI core hotplug changes reducing code duplication, integrating the
    PCI root hotplug with the core and reworking container hotplug.
 
  - ACPI core simplifications making it use ACPI_COMPANION() in the code
    "glueing" ACPI device objects to "physical" devices.
 
  - ACPICA update to upstream version 20131218.  This adds support for the
    DBG2 and PCCT tables to ACPICA, fixes some bugs and improves debug
    facilities.  From Bob Moore, Lv Zheng and Betty Dall.
 
  - Init code change to carry out the early ACPI initialization earlier.
    That should allow us to use ACPI during the timekeeping initialization
    and possibly to simplify the EFI initialization too.  From Chun-Yi Lee.
 
  - Clenups of the inclusions of ACPI headers in many places all over from
    Lv Zheng and Rashika Kheria (work in progress).
 
  - New helper for ACPI _DSM execution and rework of the code in drivers
    that uses _DSM to execute it via the new helper.  From Jiang Liu.
 
  - New Win8 OSI blacklist entries from Takashi Iwai.
 
  - Assorted ACPI fixes and cleanups from Al Stone, Emil Goode, Hanjun Guo,
    Lan Tianyu, Masanari Iida, Oliver Neukum, Prarit Bhargava, Rashika Kheria,
    Tang Chen, Zhang Rui.
 
  - intel_pstate driver updates, including proper Baytrail support, from
    Dirk Brandewie and intel_pstate documentation from Ramkumar Ramachandra.
 
  - Generic CPU boost ("turbo") support for cpufreq from Lukasz Majewski.
 
  - powernow-k6 cpufreq driver fixes from Mikulas Patocka.
 
  - cpufreq core fixes and cleanups from Viresh Kumar, Jane Li, Mark Brown.
 
  - Assorted cpufreq drivers fixes and cleanups from Anson Huang, John Tobias,
    Paul Bolle, Paul Walmsley, Sachin Kamat, Shawn Guo, Viresh Kumar.
 
  - cpuidle cleanups from Bartlomiej Zolnierkiewicz.
 
  - Support for hibernation APM events from Bin Shi.
 
  - Hibernation fix to avoid bringing up nonboot CPUs with ACPI EC disabled
    during thaw transitions from Bjørn Mork.
 
  - PM core fixes and cleanups from Ben Dooks, Leonardo Potenza, Ulf Hansson.
 
  - PNP subsystem fixes and cleanups from Dmitry Torokhov, Levente Kurusa,
    Rashika Kheria.
 
  - New tool for profiling system suspend from Todd E Brandt and a cpupower
    tool cleanup from One Thousand Gnomes.
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJS3a1eAAoJEILEb/54YlRxnTgP/iGawvgjKWm6Qqp7WSIvd5gQ
 zZ6q75C6Pc/W2fq1+OzVGnpCF8WYFy+nFDAXOvUHjIXuoxSwFcuW5l4aMckgl/0a
 TXEWe9MJrCHHRfDApfFacCJ44U02bjJAD5vTyL/hKA+IHeinq4WCSojryYC+8jU0
 cBrUIV0aNH8r5JR2WJNAyv/U29rXsDUOu0I4qTqZ4YaZT6AignMjtLXn1e9AH1Pn
 DPZphTIo/HMnb+kgBOjt4snMk+ahVO9eCOxh/hH8ecnWExw9WynXoU5Nsna0tSZs
 ssyHC7BYexD3oYsG8D52cFUpp4FCsJ0nFQNa2kw0LY+0FBNay43LySisKYHZPXEs
 2WpESDv+/t7yhtnrvM+TtA7aBheKm2XMWGFSu/aERLE17jIidOkXKH5Y7ryYLNf/
 uyRKxNS0NcZWZ0G+/wuY02jQYNkfYz3k/nTr8BAUItRBjdporGIRNEnR9gPzgCUC
 uQhjXWMPulqubr8xbyefPWHTEzU2nvbXwTUWGjrBxSy8zkyy5arfqizUj+VG6afT
 NsboANoMHa9b+xdzigSFdA3nbVK6xBjtU6Ywntk9TIpODKF5NgfARx0H+oSH+Zrj
 32bMzgZtHw/lAbYsnQ9OnTY6AEWQYt6NMuVbTiLXrMHhM3nWwfg/XoN4nZqs6jPo
 IYvE6WhQZU6L6fptGHFC
 =dRf6
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI and power management updates from Rafael Wysocki:
 "As far as the number of commits goes, the top spot belongs to ACPI
  this time with cpufreq in the second position and a handful of PM
  core, PNP and cpuidle updates.  They are fixes and cleanups mostly, as
  usual, with a couple of new features in the mix.

  The most visible change is probably that we will create struct
  acpi_device objects (visible in sysfs) for all devices represented in
  the ACPI tables regardless of their status and there will be a new
  sysfs attribute under those objects allowing user space to check that
  status via _STA.

  Consequently, ACPI device eject or generally hot-removal will not
  delete those objects, unless the table containing the corresponding
  namespace nodes is unloaded, which is extremely rare.  Also ACPI
  container hotplug will be handled quite a bit differently and cpufreq
  will support CPU boost ("turbo") generically and not only in the
  acpi-cpufreq driver.

  Specifics:

   - ACPI core changes to make it create a struct acpi_device object for
     every device represented in the ACPI tables during all namespace
     scans regardless of the current status of that device.  In
     accordance with this, ACPI hotplug operations will not delete those
     objects, unless the underlying ACPI tables go away.

   - On top of the above, new sysfs attribute for ACPI device objects
     allowing user space to check device status by triggering the
     execution of _STA for its ACPI object.  From Srinivas Pandruvada.

   - ACPI core hotplug changes reducing code duplication, integrating
     the PCI root hotplug with the core and reworking container hotplug.

   - ACPI core simplifications making it use ACPI_COMPANION() in the
     code "glueing" ACPI device objects to "physical" devices.

   - ACPICA update to upstream version 20131218.  This adds support for
     the DBG2 and PCCT tables to ACPICA, fixes some bugs and improves
     debug facilities.  From Bob Moore, Lv Zheng and Betty Dall.

   - Init code change to carry out the early ACPI initialization
     earlier.  That should allow us to use ACPI during the timekeeping
     initialization and possibly to simplify the EFI initialization too.
     From Chun-Yi Lee.

   - Clenups of the inclusions of ACPI headers in many places all over
     from Lv Zheng and Rashika Kheria (work in progress).

   - New helper for ACPI _DSM execution and rework of the code in
     drivers that uses _DSM to execute it via the new helper.  From
     Jiang Liu.

   - New Win8 OSI blacklist entries from Takashi Iwai.

   - Assorted ACPI fixes and cleanups from Al Stone, Emil Goode, Hanjun
     Guo, Lan Tianyu, Masanari Iida, Oliver Neukum, Prarit Bhargava,
     Rashika Kheria, Tang Chen, Zhang Rui.

   - intel_pstate driver updates, including proper Baytrail support,
     from Dirk Brandewie and intel_pstate documentation from Ramkumar
     Ramachandra.

   - Generic CPU boost ("turbo") support for cpufreq from Lukasz
     Majewski.

   - powernow-k6 cpufreq driver fixes from Mikulas Patocka.

   - cpufreq core fixes and cleanups from Viresh Kumar, Jane Li, Mark
     Brown.

   - Assorted cpufreq drivers fixes and cleanups from Anson Huang, John
     Tobias, Paul Bolle, Paul Walmsley, Sachin Kamat, Shawn Guo, Viresh
     Kumar.

   - cpuidle cleanups from Bartlomiej Zolnierkiewicz.

   - Support for hibernation APM events from Bin Shi.

   - Hibernation fix to avoid bringing up nonboot CPUs with ACPI EC
     disabled during thaw transitions from Bjørn Mork.

   - PM core fixes and cleanups from Ben Dooks, Leonardo Potenza, Ulf
     Hansson.

   - PNP subsystem fixes and cleanups from Dmitry Torokhov, Levente
     Kurusa, Rashika Kheria.

   - New tool for profiling system suspend from Todd E Brandt and a
     cpupower tool cleanup from One Thousand Gnomes"

* tag 'pm+acpi-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (153 commits)
  thermal: exynos: boost: Automatic enable/disable of BOOST feature (at Exynos4412)
  cpufreq: exynos4x12: Change L0 driver data to CPUFREQ_BOOST_FREQ
  Documentation: cpufreq / boost: Update BOOST documentation
  cpufreq: exynos: Extend Exynos cpufreq driver to support boost
  cpufreq / boost: Kconfig: Support for software-managed BOOST
  acpi-cpufreq: Adjust the code to use the common boost attribute
  cpufreq: Add boost frequency support in core
  intel_pstate: Add trace point to report internal state.
  cpufreq: introduce cpufreq_generic_get() routine
  ARM: SA1100: Create dummy clk_get_rate() to avoid build failures
  cpufreq: stats: create sysfs entries when cpufreq_stats is a module
  cpufreq: stats: free table and remove sysfs entry in a single routine
  cpufreq: stats: remove hotplug notifiers
  cpufreq: stats: handle cpufreq_unregister_driver() and suspend/resume properly
  cpufreq: speedstep: remove unused speedstep_get_state
  platform: introduce OF style 'modalias' support for platform bus
  PM / tools: new tool for suspend/resume performance optimization
  ACPI: fix module autoloading for ACPI enumerated devices
  ACPI: add module autoloading support for ACPI enumerated devices
  ACPI: fix create_modalias() return value handling
  ...
2014-01-24 15:51:02 -08:00
Alex Williamson
08336fd218 intel-iommu: fix off-by-one in pagetable freeing
dma_pte_free_level() has an off-by-one error when checking whether a pte
is completely covered by a range.  Take for example the case of
attempting to free pfn 0x0 - 0x1ff, ie.  512 entries covering the first
2M superpage.

The level_size() is 0x200 and we test:

  static void dma_pte_free_level(...
	...

	if (!(0 > 0 || 0x1ff < 0 + 0x200)) {
		...
	}

Clearly the 2nd test is true, which means we fail to take the branch to
clear and free the pagetable entry.  As a result, we're leaking
pagetables and failing to install new pages over the range.

This was found with a PCI device assigned to a QEMU guest using vfio-pci
without a VGA device present.  The first 1M of guest address space is
mapped with various combinations of 4K pages, but eventually the range
is entirely freed and replaced with a 2M contiguous mapping.
intel-iommu errors out with something like:

  ERROR: DMA PTE for vPFN 0x0 already set (to 5c2b8003 not 849c00083)

In this case 5c2b8003 is the pointer to the previous leaf page that was
neither freed nor cleared and 849c00083 is the superpage entry that
we're trying to replace it with.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-01-21 16:19:41 -08:00
Rafael J. Wysocki
98feb7cc61 Merge branch 'acpi-cleanup'
* acpi-cleanup: (22 commits)
  ACPI / tables: Return proper error codes from acpi_table_parse() and fix comment.
  ACPI / tables: Check if id is NULL in acpi_table_parse()
  ACPI / proc: Include appropriate header file in proc.c
  ACPI / EC: Remove unused functions and add prototype declaration in internal.h
  ACPI / dock: Include appropriate header file in dock.c
  ACPI / PCI: Include appropriate header file in pci_link.c
  ACPI / PCI: Include appropriate header file in pci_slot.c
  ACPI / EC: Mark the function acpi_ec_add_debugfs() as static in ec_sys.c
  ACPI / NVS: Include appropriate header file in nvs.c
  ACPI / OSL: Mark the function acpi_table_checksum() as static
  ACPI / processor: initialize a variable to silence compiler warning
  ACPI / processor: use ACPI_COMPANION() to get ACPI device
  ACPI: correct minor typos
  ACPI / sleep: Drop redundant acpi_disabled check
  ACPI / dock: Drop redundant acpi_disabled check
  ACPI / table: Replace '1' with specific error return values
  ACPI: remove trailing whitespace
  ACPI / IBFT: Fix incorrect <acpi/acpi.h> inclusion in iSCSI boot firmware module
  ACPI / i915: Fix incorrect <acpi/acpi.h> inclusions via <linux/acpi_io.h>
  SFI / ACPI: Fix warnings reported during builds with W=1
  ...

Conflicts:
	drivers/acpi/nvs.c
	drivers/hwmon/asus_atk0110.c
2014-01-12 23:44:09 +01:00
Joerg Roedel
dd1a175695 Merge branches 'arm/smmu', 'core', 'x86/vt-d', 'arm/shmobile', 'x86/amd', 'ppc/pamu', 'iommu/fixes' and 'arm/msm' into next 2014-01-09 13:06:59 +01:00
Dan Carpenter
9f4c7448f4 iommu/vt-d: Fix signedness bug in alloc_irte()
"index" needs to be signed for the error handling to work.  I deleted a
little bit of obsolete cruft related to "index" and "start_index" as
well.

Fixes: 360eb3c5687e ('iommu/vt-d: use dedicated bitmap to track remapping entry allocation status')

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-01-09 13:05:45 +01:00
Jiang Liu
9bdc531ec6 iommu/vt-d: free all resources if failed to initialize DMARs
Enhance intel_iommu_init() to free all resources if failed to
initialize DMAR hardware.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-01-09 12:44:30 +01:00
Jiang Liu
b707cb027e iommu/vt-d, trivial: clean sparse warnings
Clean up most sparse warnings in Intel DMA and interrupt remapping
drivers.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-01-09 12:44:16 +01:00
Jiang Liu
cc05301fd5 iommu/vt-d: fix wrong return value of dmar_table_init()
If dmar_table_init() fails to detect DMAR table on the first call,
it will return wrong result on following calls because it always
sets dmar_table_initialized no matter if succeeds or fails to
detect DMAR table.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-01-09 12:43:45 +01:00
Jiang Liu
a84da70b7b iommu/vt-d: release invalidation queue when destroying IOMMU unit
Release associated invalidation queue when destroying IOMMU unit
to avoid memory leak.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-01-09 12:43:43 +01:00
Jiang Liu
5ced12af69 iommu/vt-d: fix access after free issue in function free_dmar_iommu()
Function free_dmar_iommu() may access domain->iommu_lock by
	spin_unlock_irqrestore(&domain->iommu_lock, flags);
after freeing corresponding domain structure.

Sample stack dump:
[    8.912818] =========================
[    8.917072] [ BUG: held lock freed! ]
[    8.921335] 3.13.0-rc1-gerry+ #12 Not tainted
[    8.926375] -------------------------
[    8.930629] swapper/0/1 is freeing memory ffff880c23b56040-ffff880c23b5613f, with a lock still held there!
[    8.941675]  (&(&domain->iommu_lock)->rlock){......}, at: [<ffffffff81dc775c>] init_dmars+0x72c/0x95b
[    8.952582] 1 lock held by swapper/0/1:
[    8.957031]  #0:  (&(&domain->iommu_lock)->rlock){......}, at: [<ffffffff81dc775c>] init_dmars+0x72c/0x95b
[    8.968487]
[    8.968487] stack backtrace:
[    8.973602] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.13.0-rc1-gerry+ #12
[    8.981556] Hardware name: Intel Corporation LH Pass ........../SVRBD-ROW_T, BIOS SE5C600.86B.99.99.x059.091020121352 09/10/2012
[    8.994742]  ffff880c23b56040 ffff88042dd33c98 ffffffff815617fd ffff88042dd38b28
[    9.003566]  ffff88042dd33cd0 ffffffff810a977a ffff880c23b56040 0000000000000086
[    9.012403]  ffff88102c4923c0 ffff88042ddb4800 ffffffff81b1e8c0 ffff88042dd33d28
[    9.021240] Call Trace:
[    9.024138]  [<ffffffff815617fd>] dump_stack+0x4d/0x66
[    9.030057]  [<ffffffff810a977a>] debug_check_no_locks_freed+0x15a/0x160
[    9.037723]  [<ffffffff811aa1c2>] kmem_cache_free+0x62/0x5b0
[    9.044225]  [<ffffffff81465e27>] domain_exit+0x197/0x1c0
[    9.050418]  [<ffffffff81dc7788>] init_dmars+0x758/0x95b
[    9.056527]  [<ffffffff81dc7dfa>] intel_iommu_init+0x351/0x438
[    9.063207]  [<ffffffff81d8a711>] ? iommu_setup+0x27d/0x27d
[    9.069601]  [<ffffffff81d8a739>] pci_iommu_init+0x28/0x52
[    9.075910]  [<ffffffff81000342>] do_one_initcall+0x122/0x180
[    9.082509]  [<ffffffff81077738>] ? parse_args+0x1e8/0x320
[    9.088815]  [<ffffffff81d850e8>] kernel_init_freeable+0x1e1/0x26c
[    9.095895]  [<ffffffff81d84833>] ? do_early_param+0x88/0x88
[    9.102396]  [<ffffffff8154f580>] ? rest_init+0xd0/0xd0
[    9.108410]  [<ffffffff8154f58e>] kernel_init+0xe/0x130
[    9.114423]  [<ffffffff81574a2c>] ret_from_fork+0x7c/0xb0
[    9.120612]  [<ffffffff8154f580>] ? rest_init+0xd0/0xd0

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2014-01-09 12:43:42 +01:00