2005-04-16 15:20:36 -07:00
/*
* PCI Express PCI Hot Plug Driver
*
* Copyright ( C ) 1995 , 2001 Compaq Computer Corporation
* Copyright ( C ) 2001 Greg Kroah - Hartman ( greg @ kroah . com )
* Copyright ( C ) 2001 IBM Corp .
* Copyright ( C ) 2003 - 2004 Intel Corporation
*
* All rights reserved .
*
* This program is free software ; you can redistribute it and / or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation ; either version 2 of the License , or ( at
* your option ) any later version .
*
* This program is distributed in the hope that it will be useful , but
* WITHOUT ANY WARRANTY ; without even the implied warranty of
* MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE , GOOD TITLE or
* NON INFRINGEMENT . See the GNU General Public License for more
* details .
*
* You should have received a copy of the GNU General Public License
* along with this program ; if not , write to the Free Software
* Foundation , Inc . , 675 Mass Ave , Cambridge , MA 0213 9 , USA .
*
2005-08-16 15:16:10 -07:00
* Send feedback to < greg @ kroah . com > , < kristen . c . accardi @ intel . com >
2005-04-16 15:20:36 -07:00
*
*/
# include <linux/kernel.h>
# include <linux/module.h>
# include <linux/types.h>
2006-01-08 01:02:05 -08:00
# include <linux/signal.h>
# include <linux/jiffies.h>
# include <linux/timer.h>
2005-04-16 15:20:36 -07:00
# include <linux/pci.h>
2005-11-13 16:06:39 -08:00
# include <linux/interrupt.h>
2007-01-09 13:02:36 -08:00
# include <linux/time.h>
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 17:04:11 +09:00
# include <linux/slab.h>
2005-11-13 16:06:39 -08:00
2005-04-16 15:20:36 -07:00
# include "../pci.h"
# include "pciehp.h"
2013-05-09 11:26:16 -06:00
static inline struct pci_dev * ctrl_dev ( struct controller * ctrl )
2006-12-21 17:01:06 -08:00
{
2013-05-09 11:26:16 -06:00
return ctrl - > pcie - > port ;
2006-12-21 17:01:06 -08:00
}
2005-04-16 15:20:36 -07:00
2006-12-21 17:01:04 -08:00
static irqreturn_t pcie_isr ( int irq , void * dev_id ) ;
static void start_int_poll_timer ( struct controller * ctrl , int sec ) ;
2005-04-16 15:20:36 -07:00
/* This is the interrupt polling timeout function. */
2006-12-21 17:01:04 -08:00
static void int_poll_timeout ( unsigned long data )
2005-04-16 15:20:36 -07:00
{
2006-12-21 17:01:04 -08:00
struct controller * ctrl = ( struct controller * ) data ;
2005-04-16 15:20:36 -07:00
/* Poll for interrupt events. regs == NULL => polling */
2006-12-21 17:01:04 -08:00
pcie_isr ( 0 , ctrl ) ;
2005-04-16 15:20:36 -07:00
2006-12-21 17:01:04 -08:00
init_timer ( & ctrl - > poll_timer ) ;
2005-04-16 15:20:36 -07:00
if ( ! pciehp_poll_time )
2007-08-09 16:09:38 -07:00
pciehp_poll_time = 2 ; /* default polling interval is 2 sec */
2005-04-16 15:20:36 -07:00
2006-12-21 17:01:04 -08:00
start_int_poll_timer ( ctrl , pciehp_poll_time ) ;
2005-04-16 15:20:36 -07:00
}
/* This function starts the interrupt polling timer. */
2006-12-21 17:01:04 -08:00
static void start_int_poll_timer ( struct controller * ctrl , int sec )
2005-04-16 15:20:36 -07:00
{
2006-12-21 17:01:04 -08:00
/* Clamp to sane value */
if ( ( sec < = 0 ) | | ( sec > 60 ) )
2013-11-14 11:28:18 -07:00
sec = 2 ;
2006-12-21 17:01:04 -08:00
ctrl - > poll_timer . function = & int_poll_timeout ;
ctrl - > poll_timer . data = ( unsigned long ) ctrl ;
ctrl - > poll_timer . expires = jiffies + sec * HZ ;
add_timer ( & ctrl - > poll_timer ) ;
2005-04-16 15:20:36 -07:00
}
2008-04-25 14:39:08 -07:00
static inline int pciehp_request_irq ( struct controller * ctrl )
{
2008-08-22 17:16:48 +09:00
int retval , irq = ctrl - > pcie - > irq ;
2008-04-25 14:39:08 -07:00
/* Install interrupt polling timer. Start with 10 sec delay */
if ( pciehp_poll_mode ) {
init_timer ( & ctrl - > poll_timer ) ;
start_int_poll_timer ( ctrl , 10 ) ;
return 0 ;
}
/* Installs the interrupt handler */
retval = request_irq ( irq , pcie_isr , IRQF_SHARED , MY_NAME , ctrl ) ;
if ( retval )
2008-09-05 12:11:26 +09:00
ctrl_err ( ctrl , " Cannot get irq %d for the hotplug controller \n " ,
irq ) ;
2008-04-25 14:39:08 -07:00
return retval ;
}
static inline void pciehp_free_irq ( struct controller * ctrl )
{
if ( pciehp_poll_mode )
del_timer_sync ( & ctrl - > poll_timer ) ;
else
2008-08-22 17:16:48 +09:00
free_irq ( ctrl - > pcie - > irq , ctrl ) ;
2008-04-25 14:39:08 -07:00
}
PCI: pciehp: Compute timeout from hotplug command start time
If we issue a hotplug command, go do something else, then come back and
wait for the command to complete, we don't have to wait the whole timeout
period, because some of it elapsed while we were doing something else.
Keep track of the time we issued the command, and wait only until the
timeout period from that point has elapsed.
For controllers with errata like Intel CF118, we previously timed out
before issuing the second hotplug command:
At time T1 (during boot):
- Write DLLSCE, ABPE, PDCE, etc. to Slot Control
At time T2 (hotplug event):
- Wait for command completion (CC) in Slot Status
- Timeout at T2 + 1 second because CC is never set in Slot Status
- Write PCC, PIC, etc. to Slot Control
With this change, we wait until T1 + 1 second instead of T2 + 1 second.
If the hotplug event is more than 1 second after the boot-time
initialization, we won't wait for the timeout at all.
We still emit a "Timeout on hotplug command" message if it timed out; we
should see this on the first hotplug event on every controller with this
erratum, as well as on real errors on controllers without the erratum.
Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e7-v2-spec-update.html
Tested-by: Rajat Jain <rajatxjain@gmail.com> (IDT 807a controller)
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
2014-06-14 09:55:49 -06:00
static int pcie_poll_cmd ( struct controller * ctrl , int timeout )
2008-05-27 19:05:26 +09:00
{
2013-05-09 11:26:16 -06:00
struct pci_dev * pdev = ctrl_dev ( ctrl ) ;
2008-05-27 19:05:26 +09:00
u16 slot_status ;
2015-06-19 15:57:45 +08:00
while ( true ) {
2013-12-14 13:06:07 -07:00
pcie_capability_read_word ( pdev , PCI_EXP_SLTSTA , & slot_status ) ;
PCI: pciehp: Handle invalid data when reading from non-existent devices
It's platform-dependent, but an MMIO read to a non-existent PCI device
generally returns data with all bits set. This happens when the host
bridge or Root Complex times out waiting for a response from the device and
fabricates return data to complete the CPU's read.
One example, reported in the bugzilla below, involved this hierarchy:
pci 0000:00:1c.0: PCI bridge to [bus 02-3a] Root Port
pci 0000:02:00.0: PCI bridge to [bus 03-0a] Upstream Port
pci 0000:03:03.0: PCI bridge to [bus 05-07] Downstream Port
pci 0000:05:00.0: PCI bridge to [bus 06-07] Thunderbolt Upstream Port
pci 0000:06:00.0: PCI bridge to [bus 07] Thunderbolt Downstream Port
pci 0000:07:00.0: BCM57762 NIC
Unplugging the Thunderbolt switch and the NIC below it resulted in this:
pciehp 0000:03:03.0: Surprise Removal
tg3 0000:07:00.0: tg3_abort_hw timed out, TX_MODE_ENABLE will not clear MAC_TX_MODE=ffffffff
pciehp 0000:06:00.0: unloading service driver pciehp
pciehp 0000:06:00.0: pcie_isr: intr_loc 11f
pciehp 0000:06:00.0: Switch interrupt received
pciehp 0000:06:00.0: Latch open on Slot
pciehp 0000:06:00.0: Attention button interrupt received
pciehp 0000:06:00.0: Button pressed on Slot
pciehp 0000:06:00.0: Presence/Notify input change
pciehp 0000:06:00.0: Card present on Slot
pciehp 0000:06:00.0: Power fault interrupt received
pciehp 0000:06:00.0: Data Link Layer State change
pciehp 0000:06:00.0: Link Up event
The pciehp driver correctly noticed that the Thunderbolt switch (05:00.0
and 06:00.0) and NIC (07:00.0) had been removed, and it called their driver
remove methods.
Since the NIC was already gone, tg3 received 0xffffffff when it tried to
read from the device. The resulting timeout is a tg3 issue and not of
interest here.
Similarly, since the 06:00.0 Thunderbolt switch was already gone,
pcie_isr() received 0xffff when it tried to read PCI_EXP_SLTSTA, and pciehp
thought that was valid status showing that many events had happened: the
latch had been opened, the attention button had been pressed, a card was
now present, and the link was now up. These are all wrong, of course, but
pciehp went on to try to power up and enumerate devices below the
non-existent bridge:
pciehp 0000:06:00.0: PCI slot - powering on due to button press
pciehp 0000:06:00.0: Surprise Insertion
pci 0000:07:00.0 id reading try 50 times with interval 20 ms to get ffffffff
[bhelgaas: changelog, also check in pcie_poll_cmd() & pcie_do_write_cmd()]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=99841
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2015-07-21 12:25:30 -04:00
if ( slot_status = = ( u16 ) ~ 0 ) {
ctrl_info ( ctrl , " %s: no response from device \n " ,
__func__ ) ;
return 0 ;
}
2013-12-14 13:06:07 -07:00
if ( slot_status & PCI_EXP_SLTSTA_CC ) {
2013-05-09 11:26:16 -06:00
pcie_capability_write_word ( pdev , PCI_EXP_SLTSTA ,
PCI_EXP_SLTSTA_CC ) ;
2008-12-19 15:19:02 +09:00
return 1 ;
2008-06-20 12:04:33 +09:00
}
2015-06-19 15:57:45 +08:00
if ( timeout < 0 )
break ;
msleep ( 10 ) ;
timeout - = 10 ;
2008-05-27 19:05:26 +09:00
}
return 0 ; /* timeout */
}
2014-06-13 13:58:35 -06:00
static void pcie_wait_cmd ( struct controller * ctrl )
2006-12-21 17:01:09 -08:00
{
2006-12-21 17:01:10 -08:00
unsigned int msecs = pciehp_poll_mode ? 2500 : 1000 ;
PCI: pciehp: Compute timeout from hotplug command start time
If we issue a hotplug command, go do something else, then come back and
wait for the command to complete, we don't have to wait the whole timeout
period, because some of it elapsed while we were doing something else.
Keep track of the time we issued the command, and wait only until the
timeout period from that point has elapsed.
For controllers with errata like Intel CF118, we previously timed out
before issuing the second hotplug command:
At time T1 (during boot):
- Write DLLSCE, ABPE, PDCE, etc. to Slot Control
At time T2 (hotplug event):
- Wait for command completion (CC) in Slot Status
- Timeout at T2 + 1 second because CC is never set in Slot Status
- Write PCC, PIC, etc. to Slot Control
With this change, we wait until T1 + 1 second instead of T2 + 1 second.
If the hotplug event is more than 1 second after the boot-time
initialization, we won't wait for the timeout at all.
We still emit a "Timeout on hotplug command" message if it timed out; we
should see this on the first hotplug event on every controller with this
erratum, as well as on real errors on controllers without the erratum.
Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e7-v2-spec-update.html
Tested-by: Rajat Jain <rajatxjain@gmail.com> (IDT 807a controller)
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
2014-06-14 09:55:49 -06:00
unsigned long duration = msecs_to_jiffies ( msecs ) ;
unsigned long cmd_timeout = ctrl - > cmd_started + duration ;
unsigned long now , timeout ;
2006-12-21 17:01:10 -08:00
int rc ;
2014-06-13 13:58:35 -06:00
/*
* If the controller does not generate notifications for command
* completions , we never need to wait between writes .
*/
2014-06-26 11:58:55 -07:00
if ( NO_CMD_CMPL ( ctrl ) )
2014-06-13 13:58:35 -06:00
return ;
if ( ! ctrl - > cmd_busy )
return ;
PCI: pciehp: Compute timeout from hotplug command start time
If we issue a hotplug command, go do something else, then come back and
wait for the command to complete, we don't have to wait the whole timeout
period, because some of it elapsed while we were doing something else.
Keep track of the time we issued the command, and wait only until the
timeout period from that point has elapsed.
For controllers with errata like Intel CF118, we previously timed out
before issuing the second hotplug command:
At time T1 (during boot):
- Write DLLSCE, ABPE, PDCE, etc. to Slot Control
At time T2 (hotplug event):
- Wait for command completion (CC) in Slot Status
- Timeout at T2 + 1 second because CC is never set in Slot Status
- Write PCC, PIC, etc. to Slot Control
With this change, we wait until T1 + 1 second instead of T2 + 1 second.
If the hotplug event is more than 1 second after the boot-time
initialization, we won't wait for the timeout at all.
We still emit a "Timeout on hotplug command" message if it timed out; we
should see this on the first hotplug event on every controller with this
erratum, as well as on real errors on controllers without the erratum.
Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e7-v2-spec-update.html
Tested-by: Rajat Jain <rajatxjain@gmail.com> (IDT 807a controller)
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
2014-06-14 09:55:49 -06:00
/*
* Even if the command has already timed out , we want to call
* pcie_poll_cmd ( ) so it can clear PCI_EXP_SLTSTA_CC .
*/
now = jiffies ;
if ( time_before_eq ( cmd_timeout , now ) )
timeout = 1 ;
else
timeout = cmd_timeout - now ;
2014-06-13 13:58:35 -06:00
if ( ctrl - > slot_ctrl & PCI_EXP_SLTCTL_HPIE & &
ctrl - > slot_ctrl & PCI_EXP_SLTCTL_CCIE )
2008-05-28 14:59:44 +09:00
rc = wait_event_timeout ( ctrl - > queue , ! ctrl - > cmd_busy , timeout ) ;
2014-06-13 13:58:35 -06:00
else
2014-09-22 20:05:45 -06:00
rc = pcie_poll_cmd ( ctrl , jiffies_to_msecs ( timeout ) ) ;
PCI: pciehp: Compute timeout from hotplug command start time
If we issue a hotplug command, go do something else, then come back and
wait for the command to complete, we don't have to wait the whole timeout
period, because some of it elapsed while we were doing something else.
Keep track of the time we issued the command, and wait only until the
timeout period from that point has elapsed.
For controllers with errata like Intel CF118, we previously timed out
before issuing the second hotplug command:
At time T1 (during boot):
- Write DLLSCE, ABPE, PDCE, etc. to Slot Control
At time T2 (hotplug event):
- Wait for command completion (CC) in Slot Status
- Timeout at T2 + 1 second because CC is never set in Slot Status
- Write PCC, PIC, etc. to Slot Control
With this change, we wait until T1 + 1 second instead of T2 + 1 second.
If the hotplug event is more than 1 second after the boot-time
initialization, we won't wait for the timeout at all.
We still emit a "Timeout on hotplug command" message if it timed out; we
should see this on the first hotplug event on every controller with this
erratum, as well as on real errors on controllers without the erratum.
Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e7-v2-spec-update.html
Tested-by: Rajat Jain <rajatxjain@gmail.com> (IDT 807a controller)
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
2014-06-14 09:55:49 -06:00
/*
* Controllers with errata like Intel CF118 don ' t generate
* completion notifications unless the power / indicator / interlock
* control bits are changed . On such controllers , we ' ll emit this
* timeout message when we wait for completion of commands that
* don ' t change those bits , e . g . , commands that merely enable
* interrupts .
*/
2006-12-21 17:01:10 -08:00
if ( ! rc )
2014-08-15 17:18:44 -06:00
ctrl_info ( ctrl , " Timeout on hotplug command %#06x (issued %u msec ago) \n " ,
PCI: pciehp: Compute timeout from hotplug command start time
If we issue a hotplug command, go do something else, then come back and
wait for the command to complete, we don't have to wait the whole timeout
period, because some of it elapsed while we were doing something else.
Keep track of the time we issued the command, and wait only until the
timeout period from that point has elapsed.
For controllers with errata like Intel CF118, we previously timed out
before issuing the second hotplug command:
At time T1 (during boot):
- Write DLLSCE, ABPE, PDCE, etc. to Slot Control
At time T2 (hotplug event):
- Wait for command completion (CC) in Slot Status
- Timeout at T2 + 1 second because CC is never set in Slot Status
- Write PCC, PIC, etc. to Slot Control
With this change, we wait until T1 + 1 second instead of T2 + 1 second.
If the hotplug event is more than 1 second after the boot-time
initialization, we won't wait for the timeout at all.
We still emit a "Timeout on hotplug command" message if it timed out; we
should see this on the first hotplug event on every controller with this
erratum, as well as on real errors on controllers without the erratum.
Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e7-v2-spec-update.html
Tested-by: Rajat Jain <rajatxjain@gmail.com> (IDT 807a controller)
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
2014-06-14 09:55:49 -06:00
ctrl - > slot_ctrl ,
2014-09-22 20:07:35 -06:00
jiffies_to_msecs ( jiffies - ctrl - > cmd_started ) ) ;
2006-12-21 17:01:09 -08:00
}
2015-06-08 17:10:50 -06:00
static void pcie_do_write_cmd ( struct controller * ctrl , u16 cmd ,
u16 mask , bool wait )
2005-04-16 15:20:36 -07:00
{
2013-05-09 11:26:16 -06:00
struct pci_dev * pdev = ctrl_dev ( ctrl ) ;
2007-05-31 09:43:34 -07:00
u16 slot_ctrl ;
2005-04-16 15:20:36 -07:00
2006-12-21 17:01:09 -08:00
mutex_lock ( & ctrl - > ctrl_lock ) ;
2015-06-08 17:10:50 -06:00
/*
* Always wait for any previous command that might still be in progress
*/
PCI: pciehp: Wait for hotplug command completion lazily
Previously we issued a hotplug command and waited for it to complete. But
there's no need to wait until we're ready to issue the *next* command. The
next command will probably be much later, so the first one may have already
completed and we may not have to actually wait at all.
Because of hardware errata, some controllers generate command completion
events for some commands but not others. In the case of Intel CF118 (see
spec update reference), the controller indicates command completion only
for Slot Control writes that change the value of the following bits:
Power Controller Control
Power Indicator Control
Attention Indicator Control
Electromechanical Interlock Control
Changes to other bits, e.g., the interrupt enable bits, do not cause the
Command Completed bit to be set. Controllers from AMD and Nvidia are
reported to have similar errata.
These errata cause timeouts when pcie_enable_notification() enables
interrupts. Previously that timeout occurred at boot-time. With this
change, the timeout occurs later, when we change the state of the slot
power, indicators, or interlock. This speeds up boot but causes a timeout
at the first hotplug event on the slot. Subsequent events don't timeout
because only the first (boot-time) hotplug command updates Slot Control
without touching the power/indicator/interlock controls.
Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e7-v2-spec-update.html
Tested-by: Rajat Jain <rajatxjain@gmail.com> (IDT 807a controller)
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
2014-06-13 15:06:40 -06:00
pcie_wait_cmd ( ctrl ) ;
2013-12-14 13:06:07 -07:00
pcie_capability_read_word ( pdev , PCI_EXP_SLTCTL , & slot_ctrl ) ;
PCI: pciehp: Handle invalid data when reading from non-existent devices
It's platform-dependent, but an MMIO read to a non-existent PCI device
generally returns data with all bits set. This happens when the host
bridge or Root Complex times out waiting for a response from the device and
fabricates return data to complete the CPU's read.
One example, reported in the bugzilla below, involved this hierarchy:
pci 0000:00:1c.0: PCI bridge to [bus 02-3a] Root Port
pci 0000:02:00.0: PCI bridge to [bus 03-0a] Upstream Port
pci 0000:03:03.0: PCI bridge to [bus 05-07] Downstream Port
pci 0000:05:00.0: PCI bridge to [bus 06-07] Thunderbolt Upstream Port
pci 0000:06:00.0: PCI bridge to [bus 07] Thunderbolt Downstream Port
pci 0000:07:00.0: BCM57762 NIC
Unplugging the Thunderbolt switch and the NIC below it resulted in this:
pciehp 0000:03:03.0: Surprise Removal
tg3 0000:07:00.0: tg3_abort_hw timed out, TX_MODE_ENABLE will not clear MAC_TX_MODE=ffffffff
pciehp 0000:06:00.0: unloading service driver pciehp
pciehp 0000:06:00.0: pcie_isr: intr_loc 11f
pciehp 0000:06:00.0: Switch interrupt received
pciehp 0000:06:00.0: Latch open on Slot
pciehp 0000:06:00.0: Attention button interrupt received
pciehp 0000:06:00.0: Button pressed on Slot
pciehp 0000:06:00.0: Presence/Notify input change
pciehp 0000:06:00.0: Card present on Slot
pciehp 0000:06:00.0: Power fault interrupt received
pciehp 0000:06:00.0: Data Link Layer State change
pciehp 0000:06:00.0: Link Up event
The pciehp driver correctly noticed that the Thunderbolt switch (05:00.0
and 06:00.0) and NIC (07:00.0) had been removed, and it called their driver
remove methods.
Since the NIC was already gone, tg3 received 0xffffffff when it tried to
read from the device. The resulting timeout is a tg3 issue and not of
interest here.
Similarly, since the 06:00.0 Thunderbolt switch was already gone,
pcie_isr() received 0xffff when it tried to read PCI_EXP_SLTSTA, and pciehp
thought that was valid status showing that many events had happened: the
latch had been opened, the attention button had been pressed, a card was
now present, and the link was now up. These are all wrong, of course, but
pciehp went on to try to power up and enumerate devices below the
non-existent bridge:
pciehp 0000:06:00.0: PCI slot - powering on due to button press
pciehp 0000:06:00.0: Surprise Insertion
pci 0000:07:00.0 id reading try 50 times with interval 20 ms to get ffffffff
[bhelgaas: changelog, also check in pcie_poll_cmd() & pcie_do_write_cmd()]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=99841
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2015-07-21 12:25:30 -04:00
if ( slot_ctrl = = ( u16 ) ~ 0 ) {
ctrl_info ( ctrl , " %s: no response from device \n " , __func__ ) ;
goto out ;
}
2007-05-31 09:43:34 -07:00
slot_ctrl & = ~ mask ;
2008-04-25 14:39:14 -07:00
slot_ctrl | = ( cmd & mask ) ;
2007-05-31 09:43:34 -07:00
ctrl - > cmd_busy = 1 ;
2008-04-25 14:39:02 -07:00
smp_mb ( ) ;
2013-12-14 13:06:07 -07:00
pcie_capability_write_word ( pdev , PCI_EXP_SLTCTL , slot_ctrl ) ;
PCI: pciehp: Compute timeout from hotplug command start time
If we issue a hotplug command, go do something else, then come back and
wait for the command to complete, we don't have to wait the whole timeout
period, because some of it elapsed while we were doing something else.
Keep track of the time we issued the command, and wait only until the
timeout period from that point has elapsed.
For controllers with errata like Intel CF118, we previously timed out
before issuing the second hotplug command:
At time T1 (during boot):
- Write DLLSCE, ABPE, PDCE, etc. to Slot Control
At time T2 (hotplug event):
- Wait for command completion (CC) in Slot Status
- Timeout at T2 + 1 second because CC is never set in Slot Status
- Write PCC, PIC, etc. to Slot Control
With this change, we wait until T1 + 1 second instead of T2 + 1 second.
If the hotplug event is more than 1 second after the boot-time
initialization, we won't wait for the timeout at all.
We still emit a "Timeout on hotplug command" message if it timed out; we
should see this on the first hotplug event on every controller with this
erratum, as well as on real errors on controllers without the erratum.
Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e7-v2-spec-update.html
Tested-by: Rajat Jain <rajatxjain@gmail.com> (IDT 807a controller)
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
2014-06-14 09:55:49 -06:00
ctrl - > cmd_started = jiffies ;
2014-06-13 13:58:35 -06:00
ctrl - > slot_ctrl = slot_ctrl ;
2007-05-31 09:43:34 -07:00
2015-06-08 17:10:50 -06:00
/*
* Optionally wait for the hardware to be ready for a new command ,
* indicating completion of the above issued command .
*/
if ( wait )
pcie_wait_cmd ( ctrl ) ;
PCI: pciehp: Handle invalid data when reading from non-existent devices
It's platform-dependent, but an MMIO read to a non-existent PCI device
generally returns data with all bits set. This happens when the host
bridge or Root Complex times out waiting for a response from the device and
fabricates return data to complete the CPU's read.
One example, reported in the bugzilla below, involved this hierarchy:
pci 0000:00:1c.0: PCI bridge to [bus 02-3a] Root Port
pci 0000:02:00.0: PCI bridge to [bus 03-0a] Upstream Port
pci 0000:03:03.0: PCI bridge to [bus 05-07] Downstream Port
pci 0000:05:00.0: PCI bridge to [bus 06-07] Thunderbolt Upstream Port
pci 0000:06:00.0: PCI bridge to [bus 07] Thunderbolt Downstream Port
pci 0000:07:00.0: BCM57762 NIC
Unplugging the Thunderbolt switch and the NIC below it resulted in this:
pciehp 0000:03:03.0: Surprise Removal
tg3 0000:07:00.0: tg3_abort_hw timed out, TX_MODE_ENABLE will not clear MAC_TX_MODE=ffffffff
pciehp 0000:06:00.0: unloading service driver pciehp
pciehp 0000:06:00.0: pcie_isr: intr_loc 11f
pciehp 0000:06:00.0: Switch interrupt received
pciehp 0000:06:00.0: Latch open on Slot
pciehp 0000:06:00.0: Attention button interrupt received
pciehp 0000:06:00.0: Button pressed on Slot
pciehp 0000:06:00.0: Presence/Notify input change
pciehp 0000:06:00.0: Card present on Slot
pciehp 0000:06:00.0: Power fault interrupt received
pciehp 0000:06:00.0: Data Link Layer State change
pciehp 0000:06:00.0: Link Up event
The pciehp driver correctly noticed that the Thunderbolt switch (05:00.0
and 06:00.0) and NIC (07:00.0) had been removed, and it called their driver
remove methods.
Since the NIC was already gone, tg3 received 0xffffffff when it tried to
read from the device. The resulting timeout is a tg3 issue and not of
interest here.
Similarly, since the 06:00.0 Thunderbolt switch was already gone,
pcie_isr() received 0xffff when it tried to read PCI_EXP_SLTSTA, and pciehp
thought that was valid status showing that many events had happened: the
latch had been opened, the attention button had been pressed, a card was
now present, and the link was now up. These are all wrong, of course, but
pciehp went on to try to power up and enumerate devices below the
non-existent bridge:
pciehp 0000:06:00.0: PCI slot - powering on due to button press
pciehp 0000:06:00.0: Surprise Insertion
pci 0000:07:00.0 id reading try 50 times with interval 20 ms to get ffffffff
[bhelgaas: changelog, also check in pcie_poll_cmd() & pcie_do_write_cmd()]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=99841
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2015-07-21 12:25:30 -04:00
out :
2006-12-21 17:01:09 -08:00
mutex_unlock ( & ctrl - > ctrl_lock ) ;
2005-04-16 15:20:36 -07:00
}
2015-06-08 17:10:50 -06:00
/**
* pcie_write_cmd - Issue controller command
* @ ctrl : controller to which the command is issued
* @ cmd : command value written to slot control register
* @ mask : bitmask of slot control register to be modified
*/
static void pcie_write_cmd ( struct controller * ctrl , u16 cmd , u16 mask )
{
pcie_do_write_cmd ( ctrl , cmd , mask , true ) ;
}
/* Same as above without waiting for the hardware to latch */
static void pcie_write_cmd_nowait ( struct controller * ctrl , u16 cmd , u16 mask )
{
pcie_do_write_cmd ( ctrl , cmd , mask , false ) ;
}
2014-02-04 18:28:43 -08:00
bool pciehp_check_link_active ( struct controller * ctrl )
2008-10-22 14:31:44 +09:00
{
2013-05-09 11:26:16 -06:00
struct pci_dev * pdev = ctrl_dev ( ctrl ) ;
2012-01-27 10:55:12 -08:00
u16 lnk_status ;
2013-12-14 13:06:07 -07:00
bool ret ;
2008-10-22 14:31:44 +09:00
2013-12-14 13:06:07 -07:00
pcie_capability_read_word ( pdev , PCI_EXP_LNKSTA , & lnk_status ) ;
2012-01-27 10:55:12 -08:00
ret = ! ! ( lnk_status & PCI_EXP_LNKSTA_DLLLA ) ;
if ( ret )
ctrl_dbg ( ctrl , " %s: lnk_status = %x \n " , __func__ , lnk_status ) ;
return ret ;
2008-10-22 14:31:44 +09:00
}
2012-01-27 10:55:13 -08:00
static void __pcie_wait_link_active ( struct controller * ctrl , bool active )
2008-10-22 14:31:44 +09:00
{
int timeout = 1000 ;
2014-02-04 18:28:43 -08:00
if ( pciehp_check_link_active ( ctrl ) = = active )
2008-10-22 14:31:44 +09:00
return ;
while ( timeout > 0 ) {
msleep ( 10 ) ;
timeout - = 10 ;
2014-02-04 18:28:43 -08:00
if ( pciehp_check_link_active ( ctrl ) = = active )
2008-10-22 14:31:44 +09:00
return ;
}
2012-01-27 10:55:13 -08:00
ctrl_dbg ( ctrl , " Data Link Layer Link Active not %s in 1000 msec \n " ,
active ? " set " : " cleared " ) ;
}
static void pcie_wait_link_active ( struct controller * ctrl )
{
__pcie_wait_link_active ( ctrl , true ) ;
}
2012-01-27 10:55:11 -08:00
static bool pci_bus_check_dev ( struct pci_bus * bus , int devfn )
{
u32 l ;
int count = 0 ;
int delay = 1000 , step = 20 ;
bool found = false ;
do {
found = pci_bus_read_dev_vendor_id ( bus , devfn , & l , 0 ) ;
count + + ;
if ( found )
break ;
msleep ( step ) ;
delay - = step ;
} while ( delay > 0 ) ;
if ( count > 1 & & pciehp_debug )
printk ( KERN_DEBUG " pci %04x:%02x:%02x.%d id reading try %d times with interval %d ms to get %08x \n " ,
pci_domain_nr ( bus ) , bus - > number , PCI_SLOT ( devfn ) ,
PCI_FUNC ( devfn ) , count , step , l ) ;
return found ;
}
2009-09-15 17:30:48 +09:00
int pciehp_check_link_status ( struct controller * ctrl )
2005-04-16 15:20:36 -07:00
{
2013-05-09 11:26:16 -06:00
struct pci_dev * pdev = ctrl_dev ( ctrl ) ;
2013-12-14 13:06:07 -07:00
bool found ;
2005-04-16 15:20:36 -07:00
u16 lnk_status ;
2014-04-18 20:13:49 -04:00
/*
* Data Link Layer Link Active Reporting must be capable for
* hot - plug capable downstream port . But old controller might
* not implement it . In this case , we wait for 1000 ms .
*/
if ( ctrl - > link_active_reporting )
pcie_wait_link_active ( ctrl ) ;
else
msleep ( 1000 ) ;
2008-10-22 14:31:44 +09:00
2012-01-27 10:55:11 -08:00
/* wait 100ms before read pci conf, and try in 1s */
msleep ( 100 ) ;
found = pci_bus_check_dev ( ctrl - > pcie - > port - > subordinate ,
PCI_DEVFN ( 0 , 0 ) ) ;
2011-11-10 16:40:37 +09:00
2013-12-14 13:06:07 -07:00
pcie_capability_read_word ( pdev , PCI_EXP_LNKSTA , & lnk_status ) ;
2008-09-05 12:11:26 +09:00
ctrl_dbg ( ctrl , " %s: lnk_status = %x \n " , __func__ , lnk_status ) ;
2008-12-19 15:19:02 +09:00
if ( ( lnk_status & PCI_EXP_LNKSTA_LT ) | |
! ( lnk_status & PCI_EXP_LNKSTA_NLW ) ) {
2015-06-15 16:28:29 -05:00
ctrl_err ( ctrl , " link training error: status %#06x \n " ,
lnk_status ) ;
2013-12-14 13:06:07 -07:00
return - 1 ;
2005-04-16 15:20:36 -07:00
}
2011-11-07 07:53:23 -08:00
pcie_update_link_speed ( ctrl - > pcie - > port - > subordinate , lnk_status ) ;
2013-12-14 13:06:07 -07:00
if ( ! found )
return - 1 ;
2012-01-27 10:55:11 -08:00
2013-12-14 13:06:07 -07:00
return 0 ;
2005-04-16 15:20:36 -07:00
}
2012-01-27 10:55:14 -08:00
static int __pciehp_link_set ( struct controller * ctrl , bool enable )
{
2013-05-09 11:26:16 -06:00
struct pci_dev * pdev = ctrl_dev ( ctrl ) ;
2012-01-27 10:55:14 -08:00
u16 lnk_ctrl ;
2013-12-14 13:06:07 -07:00
pcie_capability_read_word ( pdev , PCI_EXP_LNKCTL , & lnk_ctrl ) ;
2012-01-27 10:55:14 -08:00
if ( enable )
lnk_ctrl & = ~ PCI_EXP_LNKCTL_LD ;
else
lnk_ctrl | = PCI_EXP_LNKCTL_LD ;
2013-12-14 13:06:07 -07:00
pcie_capability_write_word ( pdev , PCI_EXP_LNKCTL , lnk_ctrl ) ;
2012-01-27 10:55:14 -08:00
ctrl_dbg ( ctrl , " %s: lnk_ctrl = %x \n " , __func__ , lnk_ctrl ) ;
2013-12-14 13:06:07 -07:00
return 0 ;
2012-01-27 10:55:14 -08:00
}
static int pciehp_link_enable ( struct controller * ctrl )
{
return __pciehp_link_set ( ctrl , true ) ;
}
PCI: pciehp: Allow exclusive userspace control of indicators
PCIe hotplug supports optional Attention and Power Indicators, which are
used internally by pciehp. Users can't control the Power Indicator, but
they can control the Attention Indicator by writing to a sysfs "attention"
file.
The Slot Control register has two bits for each indicator, and the PCIe
spec defines the encodings for each as (Reserved/On/Blinking/Off). For
sysfs "attention" writes, pciehp_set_attention_status() maps into these
encodings, so the only useful write values are 0 (Off), 1 (On), and 2
(Blinking).
However, some platforms use all four bits for platform-specific indicators,
and they need to allow direct user control of them while preventing pciehp
from using them at all.
Add a "hotplug_user_indicators" flag to the pci_dev structure. When set,
pciehp does not use either the Attention Indicator or the Power Indicator,
and the low four bits (values 0x0 - 0xf) of sysfs "attention" write values
are written directly to the Attention Indicator Control and Power Indicator
Control fields.
[bhelgaas: changelog, rename flag and accessors to s/attention/indicator/]
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-09-13 10:31:59 -06:00
int pciehp_get_raw_indicator_status ( struct hotplug_slot * hotplug_slot ,
u8 * status )
{
struct slot * slot = hotplug_slot - > private ;
struct pci_dev * pdev = ctrl_dev ( slot - > ctrl ) ;
u16 slot_ctrl ;
pcie_capability_read_word ( pdev , PCI_EXP_SLTCTL , & slot_ctrl ) ;
* status = ( slot_ctrl & ( PCI_EXP_SLTCTL_AIC | PCI_EXP_SLTCTL_PIC ) ) > > 6 ;
return 0 ;
}
2013-12-14 13:06:16 -07:00
void pciehp_get_attention_status ( struct slot * slot , u8 * status )
2005-04-16 15:20:36 -07:00
{
2006-12-21 17:01:04 -08:00
struct controller * ctrl = slot - > ctrl ;
2013-05-09 11:26:16 -06:00
struct pci_dev * pdev = ctrl_dev ( ctrl ) ;
2005-04-16 15:20:36 -07:00
u16 slot_ctrl ;
2013-12-14 13:06:07 -07:00
pcie_capability_read_word ( pdev , PCI_EXP_SLTCTL , & slot_ctrl ) ;
2009-11-11 14:34:52 +09:00
ctrl_dbg ( ctrl , " %s: SLOTCTRL %x, value read %x \n " , __func__ ,
pci_pcie_cap ( ctrl - > pcie - > port ) + PCI_EXP_SLTCTL , slot_ctrl ) ;
2005-04-16 15:20:36 -07:00
2013-12-14 13:06:53 -07:00
switch ( slot_ctrl & PCI_EXP_SLTCTL_AIC ) {
case PCI_EXP_SLTCTL_ATTN_IND_ON :
2005-04-16 15:20:36 -07:00
* status = 1 ; /* On */
break ;
2013-12-14 13:06:53 -07:00
case PCI_EXP_SLTCTL_ATTN_IND_BLINK :
2005-04-16 15:20:36 -07:00
* status = 2 ; /* Blink */
break ;
2013-12-14 13:06:53 -07:00
case PCI_EXP_SLTCTL_ATTN_IND_OFF :
2005-04-16 15:20:36 -07:00
* status = 0 ; /* Off */
break ;
default :
* status = 0xFF ;
break ;
}
}
2013-12-14 13:06:16 -07:00
void pciehp_get_power_status ( struct slot * slot , u8 * status )
2005-04-16 15:20:36 -07:00
{
2006-12-21 17:01:04 -08:00
struct controller * ctrl = slot - > ctrl ;
2013-05-09 11:26:16 -06:00
struct pci_dev * pdev = ctrl_dev ( ctrl ) ;
2005-04-16 15:20:36 -07:00
u16 slot_ctrl ;
2013-12-14 13:06:07 -07:00
pcie_capability_read_word ( pdev , PCI_EXP_SLTCTL , & slot_ctrl ) ;
2009-11-11 14:34:52 +09:00
ctrl_dbg ( ctrl , " %s: SLOTCTRL %x value read %x \n " , __func__ ,
pci_pcie_cap ( ctrl - > pcie - > port ) + PCI_EXP_SLTCTL , slot_ctrl ) ;
2005-04-16 15:20:36 -07:00
2013-12-14 13:06:53 -07:00
switch ( slot_ctrl & PCI_EXP_SLTCTL_PCC ) {
case PCI_EXP_SLTCTL_PWR_ON :
* status = 1 ; /* On */
2005-04-16 15:20:36 -07:00
break ;
2013-12-14 13:06:53 -07:00
case PCI_EXP_SLTCTL_PWR_OFF :
* status = 0 ; /* Off */
2005-04-16 15:20:36 -07:00
break ;
default :
* status = 0xFF ;
break ;
}
}
2013-12-14 13:06:16 -07:00
void pciehp_get_latch_status ( struct slot * slot , u8 * status )
2005-04-16 15:20:36 -07:00
{
2013-12-14 13:06:07 -07:00
struct pci_dev * pdev = ctrl_dev ( slot - > ctrl ) ;
2005-04-16 15:20:36 -07:00
u16 slot_status ;
2013-12-14 13:06:07 -07:00
pcie_capability_read_word ( pdev , PCI_EXP_SLTSTA , & slot_status ) ;
2008-12-19 15:19:02 +09:00
* status = ! ! ( slot_status & PCI_EXP_SLTSTA_MRLSS ) ;
2005-04-16 15:20:36 -07:00
}
2013-12-14 13:06:16 -07:00
void pciehp_get_adapter_status ( struct slot * slot , u8 * status )
2005-04-16 15:20:36 -07:00
{
2013-12-14 13:06:07 -07:00
struct pci_dev * pdev = ctrl_dev ( slot - > ctrl ) ;
2005-04-16 15:20:36 -07:00
u16 slot_status ;
2013-12-14 13:06:07 -07:00
pcie_capability_read_word ( pdev , PCI_EXP_SLTSTA , & slot_status ) ;
2008-12-19 15:19:02 +09:00
* status = ! ! ( slot_status & PCI_EXP_SLTSTA_PDS ) ;
2005-04-16 15:20:36 -07:00
}
2009-09-15 17:30:48 +09:00
int pciehp_query_power_fault ( struct slot * slot )
2005-04-16 15:20:36 -07:00
{
2013-12-14 13:06:07 -07:00
struct pci_dev * pdev = ctrl_dev ( slot - > ctrl ) ;
2005-04-16 15:20:36 -07:00
u16 slot_status ;
2013-12-14 13:06:07 -07:00
pcie_capability_read_word ( pdev , PCI_EXP_SLTSTA , & slot_status ) ;
2008-12-19 15:19:02 +09:00
return ! ! ( slot_status & PCI_EXP_SLTSTA_PFD ) ;
2005-04-16 15:20:36 -07:00
}
PCI: pciehp: Allow exclusive userspace control of indicators
PCIe hotplug supports optional Attention and Power Indicators, which are
used internally by pciehp. Users can't control the Power Indicator, but
they can control the Attention Indicator by writing to a sysfs "attention"
file.
The Slot Control register has two bits for each indicator, and the PCIe
spec defines the encodings for each as (Reserved/On/Blinking/Off). For
sysfs "attention" writes, pciehp_set_attention_status() maps into these
encodings, so the only useful write values are 0 (Off), 1 (On), and 2
(Blinking).
However, some platforms use all four bits for platform-specific indicators,
and they need to allow direct user control of them while preventing pciehp
from using them at all.
Add a "hotplug_user_indicators" flag to the pci_dev structure. When set,
pciehp does not use either the Attention Indicator or the Power Indicator,
and the low four bits (values 0x0 - 0xf) of sysfs "attention" write values
are written directly to the Attention Indicator Control and Power Indicator
Control fields.
[bhelgaas: changelog, rename flag and accessors to s/attention/indicator/]
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-09-13 10:31:59 -06:00
int pciehp_set_raw_indicator_status ( struct hotplug_slot * hotplug_slot ,
u8 status )
{
struct slot * slot = hotplug_slot - > private ;
struct controller * ctrl = slot - > ctrl ;
pcie_write_cmd_nowait ( ctrl , status < < 6 ,
PCI_EXP_SLTCTL_AIC | PCI_EXP_SLTCTL_PIC ) ;
return 0 ;
}
2013-12-14 13:06:16 -07:00
void pciehp_set_attention_status ( struct slot * slot , u8 value )
2005-04-16 15:20:36 -07:00
{
2006-12-21 17:01:04 -08:00
struct controller * ctrl = slot - > ctrl ;
2007-05-31 09:43:34 -07:00
u16 slot_cmd ;
2005-04-16 15:20:36 -07:00
2013-12-15 17:23:54 -07:00
if ( ! ATTN_LED ( ctrl ) )
return ;
2005-04-16 15:20:36 -07:00
switch ( value ) {
2014-04-18 20:13:49 -04:00
case 0 : /* turn off */
2013-12-14 13:06:53 -07:00
slot_cmd = PCI_EXP_SLTCTL_ATTN_IND_OFF ;
2009-10-05 17:42:59 +09:00
break ;
case 1 : /* turn on */
2013-12-14 13:06:53 -07:00
slot_cmd = PCI_EXP_SLTCTL_ATTN_IND_ON ;
2009-10-05 17:42:59 +09:00
break ;
case 2 : /* turn blink */
2013-12-14 13:06:53 -07:00
slot_cmd = PCI_EXP_SLTCTL_ATTN_IND_BLINK ;
2009-10-05 17:42:59 +09:00
break ;
default :
2013-12-14 13:06:16 -07:00
return ;
2005-04-16 15:20:36 -07:00
}
2015-06-08 17:10:50 -06:00
pcie_write_cmd_nowait ( ctrl , slot_cmd , PCI_EXP_SLTCTL_AIC ) ;
2009-11-11 14:34:52 +09:00
ctrl_dbg ( ctrl , " %s: SLOTCTRL %x write cmd %x \n " , __func__ ,
pci_pcie_cap ( ctrl - > pcie - > port ) + PCI_EXP_SLTCTL , slot_cmd ) ;
2005-04-16 15:20:36 -07:00
}
2009-09-15 17:30:48 +09:00
void pciehp_green_led_on ( struct slot * slot )
2005-04-16 15:20:36 -07:00
{
2006-12-21 17:01:04 -08:00
struct controller * ctrl = slot - > ctrl ;
2007-08-09 16:09:34 -07:00
2013-12-15 17:23:54 -07:00
if ( ! PWR_LED ( ctrl ) )
return ;
2015-06-08 17:10:50 -06:00
pcie_write_cmd_nowait ( ctrl , PCI_EXP_SLTCTL_PWR_IND_ON ,
PCI_EXP_SLTCTL_PIC ) ;
2009-11-11 14:34:52 +09:00
ctrl_dbg ( ctrl , " %s: SLOTCTRL %x write cmd %x \n " , __func__ ,
2013-12-14 13:06:53 -07:00
pci_pcie_cap ( ctrl - > pcie - > port ) + PCI_EXP_SLTCTL ,
PCI_EXP_SLTCTL_PWR_IND_ON ) ;
2005-04-16 15:20:36 -07:00
}
2009-09-15 17:30:48 +09:00
void pciehp_green_led_off ( struct slot * slot )
2005-04-16 15:20:36 -07:00
{
2006-12-21 17:01:04 -08:00
struct controller * ctrl = slot - > ctrl ;
2005-04-16 15:20:36 -07:00
2013-12-15 17:23:54 -07:00
if ( ! PWR_LED ( ctrl ) )
return ;
2015-06-08 17:10:50 -06:00
pcie_write_cmd_nowait ( ctrl , PCI_EXP_SLTCTL_PWR_IND_OFF ,
PCI_EXP_SLTCTL_PIC ) ;
2009-11-11 14:34:52 +09:00
ctrl_dbg ( ctrl , " %s: SLOTCTRL %x write cmd %x \n " , __func__ ,
2013-12-14 13:06:53 -07:00
pci_pcie_cap ( ctrl - > pcie - > port ) + PCI_EXP_SLTCTL ,
PCI_EXP_SLTCTL_PWR_IND_OFF ) ;
2005-04-16 15:20:36 -07:00
}
2009-09-15 17:30:48 +09:00
void pciehp_green_led_blink ( struct slot * slot )
2005-04-16 15:20:36 -07:00
{
2006-12-21 17:01:04 -08:00
struct controller * ctrl = slot - > ctrl ;
2007-08-09 16:09:34 -07:00
2013-12-15 17:23:54 -07:00
if ( ! PWR_LED ( ctrl ) )
return ;
2015-06-08 17:10:50 -06:00
pcie_write_cmd_nowait ( ctrl , PCI_EXP_SLTCTL_PWR_IND_BLINK ,
PCI_EXP_SLTCTL_PIC ) ;
2009-11-11 14:34:52 +09:00
ctrl_dbg ( ctrl , " %s: SLOTCTRL %x write cmd %x \n " , __func__ ,
2013-12-14 13:06:53 -07:00
pci_pcie_cap ( ctrl - > pcie - > port ) + PCI_EXP_SLTCTL ,
PCI_EXP_SLTCTL_PWR_IND_BLINK ) ;
2005-04-16 15:20:36 -07:00
}
2014-04-18 20:13:49 -04:00
int pciehp_power_on_slot ( struct slot * slot )
2005-04-16 15:20:36 -07:00
{
2006-12-21 17:01:04 -08:00
struct controller * ctrl = slot - > ctrl ;
2013-05-09 11:26:16 -06:00
struct pci_dev * pdev = ctrl_dev ( ctrl ) ;
2007-05-31 09:43:34 -07:00
u16 slot_status ;
2013-12-14 13:06:07 -07:00
int retval ;
2005-04-16 15:20:36 -07:00
2005-11-23 15:44:54 -08:00
/* Clear sticky power-fault bit from previous power failures */
2013-12-14 13:06:07 -07:00
pcie_capability_read_word ( pdev , PCI_EXP_SLTSTA , & slot_status ) ;
2013-12-14 13:06:40 -07:00
if ( slot_status & PCI_EXP_SLTSTA_PFD )
pcie_capability_write_word ( pdev , PCI_EXP_SLTSTA ,
PCI_EXP_SLTSTA_PFD ) ;
2009-11-13 15:14:10 +09:00
ctrl - > power_fault_detected = 0 ;
2005-04-16 15:20:36 -07:00
2013-12-14 13:06:53 -07:00
pcie_write_cmd ( ctrl , PCI_EXP_SLTCTL_PWR_ON , PCI_EXP_SLTCTL_PCC ) ;
2009-11-11 14:34:52 +09:00
ctrl_dbg ( ctrl , " %s: SLOTCTRL %x write cmd %x \n " , __func__ ,
2013-12-14 13:06:53 -07:00
pci_pcie_cap ( ctrl - > pcie - > port ) + PCI_EXP_SLTCTL ,
PCI_EXP_SLTCTL_PWR_ON ) ;
2005-04-16 15:20:36 -07:00
2012-01-27 10:55:15 -08:00
retval = pciehp_link_enable ( ctrl ) ;
if ( retval )
ctrl_err ( ctrl , " %s: Can not enable the link! \n " , __func__ ) ;
2005-04-16 15:20:36 -07:00
return retval ;
}
2014-04-18 20:13:49 -04:00
void pciehp_power_off_slot ( struct slot * slot )
2005-04-16 15:20:36 -07:00
{
2006-12-21 17:01:04 -08:00
struct controller * ctrl = slot - > ctrl ;
2007-12-20 19:45:09 +09:00
2013-12-14 13:06:53 -07:00
pcie_write_cmd ( ctrl , PCI_EXP_SLTCTL_PWR_OFF , PCI_EXP_SLTCTL_PCC ) ;
2009-11-11 14:34:52 +09:00
ctrl_dbg ( ctrl , " %s: SLOTCTRL %x write cmd %x \n " , __func__ ,
2013-12-14 13:06:53 -07:00
pci_pcie_cap ( ctrl - > pcie - > port ) + PCI_EXP_SLTCTL ,
PCI_EXP_SLTCTL_PWR_OFF ) ;
2005-04-16 15:20:36 -07:00
}
PCI: pciehp: Process all hotplug events before looking for new ones
Previously we accumulated hotplug events, then processed them, essentially
like this:
events = 0
do {
status = read(Slot Status)
status &= EVENT_MASK # only look at events
events |= status # accumulate events
write(Slot Status, events) # clear events
} while (status)
process events
The problem is that as soon as we clear events in Slot Status, the hardware
may send notifications for new events, and we lose information about the
first events. For example, we might see two Presence Detect Changed
events, but lose the fact that the slot was temporarily empty:
read PCI_EXP_SLTSTA_PDC set, PCI_EXP_SLTSTA_PDS clear # slot empty
write PCI_EXP_SLTSTA_PDC # clear PDC event
read PCI_EXP_SLTSTA_PDC set, PCI_EXP_SLTSTA_PDS set # slot occupied
The current code does not process a removal; it only processes the
insertion, which fails because we didn't remove the original device.
To avoid this problem, read Slot Status once and process all the events
before reading it again, like this:
do {
read events
clear events
process events
} while (events)
[bhelgaas: changelog, add external loop around pciehp_isr()]
Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Mayurkumar Patel <mayurkumar.patel@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2016-09-08 15:07:56 -05:00
static irqreturn_t pciehp_isr ( int irq , void * dev_id )
2005-04-16 15:20:36 -07:00
{
2006-12-21 17:01:04 -08:00
struct controller * ctrl = ( struct controller * ) dev_id ;
2013-05-09 11:26:16 -06:00
struct pci_dev * pdev = ctrl_dev ( ctrl ) ;
PCI: Add pci_ignore_hotplug() to ignore hotplug events for a device
Powering off a hot-pluggable device, e.g., with pci_set_power_state(D3cold),
normally generates a hot-remove event that unbinds the driver.
Some drivers expect to remain bound to a device even while they power it
off and back on again. This can be dangerous, because if the device is
removed or replaced while it is powered off, the driver doesn't know that
anything changed. But some drivers accept that risk.
Add pci_ignore_hotplug() for use by drivers that know their device cannot
be removed. Using pci_ignore_hotplug() tells the PCI core that hot-plug
events for the device should be ignored.
The radeon and nouveau drivers use this to switch between a low-power,
integrated GPU and a higher-power, higher-performance discrete GPU. They
power off the unused GPU, but they want to remain bound to it.
This is a reimplementation of f244d8b623da ("ACPIPHP / radeon / nouveau:
Fix VGA switcheroo problem related to hotplug") but extends it to work with
both acpiphp and pciehp.
This fixes a problem where systems with dual GPUs using the radeon drivers
become unusable, freezing every few seconds (see bugzillas below). The
resume of the radeon device may also fail, e.g.,
This fixes problems on dual GPU systems where the radeon driver becomes
unusable because of problems while suspending the device, as in bug 79701:
[drm] radeon: finishing device.
radeon 0000:01:00.0: Userspace still has active objects !
radeon 0000:01:00.0: ffff8800cb4ec288 ffff8800cb4ec000 16384 4294967297 force free
...
WARNING: CPU: 0 PID: 67 at /home/apw/COD/linux/drivers/gpu/drm/radeon/radeon_gart.c:234 radeon_gart_unbind+0xd2/0xe0 [radeon]()
trying to unbind memory from uninitialized GART !
or while resuming it, as in bug 77261:
radeon 0000:01:00.0: ring 0 stalled for more than 10158msec
radeon 0000:01:00.0: GPU lockup ...
radeon 0000:01:00.0: GPU pci config reset
pciehp 0000:00:01.0:pcie04: Card not present on Slot(1-1)
radeon 0000:01:00.0: GPU reset succeeded, trying to resume
*ERROR* radeon: dpm resume failed
radeon 0000:01:00.0: Wait for MC idle timedout !
Link: https://bugzilla.kernel.org/show_bug.cgi?id=77261
Link: https://bugzilla.kernel.org/show_bug.cgi?id=79701
Reported-by: Shawn Starr <shawn.starr@rogers.com>
Reported-by: Jose P. <lbdkmjdf@sharklasers.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Rajat Jain <rajatxjain@gmail.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Dave Airlie <airlied@redhat.com>
CC: stable@vger.kernel.org # v3.15+
2014-09-10 13:45:01 -06:00
struct pci_bus * subordinate = pdev - > subordinate ;
struct pci_dev * dev ;
2009-09-15 17:24:46 +09:00
struct slot * slot = ctrl - > slot ;
2016-09-08 17:30:38 -05:00
u16 status , events ;
2015-07-01 17:17:49 -05:00
u8 present ;
2015-06-14 21:35:13 -05:00
bool link ;
2005-04-16 15:20:36 -07:00
2016-05-13 13:15:31 +02:00
/* Interrupts cannot originate from a controller that's asleep */
if ( pdev - > current_state = = PCI_D3cold )
return IRQ_NONE ;
PCI: pciehp: Process all hotplug events before looking for new ones
Previously we accumulated hotplug events, then processed them, essentially
like this:
events = 0
do {
status = read(Slot Status)
status &= EVENT_MASK # only look at events
events |= status # accumulate events
write(Slot Status, events) # clear events
} while (status)
process events
The problem is that as soon as we clear events in Slot Status, the hardware
may send notifications for new events, and we lose information about the
first events. For example, we might see two Presence Detect Changed
events, but lose the fact that the slot was temporarily empty:
read PCI_EXP_SLTSTA_PDC set, PCI_EXP_SLTSTA_PDS clear # slot empty
write PCI_EXP_SLTSTA_PDC # clear PDC event
read PCI_EXP_SLTSTA_PDC set, PCI_EXP_SLTSTA_PDS set # slot occupied
The current code does not process a removal; it only processes the
insertion, which fails because we didn't remove the original device.
To avoid this problem, read Slot Status once and process all the events
before reading it again, like this:
do {
read events
clear events
process events
} while (events)
[bhelgaas: changelog, add external loop around pciehp_isr()]
Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Mayurkumar Patel <mayurkumar.patel@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2016-09-08 15:07:56 -05:00
pcie_capability_read_word ( pdev , PCI_EXP_SLTSTA , & status ) ;
if ( status = = ( u16 ) ~ 0 ) {
ctrl_info ( ctrl , " %s: no response from device \n " , __func__ ) ;
return IRQ_NONE ;
}
2008-04-25 14:38:57 -07:00
/*
PCI: pciehp: Process all hotplug events before looking for new ones
Previously we accumulated hotplug events, then processed them, essentially
like this:
events = 0
do {
status = read(Slot Status)
status &= EVENT_MASK # only look at events
events |= status # accumulate events
write(Slot Status, events) # clear events
} while (status)
process events
The problem is that as soon as we clear events in Slot Status, the hardware
may send notifications for new events, and we lose information about the
first events. For example, we might see two Presence Detect Changed
events, but lose the fact that the slot was temporarily empty:
read PCI_EXP_SLTSTA_PDC set, PCI_EXP_SLTSTA_PDS clear # slot empty
write PCI_EXP_SLTSTA_PDC # clear PDC event
read PCI_EXP_SLTSTA_PDC set, PCI_EXP_SLTSTA_PDS set # slot occupied
The current code does not process a removal; it only processes the
insertion, which fails because we didn't remove the original device.
To avoid this problem, read Slot Status once and process all the events
before reading it again, like this:
do {
read events
clear events
process events
} while (events)
[bhelgaas: changelog, add external loop around pciehp_isr()]
Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Mayurkumar Patel <mayurkumar.patel@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2016-09-08 15:07:56 -05:00
* Slot Status contains plain status bits as well as event
* notification bits ; right now we only want the event bits .
2008-04-25 14:38:57 -07:00
*/
PCI: pciehp: Process all hotplug events before looking for new ones
Previously we accumulated hotplug events, then processed them, essentially
like this:
events = 0
do {
status = read(Slot Status)
status &= EVENT_MASK # only look at events
events |= status # accumulate events
write(Slot Status, events) # clear events
} while (status)
process events
The problem is that as soon as we clear events in Slot Status, the hardware
may send notifications for new events, and we lose information about the
first events. For example, we might see two Presence Detect Changed
events, but lose the fact that the slot was temporarily empty:
read PCI_EXP_SLTSTA_PDC set, PCI_EXP_SLTSTA_PDS clear # slot empty
write PCI_EXP_SLTSTA_PDC # clear PDC event
read PCI_EXP_SLTSTA_PDC set, PCI_EXP_SLTSTA_PDS set # slot occupied
The current code does not process a removal; it only processes the
insertion, which fails because we didn't remove the original device.
To avoid this problem, read Slot Status once and process all the events
before reading it again, like this:
do {
read events
clear events
process events
} while (events)
[bhelgaas: changelog, add external loop around pciehp_isr()]
Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Mayurkumar Patel <mayurkumar.patel@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2016-09-08 15:07:56 -05:00
events = status & ( PCI_EXP_SLTSTA_ABP | PCI_EXP_SLTSTA_PFD |
2016-09-08 17:30:38 -05:00
PCI_EXP_SLTSTA_PDC | PCI_EXP_SLTSTA_CC |
PCI_EXP_SLTSTA_DLLSC ) ;
PCI: pciehp: Process all hotplug events before looking for new ones
Previously we accumulated hotplug events, then processed them, essentially
like this:
events = 0
do {
status = read(Slot Status)
status &= EVENT_MASK # only look at events
events |= status # accumulate events
write(Slot Status, events) # clear events
} while (status)
process events
The problem is that as soon as we clear events in Slot Status, the hardware
may send notifications for new events, and we lose information about the
first events. For example, we might see two Presence Detect Changed
events, but lose the fact that the slot was temporarily empty:
read PCI_EXP_SLTSTA_PDC set, PCI_EXP_SLTSTA_PDS clear # slot empty
write PCI_EXP_SLTSTA_PDC # clear PDC event
read PCI_EXP_SLTSTA_PDC set, PCI_EXP_SLTSTA_PDS set # slot occupied
The current code does not process a removal; it only processes the
insertion, which fails because we didn't remove the original device.
To avoid this problem, read Slot Status once and process all the events
before reading it again, like this:
do {
read events
clear events
process events
} while (events)
[bhelgaas: changelog, add external loop around pciehp_isr()]
Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Mayurkumar Patel <mayurkumar.patel@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2016-09-08 15:07:56 -05:00
if ( ! events )
return IRQ_NONE ;
2007-08-09 16:09:34 -07:00
2016-09-09 09:10:17 -05:00
/* Capture link status before clearing interrupts */
if ( events & PCI_EXP_SLTSTA_DLLSC )
link = pciehp_check_link_active ( ctrl ) ;
PCI: pciehp: Process all hotplug events before looking for new ones
Previously we accumulated hotplug events, then processed them, essentially
like this:
events = 0
do {
status = read(Slot Status)
status &= EVENT_MASK # only look at events
events |= status # accumulate events
write(Slot Status, events) # clear events
} while (status)
process events
The problem is that as soon as we clear events in Slot Status, the hardware
may send notifications for new events, and we lose information about the
first events. For example, we might see two Presence Detect Changed
events, but lose the fact that the slot was temporarily empty:
read PCI_EXP_SLTSTA_PDC set, PCI_EXP_SLTSTA_PDS clear # slot empty
write PCI_EXP_SLTSTA_PDC # clear PDC event
read PCI_EXP_SLTSTA_PDC set, PCI_EXP_SLTSTA_PDS set # slot occupied
The current code does not process a removal; it only processes the
insertion, which fails because we didn't remove the original device.
To avoid this problem, read Slot Status once and process all the events
before reading it again, like this:
do {
read events
clear events
process events
} while (events)
[bhelgaas: changelog, add external loop around pciehp_isr()]
Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Mayurkumar Patel <mayurkumar.patel@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2016-09-08 15:07:56 -05:00
pcie_capability_write_word ( pdev , PCI_EXP_SLTSTA , events ) ;
2016-09-08 17:30:38 -05:00
ctrl_dbg ( ctrl , " pending interrupts %#06x from Slot Status \n " , events ) ;
2007-08-09 16:09:34 -07:00
2008-04-25 14:38:57 -07:00
/* Check Command Complete Interrupt Pending */
2016-09-08 17:30:38 -05:00
if ( events & PCI_EXP_SLTSTA_CC ) {
2006-12-21 17:01:10 -08:00
ctrl - > cmd_busy = 0 ;
2008-04-25 14:39:02 -07:00
smp_mb ( ) ;
2008-05-28 14:59:44 +09:00
wake_up ( & ctrl - > queue ) ;
2005-04-16 15:20:36 -07:00
}
PCI: Add pci_ignore_hotplug() to ignore hotplug events for a device
Powering off a hot-pluggable device, e.g., with pci_set_power_state(D3cold),
normally generates a hot-remove event that unbinds the driver.
Some drivers expect to remain bound to a device even while they power it
off and back on again. This can be dangerous, because if the device is
removed or replaced while it is powered off, the driver doesn't know that
anything changed. But some drivers accept that risk.
Add pci_ignore_hotplug() for use by drivers that know their device cannot
be removed. Using pci_ignore_hotplug() tells the PCI core that hot-plug
events for the device should be ignored.
The radeon and nouveau drivers use this to switch between a low-power,
integrated GPU and a higher-power, higher-performance discrete GPU. They
power off the unused GPU, but they want to remain bound to it.
This is a reimplementation of f244d8b623da ("ACPIPHP / radeon / nouveau:
Fix VGA switcheroo problem related to hotplug") but extends it to work with
both acpiphp and pciehp.
This fixes a problem where systems with dual GPUs using the radeon drivers
become unusable, freezing every few seconds (see bugzillas below). The
resume of the radeon device may also fail, e.g.,
This fixes problems on dual GPU systems where the radeon driver becomes
unusable because of problems while suspending the device, as in bug 79701:
[drm] radeon: finishing device.
radeon 0000:01:00.0: Userspace still has active objects !
radeon 0000:01:00.0: ffff8800cb4ec288 ffff8800cb4ec000 16384 4294967297 force free
...
WARNING: CPU: 0 PID: 67 at /home/apw/COD/linux/drivers/gpu/drm/radeon/radeon_gart.c:234 radeon_gart_unbind+0xd2/0xe0 [radeon]()
trying to unbind memory from uninitialized GART !
or while resuming it, as in bug 77261:
radeon 0000:01:00.0: ring 0 stalled for more than 10158msec
radeon 0000:01:00.0: GPU lockup ...
radeon 0000:01:00.0: GPU pci config reset
pciehp 0000:00:01.0:pcie04: Card not present on Slot(1-1)
radeon 0000:01:00.0: GPU reset succeeded, trying to resume
*ERROR* radeon: dpm resume failed
radeon 0000:01:00.0: Wait for MC idle timedout !
Link: https://bugzilla.kernel.org/show_bug.cgi?id=77261
Link: https://bugzilla.kernel.org/show_bug.cgi?id=79701
Reported-by: Shawn Starr <shawn.starr@rogers.com>
Reported-by: Jose P. <lbdkmjdf@sharklasers.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Rajat Jain <rajatxjain@gmail.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Dave Airlie <airlied@redhat.com>
CC: stable@vger.kernel.org # v3.15+
2014-09-10 13:45:01 -06:00
if ( subordinate ) {
list_for_each_entry ( dev , & subordinate - > devices , bus_list ) {
if ( dev - > ignore_hotplug ) {
ctrl_dbg ( ctrl , " ignoring hotplug event %#06x (%s requested no hotplug) \n " ,
2016-09-08 17:30:38 -05:00
events , pci_name ( dev ) ) ;
PCI: Add pci_ignore_hotplug() to ignore hotplug events for a device
Powering off a hot-pluggable device, e.g., with pci_set_power_state(D3cold),
normally generates a hot-remove event that unbinds the driver.
Some drivers expect to remain bound to a device even while they power it
off and back on again. This can be dangerous, because if the device is
removed or replaced while it is powered off, the driver doesn't know that
anything changed. But some drivers accept that risk.
Add pci_ignore_hotplug() for use by drivers that know their device cannot
be removed. Using pci_ignore_hotplug() tells the PCI core that hot-plug
events for the device should be ignored.
The radeon and nouveau drivers use this to switch between a low-power,
integrated GPU and a higher-power, higher-performance discrete GPU. They
power off the unused GPU, but they want to remain bound to it.
This is a reimplementation of f244d8b623da ("ACPIPHP / radeon / nouveau:
Fix VGA switcheroo problem related to hotplug") but extends it to work with
both acpiphp and pciehp.
This fixes a problem where systems with dual GPUs using the radeon drivers
become unusable, freezing every few seconds (see bugzillas below). The
resume of the radeon device may also fail, e.g.,
This fixes problems on dual GPU systems where the radeon driver becomes
unusable because of problems while suspending the device, as in bug 79701:
[drm] radeon: finishing device.
radeon 0000:01:00.0: Userspace still has active objects !
radeon 0000:01:00.0: ffff8800cb4ec288 ffff8800cb4ec000 16384 4294967297 force free
...
WARNING: CPU: 0 PID: 67 at /home/apw/COD/linux/drivers/gpu/drm/radeon/radeon_gart.c:234 radeon_gart_unbind+0xd2/0xe0 [radeon]()
trying to unbind memory from uninitialized GART !
or while resuming it, as in bug 77261:
radeon 0000:01:00.0: ring 0 stalled for more than 10158msec
radeon 0000:01:00.0: GPU lockup ...
radeon 0000:01:00.0: GPU pci config reset
pciehp 0000:00:01.0:pcie04: Card not present on Slot(1-1)
radeon 0000:01:00.0: GPU reset succeeded, trying to resume
*ERROR* radeon: dpm resume failed
radeon 0000:01:00.0: Wait for MC idle timedout !
Link: https://bugzilla.kernel.org/show_bug.cgi?id=77261
Link: https://bugzilla.kernel.org/show_bug.cgi?id=79701
Reported-by: Shawn Starr <shawn.starr@rogers.com>
Reported-by: Jose P. <lbdkmjdf@sharklasers.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Rajat Jain <rajatxjain@gmail.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Dave Airlie <airlied@redhat.com>
CC: stable@vger.kernel.org # v3.15+
2014-09-10 13:45:01 -06:00
return IRQ_HANDLED ;
}
}
}
2008-04-25 14:38:57 -07:00
/* Check Attention Button Pressed */
2016-09-08 17:30:38 -05:00
if ( events & PCI_EXP_SLTSTA_ABP ) {
2016-09-08 15:19:58 -05:00
ctrl_info ( ctrl , " Slot(%s): Attention button pressed \n " ,
2015-06-14 21:35:13 -05:00
slot_name ( slot ) ) ;
pciehp_queue_interrupt_event ( slot , INT_BUTTON_PRESS ) ;
}
2006-12-21 17:01:04 -08:00
2016-11-19 00:32:45 -08:00
/*
* Check Link Status Changed at higher precedence than Presence
* Detect Changed . The PDS value may be set to " card present " from
* out - of - band detection , which may be in conflict with a Link Down
* and cause the wrong event to queue .
*/
if ( events & PCI_EXP_SLTSTA_DLLSC ) {
ctrl_info ( ctrl , " Slot(%s): Link %s \n " , slot_name ( slot ) ,
link ? " Up " : " Down " ) ;
pciehp_queue_interrupt_event ( slot , link ? INT_LINK_UP :
INT_LINK_DOWN ) ;
} else if ( events & PCI_EXP_SLTSTA_PDC ) {
2016-09-09 09:10:17 -05:00
present = ! ! ( status & PCI_EXP_SLTSTA_PDS ) ;
2016-09-08 15:19:58 -05:00
ctrl_info ( ctrl , " Slot(%s): Card %spresent \n " , slot_name ( slot ) ,
present ? " " : " not " ) ;
2015-06-14 21:35:13 -05:00
pciehp_queue_interrupt_event ( slot , present ? INT_PRESENCE_ON :
INT_PRESENCE_OFF ) ;
}
2006-12-21 17:01:04 -08:00
2008-04-25 14:38:57 -07:00
/* Check Power Fault Detected */
2016-09-08 17:30:38 -05:00
if ( ( events & PCI_EXP_SLTSTA_PFD ) & & ! ctrl - > power_fault_detected ) {
2009-02-03 15:06:16 +09:00
ctrl - > power_fault_detected = 1 ;
2016-09-08 15:19:58 -05:00
ctrl_err ( ctrl , " Slot(%s): Power fault \n " , slot_name ( slot ) ) ;
2015-06-14 21:35:13 -05:00
pciehp_queue_interrupt_event ( slot , INT_POWER_FAULT ) ;
2009-02-03 15:06:16 +09:00
}
2014-02-04 18:29:10 -08:00
2005-04-16 15:20:36 -07:00
return IRQ_HANDLED ;
}
PCI: pciehp: Process all hotplug events before looking for new ones
Previously we accumulated hotplug events, then processed them, essentially
like this:
events = 0
do {
status = read(Slot Status)
status &= EVENT_MASK # only look at events
events |= status # accumulate events
write(Slot Status, events) # clear events
} while (status)
process events
The problem is that as soon as we clear events in Slot Status, the hardware
may send notifications for new events, and we lose information about the
first events. For example, we might see two Presence Detect Changed
events, but lose the fact that the slot was temporarily empty:
read PCI_EXP_SLTSTA_PDC set, PCI_EXP_SLTSTA_PDS clear # slot empty
write PCI_EXP_SLTSTA_PDC # clear PDC event
read PCI_EXP_SLTSTA_PDC set, PCI_EXP_SLTSTA_PDS set # slot occupied
The current code does not process a removal; it only processes the
insertion, which fails because we didn't remove the original device.
To avoid this problem, read Slot Status once and process all the events
before reading it again, like this:
do {
read events
clear events
process events
} while (events)
[bhelgaas: changelog, add external loop around pciehp_isr()]
Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Mayurkumar Patel <mayurkumar.patel@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2016-09-08 15:07:56 -05:00
static irqreturn_t pcie_isr ( int irq , void * dev_id )
{
irqreturn_t rc , handled = IRQ_NONE ;
/*
* To guarantee that all interrupt events are serviced , we need to
* re - inspect Slot Status register after clearing what is presumed
* to be the last pending interrupt .
*/
do {
rc = pciehp_isr ( irq , dev_id ) ;
if ( rc = = IRQ_HANDLED )
handled = IRQ_HANDLED ;
} while ( rc = = IRQ_HANDLED ) ;
/* Return IRQ_HANDLED if we handled one or more events */
return handled ;
}
2013-12-14 13:06:16 -07:00
void pcie_enable_notification ( struct controller * ctrl )
2007-11-21 15:07:55 -08:00
{
2008-04-25 14:39:05 -07:00
u16 cmd , mask ;
2005-04-16 15:20:36 -07:00
2009-11-13 15:14:10 +09:00
/*
* TBD : Power fault detected software notification support .
*
* Power fault detected software notification is not enabled
* now , because it caused power fault detected interrupt storm
* on some machines . On those machines , power fault detected
* bit in the slot status register was set again immediately
* when it is cleared in the interrupt service routine , and
* next power fault detected interrupt was notified again .
*/
2014-02-04 18:29:23 -08:00
/*
* Always enable link events : thus link - up and link - down shall
* always be treated as hotplug and unplug respectively . Enable
* presence detect only if Attention Button is not present .
*/
cmd = PCI_EXP_SLTCTL_DLLSCE ;
2008-04-25 14:39:06 -07:00
if ( ATTN_BUTTN ( ctrl ) )
2008-12-19 15:19:02 +09:00
cmd | = PCI_EXP_SLTCTL_ABPE ;
2014-02-04 18:29:23 -08:00
else
cmd | = PCI_EXP_SLTCTL_PDCE ;
2008-04-25 14:39:05 -07:00
if ( ! pciehp_poll_mode )
2008-12-19 15:19:02 +09:00
cmd | = PCI_EXP_SLTCTL_HPIE | PCI_EXP_SLTCTL_CCIE ;
2008-04-25 14:39:05 -07:00
2008-12-19 15:19:02 +09:00
mask = ( PCI_EXP_SLTCTL_PDCE | PCI_EXP_SLTCTL_ABPE |
2015-07-01 17:17:49 -05:00
PCI_EXP_SLTCTL_PFDE |
2014-02-04 18:29:23 -08:00
PCI_EXP_SLTCTL_HPIE | PCI_EXP_SLTCTL_CCIE |
PCI_EXP_SLTCTL_DLLSCE ) ;
2008-04-25 14:39:05 -07:00
2015-06-08 17:10:50 -06:00
pcie_write_cmd_nowait ( ctrl , cmd , mask ) ;
2014-09-22 20:36:09 -06:00
ctrl_dbg ( ctrl , " %s: SLOTCTRL %x write cmd %x \n " , __func__ ,
pci_pcie_cap ( ctrl - > pcie - > port ) + PCI_EXP_SLTCTL , cmd ) ;
2008-06-20 12:07:08 +09:00
}
static void pcie_disable_notification ( struct controller * ctrl )
{
u16 mask ;
2013-12-14 13:06:16 -07:00
2008-12-19 15:19:02 +09:00
mask = ( PCI_EXP_SLTCTL_PDCE | PCI_EXP_SLTCTL_ABPE |
PCI_EXP_SLTCTL_MRLSCE | PCI_EXP_SLTCTL_PFDE |
2009-10-05 17:40:02 +09:00
PCI_EXP_SLTCTL_HPIE | PCI_EXP_SLTCTL_CCIE |
PCI_EXP_SLTCTL_DLLSCE ) ;
2013-12-14 13:06:16 -07:00
pcie_write_cmd ( ctrl , 0 , mask ) ;
2014-09-22 20:36:09 -06:00
ctrl_dbg ( ctrl , " %s: SLOTCTRL %x write cmd %x \n " , __func__ ,
pci_pcie_cap ( ctrl - > pcie - > port ) + PCI_EXP_SLTCTL , 0 ) ;
2008-06-20 12:07:08 +09:00
}
2013-08-08 14:09:37 -06:00
/*
* pciehp has a 1 : 1 bus : slot relationship so we ultimately want a secondary
2014-02-18 18:53:19 -08:00
* bus reset of the bridge , but at the same time we want to ensure that it is
* not seen as a hot - unplug , followed by the hot - plug of the device . Thus ,
* disable link state notification and presence detection change notification
* momentarily , if we see that they could interfere . Also , clear any spurious
2013-08-08 14:09:37 -06:00
* events after .
*/
int pciehp_reset_slot ( struct slot * slot , int probe )
{
struct controller * ctrl = slot - > ctrl ;
2013-05-09 11:26:16 -06:00
struct pci_dev * pdev = ctrl_dev ( ctrl ) ;
2014-02-04 18:30:40 -08:00
u16 stat_mask = 0 , ctrl_mask = 0 ;
2013-08-08 14:09:37 -06:00
if ( probe )
return 0 ;
2014-02-18 18:53:19 -08:00
if ( ! ATTN_BUTTN ( ctrl ) ) {
2014-02-04 18:30:40 -08:00
ctrl_mask | = PCI_EXP_SLTCTL_PDCE ;
stat_mask | = PCI_EXP_SLTSTA_PDC ;
2013-08-08 14:09:37 -06:00
}
2014-02-04 18:30:40 -08:00
ctrl_mask | = PCI_EXP_SLTCTL_DLLSCE ;
stat_mask | = PCI_EXP_SLTSTA_DLLSC ;
pcie_write_cmd ( ctrl , 0 , ctrl_mask ) ;
2014-09-22 20:36:09 -06:00
ctrl_dbg ( ctrl , " %s: SLOTCTRL %x write cmd %x \n " , __func__ ,
pci_pcie_cap ( ctrl - > pcie - > port ) + PCI_EXP_SLTCTL , 0 ) ;
2014-02-04 18:30:40 -08:00
if ( pciehp_poll_mode )
del_timer_sync ( & ctrl - > poll_timer ) ;
2013-08-08 14:09:37 -06:00
pci_reset_bridge_secondary_bus ( ctrl - > pcie - > port ) ;
2014-02-04 18:30:40 -08:00
pcie_capability_write_word ( pdev , PCI_EXP_SLTSTA , stat_mask ) ;
2015-06-08 17:10:50 -06:00
pcie_write_cmd_nowait ( ctrl , ctrl_mask , ctrl_mask ) ;
2014-09-22 20:36:09 -06:00
ctrl_dbg ( ctrl , " %s: SLOTCTRL %x write cmd %x \n " , __func__ ,
pci_pcie_cap ( ctrl - > pcie - > port ) + PCI_EXP_SLTCTL , ctrl_mask ) ;
2014-02-04 18:30:40 -08:00
if ( pciehp_poll_mode )
int_poll_timeout ( ctrl - > poll_timer . data ) ;
2013-08-08 14:09:37 -06:00
return 0 ;
}
2009-01-28 19:31:18 -08:00
int pcie_init_notification ( struct controller * ctrl )
2008-06-20 12:07:08 +09:00
{
if ( pciehp_request_irq ( ctrl ) )
return - 1 ;
2013-12-14 13:06:16 -07:00
pcie_enable_notification ( ctrl ) ;
2009-01-28 19:31:18 -08:00
ctrl - > notification_enabled = 1 ;
2008-06-20 12:07:08 +09:00
return 0 ;
}
static void pcie_shutdown_notification ( struct controller * ctrl )
{
2009-01-28 19:31:18 -08:00
if ( ctrl - > notification_enabled ) {
pcie_disable_notification ( ctrl ) ;
pciehp_free_irq ( ctrl ) ;
ctrl - > notification_enabled = 0 ;
}
2008-06-20 12:07:08 +09:00
}
static int pcie_init_slot ( struct controller * ctrl )
{
struct slot * slot ;
slot = kzalloc ( sizeof ( * slot ) , GFP_KERNEL ) ;
if ( ! slot )
return - ENOMEM ;
2013-07-03 15:04:57 -07:00
slot - > wq = alloc_workqueue ( " pciehp-%u " , 0 , 0 , PSN ( ctrl ) ) ;
2013-01-11 10:15:54 +08:00
if ( ! slot - > wq )
goto abort ;
2008-06-20 12:07:08 +09:00
slot - > ctrl = ctrl ;
mutex_init ( & slot - > lock ) ;
2014-02-04 18:31:11 -08:00
mutex_init ( & slot - > hotplug_lock ) ;
2008-06-20 12:07:08 +09:00
INIT_DELAYED_WORK ( & slot - > work , pciehp_queue_pushbutton_work ) ;
2009-09-15 17:24:46 +09:00
ctrl - > slot = slot ;
2005-04-16 15:20:36 -07:00
return 0 ;
2013-01-11 10:15:54 +08:00
abort :
kfree ( slot ) ;
return - ENOMEM ;
2005-04-16 15:20:36 -07:00
}
2007-11-28 15:11:46 -08:00
2008-06-20 12:07:08 +09:00
static void pcie_cleanup_slot ( struct controller * ctrl )
{
2009-09-15 17:24:46 +09:00
struct slot * slot = ctrl - > slot ;
2008-06-20 12:07:08 +09:00
cancel_delayed_work ( & slot - > work ) ;
2013-01-11 10:15:54 +08:00
destroy_workqueue ( slot - > wq ) ;
2008-06-20 12:07:08 +09:00
kfree ( slot ) ;
}
2008-04-25 14:39:08 -07:00
static inline void dbg_ctrl ( struct controller * ctrl )
2007-11-28 15:11:46 -08:00
{
2009-09-15 17:30:14 +09:00
struct pci_dev * pdev = ctrl - > pcie - > port ;
2015-06-15 16:28:29 -05:00
u16 reg16 ;
2007-11-28 15:11:46 -08:00
2008-04-25 14:39:08 -07:00
if ( ! pciehp_debug )
return ;
2007-11-28 15:11:46 -08:00
2008-09-05 12:11:26 +09:00
ctrl_info ( ctrl , " Slot Capabilities : 0x%08x \n " , ctrl - > slot_cap ) ;
2013-05-09 11:26:16 -06:00
pcie_capability_read_word ( pdev , PCI_EXP_SLTSTA , & reg16 ) ;
2008-09-05 12:11:26 +09:00
ctrl_info ( ctrl , " Slot Status : 0x%04x \n " , reg16 ) ;
2013-05-09 11:26:16 -06:00
pcie_capability_read_word ( pdev , PCI_EXP_SLTCTL , & reg16 ) ;
2008-09-05 12:11:26 +09:00
ctrl_info ( ctrl , " Slot Control : 0x%04x \n " , reg16 ) ;
2008-04-25 14:39:08 -07:00
}
2007-11-28 15:11:46 -08:00
2014-04-18 20:13:49 -04:00
# define FLAG(x, y) (((x) & (y)) ? '+' : '-')
2013-12-14 13:06:36 -07:00
2008-06-20 12:07:08 +09:00
struct controller * pcie_init ( struct pcie_device * dev )
2008-04-25 14:39:08 -07:00
{
2008-06-20 12:07:08 +09:00
struct controller * ctrl ;
2008-10-22 14:31:44 +09:00
u32 slot_cap , link_cap ;
2008-04-25 14:39:08 -07:00
struct pci_dev * pdev = dev - > port ;
2007-11-28 15:11:46 -08:00
2008-06-20 12:07:08 +09:00
ctrl = kzalloc ( sizeof ( * ctrl ) , GFP_KERNEL ) ;
if ( ! ctrl ) {
2008-10-23 11:47:32 +09:00
dev_err ( & dev - > device , " %s: Out of memory \n " , __func__ ) ;
2008-06-20 12:07:08 +09:00
goto abort ;
}
2008-08-22 17:16:48 +09:00
ctrl - > pcie = dev ;
2013-12-14 13:06:07 -07:00
pcie_capability_read_dword ( pdev , PCI_EXP_SLTCAP , & slot_cap ) ;
PCI: pciehp: Allow exclusive userspace control of indicators
PCIe hotplug supports optional Attention and Power Indicators, which are
used internally by pciehp. Users can't control the Power Indicator, but
they can control the Attention Indicator by writing to a sysfs "attention"
file.
The Slot Control register has two bits for each indicator, and the PCIe
spec defines the encodings for each as (Reserved/On/Blinking/Off). For
sysfs "attention" writes, pciehp_set_attention_status() maps into these
encodings, so the only useful write values are 0 (Off), 1 (On), and 2
(Blinking).
However, some platforms use all four bits for platform-specific indicators,
and they need to allow direct user control of them while preventing pciehp
from using them at all.
Add a "hotplug_user_indicators" flag to the pci_dev structure. When set,
pciehp does not use either the Attention Indicator or the Power Indicator,
and the low four bits (values 0x0 - 0xf) of sysfs "attention" write values
are written directly to the Attention Indicator Control and Power Indicator
Control fields.
[bhelgaas: changelog, rename flag and accessors to s/attention/indicator/]
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-09-13 10:31:59 -06:00
if ( pdev - > hotplug_user_indicators )
slot_cap & = ~ ( PCI_EXP_SLTCAP_AIP | PCI_EXP_SLTCAP_PIP ) ;
2008-04-25 14:39:08 -07:00
ctrl - > slot_cap = slot_cap ;
2007-11-28 15:11:46 -08:00
mutex_init ( & ctrl - > ctrl_lock ) ;
init_waitqueue_head ( & ctrl - > queue ) ;
2008-04-25 14:39:08 -07:00
dbg_ctrl ( ctrl ) ;
2014-06-14 10:56:31 -06:00
2014-04-18 20:13:49 -04:00
/* Check if Data Link Layer Link Active Reporting is implemented */
pcie_capability_read_dword ( pdev , PCI_EXP_LNKCAP , & link_cap ) ;
2015-06-15 16:28:29 -05:00
if ( link_cap & PCI_EXP_LNKCAP_DLLLARC )
2014-04-18 20:13:49 -04:00
ctrl - > link_active_reporting = 1 ;
2008-10-22 14:31:44 +09:00
2008-06-20 12:07:08 +09:00
/* Clear all remaining event bits in Slot Status register */
2013-12-14 13:06:47 -07:00
pcie_capability_write_word ( pdev , PCI_EXP_SLTSTA ,
PCI_EXP_SLTSTA_ABP | PCI_EXP_SLTSTA_PFD |
PCI_EXP_SLTSTA_MRLSC | PCI_EXP_SLTSTA_PDC |
2014-06-17 13:27:34 -06:00
PCI_EXP_SLTSTA_CC | PCI_EXP_SLTSTA_DLLSC ) ;
2007-11-28 15:11:46 -08:00
2015-06-15 16:28:29 -05:00
ctrl_info ( ctrl , " Slot #%d AttnBtn%c PwrCtrl%c MRL%c AttnInd%c PwrInd%c HotPlug%c Surprise%c Interlock%c NoCompl%c LLActRep%c \n " ,
2013-12-14 13:06:36 -07:00
( slot_cap & PCI_EXP_SLTCAP_PSN ) > > 19 ,
FLAG ( slot_cap , PCI_EXP_SLTCAP_ABP ) ,
FLAG ( slot_cap , PCI_EXP_SLTCAP_PCP ) ,
FLAG ( slot_cap , PCI_EXP_SLTCAP_MRLSP ) ,
2015-06-15 16:28:29 -05:00
FLAG ( slot_cap , PCI_EXP_SLTCAP_AIP ) ,
FLAG ( slot_cap , PCI_EXP_SLTCAP_PIP ) ,
FLAG ( slot_cap , PCI_EXP_SLTCAP_HPC ) ,
FLAG ( slot_cap , PCI_EXP_SLTCAP_HPS ) ,
2013-12-14 13:06:36 -07:00
FLAG ( slot_cap , PCI_EXP_SLTCAP_EIP ) ,
FLAG ( slot_cap , PCI_EXP_SLTCAP_NCCS ) ,
FLAG ( link_cap , PCI_EXP_LNKCAP_DLLLARC ) ) ;
2008-06-20 12:07:08 +09:00
if ( pcie_init_slot ( ctrl ) )
goto abort_ctrl ;
2008-04-25 14:39:08 -07:00
2008-06-20 12:07:08 +09:00
return ctrl ;
abort_ctrl :
kfree ( ctrl ) ;
2007-11-28 15:11:46 -08:00
abort :
2008-06-20 12:07:08 +09:00
return NULL ;
}
2009-09-15 17:30:48 +09:00
void pciehp_release_ctrl ( struct controller * ctrl )
2008-06-20 12:07:08 +09:00
{
pcie_shutdown_notification ( ctrl ) ;
pcie_cleanup_slot ( ctrl ) ;
kfree ( ctrl ) ;
2007-11-28 15:11:46 -08:00
}