439 Commits

Author SHA1 Message Date
Petr Pavlu
35a6550405 ipmi/watchdog: Stop watchdog timer when the current action is 'none'
commit 2253042d86f57d90a621ac2513a7a7a13afcf809 upstream.

When an IPMI watchdog timer is being stopped in ipmi_close() or
ipmi_ioctl(WDIOS_DISABLECARD), the current watchdog action is updated to
WDOG_TIMEOUT_NONE and _ipmi_set_timeout(IPMI_SET_TIMEOUT_NO_HB) is called
to install this action. The latter function ends up invoking
__ipmi_set_timeout() which makes the actual 'Set Watchdog Timer' IPMI
request.

For IPMI 1.0, this operation results in fully stopping the watchdog timer.
For IPMI >= 1.5, function __ipmi_set_timeout() always specifies the "don't
stop" flag in the prepared 'Set Watchdog Timer' IPMI request. This causes
that the watchdog timer has its action correctly updated to 'none' but the
timer continues to run. A problem is that IPMI firmware can then still log
an expiration event when the configured timeout is reached, which is
unexpected because the watchdog timer was requested to be stopped.

The patch fixes this problem by not setting the "don't stop" flag in
__ipmi_set_timeout() when the current action is WDOG_TIMEOUT_NONE which
results in stopping the watchdog timer. This makes the behaviour for
IPMI >= 1.5 consistent with IPMI 1.0. It also matches the logic in
__ipmi_heartbeat() which does not allow to reset the watchdog if the
current action is WDOG_TIMEOUT_NONE as that would start the timer.

Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Message-Id: <10a41bdc-9c99-089c-8d89-fa98ce5ea080@suse.com>
Cc: stable@vger.kernel.org
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-20 16:21:09 +02:00
Wen Yang
62cd9aa3ac ipmi: fix hung processes in __get_guid()
[ Upstream commit 32830a0534700f86366f371b150b17f0f0d140d7 ]

The wait_event() function is used to detect command completion.
When send_guid_cmd() returns an error, smi_send() has not been
called to send data. Therefore, wait_event() should not be used
on the error path, otherwise it will cause the following warning:

[ 1361.588808] systemd-udevd   D    0  1501   1436 0x00000004
[ 1361.588813]  ffff883f4b1298c0 0000000000000000 ffff883f4b188000 ffff887f7e3d9f40
[ 1361.677952]  ffff887f64bd4280 ffffc90037297a68 ffffffff8173ca3b ffffc90000000010
[ 1361.767077]  00ffc90037297ad0 ffff887f7e3d9f40 0000000000000286 ffff883f4b188000
[ 1361.856199] Call Trace:
[ 1361.885578]  [<ffffffff8173ca3b>] ? __schedule+0x23b/0x780
[ 1361.951406]  [<ffffffff8173cfb6>] schedule+0x36/0x80
[ 1362.010979]  [<ffffffffa071f178>] get_guid+0x118/0x150 [ipmi_msghandler]
[ 1362.091281]  [<ffffffff810d5350>] ? prepare_to_wait_event+0x100/0x100
[ 1362.168533]  [<ffffffffa071f755>] ipmi_register_smi+0x405/0x940 [ipmi_msghandler]
[ 1362.258337]  [<ffffffffa0230ae9>] try_smi_init+0x529/0x950 [ipmi_si]
[ 1362.334521]  [<ffffffffa022f350>] ? std_irq_setup+0xd0/0xd0 [ipmi_si]
[ 1362.411701]  [<ffffffffa0232bd2>] init_ipmi_si+0x492/0x9e0 [ipmi_si]
[ 1362.487917]  [<ffffffffa0232740>] ? ipmi_pci_probe+0x280/0x280 [ipmi_si]
[ 1362.568219]  [<ffffffff810021a0>] do_one_initcall+0x50/0x180
[ 1362.636109]  [<ffffffff812231b2>] ? kmem_cache_alloc_trace+0x142/0x190
[ 1362.714330]  [<ffffffff811b2ae1>] do_init_module+0x5f/0x200
[ 1362.781208]  [<ffffffff81123ca8>] load_module+0x1898/0x1de0
[ 1362.848069]  [<ffffffff811202e0>] ? __symbol_put+0x60/0x60
[ 1362.913886]  [<ffffffff8130696b>] ? security_kernel_post_read_file+0x6b/0x80
[ 1362.998514]  [<ffffffff81124465>] SYSC_finit_module+0xe5/0x120
[ 1363.068463]  [<ffffffff81124465>] ? SYSC_finit_module+0xe5/0x120
[ 1363.140513]  [<ffffffff811244be>] SyS_finit_module+0xe/0x10
[ 1363.207364]  [<ffffffff81003c04>] do_syscall_64+0x74/0x180

Fixes: 50c812b2b951 ("[PATCH] ipmi: add full sysfs support")
Signed-off-by: Wen Yang <wenyang@linux.alibaba.com>
Cc: Corey Minyard <minyard@acm.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: openipmi-developer@lists.sourceforge.net
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org # 2.6.17-
Message-Id: <20200403090408.58745-1-wenyang@linux.alibaba.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-24 07:59:00 +02:00
Corey Minyard
dcbb950c78 ipmi:ssif: Handle a possible NULL pointer reference
[ Upstream commit 6b8526d3abc02c08a2f888e8c20b7ac9e5776dfe ]

In error cases a NULL can be passed to memcpy.  The length will always
be zero, so it doesn't really matter, but go ahead and check for NULL,
anyway, to be more precise and avoid static analysis errors.

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-03-11 07:53:02 +01:00
Corey Minyard
7522f96fa4 ipmi_si: Only schedule continuously in the thread in maintenance mode
[ Upstream commit 340ff31ab00bca5c15915e70ad9ada3030c98cf8 ]

ipmi_thread() uses back-to-back schedule() to poll for command
completion which, on some machines, can push up CPU consumption and
heavily tax the scheduler locks leading to noticeable overall
performance degradation.

This was originally added so firmware updates through IPMI would
complete in a timely manner.  But we can't kill the scheduler
locks for that one use case.

Instead, only run schedule() continuously in maintenance mode,
where firmware updates should run.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-10-07 18:53:10 +02:00
Kamlakant Patel
a2a2a146ec ipmi:ssif: compare block number correctly for multi-part return messages
commit 55be8658c7e2feb11a5b5b33ee031791dbd23a69 upstream.

According to ipmi spec, block number is a number that is incremented,
starting with 0, for each new block of message data returned using the
middle transaction.

Here, the 'blocknum' is data[0] which always starts from zero(0) and
'ssif_info->multi_pos' starts from 1.
So, we need to add +1 to blocknum while comparing with multi_pos.

Fixes: 7d6380cd40f79 ("ipmi:ssif: Fix handling of multi-part return messages").
Reported-by: Kiran Kolukuluru <kirank@ami.com>
Signed-off-by: Kamlakant Patel <kamlakantp@marvell.com>
Message-Id: <1556106615-18722-1-git-send-email-kamlakantp@marvell.com>
[Also added a debug log if the block numbers don't match.]
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Cc: stable@vger.kernel.org # 4.4
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21 18:49:00 +02:00
Corey Minyard
cac2590d25 ipmi:ssif: Fix handling of multi-part return messages
commit 7d6380cd40f7993f75c4bde5b36f6019237e8719 upstream.

The block number was not being compared right, it was off by one
when checking the response.

Some statistics wouldn't be incremented properly in some cases.

Check to see if that middle-part messages always have 31 bytes of
data.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Cc: stable@vger.kernel.org # 4.4
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-26 09:38:36 +01:00
Jan Glauber
9aba7ddfc8 ipmi: Fix timer race with module unload
commit 0711e8c1b4572d076264e71b0002d223f2666ed7 upstream.

Please note that below oops is from an older kernel, but the same
race seems to be present in the upstream kernel too.

---8<---

The following panic was encountered during removing the ipmi_ssif
module:

[ 526.352555] Unable to handle kernel paging request at virtual address ffff000006923090
[ 526.360464] Mem abort info:
[ 526.363257] ESR = 0x86000007
[ 526.366304] Exception class = IABT (current EL), IL = 32 bits
[ 526.372221] SET = 0, FnV = 0
[ 526.375269] EA = 0, S1PTW = 0
[ 526.378405] swapper pgtable: 4k pages, 48-bit VAs, pgd = 000000008ae60416
[ 526.385185] [ffff000006923090] *pgd=000000bffcffe803, *pud=000000bffcffd803, *pmd=0000009f4731a003, *pte=0000000000000000
[ 526.396141] Internal error: Oops: 86000007 [#1] SMP
[ 526.401008] Modules linked in: nls_iso8859_1 ipmi_devintf joydev input_leds ipmi_msghandler shpchp sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear i2c_smbus hid_generic usbhid uas hid usb_storage ast aes_ce_blk i2c_algo_bit aes_ce_cipher qede ttm crc32_ce ptp crct10dif_ce drm_kms_helper ghash_ce syscopyarea sha2_ce sysfillrect sysimgblt pps_core fb_sys_fops sha256_arm64 sha1_ce mpt3sas qed drm raid_class ahci scsi_transport_sas libahci gpio_xlp i2c_xlp9xx aes_neon_bs aes_neon_blk crypto_simd cryptd aes_arm64 [last unloaded: ipmi_ssif]
[ 526.468085] CPU: 125 PID: 0 Comm: swapper/125 Not tainted 4.15.0-35-generic #38~lp1775396+build.1
[ 526.476942] Hardware name: To be filled by O.E.M. Saber/Saber, BIOS 0ACKL022 08/14/2018
[ 526.484932] pstate: 00400009 (nzcv daif +PAN -UAO)
[ 526.489713] pc : 0xffff000006923090
[ 526.493198] lr : call_timer_fn+0x34/0x178
[ 526.497194] sp : ffff000009b0bdd0
[ 526.500496] x29: ffff000009b0bdd0 x28: 0000000000000082
[ 526.505796] x27: 0000000000000002 x26: ffff000009515188
[ 526.511096] x25: ffff000009515180 x24: ffff0000090f1018
[ 526.516396] x23: ffff000009519660 x22: dead000000000200
[ 526.521696] x21: ffff000006923090 x20: 0000000000000100
[ 526.526995] x19: ffff809eeb466a40 x18: 0000000000000000
[ 526.532295] x17: 000000000000000e x16: 0000000000000007
[ 526.537594] x15: 0000000000000000 x14: 071c71c71c71c71c
[ 526.542894] x13: 0000000000000000 x12: 0000000000000000
[ 526.548193] x11: 0000000000000001 x10: ffff000009b0be88
[ 526.553493] x9 : 0000000000000000 x8 : 0000000000000005
[ 526.558793] x7 : ffff80befc1f8528 x6 : 0000000000000020
[ 526.564092] x5 : 0000000000000040 x4 : 0000000020001b20
[ 526.569392] x3 : 0000000000000000 x2 : ffff809eeb466a40
[ 526.574692] x1 : ffff000006923090 x0 : ffff809eeb466a40
[ 526.579992] Process swapper/125 (pid: 0, stack limit = 0x000000002eb50acc)
[ 526.586854] Call trace:
[ 526.589289] 0xffff000006923090
[ 526.592419] expire_timers+0xc8/0x130
[ 526.596070] run_timer_softirq+0xec/0x1b0
[ 526.600070] __do_softirq+0x134/0x328
[ 526.603726] irq_exit+0xc8/0xe0
[ 526.606857] __handle_domain_irq+0x6c/0xc0
[ 526.610941] gic_handle_irq+0x84/0x188
[ 526.614679] el1_irq+0xe8/0x180
[ 526.617822] cpuidle_enter_state+0xa0/0x328
[ 526.621993] cpuidle_enter+0x34/0x48
[ 526.625564] call_cpuidle+0x44/0x70
[ 526.629040] do_idle+0x1b8/0x1f0
[ 526.632256] cpu_startup_entry+0x2c/0x30
[ 526.636174] secondary_start_kernel+0x11c/0x130
[ 526.640694] Code: bad PC value
[ 526.643800] ---[ end trace d020b0b8417c2498 ]---
[ 526.648404] Kernel panic - not syncing: Fatal exception in interrupt
[ 526.654778] SMP: stopping secondary CPUs
[ 526.658734] Kernel Offset: disabled
[ 526.662211] CPU features: 0x5800c38
[ 526.665688] Memory Limit: none
[ 526.668768] ---[ end Kernel panic - not syncing: Fatal exception in interrupt

Prevent mod_timer from arming a timer that was already removed by
del_timer during module unload.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
Cc: <stable@vger.kernel.org> # 3.19
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-13 11:16:46 -08:00
Corey Minyard
d11ec041b2 ipmi:bt: Set the timeout before doing a capabilities check
commit fe50a7d0393a552e4539da2d31261a59d6415950 upstream.

There was one place where the timeout value for an operation was
not being set, if a capabilities request was done from idle.  Move
the timeout value setting to before where that change might be
requested.

IMHO the cause here is the invisible returns in the macros.  Maybe
that's a job for later, though.

Reported-by: Nordmark Claes <Claes.Nordmark@tieto.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03 11:23:07 +02:00
Kamlakant Patel
b4cc441afd ipmi_ssif: Fix kernel panic at msg_done_handler
[ Upstream commit f002612b9d86613bc6fde0a444e0095225f6053e ]

This happens when BMC doesn't return any data and the code is trying
to print the value of data[2].

Getting following crash:
[  484.728410] Unable to handle kernel NULL pointer dereference at virtual address 00000002
[  484.736496] pgd = ffff0000094a2000
[  484.739885] [00000002] *pgd=00000047fcffe003, *pud=00000047fcffd003, *pmd=0000000000000000
[  484.748158] Internal error: Oops: 96000005 [#1] SMP
[...]
[  485.101451] Call trace:
[...]
[  485.188473] [<ffff000000a46e68>] msg_done_handler+0x668/0x700 [ipmi_ssif]
[  485.195249] [<ffff000000a456b8>] ipmi_ssif_thread+0x110/0x128 [ipmi_ssif]
[  485.202038] [<ffff0000080f1430>] kthread+0x108/0x138
[  485.206994] [<ffff0000080838e0>] ret_from_fork+0x10/0x30
[  485.212294] Code: aa1903e1 aa1803e0 b900227f 95fef6a5 (39400aa3)

Adding a check to validate the data len before printing data[2] to fix this issue.

Signed-off-by: Kamlakant Patel <kamlakant.patel@cavium.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:50:46 +02:00
Wei Yongjun
b5c7751a4a ipmi/powernv: Fix error return code in ipmi_powernv_probe()
[ Upstream commit e749d328b0b450aa78d562fa26a0cd8872325dd9 ]

Fix to return a negative error code from the request_irq() error
handling case instead of 0, as done elsewhere in this function.

Fixes: dce143c3381c ("ipmi/powernv: Convert to irq event interface")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:50:21 +02:00
Dan Carpenter
ad741ccdd8 ipmi_ssif: unlock on allocation failure
[ Upstream commit cf9806f32ef63b745f2486e0dbb2ac21f4ca44f0 ]

We should unlock and re-enable IRQs if this allocation fails.

Fixes: 259307074bfc ("ipmi: Add SMBus interface driver (SSIF) ")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-13 19:47:52 +02:00
Robert Lippert
e9e5ad6b0e ipmi/watchdog: fix wdog hang on panic waiting for ipmi response
[ Upstream commit 2c1175c2e8e5487233cabde358a19577562ac83e ]

Commit c49c097610fe ("ipmi: Don't call receive handler in the
panic context") means that the panic_recv_free is not called during a
panic and the atomic count does not drop to 0.

Fix this by only expecting one decrement of the atomic variable
which comes from panic_smi_free.

Signed-off-by: Robert Lippert <rlippert@google.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-24 11:00:18 +01:00
Masamitsu Yamazaki
f8dac5bfbd ipmi: Stop timers before cleaning up the module
commit 4f7f5551a760eb0124267be65763008169db7087 upstream.

System may crash after unloading ipmi_si.ko module
because a timer may remain and fire after the module cleaned up resources.

cleanup_one_si() contains the following processing.

        /*
         * Make sure that interrupts, the timer and the thread are
         * stopped and will not run again.
         */
        if (to_clean->irq_cleanup)
                to_clean->irq_cleanup(to_clean);
        wait_for_timer_and_thread(to_clean);

        /*
         * Timeouts are stopped, now make sure the interrupts are off
         * in the BMC.  Note that timers and CPU interrupts are off,
         * so no need for locks.
         */
        while (to_clean->curr_msg || (to_clean->si_state != SI_NORMAL)) {
                poll(to_clean);
                schedule_timeout_uninterruptible(1);
        }

si_state changes as following in the while loop calling poll(to_clean).

  SI_GETTING_MESSAGES
    => SI_CHECKING_ENABLES
     => SI_SETTING_ENABLES
      => SI_GETTING_EVENTS
       => SI_NORMAL

As written in the code comments above,
timers are expected to stop before the polling loop and not to run again.
But the timer is set again in the following process
when si_state becomes SI_SETTING_ENABLES.

  => poll
     => smi_event_handler
       => handle_transaction_done
          // smi_info->si_state == SI_SETTING_ENABLES
         => start_getting_events
           => start_new_msg
            => smi_mod_timer
              => mod_timer

As a result, before the timer set in start_new_msg() expires,
the polling loop may see si_state becoming SI_NORMAL
and the module clean-up finishes.

For example, hard LOCKUP and panic occurred as following.
smi_timeout was called after smi_event_handler,
kcs_event and hangs at port_inb()
trying to access I/O port after release.

    [exception RIP: port_inb+19]
    RIP: ffffffffc0473053  RSP: ffff88069fdc3d80  RFLAGS: 00000006
    RAX: ffff8806800f8e00  RBX: ffff880682bd9400  RCX: 0000000000000000
    RDX: 0000000000000ca3  RSI: 0000000000000ca3  RDI: ffff8806800f8e40
    RBP: ffff88069fdc3d80   R8: ffffffff81d86dfc   R9: ffffffff81e36426
    R10: 00000000000509f0  R11: 0000000000100000  R12: 0000000000]:000000
    R13: 0000000000000000  R14: 0000000000000246  R15: ffff8806800f8e00
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
 --- <NMI exception stack> ---

To fix the problem I defined a flag, timer_can_start,
as member of struct smi_info.
The flag is enabled immediately after initializing the timer
and disabled immediately before waiting for timer deletion.

Fixes: 0cfec916e86d ("ipmi: Start the timer and thread on internal msgs")
Signed-off-by: Yamazaki Masamitsu <m-yamazaki@ah.jp.nec.com>
[Adjusted for recent changes in the driver.]
[Some fairly major changes went into the IPMI driver in 4.15, so this
 required a backport as the code had changed and moved to a different
 file.  The 4.14 version of this patch moved some code under an
 if statement causing it to not apply to 4.7-4.13.]
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-16 16:25:46 +01:00
Corey Minyard
55b06b0fc0 ipmi: fix unsigned long underflow
commit 392a17b10ec4320d3c0e96e2a23ebaad1123b989 upstream.

When I set the timeout to a specific value such as 500ms, the timeout
event will not happen in time due to the overflow in function
check_msg_timeout:
...
	ent->timeout -= timeout_period;
	if (ent->timeout > 0)
		return;
...

The type of timeout_period is long, but ent->timeout is unsigned long.
This patch makes the type consistent.

Reported-by: Weilong Chen <chenweilong@huawei.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Tested-by: Weilong Chen <chenweilong@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-24 08:33:42 +01:00
Valentin Vidic
d933777b1b ipmi/watchdog: fix watchdog timeout set on reboot
commit 860f01e96981a68553f3ca49f574ff14fe955e72 upstream.

systemd by default starts watchdog on reboot and sets the timer to
ShutdownWatchdogSec=10min.  Reboot handler in ipmi_watchdog than reduces
the timer to 120s which is not enough time to boot a Xen machine with
a lot of RAM.  As a result the machine is rebooted the second time
during the long run of (XEN) Scrubbing Free RAM.....

Fix this by setting the timer to 120s only if it was previously
set to a low value.

Signed-off-by: Valentin Vidic <Valentin.Vidic@CARNet.hr>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-06 18:59:43 -07:00
Corey Minyard
1b9008cdae ipmi:ssif: Add missing unlock in error branch
commit 4495ec6d770e1bca7a04e93ac453ab6720c56c5d upstream.

When getting flags, a response to a different message would
result in a deadlock because of a missing unlock.  Add that
unlock and a comment.  Found by static analysis.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-27 15:08:02 -07:00
Tony Camuso
685e124ebc ipmi: use rcu lock around call to intf->handlers->sender()
commit cdea46566bb21ce309725a024208322a409055cc upstream.

A vendor with a system having more than 128 CPUs occasionally encounters
the following crash during shutdown. This is not an easily reproduceable
event, but the vendor was able to provide the following analysis of the
crash, which exhibits the same footprint each time.

crash> bt
PID: 0      TASK: ffff88017c70ce70  CPU: 5   COMMAND: "swapper/5"
 #0 [ffff88085c143ac8] machine_kexec at ffffffff81059c8b
 #1 [ffff88085c143b28] __crash_kexec at ffffffff811052e2
 #2 [ffff88085c143bf8] crash_kexec at ffffffff811053d0
 #3 [ffff88085c143c10] oops_end at ffffffff8168ef88
 #4 [ffff88085c143c38] no_context at ffffffff8167ebb3
 #5 [ffff88085c143c88] __bad_area_nosemaphore at ffffffff8167ec49
 #6 [ffff88085c143cd0] bad_area_nosemaphore at ffffffff8167edb3
 #7 [ffff88085c143ce0] __do_page_fault at ffffffff81691d1e
 #8 [ffff88085c143d40] do_page_fault at ffffffff81691ec5
 #9 [ffff88085c143d70] page_fault at ffffffff8168e188
    [exception RIP: unknown or invalid address]
    RIP: ffffffffa053c800  RSP: ffff88085c143e28  RFLAGS: 00010206
    RAX: ffff88017c72bfd8  RBX: ffff88017a8dc000  RCX: ffff8810588b5ac8
    RDX: ffff8810588b5a00  RSI: ffffffffa053c800  RDI: ffff8810588b5a00
    RBP: ffff88085c143e58   R8: ffff88017c70d408   R9: ffff88017a8dc000
    R10: 0000000000000002  R11: ffff88085c143da0  R12: ffff8810588b5ac8
    R13: 0000000000000100  R14: ffffffffa053c800  R15: ffff8810588b5a00
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
    <IRQ stack>
    [exception RIP: cpuidle_enter_state+82]
    RIP: ffffffff81514192  RSP: ffff88017c72be50  RFLAGS: 00000202
    RAX: 0000001e4c3c6f16  RBX: 000000000000f8a0  RCX: 0000000000000018
    RDX: 0000000225c17d03  RSI: ffff88017c72bfd8  RDI: 0000001e4c3c6f16
    RBP: ffff88017c72be78   R8: 000000000000237e   R9: 0000000000000018
    R10: 0000000000002494  R11: 0000000000000001  R12: ffff88017c72be20
    R13: ffff88085c14f8e0  R14: 0000000000000082  R15: 0000001e4c3bb400
    ORIG_RAX: ffffffffffffff10  CS: 0010  SS: 0018

This is the corresponding stack trace

It has crashed because the area pointed with RIP extracted from timer
element is already removed during a shutdown process.

The function is smi_timeout().

And we think ffff8810588b5a00 in RDX is a parameter struct smi_info

crash> rd ffff8810588b5a00 20
ffff8810588b5a00:  ffff8810588b6000 0000000000000000   .`.X............
ffff8810588b5a10:  ffff880853264400 ffffffffa05417e0   .D&S......T.....
ffff8810588b5a20:  24a024a000000000 0000000000000000   .....$.$........
ffff8810588b5a30:  0000000000000000 0000000000000000   ................
ffff8810588b5a30:  0000000000000000 0000000000000000   ................
ffff8810588b5a40:  ffffffffa053a040 ffffffffa053a060   @.S.....`.S.....
ffff8810588b5a50:  0000000000000000 0000000100000001   ................
ffff8810588b5a60:  0000000000000000 0000000000000e00   ................
ffff8810588b5a70:  ffffffffa053a580 ffffffffa053a6e0   ..S.......S.....
ffff8810588b5a80:  ffffffffa053a4a0 ffffffffa053a250   ..S.....P.S.....
ffff8810588b5a90:  0000000500000002 0000000000000000   ................

Unfortunately the top of this area is already detroyed by someone.
But because of two reasonns we think this is struct smi_info
 1) The address included in between  ffff8810588b5a70 and ffff8810588b5a80:
  are inside of ipmi_si_intf.c  see crash> module ffff88085779d2c0

 2) We've found the area which point this.
  It is offset 0x68 of  ffff880859df4000

crash> rd  ffff880859df4000 100
ffff880859df4000:  0000000000000000 0000000000000001   ................
ffff880859df4010:  ffffffffa0535290 dead000000000200   .RS.............
ffff880859df4020:  ffff880859df4020 ffff880859df4020    @.Y.... @.Y....
ffff880859df4030:  0000000000000002 0000000000100010   ................
ffff880859df4040:  ffff880859df4040 ffff880859df4040   @@.Y....@@.Y....
ffff880859df4050:  0000000000000000 0000000000000000   ................
ffff880859df4060:  0000000000000000 ffff8810588b5a00   .........Z.X....
ffff880859df4070:  0000000000000001 ffff880859df4078   ........x@.Y....

 If we regards it as struct ipmi_smi in shutdown process
 it looks consistent.

The remedy for this apparent race is affixed below.

Signed-off-by: Tony Camuso <tcamuso@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

This was first introduced in 7ea0ed2b5be817 ipmi: Make the
message handler easier to use for SMI interfaces
where some code was moved outside of the rcu_read_lock()
and the lock was not added.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
2017-07-27 15:08:02 -07:00
Joeseph Chang
46ba11b007 ipmi: Fix kernel panic at ipmi_ssif_thread()
commit 6de65fcfdb51835789b245203d1bfc8d14cb1e06 upstream.

msg_written_handler() may set ssif_info->multi_data to NULL
when using ipmitool to write fru.

Before setting ssif_info->multi_data to NULL, add new local
pointer "data_to_send" and store correct i2c data pointer to
it to fix NULL pointer kernel panic and incorrect ssif_info->multi_pos.

Signed-off-by: Joeseph Chang <joechang@codeaurora.org>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-20 14:28:41 +02:00
Cédric Le Goater
1c8018f7a7 ipmi/bt-bmc: change compatible node to 'aspeed, ast2400-ibt-bmc'
The Aspeed SoCs have two BT interfaces : one is IPMI compliant and the
other is H8S/2168 compliant.

The current ipmi/bt-bmc driver implements the IPMI version and we
should reflect its nature in the compatible node name using
'aspeed,ast2400-ibt-bmc' instead of 'aspeed,ast2400-bt-bmc'. The
latter should be used for a H8S interface driver if it is implemented
one day.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
2016-11-17 16:31:09 -08:00
Xie XiuQi
bd85f4b37d ipmi: fix crash on reading version from proc after unregisted bmc
I meet a crash, which could be reproduce:
1) while true; do cat /proc/ipmi/0/version; done
2) modprobe -rv ipmi_si ipmi_msghandler ipmi_devintf

[82761.021137] IPMI BT: req2rsp=5 secs retries=2
[82761.034524] ipmi device interface
[82761.222218] ipmi_si ipmi_si.0: Found new BMC (man_id: 0x0007db, prod_id: 0x0001, dev_id: 0x01)
[82761.222230] ipmi_si ipmi_si.0: IPMI bt interface initialized
[82903.922740] BUG: unable to handle kernel NULL pointer dereference at 00000000000002d4
[82903.930952] IP: [<ffffffffa030d9e8>] smi_version_proc_show+0x18/0x40 [ipmi_msghandler]
[82903.939220] PGD 86693a067 PUD 865304067 PMD 0
[82903.943893] Thread overran stack, or stack corrupted
[82903.949034] Oops: 0000 [#1] SMP
[82903.983091] Modules linked in: ipmi_si(-) ipmi_msghandler binfmt_misc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter
...
[82904.057285]  pps_core scsi_transport_sas dm_mod vfio_iommu_type1 vfio xt_sctp nf_conntrack_proto_sctp nf_nat_proto_sctp
                nf_nat nf_conntrack sctp libcrc32c [last unloaded: ipmi_devintf]
[82904.073169] CPU: 37 PID: 28089 Comm: cat Tainted: GF          O   ---- -------   3.10.0-327.28.3.el7.x86_64 #1
[82904.083373] Hardware name: Huawei RH2288H V3/BC11HGSA0, BIOS 3.22 05/16/2016
[82904.090592] task: ffff880101cc2e00 ti: ffff880369c54000 task.ti: ffff880369c54000
[82904.098414] RIP: 0010:[<ffffffffa030d9e8>]  [<ffffffffa030d9e8>] smi_version_proc_show+0x18/0x40 [ipmi_msghandler]
[82904.109124] RSP: 0018:ffff880369c57e70  EFLAGS: 00010203
[82904.114608] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000024688470
[82904.121912] RDX: fffffffffffffff4 RSI: ffffffffa0313404 RDI: ffff8808670ce200
[82904.129218] RBP: ffff880369c57e70 R08: 0000000000019720 R09: ffffffff81204a27
[82904.136521] R10: ffff88046f803300 R11: 0000000000000246 R12: ffff880662399700
[82904.143828] R13: 0000000000000001 R14: ffff880369c57f48 R15: ffff8808670ce200
[82904.151128] FS:  00007fb70c9ca740(0000) GS:ffff88086e340000(0000) knlGS:0000000000000000
[82904.159557] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[82904.165473] CR2: 00000000000002d4 CR3: 0000000864c0c000 CR4: 00000000003407e0
[82904.172778] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[82904.180084] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[82904.187385] Stack:
[82904.189573]  ffff880369c57ee0 ffffffff81204f1a 00000000122a2427 0000000001426000
[82904.197392]  ffff8808670ce238 0000000000010000 0000000000000000 0000000000000fff
[82904.205198]  00000000122a2427 ffff880862079600 0000000001426000 ffff880369c57f48
[82904.212962] Call Trace:
[82904.219667]  [<ffffffff81204f1a>] seq_read+0xfa/0x3a0
[82904.224893]  [<ffffffff8124ce2d>] proc_reg_read+0x3d/0x80
[82904.230468]  [<ffffffff811e102c>] vfs_read+0x9c/0x170
[82904.235689]  [<ffffffff811e1b7f>] SyS_read+0x7f/0xe0
[82904.240816]  [<ffffffff81649209>] system_call_fastpath+0x16/0x1b
[82904.246991] Code: 30 a0 e8 0c 6f ef e0 5b 5d c3 66 0f 1f 84 00 00 00 00 00 0f 1f
               44 00 00 48 8b 47 78 55 48 c7 c6 04 34 31 a0 48 89 e5 48 8b 40 50 <0f>
	       b6 90 d4 02 00 00 31 c0 89 d1 83 e2 0f c0 e9 04 0f b6 c9 e8
[82904.267710] RIP  [<ffffffffa030d9e8>] smi_version_proc_show+0x18/0x40 [ipmi_msghandler]
[82904.276079]  RSP <ffff880369c57e70>
[82904.279734] CR2: 00000000000002d4
[82904.283731] ---[ end trace a69e4328b49dd7c4 ]---
[82904.328118] Kernel panic - not syncing: Fatal exception

Reading versin from /proc need bmc device struct available. So in this patch
we move add/remove_proc_entries between ipmi_bmc_register and ipmi_bmc_unregister.

Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
2016-10-03 09:09:47 -05:00
Wei Yongjun
d94655b405 ipmi/bt-bmc: remove redundant return value check of platform_get_resource()
Remove unneeded error handling on the result of a call
to platform_get_resource() when the value is passed to
devm_ioremap_resource().

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
2016-09-30 09:23:22 -05:00
Cédric Le Goater
a3e6061bad ipmi/bt-bmc: add a dependency on ARCH_ASPEED
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
2016-09-29 19:05:06 -05:00
Joel Stanley
1a377a7921 ipmi: Fix ioremap error handling in bt-bmc
devm_ioremap_resource returns ERR_PTR so we can't check for NULL.

Signed-off-by: Joel Stanley <joel@jms.id.au>
Acked-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
2016-09-29 19:05:06 -05:00
Alistair Popple
54f9c4d077 ipmi: add an Aspeed BT IPMI BMC driver
This patch adds a simple device driver to expose the iBT interface on
Aspeed SOCs (AST2400 and AST2500) as a character device. Such SOCs are
commonly used as BMCs (BaseBoard Management Controllers) and this
driver implements the BMC side of the BT interface.

The BT (Block Transfer) interface is used to perform in-band IPMI
communication between a host and its BMC. Entire messages are buffered
before sending a notification to the other end, host or BMC, that
there is data to be read. Usually, the host emits requests and the BMC
responses but the specification provides a mean for the BMC to send
SMS Attention (BMC-to-Host attention or System Management Software
attention) messages.

For this purpose, the driver introduces a specific ioctl on the
device: 'BT_BMC_IOCTL_SMS_ATN' that can be used by the system running
on the BMC to signal the host of such an event.

The device name defaults to '/dev/ipmi-bt-host'

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Joel Stanley <joel@jms.id.au>
[clg: - checkpatch fixes
      - added a devicetree binding documentation
      - replace 'bt_host' by 'bt_bmc' to reflect that the driver is
        the BMC side of the IPMI BT interface
      - renamed the device to 'ipmi-bt-host'
      - introduced a temporary buffer to copy_{to,from}_user
      - used platform_get_irq()
      - moved the driver under drivers/char/ipmi/ but kept it as a misc
        device
      - changed the compatible cell to "aspeed,ast2400-bt-bmc"
]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
[clg: - checkpatch --strict fixes
      - removed the use of devm_iounmap, devm_kfree in cleanup paths
      - introduced an atomic-t to limit opens to 1
      - introduced a mutex to protect write/read operations]
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
2016-09-29 19:05:06 -05:00
Linus Torvalds
66304207cd Merge branch 'i2c/for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c updates from Wolfram Sang:
 "Here is the I2C pull request for 4.8:

   - the core and i801 driver gained support for SMBus Host Notify

   - core support for more than one address in DT

   - i2c_add_adapter() has now better error messages.  We can remove all
     error messages from drivers calling it as a next step.

   - bigger updates to rk3x driver to support rk3399 SoC

   - the at24 eeprom driver got refactored and can now read special
     variants with unique serials or fixed MAC addresses.

  The rest is regular driver updates and bugfixes"

* 'i2c/for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (66 commits)
  i2c: i801: use IS_ENABLED() instead of checking for built-in or module
  Documentation: i2c: slave: give proper example for pm usage
  Documentation: i2c: slave: describe buffer problems a bit better
  i2c: bcm2835: Don't complain on -EPROBE_DEFER from getting our clock
  i2c: i2c-smbus: drop useless stubs
  i2c: efm32: fix a failure path in efm32_i2c_probe()
  Revert "i2c: core: Cleanup I2C ACPI namespace"
  Revert "i2c: core: Add function for finding the bus speed from ACPI"
  i2c: Update the description of I2C_SMBUS
  i2c: i2c-smbus: fix i2c_handle_smbus_host_notify documentation
  eeprom: at24: tweak the loop_until_timeout() macro
  eeprom: at24: add support for at24mac series
  eeprom: at24: support reading the serial number for 24csxx
  eeprom: at24: platform_data: use BIT() macro
  eeprom: at24: split at24_eeprom_write() into specialized functions
  eeprom: at24: split at24_eeprom_read() into specialized functions
  eeprom: at24: hide the read/write loop behind a macro
  eeprom: at24: call read/write functions via function pointers
  eeprom: at24: coding style fixes
  eeprom: at24: move at24_read() below at24_eeprom_write()
  ...
2016-07-27 14:19:25 -07:00
Tony Camuso
b07b58a3e4 ipmi: remove trydefaults parameter and default init
Parameter trydefaults=1 causes the ipmi_init to initialize ipmi through
the legacy port io space that was designated for ipmi. Architectures
that do not map legacy port io can panic when trydefaults=1.

Rather than implement build-time conditional exceptions for each
architecture that does not map legacy port io, we have removed legacy
port io from the driver.

Parameter 'trydefaults' has been removed. Attempts to use it hereafter
will evoke the "Unknown symbol in module, or unknown parameter" message.

The patch was built against a number of architectures and tested for
regressions and functionality on x86_64 and ARM64.

Signed-off-by: Tony Camuso <tcamuso@redhat.com>

Removed the config entry and the address source entry for default,
since neither were used any more.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
2016-07-27 10:24:38 -05:00
Benjamin Tissoires
b4f210541f i2c: add a protocol parameter to the alert callback
.alert() is meant to be generic, but there is currently no way
for the device driver to know which protocol generated the alert.
Add a parameter in .alert() to help the device driver to understand
what is given in data.

This patch is required to have the support of SMBus Host Notify protocol
through .alert().

Tested-by: Andrew Duggan <aduggan@synaptics.com>
For hwmon:
Acked-by: Guenter Roeck <linux@roeck-us.net>
For IPMI:
Acked-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2016-06-17 12:41:25 +02:00
Junichi Nomura
ae4ea9a246 ipmi: Remove smi_msg from waiting_rcv_msgs list before handle_one_recv_msg()
Commit 7ea0ed2b5be8 ("ipmi: Make the message handler easier to use for
SMI interfaces") changed handle_new_recv_msgs() to call handle_one_recv_msg()
for a smi_msg while the smi_msg is still connected to waiting_rcv_msgs list.
That could lead to following list corruption problems:

1) low-level function treats smi_msg as not connected to list

  handle_one_recv_msg() could end up calling smi_send(), which
  assumes the msg is not connected to list.

  For example, the following sequence could corrupt list by
  doing list_add_tail() for the entry still connected to other list.

    handle_new_recv_msgs()
      msg = list_entry(waiting_rcv_msgs)
      handle_one_recv_msg(msg)
        handle_ipmb_get_msg_cmd(msg)
          smi_send(msg)
            spin_lock(xmit_msgs_lock)
            list_add_tail(msg)
            spin_unlock(xmit_msgs_lock)

2) race between multiple handle_new_recv_msgs() instances

  handle_new_recv_msgs() once releases waiting_rcv_msgs_lock before calling
  handle_one_recv_msg() then retakes the lock and list_del() it.

  If others call handle_new_recv_msgs() during the window shown below
  list_del() will be done twice for the same smi_msg.

  handle_new_recv_msgs()
    spin_lock(waiting_rcv_msgs_lock)
    msg = list_entry(waiting_rcv_msgs)
    spin_unlock(waiting_rcv_msgs_lock)
  |
  | handle_one_recv_msg(msg)
  |
    spin_lock(waiting_rcv_msgs_lock)
    list_del(msg)
    spin_unlock(waiting_rcv_msgs_lock)

Fixes: 7ea0ed2b5be8 ("ipmi: Make the message handler easier to use for SMI interfaces")
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
[Added a comment to describe why this works.]
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Cc: stable@vger.kernel.org # 3.19
Tested-by: Ye Feng <yefeng.yl@alibaba-inc.com>
2016-06-13 08:56:28 -05:00
Corey Minyard
70f95b76f1 ipmi: Fix the I2C address extraction from SPMI tables
Unlike everywhere else in the IPMI specification, the I2C address
specified in the SPMI table is not shifted to the left one bit with
the LSB zero.  Instead it is not shifted with the MSB zero.

Reported-by: Sanjeev <singhsan@codeaurora.org>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
2016-05-16 19:49:49 -05:00
Corey Minyard
57a38f1340 IPMI: reserve memio regions separately
Commit d61a3ead2680 ("[PATCH] IPMI: reserve I/O ports separately")
changed the way I/O ports were reserved and includes this comment in
log:

 Some BIOSes reserve disjoint I/O regions in their ACPI tables for the IPMI
 controller.  This causes problems when trying to register the entire I/O
 region.  Therefore we must register each I/O port separately.

There is a similar problem with memio regions on an arm64 platform
(AMD Seattle). Where I see:

 ipmi message handler version 39.2
 ipmi_si AMDI0300:00: probing via device tree
 ipmi_si AMDI0300:00: ipmi_si: probing via ACPI
 ipmi_si AMDI0300:00: [mem 0xe0010000] regsize 1 spacing 4 irq 23
 ipmi_si: Adding ACPI-specified kcs state machine
 IPMI System Interface driver.
 ipmi_si: Trying ACPI-specified kcs state machine at mem \
          address 0xe0010000, slave address 0x0, irq 23
 ipmi_si: Could not set up I/O space

The problem is that the ACPI core registers disjoint regions for the
platform device:

e0010000-e0010000 : AMDI0300:00
e0010004-e0010004 : AMDI0300:00

and the ipmi_si driver tries to register one region e0010000-e0010004.

Based on a patch from Mark Salter <msalter@redhat.com>, who also wrote
all the above text.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Tested-by: Mark Salter <msalter@redhat.com>
2016-05-16 19:49:48 -05:00
Corey Minyard
76824852a9 ipmi: Fix some minor coding style issues
Signed-off-by: Corey Minyard <cminyard@mvista.com>
2016-05-16 19:49:48 -05:00
Hidehiro Kawai
73cbf4a1dd ipmi/watchdog: use nmi_panic() when kernel panics in NMI handler
Commit 1717f2096b54 ("panic, x86: Fix re-entrance problem due to panic
on NMI") introduced nmi_panic() which prevents concurrent and recursive
execution of panic().  It also saves registers for the crash dump on x86
by later commit 58c5661f2144 ("panic, x86: Allow CPUs to save registers
even if looping in NMI context").

ipmi_watchdog driver can call panic() from NMI handler, so replace it
with nmi_panic().

Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Joe Lawrence
9f0257b39c ipmi: do not probe ACPI devices if si_tryacpi is unset
Extend the tryacpi module parameter to turn off acpi_ipmi_probe such
that hard-coded options (type, ports, address, etc.) have complete
control over the smi_info data structures setup by the driver.

Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
2016-03-18 07:01:24 -05:00
Corey Minyard
d9dffd2a0b ipmi_si: Avoid a wrong long timeout on transaction done
Under some circumstances, the IPMI state machine could return
a call without delay option but the driver would still do a long
delay because the result wasn't checked.  Instead of calling
the state machine after transaction done, just go back to the
top of the processing to start over.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
2016-03-18 07:01:23 -05:00
Corey Minyard
f813655a36 ipmi_si: Fix module parameter doc names
Several were tryacpi instead of their actual values.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
2016-03-18 07:01:23 -05:00
Corey Minyard
21c8f9154d ipmi_ssif: Fix logic around alert handling
There was a mistake in the logic, if an alert came in very quickly
it would hang the driver.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
2016-03-18 07:01:23 -05:00
Tony Camuso
58c9d61f86 ipmi: put acpi.h with the other headers
Enclosing '#include <linux/acpi.h>' within '#ifdef CONFIG_ACPI' is
unnecessary, since it has its own conditional compile for CONFIG_ACPI.

Commit 0fbcf4af7c83 ("ipmi: Convert the IPMI SI ACPI handling to a
platform device") exposed this as a problem for platforms that do not
support ACPI when it introduced a call to ACPI_PTR() macro outside of
the CONFIG_ACPI conditional compile. This would have been perfectly
acceptable if acpi.h were not conditionally excluded for the non-acpi
platform, because the conditional compile within acpi.h defines
ACPI_PTR() to return NULL when compiled for non acpi platforms.

Signed-off-by: Tony Camuso <tcamuso@redhat.com>

Fixed commit reference in header to conform to standard.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
2016-02-03 10:35:52 -06:00
Dave Jones
bb0dcebef9 ipmi: Remove unnecessary pci_disable_device.
We call cleanup_one_si from ipmi_pci_remove, which calls ->addr_source_cleanup,
 which gets set to point to ipmi_pci_cleanup, which does a pci_disable_device.

On return from this, we do a second pci_disable_device, which
results in the trace below.

ipmi_si 0000:00:16.0: disabling already-disabled device
Call Trace:
 [<ffffffff818ce54c>] dump_stack+0x45/0x57
 [<ffffffff810525f7>] warn_slowpath_common+0x97/0xe0
 [<ffffffff810526f6>] warn_slowpath_fmt+0x46/0x50
 [<ffffffff81497ca1>] pci_disable_device+0xb1/0xc0
 [<ffffffffa00851a5>] ipmi_pci_remove+0x25/0x30 [ipmi_si]
 [<ffffffff8149a696>] pci_device_remove+0x46/0xc0
 [<ffffffff8156801f>] __device_release_driver+0x7f/0xf0
 [<ffffffff81568978>] driver_detach+0xb8/0xc0
 [<ffffffff81567e50>] bus_remove_driver+0x50/0xa0
 [<ffffffff8156914e>] driver_unregister+0x2e/0x60
 [<ffffffff8149a3e5>] pci_unregister_driver+0x25/0x90
 [<ffffffffa0085804>] cleanup_ipmi_si+0xd4/0xf0 [ipmi_si]
 [<ffffffff810c727a>] SyS_delete_module+0x12a/0x200
 [<ffffffff818d4d72>] system_call_fastpath+0x12/0x17

Signed-off-by: Dave Jones <dsj@fb.com>
2016-01-12 15:08:49 -06:00
Krzysztof Kozlowski
aad756f869 char: ipmi: Drop owner assignment from i2c_driver
i2c_driver does not need to set an owner because i2c_register_driver()
will set it.

Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
2016-01-12 15:08:49 -06:00
LABBE Corentin
99ee67351b ipmi: constify some struct and char arrays
Lots of char arrays could be set as const since they contain only literal
char arrays.
We could in the same time make const some struct members who are pointer
to those const char arrays.

Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
2016-01-12 15:08:49 -06:00
Jan Stancek
27f972d3e0 ipmi: move timer init to before irq is setup
We encountered a panic on boot in ipmi_si on a dell per320 due to an
uninitialized timer as follows.

static int smi_start_processing(void       *send_info,
                                ipmi_smi_t intf)
{
        /* Try to claim any interrupts. */
        if (new_smi->irq_setup)
                new_smi->irq_setup(new_smi);

 --> IRQ arrives here and irq handler tries to modify uninitialized timer

    which triggers BUG_ON(!timer->function) in __mod_timer().

 Call Trace:
   <IRQ>
   [<ffffffffa0532617>] start_new_msg+0x47/0x80 [ipmi_si]
   [<ffffffffa053269e>] start_check_enables+0x4e/0x60 [ipmi_si]
   [<ffffffffa0532bd8>] smi_event_handler+0x1e8/0x640 [ipmi_si]
   [<ffffffff810f5584>] ? __rcu_process_callbacks+0x54/0x350
   [<ffffffffa053327c>] si_irq_handler+0x3c/0x60 [ipmi_si]
   [<ffffffff810efaf0>] handle_IRQ_event+0x60/0x170
   [<ffffffff810f245e>] handle_edge_irq+0xde/0x180
   [<ffffffff8100fc59>] handle_irq+0x49/0xa0
   [<ffffffff8154643c>] do_IRQ+0x6c/0xf0
   [<ffffffff8100ba53>] ret_from_intr+0x0/0x11

        /* Set up the timer that drives the interface. */
        setup_timer(&new_smi->si_timer, smi_timeout, (long)new_smi);

The following patch fixes the problem.

To: Openipmi-developer@lists.sourceforge.net
To: Corey Minyard <minyard@acm.org>
CC: linux-kernel@vger.kernel.org

Signed-off-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Tony Camuso <tcamuso@redhat.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Cc: stable@vger.kernel.org # Applies cleanly to 3.10-, needs small rework before
2015-12-09 13:13:06 -06:00
Jean-Yves Faye
c7f42c6390 ipmi watchdog : add panic_wdt_timeout parameter
In order to allow panic actions to be processed, the ipmi watchdog
driver sets a new timeout value on panic. The 255s timeout
was designed to allow kdump and others actions on panic, as in
http://lkml.iu.edu/hypermail/linux/kernel/0711.3/0258.html

This is counter-intuitive for a end-user who sets watchdog timeout
value to something like 30s and who expects BMC to reset the system
within 30s of a panic.

This commit allows user to configure the timeout on panic.

Signed-off-by: Jean-Yves Faye <jean-yves.faye@c-s.fr>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
2015-11-16 06:28:43 -06:00
Luis de Bethencourt
66f4401830 char: ipmi: Move MODULE_DEVICE_TABLE() to follow struct
The policy for drivers is to have MODULE_DEVICE_TABLE() just after the
struct used in it. For clarity.

Suggested-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
2015-11-15 21:08:26 -06:00
Corey Minyard
314ef52fe6 ipmi: Stop the timer immediately if idle
The IPMI driver would let the final timeout just happen, but it could
easily just stop the timer.  If the timer stop fails that's ok, that
should be rare.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
2015-11-15 21:08:26 -06:00
Corey Minyard
0cfec916e8 ipmi: Start the timer and thread on internal msgs
The timer and thread were not being started for internal messages,
so in interrupt mode if something hung the timer would never go
off and clean things up.  Factor out the internal message sending
and start the timer for those messages, too.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Tested-by: Gouji, Masayuki <gouji.masayuki@jp.fujitsu.com>
Cc: stable@vger.kernel.org
2015-11-15 21:08:26 -06:00
Amitoj Kaur Chawla
526290aa62 char: ipmi: ipmi_ssif: Replace timeval with timespec64
This patch replaces timeval with timespec64 as 32 bit 'struct timeval'
will not give current time beyond 2038.

The patch changes the code to use ktime_get_real_ts64() which returns
a 'struct timespec64' instead of do_gettimeofday() which returns a
'struct timeval'

This patch also alters the format string in pr_info() for now.tv_sec
to incorporate 'long long' on 32 bit architectures.

Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-10-24 19:46:42 -07:00
Corey Minyard
bf2d087749 ipmi:ssif: Add a module parm to specify that SMBus alerts don't work
They are broken on some platforms, this gives people a chance to work
around it until the firmware is fixed.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
2015-09-03 15:02:31 -05:00
Brijesh Singh
acbd9ae70a ipmi: add of_device_id in MODULE_DEVICE_TABLE
Fix autoloading ipmi modules when using device tree.

Signed-off-by: Brijesh Singh <brijeshkumar.singh@amd.com>

Moved this change up into the CONFIG_OF section to account
for changes to the probing code.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
2015-09-03 15:02:31 -05:00
Corey Minyard
d08828973d ipmi: Compensate for BMCs that wont set the irq enable bit
It appears that some BMCs support interrupts but don't support setting
the irq enable bits.  The interrupts are just always on.  Sigh.
Add code to compensate.

The new code was very similar to another functions, so this also
factors out the common code into other functions.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Tested-by: Henrik Korkuc <henrik@kirneh.eu>
2015-09-03 15:02:30 -05:00
Hidehiro Kawai
c49c097610 ipmi: Don't call receive handler in the panic context
Received handlers defined as ipmi_recv_hndl member of struct
ipmi_user_hndl can take a spinlock.  This means that if the kernel
panics while holding the lock, a deadlock may happen on the lock
while flushing queued messages in the panic context.

Calling the receive handler doesn't make much meanings in the panic
context, simply skip it to avoid possible deadlocks.

Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
2015-09-03 15:02:29 -05:00