linux/drivers
Russell King 025e4ab3db pcmcia: fix socket refcount decrementing on each resume
This fixes a memory-corrupting bug: not only does it cause the warning,
but as a result of dropping the refcount to zero, it causes the
pcmcia_socket0 device structure to be freed while it still has
references, causing slab caches corruption.  A fatal oops quickly
follows this warning - often even just a 'dmesg' following the warning
causes the kernel to oops.

While testing suspend/resume on an ARM device with PCMCIA support, and a
CF card inserted, I found that after five suspend and resumes, the
kernel would complain, and shortly die after with slab corruption.

  WARNING: at include/linux/kref.h:41 kobject_get+0x28/0x50()

As the message doesn't give a clue about which kobject, and the built-in
debugging in drivers/base/power/main.c happens too late, this was added
right before each get_device():

  printk("%s: %p [%s] %u\n", __func__, dev, kobject_name(&dev->kobj), atomic_read(&dev->kobj.kref.refcount));

and on the 3rd s2ram cycle, the following behaviour observed:

On the 3rd suspend/resume cycle:

  dpm_prepare: c1a0d998 [pcmcia_socket0] 3
  dpm_suspend: c1a0d998 [pcmcia_socket0] 3
  dpm_suspend_noirq: c1a0d998 [pcmcia_socket0] 3
  dpm_resume_noirq: c1a0d998 [pcmcia_socket0] 3
  dpm_resume: c1a0d998 [pcmcia_socket0] 3
  dpm_complete: c1a0d998 [pcmcia_socket0] 2

4th:

  dpm_prepare: c1a0d998 [pcmcia_socket0] 2
  dpm_suspend: c1a0d998 [pcmcia_socket0] 2
  dpm_suspend_noirq: c1a0d998 [pcmcia_socket0] 2
  dpm_resume_noirq: c1a0d998 [pcmcia_socket0] 2
  dpm_resume: c1a0d998 [pcmcia_socket0] 2
  dpm_complete: c1a0d998 [pcmcia_socket0] 1

5th:

  dpm_prepare: c1a0d998 [pcmcia_socket0] 1
  dpm_suspend: c1a0d998 [pcmcia_socket0] 1
  dpm_suspend_noirq: c1a0d998 [pcmcia_socket0] 1
  dpm_resume_noirq: c1a0d998 [pcmcia_socket0] 1
  dpm_resume: c1a0d998 [pcmcia_socket0] 1
  dpm_complete: c1a0d998 [pcmcia_socket0] 0
  ------------[ cut here ]------------
  WARNING: at include/linux/kref.h:41 kobject_get+0x28/0x50()
  Modules linked in: ucb1x00_core
  Backtrace:
  [<c0212090>] (dump_backtrace+0x0/0x110) from [<c04799dc>] (dump_stack+0x18/0x1c)
  [<c04799c4>] (dump_stack+0x0/0x1c) from [<c021cba0>] (warn_slowpath_common+0x50/0x68)
  [<c021cb50>] (warn_slowpath_common+0x0/0x68) from [<c021cbdc>] (warn_slowpath_null+0x24/0x28)
  [<c021cbb8>] (warn_slowpath_null+0x0/0x28) from [<c0335374>] (kobject_get+0x28/0x50)
  [<c033534c>] (kobject_get+0x0/0x50) from [<c03804f4>] (get_device+0x1c/0x24)
  [<c0388c90>] (dpm_complete+0x0/0x1a0) from [<c0389cc0>] (dpm_resume_end+0x1c/0x20)
  ...

Looking at commit 7b24e79882 ("pcmcia: split up central event handler"),
the following change was made to cs.c:

                return 0;
        }
 #endif
-
-       send_event(skt, CS_EVENT_PM_RESUME, CS_EVENT_PRI_LOW);
+       if (!(skt->state & SOCKET_CARDBUS) && (skt->callback))
+               skt->callback->early_resume(skt);
        return 0;
 }

And the corresponding change in ds.c is from:

-static int ds_event(struct pcmcia_socket *skt, event_t event, int priority)
-{
-       struct pcmcia_socket *s = pcmcia_get_socket(skt);
...
-       switch (event) {
...
-       case CS_EVENT_PM_RESUME:
-               if (verify_cis_cache(skt) != 0) {
-                       dev_dbg(&skt->dev, "cis mismatch - different card\n");
-                       /* first, remove the card */
-                       ds_event(skt, CS_EVENT_CARD_REMOVAL, CS_EVENT_PRI_HIGH);
-                       mutex_lock(&s->ops_mutex);
-                       destroy_cis_cache(skt);
-                       kfree(skt->fake_cis);
-                       skt->fake_cis = NULL;
-                       s->functions = 0;
-                       mutex_unlock(&s->ops_mutex);
-                       /* now, add the new card */
-                       ds_event(skt, CS_EVENT_CARD_INSERTION,
-                                CS_EVENT_PRI_LOW);
-               }
-               break;
...
-    }

-    pcmcia_put_socket(s);

-    return 0;
-} /* ds_event */

to:

+static int pcmcia_bus_early_resume(struct pcmcia_socket *skt)
+{
+       if (!verify_cis_cache(skt)) {
+               pcmcia_put_socket(skt);
+               return 0;
+       }

+       dev_dbg(&skt->dev, "cis mismatch - different card\n");

+       /* first, remove the card */
+       pcmcia_bus_remove(skt);
+       mutex_lock(&skt->ops_mutex);
+       destroy_cis_cache(skt);
+       kfree(skt->fake_cis);
+       skt->fake_cis = NULL;
+       skt->functions = 0;
+       mutex_unlock(&skt->ops_mutex);

+       /* now, add the new card */
+       pcmcia_bus_add(skt);
+       return 0;
+}

As can be seen, the original function called pcmcia_get_socket() and
pcmcia_put_socket() around the guts, whereas the replacement code
calls pcmcia_put_socket() only in one path.  This creates an imbalance
in the refcounting.

Testing with pcmcia_put_socket() put removed shows that the bug is gone:

  dpm_suspend: c1a10998 [pcmcia_socket0] 5
  dpm_suspend_noirq: c1a10998 [pcmcia_socket0] 5
  dpm_resume_noirq: c1a10998 [pcmcia_socket0] 5
  dpm_resume: c1a10998 [pcmcia_socket0] 5
  dpm_complete: c1a10998 [pcmcia_socket0] 5

Tested-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-02-08 19:03:51 -08:00
..
accessibility module_param: make bool parameters really bool (drivers & misc) 2012-01-13 09:32:20 +10:30
acpi ACPI: remove duplicated lines of merging problems with acpi_processor_add 2012-02-07 14:31:35 -08:00
amba
ata [libata] ata_piix: Add Toshiba Satellite Pro A120 to the quirks list 2012-01-17 20:50:53 -05:00
atm module_param: make bool parameters really bool (drivers & misc) 2012-01-13 09:32:20 +10:30
auxdisplay
base Here are some patches for the 3.3-rc1 tree. 2012-01-28 18:20:48 -08:00
bcma bcma: connect the bcma bus suspend/resume to the bcma driver suspend/resume 2012-01-17 09:54:08 -05:00
block Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client 2012-02-02 15:47:33 -08:00
bluetooth module_param: make bool parameters really bool (drivers & misc) 2012-01-13 09:32:20 +10:30
cdrom block: add and use scsi_blk_cmd_ioctl 2012-01-14 15:07:24 -08:00
char agp: fix scratch page cleanup 2012-01-26 18:36:48 +00:00
clk
clocksource
connector
cpufreq Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq 2012-01-11 18:53:33 -08:00
cpuidle
crypto Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2012-01-10 22:01:27 -08:00
dca
devfreq
dio
dma i.MX SDMA: Fix burstsize settings 2012-02-02 14:00:43 +05:30
edac module_param: make bool parameters really bool (drivers & misc) 2012-01-13 09:32:20 +10:30
eisa
firewire firewire: ohci: disable MSI on Ricoh controllers 2012-01-30 21:33:34 +01:00
firmware Merge commit '070680218379e15c1901f4bf21b98e3cbf12b527' into stable/for-linus-fixes-3.3 2012-01-12 11:53:55 -05:00
gpio gpio: Add missing spin_lock_init in gpio-ml-ioh driver 2012-02-01 21:59:37 -07:00
gpu drm/radeon/kms/blit: fix blit copy for very large buffers 2012-02-02 15:54:48 +00:00
hid HID: wiimote: fix invalid power_supply_powers call 2012-02-07 13:40:56 +01:00
hv
hwmon hwmon: (w83627ehf) Fix number of fans for NCT6776F 2012-02-04 18:08:23 -08:00
hwspinlock
i2c i2c: OMAP: Fix OMAP1 build error 2012-01-20 08:24:22 -08:00
ide block: add and use scsi_blk_cmd_ioctl 2012-01-14 15:07:24 -08:00
idle ACPI processor hotplug: Delay acpi_processor_start() call for hotplugged cores 2012-01-19 21:26:32 -05:00
ieee802154
infiniband IB/srpt: Don't return freed pointer from srpt_alloc_ioctx_ring() 2012-02-06 08:57:11 -08:00
input Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2012-02-04 10:57:42 -08:00
iommu
isdn Autogenerated GPG tag for Rusty D1ADB8F1: 15EE 8D6C AB0E 7F0C F999 BFCB D920 0E6C D1AD B8F1 2012-01-14 12:32:16 -08:00
leds drivers/leds/leds-lm3530.c: fix setting pltfm->als_vmax 2012-02-08 19:03:51 -08:00
lguest lguest: Make sure interrupt is allocated ok by lguest_setup_irq 2012-01-12 15:44:47 +10:30
macintosh module_param: make bool parameters really bool (drivers & misc) 2012-01-13 09:32:20 +10:30
mca
md Merge branch 'for-3.3/core' of git://git.kernel.dk/linux-block 2012-01-15 12:24:45 -08:00
media Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media 2012-01-26 17:04:47 -08:00
memstick module_param: make bool parameters really bool (drivers & misc) 2012-01-13 09:32:20 +10:30
message
mfd mfd: Avoid twl6040-codec PLL reconfiguration when not needed 2012-02-03 19:03:50 +01:00
misc lkdtm: avoid calling lkdtm_do_action() with spinlock held 2012-02-03 16:16:41 -08:00
mmc Merge branch 'next' of git://git.infradead.org/users/vkoul/slave-dma 2012-01-17 18:40:24 -08:00
mtd - Fix a regression in 16-bit Atmel NAND flash which was introduced in 3.1 2012-02-04 07:17:47 -08:00
net Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless 2012-01-27 20:40:18 -05:00
nfc
nubus
of 2nd set of device tree changes for v3.3 2012-01-14 13:25:55 -08:00
oprofile
parisc Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci 2012-01-11 18:50:26 -08:00
parport Autogenerated GPG tag for Rusty D1ADB8F1: 15EE 8D6C AB0E 7F0C F999 BFCB D920 0E6C D1AD B8F1 2012-01-14 12:32:16 -08:00
pci kernel-doc: fix new warnings in pci 2012-01-23 08:44:53 -08:00
pcmcia pcmcia: fix socket refcount decrementing on each resume 2012-02-08 19:03:51 -08:00
pinctrl pinctrl: add checks for empty function names 2012-01-26 14:13:11 +01:00
platform module_param: make bool parameters really bool (drivers & misc) 2012-01-13 09:32:20 +10:30
pnp
power module_param: make bool parameters really bool (drivers & misc) 2012-01-13 09:32:20 +10:30
pps
ps3
ptp
rapidio
regulator This fixes an integration issue with the regulator device tree bindings 2012-01-30 10:16:25 -08:00
rtc Revert "RTC: sa1100: remove redundant code of setting alarm" 2012-01-19 17:26:26 +00:00
s390 [S390] dasd: revalidate server for new pathgroup 2012-01-18 18:03:42 +01:00
sbus
scsi Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k 2012-01-26 12:43:57 -08:00
sfi
sh SH/R-Mobile updates for 3.3 merge window. 2012-01-11 23:29:20 -08:00
sn
spi Merge branch 'next' of git://git.infradead.org/users/vkoul/slave-dma 2012-01-17 18:40:24 -08:00
ssb
staging staging: fix go7007-usb license 2012-02-01 18:29:33 -08:00
target
tc
telephony
thermal thermal: Rename generate_netlink_event 2012-01-23 03:15:25 -05:00
tty drivers/tty/vt/vt_ioctl.c: fix KDFONTOP 32bit compatibility layer 2012-02-03 16:16:41 -08:00
uio Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci 2012-01-11 18:50:26 -08:00
usb Here are a bunch of USB patches for 3.3-rc1. 2012-01-30 11:38:28 -08:00
uwb
vhost vhost-net: add module alias (v2.1) 2012-01-13 10:12:23 -08:00
video fbdev fixes for 3.3 2012-02-07 15:54:02 -08:00
virt
virtio virtio: correct the memory barrier in virtqueue_kick_prepare() 2012-01-28 08:10:23 +10:30
vlynq
w1
watchdog watchdog: iTCO_wdt: add Intel Lynx Point DeviceIDs 2012-01-27 10:01:16 +01:00
xen xen/granttable: Disable grant v2 for HVM domains. 2012-01-27 11:14:16 -05:00
zorro
Kconfig
Makefile mmc: sdhci-pci: add platform data 2012-01-11 23:58:47 -05:00