linux

iv/linux

Author	SHA1	Message	Date
David Howells	499f9ff872	X.509: Fix leap year handling again commit ac4cbedfdf55455b4c447f17f0fa027dbf02b2a6 upstream. There are still a couple of minor issues in the X.509 leap year handling: (1) To avoid doing a modulus-by-400 in addition to a modulus-by-100 when determining whether the year is a leap year or not, I divided the year by 100 after doing the modulus-by-100, thereby letting the compiler do one instruction for both, and then did a modulus-by-4. Unfortunately, I then passed the now-modified year value to mktime64() to construct a time value. Since this isn't a fast path and since mktime64() does a bunch of divisions, just condense down to "% 400". It's also easier to read. (2) The default month length for any February where the year doesn't divide by four exactly is obtained from the month_length[] array where the value is 29, not 28. This is fixed by altering the table. Reported-by: Rudolf Polzer <rpolzer@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: David Woodhouse <David.Woodhouse@intel.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:46 -07:00
Boris BREZILLON	f08fc4eed8	crypto: marvell/cesa - forward devm_ioremap_resource() error code commit dfe97ad30e8c038261663a18b9e04b8b5bc07bea upstream. Forward devm_ioremap_resource() error code instead of returning -ENOMEM. Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Reported-by: Russell King - ARM Linux <linux@arm.linux.org.uk> Fixes: f63601fd616a ("crypto: marvell/cesa - add a new driver for Marvell's CESA") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:46 -07:00
Vladimir Zapolskiy	75efb5fe5c	crypto: ux500 - fix checks of error code returned by devm_ioremap_resource() commit b62917a2622ebcb03a500ef20da47be80d8c8951 upstream. The change fixes potential oops while accessing iomem on invalid address, if devm_ioremap_resource() fails due to some reason. The devm_ioremap_resource() function returns ERR_PTR() and never returns NULL, which makes useless a following check for NULL. Signed-off-by: Vladimir Zapolskiy <vz@mleia.com> Fixes: 5a4eea2658c93 ("crypto: ux500 - Use devm_xxx() managed function") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:46 -07:00
Vladimir Zapolskiy	90933f3fb6	crypto: atmel - fix checks of error code returned by devm_ioremap_resource() commit 9b52d55f4f0e2bb9a34abbcf99e05e17f1b3b281 upstream. The change fixes potential oops while accessing iomem on invalid address, if devm_ioremap_resource() fails due to some reason. The devm_ioremap_resource() function returns ERR_PTR() and never returns NULL, which makes useless a following check for NULL. Signed-off-by: Vladimir Zapolskiy <vz@mleia.com> Fixes: b0e8b3417a62 ("crypto: atmel - use devm_xxx() managed function") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:46 -07:00
Dan Carpenter	f69c1b51f6	crypto: keywrap - memzero the correct memory commit 2b8b28fd232233c22fb61009dd8b0587390d2875 upstream. We're clearing the wrong memory. The memory corruption is likely harmless because we weren't going to use that stack memory again but not zeroing is a potential information leak. Fixes: e28facde3c39 ('crypto: keywrap - add key wrapping block chaining mode') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:45 -07:00
Tom Lendacky	0cdc91f539	crypto: ccp - memset request context to zero during import commit ce0ae266feaf35930394bd770c69778e4ef03ba9 upstream. Since a crypto_ahash_import() can be called against a request context that has not had a crypto_ahash_init() performed, the request context needs to be cleared to insure there is no random data present. If not, the random data can result in a kernel oops during crypto_ahash_update(). Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:45 -07:00
Tom Lendacky	cc78d091bd	crypto: ccp - Don't assume export/import areas are aligned commit b31dde2a5cb1bf764282abf934266b7193c2bc7c upstream. Use a local variable for the exported and imported state so that alignment is not an issue. On export, set a local variable from the request context and then memcpy the contents of the local variable to the export memory area. On import, memcpy the import memory area into a local variable and then use the local variable to set the request context. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:45 -07:00
Tom Lendacky	8c5156ad2d	crypto: ccp - Limit the amount of information exported commit d1662165ae612ec8b5f94a6b07e65ea58b6dce34 upstream. Since the exported information can be exposed to user-space, instead of exporting the entire request context only export the minimum information needed. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:45 -07:00
Tom Lendacky	120e2febfc	crypto: ccp - Add hash state import and export support commit 952bce9792e6bf36fda09c2e5718abb5d9327369 upstream. Commit 8996eafdcbad ("crypto: ahash - ensure statesize is non-zero") added a check to prevent ahash algorithms from successfully registering if the import and export functions were not implemented. This prevents an oops in the hash_accept function of algif_hash. This commit causes the ccp-crypto module SHA support and AES CMAC support from successfully registering and causing the ccp-crypto module load to fail because the ahash import and export functions are not implemented. Update the CCP Crypto API support to provide import and export support for ahash algorithms. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:45 -07:00
Dmitry Tunin	28dee87525	Bluetooth: btusb: Add a new AR3012 ID 13d3:3472 commit 75c6aca4765dbe3d0c1507ab5052f2e373dc2331 upstream. T: Bus=01 Lev=01 Prnt=01 Port=04 Cnt=01 Dev#= 4 Spd=12 MxCh= 0 D: Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=13d3 ProdID=3472 Rev=00.01 C: #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA I: If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb I: If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb BugLink: https://bugs.launchpad.net/bugs/1552925 Signed-off-by: Dmitry Tunin <hanipouspilot@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:44 -07:00
Dmitry Tunin	36a7f23fc7	Bluetooth: btusb: Add a new AR3012 ID 04ca:3014 commit 81d90442eac779938217c3444b240aa51fd3db47 upstream. T: Bus=01 Lev=01 Prnt=01 Port=04 Cnt=03 Dev#= 5 Spd=12 MxCh= 0 D: Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=04ca ProdID=3014 Rev=00.02 C: #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA I: If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb I: If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb BugLink: https://bugs.launchpad.net/bugs/1546694 Signed-off-by: Dmitry Tunin <hanipouspilot@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:44 -07:00
Dmitry Tunin	fb243c3d81	Bluetooth: btusb: Add new AR3012 ID 13d3:3395 commit 609574eb46335cfac1421a07c0505627cbbab1f0 upstream. T: Bus=03 Lev=02 Prnt=02 Port=00 Cnt=01 Dev#= 3 Spd=12 MxCh= 0 D: Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=13d3 ProdID=3395 Rev=00.01 C: #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA I: If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb I: If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb BugLink: https://bugs.launchpad.net/bugs/1542564 Reported-and-tested-by: Christopher Simerly <kilikopela29@gmail.com> Signed-off-by: Dmitry Tunin <hanipouspilot@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:44 -07:00
Vladis Dronov	b7f03eeaaf	ALSA: usb-audio: Fix double-free in error paths after snd_usb_add_audio_stream() call commit 836b34a935abc91e13e63053d0a83b24dfb5ea78 upstream. create_fixed_stream_quirk(), snd_usb_parse_audio_interface() and create_uaxx_quirk() functions allocate the audioformat object by themselves and free it upon error before returning. However, once the object is linked to a stream, it's freed again in snd_usb_audio_pcm_free(), thus it'll be double-freed, eventually resulting in a memory corruption. This patch fixes these failures in the error paths by unlinking the audioformat object before freeing it. Based on a patch by Takashi Iwai <tiwai@suse.de> [Note for stable backports: this patch requires the commit 902eb7fd1e4a ('ALSA: usb-audio: Minor code cleanup in create_fixed_stream_quirk()')] Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1283358 Reported-by: Ralf Spenneberg <ralf@spenneberg.net> Signed-off-by: Vladis Dronov <vdronov@redhat.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:44 -07:00
Takashi Iwai	4d073cfdf7	ALSA: usb-audio: Minor code cleanup in create_fixed_stream_quirk() commit 902eb7fd1e4af3ac69b9b30f8373f118c92b9729 upstream. Just a minor code cleanup: unify the error paths. Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:44 -07:00
Victor Clément	94bfaf24e6	ALSA: usb-audio: add Microsoft HD-5001 to quirks commit 0ef21100ae912f76ed89f76ecd894f4ffb3689c1 upstream. The Microsoft HD-5001 webcam microphone does not support sample rate reading as the HD-5000 one. This results in dmesg errors and sound hanging with pulseaudio. Signed-off-by: Victor Clément <victor.clement@openmailbox.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:43 -07:00
Takashi Iwai	f9f026d390	ALSA: usb-audio: Add sanity checks for endpoint accesses commit 447d6275f0c21f6cc97a88b3a0c601436a4cdf2a upstream. Add some sanity check codes before actually accessing the endpoint via get_endpoint() in order to avoid the invalid access through a malformed USB descriptor. Mostly just checking bNumEndpoints, but in one place (snd_microii_spdif_default_get()), the validity of iface and altsetting index is checked as well. Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=971125 Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:43 -07:00
Takashi Iwai	57f6ad5f15	ALSA: usb-audio: Fix NULL dereference in create_fixed_stream_quirk() commit 0f886ca12765d20124bd06291c82951fd49a33be upstream. create_fixed_stream_quirk() may cause a NULL-pointer dereference by accessing the non-existing endpoint when a USB device with a malformed USB descriptor is used. This patch avoids it simply by adding a sanity check of bNumEndpoints before the accesses. Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=971125 Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:43 -07:00
Josh Boyer	fbd40d7bee	Input: powermate - fix oops with malicious USB descriptors commit 9c6ba456711687b794dcf285856fc14e2c76074f upstream. The powermate driver expects at least one valid USB endpoint in its probe function. If given malicious descriptors that specify 0 for the number of endpoints, it will crash. Validate the number of endpoints on the interface before using them. The full report for this issue can be found here: http://seclists.org/bugtraq/2016/Mar/85 Reported-by: Ralf Spenneberg <ralf@spenneberg.net> Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:43 -07:00
Hans de Goede	5cede226da	pwc: Add USB id for Philips Spc880nc webcam commit 7445e45d19a09e5269dc85f17f9635be29d2f76c upstream. SPC 880NC PC camera discussions: http://www.pclinuxos.com/forum/index.php/topic,135688.0.html Reported-by: Kikim <klucznik0@op.pl> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:42 -07:00
Bjørn Mork	4910297147	USB: option: add "D-Link DWM-221 B1" device id commit d48d5691ebf88a15d95ba96486917ffc79256536 upstream. Thomas reports: "Windows: 00 diagnostics 01 modem 02 at-port 03 nmea 04 nic Linux: T: Bus=02 Lev=01 Prnt=01 Port=03 Cnt=01 Dev#= 4 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=2001 ProdID=7e19 Rev=02.32 S: Manufacturer=Mobile Connect S: Product=Mobile Connect S: SerialNumber=0123456789ABCDEF C: #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan I: If#= 5 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage" Reported-by: Thomas Schäfer <tschaefer@t-online.de> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:42 -07:00
Josh Boyer	b3e0983cb9	USB: serial: ftdi_sio: Add support for ICP DAS I-756xU devices commit ea6db90e750328068837bed34cb1302b7a177339 upstream. A Fedora user reports that the ftdi_sio driver works properly for the ICP DAS I-7561U device. Further, the user manual for these devices instructs users to load the driver and add the ids using the sysfs interface. Add support for these in the driver directly so that the devices work out of the box instead of needing manual configuration. Reported-by: <thesource@mail.ru> Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:42 -07:00
Martyn Welch	df3dddcc64	USB: serial: cp210x: Adding GE Healthcare Device ID commit cddc9434e3dcc37a85c4412fb8e277d3a582e456 upstream. The CP2105 is used in the GE Healthcare Remote Alarm Box, with the Manufacturer ID of 0x1901 and Product ID of 0x0194. Signed-off-by: Martyn Welch <martyn.welch@collabora.co.uk> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:42 -07:00
Oliver Neukum	ca76906a77	USB: cypress_m8: add endpoint sanity check commit c55aee1bf0e6b6feec8b2927b43f7a09a6d5f754 upstream. An attack using missing endpoints exists. CVE-2016-3137 Signed-off-by: Oliver Neukum <ONeukum@suse.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:42 -07:00
Oliver Neukum	4f6ad5b0d2	USB: digi_acceleport: do sanity checking for the number of ports commit 5a07975ad0a36708c6b0a5b9fea1ff811d0b0c1f upstream. The driver can be crashed with devices that expose crafted descriptors with too few endpoints. See: http://seclists.org/bugtraq/2016/Mar/61 Signed-off-by: Oliver Neukum <ONeukum@suse.com> [johan: fix OOB endpoint check and add error messages ] Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:41 -07:00
Oliver Neukum	9deac9454b	USB: mct_u232: add sanity checking in probe commit 4e9a0b05257f29cf4b75f3209243ed71614d062e upstream. An attack using the lack of sanity checking in probe is known. This patch checks for the existence of a second port. CVE-2016-3136 Signed-off-by: Oliver Neukum <ONeukum@suse.com> [johan: add error message ] Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:41 -07:00
Oliver Neukum	b6c6426252	USB: usb_driver_claim_interface: add sanity checking commit 0b818e3956fc1ad976bee791eadcbb3b5fec5bfd upstream. Attacks that trick drivers into passing a NULL pointer to usb_driver_claim_interface() using forged descriptors are known. This thwarts them by sanity checking. Signed-off-by: Oliver Neukum <ONeukum@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:41 -07:00
Josh Boyer	850631bedd	USB: iowarrior: fix oops with malicious USB descriptors commit 4ec0ef3a82125efc36173062a50624550a900ae0 upstream. The iowarrior driver expects at least one valid endpoint. If given malicious descriptors that specify 0 for the number of endpoints, it will crash in the probe function. Ensure there is at least one endpoint on the interface before using it. The full report of this issue can be found here: http://seclists.org/bugtraq/2016/Mar/87 Reported-by: Ralf Spenneberg <ralf@spenneberg.net> Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:41 -07:00
Oliver Neukum	1ea680abf7	USB: cdc-acm: more sanity checking commit 8835ba4a39cf53f705417b3b3a94eb067673f2c9 upstream. An attack has become available which pretends to be a quirky device circumventing normal sanity checks and crashes the kernel by an insufficient number of interfaces. This patch adds a check to the code path for quirky devices. Signed-off-by: Oliver Neukum <ONeukum@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:41 -07:00
Hans de Goede	a85722c650	USB: uas: Reduce can_queue to MAX_CMNDS commit 55ff8cfbc4e12a7d2187df523938cc671fbebdd1 upstream. The uas driver can never queue more then MAX_CMNDS (- 1) tags and tags are shared between luns, so there is no need to claim that we can_queue some random large number. Not claiming that we can_queue 65536 commands, fixes the uas driver failing to initialize while allocating the tag map with a "Page allocation failure (order 7)" error on systems which have been running for a while and thus have fragmented memory. Reported-and-tested-by: Yves-Alexis Perez <corsac@corsac.net> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:40 -07:00
Oliver Neukum	aa563cf3bc	usb: hub: fix a typo in hub_port_init() leading to wrong logic commit 0d5ce778c43bf888328231bcdce05d5c860655aa upstream. A typo of j for i led to a logic bug. To rule out future confusion, the variable names are made meaningful. Signed-off-by: Oliver Neukum <ONeukum@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:40 -07:00
Oliver Neukum	951822beba	usb: retry reset if a device times out commit 264904ccc33c604d4b3141bbd33808152dfac45b upstream. Some devices I got show an inability to operate right after power on if they are already connected. They are beyond recovery if the descriptors are requested multiple times. So in case of a timeout we rather bail early and reset again. But it must be done only on the first loop lest we get into a reset/time out spiral that can be overcome with a retry. This patch is a rework of a patch that fell through the cracks. http://www.spinics.net/lists/linux-usb/msg103263.html Signed-off-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:40 -07:00
Bryn M. Reeves	8907d8a6fd	dm: fix rq_end_stats() NULL pointer in dm_requeue_original_request() commit 98dbc9c6c61698792e3a66f32f3bf066201d42d7 upstream. An "old" (.request_fn) DM 'struct request' stores a pointer to the associated 'struct dm_rq_target_io' in rq->special. dm_requeue_original_request(), previously named dm_requeue_unmapped_original_request(), called dm_unprep_request() to reset rq->special to NULL. But rq_end_stats() would go on to hit a NULL pointer deference because its call to tio_from_request() returned NULL. Fix this by calling rq_end_stats() _before_ dm_unprep_request() Signed-off-by: Bryn M. Reeves <bmr@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Fixes: e262f34741 ("dm stats: add support for request-based DM devices") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:40 -07:00
Joe Thornber	7f47aea487	dm cache: make sure every metadata function checks fail_io commit d14fcf3dd79c0b8a8d0ba469c44a6b04f3a1403b upstream. Otherwise operations may be attempted that will only ever go on to crash (since the metadata device is either missing or unreliable if 'fail_io' is set). Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:40 -07:00
Joe Thornber	291e2b3900	dm thin metadata: don't issue prefetches if a transaction abort has failed commit 2eae9e4489b4cf83213fa3bd508b5afca3f01780 upstream. If a transaction abort has failed then we can no longer use the metadata device. Typically this happens if the superblock is unreadable. This fix addresses a crash seen during metadata device failure testing. Fixes: 8a01a6af75 ("dm thin: prefetch missing metadata pages") Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:40 -07:00
Mike Snitzer	5504a47088	dm: fix excessive dm-mq context switching commit 6acfe68bac7e6f16dc312157b1fa6e2368985013 upstream. Request-based DM's blk-mq support (dm-mq) was reported to be 50% slower than if an underlying null_blk device were used directly. One of the reasons for this drop in performance is that blk_insert_clone_request() was calling blk_mq_insert_request() with @async=true. This forced the use of kblockd_schedule_delayed_work_on() to run the blk-mq hw queues which ushered in ping-ponging between process context (fio in this case) and kblockd's kworker to submit the cloned request. The ftrace function_graph tracer showed: kworker-2013 => fio-12190 fio-12190 => kworker-2013 ... kworker-2013 => fio-12190 fio-12190 => kworker-2013 ... Fixing blk_insert_clone_request()'s blk_mq_insert_request() call to _not_ use kblockd to submit the cloned requests isn't enough to eliminate the observed context switches. In addition to this dm-mq specific blk-core fix, there are 2 DM core fixes to dm-mq that (when paired with the blk-core fix) completely eliminate the observed context switching: 1) don't blk_mq_run_hw_queues in blk-mq request completion Motivated by desire to reduce overhead of dm-mq, punting to kblockd just increases context switches. In my testing against a really fast null_blk device there was no benefit to running blk_mq_run_hw_queues() on completion (and no other blk-mq driver does this). So hopefully this change doesn't induce the need for yet another revert like commit 621739b00e16ca2d ! 2) use blk_mq_complete_request() in dm_complete_request() blk_complete_request() doesn't offer the traditional q->mq_ops vs .request_fn branching pattern that other historic block interfaces do (e.g. blk_get_request). Using blk_mq_complete_request() for blk-mq requests is important for performance. It should be noted that, like blk_complete_request(), blk_mq_complete_request() doesn't natively handle partial completions -- but the request-based DM-multipath target does provide the required partial completion support by dm.c:end_clone_bio() triggering requeueing of the request via dm-mpath.c:multipath_end_io()'s return of DM_ENDIO_REQUEUE. dm-mq fix #2 is _much_ more important than #1 for eliminating the context switches. Before: cpu : usr=15.10%, sys=59.39%, ctx=7905181, majf=0, minf=475 After: cpu : usr=20.60%, sys=79.35%, ctx=2008, majf=0, minf=472 With these changes multithreaded async read IOPs improved from ~950K to ~1350K for this dm-mq stacked on null_blk test-case. The raw read IOPs of the underlying null_blk device for the same workload is ~1950K. Fixes: 7fb4898e0 ("block: add blk-mq support to blk_insert_cloned_request()") Fixes: bfebd1cdb ("dm: add full blk-mq support to request-based DM") Reported-by: Sagi Grimberg <sagig@dev.mellanox.co.il> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Acked-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:40 -07:00
DingXiang	15c3af026b	dm snapshot: disallow the COW and origin devices from being identical commit 4df2bf466a9c9c92f40d27c4aa9120f4e8227bfc upstream. Otherwise loading a "snapshot" table using the same device for the origin and COW devices, e.g.: echo "0 20971520 snapshot 253:3 253:3 P 8" \| dmsetup create snap will trigger: BUG: unable to handle kernel NULL pointer dereference at 0000000000000098 [ 1958.979934] IP: [<ffffffffa040efba>] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot] [ 1958.989655] PGD 0 [ 1958.991903] Oops: 0000 [#1] SMP ... [ 1959.059647] CPU: 9 PID: 3556 Comm: dmsetup Tainted: G IO 4.5.0-rc5.snitm+ #150 ... [ 1959.083517] task: ffff8800b9660c80 ti: ffff88032a954000 task.ti: ffff88032a954000 [ 1959.091865] RIP: 0010:[<ffffffffa040efba>] [<ffffffffa040efba>] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot] [ 1959.104295] RSP: 0018:ffff88032a957b30 EFLAGS: 00010246 [ 1959.110219] RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000001 [ 1959.118180] RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff880329334a00 [ 1959.126141] RBP: ffff88032a957b50 R08: 0000000000000000 R09: 0000000000000001 [ 1959.134102] R10: 000000000000000a R11: f000000000000000 R12: ffff880330884d80 [ 1959.142061] R13: 0000000000000008 R14: ffffc90001c13088 R15: ffff880330884d80 [ 1959.150021] FS: 00007f8926ba3840(0000) GS:ffff880333440000(0000) knlGS:0000000000000000 [ 1959.159047] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1959.165456] CR2: 0000000000000098 CR3: 000000032f48b000 CR4: 00000000000006e0 [ 1959.173415] Stack: [ 1959.175656] ffffc90001c13040 ffff880329334a00 ffff880330884ed0 ffff88032a957bdc [ 1959.183946] ffff88032a957bb8 ffffffffa040f225 ffff880329334a30 ffff880300000000 [ 1959.192233] ffffffffa04133e0 ffff880329334b30 0000000830884d58 00000000569c58cf [ 1959.200521] Call Trace: [ 1959.203248] [<ffffffffa040f225>] dm_exception_store_create+0x1d5/0x240 [dm_snapshot] [ 1959.211986] [<ffffffffa040d310>] snapshot_ctr+0x140/0x630 [dm_snapshot] [ 1959.219469] [<ffffffffa0005c44>] ? dm_split_args+0x64/0x150 [dm_mod] [ 1959.226656] [<ffffffffa0005ea7>] dm_table_add_target+0x177/0x440 [dm_mod] [ 1959.234328] [<ffffffffa0009203>] table_load+0x143/0x370 [dm_mod] [ 1959.241129] [<ffffffffa00090c0>] ? retrieve_status+0x1b0/0x1b0 [dm_mod] [ 1959.248607] [<ffffffffa0009e35>] ctl_ioctl+0x255/0x4d0 [dm_mod] [ 1959.255307] [<ffffffff813304e2>] ? memzero_explicit+0x12/0x20 [ 1959.261816] [<ffffffffa000a0c3>] dm_ctl_ioctl+0x13/0x20 [dm_mod] [ 1959.268615] [<ffffffff81215eb6>] do_vfs_ioctl+0xa6/0x5c0 [ 1959.274637] [<ffffffff81120d2f>] ? __audit_syscall_entry+0xaf/0x100 [ 1959.281726] [<ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70 [ 1959.288814] [<ffffffff81216449>] SyS_ioctl+0x79/0x90 [ 1959.294450] [<ffffffff8167e4ae>] entry_SYSCALL_64_fastpath+0x12/0x71 ... [ 1959.323277] RIP [<ffffffffa040efba>] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot] [ 1959.333090] RSP <ffff88032a957b30> [ 1959.336978] CR2: 0000000000000098 [ 1959.344121] ---[ end trace b049991ccad1169e ]--- Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1195899 Signed-off-by: Ding Xiang <dingxiang@huawei.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:39 -07:00
Jerry Hoemann	83250e67a9	libnvdimm: Fix security issue with DSM IOCTL. commit 07accfa9d1a8bac8262f6d24a94a54d2d1f35149 upstream. Code attempts to prevent certain IOCTL DSM from being called when device is opened read only. This security feature can be trivially overcome by changing the size portion of the ioctl_command which isn't used. Check only the _IOC_NR (i.e. the command). Signed-off-by: Jerry Hoemann <jerry.hoemann@hpe.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:39 -07:00
Alan	b9d26f81ae	aic7xxx: Fix queue depth handling commit 5a51a7abca133860a6f4429655a9eda3c4afde32 upstream. We were setting the queue depth correctly, then setting it back to two. If you hit this as a bisection point then please send me an email as it would imply we've been hiding other bugs with this one. Signed-off-by: Alan Cox <alan@linux.intel.com> Reviewed-by: Hannes Reinicke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:39 -07:00
Maurizio Lombardi	e760edfb7b	be2iscsi: set the boot_kset pointer to NULL in case of failure commit 84bd64993f916bcf86270c67686ecf4cea7b8933 upstream. In beiscsi_setup_boot_info(), the boot_kset pointer should be set to NULL in case of failure otherwise an invalid pointer dereference may occur later. Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:39 -07:00
Vitaly Kuznetsov	c1f327046b	scsi: storvsc: fix SRB_STATUS_ABORTED handling commit ff06c5ffbcb4ffa542fb80c897be977956fafecc upstream. Commit 3209f9d780d1 ("scsi: storvsc: Fix a bug in the handling of SRB status flags") filtered SRB_STATUS_AUTOSENSE_VALID out effectively making the (SRB_STATUS_ABORTED \| SRB_STATUS_AUTOSENSE_VALID) case a dead code. The logic from this branch (e.g. storvsc_device_scan() call) is still required, fix the check. Fixes: 3209f9d780d1 ("scsi: storvsc: Fix a bug in the handling of SRB status flags") Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Acked-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:39 -07:00
Martin K. Petersen	67aa7e6dd9	sd: Fix discard granularity when LBPRZ=1 commit 6540a65da90c09590897310e31993b1f6e28485a upstream. Commit 397737223c59 ("sd: Make discard granularity match logical block size when LBPRZ=1") accidentally set the granularity to one byte instead of one logical block on devices that provide deterministic zeroes after UNMAP. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reported-by: Mike Snitzer <snitzer@redhat.com> Reviewed-by: Ewan Milne <emilne@redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Fixes: 397737223c59e89dca7305feb6528caef8fbef84 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:39 -07:00
Raghava Aditya Renukunta	e468298bd4	aacraid: Set correct msix count for EEH recovery commit ecc479e00db8eb110b200afe1effcb3df20ca7ae upstream. During EEH recovery number of online CPU's might change thereby changing the number of MSIx vectors. Since each fib is allocated to a vector, changes in the number of vectors causes fib to be sent thru invalid vectors.In addition the correct number of MSIx vectors is not updated in the INIT struct sent to the controller, when it is reinitialized. Fixed by reassigning vectors to fibs based on the updated number of MSIx vectors and updating the INIT structure before sending to controller. Fixes: MSI-X vector calculation for suspend/resume Signed-off-by: Raghava Aditya Renukunta <raghavaaditya.renukunta@pmcs.com> Reviewed-by: Shane Seymour <shane.seymour@hpe.com> Reviewed-by: Johannes Thumshirn <jthushirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:38 -07:00
Raghava Aditya Renukunta	f3a3019dfc	aacraid: Fix memory leak in aac_fib_map_free commit f88fa79a61726ce9434df9b4aede36961f709f17 upstream. aac_fib_map_free() calls pci_free_consistent() without checking that dev->hw_fib_va is not NULL and dev->max_fib_size is not zero.If they are indeed NULL/0, this will result in a hang as pci_free_consistent() will attempt to invalidate cache for the entire 64-bit address space (which would take a very long time). Fixed by adding a check to make sure that dev->hw_fib_va and dev->max_fib_size are not NULL and 0 respectively. Fixes: 9ad5204d6 - "[SCSI]aacraid: incorrect dma mapping mask during blinked recover or user initiated reset" Signed-off-by: Raghava Aditya Renukunta <raghavaaditya.renukunta@pmcs.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:38 -07:00
Raghava Aditya Renukunta	1624297ccc	aacraid: Fix RRQ overload commit 3f4ce057d51a9c0ed9b01ba693df685d230ffcae upstream. The driver utilizes an array of atomic variables to keep track of IO submissions to each vector. To submit an IO multiple threads iterate through the array to find a vector which has empty slots to send an IO. The reading and updating of the variable is not atomic, causing race conditions when a thread uses a full vector to submit an IO. Fixed by mapping each FIB to a vector, the submission path then uses said vector to submit IO thereby removing the possibly of a race condition.The vector assignment is started from 1 since vector 0 is reserved for the use of AIF management FIBS.If the number of MSIx vectors is 1 (MSI or INTx mode) then all the fibs are allocated to vector 0. Fixes: 495c0217 "aacraid: MSI-x support" Signed-off-by: Raghava Aditya Renukunta <raghavaaditya.renukunta@pmcs.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:38 -07:00
Douglas Gilbert	f5967a77df	sg: fix dxferp in from_to case commit 5ecee0a3ee8d74b6950cb41e8989b0c2174568d4 upstream. One of the strange things that the original sg driver did was let the user provide both a data-out buffer (it followed the sg_header+cdb) _and_ specify a reply length greater than zero. What happened was that the user data-out buffer was copied into some kernel buffers and then the mid level was told a read type operation would take place with the data from the device overwriting the same kernel buffers. The user would then read those kernel buffers back into the user space. From what I can tell, the above action was broken by commit fad7f01e61bf ("sg: set dxferp to NULL for READ with the older SG interface") in 2008 and syzkaller found that out recently. Make sure that a user space pointer is passed through when data follows the sg_header structure and command. Fix the abnormal case when a non-zero reply_len is also given. Fixes: fad7f01e61bf737fe8a3740d803f000db57ecac6 Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Reviewed-by: Ewan Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:38 -07:00
Nadav Amit	8e8a1a17bc	x86/mm: TLB_REMOTE_SEND_IPI should count pages commit 18c98243ddf05a1827ad2c359c5ac051101e7ff7 upstream. TLB_REMOTE_SEND_IPI was recently introduced, but it counts bytes instead of pages. In addition, it does not report correctly the case in which flush_tlb_page flushes a page. Fix it to be consistent with other TLB counters. Fixes: 5b74283ab251b9d ("x86, mm: trace when an IPI is about to be sent") Signed-off-by: Nadav Amit <namit@vmware.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:38 -07:00
Andy Lutomirski	f71e846236	x86/iopl: Fix iopl capability check on Xen PV commit c29016cf41fe9fa994a5ecca607cf5f1cd98801e upstream. iopl(3) is supposed to work if iopl is already 3, even if unprivileged. This didn't work right on Xen PV. Fix it. Reviewewd-by: Jan Beulich <JBeulich@suse.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/8ce12013e6e4c0a44a97e316be4a6faff31bd5ea.1458162709.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:38 -07:00
Andy Lutomirski	0f63ab5873	x86/iopl/64: Properly context-switch IOPL on Xen PV commit b7a584598aea7ca73140cb87b40319944dd3393f upstream. On Xen PV, regs->flags doesn't reliably reflect IOPL and the exit-to-userspace code doesn't change IOPL. We need to context switch it manually. I'm doing this without going through paravirt because this is specific to Xen PV. After the dust settles, we can merge this with the 32-bit code, tidy up the iopl syscall implementation, and remove the set_iopl pvop entirely. Fixes XSA-171. Reviewewd-by: Jan Beulich <JBeulich@suse.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jan Beulich <JBeulich@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/693c3bd7aeb4d3c27c92c622b7d0f554a458173c.1458162709.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:38 -07:00
Dave Jones	1eeb322585	x86/apic: Fix suspicious RCU usage in smp_trace_call_function_interrupt() commit 7834c10313fb823e538f2772be78edcdeed2e6e3 upstream. Since 4.4, I've been able to trigger this occasionally: =============================== [ INFO: suspicious RCU usage. ] 4.5.0-rc7-think+ #3 Not tainted Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/20160315012054.GA17765@codemonkey.org.uk Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> ------------------------------- ./arch/x86/include/asm/msr-trace.h:47 suspicious rcu_dereference_check() usage! other info that might help us debug this: RCU used illegally from idle CPU! rcu_scheduler_active = 1, debug_locks = 1 RCU used illegally from extended quiescent state! no locks held by swapper/3/0. stack backtrace: CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.5.0-rc7-think+ #3 ffffffff92f821e0 1f3e5c340597d7fc ffff880468e07f10 ffffffff92560c2a ffff880462145280 0000000000000001 ffff880468e07f40 ffffffff921376a6 ffffffff93665ea0 0000cc7c876d28da 0000000000000005 ffffffff9383dd60 Call Trace: <IRQ> [<ffffffff92560c2a>] dump_stack+0x67/0x9d [<ffffffff921376a6>] lockdep_rcu_suspicious+0xe6/0x100 [<ffffffff925ae7a7>] do_trace_write_msr+0x127/0x1a0 [<ffffffff92061c83>] native_apic_msr_eoi_write+0x23/0x30 [<ffffffff92054408>] smp_trace_call_function_interrupt+0x38/0x360 [<ffffffff92d1ca60>] trace_call_function_interrupt+0x90/0xa0 <EOI> [<ffffffff92ac5124>] ? cpuidle_enter_state+0x1b4/0x520 Move the entering_irq() call before ack_APIC_irq(), because entering_irq() tells the RCU susbstems to end the extended quiescent state, so that the following trace call in ack_APIC_irq() works correctly. Suggested-by: Andi Kleen <ak@linux.intel.com> Fixes: 4787c368a9bc "x86/tracing: Add irq_enter/exit() in smp_trace_reschedule_interrupt()" Signed-off-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2016-04-12 09:08:38 -07:00
Thomas Gleixner	dc1441612f	x86/irq: Cure live lock in fixup_irqs() commit 551adc60573cb68e3d55cacca9ba1b7437313df7 upstream. Harry reported, that he's able to trigger a system freeze with cpu hot unplug. The freeze turned out to be a live lock caused by recent changes in irq_force_complete_move(). When fixup_irqs() and from there irq_force_complete_move() is called on the dying cpu, then all other cpus are in stop machine an wait for the dying cpu to complete the teardown. If there is a move of an interrupt pending then irq_force_complete_move() sends the cleanup IPI to the cpus in the old_domain mask and waits for them to clear the mask. That's obviously impossible as those cpus are firmly stuck in stop machine with interrupts disabled. I should have known that, but I completely overlooked it being concentrated on the locking issues around the vectors. And the existance of the call to __irq_complete_move() in the code, which actually sends the cleanup IPI made it reasonable to wait for that cleanup to complete. That call was bogus even before the recent changes as it was just a pointless distraction. We have to look at two cases: 1) The move_in_progress flag of the interrupt is set This means the ioapic has been updated with the new vector, but it has not fired yet. In theory there is a race: set_ioapic(new_vector) <-- Interrupt is raised before update is effective, i.e. it's raised on the old vector. So if the target cpu cannot handle that interrupt before the old vector is cleaned up, we get a spurious interrupt and in the worst case the ioapic irq line becomes stale, but my experiments so far have only resulted in spurious interrupts. But in case of cpu hotplug this should be a non issue because if the affinity update happens right before all cpus rendevouz in stop machine, there is no way that the interrupt can be blocked on the target cpu because all cpus loops first with interrupts enabled in stop machine, so the old vector is not yet cleaned up when the interrupt fires. So the only way to run into this issue is if the delivery of the interrupt on the apic/system bus would be delayed beyond the point where the target cpu disables interrupts in stop machine. I doubt that it can happen, but at least there is a theroretical chance. Virtualization might be able to expose this, but AFAICT the IOAPIC emulation is not as stupid as the real hardware. I've spent quite some time over the weekend to enforce that situation, though I was not able to trigger the delayed case. 2) The move_in_progress flag is not set and the old_domain cpu mask is not empty. That means, that an interrupt was delivered after the change and the cleanup IPI has been sent to the cpus in old_domain, but not all CPUs have responded to it yet. In both cases we can assume that the next interrupt will arrive on the new vector, so we can cleanup the old vectors on the cpus in the old_domain cpu mask. Fixes: 98229aa36caa "x86/irq: Plug vector cleanup race" Reported-by: Harry Junior <harryjr@outlook.fr> Tested-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Joe Lawrence <joe.lawrence@stratus.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Ben Hutchings <ben@decadent.org.uk> Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1603140931430.3657@nanos Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-12 09:08:37 -07:00

1 2 3 4 5 ...

563251 Commits