linux

iv/linux

Author	SHA1	Message	Date
Alex Elder	4a73ef27ad	libceph: record message data byte length Record the number of bytes of data in a page array rather than the number of pages in the array. It can be assumed that the page array is of sufficient size to hold the number of bytes indicated (and offset by the indicated alignment). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:16:42 -07:00
Alex Elder	27fa83852b	libceph: isolate other message data fields Define ceph_msg_data_set_pagelist(), ceph_msg_data_set_bio(), and ceph_msg_data_set_trail() to clearly abstract the assignment of the remaining data-related fields in a ceph message structure. Use the new functions in the osd client and mds client. This partially resolves: http://tracker.ceph.com/issues/4263 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:16:40 -07:00
Alex Elder	f1baeb2b9f	libceph: set page info with byte length When setting page array information for message data, provide the byte length rather than the page count ceph_msg_data_set_pages(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:16:39 -07:00
Alex Elder	02afca6ca0	libceph: isolate message page field manipulation Define a function ceph_msg_data_set_pages(), which more clearly abstracts the assignment page-related fields for data in a ceph message structure. Use this new function in the osd client and mds client. Ideally, these fields would never be set more than once (with BUG_ON() calls to guarantee that). At the moment though the osd client sets these every time it receives a message, and in the event of a communication problem this can happen more than once. (This will be resolved shortly, but setting up these helpers first makes it all a bit easier to work with.) Rearrange the field order in a ceph_msg structure to group those that are used to define the possible data payloads. This partially resolves: http://tracker.ceph.com/issues/4263 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:16:38 -07:00
Alex Elder	e0c594878e	libceph: record byte count not page count Record the byte count for an osd request rather than the page count. The number of pages can always be derived from the byte count (and alignment/offset) but the reverse is not true. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:16:36 -07:00
Alex Elder	7b11ba3758	libceph: define CEPH_MSG_MAX_MIDDLE_LEN This is probably unnecessary but the code read as if it were wrong in read_partial_message(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:16:29 -07:00
Alex Elder	0fff87ec79	libceph: separate read and write data An osd request defines information about where data to be read should be placed as well as where data to write comes from. Currently these are represented by common fields. Keep information about data for writing separate from data to be read by splitting these into data_in and data_out fields. This is the key patch in this whole series, in that it actually identifies which osd requests generate outgoing data and which generate incoming data. It's less obvious (currently) that an osd CALL op generates both outgoing and incoming data; that's the focus of some upcoming work. This resolves: http://tracker.ceph.com/issues/4127 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:16:27 -07:00
Alex Elder	2ac2b7a6d4	libceph: distinguish page and bio requests An osd request uses either pages or a bio list for its data. Use a union to record information about the two, and add a data type tag to select between them. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:16:25 -07:00
Alex Elder	2794a82a11	libceph: separate osd request data info Pull the fields in an osd request structure that define the data for the request out into a separate structure. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:16:24 -07:00
Alex Elder	153e5167e0	libceph: don't assign page info in ceph_osdc_new_request() Currently ceph_osdc_new_request() assigns an osd request's r_num_pages and r_alignment fields. The only thing it does after that is call ceph_osdc_build_request(), and that doesn't need those fields to be assigned. Move the assignment of those fields out of ceph_osdc_new_request() and into its caller. As a result, the page_align parameter is no longer used, so get rid of it. Note that in ceph_sync_write(), the value for req->r_num_pages had already been calculated earlier (as num_pages, and fortunately it was computed the same way). So don't bother recomputing it, but because it's not needed earlier, move that calculation after the call to ceph_osdc_new_request(). Hold off making the assignment to r_alignment, doing it instead r_pages and r_num_pages are getting set. Similarly, in start_read(), nr_pages already holds the number of pages in the array (and is calculated the same way), so there's no need to recompute it. Move the assignment of the page alignment down with the others there as well. This and the next few patches are preparation work for: http://tracker.ceph.com/issues/4127 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:16:23 -07:00
Alex Elder	41766f87f5	libceph: rename ceph_calc_object_layout() The purpose of ceph_calc_object_layout() is to fill in the pool number and seed for a ceph_pg structure provided, based on a given osd map and target object id. Currently that function takes a file layout parameter, but the only thing used out of that is its pool number. Change the function so it takes a pool number rather than the full file layout structure. Only update the ceph_pg if the pool is found in the osd map. Get rid of few useless lines of code from the function while there. Since the function now very clearly just fills in the ceph_pg structure it's provided, rename it ceph_calc_ceph_pg(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:16:17 -07:00
Alex Elder	ec02a2f2ff	libceph: kill ceph_msg->pagelist_count The pagelist_count field is never actually used, so get rid of it. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:16:16 -07:00
Alex Elder	2a24d1f4bd	libceph: use (void ) for untyped data in osd ops Two of the fields defining osd operations are defined using (char ) while the data they represent are really untyped, not character strings. Change them to have type (void *). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:16:15 -07:00
Alex Elder	0d5af16435	libceph: complete lingering requests only once An osd request marked to linger will be re-submitted in the event a connection to the target osd gets dropped. Currently, if there is a callback function associated with a request it will be called each time a request is submitted--which for lingering requests can be more than once. Change it so a request--including lingering ones--will get completed (from the perspective of the user of the osd client) exactly once. This resolves: http://tracker.ceph.com/issues/3967 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:16:12 -07:00
Alex Elder	d4b515fa10	libceph: distinguish page array and pagelist count Use distinct fields for tracking the number of pages in a message's page array and in a message's page list. Currently only one or the other is used at a time, but that will be changing soon. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:14:28 -07:00
Alex Elder	07c09b7255	libceph: make ceph_msg->bio_seg be unsigned The bio_seg field is used by the ceph messenger in iterating through a bio. It should never have a negative value, so make it an unsigned. (I contemplated making it unsigned short to match the struct bio definition, but it offered no benefit.) Change variables used to hold bio_seg values to all be unsigned as well. Change two variable names in init_bio_iter() to match the convention used everywhere else. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:14:23 -07:00
Linus Torvalds	830ac8524f	Merge branch 'x86-kdump-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull kdump fixes from Peter Anvin: "The kexec/kdump people have found several problems with the support for loading over 4 GiB that was introduced in this merge cycle. This is partly due to a number of design problems inherent in the way the various pieces of kdump fit together (it is pretty horrifically manual in many places.) After a lot of iterations this is the patchset that was agreed upon, but of course it is now very late in the cycle. However, because it changes both the syntax and semantics of the crashkernel option, it would be desirable to avoid a stable release with the broken interfaces." I'm not happy with the timing, since originally the plan was to release the final 3.9 tomorrow. But apparently I'm doing an -rc8 instead... * 'x86-kdump-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: kexec: use Crash kernel for Crash kernel low x86, kdump: Change crashkernel_high/low= to crashkernel=,high/low x86, kdump: Retore crashkernel= to allocate under 896M x86, kdump: Set crashkernel_low automatically	2013-04-20 18:40:36 -07:00
Linus Torvalds	db93f8b420	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Peter Anvin: "Three groups of fixes: 1. Make sure we don't execute the early microcode patching if family < 6, since it would touch MSRs which don't exist on those families, causing crashes. 2. The Xen partial emulation of HyperV can be dealt with more gracefully than just disabling the driver. 3. More EFI variable space magic. In particular, variables hidden from runtime code need to be taken into account too." * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, microcode: Verify the family before dispatching microcode patching x86, hyperv: Handle Xen emulation of Hyper-V more gracefully x86,efi: Implement efi_no_storage_paranoia parameter efi: Export efi_query_variable_store() for efivars.ko x86/Kconfig: Make EFI select UCS2_STRING efi: Distinguish between "remaining space" and actually used space efi: Pass boot services variable info to runtime code Move utf16 functions to kernel core and rename x86,efi: Check max_size only if it is non-zero. x86, efivars: firmware bug workarounds should be in platform code	2013-04-20 18:38:48 -07:00
Linus Torvalds	c437d888c7	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking updates from David Miller: 1) ax88796 does 64-bit divides which causes link errors on ARM, fix from Arnd Bergmann. 2) Once an improper offload setting is detected on an SKB we don't rate limit the log message so we can very easily live lock. From Ben Greear. 3) Openvswitch cannot report vport configuration changes reliably because it didn't preallocate the netlink notification message before changing state. From Jesse Gross. 4) The effective UID/GID SCM credentials fix, from Linus. 5) When a user explicitly asks for wireless authentication, cfg80211 isn't told about the AP detachment leaving inconsistent state. Fix from Johannes Berg. 6) Fix self-MAC checks in batman-adv on multi-mesh nodes, from Antonio Quartulli. 7) Revert build_skb() change sin IGB driver, can result in memory corruption. From Alexander Duyck. 8) Fix setting VLANs on virtual functions in IXGBE, from Greg Rose. 9) Fix TSO races in qlcnic driver, from Sritej Velaga. 10) In bnx2x the kernel driver and UNDI firmware can try to program the chip at the same time, resulting in corruption. Add proper synchronization. From Dmitry Kravkov. 11) Fix corruption of status block in firmware ram in bxn2x, from Ariel Elior. 12) Fix load balancing hash regression of bonding driver in forwarding configurations, from Eric Dumazet. 13) Fix TS ECR regression in TCP by calling tcp_replace_ts_recent() in all the right spots, from Eric Dumazet. 14) Fix several bonding bugs having to do with address manintainence, including not removing address when configuration operations encounter errors, missed locking on the address lists, missing refcounting on VLAN objects, etc. All from Nikolay Aleksandrov. 15) Add workarounds for firmware bugs in LTE qmi_wwan devices, wherein the devices fail to add a proper ethernet header while on LTE networks but otherwise properly do so on 2G and 3G ones. From Bjørn Mork. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (38 commits) net: fix incorrect credentials passing net: rate-limit warn-bad-offload splats. net: ax88796: avoid 64 bit arithmetic qlge: Update version to 1.00.00.32. qlge: Fix ethtool autoneg advertising. qlge: Fix receive path to drop error frames net: qmi_wwan: prevent duplicate mac address on link (firmware bug workaround) net: qmi_wwan: fixup destination address (firmware bug workaround) net: qmi_wwan: fixup missing ethernet header (firmware bug workaround) bonding: in bond_mc_swap() bond's mc addr list is walked without lock bonding: disable netpoll on enslave failure bonding: primary_slave & curr_active_slave are not cleaned on enslave failure bonding: vlans don't get deleted on enslave failure bonding: mc addresses don't get deleted on enslave failure pkt_sched: fix error return code in fw_change_attrs() irda: small read past the end of array in debug code tcp: call tcp_replace_ts_recent() from tcp_ack() netfilter: xt_rpfilter: skip locally generated broadcast/multicast, too netfilter: ipset: bitmap:ip,mac: fix listing with timeout bonding: fix l23 and l34 load balancing in forwarding path ...	2013-04-20 18:21:05 -07:00
Linus Torvalds	83f1b4ba91	net: fix incorrect credentials passing Commit 257b5358b32f ("scm: Capture the full credentials of the scm sender") changed the credentials passing code to pass in the effective uid/gid instead of the real uid/gid. Obviously this doesn't matter most of the time (since normally they are the same), but it results in differences for suid binaries when the wrong uid/gid ends up being used. This just undoes that (presumably unintentional) part of the commit. Reported-by: Andy Lutomirski <luto@amacapital.net> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Serge E. Hallyn <serge@hallyn.com> Cc: David S. Miller <davem@davemloft.net> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-20 16:56:42 -04:00
H. Peter Anvin	c0a9f451e4	Merge remote-tracking branch 'efi/urgent' into x86/urgent Matt Fleming (1): x86, efivars: firmware bug workarounds should be in platform code Matthew Garrett (3): Move utf16 functions to kernel core and rename efi: Pass boot services variable info to runtime code efi: Distinguish between "remaining space" and actually used space Richard Weinberger (2): x86,efi: Check max_size only if it is non-zero. x86,efi: Implement efi_no_storage_paranoia parameter Sergey Vlasov (2): x86/Kconfig: Make EFI select UCS2_STRING efi: Export efi_query_variable_store() for efivars.ko Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-04-19 17:09:03 -07:00
Dan Carpenter	e15465e180	irda: small read past the end of array in debug code The "reason" can come from skb->data[] and it hasn't been capped so it can be from 0-255 instead of just 0-6. For example in irlmp_state_dtr() the code does: reason = skb->data[3]; ... irlmp_disconnect_indication(self, reason, skb); Also LMREASON has a couple other values which don't have entries in the irlmp_reasons[] array. And 0xff is a valid reason as well which means "unknown". So far as I can see we don't actually care about "reason" except for in the debug code. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-19 17:32:31 -04:00
Linus Torvalds	53d945e1a2	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse build fix from Miklos Szeredi: "This fixes android builds. The patch appears large, but is just search & replace." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: fix type definitions in uapi header	2013-04-19 09:12:55 -07:00
John W. Linville	5a22483e5a	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2013-04-18 15:01:30 -04:00
Linus Torvalds	0a82a8d132	Revert "block: add missing block_bio_complete() tracepoint" This reverts commit 3a366e614d0837d9fc23f78cdb1a1186ebc3387f. Wanlong Gao reports that it causes a kernel panic on his machine several minutes after boot. Reverting it removes the panic. Jens says: "It's not quite clear why that is yet, so I think we should just revert the commit for 3.9 final (which I'm assuming is pretty close). The wifi is crap at the LSF hotel, so sending this email instead of queueing up a revert and pull request." Reported-by: Wanlong Gao <gaowanlong@cn.fujitsu.com> Requested-by: Jens Axboe <axboe@kernel.dk> Cc: Tejun Heo <tj@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-18 09:00:26 -07:00
Yinghai Lu	55a20ee780	x86, kdump: Retore crashkernel= to allocate under 896M Vivek found old kexec-tools does not work new kernel anymore. So change back crashkernel= back to old behavoir, and add crashkernel_high= to let user decide if buffer could be above 4G, and also new kexec-tools will be needed. -v2: let crashkernel=X override crashkernel_high= update description about _high will be ignored by crashkernel=X -v3: update description about kernel-parameters.txt according to Vivek. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1366089828-19692-4-git-send-email-yinghai@kernel.org Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-04-17 12:35:33 -07:00
Yinghai Lu	c729de8fce	x86, kdump: Set crashkernel_low automatically Chao said that kdump does does work well on his system on 3.8 without extra parameter, even iommu does not work with kdump. And now have to append crashkernel_low=Y in first kernel to make kdump work. We have now modified crashkernel=X to allocate memory beyong 4G (if available) and do not allocate low range for crashkernel if the user does not specify that with crashkernel_low=Y. This causes regression if iommu is not enabled. Without iommu, swiotlb needs to be setup in first 4G and there is no low memory available to second kernel. Set crashkernel_low automatically if the user does not specify that. For system that does support IOMMU with kdump properly, user could specify crashkernel_low=0 to save that 72M low ram. -v3: add swiotlb_size() according to Konrad. -v4: add comments what 8M is for according to hpa. also update more crashkernel_low= in kernel-parameters.txt -v5: update changelog according to Vivek. -v6: Change description about swiotlb referring according to HATAYAMA. Reported-by: WANG Chao <chaowang@redhat.com> Tested-by: WANG Chao <chaowang@redhat.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1366089828-19692-2-git-send-email-yinghai@kernel.org Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-04-17 12:35:32 -07:00
Linus Torvalds	fca83168aa	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Fix erroneous netfilter drop of SIP packets generated by some Cisco phones, from Patrick McHardy. 2) Fix netfilter IPSET refcounting in list_set_add(), from Jozsef Kadlecsik. 3) Fix TCP syncookies route lookup key, we don't use the same values we would use for the usual SYN receive processing, from Dmitry Popov. 4) Fix NULL deref in bond_slave_netdev_event(), from Nikolay Aleksandrov. 5) When bonding enslave fails, we can forget to clear the IFF_BONDING bit, fix also from Nikolay Aleksandrov. 6) skb->csum_start is 16-bits, which is almost always just fine. But if we reallocate the headroom of an SKB this can push the skb->csum_start value outside of it's valid range. This can easily happen when collapsing multiple SKBs from the retransmit queue together. Fix from Thomas Graf. 7) Fix NULL deref in be2net driver due to missing check of __vlan_put_tag() return value, from Ivan Vecera. 8) tun_set_iff() returns zero instead of error code on failure, fix from Wei Yongjun. 9) Like GARP, 802 MRP needs to hold the app->lock when adding MAD events and queueing PDUs. Fix from David Ward. 10) Build fix, MVMDIO needs PHYLIB, from Thomas Petazzoni.. 11) Fix mac80211 static with ipv6 modular build, from Cong Wang. 12) If userland specifies a path cost explicitly, do not override it when the carrier state changes. From Stephen Hemminger. 13) mvnets calculates the TX queue to use incorrectly resulting in garbage pointer derefs and crashes, fix from Willy Tarreau. 14) cdc_mbim does erroneous sizeof(ETH_HLEN). Fix from Bjorn Mork. 15) IP fragmentation can leak a refcount-less route out from an RCU protected section. This results in crashes and all sorts of hard to diagnose behavior. Fix from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (24 commits) qlcnic: fix beaconing test for 82xx adapter net: drop dst before queueing fragments net: fec: fix regression in link change accounting net: cdc_mbim: remove bogus sizeof() drivers: net: ethernet: cpsw: get slave VLAN id from slave node instead of cpsw node net: mvneta: fix improper tx queue usage in mvneta_tx() esp4: fix error return code in esp_output() bridge: make user modified path cost sticky ipv6: statically link register_inet6addr_notifier() net: mvmdio: add select PHYLIB net/802/mrp: fix possible race condition when calling mrp_pdu_queue() tuntap: fix error return code in tun_set_iff() be2net: take care of __vlan_put_tag return value can: sja1000: fix handling on dt properties on little endian systems can: mcp251x: add missing IRQF_ONESHOT to request_threaded_irq netfilter: nf_nat: fix race when unloading protocol modules tcp: Reallocate headroom if it would overflow csum_start stmmac: prevent interrupt loop with MMC RX IPC Counter bonding: IFF_BONDING is not stripped on enslave failure bonding: fix netdev event NULL pointer dereference ...	2013-04-17 08:50:59 -07:00
Miklos Szeredi	4c82456eeb	fuse: fix type definitions in uapi header Commit 7e98d53086d18c877cb44e9065219335184024de (Synchronize fuse header with one used in library) added #ifdef __linux__ around defines if it is not set. The kernel build is self-contained and can be built on non-Linux toolchains. After the mentioned commit builds on non-Linux toolchains will try to include stdint.h and fail due to -nostdinc, and then fail with a bunch of undefined type errors. Fix by checking for __KERNEL__ instead of __linux__ and using the standard int types instead of the linux specific ones. Reported-by: Arve Hjønnevåg <arve@android.com> Reported-by: Colin Cross <ccross@android.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2013-04-17 12:30:40 +02:00
Linus Torvalds	b4cbb197c7	vm: add vm_iomap_memory() helper function Various drivers end up replicating the code to mmap() their memory buffers into user space, and our core memory remapping function may be very flexible but it is unnecessarily complicated for the common cases to use. Our internal VM uses pfn's ("page frame numbers") which simplifies things for the VM, and allows us to pass physical addresses around in a denser and more efficient format than passing a "phys_addr_t" around, and having to shift it up and down by the page size. But it just means that drivers end up doing that shifting instead at the interface level. It also means that drivers end up mucking around with internal VM things like the vma details (vm_pgoff, vm_start/end) way more than they really need to. So this just exports a function to map a certain physical memory range into user space (using a phys_addr_t based interface that is much more natural for a driver) and hides all the complexity from the driver. Some drivers will still end up tweaking the vm_page_prot details for things like prefetching or cacheability etc, but that's actually relevant to the driver, rather than caring about what the page offset of the mapping is into the particular IO memory region. Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-16 16:45:45 -07:00
Matthew Garrett	0635eb8a54	Move utf16 functions to kernel core and rename We want to be able to use the utf16 functions that are currently present in the EFI variables code in platform-specific code as well. Move them to the kernel core, and in the process rename them to accurately describe what they do - they don't handle UTF16, only UCS2. Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>	2013-04-15 21:23:03 +01:00
Linus Torvalds	bb33db7a07	Merge branches 'timers-urgent-for-linus', 'irq-urgent-for-linus' and 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull {timer,irq,core} fixes from Thomas Gleixner: - timer: bug fix for a cpu hotplug race. - irq: single bugfix for a wrong return value, which prevents the calling function to invoke the software fallback. - core: bugfix which plugs two race confitions which can cause hotplug per cpu threads to end up on the wrong cpu. * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: hrtimer: Don't reinitialize a cpu_base lock on CPU_UP * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip: gic: fix irq_trigger return * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: kthread: Prevent unpark race which puts threads on the wrong cpu	2013-04-15 07:03:01 -07:00
Cong Wang	f88c91ddba	ipv6: statically link register_inet6addr_notifier() Tomas reported the following build error: net/built-in.o: In function `ieee80211_unregister_hw': (.text+0x10f0e1): undefined reference to `unregister_inet6addr_notifier' net/built-in.o: In function `ieee80211_register_hw': (.text+0x10f610): undefined reference to `register_inet6addr_notifier' make: *** [vmlinux] Error 1 when built IPv6 as a module. So we have to statically link these symbols. Reported-by: Tomas Melin <tomas.melin@iki.fi> Cc: Tomas Melin <tomas.melin@iki.fi> Cc: "David S. Miller" <davem@davemloft.net> Cc: YOSHIFUJI Hidaki <yoshfuji@linux-ipv6.org> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-14 15:24:17 -04:00
Linus Torvalds	3c91930f0c	Namhyung Kim found and fixed a bug that can crash the kernel by simply doing: echo 1234 \| tee -a /sys/kernel/debug/tracing/set_ftrace_pid Luckily, this can only be done by root, but still is a nasty bug. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQEcBAABAgAGBQJRaK2+AAoJEOdOSU1xswtMw48IAJPcSNMl1+epx5cPw8pwf+y6 YYvs/Ud3BMPBL+mpNPGNFWY+dWJsAtCtAgkLi0WgdL+b9iPNZrmQqqcP5xWV4uKV vRX2SPCQcyEn5keNnFdN3fN1R0+Gj4V8kLvxPqugzNrO9EHejx+TJFWjrONzkcSy g90lY45jfGWW0OS4GuSwHFhKDgcx8/kgb4Whv+xrKzTuX2QkU1BhG9WPsjiHWiL5 WRYjC4LWafrWaPd4cIkzMqj1eU/hL8BkiLLQHM1Tw8yD7t8OPzgmuJMZEh6Cx1iW /Xrm5QkNEcqQ/vSAC6aWUi22VEgRYDLg8WjngwuMgY1Qa3LE2ex8cUDyk7lJbas= =SFA8 -----END PGP SIGNATURE----- Merge tag 'trace-fixes-v3.9-rc-v3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull ftrace fixes from Steven Rostedt: "Namhyung Kim found and fixed a bug that can crash the kernel by simply doing: echo 1234 \| tee -a /sys/kernel/debug/tracing/set_ftrace_pid Luckily, this can only be done by root, but still is a nasty bug." * tag 'trace-fixes-v3.9-rc-v3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ftrace: Move ftrace_filter_lseek out of CONFIG_DYNAMIC_FTRACE section tracing: Fix possible NULL pointer dereferences	2013-04-14 10:50:55 -07:00
Linus Torvalds	935d8aabd4	Add file_ns_capable() helper function for open-time capability checking Nothing is using it yet, but this will allow us to delay the open-time checks to use time, without breaking the normal UNIX permission semantics where permissions are determined by the opener (and the file descriptor can then be passed to a different process, or the process can drop capabilities). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-14 10:06:31 -07:00
Dave Hansen	1de14c3c5c	x86-32: Fix possible incomplete TLB invalidate with PAE pagetables This patch attempts to fix: https://bugzilla.kernel.org/show_bug.cgi?id=56461 The symptom is a crash and messages like this: chrome: Corrupted page table at address 34a03000 pdpt = 0000000000000000 pde = 0000000000000000 Bad pagetable: 000f [#1] PREEMPT SMP Ingo guesses this got introduced by commit 611ae8e3f520 ("x86/tlb: enable tlb flush range support for x86") since that code started to free unused pagetables. On x86-32 PAE kernels, that new code has the potential to free an entire PMD page and will clear one of the four page-directory-pointer-table (aka pgd_t entries). The hardware aggressively "caches" these top-level entries and invlpg does not actually affect the CPU's copy. If we clear one we HAVE to do a full TLB flush, otherwise we might continue using a freed pmd page. (note, we do this properly on the population side in pud_populate()). This patch tracks whenever we clear one of these entries in the 'struct mmu_gather', and ensures that we follow up with a full tlb flush. BTW, I disassembled and checked that: if (tlb->fullmm == 0) and if (!tlb->fullmm && !tlb->need_flush_all) generate essentially the same code, so there should be zero impact there to the !PAE case. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Artem S Tashkinov <t.artem@mailcity.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-12 16:56:47 -07:00
Steven Rostedt (Red Hat)	7f49ef69db	ftrace: Move ftrace_filter_lseek out of CONFIG_DYNAMIC_FTRACE section As ftrace_filter_lseek is now used with ftrace_pid_fops, it needs to be moved out of the #ifdef CONFIG_DYNAMIC_FTRACE section as the ftrace_pid_fops is defined when DYNAMIC_FTRACE is not. Cc: stable@vger.kernel.org Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2013-04-12 17:12:41 -04:00
Namhyung Kim	6a76f8c0ab	tracing: Fix possible NULL pointer dereferences Currently set_ftrace_pid and set_graph_function files use seq_lseek for their fops. However seq_open() is called only for FMODE_READ in the fops->open() so that if an user tries to seek one of those file when she open it for writing, it sees NULL seq_file and then panic. It can be easily reproduced with following command: $ cd /sys/kernel/debug/tracing $ echo 1234 \| sudo tee -a set_ftrace_pid In this example, GNU coreutils' tee opens the file with fopen(, "a") and then the fopen() internally calls lseek(). Link: http://lkml.kernel.org/r/1365663302-2170-1-git-send-email-namhyung@kernel.org Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: stable@vger.kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2013-04-12 14:43:34 -04:00
David S. Miller	8b5b8c2990	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf into netfilter Pablo Neira Ayuso says: ==================== The following patchset contains late netfilter fixes for your net tree, they are: * Don't drop segmented TCP packets in the SIP helper, we've got reports from users that this was breaking communications when the SIP phone messages are larger than the MTU, from Patrick McHardy. * Fix refcount leak in the ipset list set, from Jozsef Kadlecsik. * On hash set resizing, the nomatch flag was lost, thus entirely inverting the logic of the set matching, from Jozsef Kadlecsik. * Fix crash on NAT modules removal. Timer expiration may race with the module cleanup exit path while deleting conntracks, from Florian Westphal. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-12 14:26:39 -04:00
Thomas Gleixner	f2530dc71c	kthread: Prevent unpark race which puts threads on the wrong cpu The smpboot threads rely on the park/unpark mechanism which binds per cpu threads on a particular core. Though the functionality is racy: CPU0 CPU1 CPU2 unpark(T) wake_up_process(T) clear(SHOULD_PARK) T runs leave parkme() due to !SHOULD_PARK bind_to(CPU2) BUG_ON(wrong CPU) We cannot let the tasks move themself to the target CPU as one of those tasks is actually the migration thread itself, which requires that it starts running on the target cpu right away. The solution to this problem is to prevent wakeups in park mode which are not from unpark(). That way we can guarantee that the association of the task to the target cpu is working correctly. Add a new task state (TASK_PARKED) which prevents other wakeups and use this state explicitly for the unpark wakeup. Peter noticed: Also, since the task state is visible to userspace and all the parked tasks are still in the PID space, its a good hint in ps and friends that these tasks aren't really there for the moment. The migration thread has another related issue. CPU0 CPU1 Bring up CPU2 create_thread(T) park(T) wait_for_completion() parkme() complete() sched_set_stop_task() schedule(TASK_PARKED) The sched_set_stop_task() call is issued while the task is on the runqueue of CPU1 and that confuses the hell out of the stop_task class on that cpu. So we need the same synchronizaion before sched_set_stop_task(). Reported-by: Dave Jones <davej@redhat.com> Reported-and-tested-by: Dave Hansen <dave@sr71.net> Reported-and-tested-by: Borislav Petkov <bp@alien8.de> Acked-by: Peter Ziljstra <peterz@infradead.org> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: dhillf@gmail.com Cc: Ingo Molnar <mingo@kernel.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1304091635430.21884@ionos Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2013-04-12 14:18:43 +02:00
Linus Torvalds	fe2971a017	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) cfg80211_conn_scan() must be called with the sched_scan_mutex, fix from Artem Savkov. 2) Fix regression in TCP ICMPv6 processing, we do not want to treat redirects as socket errors, from Christoph Paasch. 3) Fix several recvmsg() msg_name kernel memory leaks into userspace, in ATM, AX25, Bluetooth, CAIF, IRDA, s390 IUCV, L2TP, LLC, Netrom, NFC, Rose, TIPC, and VSOCK. From Mathias Krause and Wei Yongjun. 4) Fix AF_IUCV handling of segmented SKBs in recvmsg(), from Ursula Braun and Eric Dumazet. 5) CAN gw.c code does kfree() on SLAB cache memory, use kmem_cache_free() instead. Fix from Wei Yongjun. 6) Fix LSM regression on TCP SYN/ACKs, some LSMs such as SELINUX want an skb->sk socket context available for these packets, but nothing else requires it. From Eric Dumazet and Paul Moore. 7) Fix ipv4 address lifetime processing so that we don't perform sleepable acts inside of rcu_read_lock() sections, do them in an rtnl_lock() section instead. From Jiri Pirko. 8) mvneta driver accidently sets HW features after device registry, it should do so beforehand. Fix from Willy Tarreau. 9) Fix bonding unload races more correctly, from Nikolay Aleksandrov and Veaceslav Falico. 10) rtnl_dump_ifinfo() and rtnl_calcit() invoke nlmsg_parse() with wrong header size argument. Fix from Michael Riesch. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits) lsm: add the missing documentation for the security_skb_owned_by() hook bnx2x: Prevent null pointer dereference in AFEX mode e100: Add dma mapping error check selinux: add a skb_owned_by() hook can: gw: use kmem_cache_free() instead of kfree() netrom: fix invalid use of sizeof in nr_recvmsg() qeth: fix qeth_wait_for_threads() deadlock for OSN devices af_iucv: fix recvmsg by replacing skb_pull() function rtnetlink: Call nlmsg_parse() with correct header length bonding: fix bonding_masters race condition in bond unloading Revert "bonding: remove sysfs before removing devices" net: mvneta: enable features before registering the driver hyperv: Fix RNDIS send_completion code path hyperv: Fix a kernel warning from netvsc_linkstatus_callback() net: ipv4: fix schedule while atomic bug in check_lifetime() net: ipv4: reset check_lifetime_work after changing lifetime bnx2x: Fix KR2 rapid link flap sctp: remove 'sridhar' from maintainers list VSOCK: Fix missing msg_namelen update in vsock_stream_recvmsg() VSOCK: vmci - fix possible info leak in vmci_transport_dgram_dequeue() ...	2013-04-10 14:15:27 -07:00
Paul Moore	6b07a24fc3	lsm: add the missing documentation for the security_skb_owned_by() hook Unfortunately we didn't catch the missing comments earlier when the patch was merged. Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-10 15:40:39 -04:00
Rafał Miłecki	46fc4c9093	ssb: implement spurious tone avoidance And make use of it in b43. This fixes a regression introduced with 49d55cef5b1925a5c1efb6aaddaa40fc7c693335 b43: N-PHY: implement spurious tone avoidance This commit made BCM4322 use only MCS 0 on channel 13, which of course resulted in performance drop (down to 0.7Mb/s). Reported-by: Stefan Brüns <stefan.bruens@rwth-aachen.de> Signed-off-by: Rafał Miłecki <zajec5@gmail.com> Cc: Stable <stable@vger.kernel.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>	2013-04-10 10:31:26 -04:00
Linus Torvalds	e8f2b548de	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "A nasty bug in fs/namespace.c caught by Andrey + a couple of less serious unpleasantness - ecryptfs misc device playing hopeless games with try_module_get() and palinfo procfs support being... not quite correctly done, to be polite." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: mnt: release locks on error path in do_loopback palinfo fixes procfs: add proc_remove_subtree() ecryptfs: close rmmod race	2013-04-09 12:22:49 -07:00
Jozsef Kadlecsik	6eb4c7e96e	netfilter: ipset: hash:net: nomatch flag not excluded on set resize If a resize is triggered the nomatch flag is not excluded at hashing, which leads to the element missed at lookup in the resized set. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-04-09 21:04:16 +02:00
Al Viro	8ce584c741	procfs: add proc_remove_subtree() just what it sounds like; do that only to procfs subtrees you've created - doing that to something shared with another driver is not only antisocial, but might cause interesting races with proc_create() and its ilk. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-04-09 14:09:17 -04:00
Linus Torvalds	386afc9114	spinlocks and preemption points need to be at least compiler barriers In UP and non-preempt respectively, the spinlocks and preemption disable/enable points are stubbed out entirely, because there is no regular code that can ever hit the kind of concurrency they are meant to protect against. However, while there is no regular code that can cause scheduling, we _do_ end up having some exceptional (literally!) code that can do so, and that we need to make sure does not ever get moved into the critical region by the compiler. In particular, get_user() and put_user() is generally implemented as inline asm statements (even if the inline asm may then make a call instruction to call out-of-line), and can obviously cause a page fault and IO as a result. If that inline asm has been scheduled into the middle of a preemption-safe (or spinlock-protected) code region, we obviously lose. Now, admittedly this is very unlikely to actually ever happen, and we've not seen examples of actual bugs related to this. But partly exactly because it's so hard to trigger and the resulting bug is so subtle, we should be extra careful to get this right. So make sure that even when preemption is disabled, and we don't have to generate any actual code to explicitly tell the system that we are in a preemption-disabled region, we need to at least tell the compiler not to move things around the critical region. This patch grew out of the same discussion that caused commits 79e5f05edcbf ("ARC: Add implicit compiler barrier to raw_local_irq* functions") and 3e2e0d2c222b ("tile: comment assumption about __insn_mtspr for <asm/irqflags.h>") to come about. Note for stable: use discretion when/if applying this. As mentioned, this bug may never have actually bitten anybody, and gcc may never have done the required code motion for it to possibly ever trigger in practice. Cc: stable@vger.kernel.org Cc: Steven Rostedt <srostedt@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-04-09 10:48:33 -07:00
Eric Dumazet	ca10b9e9a8	selinux: add a skb_owned_by() hook Commit 90ba9b1986b5ac (tcp: tcp_make_synack() can use alloc_skb()) broke certain SELinux/NetLabel configurations by no longer correctly assigning the sock to the outgoing SYNACK packet. Cost of atomic operations on the LISTEN socket is quite big, and we would like it to happen only if really needed. This patch introduces a new security_ops->skb_owned_by() method, that is a void operation unless selinux is active. Reported-by: Miroslav Vadkerti <mvadkert@redhat.com> Diagnosed-by: Paul Moore <pmoore@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: linux-security-module@vger.kernel.org Acked-by: James Morris <james.l.morris@oracle.com> Tested-by: Paul Moore <pmoore@redhat.com> Acked-by: Paul Moore <pmoore@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-09 13:23:11 -04:00
Matt Fleming	a6e4d5a03e	x86, efivars: firmware bug workarounds should be in platform code Let's not burden ia64 with checks in the common efivars code that we're not writing too much data to the variable store. That kind of thing is an x86 firmware bug, plain and simple. efi_query_variable_store() provides platforms with a wrapper in which they can perform checks and workarounds for EFI variable storage bugs. Cc: H. Peter Anvin <hpa@zytor.com> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Signed-off-by: Matt Fleming <matt.fleming@intel.com>	2013-04-09 11:34:05 +01:00
Linus Torvalds	f465d40d85	1) Fix ATAPI regression, noticed mainly on tape drives, due to a commit which mistakenly changed an 'int' return type to a 'bool'. Broken by 4dce8ba94c7. 2) Add Slimtype DVD A DS8A8SH ATAPI quirk 3) ata_piix: Intel Haswell platform quirk 4) Avoid DMA'ing to stack buffer, when obtaining DEVSLP timings. IMO a mild regression, given that libata previously did not DMA to a stack buffer. Broken by 803739d2. 5) Fix regression impacting SMART and smartd, broken by 84a9a8cd9. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) iQIVAwUAUWMoWSWzCDIBeCsvAQIFDhAA1ISwRA0681bZiTfJskaRcHPy7HkFP3Df VRmfsqsoNignQVddAbrGpYYt27NASZsTycQ8TGAUkPVVGFm7w0BOj9abxzPvBCoC pg3rJaBHyfXjjLXxSDjk6HxqjnEsgwiIeXvx5vtUCuinfMTdDkJdIHIuIHhL2SzC x39GfCf3O+mtzHYxqTu4X7QXwirJFMsMG5/ZBa1RCVfFGzqzlVyQXeRlEGLZ6wmI YJHId0LKUUZhYURvmlqTVQm9PEu8WsgS0lNqhHkwrj87Qi3jSwDQW/RnHPwUf5gG VHZHGVVu5FRkrDZsGjYdvQyZd9oywcbCIL7oK2mGe4V8Vnxii1KHlI8x08YSOjYX ASY5koJivzyLrCcwIqIsfHvk1+D0yZUjhfQSAqhi509h8Tc9Ge6V83XmAhRXy3b5 nO/+b63qMYxAiMTJyIw7BRQcYpOiU/Nohi2ieGY4tUPw4aBhU61Hnmkhy/YME5P4 nhKZwfNquXXN71WCV+xnchRIuw08EN9zUaeG5vGGnXYKoiIrlId//eyVgtcerh3e lFiuOQAoAY6PIZsnMQU8Z2toL5yXFkLWxmNDYi7RIIai3yChOtxyMTuXEOSKPpG/ rgEy5p5lchhRXzqpC7KhU53uVA14ePPs6LIUoRNiqtGIf+B1nTjUYmKZPTPEzNA9 ftOvFkMHfmQ= =Jhsf -----END PGP SIGNATURE----- Merge tag 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev Pull libata fixes from Jeff Garzik: "The HDIO_DRIVE_* fix is really the biggie. 1) Fix ATAPI regression, noticed mainly on tape drives, due to a commit which mistakenly changed an 'int' return type to a 'bool'. Broken by commit 4dce8ba94c75 ("libata: Use 'bool' return value for ata_id_XXX") 2) Add Slimtype DVD A DS8A8SH ATAPI quirk 3) ata_piix: Intel Haswell platform quirk 4) Avoid DMA'ing to stack buffer, when obtaining DEVSLP timings. IMO a mild regression, given that libata previously did not DMA to a stack buffer. Broken by commit commit 803739d25c23 ("[libata] replace sata_settings with devslp_timing") 5) Fix regression impacting SMART and smartd, broken by commit 84a9a8cd9d0a ("[libata] Set proper SK when CK_COND is set")" * tag 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: [libata] Fix HDIO_DRIVE_* ioctl() Linux 3.9 regression libata: fix DMA to stack in reading devslp_timing parameters ata_piix: Fix DVD not dectected at some Haswell platforms libata: Set max sector to 65535 for Slimtype DVD A DS8A8SH drive libata: Use integer return value for atapi_command_packet_set	2013-04-08 15:15:22 -07:00

1 2 3 4 5 ...

57117 Commits