56121 Commits

Author SHA1 Message Date
Zhang Rui
7e73c5ae6e PM: Introduce suspend state PM_SUSPEND_FREEZE
PM_SUSPEND_FREEZE state is a general state that
does not need any platform specific support, it equals
frozen processes + suspended devices + idle processors.

Compared with PM_SUSPEND_MEMORY,
PM_SUSPEND_FREEZE saves less power
because the system is still in a running state.
PM_SUSPEND_FREEZE has less resume latency because it does not
touch BIOS, and the processors are in idle state.

Compared with RTPM/idle,
PM_SUSPEND_FREEZE saves more power as
1. the processor has longer sleep time because processes are frozen.
   The deeper c-state the processor supports, more power saving we can get.
2. PM_SUSPEND_FREEZE uses system suspend code path, thus we can get
   more power saving from the devices that does not have good RTPM support.

This state is useful for
1) platforms that do not have STR, or have a broken STR.
2) platforms that have an extremely low power idle state,
   which can be used to replace STR.

The following describes how PM_SUSPEND_FREEZE state works.
1. echo freeze > /sys/power/state
2. the processes are frozen.
3. all the devices are suspended.
4. all the processors are blocked by a wait queue
5. all the processors idles and enters (Deep) c-state.
6. an interrupt fires.
7. a processor is woken up and handles the irq.
8. if it is a general event,
   a) the irq handler runs and quites.
   b) goto step 4.
9. if it is a real wake event, say, power button pressing, keyboard touch, mouse moving,
   a) the irq handler runs and activate the wakeup source
   b) wakeup_source_activate() notifies the wait queue.
   c) system starts resuming from PM_SUSPEND_FREEZE
10. all the devices are resumed.
11. all the processes are unfrozen.
12. system is back to working.

Known Issue:
The wakeup of this new PM_SUSPEND_FREEZE state may behave differently
from the previous suspend state.
Take ACPI platform for example, there are some GPEs that only enabled
when the system is in sleep state, to wake the system backk from S3/S4.
But we are not touching these GPEs during transition to PM_SUSPEND_FREEZE.
This means we may lose some wake event.
But on the other hand, as we do not disable all the Interrupts during
PM_SUSPEND_FREEZE, we may get some extra "wakeup" Interrupts, that are
not available for S3/S4.

The patches has been tested on an old Sony laptop, and here are the results:

Average Power:
1. RPTM/idle for half an hour:
   14.8W, 12.6W, 14.1W, 12.5W, 14.4W, 13.2W, 12.9W
2. Freeze for half an hour:
   11W, 10.4W, 9.4W, 11.3W 10.5W
3. RTPM/idle for three hours:
   11.6W
4. Freeze for three hours:
   10W
5. Suspend to Memory:
   0.5~0.9W

Average Resume Latency:
1. RTPM/idle with a black screen: (From pressing keyboard to screen back)
   Less than 0.2s
2. Freeze: (From pressing power button to screen back)
   2.50s
3. Suspend to Memory: (From pressing power button to screen back)
   4.33s

>From the results, we can see that all the platforms should benefit from
this patch, even if it does not have Low Power S0.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-09 22:30:44 +01:00
Grant Likely
0d73299ddf Merge branch spi-next from git://git.kernel.org/pub/scm/linux/kernel/git/broonie/misc.git
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2013-02-09 16:02:44 +00:00
Prarit Bhargava
84e345e4e2 time, Fix setting of hardware clock in NTP code
At init time, if the system time is "warped" forward in warp_clock()
it will differ from the hardware clock by sys_tz.tz_minuteswest.  This time
difference is not taken into account when ntp updates the hardware clock,
and this causes the system time to jump forward by this offset every reboot.

The kernel must take this offset into account when writing the system time
to the hardware clock in the ntp code.  This patch adds
persistent_clock_is_local which indicates that an offset has been applied
in warp_clock() and accounts for the "warp" before writing the hardware
clock.

x86 does not have this problem as rtc writes are software limited to a
+/-15 minute window relative to the current rtc time.  Other arches, such
as powerpc, however do a full synchronization of the system time to the
rtc and will see this problem.

[v2]: generated against tip/timers/core

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2013-02-08 15:07:05 -08:00
Thomas Gleixner
d6d3c4e656 OF: convert devtree lock from rw_lock to raw spinlock
With the locking cleanup in place (from "OF: Fixup resursive
locking code paths"), we can now do the conversion from the
rw_lock to a raw spinlock as required for preempt-rt.

The previous cleanup and this conversion were originally
separate since they predated when mainline got raw spinlock (in
commit c2f21ce2e31286a "locking: Implement new raw_spinlock").

So, at that point in time, the cleanup was considered plausible
for mainline, but not this conversion.  In any case, we've kept
them separate as it makes for easier review and better bisection.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[PG: taken from preempt-rt, update subject & add a commit log]
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
2013-02-08 17:02:40 -06:00
David S. Miller
fd5023111c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Synchronize with 'net' in order to sort out some l2tp, wireless, and
ipv6 GRE fixes that will be built on top of in 'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-08 18:02:14 -05:00
Alexander Duyck
e5e6730588 skbuff: Move definition of NETDEV_FRAG_PAGE_MAX_SIZE
In order to address the fact that some devices cannot support the full 32K
frag size we need to have the value accessible somewhere so that we can use it
to do comparisons against what the device can support.  As such I am moving
the values out of skbuff.c and into skbuff.h.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-08 17:33:35 -05:00
Linus Torvalds
e06b84052a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Revert iwlwifi reclaimed packet tracking, it causes problems for a
    bunch of folks.  From Emmanuel Grumbach.

 2) Work limiting code in brcmsmac wifi driver can clear tx status
    without processing the event.  From Arend van Spriel.

 3) rtlwifi USB driver processes wrong SKB, fix from Larry Finger.

 4) l2tp tunnel delete can race with close, fix from Tom Parkin.

 5) pktgen_add_device() failures are not checked at all, fix from Cong
    Wang.

 6) Fix unintentional removal of carrier off from tun_detach(),
    otherwise we confuse userspace, from Michael S.  Tsirkin.

 7) Don't leak socket reference counts and ubufs in vhost-net driver,
    from Jason Wang.

 8) vmxnet3 driver gets it's initial carrier state wrong, fix from Neil
    Horman.

 9) Protect against USB networking devices which spam the host with 0
    length frames, from Bjørn Mork.

10) Prevent neighbour overflows in ipv6 for locally destined routes,
    from Marcelo Ricardo.  This is the best short-term fix for this, a
    longer term fix has been implemented in net-next.

11) L2TP uses ipv4 datagram routines in it's ipv6 code, whoops.  This
    mistake is largely because the ipv6 functions don't even have some
    kind of prefix in their names to suggest they are ipv6 specific.
    From Tom Parkin.

12) Check SYN packet drops properly in tcp_rcv_fastopen_synack(), from
    Yuchung Cheng.

13) Fix races and TX skb freeing bugs in via-rhine's NAPI support, from
    Francois Romieu and your's truly.

14) Fix infinite loops and divides by zero in TCP congestion window
    handling, from Eric Dumazet, Neal Cardwell, and Ilpo Järvinen.

15) AF_PACKET tx ring handling can leak kernel memory to userspace, fix
    from Phil Sutter.

16) Fix error handling in ipv6 GRE tunnel transmit, from Tommi Rantala.

17) Protect XEN netback driver against hostile frontend putting garbage
    into the rings, don't leak pages in TX GOP checking, and add proper
    resource releasing in error path of xen_netbk_get_requests().  From
    Ian Campbell.

18) SCTP authentication keys should be cleared out and released with
    kzfree(), from Daniel Borkmann.

19) L2TP is a bit too clever trying to maintain skb->truesize, and ends
    up corrupting socket memory accounting to the point where packet
    sending is halted indefinitely.  Just remove the adjustments
    entirely, they aren't really needed.  From Eric Dumazet.

20) ATM Iphase driver uses a data type with the same name as the S390
    headers, rename to fix the build.  From Heiko Carstens.

21) Fix a typo in copying the inner network header offset from one SKB
    to another, from Pravin B Shelar.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (56 commits)
  net: sctp: sctp_endpoint_free: zero out secret key data
  net: sctp: sctp_setsockopt_auth_key: use kzfree instead of kfree
  atm/iphase: rename fregt_t -> ffreg_t
  net: usb: fix regression from FLAG_NOARP code
  l2tp: dont play with skb->truesize
  net: sctp: sctp_auth_key_put: use kzfree instead of kfree
  netback: correct netbk_tx_err to handle wrap around.
  xen/netback: free already allocated memory on failure in xen_netbk_get_requests
  xen/netback: don't leak pages on failure in xen_netbk_tx_check_gop.
  xen/netback: shutdown the ring if it contains garbage.
  net: qmi_wwan: add more Huawei devices, including E320
  net: cdc_ncm: add another Huawei vendor specific device
  ipv6/ip6_gre: fix error case handling in ip6gre_tunnel_xmit()
  tcp: fix for zero packets_in_flight was too broad
  brcmsmac: rework of mac80211 .flush() callback operation
  ssb: unregister gpios before unloading ssb
  bcma: unregister gpios before unloading bcma
  rtlwifi: Fix scheduling while atomic bug
  net: usbnet: fix tx_dropped statistics
  tcp: ipv6: Update MIB counters for drops
  ...
2013-02-09 07:55:24 +11:00
John W. Linville
c6d3b2046e This is the 2nd NFC pull request.
With this one we have a new NFC driver for Inside Secure microread and a few
 pn533 fixes.
 Microread is an HCI based NFC IP and the driver we're pushing supports tags
 R/W, and NFC p2p. It's supported over the i2c and MEI busses.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJRFN8vAAoJEIqAPN1PVmxKdMUP/jaXKl80SOEkno5Yjfp8oRln
 smr0LPo/m+jffoQw2Fp8icym2w8YSzrnPDpBOiPTtUI/7pAHT4rI86t3ZkeVU3jE
 U0RHNvGatehDCba87qD2LaN5T5YYLIHb824a5qU37wFQDphPowfhAoI0kfgaO+OY
 8VWeK7ANiEDfrRPNidhRgVflW+GPuh2SsW/3i9txWYRuovpcDTh/AzQjE5/R0vaH
 +IGcIc8ZqJI2kE7375GRhjMgdFXuQVzd/z7B8lMUKevjZAvXWBac+iOTdbLLAHtz
 0kNUiMkHPOoIuGKiJ6P+bBMJR3fckycGSDcpsNxP+ANQ30M8ZZBu+PxMV6tFdnhU
 CDPPiw93Kz93fwX2wzlbOUfRy/UVpMo1uZogWpZLvfljmD1kdzM+kIwMHeUzfOks
 lS9ZqWucZAyXMWCBd6qD617g7wsBGSB+xZi5/l25g0fRlsL4wirooDa/llIgdxE2
 SzkAp9ffthTNfFBh+fnMf17Ycx/CCiSBaPrvzgLR/UP/P1HEz4v/NQ7d/nPp4n4+
 hjU6vuviOU6e1leZppFcO/SgTptsTv4Mgbuz8NkeAUQvZIcTzB81tMhaOqsxwTQh
 R9oFNaPuldxpPyz5xQgtolp4NiaFADdSp+oFb5gk8TyUjqPaPNYZzA86l4yUldIz
 0YjGjDrAfLimDJlpYWkQ
 =YDW6
 -----END PGP SIGNATURE-----

Merge tag 'nfc-next-3.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next

Samuel says:

"This is the 2nd NFC pull request.

With this one we have a new NFC driver for Inside Secure microread and a few
pn533 fixes.
Microread is an HCI based NFC IP and the driver we're pushing supports tags
R/W, and NFC p2p. It's supported over the i2c and MEI busses."

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-02-08 14:42:37 -05:00
Helge Deller
4f4ffc3a53 unbreak automounter support on 64-bit kernel with 32-bit userspace (v2)
automount-support is broken on the parisc architecture, because the existing
#if list does not include a check for defined(__hppa__). The HPPA (parisc)
architecture is similiar to other 64bit Linux targets where we have to define
autofs_wqt_t (which is passed back and forth to user space) as int type which
has a size of 32bit across 32 and 64bit kernels.

During the discussion on the mailing list, H. Peter Anvin suggested to invert
the #if list since only specific platforms (specifically those who do not have
a 32bit userspace, like IA64 and Alpha) should have autofs_wqt_t as unsigned
long type.

This suggestion is probably the best way to go, since Arm64 (and maybe others?)
seems to have a non-working automounter. So in the long run even for other new
upcoming architectures this inverted check seem to be the best solution, since
it will not require them to change this #if again (unless they are 64bit only).

Signed-off-by: Helge Deller <deller@gmx.de>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Acked-by: Ian Kent <raven@themaw.net>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
CC: James Bottomley <James.Bottomley@HansenPartnership.com>
CC: Rolf Eike Beer <eike-kernel@sf-tec.de>
2013-02-08 20:42:18 +01:00
John W. Linville
4d25a75bc6 Merge branch 'for-linville' of git://git.kernel.org/pub/scm/linux/kernel/git/luca/wl12xx 2013-02-08 14:41:45 -05:00
John W. Linville
3549c6b195 Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Fixed-up drivers/net/wireless/iwlwifi/mvm/mac80211.c to change change
IEEE80211_HW_NEED_DTIM_PERIOD to IEEE80211_HW_NEED_DTIM_BEFORE_ASSOC
as requested by Johannes Berg. -- JWL

Signed-off-by: John W. Linville <linville@tuxdriver.com>
2013-02-08 14:39:54 -05:00
John W. Linville
f5237f278f Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2013-02-08 13:16:17 -05:00
Oleg Nesterov
bdf8647c44 uprobes: Introduce uprobe_apply()
Currently it is not possible to change the filtering constraints after
uprobe_register(), so a consumer can not, say, start to trace a task/mm
which was previously filtered out, or remove the no longer needed bp's.

Introduce uprobe_apply() which simply does register_for_each_vma() again
to consult uprobe_consumer->filter() and install/remove the breakpoints.
The only complication is that register_for_each_vma() can no longer
assume that uprobe->consumers should be consulter if is_register == T,
so we change it to accept "struct uprobe_consumer *new" instead.

Unlike uprobe_register(), uprobe_apply(true) doesn't do "unregister" if
register_for_each_vma() fails, it is up to caller to handle the error.

Note: we probably need to cleanup the current interface, it is strange
that uprobe_apply/unregister need inode/offset. We should either change
uprobe_register() to return "struct uprobe *", or add a private ->uprobe
member in uprobe_consumer. And in the long term uprobe_apply() should
take a single argument, uprobe or consumer, even "bool add" should go
away.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-02-08 18:28:04 +01:00
Oleg Nesterov
f22c1bb6b4 perf: Introduce hw_perf_event->tp_target and ->tp_list
sys_perf_event_open()->perf_init_event(event) is called before
find_get_context(event), this means that event->ctx == NULL when
class->reg(TRACE_REG_PERF_REGISTER/OPEN) is called and thus it
can't know if this event is per-task or system-wide.

This patch adds hw_perf_event->tp_target for PERF_TYPE_TRACEPOINT,
this is analogous to PERF_TYPE_BREAKPOINT/bp_target we already have.
The patch also moves ->bp_target up so that it can overlap with the
new member, this can help the compiler to generate the better code.

trace_uprobe_register() will use it for prefiltering to avoid the
unnecessary breakpoints in mm's we do not want to trace.

->tp_target doesn't have its own reference, but we can rely on the
fact that either sys_perf_event_open() holds a reference, or it is
equal to event->ctx->task. So this pointer is always valid until
free_event().

Also add the "struct list_head tp_list" into this union. It is not
strictly necessary, but it can simplify the next changes and we can
add it for free.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-02-08 18:28:02 +01:00
Oleg Nesterov
da1816b1ca uprobes: Teach handler_chain() to filter out the probed task
Currrently the are 2 problems with pre-filtering:

1. It is not possible to add/remove a task (mm) after uprobe_register()

2. A forked child inherits all breakpoints and uprobe_consumer can not
   control this.

This patch does the first step to improve the filtering. handler_chain()
removes the breakpoints installed by this uprobe from current->mm if all
handlers return UPROBE_HANDLER_REMOVE.

Note that handler_chain() relies on ->register_rwsem to avoid the race
with uprobe_register/unregister which can add/del a consumer, or even
remove and then insert the new uprobe at the same address.

Perhaps we will add uprobe_apply_mm(uprobe, mm, is_register) and teach
copy_mm() to do filter(UPROBE_FILTER_FORK), but I think this change makes
sense anyway.

Note: instead of checking the retcode from uc->handler, we could add
uc->filter(UPROBE_FILTER_BPHIT). But I think this is not optimal to
call 2 hooks in a row. This buys nothing, and if handler/filter do
something nontrivial they will probably do the same work twice.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
2013-02-08 17:47:11 +01:00
Oleg Nesterov
8a7f2fa0de uprobes: Reintroduce uprobe_consumer->filter()
Finally add uprobe_consumer->filter() and change consumer_filter()
to actually call this method.

Note that ->filter() accepts mm_struct, not task_struct. Because:

	1. We do not have for_each_mm_user(mm, task).

	2. Even if we implement for_each_mm_user(), ->filter() can
	   use it itself.

	3. It is not clear who will actually need this interface to
	   do the "nontrivial" filtering.

Another argument is "enum uprobe_filter_ctx", consumer->filter() can
use it to figure out why/where it was called. For example, perhaps
we can add UPROBE_FILTER_PRE_REGISTER used by build_map_info() to
quickly "nack" the unwanted mm's. In this case consumer should know
that it is called under ->i_mmap_mutex.

See the previous discussion at http://marc.info/?t=135214229700002
Perhaps we should pass more arguments, vma/vaddr?

Note: this patch obviously can't help to filter out the child created
by fork(), this will be addressed later.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
2013-02-08 17:47:10 +01:00
Oleg Nesterov
fe20d71f25 uprobes: Kill uprobe_consumer->filter()
uprobe_consumer->filter() is pointless in its current form, kill it.

We will add it back, but with the different signature/semantics. Perhaps
we will even re-introduce the callsite in handler_chain(), but not to
just skip uc->handler().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
2013-02-08 17:47:02 +01:00
Mika Westerberg
a0d2642e92 spi/pxa2xx: add support for Intel Low Power Subsystem SPI
Intel LPSS SPI is pretty much the same as the PXA27xx SPI except that it
has few additional features over the original:

	o FIFO depth is 256 entries
	o RX FIFO has one watermark
	o TX FIFO has two watermarks, low and high
	o chip select can be controlled by writing to a register

The new FIFO registers follow immediately the PXA27xx registers but then there
are some additional LPSS private registers at offset 1k or 2k from the base
address. For these private registers we add new accessors that take advantage
of drv_data->lpss_base once it is resolved.

We add a new type LPSS_SSP that can be used to distinguish the LPSS devices
from others.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Tested-by: Lu Cao <lucao@marvell.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2013-02-08 13:14:40 +00:00
Mika Westerberg
5928808ef6 spi/pxa2xx: add support for DMA engine
To be able to use DMA with this driver on non-PXA platforms we implement
support for the generic DMA engine API. This lets user to use different DMA
engines with little or no modification to the driver.

Request lines and channel numbers can be passed to the driver from the
platform specific data.

The DMA engine implementation will be selected by default even on PXA
platform. User can select the legacy DMA API by enabling Kconfig option
CONFIG_SPI_PXA2XX_PXADMA.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Lu Cao <lucao@marvell.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2013-02-08 12:15:28 +00:00
Mika Westerberg
cd7bed0034 spi/pxa2xx: break out the private DMA API usage into a separate file
The PXA SPI driver uses PXA platform specific private DMA implementation
which does not work on non-PXA platforms. In order to use this driver on
other platforms we break out the private DMA implementation into a separate
file that gets compiled only when CONFIG_SPI_PXA2XX_PXADMA is set. The DMA
functions are stubbed out if there is no DMA implementation selected (i.e
we are building on non-PXA platform).

While we are there we can kill the dummy DMA bits in pxa2xx_spi.h as they
are not needed anymore for CE4100.

Once this is done we can add the generic DMA engine support to the driver
that allows usage of any DMA controller that implements DMA engine API.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Lu Cao <lucao@marvell.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2013-02-08 12:15:21 +00:00
Luciano Coelho
6cc9efed70 wlcore: move wl12xx_platform_data up and make it truly optional
The platform data is used not only by wlcore-based drivers, but also
by wl1251.  Move it up in the directory hierarchy to reflect this.

Additionally, make it truly optional.  At the moment, disabling
platform data while wl1251_sdio or wlcore_sdio are enabled doesn't
work, but it will be necessary when device tree support is
implemented.

Signed-off-by: Luciano Coelho <coelho@ti.com>
Reviewed-by: Felipe Balbi <balbi@ti.com>
2013-02-08 10:05:02 +02:00
Luciano Coelho
afb43e6d88 wlcore: remove if_ops from platform_data
We can't pass pointers from the platform data to the modules, because
with DT it cannot be done.  Those pointers are not set by the board
files anyway.  It's the bus modules that set them, so they can be
safely removed from the platform data without changing any board
files.

Create a new structure that the bus modules pass to wlcore.  This
structure contains the if_ops pointers and a pointer to the actual
platform data.

Signed-off-by: Luciano Coelho <coelho@ti.com>
Reviewed-by: Felipe Balbi <balbi@ti.com>
2013-02-08 10:05:02 +02:00
Lucas Stach
9c79330d93 net: usb: fix regression from FLAG_NOARP code
In commit 6509141f9c2ba74df6cc72ec35cd1865276ae3a4 ("usbnet: add new
flag FLAG_NOARP for usb net devices"), the newly added flag NOARP was
using an already defined value, which broke drivers using flag
MULTI_PACKET.

Signed-off-by: Lucas Stach <dev@lynxeye.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-08 01:49:49 -05:00
Hauke Mehrtens
7e6c63f03d tg3: add support for Ethernet core in bcm4785
The BCM4785 or sometimes named BMC4705 is a Broadcom SoC which a
Gigabit 5750 Ethernet core. The core is connected via PCI with the rest
of the SoC, but it uses some extension.

This core does not use a firmware or an eeprom.

Some devices only have a switch which supports 100MBit/s, this
currently does not work with this driver.

This patch was original written by Michael Buesch <m@bues.ch> and is in
OpenWrt for some years now.

This was tested on a Linksys WRT610N V1 and older versions of this patch
were tested by other people on different devices.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-07 23:47:01 -05:00
Hauke Mehrtens
180996c305 ssb: get mac address from sprom struct for gige driver
The mac address is already stored in the sprom structure by the
platform code of the SoC this Ethernet core is found on, it just has to
be fetched from this structure instead of accessing the nvram here.
This patch also adds a return value to indicate if a mac address could
be fetched from the sprom structure.
When CONFIG_SSB_DRIVER_GIGE is not set the header file now also declares
ssb_gige_get_macaddr().

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Acked-by: Michael Buesch <m@bues.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-07 23:47:01 -05:00
David S. Miller
2de27f307f Merge branch 'mlx4'
Amir Vadai says:

====================
This series from Yan Burman adds support for unicast MAC address filtering and
ndo FDB operations.  It also includes some optimizations to loopback related
decisions and checks in the TX/RX fast path and one cleanup, all in separate
patches.

Today, when adding macvlan devices, the NIC goes into promiscuous mode, since
unicast MAC filtering is not supported. With these changes, macvlan devices can
be added without the penalty of promiscuous mode.

If for some reason adding a unicast address filter fails e.g as of missing space in
the HW mac table, the device forces itself into promiscuous mode (and out of this
forced state when enough space is available).

Also, now it is possible to have bridge under multi-function configuration that include
PF and VFs.  In order to use bridge over PF/VFs, VM MAC fdb entries must be added e.g.
using 'bridge fdb add' command.

Changes from v1 - based on more comments from Eric Dumazet:
* added failure handling when adding unicast address filter

Changes from v0 - based on comments from Eric Dumazet:
* Removed unneeded synchronize_rcu()
* Use kfree_rcu() instead of synchronize_rcu() + kfree()
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-07 23:28:32 -05:00
Yan Burman
16a10ffd20 net/mlx4: Move Ethernet related functionality from mlx4_core to mlx4_en
Move low level code that deals with management of Ethernet MACs and QPs from mlx4_core to mlx4_en.
Also convert the new functions to deal with MACs in form of char array instead of u64.

Actual functions moved:
mlx4_replace_mac
mlx4_get_eth_qp
mlx4_put_eth_qp

To conduct this change, some functionality had to be exported from the core,
the following functions were added:
mlx4_get_base_qp
__mlx4_replace_mac (low level function for CX1/A0 compatibility)

Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-07 23:26:12 -05:00
Lai Jiangshan
511a0868be srcu: Remove checks preventing idle CPUs from calling srcu_read_lock()
SRCU has its own statemachine and no longer relies on normal RCU.
Its read-side critical section can now be used by an offline CPU, so this
commit removes the check and the comments, reverting the SRCU portion
of ff195cb6 (rcu: Warn when srcu_read_lock() is used in an extended
quiescent state).

It also makes the codes match the comments in whatisRCU.txt:

g.	Do you need read-side critical sections that are respected
	even though they are in the middle of the idle loop, during
	user-mode execution, or on an offlined CPU?  If so, SRCU is the
	only choice that will work for you.

[ paulmck: There is at least one remaining issue, namely use of lockdep
	   with tracing enabled. ]

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-02-07 15:15:00 -08:00
Lai Jiangshan
3bc97a782c srcu: Remove checks preventing offline CPUs from calling srcu_read_lock()
SRCU has its own statemachine and no longer relies on normal RCU.
Its read-side critical section can now be used by an offline CPU, so this
commit removes the check and the comments, reverting the SRCU portion
of c0d6d01b (rcu: Check for illegal use of RCU from offlined CPUs).

It also makes the code match the comments in whatisRCU.txt:

g.	Do you need read-side critical sections that are respected
	even though they are in the middle of the idle loop, during
	user-mode execution, or on an offlined CPU?  If so, SRCU is the
	only choice that will work for you.

[ paulmck: There is at least one remaining issue, namely use of lockdep
	   with tracing enabled. ]

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-02-07 15:10:39 -08:00
Clark Williams
8bd75c77b7 sched/rt: Move rt specific bits into new header file
Move rt scheduler definitions out of include/linux/sched.h into
new file include/linux/sched/rt.h

Signed-off-by: Clark Williams <williams@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20130207094707.7b9f825f@riff.lan
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-07 20:51:08 +01:00
Clark Williams
ce0dbbbb30 sched/rt: Add a tuning knob to allow changing SCHED_RR timeslice
Add a /proc/sys/kernel scheduler knob named
sched_rr_timeslice_ms that allows global changing of the
SCHED_RR timeslice value. User visable value is in milliseconds
but is stored as jiffies.  Setting to 0 (zero) resets to the
default (currently 100ms).

Signed-off-by: Clark Williams <williams@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20130207094704.13751796@riff.lan
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-07 20:51:07 +01:00
Clark Williams
cf4aebc292 sched: Move sched.h sysctl bits into separate header
Move the sysctl-related bits from include/linux/sched.h into
a new file: include/linux/sched/sysctl.h. Then update source
files requiring access to those bits by including the new
header file.

Signed-off-by: Clark Williams <williams@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20130207094659.06dced96@riff.lan
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-07 20:50:54 +01:00
Eric Wong
634734b63a fuse: allow control of adaptive readdirplus use
For some filesystems (e.g. GlusterFS), the cost of performing a
normal readdir and readdirplus are identical.  Since adaptively
using readdirplus has no benefit for those systems, give
users/filesystems the option to control adaptive readdirplus use.

v2 of this patch incorporates Miklos's suggestion to simplify the code,
as well as improving consistency of macro names and documentation.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-02-07 14:25:44 +01:00
Miklos Szeredi
7e98d53086 Synchronize fuse header with one used in library
The library one has provisions for use in *BSD, add them to the kernel one too.
They don't hurt and ease maintenance.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2013-02-07 11:58:12 +01:00
Lai Jiangshan
60c057bca2 workqueue: add delayed_work->wq to simplify reentrancy handling
To avoid executing the same work item from multiple CPUs concurrently,
a work_struct records the last pool it was on in its ->data so that,
on the next queueing, the pool can be queried to determine whether the
work item is still executing or not.

A delayed_work goes through timer before actually being queued on the
target workqueue and the timer needs to know the target workqueue and
CPU.  This is currently achieved by modifying delayed_work->work.data
such that it points to the cwq which points to the target workqueue
and the last CPU the work item was on.  __queue_delayed_work()
extracts the last CPU from delayed_work->work.data and then combines
it with the target workqueue to create new work.data.

The only thing this rather ugly hack achieves is encoding the target
workqueue into delayed_work->work.data without using a separate field,
which could be a trade off one can make; unfortunately, this entangles
work->data management between regular workqueue and delayed_work code
by setting cwq pointer before the work item is actually queued and
becomes a hindrance for further improvements of work->data handling.

This can be easily made sane by adding a target workqueue field to
delayed_work.  While delayed_work is used widely in the kernel and
this does make it a bit larger (<5%), I think this is the right
trade-off especially given the prospect of much saner handling of
work->data which currently involves quite tricky memory barrier
dancing, and don't expect to see any measureable effect.

Add delayed_work->wq and drop the delayed_work->work.data overloading.

tj: Rewrote the description.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2013-02-06 18:04:53 -08:00
Lai Jiangshan
6be195886a workqueue: replace WORK_CPU_NONE/LAST with WORK_CPU_END
Now that workqueue has moved away from gcwqs, workqueue no longer has
the need to have a CPU identifier indicating "no cpu associated" - we
now use WORK_OFFQ_POOL_NONE instead - and most uses of WORK_CPU_NONE
are gone.

The only left usage is as the end marker for for_each_*wq*()
iterators, where the name WORK_CPU_NONE is confusing w/o actual
WORK_CPU_NONE usages.  Similarly, WORK_CPU_LAST which equals
WORK_CPU_NONE no longer makes sense.

Replace both WORK_CPU_NONE and LAST with WORK_CPU_END.  This patch
doesn't introduce any functional difference.

tj: s/WORK_CPU_LAST/WORK_CPU_END/ and rewrote the description.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2013-02-06 18:04:53 -08:00
Linus Torvalds
2110cf029a Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block layer updates from Jens Axboe:
 "I've got a few bits pending for 3.8 final, that I better get sent out.
  It's all been sitting for a while, I consider it safe.

  It contains:

   - Two bug fixes for mtip32xx, fixing a driver hang and a crash.

   - A few-liner protocol error fix for drbd.

   - A few fixes for the xen block front/back driver, fixing a potential
     data corruption issue.

   - A race fix for disk_clear_events(), causing spurious warnings.  Out
     of the Chrome OS base.

   - A deadlock fix for disk_clear_events(), moving it to the a
     unfreezable workqueue.  Also from the Chrome OS base."

* 'for-linus' of git://git.kernel.dk/linux-block:
  drbd: fix potential protocol error and resulting disconnect/reconnect
  mtip32xx: fix for crash when the device surprise removed during rebuild
  mtip32xx: fix for driver hang after a command timeout
  block: prevent race/cleanup
  block: remove deadlock in disk_clear_events
  xen-blkfront: handle bvecs with partial data
  llist/xen-blkfront: implement safe version of llist_for_each_entry
  xen-blkback: implement safe iterator for the list of persistent grants
2013-02-07 08:38:33 +11:00
Eric Dumazet
cd431e7385 macvlan: add multicast filter
Setting up IPv6 addresses on configurations with many macvlans
is not really working, as many multicast messages are dropped.

Add a multicast filter to macvlan to reduce the amount of cloned
skbs and overhead.

Successfully tested with 1024 macvlans on one ethernet device.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06 15:59:47 -05:00
Cong Wang
12b0004d1d net: adjust skb_gso_segment() for calling in rx path
skb_gso_segment() is almost always called in tx path,
except for openvswitch. It calls this function when
it receives the packet and tries to queue it to user-space.
In this special case, the ->ip_summed check inside
skb_gso_segment() is no longer true, as ->ip_summed value
has different meanings on rx path.

This patch adjusts skb_gso_segment() so that we can at least
avoid such warnings on checksum.

Cc: Jesse Gross <jesse@nicira.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06 15:58:00 -05:00
Flavio Leitner
e185483e6b team: allow userspace to take control over carrier
Some modes don't require any special carrier handling so
in these cases, the kernel can control the carrier as for
any other interface.  However, some other modes, e.g. lacp,
requires more than just that, so userspace needs to control
the carrier itself.

The daemon today is ready to control it, but the kernel
still can change it based on events.

This fix so that either kernel or userspace is controlling
the carrier.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06 15:48:09 -05:00
Mugunthan V N
3b72c2fe0c drivers: net:ethernet: cpsw: add support for VLAN
adding support for VLAN interface for cpsw.

CPSW VLAN Capability
* Can filter VLAN packets in Hardware

Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06 15:46:40 -05:00
Neil Horman
ca99ca14c9 netpoll: protect napi_poll and poll_controller during dev_[open|close]
Ivan Vercera was recently backporting commit
9c13cb8bb477a83b9a3c9e5a5478a4e21294a760 to a RHEL kernel, and I noticed that,
while this patch protects the tg3 driver from having its ndo_poll_controller
routine called during device initalization, it does nothing for the driver
during shutdown. I.e. it would be entirely possible to have the
ndo_poll_controller method (or subsequently the ndo_poll) routine called for a
driver in the netpoll path on CPU A while in parallel on CPU B, the ndo_close or
ndo_open routine could be called.  Given that the two latter routines tend to
initizlize and free many data structures that the former two rely on, the result
can easily be data corruption or various other crashes.  Furthermore, it seems
that this is potentially a problem with all net drivers that support netpoll,
and so this should ideally be fixed in a common path.

As Ben H Pointed out to me, we can't preform dev_open/dev_close in atomic
context, so I've come up with this solution.  We can use a mutex to sleep in
open/close paths and just do a mutex_trylock in the napi poll path and abandon
the poll attempt if we're locked, as we'll just retry the poll on the next send
anyway.

I've tested this here by flooding netconsole with messages on a system whos nic
driver I modfied to periodically return NETDEV_TX_BUSY, so that the netpoll tx
workqueue would be forced to send frames and poll the device.  While this was
going on I rapidly ifdown/up'ed the interface and watched for any problems.
I've not found any.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Ivan Vecera <ivecera@redhat.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Ben Hutchings <bhutchings@solarflare.com>
CC: Francois Romieu <romieu@fr.zoreil.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-06 15:45:03 -05:00
Guenter Roeck
5372d2d71c hwmon: Driver for Maxim MAX6697 and compatibles
Add support for MAX6581, MAX6602, MAX6622, MAX6636, MAX6689, MAX6693,
MAX6694, MAX6697, MAX6698, and MAX6699 temperature sensors

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Jean Delvare <khali@linux-fr.org>
2013-02-06 09:57:56 -08:00
Steffen Trumtrar
c0a05bf018 of: add 'const' to of_node_full_name parameter
As the function just returns the np->full_name or the string "<no-node>", the
passed device_node pointer is not changed in any way.

The passed parameter can therefore be a const pointer.

Also, fix the following error from checkpatch.pl:

ERROR: "foo* bar" should be "foo *bar"
+static inline const char* of_node_full_name(const struct device_node *np)

Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2013-02-06 11:06:36 +00:00
Michal Kubecek
8d068875ca xfrm: make gc_thresh configurable in all namespaces
The xfrm gc threshold can be configured via xfrm{4,6}_gc_thresh
sysctl but currently only in init_net, other namespaces always
use the default value. This can substantially limit the number
of IPsec tunnels that can be effectively used.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-02-06 11:36:29 +01:00
Steffen Klassert
a0073fe18e xfrm: Add a state resolution packet queue
As the default, we blackhole packets until the key manager resolves
the states. This patch implements a packet queue where IPsec packets
are queued until the states are resolved. We generate a dummy xfrm
bundle, the output routine of the returned route enqueues the packet
to a per policy queue and arms a timer that checks for state resolution
when dst_output() is called. Once the states are resolved, the packets
are sent out of the queue. If the states are not resolved after some
time, the queue is flushed.

This patch keeps the defaut behaviour to blackhole packets as long
as we have no states. To enable the packet queue the sysctl
xfrm_larval_drop must be switched off.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2013-02-06 08:31:10 +01:00
Linus Torvalds
0f632118a1 USB fixes for 3.8-rc6
Here are a few tiny USB fixes for 3.8-rc6.
 
 Nothing major here, some host controller bug fixes to resolve a number
 of bugs that people have reported, and a bunch of additional device ids
 are added to a number of drivers (which caused code to be deleted from
 the usb-storage driver, always nice.)
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iEYEABECAAYFAlERRT4ACgkQMUfUDdst+ylxAQCghY23x/0vT9oCsrWZzKeOFOMB
 J64AoNjVngpCpIcmh/yk1tgjSRSQu3dn
 =p4nA
 -----END PGP SIGNATURE-----

Merge tag 'usb-3.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg Kroah-Hartman:
 "Here are a few tiny USB fixes for 3.8-rc6.

  Nothing major here, some host controller bug fixes to resolve a number
  of bugs that people have reported, and a bunch of additional device
  ids are added to a number of drivers (which caused code to be deleted
  from the usb-storage driver, always nice)"

* tag 'usb-3.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (22 commits)
  USB: storage: optimize to match the Huawei USB storage devices and support new switch command
  USB: storage: Define a new macro for USB storage match rules
  USB: ftdi_sio: add Zolix FTDI PID
  USB: option: add Changhong CH690
  USB: ftdi_sio: add PID/VID entries for ELV WS 300 PC II
  USB: add OWL CM-160 support to cp210x driver
  USB: EHCI: fix bug in scheduling periodic split transfers
  USB: EHCI: fix for leaking isochronous data
  USB: option: add support for Telit LE920
  USB: qcserial: add Telit Gobi QDL device
  USB: EHCI: fix timer bug affecting port resume
  USB: UHCI: notify usbcore about port resumes
  USB: EHCI: notify usbcore about port resumes
  USB: add usb_hcd_{start,end}_port_resume
  USB: EHCI: unlink one async QH at a time
  USB: EHCI: remove ASS/PSS polling timeout
  usb: Using correct way to clear usb3.0 device's remote wakeup feature.
  usb: Prevent dead ports when xhci is not enabled
  USB: XHCI: fix memory leak of URB-private data
  drivers: xhci: fix incorrect bit test
  ...
2013-02-06 08:32:32 +11:00
Stephen Hemminger
ca2eb5679f tcp: remove Appropriate Byte Count support
TCP Appropriate Byte Count was added by me, but later disabled.
There is no point in maintaining it since it is a potential source
of bugs and Linux already implements other better window protection
heuristics.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-05 14:51:16 -05:00
David S. Miller
188d1f76d0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/intel/e1000e/ethtool.c
	drivers/net/vmxnet3/vmxnet3_drv.c
	drivers/net/wireless/iwlwifi/dvm/tx.c
	net/ipv6/route.c

The ipv6 route.c conflict is simple, just ignore the 'net' side change
as we fixed the same problem in 'net-next' by eliminating cached
neighbours from ipv6 routes.

The e1000e conflict is an addition of a new statistic in the ethtool
code, trivial.

The vmxnet3 conflict is about one change in 'net' removing a guarding
conditional, whilst in 'net-next' we had a netdev_info() conversion.

The iwlwifi conflict is dealing with a WARN_ON() conversion in
'net-next' vs. a revert happening in 'net'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-02-05 14:12:20 -05:00
Grant Likely
f305a0a8d7 Merge branch 'broonie/spi-next' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/misc.git
Minor features and bug fixes for PXA, OMAP and GPIO deivce drivers and a
cosmetic change to the bitbang driver.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2013-02-05 12:30:13 +00:00