IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
I noticed we were sending wrong IPv4 ID in TCP flows when MTU discovery
is disabled.
Note how GSO/TSO packets do not have monotonically incrementing ID.
06:37:41.575531 IP (id 14227, proto: TCP (6), length: 4396)
06:37:41.575534 IP (id 14272, proto: TCP (6), length: 65212)
06:37:41.575544 IP (id 14312, proto: TCP (6), length: 57972)
06:37:41.575678 IP (id 14317, proto: TCP (6), length: 7292)
06:37:41.575683 IP (id 14361, proto: TCP (6), length: 63764)
It appears I introduced this bug in linux-3.1.
inet_getid() must return the old value of peer->ip_id_count,
not the new one.
Lets revert this part, and remove the prevention of
a null identification field in IPv6 Fragment Extension Header,
which is dubious and not even done properly.
Fixes: 87c48fa3b4 ("ipv6: make fragment identifications less predictable")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This interface is unusable, as the cdc-wdm character device doesn't reply to
any QMI command. Also, the out-of-tree Sierra Wireless GobiNet driver fully
skips it.
Signed-off-by: Aleksander Morgado <aleksander@aleksander.es>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
A set of new VID/PIDs retrieved from the out-of-tree GobiNet/GobiSerial
Sierra Wireless drivers.
Signed-off-by: Aleksander Morgado <aleksander@aleksander.es>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
br_handle_local_finish() is allowing us to insert an FDB entry with
disallowed vlan. For example, when port 1 and 2 are communicating in
vlan 10, and even if vlan 10 is disallowed on port 3, port 3 can
interfere with their communication by spoofed src mac address with
vlan id 10.
Note: Even if it is judged that a frame should not be learned, it should
not be dropped because it is destined for not forwarding layer but higher
layer. See IEEE 802.1Q-2011 8.13.10.
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Acked-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Any process is able to send netlink messages with leftover bytes.
Make the warning rate-limited to prevent too much log spam.
The warning is supposed to help find userspace bugs, so print the
triggering command name to implicate the buggy program.
[v2: Use pr_warn_ratelimited instead of printk_ratelimited.]
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There has been a number incidents recently where customers running KVM have
reported that VM hosts on different Hypervisors are unreachable. Based on
pcap traces we found that the bridge was broadcasting the ARP request out
onto the network. However some NICs have an inbuilt switch which on occasions
were broadcasting the VMs ARP request back through the physical NIC on the
Hypervisor. This resulted in the bridge changing ports and incorrectly learning
that the VMs mac address was external. As a result the ARP reply was directed
back onto the external network and VM never updated it's ARP cache. This patch
will notify the bridge command, after a fdb has been updated to identify such
port toggling.
Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com>
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
After 1e785f48d2 ("net: Start with correct mac_len in
skb_network_protocol") skb->mac_len is used as a start of the
calculation in skb_network_protocol() but that is not always correct. If
skb->protocol == 8021Q/AD, usually the vlan header is already inserted
in the skb (i.e. vlan reorder hdr == 0). Usually when the packet enters
dev_hard_xmit it has mac_len == 0 so we take 2 bytes from the
destination mac address (skb->data + VLAN_HLEN) as a type in
skb_network_protocol() and return vlan_depth == 4. In the case where TSO is
off, then the mac_len is set but it's == 18 (ETH_HLEN + VLAN_HLEN), so
skb_network_protocol() returns a type from inside the packet and
offset == 22. Also make vlan_depth unsigned as suggested before.
As suggested by Eric Dumazet, move the while() loop in the if() so we
can avoid additional testing in fast path.
Here are few netperf tests + debug printk's to illustrate:
cat netperf.tso-on.reorder-on.bugged
- Vlan -> device (reorder on, default, this case is okay)
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
192.168.3.1 () port 0 AF_INET
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
87380 16384 16384 10.00 7111.54
[ 81.605435] skb->len 65226 skb->gso_size 1448 skb->proto 0x800
skb->mac_len 0 vlan_depth 0 type 0x800
- Vlan -> device (reorder off, bad)
cat netperf.tso-on.reorder-off.bugged
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
192.168.3.1 () port 0 AF_INET
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
87380 16384 16384 10.00 241.35
[ 204.578332] skb->len 1518 skb->gso_size 0 skb->proto 0x8100
skb->mac_len 0 vlan_depth 4 type 0x5301
0x5301 are the last two bytes of the destination mac.
And if we stop TSO, we may get even the following:
[ 83.343156] skb->len 2966 skb->gso_size 1448 skb->proto 0x8100
skb->mac_len 18 vlan_depth 22 type 0xb84
Because mac_len already accounts for VLAN_HLEN.
After the fix:
cat netperf.tso-on.reorder-off.fixed
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
192.168.3.1 () port 0 AF_INET
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
87380 16384 16384 10.01 5001.46
[ 81.888489] skb->len 65230 skb->gso_size 1448 skb->proto 0x8100
skb->mac_len 0 vlan_depth 18 type 0x800
CC: Vlad Yasevich <vyasevic@redhat.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Daniel Borkman <dborkman@redhat.com>
CC: David S. Miller <davem@davemloft.net>
Fixes:1e785f48d29a ("net: Start with correct mac_len in
skb_network_protocol")
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- prevent NULL dereference in multicast code
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJTiZ58AAoJEJgn97Bh2u9e2x4QAIjWLDIo6feo4jH8l6q6R5iO
cH/EXCtqk9GHNwvfZDNt+pF19ejzVk/TPnmmXTZ4QcElS9GuXe5WdWxiGcS5KEwa
0UNDRp8fgcBSV1Kqc/vbyKiQ4j69QtC1PPfLWUtxj/GYE0qHX/A1OzB9zROvoHJ7
sa3l8O5XRWiaxBYDkT0RfhHH0jeDdvm3I9yt8B+4B6c71094VIsfGXBVPp4tPrdg
nkuzBdwF0HFPiFrlsfboJDLcXPLpRR93H1GsmfELYd5jQ4rtUhlcuEESq6573tvB
TV93tkm/zmbwtInMoPI29qKL8t2478cJH7SvKvM4NiqMsB1zOhknhUXzElh9TPGA
xyNivxJraYJzL53XguBFO8A8fP1k/E8Z6UQXJbgry4lu+6qZ60e0/J8zGxGpSamP
i1JX0MAVPX6T4MAlZ70LMxfmzJ5sSNkkYyXobG+aBa/AgzRsXVvG4So1qi364COx
btCxgBXK1Z20ZuNclY8/J06D8EbTXI5y5MCSDvMCOHQlb5mjBl34RtFVw+5/QXkg
v2suc7T/YLOPNtZktZC2506caPHoOlwEVvkyA55p+qdkcD/Dd5Iv4Hndi+g+C5gv
O2ja7gUQco1R8ElormKW9rE7OvjiUlowNJmguXWAdzc9FC0yISpP66BAGjBqwhF9
6YibEebXMQICjxAVTEAM
=fvfu
-----END PGP SIGNATURE-----
Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge
Included changes:
- prevent NULL dereference in multicast code
Antonion Quartulli says:
====================
pull request net: batman-adv 20140527
here you have another very small fix intended for net/linux-3.15.
It prevents some multicast functions from dereferencing a NULL pointer.
(Actually it was nothing more than a typo)
I hope it is not too late for such a small patch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Was introduced with 4c8755d69c
("batman-adv: Send multicast packets to nodes with a WANT_ALL flag")
Reported-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Acked-by: Antonio Quartulli <antonio@meshcoding.com>
Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
Pablo Neira Ayuso says:
====================
The following patchset contains a late fix for IPVS:
* Fix crash when trying to remove the transport header with non-linear
skbuffs, this was introduced in 3.6-rc. Patch from Peter Christensen
via the IPVS folks.
I'll pass this to -stable once this hits mainstream.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iEYEABECAAYFAlOFkZ4ACgkQjTAFq1RaXHOYCQCeJwwa8XQMw2HkGemGMBSpcUHb
MLwAoISCoqgrXWq/lBBzDqlVXik8fR4t
=gHVg
-----END PGP SIGNATURE-----
Merge tag 'linux-can-fixes-for-3.15-20140528' of git://gitorious.org/linux-can/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2014-05-28
here's a pull request for v3.15, hope it's not too late.
Oliver Hartkopp fixed a bug in the CAN led trigger device renaming code.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Reset the GIDs assigned to a VF in the port RoCE GID table when
that guest goes down (either crashes or goes down cleanly).
As part of this fix, we refactor the RoCE gid table driver copy,
moving it to the mlx4_port_info structure (together with the MAC
and VLAN tables).
As with the MAC and VLAN tables, we now use a mutex per port
for the GID table so that modifying the driver copy and
modifying the firmware copy of a port GID table becomes an
atomic operation (thus avoiding driver-copy/FW-copy mismatches).
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Aggreagation of version 1-2 because of version 1 can hit
PLB errors too. If it's not set so we missing events for PLB bits
and driver can't process those interrupts.
Signed-off-by: Ivan Mikhaylov <ivan@ru.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In chips of emac/rgmii b'000' for 0/1 channel isn't suitable which
resulted in non working network interface in this mode.
Signed-off-by: Ivan Mikhaylov <ivan@ru.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit a1ef7bd9fc ("can: rename LED trigger name on netdev renames") renames
the led trigger names according to the changed netdevice name.
As not every CAN driver supports and initializes the led triggers, checking for
the CAN private datastructure with safe_candev_priv() in the notifier chain is
not enough.
This patch adds a check when CONFIG_CAN_LEDS is enabled and the driver does not
support led triggers.
For stable 3.9+
Cc: Fabio Baltieri <fabio.baltieri@gmail.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Acked-by: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Receiving a ICMP response to an IPIP packet in a non-linear skb could
cause a kernel panic in __skb_pull.
The problem was introduced in
commit f2edb9f770 ("ipvs: implement
passive PMTUD for IPIP packets").
Signed-off-by: Peter Christensen <pch@ordbogen.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
John W. Linville says:
====================
pull request: wireless 2014-05-23
I have two more fixes intended for the 3.15 stream...
For the iwlwifi one, Emmanuel says:
"A race has been discovered in the beacon filtering code. Since the
fix is too big for 3.15, I disable here the feature."
For the bluetooth one, Gustavo says:
"This pull request contains a very important fix for 3.15. Here we fix the
permissions of a debugfs file that would otherwise allow unauthorized users
to write content to it."
Please let me know if there are problems!
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This function is called from dcbnl_build_peer_app(). The "info"
struct isn't initialized at all so we disclose 2 bytes of uninitialized
stack data. We should clear it before passing it to the user.
Fixes: 48365e4852 ('qlcnic: dcb: Add support for CEE Netlink interface.')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull sparc fixes from David Miller:
"A small bunch of bug fixes, in particular:
1) On older cpus we need a different chunk of virtual address space
to map the huge page TSB.
2) Missing memory barrier in Niagara2 memcpy.
3) trinity showed some places where fault validation was
unnecessarily loud on sparc64
4) Some sysfs printf's need a type adjustment, from Toralf Förster"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: fix format string mismatch in arch/sparc/kernel/sysfs.c
sparc64: Add membar to Niagara2 memcpy code.
sparc64: Fix huge TSB mapping on pre-UltraSPARC-III cpus.
sparc64: Don't bark so loudly about 32-bit tasks generating 64-bit fault addresses.
Pull networking fixes from David Miller:
"It looks like a sizeble collection but this is nearly 3 weeks of bug
fixing while you were away.
1) Fix crashes over IPSEC tunnels with NAT, the latter can reroute
the packet through a non-IPSEC protected path and the code has to
be able to handle SKBs attached to routes lacking an attached xfrm
state. From Steffen Klassert.
2) Fix OOPSs in ipv4 and ipv6 ipsec layers for unsupported
sub-protocols, also from Steffen Klassert.
3) Set local_df on fragmented netfilter skbs otherwise we won't be
able to forward successfully, from Florian Westphal.
4) cdc_mbim ipv6 neighbour code does __vlan_find_dev_deep without
holding RCU lock, from Bjorn Mork.
5) local_df test in ip_may_fragment is inverted, from Florian
Westphal.
6) jme driver doesn't check for DMA mapping failures, from Neil
Horman.
7) qlogic driver doesn't calculate number of TX queues properly, from
Shahed Shaikh.
8) fib_info_cnt can drift irreversibly positive if we fail to
allocate the fi->fib_metrics array, from Sergey Popovich.
9) Fix use after free in ip6_route_me_harder(), also from Sergey
Popovich.
10) When SYSCTL is disabled, we don't handle local_port_range and
ping_group_range defaults properly at all, from Cong Wang.
11) Unaccelerated VLAN tagged frames improperly handled by cdc_mbim
driver, fix from Bjorn Mork.
12) cassini driver needs nested lock annotations for TX locking, from
Emil Goode.
13) On init error ipv6 VTI driver can unregister pernet ops twice,
oops. Fix from Mahtias Krause.
14) If macvlan device is down, don't propagate IFF_ALLMULTI changes,
from Peter Christensen.
15) Missing NULL pointer check while parsing netlink config options in
ip6_tnl_validate(). From Susant Sahani.
16) Fix handling of neighbour entries during ipv6 router reachability
probing, from Duan Jiong.
17) x86 and s390 JIT address randomization has some address
calculation bugs leading to crashes, from Alexei Starovoitov and
Heiko Carstens.
18) Clear up those uglies with nop patching and net_get_random_once(),
from Hannes Frederic Sowa.
19) Option length miscalculated in ip6_append_data(), fix also from
Hannes Frederic Sowa.
20) A while ago we fixed a race during device unregistry when a
namespace went down, turns out there is a second place that needs
similar protection. From Cong Wang.
21) In the new Altera TSE driver multicast filtering isn't working,
disable it and just use promisc mode until the cause is found.
From Vince Bridgers.
22) When we disable router enabling in ipv6 we have to flush the
cached routes explicitly, from Duan Jiong.
23) NBMA tunnels should not cache routes on the tunnel object because
the key is variable, from Timo Teräs.
24) With stacked devices GRO information in skb->cb[] can be not setup
properly, make sure it is in all code paths. From Eric Dumazet.
25) Really fix stacked vlan locking, multiple levels of nesting with
intervening non-vlan devices are possible. From Vlad Yasevich.
26) Fallback ipip tunnel device's mtu is not setup properly, from
Steffen Klassert.
27) The packet scheduler's tcindex filter can crash because we
structure copy objects with list_head's inside, oops. From Cong
Wang.
28) Fix CHECKSUM_COMPLETE handling for ipv6 GRE tunnels, from Eric
Dumazet.
29) In some configurations 'itag' in __mkroute_input() can end up
being used uninitialized because of how fib_validate_source()
works. Fix it by explitly initializing itag to zero like all the
other fib_validate_source() callers do, from Li RongQing"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits)
batman: fix a bogus warning from batadv_is_on_batman_iface()
ipv4: initialise the itag variable in __mkroute_input
bonding: Send ALB learning packets using the right source
bonding: Don't assume 802.1Q when sending alb learning packets.
net: doc: Update references to skb->rxhash
stmmac: Remove unbalanced clk_disable call
ipv6: gro: fix CHECKSUM_COMPLETE support
net_sched: fix an oops in tcindex filter
can: peak_pci: prevent use after free at netdev removal
ip_tunnel: Initialize the fallback device properly
vlan: Fix build error wth vlan_get_encap_level()
can: c_can: remove obsolete STRICT_FRAME_ORDERING Kconfig option
MAINTAINERS: Pravin Shelar is Open vSwitch maintainer.
bnx2x: Convert return 0 to return rc
bonding: Fix alb mode to only use first level vlans.
bonding: Fix stacked device detection in arp monitoring
macvlan: Fix lockdep warnings with stacked macvlan devices
vlan: Fix lockdep warning with stacked vlan devices.
net: Allow for more then a single subclass for netif_addr_lock
net: Find the nesting level of a given device by type.
...
Pull scheduler fixes from Ingo Molnar:
"The biggest commit is an irqtime accounting loop latency fix, the rest
are misc fixes all over the place: deadline scheduling, docs, numa,
balancer and a bad to-idle latency fix"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/numa: Initialize newidle balance stats in sd_numa_init()
sched: Fix updating rq->max_idle_balance_cost and rq->next_balance in idle_balance()
sched: Skip double execution of pick_next_task_fair()
sched: Use CPUPRI_NR_PRIORITIES instead of MAX_RT_PRIO in cpupri check
sched/deadline: Fix memory leak
sched/deadline: Fix sched_yield() behavior
sched: Sanitize irq accounting madness
sched/docbook: Fix 'make htmldocs' warnings caused by missing description
Pull perf fixes from Ingo Molnar:
"The biggest changes are fixes for races that kept triggering Trinity
crashes, plus liblockdep build fixes and smaller misc fixes.
The liblockdep bits in perf/urgent are a pull mistake - they should
have been in locking/urgent - but by the time I noticed other commits
were added and testing was done :-/ Sorry about that"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf: Fix a race between ring_buffer_detach() and ring_buffer_attach()
perf: Prevent false warning in perf_swevent_add
perf: Limit perf_event_attr::sample_period to 63 bits
tools/liblockdep: Remove all build files when doing make clean
tools/liblockdep: Build liblockdep from tools/Makefile
perf/x86/intel: Fix Silvermont's event constraints
perf: Fix perf_event_init_context()
perf: Fix race in removing an event
Pull drm radeon and nouveau fixes from Dave Airlie:
"Fixes for the other big two.
The radeon VCE one is large but it fixes some userspace triggerable
issues, otherwise its blackscreens and oopses.
Nouveau fixes a bleeding laptop panel issue when displayport is used
sometimes"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/radeon/pm: don't allow debugfs/sysfs access when PX card is off (v2)
drm/radeon: avoid segfault on device open when accel is not working.
drm/radeon: fix typo in finding PLL params
drm/radeon: fix register typo on si
drm/radeon: fix buffer placement under memory pressure v2
drm/radeon: fix page directory update size estimation
drm/radeon: handle non-VGA class pci devices with ATRM
drm/radeon: fix DCE83 check for mullins
drm/radeon: check VCE relocation buffer range v3
drm/radeon: also try GART for CPU accessed buffers
drm/gf119-/disp: fix nasty bug which can clobber SOR0's clock setup
drm/nvd9/therm: handle another kind of PWM fan
Merge misc fixes from Andrew Morton:
"9 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
MAINTAINERS: add closing angle bracket to Vince Bridgers' email address
Documentation: fix DOCBOOKS=... building
ocfs2: fix double kmem_cache_destroy in dlm_init
mm/memory-failure.c: fix memory leak by race between poison and unpoison
wait: swap EXIT_ZOMBIE(Z) and EXIT_DEAD(X) chars in TASK_STATE_TO_CHAR_STR
memcg: fix swapcache charge from kernel thread context
mm: madvise: fix MADV_WILLNEED on shmem swapouts
mm/filemap.c: avoid always dirtying mapping->flags on O_DIRECT
hwpoison, hugetlb: lock_page/unlock_page does not match for handling a free hugepage
Prior to commit 4266129964 ("[media] DocBook: Move all media docbook
stuff into its own directory") it was possible to build only a single
(or more) book(s) by calling, for example
make htmldocs DOCBOOKS=80211.xml
This now fails:
cp: target `.../Documentation/DocBook//media_api' is not a directory
Ignore errors from that copy to make this possible again.
Fixes: 4266129964 ("[media] DocBook: Move all media docbook stuff into its own directory")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In dlm_init, if create dlm_lockname_cache failed in
dlm_init_master_caches, it will destroy dlm_lockres_cache which created
before twice. And this will cause system die when loading modules.
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When a memory error happens on an in-use page or (free and in-use)
hugepage, the victim page is isolated with its refcount set to one.
When you try to unpoison it later, unpoison_memory() calls put_page()
for it twice in order to bring the page back to free page pool (buddy or
free hugepage list). However, if another memory error occurs on the
page which we are unpoisoning, memory_failure() returns without
releasing the refcount which was incremented in the same call at first,
which results in memory leak and unconsistent num_poisoned_pages
statistics. This patch fixes it.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: <stable@vger.kernel.org> [2.6.32+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In commit ad86622b47 ("wait: swap EXIT_ZOMBIE and EXIT_DEAD to hide
EXIT_TRACE from user-space") the order of task state definitions were
changed: EXIT_DEAD and EXIT_ZOMBIE were swapped. Though the charterers
for the states in TASK_STATE_TO_CHAR_STR string were not updated. This
patch synchronizes the string to the order of definitions.
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 284f39afea ("mm: memcg: push !mm handling out to page cache
charge function") explicitly checks for page cache charges without any
mm context (from kernel thread context[1]).
This seemed to be the only possible case where memory could be charged
without mm context so commit 03583f1a63 ("memcg: remove unnecessary
!mm check from try_get_mem_cgroup_from_mm()") removed the mm check from
get_mem_cgroup_from_mm(). This however caused another NULL ptr
dereference during early boot when loopback kernel thread splices to
tmpfs as reported by Stephan Kulow:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000360
IP: get_mem_cgroup_from_mm.isra.42+0x2b/0x60
Oops: 0000 [#1] SMP
Modules linked in: btrfs dm_multipath dm_mod scsi_dh multipath raid10 raid456 async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor async_tx raid1 raid0 md_mod parport_pc parport nls_utf8 isofs usb_storage iscsi_ibft iscsi_boot_sysfs arc4 ecb fan thermal nfs lockd fscache nls_iso8859_1 nls_cp437 sg st hid_generic usbhid af_packet sunrpc sr_mod cdrom ata_generic uhci_hcd virtio_net virtio_blk ehci_hcd usbcore ata_piix floppy processor button usb_common virtio_pci virtio_ring virtio edd squashfs loop ppa]
CPU: 0 PID: 97 Comm: loop1 Not tainted 3.15.0-rc5-5-default #1
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Call Trace:
__mem_cgroup_try_charge_swapin+0x40/0xe0
mem_cgroup_charge_file+0x8b/0xd0
shmem_getpage_gfp+0x66b/0x7b0
shmem_file_splice_read+0x18f/0x430
splice_direct_to_actor+0xa2/0x1c0
do_lo_receive+0x5a/0x60 [loop]
loop_thread+0x298/0x720 [loop]
kthread+0xc6/0xe0
ret_from_fork+0x7c/0xb0
Also Branimir Maksimovic reported the following oops which is tiggered
for the swapcache charge path from the accounting code for kernel threads:
CPU: 1 PID: 160 Comm: kworker/u8:5 Tainted: P OE 3.15.0-rc5-core2-custom #159
Hardware name: System manufacturer System Product Name/MAXIMUSV GENE, BIOS 1903 08/19/2013
task: ffff880404e349b0 ti: ffff88040486a000 task.ti: ffff88040486a000
RIP: get_mem_cgroup_from_mm.isra.42+0x2b/0x60
Call Trace:
__mem_cgroup_try_charge_swapin+0x45/0xf0
mem_cgroup_charge_file+0x9c/0xe0
shmem_getpage_gfp+0x62c/0x770
shmem_write_begin+0x38/0x40
generic_perform_write+0xc5/0x1c0
__generic_file_aio_write+0x1d1/0x3f0
generic_file_aio_write+0x4f/0xc0
do_sync_write+0x5a/0x90
do_acct_process+0x4b1/0x550
acct_process+0x6d/0xa0
do_exit+0x827/0xa70
kthread+0xc3/0xf0
This patch fixes the issue by reintroducing mm check into
get_mem_cgroup_from_mm. We could do the same trick in
__mem_cgroup_try_charge_swapin as we do for the regular page cache path
but it is not worth troubles. The check is not that expensive and it is
better to have get_mem_cgroup_from_mm more robust.
[1] - http://marc.info/?l=linux-mm&m=139463617808941&w=2
Fixes: 03583f1a63 ("memcg: remove unnecessary !mm check from try_get_mem_cgroup_from_mm()")
Reported-and-tested-by: Stephan Kulow <coolo@suse.com>
Reported-by: Branimir Maksimovic <branimir.maksimovic@gmail.com>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
MADV_WILLNEED currently does not read swapped out shmem pages back in.
Commit 0cd6144aad ("mm + fs: prepare for non-page entries in page
cache radix trees") made find_get_page() filter exceptional radix tree
entries but failed to convert all find_get_page() callers that WANT
exceptional entries over to find_get_entry(). One of them is shmem swap
readahead in madvise, which now skips over any swap-out records.
Convert it to find_get_entry().
Fixes: 0cd6144aad ("mm + fs: prepare for non-page entries in page cache radix trees")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In some testing I ran today (some fio jobs that spread over two nodes),
we end up spending 40% of the time in filemap_check_errors(). That
smells fishy. Looking further, this is basically what happens:
blkdev_aio_read()
generic_file_aio_read()
filemap_write_and_wait_range()
if (!mapping->nr_pages)
filemap_check_errors()
and filemap_check_errors() always attempts two test_and_clear_bit() on
the mapping flags, thus dirtying it for every single invocation. The
patch below tests each of these bits before clearing them, avoiding this
issue. In my test case (4-socket box), performance went from 1.7M IOPS
to 4.0M IOPS.
Signed-off-by: Jens Axboe <axboe@fb.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For handling a free hugepage in memory failure, the race will happen if
another thread hwpoisoned this hugepage concurrently. So we need to
check PageHWPoison instead of !PageHWPoison.
If hwpoison_filter(p) returns true or a race happens, then we need to
unlock_page(hpage).
Signed-off-by: Chen Yucong <slaoub@gmail.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Tested-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: <stable@vger.kernel.org> [2.6.36+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The 'renameat2()' system call was incorrectly added as a ENTRY_COMP() in
the parisc system call table by commit 18e480aa07 ("parisc: add
renameat2 syscall"). That causes a link-time error due to there not
being any compat version of that system call:
arch/parisc/kernel/built-in.o: In function `sys_call_table':
(.rodata+0xad0): undefined reference to `compat_sys_renameat2'
make: *** [vmlinux] Error 1
Easily fixed by marking the system call as being the same for compat as
for native by using ENTRY_SAME() instead of ENTRY_COMP().
Reported-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Miklos Szeredi <miklos@szeredi.hu>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
batman tries to search dev->iflink to check if it's a batman interface,
but ->iflink could be 0, which is not a valid ifindex. It should just
avoid iflink == 0 case.
Reported-by: Jet Chen <jet.chen@intel.com>
Tested-by: Jet Chen <jet.chen@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Antonio Quartulli <antonio@open-mesh.com>
Cc: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
the value of itag is a random value from stack, and may not be initiated by
fib_validate_source, which called fib_combine_itag if CONFIG_IP_ROUTE_CLASSID
is not set
This will make the cached dst uncertainty
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ALB learning packets are currentlyalways sent using the slave mac
address for all vlans configured on top of bond. This is not always
correct, as vlans may change their mac address.
This patch introduced a concept of strict matching where the
source of learning packets can either strictly match the address
passed in, or it can determine a more correct address to use.
There are 3 casese to consider:
1) Switchover. In this case, we have a new active slave and we need
tell the switch about all addresses available on the slave.
2) Monitor. We'll periodically refresh learning info for all slaves.
In this case, we refresh all addresses for current active, and just
the slave address for other slaves.
3) Teaching of disabled adddress. This happens as part of the
failover and in this case, we alwyas to use just the address
provided.
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
TLB/ALB learning packets always assume 802.1Q vlan protocol, but
that is no longer the case since we now have support for Q-in-Q
on top of bonding. Pass the vlan protocol to alb_send_lp_vid()
so that the packets are properly tagged.
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Acked-by: Veaceslav Falico <vfalico@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iEYEABECAAYFAlN8ibsACgkQjTAFq1RaXHPXQQCdF6gWW/tCObrWO8cWHQJCRij+
FVwAn2cAfA0W8goptL45550Nhzd1ijQf
=HF4f
-----END PGP SIGNATURE-----
Merge tag 'linux-can-fixes-for-3.15-20140521' of git://gitorious.org/linux-can/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2014-05-21
this is a pull request for net/master, for the v3.15 release cycle, with a
single patch. Christopher R. Baker found a use after free during unloading of
the peak_pci driver. This is fixes in a patch by Stephane Grosjean.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit 61b905da33 ("net: Rename skb->rxhash to skb->hash"), skb->rxhash
was renamed to skb->hash. Update references in Documentation
accordingly.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The stmmac_open call was calling clk_disable_unprepare on phy init
failure, but it never calls clk_prepare_enable, this causes
a WARN_ON in the clk framework to trigger if for some reason phy init
fails.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Acked-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
radeon fixes, VCE one is big but does fix a userspace crash.
* 'drm-fixes-3.15' of git://people.freedesktop.org/~deathsimple/linux:
drm/radeon/pm: don't allow debugfs/sysfs access when PX card is off (v2)
drm/radeon: avoid segfault on device open when accel is not working.
drm/radeon: fix typo in finding PLL params
drm/radeon: fix register typo on si
drm/radeon: fix buffer placement under memory pressure v2
drm/radeon: fix page directory update size estimation
drm/radeon: handle non-VGA class pci devices with ATRM
drm/radeon: fix DCE83 check for mullins
drm/radeon: check VCE relocation buffer range v3
drm/radeon: also try GART for CPU accessed buffers
fixes nasty panel bleeding bug.
* 'drm-nouveau-next' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
drm/gf119-/disp: fix nasty bug which can clobber SOR0's clock setup
drm/nvd9/therm: handle another kind of PWM fan
When GRE support was added in linux-3.14, CHECKSUM_COMPLETE handling
broke on GRE+IPv6 because we did not update/use the appropriate csum :
GRO layer is supposed to use/update NAPI_GRO_CB(skb)->csum instead of
skb->csum
Tested using a GRE tunnel and IPv6 traffic. GRO aggregation now happens
at the first level (ethernet device) instead of being done in gre
tunnel. Native IPv6+TCP is still properly aggregated.
Fixes: bf5a755f5e ("net-gre-gro: Add GRE support to the GRO stack")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jerry Chu <hkchu@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull two powerpc fixes from Ben Herrenschmidt:
"Here are a couple of fixes for 3.15. One from Anton fixes a nasty
regression I introduced when trying to fix a loss of irq_work whose
consequences is that we can completely lose timer interrupts on a
CPU... not pretty.
The other one is a change to our PCIe reset hook to use a firmware
call instead of direct config space accesses to trigger a fundamental
reset on the root port. This is necessary so that the FW gets a
chance to disable the link down error monitoring, which would
otherwise trip and cause subsequent fatal EEH error"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc: irq work racing with timer interrupt can result in timer interrupt hang
powerpc/powernv: Reset root port in firmware