IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Pull thermal management fixes from Zhang Rui:
- Fix a wrong __percpu structure declaration in intel_powerclamp driver
(Luc Van Oostenryck)
- Fix truncated name of the idle injection kthreads created by
intel_powerclamp driver (Zhang Rui)
- Fix the missing UUID supports in int3400 thermal driver (Matthew
Garrett)
- Fix a crash when accessing the debugfs of bcm2835 SoC thermal driver
(Phil Elwell)
- A couple of trivial fixes/cleanups in some SoC thermal drivers
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
thermal/intel_powerclamp: fix truncated kthread name
thermal: mtk: Allocate enough space for mtk_thermal.
thermal/int340x_thermal: fix mode setting
thermal/int340x_thermal: Add additional UUIDs
thermal: cpu_cooling: Remove unused cur_freq variable
thermal: bcm2835: Fix crash in bcm2835_thermal_debugfs
thermal: samsung: Fix incorrect check after code merge
thermal/intel_powerclamp: fix __percpu declaration of worker_data
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAlyWPM0ACgkQiiy9cAdy
T1FKCAwAsrZ2WlTdSp5/1Ogwa18vrS5dMHnMipOaZytG6HxBDXPGDzohNQxHbLQK
ShxQIrcPoVmB6WVrzpEYrPGamDETogp+ennVHwngTNDP7TN/U/oSVzBSJ/ZW32uO
w6LSXm3upVNluQLalLhy95xRUhZrt/FkCGp8BkTduR9VObfDtSHouCvsdMl0gXL1
qHZM1LguJsB0ziWNQpeXvFar63NO5bJEBtvP+sc+sGjuoRskE6Bz68GCg3t+Mbp+
73M/iVil8HWMHZELs1JwPslakbp1xDAcz7fjcizrVXcMaW8xqT68rylORC4aY3p3
bNqDTCPRjufE0ktjSi8ld8g9W3W/TbyizBVUDUHYqDzRomx5sx74vz57yCdqyKZX
/MENHMcrQ8OcB1OXKKTtgbh3dEcwSbrKHKuO+0XMz4We1S5Py8qRRTVxissqxWx6
aSeXYDC2XLlsPO9/OY6h08pQ21i7qO8wbuCbmpunaMgnk1vurF0B5Xt8s27FSoEK
M32p50xP
=Q90v
-----END PGP SIGNATURE-----
Merge tag '5.1-rc1-cifs-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb3 fixes from Steve French:
- two fixes for stable for guest mount problems with smb3.1.1
- two fixes for crediting (SMB3 flow control) on resent requests
- a byte range lock leak fix
- two fixes for incorrect rc mappings
* tag '5.1-rc1-cifs-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: update internal module version number
SMB3: Fix SMB3.1.1 guest mounts to Samba
cifs: Fix slab-out-of-bounds when tracing SMB tcon
cifs: allow guest mounts to work for smb3.11
fix incorrect error code mapping for OBJECTID_NOT_FOUND
cifs: fix that return -EINVAL when do dedupe operation
CIFS: Fix an issue with re-sending rdata when transport returning -EAGAIN
CIFS: Fix an issue with re-sending wdata when transport returning -EAGAIN
- Series to fix a memory leak in hd44780
while introducing charlcd_free().
From Andy Shevchenko
- Series to clean up the Kconfig menus and
a couple of improvements for charlcd.
From Mans Rullgard
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPjU5OPd5QIZ9jqqOGXyLc2htIW0FAlyWPZIACgkQGXyLc2ht
IW1Jkg/8DJPkAjGqmhSMXesHj/M4+0sIRq3MDLFoZTLBzWfalCCUeu/f8qun0Eb3
kGHRHuCfu1tJF6EP/k4UcPLvsmrh9iKRc/GYcgyHyuOxxkOhmyxd890FwdpZb5ev
Mawv0otK8Helv4W8hxQhO+BN1R2qPVhM7ddQ3dX2190N46Q3SzAvie3itdNzwF6C
voT9XfHXjqenSA/xjEZbTfF18zndlGE+inbouyw6vT/O4dIu++ljUVNLOahvUXPt
tEiZ3buwfIuF0tFnVllmSJToGlEJ47fdCZ0IP+eK0ay5DzhdDayimd/OdZPHCHL/
DY2Cvp5ugB6BwZLLSVpIGKNL/GAnA4686+zy3uRykCs1eDZsM2gSYe3Xat2aQCpy
101GUfgym+3VFtv+wkYk83Lxr2cCPOc5a700xzx9GSDtXAF+MY+Ahi8UoKnOa9lg
48MKgXMlCN69/Hd/LWQmc8qVAx3AQQxlWroKV8hK9N1ZRklY4J5ZhNkQjfUhO2FF
6U8+s6pq3s2Im3AKvsCoAxWZa+sBPJhH+yYWszxfqap+k8p4Ligku3iQgoM0OaMI
mxgv5cSZWeuO/tG7WTiJAuc0KgZen2w04iBYOYL6jyhMdzLjfs24CrJU9y0bLKFd
GY7jkxDuv6TsS5GqHalDepXyk1Y41C1jaddz1AcUvBqyVTRPjj0=
=MO2/
-----END PGP SIGNATURE-----
Merge tag 'auxdisplay-for-linus-v5.1-rc2' of git://github.com/ojeda/linux
Pull auxdisplay updates from Miguel Ojeda:
"A few fixes and improvements for auxdisplay:
- Series to fix a memory leak in hd44780 while introducing
charlcd_free(). From Andy Shevchenko
- Series to clean up the Kconfig menus and a couple of improvements
for charlcd. From Mans Rullgard"
* tag 'auxdisplay-for-linus-v5.1-rc2' of git://github.com/ojeda/linux:
auxdisplay: charlcd: make backlight initial state configurable
auxdisplay: charlcd: simplify init message display
auxdisplay: deconfuse configuration
auxdisplay: hd44780: Convert to use charlcd_free()
auxdisplay: panel: Convert to use charlcd_free()
auxdisplay: charlcd: Introduce charlcd_free() helper
auxdisplay: charlcd: Move to_priv() to charlcd namespace
auxdisplay: hd44780: Fix memory leak on ->remove()
Six fixes to four drivers and two core fixes. One core fix simply
corrects a missed destroy_rcu_head() but the other is hopefully the
end of an ongoing effort to make suspend/resume play nicely with scsi
quiesce.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXJZbZCYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishaIGAP9dOI8E
7RJRVPzXDEq7A2YE8whsWiQw7lB/BwMIYQenIwD/YySqYO8hTIDzXzkCuR94OY5L
HTWvluKkLILX/wAFM50=
=ItVl
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Six fixes to four drivers and two core fixes.
One core fix simply corrects a missed destroy_rcu_head() but the other
is hopefully the end of an ongoing effort to make suspend/resume play
nicely with scsi quiesce"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ibmvscsi: Fix empty event pool access during host removal
scsi: ibmvscsi: Protect ibmvscsi_head from concurrent modificaiton
scsi: hisi_sas: Add softreset in hisi_sas_I_T_nexus_reset()
scsi: qla2xxx: Fix NULL pointer crash due to stale CPUID
scsi: qla2xxx: Fix FC-AL connection target discovery
scsi: core: Avoid that a kernel warning appears during system resume
scsi: core: Also call destroy_rcu_head() for passthrough requests
scsi: iscsi: flush running unbind operations when removing a session
Since board support for the CLPS711X platform was removed,
remove the board support from the clps711x-timer driver.
Signed-off-by: Alexander Shiyan <shc_work@mail.ru>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lkml.kernel.org/r/20181220111626.17140-1-shc_work@mail.ru
Igor Russkikh says:
====================
net: aquantia: RX performance optimization patches
Here is a set of patches targeting for performance improvement
on various platforms and protocols.
Our main target was rx performance on iommu systems, notably
NVIDIA Jetson TX2 and NVIDIA Xavier platforms.
We introduce page reuse strategy to better deal with iommu dma mapping costs.
With it we see 80-90% of page reuse under some test configurations on UDP traffic.
This shows good improvements on other systems with IOMMU hardware, like
AMD Ryzen.
We've also improved TCP LRO configuration parameters, allowing packets to better
coalesce.
Page reuse tests were carried out using iperf3, iperf2, netperf and pktgen.
Mainly on UDP traffic, with various packet lengths.
Jetson TX2, UDP, Default MTU:
RX Lost Datagrams
Before: Max: 69% Min: 68% Avg: 68.5%
After: Max: 41% Min: 38% Avg: 39.2%
Maximum throughput
Before: 1.27 Gbits/sec
After: 2.41 Gbits/sec
AMD Ryzen 5 2400G, UDP, Default MTU:
RX Lost Datagrams
Before: Max: 12% Min: 4.5% Avg: 7.17%
After: Max: 6.2% Min: 2.3% Avg: 4.26%
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver is now constantly tested in our lab on aarch64 hardware:
Jetson tx2, Pascal and Xavier tegra based hardware.
Many of tegra smmu related HW bugs were fixed or workarounded already.
Thus, add ARM64 into Kconfig.
Add also COMPILE_TEST dependency.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Default LRO HW configuration was very conservative.
Low Number of Descriptors per LRO Sequence, small session
timeout, inefficient settings in interrupt generation logic.
Change max number of LRO descriptors from 2 to 16 to
increase performance. Increase maximum coalescing interval
in HW to 250uS. Tune up HW LRO interrupt generation setting
to prevent hw issues with long LRO sessions.
Signed-off-by: Nikita Danilov <nikita.danilov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For multigig rates 1K ring size is often not enough and causes extra
packet drops in hardware.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This correlates with default internet MTU. This also allows page
flip/reuse to be activated, since each allocated RX page now serves for
two frags/packets.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Before that, we've refilled ring even on single descriptor move.
Under high packet load that caused page allocation logic to be triggered
too often. That made overall ring processing slower.
Moreover, with page buffer reuse implemented, we should give a chance
higher networking levels to process received packets faster, release
the pages they consumed and therefore give a higher chance for these
pages to be reused.
RX ring is now refilled only when AQ_CFG_RX_REFILL_THRES or more
descriptors were processed (32 by default). Under regular traffic this
gives quite enough time for packet to be consumed and page to be reused.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We introduce internal aq_rxpage wrapper over regular page
where extra field is tracked: rxpage offset inside of allocated page.
This offset allows to reuse one page for multiple packets.
When needed (for example with large frames processing), allocated
pageorder could be customized. This gives even larger page reuse
efficiency.
page_ref_count is used to track page users. If during rx refill
underlying page has users, we increase pg_off by rx frame size
thus the top half of the page is reused.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Atlantic driver used 14 bytes preallocated skb size. That made L3 protocol
processing inefficient because pskb_pull had to fetch all the L3/L4 headers
from extra fragments.
Specially on UDP flows that caused extra packet drops because CPU was
overloaded with pskb_pull.
This patch uses eth_get_headlen for skb preallocation.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This series includes updates to mlx5 driver,
1) Compiler warnings cleanup from Saeed Mahameed
2) Parav Pandit simplifies sriov enable/disables
3) Gustavo A. R. Silva, Removes a redundant assignment
4) Moshe Shemesh, Adds Geneve tunnel stateless offload support
5) Eli Britstein, Adds the Support for VLAN modify action and
Replaces TC VLAN pop and push actions with VLAN modify
Note: This series includes two simple non-mlx5 patches,
1) Declare IANA_VXLAN_UDP_PORT definition in include/net/vxlan.h,
and use it in some drivers.
2) Declare GENEVE_UDP_PORT definition in include/net/geneve.h,
and use it in mlx5 and nfp drivers.
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJclTLsAAoJEEg/ir3gV/o+7bEH/1sz4oKP2mfhKSbG/I/g7Q3D
ifnccYq2EyXd1HzeglXpzLndO8wPve9qr/ANKrrKIYYCxc8FpCdb4aJD1Ucuylbb
XHHdfbTIPMa3vjhKtR/Fydht4RkY5IBBsgXywBcNL3ofxmnleNt9JRSr76Yhr2sy
Q3H30X+UvwAAQJBY1X+P8RiJcSklLu0UPG2KtTXcCz8YRgOWK0JtEiQyQu6yET4u
zbVxYixwKgsR9uhwNXqLxVMsaWFue9cYmVSMLigDx7fRZvj6Ao9REEUflt1hCEoR
jOXm1Avnsg9TKnwmgiBjrWQQQ4h+IMfZLK8EtuxVcraBUjtQRVnPak5JjZMjDuc=
=7t4R
-----END PGP SIGNATURE-----
Merge tag 'mlx5-updates-2019-03-20' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2019-03-20
This series includes updates to mlx5 driver,
1) Compiler warnings cleanup from Saeed Mahameed
2) Parav Pandit simplifies sriov enable/disables
3) Gustavo A. R. Silva, Removes a redundant assignment
4) Moshe Shemesh, Adds Geneve tunnel stateless offload support
5) Eli Britstein, Adds the Support for VLAN modify action and
Replaces TC VLAN pop and push actions with VLAN modify
Note: This series includes two simple non-mlx5 patches,
1) Declare IANA_VXLAN_UDP_PORT definition in include/net/vxlan.h,
and use it in some drivers.
2) Declare GENEVE_UDP_PORT definition in include/net/geneve.h,
and use it in mlx5 and nfp drivers.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
100GbE Intel Wired LAN Driver Updates 2019-03-22
This series contains updates to ice driver only.
Akeem enables MAC anti-spoofing by default when a new VSI is being
created. Fixes an issue when reclaiming VF resources back to the pool
after reset, by freeing VF resources separately using the first VF
vector index to traverse the list, instead of starting at the last
assigned vectors list. Added support for VF & PF promiscuous mode in
the ice driver. Fixed the PF driver from letting the VF know it is "not
trusted" when it attempts to add more than its permitted additional MAC
addresses. Altered how the driver gets the VF VSIs instances, instead
of using the mailbox messages to retrieve VSIs, get it directly via the
VF object in the PF data structure.
Bruce fixes return values to resolve static analysis warnings. Made
whitespace changes to increase readability and reduce code wrapping.
Anirudh cleans up code by removing a function prototype that was never
implemented and removed an unused field in the ice_sched_vsi_info
structure.
Kiran fixes a potential divide by zero issue by adding a check.
Victor cleans up the transmit scheduler by adjusting the stack variable
usage and added/modified debug prints to make them more useful.
Yashaswini updates the driver in VEB mode to ensure that the LAN_EN bit
is set if all the right conditions are met.
Christopher ensures the loopback enable bit is not set for prune switch
rules, since all transmit traffic would be looped back to the internal
switch and dropped.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet says:
====================
tcp: add rx/tx cache to reduce lock contention
On hosts with many cpus we can observe a very serious contention
on spinlocks used in mm slab layer.
The following can happen quite often :
1) TX path
sendmsg() allocates one (fclone) skb on CPU A, sends a clone.
ACK is received on CPU B, and consumes the skb that was in the retransmit
queue.
2) RX path
network driver allocates skb on CPU C
recvmsg() happens on CPU D, freeing the skb after it has been delivered
to user space.
In both cases, we are hitting the asymetric alloc/free pattern
for which slab has to drain alien caches. At 8 Mpps per second,
this represents 16 Mpps alloc/free per second and has a huge penalty.
In an interesting experiment, I tried to use a single kmem_cache for all the skbs
(in skb_init() : skbuff_fclone_cache = skbuff_head_cache =
kmem_cache_create("skbuff_fclone_cache", sizeof(struct sk_buff_fclones),);
qnd most of the contention disappeared, since cpus could better use
their local slab per-cpu cache.
But we can do actually better, in the following patches.
TX : at ACK time, no longer free the skb but put it back in a tcp socket cache,
so that next sendmsg() can reuse it immediately.
RX : at recvmsg() time, do not free the skb but put it in a tcp socket cache
so that it can be freed by the cpu feeding the incoming packets in BH.
This increased the performance of small RPC benchmark by about 10 % on a host
with 112 hyperthreads.
v2 : - Solved a race condition : sk_stream_alloc_skb() to make sure the prior
clone has been freed.
- Really test rps_needed in sk_eat_skb() as claimed.
- Fixed rps_needed use in drivers/net/tun.c
v3: Added a #ifdef CONFIG_RPS, to avoid compile error (kbuild robot)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Often times, recvmsg() system calls and BH handling for a particular
TCP socket are done on different cpus.
This means the incoming skb had to be allocated on a cpu,
but freed on another.
This incurs a high spinlock contention in slab layer for small rpc,
but also a high number of cache line ping pongs for larger packets.
A full size GRO packet might use 45 page fragments, meaning
that up to 45 put_page() can be involved.
More over performing the __kfree_skb() in the recvmsg() context
adds a latency for user applications, and increase probability
of trapping them in backlog processing, since the BH handler
might found the socket owned by the user.
This patch, combined with the prior one increases the rpc
performance by about 10 % on servers with large number of cores.
(tcp_rr workload with 10,000 flows and 112 threads reach 9 Mpps
instead of 8 Mpps)
This also increases single bulk flow performance on 40Gbit+ links,
since in this case there are often two cpus working in tandem :
- CPU handling the NIC rx interrupts, feeding the receive queue,
and (after this patch) freeing the skbs that were consumed.
- CPU in recvmsg() system call, essentially 100 % busy copying out
data to user space.
Having at most one skb in a per-socket cache has very little risk
of memory exhaustion, and since it is protected by socket lock,
its management is essentially free.
Note that if rps/rfs is used, we do not enable this feature, because
there is high chance that the same cpu is handling both the recvmsg()
system call and the TCP rx path, but that another cpu did the skb
allocations in the device driver right before the RPS/RFS logic.
To properly handle this case, it seems we would need to record
on which cpu skb was allocated, and use a different channel
to give skbs back to this cpu.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On hosts with a lot of cores, RPC workloads suffer from heavy contention on slab spinlocks.
20.69% [kernel] [k] queued_spin_lock_slowpath
5.64% [kernel] [k] _raw_spin_lock
3.83% [kernel] [k] syscall_return_via_sysret
3.48% [kernel] [k] __entry_text_start
1.76% [kernel] [k] __netif_receive_skb_core
1.64% [kernel] [k] __fget
For each sendmsg(), we allocate one skb, and free it at the time ACK packet comes.
In many cases, ACK packets are handled by another cpus, and this unfortunately
incurs heavy costs for slab layer.
This patch uses an extra pointer in socket structure, so that we try to reuse
the same skb and avoid these expensive costs.
We cache at most one skb per socket so this should be safe as far as
memory pressure is concerned.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We prefer static_branch_unlikely() over static_key_false() these days.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni says:
====================
net: dev: BYPASS for lockless qdisc
This patch series is aimed at improving xmit performances of lockless qdisc
in the uncontended scenario.
After the lockless refactor pfifo_fast can't leverage the BYPASS optimization.
Due to retpolines the overhead for the avoidables enqueue and dequeue operations
has increased and we see measurable regressions.
The first patch introduces the BYPASS code path for lockless qdisc, and the
second one optimizes such path further. Overall this avoids up to 3 indirect
calls per xmit packet. Detailed performance figures are reported in the 2nd
patch.
v2 -> v3:
- qdisc_is_empty() has a const argument (Eric)
v1 -> v2:
- use really an 'empty' flag instead of 'not_empty', as
suggested by Eric
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
With commit c5ad119fb6 ("net: sched: pfifo_fast use skb_array")
pfifo_fast no longer benefit from the TCQ_F_CAN_BYPASS optimization.
Due to retpolines the cost of the enqueue()/dequeue() pair has become
relevant and we observe measurable regression for the uncontended
scenario when the packet-rate is below line rate.
After commit 46b1c18f9d ("net: sched: put back q.qlen into a
single location") we can check for empty qdisc with a reasonably
fast operation even for nolock qdiscs.
This change extends TCQ_F_CAN_BYPASS support to nolock qdisc.
The new chunk of code mirrors closely the existing one for traditional
qdisc, leveraging a newly introduced helper to read atomically the
qdisc length.
Tested with pktgen in queue xmit mode, with pfifo_fast, a MQ
device, and MQ root qdisc:
threads vanilla patched
kpps kpps
1 2465 2889
2 4304 5188
4 7898 9589
Same as above, but with a single queue device:
threads vanilla patched
kpps kpps
1 2556 2827
2 2900 2900
4 5000 5000
8 4700 4700
No mesaurable changes in the contended scenarios, and more 10%
improvement in the uncontended ones.
v1 -> v2:
- rebased after flag name change
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The queue is marked not empty after acquiring the seqlock,
and it's up to the NOLOCK qdisc clearing such flag on dequeue.
Since the empty status lays on the same cache-line of the
seqlock, it's always hot on cache during the updates.
This makes the empty flag update a little bit loosy. Given
the lack of synchronization between enqueue and dequeue, this
is unavoidable.
v2 -> v3:
- qdisc_is_empty() has a const argument (Eric)
v1 -> v2:
- use really an 'empty' flag instead of 'not_empty', as
suggested by Eric
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add documentation to the tcp_ca_state enum, since this enum is
exposed in uapi.
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Sowmini Varadhan <sowmini05@gmail.com>
Acked-by: Sowmini Varadhan <sowmini05@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
clang produces a false-positive warning as it fails to notice
that "lost = true" implies that "ret" is initialized:
net/rxrpc/output.c:402:6: error: variable 'ret' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized]
if (lost)
^~~~
net/rxrpc/output.c:437:6: note: uninitialized use occurs here
if (ret >= 0) {
^~~
net/rxrpc/output.c:402:2: note: remove the 'if' if its condition is always false
if (lost)
^~~~~~~~~
net/rxrpc/output.c:339:9: note: initialize the variable 'ret' to silence this warning
int ret, opt;
^
= 0
Rearrange the code to make that more obvious and avoid the warning.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When checking the code with clang -Wsometimes-uninitialized we get the
following warning:
if (!tipc_link_is_establishing(l)) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
net/tipc/node.c:847:46: note: uninitialized use occurs here
tipc_bearer_xmit(n->net, bearer_id, &xmitq, maddr);
net/tipc/node.c:831:2: note: remove the 'if' if its condition is always
true
if (!tipc_link_is_establishing(l)) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
net/tipc/node.c:821:31: note: initialize the variable 'maddr' to silence
this warning
struct tipc_media_addr *maddr;
We fix this by initializing 'maddr' to NULL. For the matter of clarity,
we also test if 'xmitq' is non-empty before we use it and 'maddr'
further down in the function. It will never happen that 'xmitq' is non-
empty at the same time as 'maddr' is NULL, so this is a sufficient test.
Fixes: 598411d70f ("tipc: make resetting of links non-atomic")
Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcp_clock_ns() (aka ktime_get_ns()) is using monotonic clock,
so the checks we had in tcp_mstamp_refresh() are no longer
relevant.
This patch removes cpu stall (when the cache line is not hot)
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A new mirred action is created by the tcf_mirred_init function. This
contains a list head struct which is inserted into a global list on
successful creation of a new action. However, after a creation, it is
still possible to error out and call the tcf_idr_release function. This,
in turn, calls the act_mirr cleanup function via __tcf_idr_release and
__tcf_action_put. This cleanup function tries to delete the list entry
which is as yet uninitialised, leading to a NULL pointer exception.
Fix this by initialising the list entry on creation of a new action.
Bug report:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
PGD 8000000840c73067 P4D 8000000840c73067 PUD 858dcc067 PMD 0
Oops: 0002 [#1] SMP PTI
CPU: 32 PID: 5636 Comm: handler194 Tainted: G OE 5.0.0+ #186
Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.3.6 06/03/2015
RIP: 0010:tcf_mirred_release+0x42/0xa7 [act_mirred]
Code: f0 90 39 c0 e8 52 04 57 c8 48 c7 c7 b8 80 39 c0 e8 94 fa d4 c7 48 8b 93 d0 00 00 00 48 8b 83 d8 00 00 00 48 c7 c7 f0 90 39 c0 <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 83 d0 00
RSP: 0018:ffffac4aa059f688 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff9dcd1b214d00 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff9dcd1fa165f8 RDI: ffffffffc03990f0
RBP: ffff9dccf9c7af80 R08: 0000000000000a3b R09: 0000000000000000
R10: ffff9dccfa11f420 R11: 0000000000000000 R12: 0000000000000001
R13: ffff9dcd16b433c0 R14: ffff9dcd1b214d80 R15: 0000000000000000
FS: 00007f441bfff700(0000) GS:ffff9dcd1fa00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000008 CR3: 0000000839e64004 CR4: 00000000001606e0
Call Trace:
tcf_action_cleanup+0x59/0xca
__tcf_action_put+0x54/0x6b
__tcf_idr_release.cold.33+0x9/0x12
tcf_mirred_init.cold.20+0x22e/0x3b0 [act_mirred]
tcf_action_init_1+0x3d0/0x4c0
tcf_action_init+0x9c/0x130
tcf_exts_validate+0xab/0xc0
fl_change+0x1ca/0x982 [cls_flower]
tc_new_tfilter+0x647/0x8d0
? load_balance+0x14b/0x9e0
rtnetlink_rcv_msg+0xe3/0x370
? __switch_to_asm+0x40/0x70
? __switch_to_asm+0x34/0x70
? _cond_resched+0x15/0x30
? __kmalloc_node_track_caller+0x1d4/0x2b0
? rtnl_calcit.isra.31+0xf0/0xf0
netlink_rcv_skb+0x49/0x110
netlink_unicast+0x16f/0x210
netlink_sendmsg+0x1df/0x390
sock_sendmsg+0x36/0x40
___sys_sendmsg+0x27b/0x2c0
? futex_wake+0x80/0x140
? do_futex+0x2b9/0xac0
? ep_scan_ready_list.constprop.22+0x1f2/0x210
? ep_poll+0x7a/0x430
__sys_sendmsg+0x47/0x80
do_syscall_64+0x55/0x100
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: 4e232818bd ("net: sched: act_mirred: remove dependency on rtnl lock")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bartek reported that after few cable unplug/replug cycles suddenly
replug isn't detected any longer. His system uses a RTL8106, I wasn't
able to reproduce the issue with RTL8168g. According to his bisect
the referenced commit caused the regression. As Realtek doesn't
release datasheets or errata it's hard to say what's the actual root
cause, but this change was reported to fix the issue.
Fixes: 38caff5a44 ("r8169: handle all interrupt events in the hard irq handler")
Reported-by: Bartosz Skrzypczak <barteks2x@gmail.com>
Suggested-by: Bartosz Skrzypczak <barteks2x@gmail.com>
Tested-by: Bartosz Skrzypczak <barteks2x@gmail.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The call to of_get_child_by_name returns a node pointer with refcount
incremented thus it must be explicitly decremented after the last
usage.
Detected by coccinelle with the following warnings:
./drivers/net/ethernet/ti/netcp_ethss.c:3661:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 3654, but without a corresponding object release within this function.
./drivers/net/ethernet/ti/netcp_ethss.c:3665:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 3654, but without a corresponding object release within this function.
Signed-off-by: Wen Yang <wen.yang99@zte.com.cn>
Cc: Wingman Kwok <w-kwok2@ti.com>
Cc: Murali Karicheri <m-karicheri2@ti.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
The call to ehea_get_eth_dn returns a node pointer with refcount
incremented thus it must be explicitly decremented after the last
usage.
Detected by coccinelle with the following warnings:
./drivers/net/ethernet/ibm/ehea/ehea_main.c:3163:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 3154, but without a corresponding object release within this function.
Signed-off-by: Wen Yang <wen.yang99@zte.com.cn>
Cc: Douglas Miller <dougmill@linux.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
The call to of_parse_phandle returns a node pointer with refcount
incremented thus it must be explicitly decremented after the last
usage.
Detected by coccinelle with the following warnings:
./drivers/net/ethernet/xilinx/xilinx_axienet_main.c:1624:1-7: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 1569, but without a corresponding object release within this function.
Signed-off-by: Wen Yang <wen.yang99@zte.com.cn>
Cc: Anirudha Sarangi <anirudh@xilinx.com>
Cc: John Linn <John.Linn@xilinx.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: netdev@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
After 90eff9096c ("net: phy: Allow splitting MDIO bus/device support
from PHYs") the various MDIO bus drivers were no longer parented with
config PHYLIB but with config MDIO_BUS which is not a menuconfig, fix
this by depending on MDIO_DEVICE which is a menuconfig.
This is visually nicer and less confusing for users.
Fixes: 90eff9096c ("net: phy: Allow splitting MDIO bus/device support from PHYs")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The tristate prompt should have been replaced rather than defined a few
lines below, rebase mistake.
Fixes: 17cc982176 ("net: phy: Move Omega PHY entry to Cygnus PHY driver")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAlyWVysQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpn5lD/0bEg76kbuwOUy5+FDqOpF0MNOU7xZcYcsI
YkkaKkUi2YQL6NJlkU7AhtPwep+J2sgSnDW9Ho9WIXbsnsO6UF79uIdcix6zJGIl
WnZZ3BLgWeciCfrzFpn3FFZnm/AKJSPWPmllUFvmUYT9GdRgN4ZnHBsS1HTlJ1m5
5HhwLtaYOsZ75NxWBRqWspmtFe+XZ/CrjGgmvIF8FjSuIP2q0RrOmCF1XAA82umd
ehiU1ZtQ+v4FHxmJWjzMWhrCj2y0gmPb+DotIqefFjVnd/G+LrFGMD1fsLoQVFDy
L5VzSOGj1E4KXfDpIeGnz/08dpqXmOkvsSaNnv1U7vA7SCkbodR/BA1EKJrvk5v7
MGkkcQDaU/WzC41RCyVQNWAWjzNLKbruXQ+1HqCx5eh7uthvMQMXDvGf4Jgeq+/E
vGzrEKZ6qI78Vy0mXSy4dfFbFaNTjCkE2jbIG7BQx5zdtnS9/VPXNkpZxPrGLM+P
/fTsLXghU9lKn6WHVtLpQsfJr0OMjyC9JA23pTX2G9MtBhDcyuRs+uCeQgG6cIkl
F15LGuOY7YGYxRsegdinFaoldnHersUDx19c+uFdrB0k0A/A6KeGHuZx7aJPkW1L
M89FkyJr2ZBgc26PvKz6j1Hwl2MKJC5h8TpPES/QnulWh4FbqqH3a501Qa1AQuxC
1me95iy74w==
=l4lx
-----END PGP SIGNATURE-----
Merge tag 'io_uring-20190323' of git://git.kernel.dk/linux-block
Pull io_uring fixes and improvements from Jens Axboe:
"The first five in this series are heavily inspired by the work Al did
on the aio side to fix the races there.
The last two re-introduce a feature that was in io_uring before it got
merged, but which I pulled since we didn't have a good way to have
BVEC iters that already have a stable reference. These aren't
necessarily related to block, it's just how io_uring pins fixed
buffers"
* tag 'io_uring-20190323' of git://git.kernel.dk/linux-block:
block: add BIO_NO_PAGE_REF flag
iov_iter: add ITER_BVEC_FLAG_NO_REF flag
io_uring: mark me as the maintainer
io_uring: retry bulk slab allocs as single allocs
io_uring: fix poll races
io_uring: fix fget/fput handling
io_uring: add prepped flag
io_uring: make io_read/write return an integer
io_uring: use regular request ref counts
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAlyWVjYQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpghzEADViI9AC1ixO/4/L09T3LD6WrKa+GfAfl95
JhbBB5CzaSYtw1lW2sciXO3vD6Smgbs9Mir3SvKxj5QjbMjKUhlLGf2jwfsiokY2
PU/rZkd5ueioQ/jm6ABgos1hXOVz6EjDO2B8ERBPTJ19vLWVtpTAh9yAGYWFnY8m
etiNAFpcM3iVeeaLFlCEbWk4O2A3oPvwm6rz41h81mbKTmFc6BBNSSb/FwtsRfP4
vLK+JOBBy1jEwpjULoi9FmNVa+QCjfvAhK3kwc482+AoF5HHFeo5LnLOMjcaPTSy
W3Nefh+Jhzm0ERSX+q4biJ1ly/doFpmFfdmzXbFaWkLBQENx2MZkb1MS8SSxRV1N
hxRACtY8DYAlDLDJ6SsLGgJ0js6282hGPPR5DVxp4VP1iWvobUx/QcoR14lVvpFt
1g/jFDuU18JW5lxY/gIYT6PGjZRpdqdaqhhI6XmmMj3V0zo6Z4UrX1FxaoNn9fqP
2JQKUpSvlq0ZFUwHOn91sRbv9Zb1mKgWRsjTUPnFL8dAiHOgl7ZS8qvgBRkKsl4D
54aG2pvBSVOa+8S+UAFfYJkZPiv8eHg8WqeIy3e8J+AROtuP2ccitt36cvltl/MS
yivVzUenyv+SZSlTMuYbQfQRmm5CTit0NSDrclrB/qc1w7EbmjxIF64ohXdatuI5
xCKmvid0Wg==
=4Ktc
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20190323' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"A set of fixes/changes that should go into this series. This contains:
- Kernel doc / comment updates (Bart, Shenghui)
- Un-export of core-only used function (Bart)
- Fix race on loop file access (Dongli)
- pf/pcd queue cleanup fixes (me)
- Use appropriate helper for RESTART bit set (Yufen)
- Use named identifier for classic poll (Yufen)"
* tag 'for-linus-20190323' of git://git.kernel.dk/linux-block:
sbitmap: trivial - update comment for sbitmap_deferred_clear_bit
blkcg: Fix kernel-doc warnings
blk-iolatency: #include "blk.h"
block: Unexport blk_mq_add_to_requeue_list()
block: add BLK_MQ_POLL_CLASSIC for hybrid poll and return EINVAL for unexpected value
blk-mq: remove unused 'nr_expired' from blk_mq_hw_ctx
loop: access lo_backing_file only when the loop device is Lo_bound
blk-mq: use blk_mq_sched_mark_restart_hctx to set RESTART
paride/pcd: cleanup queues when detection fails
paride/pf: cleanup queues when detection fails
for stable.
-----BEGIN PGP SIGNATURE-----
iQFHBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAlyVCYsTHGlkcnlvbW92
QGdtYWlsLmNvbQAKCRBKf944AhHzi6oFB/9Lw2OMiNrdMAgfOzcVkB4dbXbRtryG
rBsTUc/fwFrCl2N+wRmJKUkDP1e1phM9CHaGEoBYB+vL2Hhj8kjyJN4NY1g09ilt
w8XT6Pkg2Uui4kr5bnWzDS+oO8hm1ES2PbzDnIX1UiLStHypzQFwzvlm+TQE8jxb
pGNyNnbASJ+2mk+xAz7EkLGlNcAgULUT7oC/w0xj0neM739aYS3IsZgWylnw41CS
59TDyWROUUePLQnvieXmp4l58fn7xpvwrUE0KQgYec+j2t2SsKnHUoI86qZFYF2Q
NsE/K2fmIvKxWrmNw+0uuOVj+mNr2Jr5m/q+Nu61asOjs7Hz5CwUp7nd
=DkXC
-----END PGP SIGNATURE-----
Merge tag 'ceph-for-5.1-rc2' of git://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
"A follow up for the new alloc_size logic and a blacklisting fix,
marked for stable"
* tag 'ceph-for-5.1-rc2' of git://github.com/ceph/ceph-client:
rbd: drop wait_for_latest_osdmap()
libceph: wait for latest osdmap in ceph_monc_blacklist_add()
rbd: set io_min, io_opt and discard_granularity to alloc_size
The ext4 fstrim implementation uses the block bitmaps to find free space
that can be discarded. If we haven't replayed the journal, the bitmaps
will be stale and we absolutely *cannot* use stale metadata to zap the
underlying storage.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
During a read failover, we may end up changing the value of
the pgio_mirror_idx, so make sure that we record the layout
stats before that update.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Specifying a retrans=0 mount parameter to a NFS/TCP mount, is
inadvertently causing the NFS client to rewrite any specified
timeout parameter to the default of 60 seconds.
Fixes: a956beda19 ("NFS: Allow the mount option retrans=0")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Currently, we are releasing the indirect buffer where we are done with
it in ext4_ind_remove_space(), so we can see the brelse() and
BUFFER_TRACE() everywhere. It seems fragile and hard to read, and we
may probably forget to release the buffer some day. This patch cleans
up the code by putting of the code which releases the buffers to the
end of the function.
Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
If the transport is still connected, then we do want to allow
RPC_SOFTCONN tasks to retry. They should time out if and only if
the connection is broken.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
All indirect buffers get by ext4_find_shared() should be released no
mater the branch should be freed or not. But now, we forget to release
the lower depth indirect buffers when removing space from the same
higher depth indirect block. It will lead to buffer leak and futher
more, it may lead to quota information corruption when using old quota,
consider the following case.
- Create and mount an empty ext4 filesystem without extent and quota
features,
- quotacheck and enable the user & group quota,
- Create some files and write some data to them, and then punch hole
to some files of them, it may trigger the buffer leak problem
mentioned above.
- Disable quota and run quotacheck again, it will create two new
aquota files and write the checked quota information to them, which
probably may reuse the freed indirect block(the buffer and page
cache was not freed) as data block.
- Enable quota again, it will invoke
vfs_load_quota_inode()->invalidate_bdev() to try to clean unused
buffers and pagecache. Unfortunately, because of the buffer of quota
data block is still referenced, quota code cannot read the up to date
quota info from the device and lead to quota information corruption.
This problem can be reproduced by xfstests generic/231 on ext3 file
system or ext4 file system without extent and quota features.
This patch fix this problem by releasing the missing indirect buffers,
in ext4_ind_remove_space().
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: stable@kernel.org
In preparation to enabling -Wimplicit-fallthrough, mark switch
cases where we are expecting to fall through.
With -Wimplicit-fallthrough added to CFLAGS:
kernel/irq/manage.c: In function ‘irq_do_set_affinity’:
kernel/irq/manage.c:198:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
cpumask_copy(desc->irq_common_data.affinity, mask);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
kernel/irq/manage.c:199:2: note: here
case IRQ_SET_MASK_OK_NOCOPY:
^~~~
Annotate it.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20190228213714.GA9246@embeddedor
For all riscv architectures (RV32, RV64 and RV128), the clocksource
is a 64 bit incrementing counter.
Fix the clock source mask accordingly.
Tested on both 64bit and 32 bit virt machine in QEMU.
Fixes: 62b0194368 ("clocksource: new RISC-V SBI timer driver")
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Anup Patel <anup@brainfault.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: linux-riscv@lists.infradead.org
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Anup Patel <Anup.Patel@wdc.com>
Cc: Damien Le Moal <Damien.LeMoal@wdc.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190322215411.19362-1-atish.patra@wdc.com
On machines where the GART aperture is mapped over physical RAM,
/proc/kcore contains the GART aperture range. Accessing the GART range via
/proc/kcore results in a kernel crash.
vmcore used to have the same issue, until it was fixed with commit
2a3e83c6f9 ("x86/gart: Exclude GART aperture from vmcore")', leveraging
existing hook infrastructure in vmcore to let /proc/vmcore return zeroes
when attempting to read the aperture region, and so it won't read from the
actual memory.
Apply the same workaround for kcore. First implement the same hook
infrastructure for kcore, then reuse the hook functions introduced in the
previous vmcore fix. Just with some minor adjustment, rename some functions
for more general usage, and simplify the hook infrastructure a bit as there
is no module usage yet.
Suggested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Kairui Song <kasong@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jiri Bohac <jbohac@suse.cz>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Dave Young <dyoung@redhat.com>
Link: https://lkml.kernel.org/r/20190308030508.13548-1-kasong@redhat.com
Workaround problem with Samba responses to SMB3.1.1
null user (guest) mounts. The server doesn't set the
expected flag in the session setup response so we have
to do a similar check to what is done in smb3_validate_negotiate
where we also check if the user is a null user (but not sec=krb5
since username might not be passed in on mount for Kerberos case).
Note that the commit below tightened the conditions and forced signing
for the SMB2-TreeConnect commands as per MS-SMB2.
However, this should only apply to normal user sessions and not for
cases where there is no user (even if server forgets to set the flag
in the response) since we don't have anything useful to sign with.
This is especially important now that the more secure SMB3.1.1 protocol
is in the default dialect list.
An earlier patch ("cifs: allow guest mounts to work for smb3.11") fixed
the guest mounts to Windows.
Fixes: 6188f28bf6 ("Tree connect for SMB3.1.1 must be signed for non-encrypted shares")
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Paulo Alcantara <palcantara@suse.de>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
This patch fixes the following KASAN report:
[ 779.044746] BUG: KASAN: slab-out-of-bounds in string+0xab/0x180
[ 779.044750] Read of size 1 at addr ffff88814f327968 by task trace-cmd/2812
[ 779.044756] CPU: 1 PID: 2812 Comm: trace-cmd Not tainted 5.1.0-rc1+ #62
[ 779.044760] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-0-ga698c89-prebuilt.qemu.org 04/01/2014
[ 779.044761] Call Trace:
[ 779.044769] dump_stack+0x5b/0x90
[ 779.044775] ? string+0xab/0x180
[ 779.044781] print_address_description+0x6c/0x23c
[ 779.044787] ? string+0xab/0x180
[ 779.044792] ? string+0xab/0x180
[ 779.044797] kasan_report.cold.3+0x1a/0x32
[ 779.044803] ? string+0xab/0x180
[ 779.044809] string+0xab/0x180
[ 779.044816] ? widen_string+0x160/0x160
[ 779.044822] ? vsnprintf+0x5bf/0x7f0
[ 779.044829] vsnprintf+0x4e7/0x7f0
[ 779.044836] ? pointer+0x4a0/0x4a0
[ 779.044841] ? seq_buf_vprintf+0x79/0xc0
[ 779.044848] seq_buf_vprintf+0x62/0xc0
[ 779.044855] trace_seq_printf+0x113/0x210
[ 779.044861] ? trace_seq_puts+0x110/0x110
[ 779.044867] ? trace_raw_output_prep+0xd8/0x110
[ 779.044876] trace_raw_output_smb3_tcon_class+0x9f/0xc0
[ 779.044882] print_trace_line+0x377/0x890
[ 779.044888] ? tracing_buffers_read+0x300/0x300
[ 779.044893] ? ring_buffer_read+0x58/0x70
[ 779.044899] s_show+0x6e/0x140
[ 779.044906] seq_read+0x505/0x6a0
[ 779.044913] vfs_read+0xaf/0x1b0
[ 779.044919] ksys_read+0xa1/0x130
[ 779.044925] ? kernel_write+0xa0/0xa0
[ 779.044931] ? __do_page_fault+0x3d5/0x620
[ 779.044938] do_syscall_64+0x63/0x150
[ 779.044944] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 779.044949] RIP: 0033:0x7f62c2c2db31
[ 779.044955] Code: fe ff ff 48 8d 3d 17 9e 09 00 48 83 ec 08 e8 96 02
02 00 66 0f 1f 44 00 00 8b 05 fa fc 2c 00 48 63 ff 85 c0 75 13 31 c0
0f 05 <48> 3d 00 f0 ff ff 77 57 f3 c3 0f 1f 44 00 00 55 53 48 89 d5 48
89
[ 779.044958] RSP: 002b:00007ffd6e116678 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[ 779.044964] RAX: ffffffffffffffda RBX: 0000560a38be9260 RCX: 00007f62c2c2db31
[ 779.044966] RDX: 0000000000002000 RSI: 00007ffd6e116710 RDI: 0000000000000003
[ 779.044966] RDX: 0000000000002000 RSI: 00007ffd6e116710 RDI: 0000000000000003
[ 779.044969] RBP: 00007f62c2ef5420 R08: 0000000000000000 R09: 0000000000000003
[ 779.044972] R10: ffffffffffffffa8 R11: 0000000000000246 R12: 00007ffd6e116710
[ 779.044975] R13: 0000000000002000 R14: 0000000000000d68 R15: 0000000000002000
[ 779.044981] Allocated by task 1257:
[ 779.044987] __kasan_kmalloc.constprop.5+0xc1/0xd0
[ 779.044992] kmem_cache_alloc+0xad/0x1a0
[ 779.044997] getname_flags+0x6c/0x2a0
[ 779.045003] user_path_at_empty+0x1d/0x40
[ 779.045008] do_faccessat+0x12a/0x330
[ 779.045012] do_syscall_64+0x63/0x150
[ 779.045017] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 779.045019] Freed by task 1257:
[ 779.045023] __kasan_slab_free+0x12e/0x180
[ 779.045029] kmem_cache_free+0x85/0x1b0
[ 779.045034] filename_lookup.part.70+0x176/0x250
[ 779.045039] do_faccessat+0x12a/0x330
[ 779.045043] do_syscall_64+0x63/0x150
[ 779.045048] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 779.045052] The buggy address belongs to the object at ffff88814f326600
which belongs to the cache names_cache of size 4096
[ 779.045057] The buggy address is located 872 bytes to the right of
4096-byte region [ffff88814f326600, ffff88814f327600)
[ 779.045058] The buggy address belongs to the page:
[ 779.045062] page:ffffea00053cc800 count:1 mapcount:0 mapping:ffff88815b191b40 index:0x0 compound_mapcount: 0
[ 779.045067] flags: 0x200000000010200(slab|head)
[ 779.045075] raw: 0200000000010200 dead000000000100 dead000000000200 ffff88815b191b40
[ 779.045081] raw: 0000000000000000 0000000000070007 00000001ffffffff 0000000000000000
[ 779.045083] page dumped because: kasan: bad access detected
[ 779.045085] Memory state around the buggy address:
[ 779.045089] ffff88814f327800: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 779.045093] ffff88814f327880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 779.045097] >ffff88814f327900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 779.045099] ^
[ 779.045103] ffff88814f327980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 779.045107] ffff88814f327a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 779.045109] ==================================================================
[ 779.045110] Disabling lock debugging due to kernel taint
Correctly assign tree name str for smb3_tcon event.
Signed-off-by: Paulo Alcantara (SUSE) <paulo@paulo.ac>
Signed-off-by: Steve French <stfrench@microsoft.com>
Fix Guest/Anonymous sessions so that they work with SMB 3.11.
The commit noted below tightened the conditions and forced signing for
the SMB2-TreeConnect commands as per MS-SMB2.
However, this should only apply to normal user sessions and not for
Guest/Anonumous sessions.
Fixes: 6188f28bf6 ("Tree connect for SMB3.1.1 must be signed for non-encrypted shares")
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
It was mapped to EIO which can be confusing when user space
queries for an object GUID for an object for which the server
file system doesn't support (or hasn't saved one).
As Amir Goldstein suggested this is similar to ENOATTR
(equivalently ENODATA in Linux errno definitions) so
changing NT STATUS code mapping for OBJECTID_NOT_FOUND
to ENODATA.
Signed-off-by: Steve French <stfrench@microsoft.com>
CC: Amir Goldstein <amir73il@gmail.com>