IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
The schedule method now needs _rcu list-traversal
primitive for svc->destinations.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
The schedule method now needs _rcu list-traversal
primitive for svc->destinations.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
The schedule method now needs _rcu list-traversal
primitive for svc->destinations. As the previous entry
could be unlinked, limit the list traversals to 2 when
lookup started from previous entry.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
The schedule method now needs _rcu list-traversal
primitive for svc->destinations.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
The schedule method now needs _rcu list-traversal
primitive for svc->destinations.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
The schedule method now needs _rcu list-traversal
primitive for svc->destinations. The read_lock for sched_lock is
removed. The set.lock is removed because now it is used in
rare cases, mostly under sched_lock.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
The schedule method now needs _rcu list-traversal
primitive for svc->destinations. The read_lock for sched_lock is
removed. Use a dead flag to prevent new entries to be created
while scheduler is reclaimed. Use hlist for the hash table.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Use the new add_dest and del_dest methods
to reassign dests.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
ip_vs_dest_hold will be used under RCU lock
while ip_vs_dest_put can be called even after dest
is removed from service, as it happens for conns and
some schedulers.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Allow schedulers to use rcu_dereference when
returning destination on lookup. The RCU read-side critical
section will allow ip_vs_bind_dest to get dest refcnt as
preparation for the step where destinations will be
deleted without an IP_VS_WAIT_WHILE guard that holds the
packet processing during update.
Add new optional scheduler methods add_dest,
del_dest and upd_dest. For now the methods are called
together with update_service but update_service will be
removed in a following change.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
The global list with schedulers ip_vs_schedulers
is accessed only from user context - configuration and
scheduler module [un]registration. Use ip_vs_sched_mutex
instead.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
We have many fields to set and few to reset,
use kmem_cache_alloc instead to save some cycles.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
__ip_vs_conn_in_get and ip_vs_conn_out_get are
hot places. Optimize them, so that ports are matched first.
By moving net and fwmark below, on 32-bit arch we can fit
caddr in 32-byte cache line and all addresses in 64-byte
cache line.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Convert __ip_vs_conntbl_lock_array as follows:
- readers that do not modify conn lists will use RCU lock
- updaters that modify lists will use spinlock_t
Now for conn lookups we will use RCU read-side
critical section. Without using __ip_vs_conn_get such
places have access to connection fields and can
dereference some pointers like pe and pe_data plus
the ability to update timer expiration. If full access
is required we contend for reference.
We add barrier in __ip_vs_conn_put, so that
other CPUs see the refcnt operation after other writes.
With the introduction of ip_vs_conn_unlink()
we try to reorganize ip_vs_conn_expire(), so that
unhashing of connections that should stay more time is
avoided, even if it is for very short time.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Allow the readers to use RCU lock and for
PE module registrations use global mutex instead of
spinlock. All PE modules need to use synchronize_rcu
in their module exit handler.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
rs_lock was used to protect rs_table (hash table)
from updaters (under global mutex) and readers (packet handlers).
We can remove rs_lock by using RCU lock for readers. Reclaiming
dest only with kfree_rcu is enough because the readers access
only fields from the ip_vs_dest structure.
Use hlist for rs_table.
As we are now using hlist_del_rcu, introduce in_rs_table
flag as replacement for the list_empty checks which do not
work with RCU. It is needed because only NAT dests are in
the rs_table.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
We use locks like tcp_app_lock, udp_app_lock,
sctp_app_lock to protect access to the protocol hash tables
from readers in packet context while the application
instances (inc) are [un]registered under global mutex.
As the hash tables are mostly read when conns are
created and bound to app, use RCU for readers and reclaim
app instance after grace period.
Simplify ip_vs_app_inc_get because we use usecnt
only for statistics and rely on module refcounting.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Currently when forwarding requests to real servers
we use dst_lock and atomic operations when cloning the
dst_cache value. As the dst_cache value does not change
most of the time it is better to use RCU and to lock
dst_lock only when we need to replace the obsoleted dst.
For this to work we keep dst_cache in new structure protected
by RCU. For packets to remote real servers we will use noref
version of dst_cache, it will be valid while we are in RCU
read-side critical section because now dst_release for replaced
dsts will be invoked after the grace period. Packets to
local real servers that are passed to local stack with
NF_ACCEPT need a dst clone.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Consolidate the PMTU checks, ICMP sending and
skb_dst modification in __ip_vs_get_out_rt and
__ip_vs_get_out_rt_v6. Now skb_dst is changed early
to simplify the transmitters.
Make sure update_pmtu is called only for local clients.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
We run in contexts like ip_rcv, ipv6_rcv, br_handle_frame,
do not expect shared skbs.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
After commit 70e7341673 (ipv4: Show that ip_send_reply()
is purely unicast routine.) we do not need to reroute DNAT-ed
traffic over loopback because reply uses iph daddr and not
rt_spec_dst.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Move and give better names to two functions:
- ip_vs_dst_reset to __ip_vs_dst_cache_reset
- __ip_vs_dev_reset to ip_vs_forget_dev
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
It was a bad idea to hide return statements in macros.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
The real server becomes unreachable on down event,
no need to wait device unregistration. Should help in
releasing dsts early before dst->dev is replaced with lo.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Avoid replacing the cached route for real server
on every packet with different TOS. I doubt that routing
by TOS for real server is used at all, so we should be
better with such optimization.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Rename skb_dst_set_noref to __skb_dst_set_noref and
add force flag as suggested by David Miller. The new wrapper
skb_dst_set_noref_force will force dst entries that are not
cached to be attached as skb dst without taking reference
as long as provided dst is reclaimed after RCU grace period.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off by: Hans Schillstrom <hans@schillstrom.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Simon Horman <horms@verge.net.au>
Add a new constant ETH_P_802_3_MIN, the minimum ethernet type for
an 802.3 frame. Frames with a lower value in the ethernet type field
are Ethernet II.
Also update all the users of this value that David Miller and
I could find to use the new constant.
Also correct a bug in util.c. The comparison with ETH_P_802_3_MIN
should be >= not >.
As suggested by Jesse Gross.
Compile tested only.
Cc: David Miller <davem@davemloft.net>
Cc: Jesse Gross <jesse@nicira.com>
Cc: Karsten Keil <isdn@linux-pingi.de>
Cc: John W. Linville <linville@tuxdriver.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Bart De Schuymer <bart.de.schuymer@pandora.be>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Gustavo Padovan <gustavo@padovan.org>
Cc: Johan Hedberg <johan.hedberg@gmail.com>
Cc: linux-bluetooth@vger.kernel.org
Cc: netfilter-devel@vger.kernel.org
Cc: bridge@lists.linux-foundation.org
Cc: linux-wireless@vger.kernel.org
Cc: linux1394-devel@lists.sourceforge.net
Cc: linux-media@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: dev@openvswitch.org
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
include/net/ipip.h
The changes made to ipip.h in 'net' were already included
in 'net-next' before that header was moved to another location.
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the protection of netns_frags.nqueues updates under the LRU_lock,
instead of the write lock. As they are located on the same cacheline,
and this is also needed when transitioning to use per hash bucket locking.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The LRU list is protected by its own lock, since commit 3ef0eb0db4
(net: frag, move LRU list maintenance outside of rwlock), and
no-longer by a read_lock.
This makes it possible, to remove the inet_frag_queue, which is about
to be "evicted", from the LRU list head. This avoids the problem, of
several CPUs grabbing the same frag queue.
Note, cannot remove the inet_frag_lru_del() call in fq_unlink()
called by inet_frag_kill(), because inet_frag_kill() is also used in
other situations. Thus, we use list_del_init() to allow this
double list_del to work.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In kernel we have fast and pretty implementation of the isxdigit() function.
Let's use it.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Switch to use the new help skb_probe_transport_header() to do the l4 header
probing for untrusted sources. For packets with partial csum, the header should
already been set by skb_partial_csum_set().
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For untrusted packets with partial checksum, we need to set the transport header
for precise packet length estimation. We can just let skb_pratial_csum_set() to
do this to avoid extra call to skb_flow_dissect() and simplify the caller.
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This formatting issue was introduced with commit
d4ac32365d
Signed-off-by: Tony Cheneau <tony.cheneau@amnesiak.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch lets dgram_recvmsg fill in the sockaddr struct in
msg->msg_name with the source address of the packet.
This is used by the userland functions recvmsg and recvfrom to get the
senders address.
[Stefan: Changed from old zigbee legacy tree to mainline]
Signed-off-by: Stephen Röttger <stephen.roettger@zero-entropy.de>
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) Always increment IPV4 ID field in encapsulated GSO packets, even
when DF is set. Regression fix from Pravin B Shelar.
2) Fix per-net subsystem initialization in netfilter conntrack,
otherwise we may access dynamically allocated memory before it is
actually allocated. From Gao Feng.
3) Fix DMA buffer lengths in iwl3945 driver, from Stanislaw Gruszka.
4) Fix race between submission of sync vs async commands in mwifiex
driver, from Amitkumar Karwar.
5) Add missing cancel of command timer in mwifiex driver, from Bing
Zhao.
6) Missing SKB free in rtlwifi USB driver, from Jussi Kivilinna.
7) Thermal layer tries to use a genetlink multicast string that is
longer than the 16 character limit. Fix it and add a BUG check to
prevent this kind of thing from happening in the future.
From Masatake YAMATO.
8) Fix many bugs in the handling of the teardown of L2TP connections,
UDP encapsulation instances, and sockets. From Tom Parkin.
9) Missing socket release in IRDA, from Kees Cook.
10) Fix fec driver modular build, from Fabio Estevam.
11) Erroneous use of kfree() instead of free_netdev() in lantiq_etop,
from Wei Yongjun.
12) Fix bugs in handling of queue numbers and steering rules in mlx4
driver, from Moshe Lazer, Hadar Hen Zion, and Or Gerlitz.
13) Some FOO_DIAG_MAX constants were defined off by one, fix from Andrey
Vagin.
14) TCP segmentation deferral is unintentionally done too strongly,
breaking ACK clocking. Fix from Eric Dumazet.
15) net_enable_timestamp() can legitimately be invoked from software
interrupts, and in a way that is safe, so remove the WARN_ON().
Also from Eric Dumazet.
16) Fix use after free in VLANs, from Cong Wang.
17) Fix TCP slow start retransmit storms after SACK reneging, from
Yuchung Cheng.
18) Unix socket release should mark a socket dead before NULL'ing out
sock->sk, otherwise we can race. Fix from Paul Moore.
19) IPV6 addrconf code can try to free static memory, from Hong Zhiguo.
20) Fix register mis-programming, NULL pointer derefs, and wrong PHC
clock frequency in IGB driver. From Lior LevyAlex Williamson, Jiri
Benc, and Jeff Kirsher.
21) skb->ip_summed logic in pch_gbe driver is reversed, breaking packet
forwarding. Fix from Veaceslav Falico.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (65 commits)
ipv4: Fix ip-header identification for gso packets.
bonding: remove already created master sysfs link on failure
af_unix: dont send SCM_CREDENTIAL when dest socket is NULL
pch_gbe: fix ip_summed checksum reporting on rx
igb: fix PHC stopping on max freq
igb: make sensor info static
igb: SR-IOV init reordering
igb: Fix null pointer dereference
igb: fix i350 anti spoofing config
ixgbevf: don't release the soft entries
ipv6: fix bad free of addrconf_init_net
unix: fix a race condition in unix_release()
tcp: undo spurious timeout after SACK reneging
bnx2x: fix assignment of signed expression to unsigned variable
bridge: fix crash when set mac address of br interface
8021q: fix a potential use-after-free
net: remove a WARN_ON() in net_enable_timestamp()
tcp: preserve ACK clocking in TSO
net: fix *_DIAG_MAX constants
net/mlx4_core: Disallow releasing VF QPs which have steering rules
...
- Fix an NFSv4 idmapper regression
- Fix an Oops in the pNFS blocks client
- Fix up various issues with pNFS layoutcommit
- Ensure correct read ordering of variables in rpc_wake_up_task_queue_locked
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
iQIcBAABAgAGBQJRUedyAAoJEGcL54qWCgDyar0P/2pTT/yxX8ejTu5DmY7e4PYJ
jhPG2AEqY/yMLn9GvB375VIs1L8tuY50+3NFhWZFjyNbEU3GV+5Y+kPpBtAgYiSI
VyIXiJ/xMtXdYJMYuE/nh5jbcqJsHwGjpcIaSd5BuWzQUaoUYvLulxWd4QN8mmaT
5SuzmgV+7WIqV6RjlaYF82srcOKAjwemcrfRkCNzzJr6aT39gH2YdYFbDaTr7qhU
fw0x3QlI7887vSNQcfaGbC1+jr6oe8wRCneOR0tceU/8bcj6zlUDk5HxqSOc28mA
jUQieoVRggcM4s5DFpNcuwW6qCPZOmzv/OFD6oqnhyyonPOrue+7zaoujZmGNmjx
dT2V/jQehanYD25WpDO8OyFXUeYE4x9bgHKsszhBTwr4x5D8ceEJ1sugcOPiTTxu
tflbbuWbt+BguvXp4p8QayUj0V2cplM/nOovWyUG+BH46sz3Dtv46NOgJeO2a29g
T6jayxmKCxvtPKtG0j34BzLngiKabZTSEhFms6Qarp9lwWvHWrR9KWGuDBNvy1Ts
GMBN8P6Ib40yVi6Pwlj5Jpy6yLKVklHtJQpactr63AZmYrF4bBBSom+MWAh3X1iO
QtF0x9Z1bBkXY2Q/u+3vWMxQtEPeW+pSiloj8aiceFAt33zKM+1bLofDhEw0s2fI
wJEHYsGyGtDQINgP0v1e
=OPbZ
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.9-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
- Fix an NFSv4 idmapper regression
- Fix an Oops in the pNFS blocks client
- Fix up various issues with pNFS layoutcommit
- Ensure correct read ordering of variables in
rpc_wake_up_task_queue_locked
* tag 'nfs-for-3.9-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
SUNRPC: Add barriers to ensure read ordering in rpc_wake_up_task_queue_locked
NFSv4.1: Add a helper pnfs_commit_and_return_layout
NFSv4.1: Always clear the NFS_INO_LAYOUTCOMMIT in layoutreturn
NFSv4.1: Fix a race in pNFS layoutcommit
pnfs-block: removing DM device maybe cause oops when call dev_remove
NFSv4: Fix the string length returned by the idmapper
ip-header id needs to be incremented even if IP_DF flag is set.
This behaviour was changed in commit 490ab08127
(IP_GRE: Fix IP-Identification).
Following patch fixes it so that identification is always
incremented.
Reported-by: Cong Wang <amwang@redhat.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
gso_segs were reset to zero when kernel receive packets from untrusted
source. But we use this zero value to estimate precise packet len which is
wrong. So this patch tries to estimate the correct gso_segs value before using
it in qdisc_pkt_len_init().
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set the transport header for 1) some drivers (e.g ixgbe needs l4 header to do
atr) 2) precise packet length estimation (introduced in 1def9238) needs l4
header to compute header length.
So this patch first tries to get l4 header for packet socket through
skb_flow_dissect(), and pretend no l4 header if skb_flow_dissect() fails.
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The previous code would just compress the UDP header and send the compressed
UDP header along with the uncompressed one.
Signed-off-by: Tony Cheneau <tony.cheneau@amnesiak.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sets the sequence number in the frame format. Without this fix, the sequence
number is always set to 0. This makes trafic analysis very hard.
Signed-off-by: Tony Cheneau <tony.cheneau@amnesiak.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bring-over mac802154_dev_get_dsn() function that was present in the
Linux ZigBee kernel. This function is called by the 6LoWPAN code in
order to properly set the DSN (Data Sequence Number) value in the IEEE
802.15.4 frame.
Signed-off-by: Tony Cheneau <tony.cheneau@amnesiak.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add pr_debug() call in order to debug 6LoWPAN fragmentation and
reassembly.
Signed-off-by: Tony Cheneau <tony.cheneau@amnesiak.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The first fragment, FRAG1, must contain some payload according to the
specs. However, as it is currently written, the first fragment will
remain empty and only contain the 6lowpan headers.
This patch also extracts the transport layer information from the first
fragment. This information is used later on when uncompressing UDP
header.
Thanks to Wolf-Bastian Pöttner for noticing that the offset value was
not properly initialized.
Signed-off-by: Tony Cheneau <tony.cheneau@amnesiak.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The IEEE 802.15.4 standard uses the 0xFFFF short address (2 bytes) for message
broadcasting.
Signed-off-by: Tony Cheneau <tony.cheneau@amnesiak.org>
Signed-off-by: David S. Miller <davem@davemloft.net>