IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Pull workqueue fix from Tejun Heo:
"Another fix for a really old bug.
It only affects drain_workqueue() which isn't used often and even then
triggers only during a pretty small race window, so it isn't too
surprising that it stayed hidden for so long.
The fix is straight-forward and low-risk. Kudos to Li Bin for
reporting and fixing the bug"
* 'for-4.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: Fix NULL pointer dereference
The active development of cgroups v2 sometimes leads to a creation
of interfaces, which are not turned on by default (to provide
backward compatibility). It's handy to know from userspace, which
cgroup v2 features are supported without calculating it based
on the kernel version. So, let's export the list of such features
using /sys/kernel/cgroup/features pseudo-file.
The list is hardcoded and has to be extended when new functionality
is added. Each feature is printed on a new line.
Example:
$ cat /sys/kernel/cgroup/features
nsdelegate
Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: kernel-team@fb.com
Signed-off-by: Tejun Heo <tj@kernel.org>
Delegatable cgroup v2 control files may require special handling
(e.g. chowning), and the exact list of such files varies between
kernel versions (and likely to be extended in the future).
To guarantee correctness of this list and simplify the life
of userspace (systemd, first of all), let's export the list
via /sys/kernel/cgroup/delegate pseudo-file.
Format is siple: each control file name is printed on a new line.
Example:
$ cat /sys/kernel/cgroup/delegate
cgroup.procs
cgroup.subtree_control
Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: kernel-team@fb.com
Signed-off-by: Tejun Heo <tj@kernel.org>
Pull perf fixes from Ingo Molnar:
"Various fixes:
- synchronize kernel and tooling headers
- cgroup support fix
- two tooling fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tools/headers: Synchronize kernel ABI headers
perf/cgroup: Fix perf cgroup hierarchy support
perf tools: Unwind properly location after REJECT
perf symbols: Fix memory corruption because of zero length symbols
Cgroup v2 lacks the device controller, provided by cgroup v1.
This patch adds a new eBPF program type, which in combination
of previously added ability to attach multiple eBPF programs
to a cgroup, will provide a similar functionality, but with some
additional flexibility.
This patch introduces a BPF_PROG_TYPE_CGROUP_DEVICE program type.
A program takes major and minor device numbers, device type
(block/character) and access type (mknod/read/write) as parameters
and returns an integer which defines if the operation should be
allowed or terminated with -EPERM.
Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thanks to the ability to load a program for a specific device,
running verifier twice is no longer needed.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If TC program is loaded with skip_sw flag, we should allow
the device-specific programs to be accepted.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pass the netdev pointer to bpf_prog_get_type(). This way
BPF code can decide whether the device matches what the
code was loaded/translated for.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Extend struct bpf_prog_info to contain information about program
being bound to a device. Since the netdev may get destroyed while
program still exists we need a flag to indicate the program is
loaded for a device, even if the device is gone.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The fact that we don't know which device the program is going
to be used on is quite limiting in current eBPF infrastructure.
We have to reverse or limit the changes which kernel makes to
the loaded bytecode if we want it to be offloaded to a networking
device. We also have to invent new APIs for debugging and
troubleshooting support.
Make it possible to load programs for a specific netdev. This
helps us to bring the debug information closer to the core
eBPF infrastructure (e.g. we will be able to reuse the verifer
log in device JIT). It allows device JITs to perform translation
on the original bytecode.
__bpf_prog_get() when called to get a reference for an attachment
point will now refuse to give it if program has a device assigned.
Following patches will add a version of that function which passes
the expected netdev in. @type argument in __bpf_prog_get() is
renamed to attach_type to make it clearer that it's only set on
attachment.
All calls to ndo_bpf are protected by rtnl, only verifier callbacks
are not. We need a wait queue to make sure netdev doesn't get
destroyed while verifier is still running and calling its driver.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After commit 674e75411fc2 (sched: cpufreq: Allow remote cpufreq
callbacks) we stopped to always read the utilization for the CPU we
are running the governor on, and instead we read it for the CPU
which we've been told has updated utilization. This is stored in
sugov_cpu->cpu.
The value is set in sugov_register() but we clear it in sugov_start()
which leads to always looking at the utilization of CPU0 instead of
the correct one.
Fix this by consolidating the initialization code into sugov_start().
Fixes: 674e75411fc2 (sched: cpufreq: Allow remote cpufreq callbacks)
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Reviewed-by: Patrick Bellasi <patrick.bellasi@arm.com>
Reviewed-by: Brendan Jackman <brendan.jackman@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Files removed in 'net-next' had their license header updated
in 'net'. We take the remove from 'net-next'.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds two arm64-specific prctls, to permit userspace to
control its vector length:
* PR_SVE_SET_VL: set the thread's SVE vector length and vector
length inheritance mode.
* PR_SVE_GET_VL: get the same information.
Although these prctls resemble instruction set features in the SVE
architecture, they provide additional control: the vector length
inheritance mode is Linux-specific and nothing to do with the
architecture, and the architecture does not permit EL0 to set its
own vector length directly. Both can be used in portable tools
without requiring the use of SVE instructions.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
[will: Fixed up prctl constants to avoid clash with PDEATHSIG]
Signed-off-by: Will Deacon <will.deacon@arm.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJZ0WQ6AAoJEHm+PkMAQRiGuloH/3sF4qfBhPuJo8OTf0uCtQ18
4Ux9zZbm81df/Jjz0exAp1Jqk+TvdIS3OXPWcKilvbUBP16hQcsxFTnI/5QF+YcN
87aNr+OCMJzOBK4suN1yhzO46NYHeIizdB0PTZVL1Zsto69Tt31D8VJmgH6oBxAw
Isb/nAkOr31dZ9PI5UEExTIanUt6EywVb0UswA+2rNl3h1UkeasQCpMpK2n6HBhU
kVD7sxEd/CN0MmfhB0HrySSam/BeSpOtzoU9bemOwrU2uu9+5+2rqMe7Gsdj4nX6
3Kk+7FQNktlrhxCZIFN/+CdusOUuDd8r/75d7DnsRK5YvSb0sZzJkfD3Nba68Ms=
=7J2+
-----END PGP SIGNATURE-----
Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux
Merge 4.14-rc3 in order to pick up the new timer_setup function.
This reverts commit b5149873a0c299195b5346fe4dc2c5b04ae2f995.
It conflicts with the following isolcpus change from the sched branch.
edb9382175c3 ("sched/isolation: Move isolcpus= handling to the housekeeping code")
Let's revert for now.
Signed-off-by: Tejun Heo <tj@kernel.org>
The bpf_verifer_ops array is generated dynamically and may be
empty depending on configuration, which then causes an out
of bounds access:
kernel/bpf/verifier.c: In function 'bpf_check':
kernel/bpf/verifier.c:4320:29: error: array subscript is above array bounds [-Werror=array-bounds]
This adds a check to the start of the function as a workaround.
I would assume that the function is never called in that configuration,
so the warning is probably harmless.
Fixes: 00176a34d9e2 ("bpf: remove the verifier ops from program structure")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
I ran into this link error with the latest net-next plus linux-next
trees when networking is disabled:
kernel/bpf/verifier.o:(.rodata+0x2958): undefined reference to `tc_cls_act_analyzer_ops'
kernel/bpf/verifier.o:(.rodata+0x2970): undefined reference to `xdp_analyzer_ops'
It seems that the code was written to deal with varying contents of
the arrray, but the actual #ifdef was missing. Both tc_cls_act_analyzer_ops
and xdp_analyzer_ops are defined in the core networking code, so adding
a check for CONFIG_NET seems appropriate here, and I've verified this with
many randconfig builds
Fixes: 4f9218aaf8a4 ("bpf: move knowledge about post-translation offsets out of verifier")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
- A number of MIPS GIC updates and cleanups
- One GICv4 update
- Another firmware workaround for GICv2
- Support for Mason8 GPIOs
- Tiny documentation fix
-----BEGIN PGP SIGNATURE-----
iQJJBAABCAAzFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAln7UeAVHG1hcmMuenlu
Z2llckBhcm0uY29tAAoJECPQ0LrRPXpDtLkQAII8Ij6h2nWN8MaSOcrOsar58PU9
gfa8HidjH7QRht5xcpaAYXO0+VO7etP37JpTlzcJfpaAU8tFTKAMJKbWaHRe2KKV
YbfRYWhnyizmb2Sn5+JDKlVVPAgR9tZN9UQyDceERb4yZmGOc9nHBObJIPAYtvnr
zKLPo5ldVs7bCQan+uGXMyncVGgJ70rYCNPhHqzyNbAUgkI3XA85Wk9NgBXQCr/2
8SiE6hs+QdkELy545Pi0oPxvqY5q4f2QmgOjWhQtJgbAQvWpizMi2Qz4zH8Gs89t
Iln6zaw8315vfPSiyXckbIlMoUUzejLuxtvwvdgdrjE5vogNkj+Z874R3yNbLZ2H
CW+DZz9gv+e2uzIiqBBpw1DatWBQQxoXvUVW1fyEOqwZPpr7JvAKwJ9Nwhc0kUdG
H2AjO1KL/WqiqCpEDAYF+zY3zLUSAaoNjMWX8+nlaUstgQO4hSyPyU3RUJXmpHf8
DHIVelZlhurKjQP6SfH1QKs+Wqv2QAZb8Otb8Pt/IMw73mXmRMbMPB7uOc2A7TqG
uceAQuOyVBG+QF2dGiXZ5RqTNtL1gWns+S+abJCtRkD+JV83teZw3I4sMnVpeqtO
rdavJVejrePck8Tgd1qOCGfXUQrBhCG1xJFkt/Y9tWbduU0R4V1y5J7LcGBLMYSy
cV5WD3fejfjayTlT
=Aozt
-----END PGP SIGNATURE-----
Merge tag 'irqchip-4.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core
Pull the second batch of irqchip updates for 4.15 from marc Zyngier:
- A number of MIPS GIC updates and cleanups
- One GICv4 update
- Another firmware workaround for GICv2
- Support for Mason8 GPIOs
- Tiny documentation fix
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWfswbQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykvEwCfXU1MuYFQGgMdDmAZXEc+xFXZvqgAoKEcHDNA
6dVh26uchcEQLN/XqUDt
=x306
-----END PGP SIGNATURE-----
Merge tag 'spdx_identifiers-4.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull initial SPDX identifiers from Greg KH:
"License cleanup: add SPDX license identifiers to some files
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the
'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally
binding shorthand, which can be used instead of the full boiler plate
text.
This patch is based on work done by Thomas Gleixner and Kate Stewart
and Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset
of the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to
license had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied
to a file was done in a spreadsheet of side by side results from of
the output of two independent scanners (ScanCode & Windriver)
producing SPDX tag:value files created by Philippe Ombredanne.
Philippe prepared the base worksheet, and did an initial spot review
of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537
files assessed. Kate Stewart did a file by file comparison of the
scanner results in the spreadsheet to determine which SPDX license
identifier(s) to be applied to the file. She confirmed any
determination that was not immediately clear with lawyers working with
the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained
>5 lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that
was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that
became the concluded license(s).
- when there was disagreement between the two scanners (one detected
a license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply
(and which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases,
confirmation by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.
The Windriver scanner is based on an older version of FOSSology in
part, so they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot
checks in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect
the correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial
patch version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch
license was not GPL-2.0 WITH Linux-syscall-note to ensure that the
applied SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'spdx_identifiers-4.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
License cleanup: add SPDX license identifier to uapi header files with a license
License cleanup: add SPDX license identifier to uapi header files with no license
License cleanup: add SPDX GPL-2.0 license identifier to files with no license
In commit 30d6e0a4190d ("futex: Remove duplicated code and fix undefined
behaviour"), I let FUTEX_WAKE_OP to fail on invalid op. Namely when op
should be considered as shift and the shift is out of range (< 0 or > 31).
But strace's test suite does this madness:
futex(0x7fabd78bcffc, 0x5, 0xfacefeed, 0xb, 0x7fabd78bcffc, 0xa0caffee);
futex(0x7fabd78bcffc, 0x5, 0xfacefeed, 0xb, 0x7fabd78bcffc, 0xbadfaced);
futex(0x7fabd78bcffc, 0x5, 0xfacefeed, 0xb, 0x7fabd78bcffc, 0xffffffff);
When I pick the first 0xa0caffee, it decodes as:
0x80000000 & 0xa0caffee: oparg is shift
0x70000000 & 0xa0caffee: op is FUTEX_OP_OR
0x0f000000 & 0xa0caffee: cmp is FUTEX_OP_CMP_EQ
0x00fff000 & 0xa0caffee: oparg is sign-extended 0xcaf = -849
0x00000fff & 0xa0caffee: cmparg is sign-extended 0xfee = -18
That means the op tries to do this:
(futex |= (1 << (-849))) == -18
which is completely bogus. The new check of op in the code is:
if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28)) {
if (oparg < 0 || oparg > 31)
return -EINVAL;
oparg = 1 << oparg;
}
which results obviously in the "Invalid argument" errno:
FAIL: futex
===========
futex(0x7fabd78bcffc, 0x5, 0xfacefeed, 0xb, 0x7fabd78bcffc, 0xa0caffee) = -1: Invalid argument
futex.test: failed test: ../futex failed with code 1
So let us soften the failure to print only a (ratelimited) message, crop
the value and continue as if it were right. When userspace keeps up, we
can switch this to return -EINVAL again.
[v2] Do not return 0 immediatelly, proceed with the cropped value.
Fixes: 30d6e0a4190d ("futex: Remove duplicated code and fix undefined behaviour")
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Darren Hart <dvhart@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Follow-up to 0fd4759c5515 ("bpf: fix pattern matches for direct
packet access") to cover also the remaining data_meta/data matches
in the verifier. The matches are also refactored a bit to simplify
handling of all the cases.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Two minor cleanups after Dave's recent merge in f8ddadc4db6c
("Merge git://git.kernel.org...") of net into net-next in
order to get the code in line with what was done originally
in the net tree: i) use max() instead of max_t() since both
ranges are u16, ii) don't split the direct access test cases
in the middle with bpf_exit test cases from 390ee7e29fc
("bpf: enforce return code for cgroup-bpf programs").
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Discovered that the compiler laid-out asm code in suboptimal way
when studying perf report during benchmarking of cpumap. Help
the compiler by the marking unlikely code paths.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Smooth Cong Wang's bug fix into 'net-next'. Basically put
the bulk of the tcf_block_put() logic from 'net' into
tcf_block_put_ext(), but after the offload unbind.
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull signal bugfix from Eric Biederman:
"When making the generic support for SIGEMT conditional on the presence
of SIGEMT I made a typo that causes it to fail to activate. It was
noticed comparatively quickly but the bug report just made it to me
today"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
signal: Fix name of SIGEMT in #if defined() check
Commit cc731525f26a ("signal: Remove kernel interal si_code magic")
added a check for SIGMET and NSIGEMT being defined. That SIGMET should
in fact be SIGEMT, with SIGEMT being defined in
arch/{alpha,mips,sparc}/include/uapi/asm/signal.h
This was actually pointed out by BenHutchings in a lwn.net comment
here https://lwn.net/Comments/734608/
Fixes: cc731525f26a ("signal: Remove kernel interal si_code magic")
Signed-off-by: Andrew Clayton <andrew@digital-domain.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Guenter reported:
There is still a problem. When running
echo 6 > /proc/sys/kernel/watchdog_thresh
echo 5 > /proc/sys/kernel/watchdog_thresh
repeatedly, the message
NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
stops after a while (after ~10-30 iterations, with fluctuations).
Maybe watchdog_cpus needs to be atomic ?
That's correct as this again is affected by the asynchronous nature of the
smpboot thread unpark mechanism.
CPU 0 CPU1 CPU2
write(watchdog_thresh, 6)
stop()
park()
update()
start()
unpark()
thread->unpark()
cnt++;
write(watchdog_thresh, 5) thread->unpark()
stop()
park() thread->park()
cnt--; cnt++;
update()
start()
unpark()
That's not a functional problem, it just affects the informational message.
Convert watchdog_cpus to atomic_t to prevent the problem
Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20171101181126.j727fqjmdthjz4xk@redhat.com
Guenter reported a crash in the watchdog/perf code, which is caused by
cleanup() and enable() running concurrently. The reason for this is:
The watchdog functions are serialized via the watchdog_mutex and cpu
hotplug locking, but the enable of the perf based watchdog happens in
context of the unpark callback of the smpboot thread. But that unpark
function is not synchronous inside the locking. The unparking of the thread
just wakes it up and leaves so there is no guarantee when the thread is
executing.
If it starts running _before_ the cleanup happened then it will create a
event and overwrite the dead event pointer. The new event is then cleaned
up because the event is marked dead.
lock(watchdog_mutex);
lockup_detector_reconfigure();
cpus_read_lock();
stop();
park()
update();
start();
unpark()
cpus_read_unlock(); thread runs()
overwrite dead event ptr
cleanup();
free new event, which is active inside perf....
unlock(watchdog_mutex);
The park side is safe as that actually waits for the thread to reach
parked state.
Commit a33d44843d45 removed the protection against this kind of scenario
under the stupid assumption that the hotplug serialization and the
watchdog_mutex cover everything.
Bring it back.
Reverts: a33d44843d45 ("watchdog/hardlockup/perf: Simplify deferred event destroy")
Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Thomas Feels-stupid Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Don Zickus <dzickus@redhat.com>
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1710312145190.1942@nanos
clockevent_device::next_event holds the next timer event of a clock event
device. The value is updated in clockevents_program_event(), i.e. when the
hardware timer is armed for the next expiry.
When there are no software timers armed on a CPU, the corresponding per CPU
clockevent device is brought into ONESHOT_STOPPED state, but
clockevent_device::next_event is not updated, because
clockevents_program_event() is not called.
So the content of clockevent_device::next_event is stale, which is not an
issue when real hardware is used. But the hrtimer broadcast device relies
on that information and the stale value causes spurious wakeups.
Update clockevent_device::next_event to KTIME_MAX when it has been brought
into ONESHOT_STOPPED state to avoid spurious wakeups. This reflects the
proper expiry time of the stopped timer: infinity.
[ tglx: Massaged changelog ]
Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: viresh.kumar@linaro.org
Link: https://lkml.kernel.org/r/1509043042-32486-1-git-send-email-psodagud@codeaurora.org
When the irqaffinity= kernel parameter is passed in a CPUMASK_OFFSTACK=y
kernel, it fails to boot, because zalloc_cpumask_var() cannot be used before
initializing the slab allocator to allocate a cpumask.
So, use alloc_bootmem_cpumask_var() instead.
Also do some cleanups while at it: in init_irq_default_affinity() remove
an #ifdef via using cpumask_available().
Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20171026045800.27087-1-rakib.mullick@gmail.com
Link: http://lkml.kernel.org/r/20171101041451.12581-1-rakib.mullick@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Dmitry (through syzbot) reported being able to trigger the WARN in
get_pi_state() and a use-after-free on:
raw_spin_lock_irq(&pi_state->pi_mutex.wait_lock);
Both are due to this race:
exit_pi_state_list() put_pi_state()
lock(&curr->pi_lock)
while() {
pi_state = list_first_entry(head);
hb = hash_futex(&pi_state->key);
unlock(&curr->pi_lock);
dec_and_test(&pi_state->refcount);
lock(&hb->lock)
lock(&pi_state->pi_mutex.wait_lock) // uaf if pi_state free'd
lock(&curr->pi_lock);
....
unlock(&curr->pi_lock);
get_pi_state(); // WARN; refcount==0
The problem is we take the reference count too late, and don't allow it
being 0. Fix it by using inc_not_zero() and simply retrying the loop
when we fail to get a refcount. In that case put_pi_state() should
remove the entry from the list.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Gratian Crisan <gratian.crisan@ni.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: dvhart@infradead.org
Cc: syzbot <bot+2af19c9e1ffe4d4ee1d16c56ae7580feaee75765@syzkaller.appspotmail.com>
Cc: syzkaller-bugs@googlegroups.com
Cc: <stable@vger.kernel.org>
Fixes: c74aef2d06a9 ("futex: Fix pi_state->owner serialization")
Link: http://lkml.kernel.org/r/20171031101853.xpfh72y643kdfhjs@hirez.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
During perf event attaching/detaching bpf programs,
the tp_event->prog_array change is protected by the
bpf_event_mutex lock in both attaching and deteching
functions. Although tp_event->prog_array is a rcu
pointer, rcu_derefrence is not needed to access it
since mutex lock will guarantee ordering.
Verified through "make C=2" that sparse
locking check still happy with the new change.
Also change the label name in perf_event_{attach,detach}_bpf_prog
from "out" to "unlock" to reflect the code action after the label.
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that SK_REDIRECT is no longer a valid return code. Remove it
from the UAPI completely. Then do a namespace remapping internal
to sockmap so SK_REDIRECT is no longer externally visible.
Patchs primary change is to do a namechange from SK_REDIRECT to
__SK_REDIRECT
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
the verifier got progressively smarter over time and size of its internal
state grew as well. Time to reduce the memory consumption.
Before:
sizeof(struct bpf_verifier_state) = 6520
After:
sizeof(struct bpf_verifier_state) = 896
It's done by observing that majority of BPF programs use little to
no stack whereas verifier kept all of 512 stack slots ready always.
Instead dynamically reallocate struct verifier state when stack
access is detected.
Runtime difference before vs after is within a noise.
The number of processed instructions stays the same.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Consolidate the code to write to the new mapping at the end of the
function to remove the duplication. Move the increase in the number
of mappings into insert_extent, keeping the logic together.
Just a small increase in readability and maintainability.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
There is no good reason for this code duplication, the number of cache
line accesses not the number of instructions are the bottleneck in
this code.
Therefore simplify maintenance by removing unnecessary code.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
This is important so reading /proc/<pid>/{uid_map,gid_map,projid_map} while
the map is being written does not do strange things.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Consolidate reading the number of extents and computing the return
value in the map_id_down, map_id_range_down and map_id_range.
This removal of one read of extents makes one smp_rmb unnecessary
and makes the code safe it is executed during the map write. Reading
the number of extents twice and depending on the result being the same
is not safe, as it could be 0 the first time and > 5 the second time,
which would lead to misinterpreting the union fields.
The consolidation of the return value just removes a duplicate
caluculation which should make it easier to understand and maintain
the code.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
There are quite some use cases where users run into the current limit for
{g,u}id mappings. Consider a user requesting us to map everything but 999, and
1001 for a given range of 1000000000 with a sub{g,u}id layout of:
some-user:100000:1000000000
some-user:999:1
some-user:1000:1
some-user:1001:1
some-user:1002:1
This translates to:
MAPPING-TYPE | CONTAINER | HOST | RANGE |
-------------|-----------|---------|-----------|
uid | 999 | 999 | 1 |
uid | 1001 | 1001 | 1 |
uid | 0 | 1000000 | 999 |
uid | 1000 | 1001000 | 1 |
uid | 1002 | 1001002 | 999998998 |
------------------------------------------------
gid | 999 | 999 | 1 |
gid | 1001 | 1001 | 1 |
gid | 0 | 1000000 | 999 |
gid | 1000 | 1001000 | 1 |
gid | 1002 | 1001002 | 999998998 |
which is already the current limit.
As discussed at LPC simply bumping the number of limits is not going to work
since this would mean that struct uid_gid_map won't fit into a single cache-line
anymore thereby regressing performance for the base-cases. The same problem
seems to arise when using a single pointer. So the idea is to use
struct uid_gid_extent {
u32 first;
u32 lower_first;
u32 count;
};
struct uid_gid_map { /* 64 bytes -- 1 cache line */
u32 nr_extents;
union {
struct uid_gid_extent extent[UID_GID_MAP_MAX_BASE_EXTENTS];
struct {
struct uid_gid_extent *forward;
struct uid_gid_extent *reverse;
};
};
};
For the base cases we will only use the struct uid_gid_extent extent member. If
we go over UID_GID_MAP_MAX_BASE_EXTENTS mappings we perform a single 4k
kmalloc() which means we can have a maximum of 340 mappings
(340 * size(struct uid_gid_extent) = 4080). For the latter case we use two
pointers "forward" and "reverse". The forward pointer points to an array sorted
by "first" and the reverse pointer points to an array sorted by "lower_first".
We can then perform binary search on those arrays.
Performance Testing:
When Eric introduced the extent-based struct uid_gid_map approach he measured
the performanc impact of his idmap changes:
> My benchmark consisted of going to single user mode where nothing else was
> running. On an ext4 filesystem opening 1,000,000 files and looping through all
> of the files 1000 times and calling fstat on the individuals files. This was
> to ensure I was benchmarking stat times where the inodes were in the kernels
> cache, but the inode values were not in the processors cache. My results:
> v3.4-rc1: ~= 156ns (unmodified v3.4-rc1 with user namespace support disabled)
> v3.4-rc1-userns-: ~= 155ns (v3.4-rc1 with my user namespace patches and user namespace support disabled)
> v3.4-rc1-userns+: ~= 164ns (v3.4-rc1 with my user namespace patches and user namespace support enabled)
I used an identical approach on my laptop. Here's a thorough description of what
I did. I built a 4.14.0-rc4 mainline kernel with my new idmap patches applied. I
booted into single user mode and used an ext4 filesystem to open/create
1,000,000 files. Then I looped through all of the files calling fstat() on each
of them 1000 times and calculated the mean fstat() time for a single file. (The
test program can be found below.)
Here are the results. For fun, I compared the first version of my patch which
scaled linearly with the new version of the patch:
| # MAPPINGS | PATCH-V1 | PATCH-NEW |
|--------------|------------|-----------|
| 0 mappings | 158 ns | 158 ns |
| 1 mappings | 164 ns | 157 ns |
| 2 mappings | 170 ns | 158 ns |
| 3 mappings | 175 ns | 161 ns |
| 5 mappings | 187 ns | 165 ns |
| 10 mappings | 218 ns | 199 ns |
| 50 mappings | 528 ns | 218 ns |
| 100 mappings | 980 ns | 229 ns |
| 200 mappings | 1880 ns | 239 ns |
| 300 mappings | 2760 ns | 240 ns |
| 340 mappings | not tested | 248 ns |
Here's the test program I used. I asked Eric what he did and this is a more
"advanced" implementation of the idea. It's pretty straight-forward:
#define __GNU_SOURCE
#define __STDC_FORMAT_MACROS
#include <errno.h>
#include <dirent.h>
#include <fcntl.h>
#include <inttypes.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/stat.h>
#include <sys/time.h>
#include <sys/types.h>
int main(int argc, char *argv[])
{
int ret;
size_t i, k;
int fd[1000000];
int times[1000];
char pathname[4096];
struct stat st;
struct timeval t1, t2;
uint64_t time_in_mcs;
uint64_t sum = 0;
if (argc != 2) {
fprintf(stderr, "Please specify a directory where to create "
"the test files\n");
exit(EXIT_FAILURE);
}
for (i = 0; i < sizeof(fd) / sizeof(fd[0]); i++) {
sprintf(pathname, "%s/idmap_test_%zu", argv[1], i);
fd[i]= open(pathname, O_RDWR | O_CREAT, S_IXUSR | S_IXGRP | S_IXOTH);
if (fd[i] < 0) {
ssize_t j;
for (j = i; j >= 0; j--)
close(fd[j]);
exit(EXIT_FAILURE);
}
}
for (k = 0; k < 1000; k++) {
ret = gettimeofday(&t1, NULL);
if (ret < 0)
goto close_all;
for (i = 0; i < sizeof(fd) / sizeof(fd[0]); i++) {
ret = fstat(fd[i], &st);
if (ret < 0)
goto close_all;
}
ret = gettimeofday(&t2, NULL);
if (ret < 0)
goto close_all;
time_in_mcs = (1000000 * t2.tv_sec + t2.tv_usec) -
(1000000 * t1.tv_sec + t1.tv_usec);
printf("Total time in micro seconds: %" PRIu64 "\n",
time_in_mcs);
printf("Total time in nanoseconds: %" PRIu64 "\n",
time_in_mcs * 1000);
printf("Time per file in nanoseconds: %" PRIu64 "\n",
(time_in_mcs * 1000) / 1000000);
times[k] = (time_in_mcs * 1000) / 1000000;
}
close_all:
for (i = 0; i < sizeof(fd) / sizeof(fd[0]); i++)
close(fd[i]);
if (ret < 0)
exit(EXIT_FAILURE);
for (k = 0; k < 1000; k++) {
sum += times[k];
}
printf("Mean time per file in nanoseconds: %" PRIu64 "\n", sum / 1000);
exit(EXIT_SUCCESS);;
}
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
CC: Serge Hallyn <serge@hallyn.com>
CC: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
- Add a struct containing two pointer to extents and wrap both the static extent
array and the struct into a union. This is done in preparation for bumping the
{g,u}idmap limits for user namespaces.
- Add brackets around anonymous union when using designated initializers to
initialize members in order to please gcc <= 4.4.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>